农业图书情报学报

• •    

基于多模态机器学习的非遗短视频传播效果影响因素研究

刘忆寒, 褚煜霞, 翟羽佳   

  1. 天津师范大学 管理学院,天津 300387
  • 收稿日期:2025-10-17 出版日期:2025-12-17
  • 作者简介:

    刘忆寒(2001- ),女,硕士研究生,天津师范大学管理学院,研究方向为数字人文

    褚煜霞(2000- ),女,硕士研究生,天津师范大学管理学院,研究方向为数字人文

    翟羽佳(1988- ),男,博士,天津师范大学管理学院,副教授,研究方向为创新扩散、数字人文等

  • 基金资助:
    为国家社会科学基金项目“基于多模态智能分析的非遗短视频知识重组与生成优化研究”(24BTQ045)

Factors Influencing the Communication Effectiveness of Intangible Cultural Heritage Short Videos: A Multimodal Machine Learning Approach

LIU Yihan, CHU Yuxia, ZHAI Yujia   

  1. Tianjin Normal University, Tianjin 300387
  • Received:2025-10-17 Online:2025-12-17

摘要:

【目的/意义】 为系统剖析非遗短视频中多模态符号的协同作用机制,本研究构建了传播效果评估模型,旨在通过融合数据识别关键影响因素,从而为优化非遗内容的数字化生产与传播策略提供科学依据。 【方法/过程】 研究整合详尽可能性模型(ELM)与媒介仪式理论构建双路径分析框架,以抖音非遗视频为对象,融合视频、音频、文本多模态内容特征及作者属性数据,运用随机森林等模型分析特征重要性并预测传播效果。 【结果/结论】 实验结果表明,多模态融合显著提升传播效果预测精度,作者属性贡献度最高,文字与视频模态次之,声音模态单独作用弱但具有语义补充价值。研究局限在于未考虑传播的动态变化及外部事件影响。综上所述,非遗传播应采取协同优化策略:以作者社会影响力为核心驱动,并在此基础上同步提升各内容模态质量,才能最大化整体传播效能。

关键词: 信息扩散, Sora, 短视频传播效果, 非遗, 多模态数据, 特征工程

Abstract:

[Purpose/Significance] Short video platforms have become the core arena for the digital presentation and dissemination of intangible cultural heritage (ICH). However, the "Matthew Effect" in the digital attention economy often causes high-quality ICH content to be submerged. Existing research predominantly suffers from "modal segmentation," focusing on single modalities such as text and visuals in isolation, which fails to explain how these elements synergistically drive user engagement. To address this gap, this study constructs a communication effect evaluation model based on multimodal machine learning. The innovation of this research lies in integrating computational communication methods with traditional persuasion theories, moving beyond simple content analysis to a quantifiable predictive framework. By identifying key influencing factors through data fusion, this study provides a scientific basis for optimizing the digital production strategies of the ICH content, offering significant value for enhancing the visibility of traditional culture and overcoming the barriers of digital dissemination. [Method/Process] This study integrates the elaboration likelihood model (ELM) and media ritual theory to establish a "cognitive-behavioral-cultural" dual-path analytical framework. Theoretically, the study maps content quality (video/audio/text) to the "Central Route" and source credibility (author attributes) to the "Peripheral Route." Empirically, focusing on ICH videos on Douyin as the subject, the study collected data from May 2024 to May 2025. After rigorous cleaning, a dataset of 2,869 valid samples was established. The study employs a multimodal feature engineering approach: visual and textual features are extracted to represent content quality; audio features (including FBank and MFCC) are processed using the OpenSMILE toolkit to capture prosodic and spectral characteristics; and author data are collected to quantify social influence. The Random Forest algorithm is utilized to fuse these heterogeneous data sources, analyze feature importance, and predict communication effectiveness. [Results/Conclusions] The empirical results demonstrate that the multimodal fusion model significantly outperforms single-modality approaches in predicting communication effects, confirming that ICH dissemination is a result of complex symbol interaction. Feature importance analysis reveals a distinct hierarchy: Author attributes make the highest contribution, indicating that the "Peripheral Route" - driven by the creator's social capital - is the decisive factor in determining communication heat. Its persuasive power far surpasses that of the content itself. Regarding content modalities, text and video follow in importance, serving as critical tools for user retention, while the audio modality holds supplementary semantic value by setting the emotional atmosphere. The study does not account for dynamic temporal changes or external trending events. Effective ICH dissemination requires a synergistic strategy: prioritizing the accumulation of the author's social influence as the core driver, while simultaneously optimizing visual and textual quality. Future research should incorporate time-series analysis to capture dynamic communication trends.

Key words: information diffusion, Sora, short video communication effect, intangible cultural heritage (ICH), multimodal data, feature engineering

中图分类号:  G206

引用本文

刘忆寒, 褚煜霞, 翟羽佳. 基于多模态机器学习的非遗短视频传播效果影响因素研究[J/OL]. 农业图书情报学报. https://doi.org/10.13998/j.cnki.issn1002-1248.25-0556.

LIU Yihan, CHU Yuxia, ZHAI Yujia. Factors Influencing the Communication Effectiveness of Intangible Cultural Heritage Short Videos: A Multimodal Machine Learning Approach[J/OL]. Journal of library and information science in agriculture. https://doi.org/10.13998/j.cnki.issn1002-1248.25-0556.