人智交互视角下社交机器人情感计算研究——文献综述与理论模型构建

doi:10.13998/j.cnki.issn1002-1248.25-0734

农业图书情报学报 ›› 2026, Vol. 38 ›› Issue (1): 4-17.doi: 10.13998/j.cnki.issn1002-1248.25-0734

• 具身智能专题 • 下一篇

人智交互视角下社交机器人情感计算研究——文献综述与理论模型构建

吴丹¹^,³, 徐华卿¹^,²

^1. 武汉大学人机交互与用户行为研究中心，武汉 430072
^2. 武汉大学信息管理学院，武汉 430072
^3. 华中师范大学信息管理学院，武汉 430079

收稿日期:2025-11-22 出版日期:2026-01-05 发布日期:2026-03-10
作者简介:
吴丹（1978- ），女，博士，教授，研究方向为人机交互、智慧图书馆、用户信息行为
徐华卿（2000- ），男，博士研究生，研究方向为人机交互
基金资助:
湖北省自然科学基金创新群体项目“以人为本的人工智能创新应用”(2023AFA012)

Affective Computing for Social Robots from the Perspective of Human-AI Interaction: A Literature Review and Theoretical Model Construction

WU Dan¹^,³, XU Huaqing¹^,²

^1. Human-Computer Interaction and User Behavior Research Center, Wuhan University, Wuhan 430072
^2. School of Information Management, Wuhan University, Wuhan 430072
^3. School of Information Management, Central China Normal University, Wuhan 430079);[58] Wang Xinxiang, Li Zihan, Wang Songyang, et al. Enhancing emotional expression in cat-like robots: Strategies for utilizing tail movements with human-like gazes[J]. Frontiers in Robotics and AI, 2024, 11: 1399012. ;[
^59. ] Gasteiger N, Lim J, Hellou M, et al. A scoping review of the literature on prosodic elements related to emotional speech in human-robot interaction[J]. International Journal of Social Robotics, 2024, 16
^4). 659-670.
^60. ] Liu Xiaozhen, Dong Jiayuan, Jeon M. Robots' "woohoo" and "argh" can enhance users' emotional and social perceptions: An exploratory study on non-lexical vocalizations and non-linguistic sounds[J]. ACM Transactions on Human-Robot Interaction, 2023, 12
^4). 1-20.
^61. ] Fiorini L, D’Onofrio G, Sorrentino A, et al. The role of coherent robot behavior and embodiment in emotion perception and recognition during human-robot interaction: Experimental study[J]. JMIR Human Factors, 2024, 11: e45494. ;[
^62. ] Heinisch J S, Kirchhoff J, Busch P, et al. Physiological data for affective computing in HRI with anthropomorphic service robots: The AFFECT-HRI data set[J]. Scientific Data, 2024, 11: 333. ;[
^63. ] Suguitan M, Depalma N, Hoffman G, et al. Face2Gesture: Translating facial expressions into robot movements through shared latent space neural networks[J]. ACM Transactions on Human-Robot Interaction, 2024, 13
^3). 1-18.

Received:2025-11-22 Online:2026-01-05 Published:2026-03-10

摘要/Abstract

摘要：

[目的/意义] 在具身智能从工业自动化转向民生服务的战略背景下，社交机器人面临交互粘性不足与情境理解匮乏的现实困境，情感计算作为赋予机器感知、理解与模拟人类情感的核心技术，是支撑具身智能实现社会化的关键。研究旨在解析多模态感知、动态适应策略与伦理边界的技术路径，为构建负责人智交互体系提供理论参考。 [方法/过程] 遵循PRISMA导向，检索Web of Science近10年具身智能与情感计算交叉领域文献。基于具身性、技术完整性及交互实证性标准筛选，因内容完整性剔除无法获取全文条目，最终选取97篇核心文献。从视觉鲁棒感知、副语言解码、生理信号洞察及多源异构数据融合等维度解析感知层级，并探讨大语言模型驱动下的生成式适应策略。 [结果/结论] 社交机器人情感计算正经历从单一信号统计向多模态语义融合、从静态规则映射向生成式动态适应的范式演进。研究证实，多模态感知的实质是对人类意图的深度解构而非简单的数据统计，基于此，本研究构建了以情境理解为起点、适应行动为核心、伦理约束为底线的动态交互框架。该框架强调，情感适应应从机械模仿转向认知共情，通过大语言模型驱动的生成式策略实现交互的个性化与连贯性，同时伦理边界并非外部附加的规制，而应是内生于算法决策的逻辑约束，旨在应对隐私不对称与心理操纵等内生风险。未来的创新范式应立足于真实环境的生态效度，通过融合长期记忆的终身学习机制对抗新奇效应的消退，并建立人在回路的安全熔断机制，从而确保具身智能在介入人类精神世界过程中的主权安全与科技向善。

关键词: 人智交互, 社交机器人, 具身智能, 情感计算, 系统性文献综述

Abstract:

[Purpose/Significance] Against the backdrop of a strategic transition from industrial efficiency to embodied intelligence within the "Silver-haired Economy," social robots are evolving from functional tools into social companions. However, the field faces a critical bottleneck: a lack of interaction stickiness and empathetic resonance, which leads to high abandonment rates. Affective computing (AC) serves as the core technology to bridge this gap by enabling machines to detect, interpret, and simulate human emotions. Unlike previous literature that often treats AC as a standalone algorithmic task, this research reconstructs the value of AC from a Human-AI Interaction (HAI) perspective. This approach responds to the national "15th Five-Year Plan" requirements for secure and controllable AI governance by integrating technical pathways with ethical boundaries. By situating social robots within complex social relationships, this study provides a theoretical roadmap for robots to transition from mechanical entities to responsible social agents, thereby supporting the high-quality development of population-centric services. [Method/Process] This study employs a systematic literature review methodology guided by the PRISMA framework to ensure scientific rigor and comprehensiveness. The Web of Science Core Collection served as the primary data source, with a search timeframe spanning from 2015 to 2025 to capture the paradigm shifts triggered by deep learning and large-scale language models. A tripartite search logic-integrating subject entities (social robots), core technologies (affective computing), and interaction contexts (human-robot interaction)-was implemented to filter relevant literature. After a multi-level screening process based on embodiment, technical integrity, and empirical validity, 97 high-quality articles were selected. The study utilizes CiteSpace for keyword clustering and citation burst analysis, mapping the evolution of the field across three distinct stages: from foundational signal processing (2018-2019) to dynamic adaptation models (2020-2022), and finally to generative-driven intelligence and ethical regulation (2023-2025). This systematic approach allows for a deep synthesis of multimodal perception technologies, including robust vision, paralinguistic decoding, and physiological signal sensing. [Results/Conclusions] The findings reveal a significant paradigm shift in affective computing for social robots, evolving from simple signal statistics to deep situational understanding and from static rule-based responses to generative dynamic adaptation. The research proposes a holistic interaction framework comprising three pillars: situational understanding, adaptive action, and ethical constraints. Situational understanding leverages multimodal semantic fusion to decode human intent beyond surface-level data, while adaptive action ensures cross-modal consistency in physical expression through generative AI and long-term memory architectures. Ethical constraints are identified as an internal safety mechanism rather than external regulations, addressing risks such as privacy asymmetry, cultural bias in datasets, and psychological manipulation stemming from high anthropomorphism. The study concludes that the future of social robotics lies in three innovative paradigms: enhancing ecological validity through real-world deployment, constructing lifelong learning mechanisms to sustain long-term relationships, and embedding "human-in-the-loop" ethical fuses directly into algorithmic architectures. Despite these advancements, the research is currently limited by a lack of diverse cultural data and long-term field studies. Future research should prioritize cross-cultural design and the development of explainable affective decision-making modules to ensure the sustainable and benevolent development of embodied intelligence in complex social environments.

Key words: Human-AI interaction, social robots, embodied intelligence, affective computing, systematic literature review

中图分类号: G250

吴丹, 徐华卿. 人智交互视角下社交机器人情感计算研究——文献综述与理论模型构建[J]. 农业图书情报学报, 2026, 38(1): 4-17.

WU Dan, XU Huaqing. Affective Computing for Social Robots from the Perspective of Human-AI Interaction: A Literature Review and Theoretical Model Construction[J]. Journal of library and information science in agriculture, 2026, 38(1): 4-17.

图/表 6

表1

图1

图2

表2

表3

图3

参考文献 57

[1]	Gasteiger N, Hellou M, Ahn H S. Factors for personalization and localization to optimize human–robot interaction: A literature review[J]. International Journal of Social Robotics, 2023, 15(4): 689-701.
[2]	工业和信息化部. “十四五”机器人产业发展规划[EB/OL]. (2021-12-28)[2025-11-16].
[3]	工业和信息化部, 国家发展和改革委员会, 教育部, 等. “十四五”智能制造发展规划[EB/OL]. (2021-12-21)[2025-11-16].
[4]	中共中央. 中共中央关于制定国民经济和社会发展第十五个五年规划的建议[EB/OL]. (2025-10-28)[2025-11-16].
[5]	国家新一代人工智能治理专业委员会. 《新一代人工智能伦理规范》发布[EB/OL]. (2021-09-25)[2025-11-16].
[6]	Commission European. Artificial intelligence act[EB/OL]. (2024-08-01)[2025-11-16].
[7]	National Institute of Standards and Technology. AI risk management framework[EB/OL]. (2023-01-26)[2025-11-16].
[8]	Bartosiak N, Gałuszka A, Wojnar M. Implementation of a neural network for the recognition of emotional states by social robots, using “OhBot”[M]//Advances in Computational Intelligence. Cham: Springer Nature Switzerland, 2023: 181-193.
[9]	Laohakangvalvit T, Subsa-ard N, Fulini F Y, et al. Improving facial emotion recognition model in social robot using graph-based techniques with 3D face orientation[C]//2024 12th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW). Piscataway, New Jersey: IEEE, 2025: 234-237.
[10]	Yu Chuang, Tapus A. Multimodal emotion recognition with thermal and RGB-D cameras for human-robot interaction[C]//Companion of the 2020 ACM/IEEE International Conference on Human-Robot Interaction. New York: ACM, 2020: 532-534.
[11]	Ramis S, Buades J M, Perales F J, et al. Using a social robot to evaluate facial expressions in the wild[J]. Sensors, 2020, 20(23): 6716.
[12]	Sham A H, Khan A, Lamas D, et al. Towards context-aware facial emotion reaction database for dyadic interaction settings[J]. Sensors, 2023, 23(1): 458.
[13]	Mishra C, Skantze G, Hagoort P, et al. Perception of emotions in human and robot faces: Is the eye region enough?[M]//Social Robotics. Singapore: Springer Nature Singapore, 2025: 290-303.
[14]	Ruiz-Garcia A, Webb N, Palade V, et al. Deep learning for real time facial expression recognition in social robots[M]//Neural Information Processing. Cham: Springer International Publishing, 2018: 392-402.
[15]	Biçer E, Takır Ş, Gürpınar C, et al. Masking and compression techniques for efficient action unit detection of children for social robots[C]//2022 30th Signal Processing and Communications Applications Conference (SIU). Piscataway, New Jersey: IEEE, 2022: 1-4.
[16]	Jaiswal S, Nandi G C. Optimized, robust, real-time emotion prediction for human-robot interactions using deep learning[J]. Multimedia Tools and Applications, 2023, 82(4): 5495-5519.
[17]	Verma A, Gavali M. Ensemble of large self-supervised transformers for improving speech emotion recognition[J]. International Journal of Data Mining, Modelling and Management, 2025, 17(2): 10065871.
[18]	Mishra R, Frye A, Rayguru M M, et al. Personalized speech emotion recognition in human-robot interaction using vision transformers[J]. IEEE Robotics and Automation Letters, 2025, 10(5): 4890-4897.
[19]	Grágeda N, Alvarado E, Mahu R, et al. Distant speech emotion recognition in an indoor human-robot interaction scenario[C]//INTERSPEECH 2023. ISCA, 2023: 3657-3661.
[20]	Ahuja S, Shabani A. Affective computing for social companion robots using fine-grained speech emotion recognition[C]//2023 IEEE Conference on Artificial Intelligence (CAI). Piscataway, New Jersey: IEEE, 2023: 331-332.
[21]	Szabóová M, Sarnovský M, Maslej Krešňáková V, et al. Emotion analysis in human–robot interaction[J]. Electronics, 2020, 9(11): 1761.
[22]	Ashok A, Pawlak J, Paplu S, et al. Paralinguistic cues in speech to adapt robot behavior in human-robot interaction[C]//2022 9th IEEE RAS/EMBS International Conference for Biomedical Robotics and Biomechatronics (BioRob). Piscataway, New Jersey: IEEE, 2022: 1-6.
[23]	Zhao Mingyi, Gong Linrui, Din A S. A review of the emotion recognition model of robots[J]. Applied Intelligence, 2025, 55(6): 364.
[24]	Staffa M, D'Errico L, Sansalone S, et al. Classifying human emotions in HRI: Applying global optimization model to EEG brain signals[J]. Frontiers in Neurorobotics, 2023, 17: 1191127.
[25]	Alimardani M, Hiraki K. Passive brain-computer interfaces for enhanced human-robot interaction[J]. Frontiers in Robotics and AI, 2020, 7: 125.
[26]	Mishra R, Welch K C. Towards forecasting engagement in children with autism spectrum disorder using social robots and deep learning[C]//SoutheastCon 2023. Piscataway, New Jersey: IEEE, 2023: 838-843.
[27]	Kothig A, Muñoz J, Mahdi H, et al. HRI physio lib: A software framework to support the integration of physiological adaptation in HRI[M]//Social Robotics. Cham: Springer International Publishing, 2020: 36-47.
[28]	Kothig A, Munoz J, Akgun S A, et al. Connecting humans and robots using physiological signals–closing-the-loop in HRI[C]//2021 30th IEEE International Conference on Robot & Human Interactive Communication (RO-MAN). Piscataway, New Jersey: IEEE, 2021: 735-742.
[29]	Li Chenghao, Seng K P, Ang L M, et al. Gait-to-gait emotional human-robot interaction utilizing trajectories-aware and skeleton-graph-aware spatial-temporal transformer[J]. Sensors, 2025, 25(3): 734.
[30]	Chen Luefeng, Feng Yu, Maram M A, et al. Multi-SVM based dempster-shafer theory for gesture intention understanding using sparse coding feature[J]. Applied Soft Computing, 2019, 85: 105787.
[31]	Powell H, Laban G, George J N, et al. Is deep learning a valid approach for inferring subjective self-disclosure in human-robot interactions?[C]//2022 17th ACM/IEEE International Conference on Human-Robot Interaction (HRI). Piscataway, New Jersey: IEEE, 2022: 991-996.
[32]	Duncan J A, Alambeigi F, Pryor M W. A survey of multimodal perception methods for human-robot interaction in social environments[J]. ACM Transactions on Human-Robot Interaction, 2024, 13(4): 1-50.
[33]	Song Xinheng, Liu Chang, Xu Linci, et al. Affective computing methods for multimodal embodied AI human-computer interaction[J]. Aslib Journal of Information Management, 2025: 1-25.
[34]	Chen Luefeng, Li Min, Wu Min, et al. Coupled multimodal emotional feature analysis based on broad-deep fusion networks in human-robot interaction[J]. IEEE Transactions on Neural Networks and Learning Systems, 2024, 35(7): 9663-9673.
[35]	Chen Luefeng, Su Wanjuan, Feng Yu, et al. Two-layer fuzzy multiple random forest for speech emotion recognition in human-robot interaction[J]. Information Sciences, 2020, 509: 150-163.
[36]	Jiang Yutong, Shao Shuai, Dai Yaping, et al. A LLM-based robot partner with multi-modal emotion recognition[C]//Intelligent Robotics and Applications. Singapore: Springer, 2025: 71-83.
[37]	Liu Xiaofeng, Lv Qincheng, Li Jie, et al. Multimodal emotion fusion mechanism and empathetic responses in companion robots[J]. IEEE Transactions on Cognitive and Developmental Systems, 2025, 17(2): 271-286.
[38]	Hwang C L, Deng Yuchen, Pu Shihen. Human-robot collaboration using sequential-recurrent-convolution-network-based dynamic face emotion and wireless speech command recognitions[J]. IEEE Access, 2023, 11: 37269-37282.
[39]	Bethany G, Gupta M. A transformer based emotion recognition model for social robots using topographical maps generated from EEG signals[C]//Human-Computer Interaction. Cham: Springer, 2024: 262-271.
[40]	Shenoy S, Jiang Yusheng, Lynch T, et al. A self learning system for emotion awareness and adaptation in humanoid robots[C]//2022 31st IEEE International Conference on Robot and Human Interactive Communication (RO-MAN). Piscataway, New Jersey: IEEE, 2022: 912-919.
[41]	Lu I C, Huang J Y, Lee W P. An emotion-driven and topic-aware dialogue framework for human-robot interaction[J]. Advanced Robotics, 2024, 38(4): 267-281.
[42]	Tanevska A, Rea F, Sandini G, et al. A socially adaptable framework for human-robot interaction: Correction[J]. Frontiers in Robotics and AI, 2021, 8: 812583.
[43]	Churamani N, Barros P, Gunes H, et al. Affect-driven learning of robot behaviour for collaborative human-robot interactions[J]. Frontiers in Robotics and AI, 2022, 9: 717193.
[44]	Tian Leimin, Oviatt S. A taxonomy of social errors in human-robot interaction[J]. ACM Transactions on Human-Robot Interaction, 2021, 10(2): 1-32.
[45]	Chen Luefeng, Wu Min, Zhou Mengtian, et al. Information-driven multirobot behavior adaptation to emotional intention in human-robot interaction[J]. IEEE Transactions on Cognitive and Developmental Systems, 2018, 10(3): 647-658.
[46]	Tuyen N T V, Elibol A, Chong N Y. Learning bodily expression of emotion for social robots through human interaction[J]. IEEE Transactions on Cognitive and Developmental Systems, 2021, 13(1): 16-30.
[47]	Guerrieri A, Braccili E, Sgrò F, et al. Gender identification in a two-level hierarchical speech emotion recognition system for an Italian social robot[J]. Sensors, 2022, 22(5): 1714.
[48]	Bagheri E, Roesler O, Cao H L, et al. A reinforcement learning based cognitive empathy framework for social robots[J]. International Journal of Social Robotics, 2021, 13(5): 1079-1093.
[49]	Mascarenhas S, Guimarães M, Prada R, et al. FAtiMA toolkit: Toward an accessible tool for the development of socio-emotional agents[J]. ACM Transactions on Interactive Intelligent Systems, 2022, 12(1): 1-30.
[50]	Feng S, Sumioka H, Yamato N, Ishiguro H, Shiomi M. Effect of emotional expression on the impression of older people towards baby-like robots[C]//Proceedings of the 12th Conference on Human-Agent Interaction, HAI 2024. New York: ACM, 2024: 414-416.
[51]	Sobhani M, Smith J, Pipe A, et al. A novel mirror neuron inspired decision-making architecture for human-robot interaction[J]. International Journal of Social Robotics, 2024, 16(6): 1297-1314.
[52]	Kang H, Ben Moussa M, Thalmann N M. Nadine: A large language model-driven intelligent social robot with affective capabilities and human-like memory[J]. Computer Animation and Virtual Worlds, 2024, 35(4): e2290.
[53]	Antony V N, Stiber M, Huang C M. Xpress: A system for dynamic, context-aware robot facial expressions using language models[C]//2025 20th ACM/IEEE International Conference on Human-Robot Interaction (HRI). Piscataway, New Jersey: IEEE, 2025: 958-967.
[54]	Penčić M, Čavić M, Oros D, et al. Anthropomorphic robotic eyes: Structural design and non-verbal communication effectiveness[J]. Sensors, 2022, 22(8): 3060.
[55]	Löffler D, Schmidt N, Tscharn R. Multimodal expression of artificial emotion in social robots using color, motion and sound[C]//2018 13th ACM/IEEE International Conference on Human-Robot Interaction (HRI). Piscataway, New Jersey: IEEE, 2021: 334-343.
[56]	Korcsok B, Konok V, Persa G, et al. Biologically inspired emotional expressions for artificial agents[J]. Frontiers in Psychology, 2018, 9: 1191.
[57]	MacDonald S, Bretin R, ElSayed S. Evaluating transferable emotion expressions for zoomorphic social robots using VR prototyping[C]//2024 IEEE International Symposium on Mixed and Augmented Reality (ISMAR). Piscataway, New Jersey: IEEE, 2024: 1087-1096.

区域/维度	“十四五”期间	“十五五”期间	核心变化与研究启示
国内：战略定位	侧重技术储备：《“十四五”智能制造发展规划》《“十四五”机器人产业发展规划》	确立未来产业地位：《中共中央关于制定国民经济和社会发展第十五个五年规划的建议》	从技术攻关上升为国家战略产业，明确了机器人的主体地位
国内：应用场景	工业与特种领域：强调“机器人+应用”，侧重标准化作业。	民生与银发经济：《“十五五”规划建议》中“人口高质量发展”板块	重心转向非标准化社会空间，确立情感计算在养老领域的必要性
国内：伦理规制	软性伦理指引依据：《新一代人工智能伦理规范》	刚性安全体系：《“十五五”规划建议》	从伦理倡导转向制度保障，要求研究必须内置“可控可治”的约束机制
国际：欧盟	风险分级理念：《人工智能法案》草案	法律强制约束：《人工智能法案》	2025年禁令生效，划定了情感计算的禁区与红线
国际：美国	行政指令：《保持美国在AI领域的领导地位》	风险实质管控：《人工智能风险管理框架》《国家生物识别信息隐私法案》	转向州级立法与算法问责，强调透明度

感知模态	关键技术	解决的交互问题
视觉感知	3D面部朝向特征、热成像融合、堆叠卷积自编码器（SCAE）、轻量化AU检测	光照剧变（如夜间或强光）、面部遮挡、非正对姿态导致的面部特征丢失、高精度视觉算法在机器人本体终端上的高延迟与算力资源不足
听觉感知	SSL Transformer、远场语音识别、深度CNN强度分级、副语言线索解码	室内混响、多说话人重叠以及机器人电机自身产生的机械噪声。无法识别反讽、双关语，或仅靠文本转录忽略了语气中的急迫感与情感强度
生理与行为	BCI/EEG分类、BVP参与度预测、EDA/HRV监测、情感步态分析	表层表情（如礼貌性假笑）掩盖了真实的负面情绪或认知负荷、特殊群体（如ASD儿童、失语老人）无法通过常规的面部或语言通道准确表达需求
多模态实时识别	LLM驱动（TriMER/GPT）、宽深融合网络（BDFN）、序列递归卷积网络（SRCN）	异构信号在时空上的不同步导致的情感理解歧义、云端大模型推理的高延迟破坏了人智交互的同步性（如话轮转换停顿）

策略维度	核心交互问题	关键流派/模型
场景驱动决策	如何超越静态规则，实现对用户深层意图的动态响应？	基于意图推断的认知评估、利用“示弱”激发照护本能、TriMER/LLM-Gen基于大模型的生成式即时决策
跨模态具身表达	如何解决单模态高保真与整体表达割裂导致的“恐怖谷”效应？	Face2Gesture“语音-面部-肢体”的端到端同步生成、非语言声音的情感传递、动物形态的去拟人化表达
长期关系构建	如何克服新奇效应消退，维持长周期的交互粘性？	基于自传记忆的过往回溯、人机双向适应的互学习框架、基于反馈的实时参数更新

人智交互视角下社交机器人情感计算研究——文献综述与理论模型构建

Affective Computing for Social Robots from the Perspective of Human-AI Interaction: A Literature Review and Theoretical Model Construction

RichHTML

PDF (PC)

赞

可视化

摘要/Abstract

引用本文

使用本文

图/表 6

参考文献 57

相关文章 4

Metrics

本文评价

推荐阅读 0

[1]	郝雅立, 梁颖, 丁若溪. “工具-价值”理性视角下情智具身智能嵌入社会治理的机制分析与风险规制[J]. 农业图书情报学报, 2026, 38(1): 18-27.
[2]	任福兵, 罗娅. 基于认知偏差视角的用户网络集群行为演化机理研究[J]. 农业图书情报学报, 2025, 37(9): 18-31.
[3]	郝雅立, 宋沂霏, 阿忠萍, 梁颖. 基于情感计算的涉农突发事件网络舆情态势分析与引导策略[J]. 农业图书情报学报, 2025, 37(10): 37-52.
[4]	刘洋, 吕树月, 黎若珺. 社交机器人在信息行为研究中的概念、任务及应用[J]. 农业图书情报学报, 2024, 36(3): 4-20.