中文    English

Journal of Library and Information Science in Agriculture

   

Multimodal Learning Technology Aimed at Exploring the Innovative Path of Library Intelligence Service

Yuanyuan SANG   

  1. Hefei University of Technology, Hefei 230009
  • Received:2024-08-30 Online:2025-03-05

Abstract:

Purpose/Significance The evolution of smart libraries has ushered in a new era, marked by the integration of multimodal learning technologies that combine information from various modalities such as speech, images, and video. This cutting-edge technology is revolutionizing traditional information service systems by providing a more interactive, efficient, and personalized user experience. Unlike traditional studies that focus on single-mode interactions, this research examines the role of multimodal technologies in transforming library services and increasing user engagement. The study highlights its unique contributions to the field of library science, particularly in improving knowledge dissemination, enhancing user-centered services, and addressing emerging challenges in digital information management. These findings not only enrich the theoretical framework of smart libraries, but also provide practical insights into the design and deployment of advanced information services. Method/Process This study takes a multidisciplinary approach, drawing from library science, information technology, and human-computer interaction theories. It systematically reviews the historical development and theoretical foundations of multimodal learning technologies while emphasizing their relevance to intelligent library ecosystems. The analysis is organized around key application areas, including intelligent navigation, intelligent question and answer systems, user education with intelligent support, and immersive reading experiences. These areas were explored through a combination of case studies, and a detailed analysis of current library practices. To evaluate the practical impact of these technologies, the study employed qualitative methods, analyzing user feedback and system performance metrics. This comprehensive research also identifies current barriers to adoption, such as data privacy concerns, technology costs, and disparities in user acceptance across different demographics. [Results/ Conclusions The results show that multimodal learning technologies significantly enhance the functionality and user experience of smart libraries. They improve the accuracy of information retrieval, enable more interactive and immersive learning environments, and enable personalized services tailored to individual needs. Despite these advantages, challenges remain, particularly in areas such as securing user data, reducing deployment costs, and increasing accessibility for underprivileged users. The study proposes actionable strategies to address these issues, including enhancing system interoperability, refining ethical frameworks, and fostering human-computer collaboration to reduce barriers to technology adoption. It also identifies gaps in current research, such as the need for more empirical studies of long-term user interaction patterns and the scalability of multimodal systems in large library networks. Future studies could also explore the integration of emerging technologies such as augmented reality (AR) and artificial intelligence (AI) into multimodal library services to further improve their efficiency and reach. By providing a robust framework and practical strategies, this study contributes to the ongoing discourse on smart library innovation, and paves the way for more sustainable and inclusive information service models. It underscores the transformative potential of multimodal technologies to redefine library science and advance the global digital information landscape.

Key words: multimodal learning, smart libraries, intelligent services, path innovation, multimodal large language models, future learning center

CLC Number: 

  • G252

Table 1

Key application areas and scenarios for multimodal learning technologies"

应用领域与场景 技术类型 典型案例
智慧教育:提供沉浸式的学习体验,融合视觉、听觉和文本来提高学习效果 多模态教学系统 Coursera的多模态学习系统;Edmodo的AI辅助教学平台;好未来的智能课堂
医疗:帮助医生通过图像和语音进行患者健康评估 多模态医疗诊断 微软的Healthcare NExT;科大讯飞的智慧医疗;拜耳的多模态诊断系统
自动驾驶:利用摄像头、雷达和语音数据进行环境感知和决策 多模态感知系统 Tesla的自动驾驶系统;Waymo的自动驾驶平台;百度的Apollo技术
娱乐:结合视觉、语音和动作捕捉,提供更沉浸的游戏体验 多模态互动娱乐系统 索尼的PlayStation VR;Meta的Horizon Worlds;Valve Index的VR游戏体验
电商:通过图像、语音和文本分析消费者行为,提供个性化推荐 多模态推荐系统 亚马逊的Alexa购物助手;阿里巴巴的达摩院多模态推荐引擎;京东的AI推荐
智慧金融:结合不同类型的数据进行风险评估和欺诈检测 多模态风控与评估系统 平安银行的多模态风控系统;Visa的AI风控平台;摩根大通的金融监控系统
智能家居:通过语音、图像和手势识别来控制智能设备 多模态智能家居控制系统 谷歌Nest的多模态交互平台;小米的AIoT智能家居;亚马逊的Echo Show
智能客服:利用语音、文本和表情识别为用户提供智能服务 多模态智能客服系统 阿里巴巴的“阿里小蜜”;京东的“京小智”;腾讯的智能客服
医疗培训:通过虚拟环境结合图像、语音和触觉反馈进行培训 多模态医疗培训系统 强生的虚拟培训平台;达芬奇机器人的手术培训系统;迈瑞的虚拟培训
安防:通过监控视频、语音和行为分析实现全面的安全监控 多模态安防系统 华为的智慧安防平台;海康威视的AI安防系统;大华股份的多模态监控技术

Fig.1

Logical framework for multimodal intelligent interaction that enables smart library services"

1
翟羽佳. 大语言模型应用于图书馆建设未来学习中心的适应性、风险与策略[J]. 图书馆学研究, 2024(7): 77-85.
ZHAI Y J. Adaptability, risk and strategies of large language models in the construction of future learning centers in libraries[J]. Research on library science, 2024(7): 77-85.
2
司俊勇, 付永华. 多模态数据融合的在线学习情感计算研究[J]. 图书与情报, 2024(3): 69-80.
SI J Y, FU Y H. Affective computing for E-learning based on multimodal data fusion[J]. Library & information, 2024(3): 69-80.
3
新华社. 中共中央办公厅 国务院办公厅印发《关于推进实施国家文化数字化战略的意见》[EB/OL]. [2024-10-10].
4
中国图书馆学会. 中国图书馆学会关于印发《中国图书馆学会“十四五”发展规划纲要(2021-2025年)》的通知[EB/OL]. [2024-10-11].
5
BALTRUŠAITIS T, AHUJA C, MORENCY L P. Multimodal machine learning: A survey and taxonomy[J]. IEEE transactions on pattern analysis and machine intelligence, 2019, 41(2): 423-443.
6
XU P, ZHU X T, CLIFTON D A. Multimodal learning with transformers: A survey[J]. IEEE transactions on pattern analysis and machine intelligence, 2023, 45(10): 12113-12132.
7
GAO J, LI P, CHEN Z K, et al. A survey on deep learning for multimodal data fusion[J]. Neural computation, 2020, 32(5): 829-864.
8
JABEEN S, LI X, AMIN M S, et al. A review on methods and applications in multimodal deep learning[J]. ACM transactions on multimedia computing, communications, and applications, 2023, 19(2s): 1-41.
9
HARRIS I, WANG Y L, WANG H Y. ICT in multimodal transport and technological trends: Unleashing potential for the future[J]. International journal of production economics, 2015, 159: 88-103.
10
SKENDERI G, JOPPI C, DENITTO M, et al. Well googled is half done: Multimodal forecasting of new fashion product sales with image-based google trends[J]. Journal of forecasting, 2024, 43(6): 1982-1997.
11
ANDERSON J, JOHNSON D. The role of artificial intelligence in enhancing e-commerce customer experience[R]. Stockport: EasyChair, 2024, 12: 9-116.
12
MUHAMMAD G, ALSHEHRI F, KARRAY F, et al. A comprehensive survey on multimodal medical signals fusion for smart healthcare systems[J]. Information fusion, 2021, 76: 355-375.
13
RAJASEKAR V, PREDIĆ B, SARACEVIC M, et al. Enhanced multimodal biometric recognition approach for smart cities based on an optimized fuzzy genetic algorithm[J]. Scientific reports, 2022, 12(1): 622.
14
GIANNAKOS M, CUKUROVA M. The role of learning theory in multimodal learning analytics[J]. British journal of educational technology, 2023, 54(5): 1246-1267.
15
ZONG Y S, AODHA O MAC, HOSPEDALES T. Self-supervised multimodal learning: A survey[J/OL]. arXiv preprint arXiv:2304.01008, 2023.
16
QIAO Y Y, YU Z, GUO L T, et al. VL-mamba: Exploring state space models for multimodal learning[J/OL]. arXiv:2403.13600, 2024.
17
付娆. 多模态数据赋能的智慧图书馆技术架构和服务模式研究[J]. 河南图书馆学刊, 2024, 44(5): 92-94, 97.
FU R. Research on the technical framework and service model of smart libraries empowered by multimodal data[J]. The library journal of Henan, 2024, 44(5): 92-94, 97.
18
贾明霞, 张晓宇, 赵宇翔. 涨知识了? 泛知识直播中多重沟通和多模态展示对用户知识采纳和持续参与的影响[J/OL].图书情报知识, 2024:1-15.
JIA M X, ZHANG X Y, ZHAO Y X. Increased knowledge? Influence of multi-communication and multi-modal presentation on user's knowledge adoption and continuous participation in pan-knowledge live broadcasting[J/OL].Library and information knowledge, 2024: 1-15.
19
辛慧仪. 多模态话语分析视角下博物馆语篇翻译研究: 以孔子博物馆为例[J]. 英语广场, 2024(9): 7-10.
XIN H Y. A study of museum text translation from the perspective of multimodal discourse analysis: Taking Confucius museum as an example[J]. English square, 2024(9): 7-10.
20
李默, 杨彬. 从生成式人工智能到通用人工智能: 赋能图书馆知识服务模式创新[J]. 农业图书情报学报, 2024(6): 50-61.
LI M, YANG B. From generative artificial intelligence to artificial general intelligence: Enabling innovation models in library knowledge services[J]. Journal of library and information science in agriculture, 2024(6): 50-61.
21
赵杨, 张雪, 王玮航, 等. 基于多模态情感分析的图书馆智能服务用户情感体验度量[J]. 情报科学, 2023, 41(9): 155-163.
ZHAO Y, ZHANG X, WANG W H, et al. Emotional experience measurement of library intelligent service users based on multimodal emotional analysis[J]. Information science, 2023, 41(9): 155-163.
22
ALWAHABY H, CUKUROVA M, PAPAMITSIOU Z, et al. The evidence of impact and ethical considerations of multimodal learning analytics: A systematic literature review[M]//The Multimodal Learning Analytics Handbook. Cham: Springer International Publishing, 2022: 289-325.
23
CARLITO M D. Supporting multimodal literacy in library instruction[J]. Reference services review, 2018, 46(2): 164-177.
24
GUO S C, LAN Y J. Virtual world-supported contextualized multimodal EFL learning at a library[J]. Information science and technology, 2023(7): 136-153.
25
张磊. ChatGPT应用视角下的公共图书馆智慧服务: 机遇、挑战与对策[J]. 图书馆工作与研究, 2024(S1): 30-35.
ZHANG L. Public libraries smart services from the perspective of ChatGPT application: Opportunities, challenges and countermeasures[J]. Library work and study, 2024(S1): 30-35.
26
黄林英, 孙云倩, 周宇麟. 公共图书馆数字化转型的实践思考: 基于19个智慧图书馆创新应用优秀案例[J]. 新世纪图书馆, 2024(9): 68-72.
HUANG L Y, SUN Y Q, ZHOU Y L. Practice and reflection on the digital transformation of public libraries: Based on 19 excellent cases of innovative application of smart library[J]. New century library, 2024(9): 68-72.
[1] Jiaxin HUANG, Xiaofang ZHANG. Application Models and Innovative Approaches of Smart Libraries from the Perspective of MR Technology [J]. Journal of Library and Information Science in Agriculture, 2024, 36(9): 78-88.
[2] ZOU Yayi. ChatGPT Strengthens Library Intelligence Services: Opportunities, Challenges and Development Strategies [J]. Journal of Library and Information Science in Agriculture, 2024, 36(2): 71-80.
[3] XIA Yikun, JIANG Jie, ZHANG Xiaheng, WANG Jiandong, ZHOU Wenjie, YANG Xinya, LI Yang. Developing the New Quality Productivity: Responses and Reflections on the Discipline of Information Resource Management [J]. Journal of Library and Information Science in Agriculture, 2024, 36(1): 4-32.
[4] WAN Qiao. Future Learning Centers: Educational Paradigms, Basic Characteristics and Space Construction [J]. Journal of Library and Information Science in Agriculture, 2023, 35(9): 57-65.
[5] ZHANG Jingbei, XU Yaping, ZHOU Qiong, CAI Yingchun. Future Learning Centers: A Study on Libraries' Role Reorientation, Function Reconstruction, and Practical Innovations [J]. Journal of Library and Information Science in Agriculture, 2023, 35(6): 43-50.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!