农业图书情报学报 ›› 2023, Vol. 35 ›› Issue (8): 4-18.doi: 10.13998/j.cnki.issn1002-1248.23-0251

• 特约综述 •    下一篇

深度学习语言模型的研究综述

王思丽1, 张伶2, 杨恒1, 刘巍1   

  1. 1.中国科学院西北生态环境资源研究院 文献情报中心,兰州 730000;
    2.新乡医学院 管理学院,新乡 453003
  • 收稿日期:2023-04-20 出版日期:2023-08-05 发布日期:2023-12-04
  • 作者简介:王思丽(1985- ),女,博士,副研究馆员,研究方向为知识发现与知识组织。张伶(1987- ),女,博士,讲师,研究方向为知识发现与知识组织。杨恒(1992- ),男,硕士,馆员,研究方向为自然语言处理与深度学习。刘巍(1980- ),男,硕士生导师,副研究馆员,研究方向为知识计算与知识挖掘
  • 基金资助:
    甘肃省哲学社会科学规划项目“基于大数据技术提升新闻媒体舆论监督能力研究”(2021YB158); 甘肃省自然科学基金“甘肃省医疗健康大数据资产管理模式与再利用机制研究”(23JRRA581)

Review of Deep Learning for Language Modeling

WANG Sili1, ZHANG Ling2, YANG Heng1, LIU Wei1   

  1. 1. Literature and Information Center of Northwest Institute of Eco-Environment and Resources, Chinese Academy of Sciences, Lanzhou 730000;
    2. School of Management, Xinxiang Medical University, Xinxiang 453003
  • Received:2023-04-20 Online:2023-08-05 Published:2023-12-04

摘要: [目的/意义]深度学习语言模型是当前提高机器语言智能的主要方法之一,已成为数据资源自动处理分析与知识情报智能挖掘计算不可或缺的重要技术手段,但在图情领域利用其进行技术开发和应用服务仍存在着一些困难。本研究通过系统梳理与揭示深度学习语言模型的研究进展、技术原理与应用开发方法,以期为图书馆员及同行从业者深入理解与应用深度学习语言模型提供理论依据与方法路径。[方法/过程]系统地调研和梳理了深度学习语言模型的产生背景、基础性特征表示算法、代表性应用开发工具,揭示其演化发展的动态历程及技术原理,分析各算法模型与开发工具的优缺点与适用性;深入地归纳总结了深度学习语言模型应用开发面临的挑战问题,提出两种拓展其应用能力的方法策略。[结果/结论]深度学习语言模型应用开发面临的重要挑战包括参数繁多,精度难调;依赖于大量准确的训练数据,变化困难;可能引发知识产权和信息安全问题等。未来可考虑从面向特定领域和特征工程两方面入手以拓展和提升其应用能力。

关键词: 深度学习, 语言模型, 神经网络, 预训练模型, 词嵌入

Abstract: [Purpose/Significance] Deep learning for language modeling is one of the major methods and advanced technologies to enhance language intelligence of machines at present, which has become an indispensable important technical means for automatic processing and analysis of data resources, and intelligent mining of information and knowledge. However, there are still some difficulties in using deep learning for language modeling for technology development and application service in the library and information science (LIS) field. Therefore, this study systematically reviews and reveals the research progress, technical principles, and development methods of deep learning for language modeling, with the aim at providing reliable theoretical basis and feasible methodological paths for the deep understanding and application of deep learning for language modeling for librarians and fellow practitioners. [Method/Process] The data used in this study were collected from the WOS core database, CNKI literature database, arXiv preprint repository, GitHub open-source software hosting platform and the open resources on the Internet. Based on these data, this paper first systematically investigates the background, basic feature representation algorithms, and representative application development tools of deep learning for language modeling, reveals their dynamic evolution and technical principles, and analyzes the advantages and disadvantages and applicability of each algorithm model and development tool. Second, an in-depth analysis of the possible challenging problems faced by the development and application of deep learning for language modeling was performed, and two strategic approaches to expand their application capabilities were put forward. [Results/Conclusions] The important challenges faced by the application and development of deep learning for language modeling include numerous parameters and difficulties to adjust accuracy, relying on a large amount of accurate training data, difficulties in making changes, and the intellectual property and information security issues. In the future, we will start from two aspects of specific domains and feature engineering to expand and improve the application capabilities of deep learning for language modeling. Specifically, we focus on consideration of the collection and preparation of domain data, selection of model architecture, participation of domain experts, and optimization for specific tasks, in order to ensure that the data source of the model is more reliable and secure, and the application effect is more accurate and practical. Moreover, the strategic methods for feature engineering to expand the application capabilities of deep learning for language modeling include selecting appropriate features, feature pre-processing, feature selection, and feature dimensionality reduction. These strategies can help improve the performance and efficiency of deep learning for language models, making them more suitable for specific tasks or domains. To sum up, LIS institutions should leverage the deep learning for language modeling related technologies, guided by the needs of scientific research and social development, and based on advantages of existing literature data resources and knowledge services; they should carry out innovative professional or vertical domain intelligent knowledge management and application service, and develop technology and systems with independent intellectual property rights, which is their long-term sustainable development path.

Key words: deep learning, language model, neural network, pre-trained model, word embedding

中图分类号: 

  • G202

引用本文

王思丽, 张伶, 杨恒, 刘巍. 深度学习语言模型的研究综述[J]. 农业图书情报学报, 2023, 35(8): 4-18.

WANG Sili, ZHANG Ling, YANG Heng, LIU Wei. Review of Deep Learning for Language Modeling[J]. Journal of Library and Information Science in Agriculture, 2023, 35(8): 4-18.