农业图书情报学报 ›› 2023, Vol. 35 ›› Issue (10): 4-33.doi: 10.13998/jcnki.issn1002-1248.23-0850

• 专家笔谈 •    下一篇

人工智能驱动的第五科研范式(AI4S)变革与观察

孙坦1,2*, 张智雄3,4,5*, 周力虹6*, 王东波7*, 张海8*, 李白杨9*, 勇素华10*, 左旺孟11*, 杨光磊11*   

  1. 1. 中国农业科学院,北京 100081;
    2. 农业农村部农业大数据重点实验室,北京 100125;
    3. 中国科学院文献情报中心,北京 100190;
    4. 中国科学院大学经济与管理学院信息资源管理系,北京 100190;
    5. 国家新闻出版署学术期刊新型出版与知识服务重点实验室,北京 100190;
    6. 武汉大学信息管理学院,武汉 430064;
    7. 南京农业大学信息管理学院,南京 210095;
    8. 嘉兴南湖学院商贸管理学院,嘉兴 314001;
    9. 南京大学数据智能与交叉创新实验室,南京 210024;
    10. 南京信息工程大学马克思主义学院,南京 210044;
    11. 哈尔滨工业大学计算学部,哈尔滨 150001
  • 收稿日期:2023-09-15 出版日期:2023-10-05 发布日期:2024-02-28
  • 通讯作者: *孙坦(1970-),男,博士,研究馆员(二级),博士生导师,研究方向为数字信息描述与组织。Email:suntan@caas.cn; 张智雄,博士,教授级高工(正高二级),博士生导师,中国科学院文献情报中心,副主任,研究方向为语义标注、数字图书馆、信息监测、学术交流等。E-mail:zhangzhx@mail.las.ac.cn; 周力虹(1983-),教授,博士,博士生导师,武汉大学信息管理学院副院长,研究方向为数据共享与智慧服务,E-mail:l.zhou@whu.edu.cn; 王东波,教授,博士生导师,研究方向为数字人文、智能信息组织。E-mail:db.wang@njau.edu.cn。张海,副教授,研究方向为数字人文、信息行为。E-mail:1033462760@qq.com; 李白杨,博士,助理教授,研究员,研究方向为数据智能、数字素养、图书馆服务。Email:libaiyang@nju.edu.cn; 勇素华(1977-),女,副教授,南京信息工程大学马克思主义学院,江苏省习近平新时代中国特色社会主义思想研究中心南信大基地,研究员,研究方向为科技史、高等教育研究。E-mail:2218530445@qq.com; 左旺孟(1977-),教授,博士生导师,哈尔滨工业大学计算学部,研究方向为视觉感知与认知。Email:cswmzuo@gmail.com。杨光磊,博士,助理教授,哈尔滨工业大学计算学部机器学习研究中心,研究方向为域适应、无监督学习、预训练模型等。Email: yangguanglei@hit.edu.cn
  • 基金资助:
    国家社会科学基金“ 融合多种知识组织体系的认知搜索模式研究” ( 20BTQ014); 国家重点研发计划项目“科技文献内容深度挖掘及智能分析关键技术和软件” (2022YFF0711900); 国家社会科学基金项目“开放数据背景下我国高校图书馆数字学术服务研究” (17CT0042); 国家社会科学基金重大项目“中国古代典籍跨语言知识库构建及应用研究” (21&ZD331); 教育部人文社会科学研究一般项目“改革开放以来大陆惠台政策演变及成效研究” (22YJAGAT001);江苏省社会科学基金项目“改革开放以来苏台关系演变研究” (20LSB001);江苏高校哲学社会科学研究项目“涉台‘31 条措施’实施绩效论析” (2018SJA0145).

The Transformation and Observations of AI for Science (AI4S) Driven by Artificial Intelligence

SUN Tan1,2*, ZHANG Zhixiong3,4,5*, ZHOU Lihong6*, WANG Dongbo7*, ZHANG Hai8*, LI Baiyang9*, YONG Suhua10*, ZUO Wangmeng11*, YANG Guanglei11*   

  1. 1. Chinese Academy of Agricultural Sciences, Beijing 100081;
    2. Key Laboratory of Agricultural Big Data, Ministry of Agriculture and Rural Affairs, Beijing 100125;
    3. National Science Library, Chinese Academy of Sciences, Beijing 100190;
    4. Department of Information Resources Management, School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190;
    5. Key Laboratory of Academic Journal New Publication and Knowledge Service of National Press and Publication Administration, Beijing 100190;
    6. School of Information Management, Wuhan University, Wuhan 430064;
    7. School of Information Management, Nanjing Agricultural University, Nanjing 210095;
    8. Commerce and Management College, Jiaxing Nanhu University, Jiaxing 314001;
    9. Data Intelligence and Cross Innovation Laboratory, Nanjing University, Nanjing 210024;
    10. School of Marxism, Nanjing University of Information Science and Technology, Nanjing 210044;
    11. School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001
  • Received:2023-09-15 Online:2023-10-05 Published:2024-02-28

摘要: “人工智能驱动的科学研究”(AI for Science,AI4S)是一场正在发生的科技革命,是将人工智能技术与科学研究深度结合,帮助促进发现新知识、解决科学难题的新型科学研究范式。随着AI4S的研究推进,其发展脉络、机遇和挑战、需求和任务、实现路径等问题值得进一步讨论。为此本刊邀请了7位专家组织了本期笔谈。1)支撑AI4S的知识服务:AI4S 对当下的知识服务提出了更高的要求,包括多层次知识发现与获取需求;跨学科研究和创新需求;以用户为中心的参与式服务需求,使得知识服务场景向多元化,智能化,专业化,个性化转变。为此须重新定位AI4S 环境中知识服务的新角色,明确其在全面支撑科技创新进程中的新任务,树立大文献观,兼顾普惠与专深,以支撑跨学科创新。2)建设AI4S的知识底座:人工智能的本质是知识的获取与利用,而科技文献则是人类知识的主要载体。中国科学院文献情报中心充分认识到AI带来的科研范式变革,提出了构建AI4S的科技文献知识底座的概念,积极挖掘科技文献中蕴含的科学知识和高质量数据,努力构建AI4S的领域智能知识底座,将“科技文献库”转变为“科技知识引擎”,支撑AI4S所需的查询循证、态势感知、推理预测、生成启示等智能化服务。3)驱动AI4S的科学数据:科学数据有效聚合为发挥AI4S的强大功能奠定了数据基础,是图书馆实现AI时代角色与功能变革的前提,是推动科研服务转型、深化科研支持、加速科技创新的必要条件。目前图书馆有效聚合科学数据为AI4S提供支撑仍面临宏观和中观上的诸多挑战,应对该挑战有以下实现路径:明确图书馆在科学数据管理中的角色与作用;营造科学数据管理环境;构建科学数据管理合作网络;提升科学数据管理服务能力。4)AI4S与古典文献智能语言模型:AI4S技术能够用于文献和文本的分析,更快速、更全面地理解大量的历史文献和文化资料。古典文献智能语言模型是人工智能技术在古籍文献研究领域的一项重要突破,为古典文献研究带来了新的机遇和挑战。随着多模态、生成式GPT模型的流行,AI4S情境下古典文献智能语言模型将更加注重整合多样信息、提高适应性、增强知识表示和服务于更广泛的应用场景。5)面向AI4S的图书馆数字学术服务:基于LLM的AI4S和AIGC推动智慧图书馆建设的理念不谋而合,给图书馆数字学术服务带来了机遇和挑战。基于AI4S平台化趋势与数字学术服务中台化特征适配,以及图书馆界长期服务科研工作的历史传统两大特点,其数字学术服务平台的再造路径,包括自主打造AI4S服务平台、购买和使用第三方的AI4S平台和作为科学智能组件的嵌入式知识服务再升级3种。6)AI4S的历史演化与逻辑结构:AI4S是人工智能技术充分应用到各学科领域主导的科学范式变革,其逻辑架构包括“数据+模型”驱动、通过机器猜想打造知识生态和通过算法思维延展应用场景。数智文明时代中,AI4S驱动科学进步与社会发展需要发扬科技向善价值观,有效选择AI4S延展应用到社会科学和人文科学领域的理论论证与方案,并完善人类决策与机器智能融合共建的系列机制。7)AI4S的发展机遇与展望:随着生成式人工智能的发展,预训练算法和预训练大模型为不同学科领域的AI4S带来了巨大机遇,在工业检测、机器人技术和医学等多个领域表现出了巨大的应用潜力和价值。此外,预训练大模型的技术实施条件局限、数据/计算资源的可持续发展、技术的透明性、公正性和可访问性等关键因素也值得重视。

关键词: AI4S, 智能知识服务, 科技文献, 知识底座, 科学数据聚合, 图书馆数字学术服务, 古典文献智能语言模型

Abstract: " AI for Science " (AI4S) is a new scientific research paradigm that deeply integrates AI technology with scientific research to promote the discovery of new knowledge and the solution of scientific problems. As the application of AI4S in the natural sciences and humanities and social sciences advances, its development line, opportunities and challenges, needs and tasks, and ways of realization deserve further discussion. In order to advance AI4S research, promote scientific and technological (S&T) innovation and progress, and facilitate the effective strengthening of the discipline of information resources management, our journal has invited seven experts to organize this academic conversation on AI4S. 1) Supporting knowledge services for AI4S: In the current landscape of intelligent knowledge services, the requirements for supporting AI4S have increased, including the need for multi-level knowledge discovery and acquisition, cross-disciplinary research and innovation, and user-friendly participatory services. In addition, knowledge service scenarios are moving towards diversification, complexity, depth, specialization, and personalization in ubiquitous knowledge discovery, generative content services, and multi-round interactive service exploration. In response, professional science and technology information organizations need to reassess the role of knowledge services in the AI4Science environment and their significance in comprehensively supporting the S&T innovation process. This involves establishing a broad literature perspective, deepening full-text knowledge elements, balancing universal and specialized depth, autonomously developing core products, and deeply engaging with professional fields to support interdisciplinary innovation. 2) Building the knowledge base of AI4S: The essence of artificial intelligence (AI) lies in the acquisition and use of knowledge, and scientific and technological (S&T) literature is the primary carrier of human knowledge. Fully recognizing the paradigm shift in scientific research brought about by AI, the Document Information Center of the China Academy of Sciences has proposed the concept of building a S&T literature knowledge base for AI4S. It is actively exploring the scientific knowledge and high-quality data contained in the S&T literature, strives to build a domain intelligent knowledge base for AI4S, and transforms the "S&T literature database" into a "scientific knowledge engine" that supports intelligent services such as query evidence, situational awareness, inference prediction, and generation of insights required by AI4S. 3) Powering AI4S with scientific data: Effective aggregation of scientific data is the foundation for unleashing the powerful capabilities of AI4S. This is essential for libraries to adapt their roles and functions in the AI era and is a crucial prerequisite for catalyzing the transformation of scientific research services, deepening scientific research support, and accelerating S&T innovation. Currently, libraries face various macro and meso challenges in effectively aggregating valuable scientific data to provide support for AI4S. To address these challenges, the following ways can be pursued: defining the roles and functions of libraries in scientific data management; promoting a conducive environment for scientific data management; establishing a collaborative network for scientific data management; and enhancing the service capacity of scientific data management. 4) AI4S and intelligent language modeling for classical literature: AI4S technology can be used to analyze documents and texts, enabling a faster and more comprehensive understanding of a vast amount of historical documents and cultural materials. The development of intelligent language modeling for classical literature represents a significant breakthrough in the field of ancient literature research, bringing new opportunities and challenges. With the increasing popularity of multimodal and generative GPT models in the context of AI4S, the intelligent language modeling of classical literature will focus on integrating diverse information, enhancing adaptability, improving knowledge representation, and addressing a wider range of application scenarios. 5) Library Digital Scholarly Services for AI4S: The concept of using LLM-based AI4S and AIGC to drive the development of smart libraries is consistent with the vision for digital scholarly services in libraries, and presents both opportunities and challenges. Given the trends towards AI4S platformization and the characteristics of "middle-end" digital scholarly service, as well as the longstanding tradition of libraries in serving scholarly research, the reengineering path for the library's digital scholarly services platform includes three approaches: building an AI4S service platform independently, purchasing and utilizing third-party AI4S platforms, and promoting embedded knowledge services as a component of scientific intelligence. This innovative approach addresses the dilemmas of financial resources, human resources, cognitive and practical gaps, and emphasizes the importance of user needs in the AI4S environment. It also focuses on knowledge organization and service delivery to meet user needs in the AI4S landscape. 6) Historical evolution and logical structure of the scientific intelligence paradigm (AI4S): AI4S is a scientific paradigm change dominated by the full application of AI technology to various disciplines, and its logical structure includes "data+model"-driven, knowledge ecology created by machine conjecture, and application scenarios expanded by algorithmic thinking. In the era of digital civilization, AI4S-driven scientific progress and social development must carry forward the value of science and technology for the good, effectively select the theoretical arguments and proposals for extending AI4S to the field of social sciences and humanities, and improve the series of mechanisms for integrating human decision-making and machine intelligence. 7) Development opportunities and prospects of AI4S in the era of generative AI: With the advances in generative AI, pre-training algorithms and large-scale pre-trained models have provided significant opportunities for AI4S in various disciplinary domains. These technologies have shown immense potential and value for applications in diverse fields such as industrial inspection, robotics, and medicine. Additionally, it is crucial to emphasize the importance of key factors such as the constraints of technical implementation conditions for large pre-trained models, the sustainability of data/computing resources, and the transparency, fairness, and accessibility of the technology.

Key words: AI4S, intelligent knowledge service, scientific and technological literature, knowledge base, scientific data aggregation, library digital scholarly services, intelligent language model for classical literature

中图分类号:  TP18;G350;G250

引用本文

孙坦, 张智雄, 周力虹, 王东波, 张海, 李白杨, 勇素华, 左旺孟, 杨光磊. 人工智能驱动的第五科研范式(AI4S)变革与观察[J]. 农业图书情报学报, 2023, 35(10): 4-33.

SUN Tan, ZHANG Zhixiong, ZHOU Lihong, WANG Dongbo, ZHANG Hai, LI Baiyang, YONG Suhua, ZUO Wangmeng, YANG Guanglei. The Transformation and Observations of AI for Science (AI4S) Driven by Artificial Intelligence[J]. Journal of Library and Information Science in Agriculture, 2023, 35(10): 4-33.