人工智能驱动的第五科研范式(AI4S)变革与观察

doi:10.13998/jcnki.issn1002-1248.23-0850

Abstract

Abstract: " AI for Science " (AI4S) is a new scientific research paradigm that deeply integrates AI technology with scientific research to promote the discovery of new knowledge and the solution of scientific problems. As the application of AI4S in the natural sciences and humanities and social sciences advances, its development line, opportunities and challenges, needs and tasks, and ways of realization deserve further discussion. In order to advance AI4S research, promote scientific and technological (S&T) innovation and progress, and facilitate the effective strengthening of the discipline of information resources management, our journal has invited seven experts to organize this academic conversation on AI4S. 1) Supporting knowledge services for AI4S: In the current landscape of intelligent knowledge services, the requirements for supporting AI4S have increased, including the need for multi-level knowledge discovery and acquisition, cross-disciplinary research and innovation, and user-friendly participatory services. In addition, knowledge service scenarios are moving towards diversification, complexity, depth, specialization, and personalization in ubiquitous knowledge discovery, generative content services, and multi-round interactive service exploration. In response, professional science and technology information organizations need to reassess the role of knowledge services in the AI4Science environment and their significance in comprehensively supporting the S&T innovation process. This involves establishing a broad literature perspective, deepening full-text knowledge elements, balancing universal and specialized depth, autonomously developing core products, and deeply engaging with professional fields to support interdisciplinary innovation. 2) Building the knowledge base of AI4S: The essence of artificial intelligence (AI) lies in the acquisition and use of knowledge, and scientific and technological (S&T) literature is the primary carrier of human knowledge. Fully recognizing the paradigm shift in scientific research brought about by AI, the Document Information Center of the China Academy of Sciences has proposed the concept of building a S&T literature knowledge base for AI4S. It is actively exploring the scientific knowledge and high-quality data contained in the S&T literature, strives to build a domain intelligent knowledge base for AI4S, and transforms the "S&T literature database" into a "scientific knowledge engine" that supports intelligent services such as query evidence, situational awareness, inference prediction, and generation of insights required by AI4S. 3) Powering AI4S with scientific data: Effective aggregation of scientific data is the foundation for unleashing the powerful capabilities of AI4S. This is essential for libraries to adapt their roles and functions in the AI era and is a crucial prerequisite for catalyzing the transformation of scientific research services, deepening scientific research support, and accelerating S&T innovation. Currently, libraries face various macro and meso challenges in effectively aggregating valuable scientific data to provide support for AI4S. To address these challenges, the following ways can be pursued: defining the roles and functions of libraries in scientific data management; promoting a conducive environment for scientific data management; establishing a collaborative network for scientific data management; and enhancing the service capacity of scientific data management. 4) AI4S and intelligent language modeling for classical literature: AI4S technology can be used to analyze documents and texts, enabling a faster and more comprehensive understanding of a vast amount of historical documents and cultural materials. The development of intelligent language modeling for classical literature represents a significant breakthrough in the field of ancient literature research, bringing new opportunities and challenges. With the increasing popularity of multimodal and generative GPT models in the context of AI4S, the intelligent language modeling of classical literature will focus on integrating diverse information, enhancing adaptability, improving knowledge representation, and addressing a wider range of application scenarios. 5) Library Digital Scholarly Services for AI4S: The concept of using LLM-based AI4S and AIGC to drive the development of smart libraries is consistent with the vision for digital scholarly services in libraries, and presents both opportunities and challenges. Given the trends towards AI4S platformization and the characteristics of "middle-end" digital scholarly service, as well as the longstanding tradition of libraries in serving scholarly research, the reengineering path for the library's digital scholarly services platform includes three approaches: building an AI4S service platform independently, purchasing and utilizing third-party AI4S platforms, and promoting embedded knowledge services as a component of scientific intelligence. This innovative approach addresses the dilemmas of financial resources, human resources, cognitive and practical gaps, and emphasizes the importance of user needs in the AI4S environment. It also focuses on knowledge organization and service delivery to meet user needs in the AI4S landscape. 6) Historical evolution and logical structure of the scientific intelligence paradigm (AI4S): AI4S is a scientific paradigm change dominated by the full application of AI technology to various disciplines, and its logical structure includes "data+model"-driven, knowledge ecology created by machine conjecture, and application scenarios expanded by algorithmic thinking. In the era of digital civilization, AI4S-driven scientific progress and social development must carry forward the value of science and technology for the good, effectively select the theoretical arguments and proposals for extending AI4S to the field of social sciences and humanities, and improve the series of mechanisms for integrating human decision-making and machine intelligence. 7) Development opportunities and prospects of AI4S in the era of generative AI: With the advances in generative AI, pre-training algorithms and large-scale pre-trained models have provided significant opportunities for AI4S in various disciplinary domains. These technologies have shown immense potential and value for applications in diverse fields such as industrial inspection, robotics, and medicine. Additionally, it is crucial to emphasize the importance of key factors such as the constraints of technical implementation conditions for large pre-trained models, the sustainability of data/computing resources, and the transparency, fairness, and accessibility of the technology.

Key words: AI4S, intelligent knowledge service, scientific and technological literature, knowledge base, scientific data aggregation, library digital scholarly services, intelligent language model for classical literature

CLC Number:

TP18

SUN Tan, ZHANG Zhixiong, ZHOU Lihong, WANG Dongbo, ZHANG Hai, LI Baiyang, YONG Suhua, ZUO Wangmeng, YANG Guanglei. The Transformation and Observations of AI for Science (AI4S) Driven by Artificial Intelligence[J].Journal of Library and Information Science in Agriculture, 2023, 35(10): 4-32.

References

[1] RADFORD A, WU J, CHILD R, et al.Language models are unsupervised multitask learners[J]. OpenAI blog, 2019, 1(8): 9.
[2] ROMBACH R, BLATTMANN A, LORENZ D, et al.High-resolution image synthesis with latent diffusion models[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, New Jersey: IEEE, 2022: 10674-10685.
[3] KRYSTAL H.ChatGPT sets record for fastest-growing user base[EB/OL].[2023-02-02].https://www.reuters.com/technology/chatgpt-sets-record-fastest-growing-user-base-analyst-note-2023-02-01/. [4] 国家互联网信息办公室, 中华人民共和国国家发展和改革委员会, 中华人民共和国教育部, 等. 生成式人工智能服务管理暂行办法[EB/OL]. [2023-02-02]. https://www.gov.cn/zhengce/zhengceku/202307/content_6891752.htm.
[5] WANG H C, FU T F, DU Y Q, et al.Scientific discovery in the age of artificial in-telligence[J]. Nature, 2023, 620: 47-60.
[6] GOODFELLOW I J, POUGET-ABADIE J, MIRZA M, et al.Generative adversarial nets[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2. New York: ACM, 2014: 2672-2680.
[7] KINGMA D P, WELLING M. Auto-encoding variational Bayes[EB/OL]. 2013: arXiv: 1312.6114. http://arxiv.org/abs/1312.6114.pdf.
[8] DEVLIN J, CHANG M W, LEE K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[EB/OL]. 2018: arXiv: 1810.04805. http://arxiv.org/abs/1810.04805.pdf.
[9] JUMPER J, EVANS R, PRITZEL A, et al.Highly accurate protein structure prediction with AlphaFold[J]. Nature, 2021, 596: 583-589.
[10] HE K M, CHEN X L, XIE S N, et al.Masked autoencoders are scalable vision learners[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, New Jersey: IEEE, 2022: 15979-15988.
[11] HO J, JAIN A, ABBEEL P.Denoising diffusion probabilistic mod-els[C]//Proceedings of the 34th International Conference on Neural Information Pro-cessing Systems. New York: ACM, 2020: 6840-6851.
[12] SENIOR A W, EVANS R, JUMPER J, et al.Improved protein structure prediction using potentials from deep learning[J]. Nature, 2020, 577(7792): 706-710.
[13] JI Y R, ZHOU Z H, LIU H, et al.DNABERT: Pre-trained bidirectional encoder representations from transformers model for DNA-language in genome[J]. Bioinfor-matics, 2021, 37(15): 2112-2120.
[14] WANG Y Y, WANG J R, CAO Z L, et al.Molecular contrastive learning of repre-sentations via graph neural networks[J]. Nature machine intelligence, 2022, 4: 279-287.
[15] ESTEVA A, ROBICQUET A, RAMSUNDAR B, et al.A guide to deep learning in healthcare[J]. Nature medicine, 2019, 25(1): 24-29.
[16] ZHOU Y K, CHIA M A, WAGNER S K, et al.A foundation model for generalizable disease detection from retinal images[J]. Nature, 2023, 622: 156-163.
[17] PATHAK J, SUBRAMANIAN S, HARRINGTON P, et al. Four-CastNet: A global data-driven high-resolution weather model using adaptive fourier neural opera-tors[EB/OL]. 2022: arXiv: 2202.11214. http://arxiv.org/abs/2202.11214.pdf.
[18] BI K F, XIE L X, ZHANG H H, et al.Accurate medium-range global weather forecasting with 3D neural networks[J]. Nature, 2023, 619: 533-538.
[19] REICHSTEIN M, CAMPS-VALLS G, STEVENS B, et al.Deep learning and process understanding for data-driven Earth system science[J]. Nature, 2019, 566: 195-204.
[20] LI Y Z, WANG H L, YUAN S H, et al. Myriad: Large multimodal model by applying vision experts for industrial anomaly detection[EB/OL]. 2023: arXiv: 2310.19070. http://arxiv.org/abs/2310.19070.pdf.
[21] SCHICK T, DWIVEDI-YU J, DESSì R, et al. Toolformer: Language models can teach themselves to use tools[EB/OL]. 2023: arXiv: 2302.04761. http://arxiv.org/abs/2302.04761.pdf
[22] BROHAN A, BROWN N, CARBAJAL J, et al. RT-2: Vision-language-action models transfer web knowledge to robotic control[EB/OL]. 2023: arXiv: 2307.15818. http://arxiv.org/abs/2307.15818.pdf
[23] DRIESS D, XIA F, SAJJADI M S M, et al. PaLM-E: An embodied multimodal language model[EB/OL]. 2023: arXiv: 2303.03378. http://arxiv.org/abs/2303.03378.pdf
[24] MOOR M, BANERJEE O, ABAD Z S H, et al. Foundation models for generalist medical artificial intelligence[J]. Nature, 2023, 616: 259-265.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

The Transformation and Observations of AI for Science (AI4S) Driven by Artificial Intelligence

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 10

Metrics

Comments

Recommended 0

[1]	QIAN Li, LIU Zhibo, HU Maodi, CHANG Zhijun. Construction Model of AI-Ready for Scientific and Technological Intelligence Data Resources [J]. Journal of Library and Information Science in Agriculture, 2024, 36(3): 32-45.
[2]	ZHAO Ruixue, LI Tian, GUAN Zhihao, XIAN Guojian, KOU Yuantao, SUN Tan. Bidirectional Empowerment Between Knowledge Service and New Quality Productive Forces Theoretical Interpretation and Practical Path [J]. Journal of Library and Information Science in Agriculture, 2024, 36(2): 4-14.
[3]	ZHANG Xingwang, DUAN Xuechun, XIN Jie. A Study on the Knowledge-Based Description Framework and Application Scenarios of Ancient Chinese Map Documents in the Digital Intelligence Era [J]. Journal of Library and Information Science in Agriculture, 2023, 35(9): 4-11.
[4]	LI Tian, ZHAO Ruixue, XIAN Guojian, KOU Yuantao. Agricultural Intelligent Knowledge Services to Enable Rural Revitalization: Internal Mechanism and Dilemma Relief [J]. Journal of Library and Information Science in Agriculture, 2023, 35(8): 43-54.
[5]	ZHAO Ruixue, HUANG Yongwen, MA Weilu, DONG Wenjia, XIAN Guojian, SUN Tan. Insights and Reflections of the Impact of ChatGPT on Intelligent Knowledge Services in Libraries [J]. Journal of Library and Information Science in Agriculture, 2023, 35(1): 29-38.
[6]	LI Jie, WEI Ruibi. VOSviewer Application Status and Its Knowledge Base [J]. Journal of Library and Information Science in Agriculture, 2022, 34(6): 61-71.
[7]	ZHANG Zhixiong, LIU Huan, YU Gaihong. Building an Artificial Intelligence Engine Based on Scientific and Technological Literature Knowledge [J]. Journal of Library and Information Science in Agriculture, 2021, 33(1): 17-31.
[8]	WANG Haiyan, ZHOU Luyi, YANG Li, ZHU Qijun, JIA Guojie, LU Chengyu. Discussion on the Teaching Resources of Pharmaceutical Science Based on Medical Knowledge Base [J]. , 2018, 30(4): 136-138.
[9]	HUANG Zhao. Research on the Construction and Sharing of Characteristic Art Collection Resource for Social Service [J]. , 2017, 29(9): 21-24.
[10]	LI Ying. Discussion on the Construction of Knowledge base for Scientific Research in University Libraries [J]. , 2014, 26(2): 39-41.