农业图书情报学报

• •    

AI+专家驱动的科技文献信息资源消费端数据体系建设研究

叶光辉, 涂凯, 胡丽娜, 韩丽, 冯智敏   

  1. 华中师范大学 信息管理学院,武汉 430079
  • 收稿日期:2024-07-01 出版日期:2024-11-26
  • 作者简介:

    叶光辉(1986- ),硕士生导师,副教授,博士,研究方向为信息融合、信息检索

    涂凯(2002- ),硕士研究生,研究方向为科学计量与用户信息行为

    胡丽娜(2002- ),硕士研究生,研究方向为数据挖掘

    韩丽(2001- ),硕士研究生,研究方向为数据挖掘

    冯智敏(2002- ),硕士研究生,研究方向为数据挖掘

  • 基金资助:
    教育部人文社会科学项目“面向共景治理的突发事件舆情演化计算与决策耦合模型研究”(23YJC870011)

Building Consumption Data Systems driven by AI Plus Expert for Scientific and Technical Literature Information Resources

Guanghui YE, Kai TU, Lina HU, Li HAN, Zhiming FENG   

  1. School of information management, Central China Normal University, Wuhan 430079
  • Received:2024-07-01 Online:2024-11-26

摘要:

[目的/意义] 受限于传统文献分类体系局限,用户产生的高价值消费端标注数据还不能作为数据要素融入科技文献服务,致使科技文献服务无法顺应开放科学时代背景与满足用户读者各类知识需求。本研究旨在挖掘AI提供技术突破潜力,构建AI+专家驱动的科技文献信息资源消费端数据体系,以期推动科技文献服务优化进程。 [方法/过程] 首先分析了科技文献信息资源消费端数据体系建设价值表征,然后提出了科技文献信息资源消费端数据体系建设原则,再者解构与剖析了AI介入科技文献信息资源消费端数据体系建设风险。最后,根据AI介入数据标注工作的程度,设计了3种AI+专家协同用户科技文献信息资源数据标注创新模式。 [结果/结论] 聚焦于引领用户协同完成数据标注工作,AI+专家辅助型数据标注模式下,AI充当工具角色根据专家制定处理规则完成表层信息处理,协助用户完成数据标注;AI+专家合作型数据标注模式下,AI完成科技文献预标注标签审查工作,用户从自生成标签模式转变为评判与挑选AI生成的数据标签模式,专家辅助审核最终数据标签质量;AI+专家主导型数据标注模式下,用户提供数据标注需求,专家进行过程操作指导,数据标注由AI4S平台自动化完成。

关键词: 科技文献信息资源, AI, 体系建设, 数据标注, 模式设计

Abstract:

[Purpose/Significance] Limited by the constraints of traditional literature classification systems, scientific and technical literature information resources face problems such as inadequate disclosure and resource utilization. At the same time, high-quality user-generated data cannot yet be integrated as data elements into services related to scientific and technical literature services, which prevents these services from adapting to the context of the open science and meeting the diverse knowledge needs of readers. This study aims to harness the technological breakthrough potential of AI to build a consumer-end data system for scientific and technical literature information resources driven by AI and experts. This will help to overcome the shortcomings of traditional services, such as the lack of supporting reading information and low interactivity between users, with the hope of promoting the optimization process of scientific and technical literature information resource services. [Method/Process] First, the study analyzes the four-dimensional value representation of the consumer-end data systems for scientific and technical literature information resources, including the intrinsic value, the tool value, the academic value, and the future value of annotation data. Then, following the processing flow of consumer-end data, namely the collection phase, utilization phase, and management phase, the paper proposes principles for the construction of consumer-end data systems. Furthermore, the paper deconstructs and analyzes the risks associated with the involvement of AI in the construction of consumer-end data systems, including four types of risks: machine algorithm risks, annotation content risks, annotation data risks and application risks. Finally, based on the degree of AI involvement in data annotation work, three innovative models of AI plus expert collaborates with user to accomplish data annotation for scientific and technical literature information resources are designed: the AI plus expert-assisted data annotation model, the AI plus expert collaborative data annotation model, and the AI plus expert-led data annotation model. [Results/Conclusions] Under the AI plus expert-assisted data annotation model, AI acts as a tool to complete surface-level information processing based on rules set by experts to assist users in data annotation. In the AI plus expert collaborative data annotation model, AI completes the review of pre-annotation tags for scientific and technical literature information resources, transforming users from a self-generated tag mode to an AI-generated data tag evaluation and selection mode, with experts assisting in the final review of data tag quality. In the AI plus expert-led data annotation model, users provide data annotation requirements, experts guide the process, and data annotation is automatically completed by the AI4S platform.

Key words: scientific and technical literature information resources, AI, system construction, data annotation, pattern design

中图分类号:  G251

引用本文

叶光辉, 涂凯, 胡丽娜, 韩丽, 冯智敏. AI+专家驱动的科技文献信息资源消费端数据体系建设研究[J/OL]. 农业图书情报学报. https://doi.org/10.13998/j.cnki.issn1002-1248.24-0640.

Guanghui YE, Kai TU, Lina HU, Li HAN, Zhiming FENG. Building Consumption Data Systems driven by AI Plus Expert for Scientific and Technical Literature Information Resources[J/OL]. Journal of Library and Information Science in Agriculture. https://doi.org/10.13998/j.cnki.issn1002-1248.24-0640.