面向AI4S的科技文献多模态知识抽取工具链研究

doi:10.13998/j.cnki.issn1002-1248.26-0178

农业图书情报学报

• •

面向AI4S的科技文献多模态知识抽取工具链研究

葛澜¹, 黄永文¹, 孔令博¹, 孙坦²^,³, 赵瑞雪¹^,⁴, 罗婷婷¹, 鲜国建¹^,²()

^1.中国农业科学院农业信息研究所，北京 100081
^2.农业农村部农业大数据重点实验室，北京 100081
^3.中国农业科学院，北京 100081
^4.国家新闻出版署农业融合出版知识挖掘与知识服务重点实验室，北京 100081

收稿日期:2026-04-03 出版日期:2026-06-25
通讯作者: 鲜国建 E-mail:xianguojian@caas.cn
作者简介:葛澜，硕士研究生，研究方向为知识组织与知识服务
黄永文，博士，研究员，研究方向为知识组织与知识服务
孔令博，博士研究生，副研究馆员，研究方向为知识服务与情报分析
孙坦，博士，研究馆员（二级），研究方向为数字信息描述与组织
赵瑞雪，博士，研究员，研究方向为农业信息管理系统
罗婷婷，硕士，副研究员，研究方向为大数据融汇治理
基金资助:
国家社会科学基金一般项目“多模态科技资源的语义组织与关联发现服务研究”(22BTQ079);中国农业科学院农业信息研究所2026年度科技创新工程任务“创新型领军人才”(CAAS-ASTIP-2026-AII)

Multimodal Knowledge Extraction Toolchain for Scientific Literature towards AI4S

GE Lan¹, HUANG Yongwen¹, KONG Lingbo¹, SUN Tan²^,³, ZHAO Ruixue¹^,⁴, LUO Tingting¹, XIAN Guojian¹^,²()

^1.Agricultural Information Institute, Chinese Academy of Agricultural Sciences, Beijing 100081
^2.Key Laboratory of Agricultural Big Data, Ministry of Agriculture and Rural Affairs, Beijing 100081
^3.Chinese Academy of Agricultural Sciences, Beijing 100081
^4.Key Laboratory of Knowledge Mining and Knowledge Services in Agricultural Converging Publishing, National Press and Publication Administration, Beijing 100081

Received:2026-04-03 Online:2026-06-25
Contact: XIAN Guojian E-mail:xianguojian@caas.cn

摘要/Abstract

摘要：

[目的/意义] 人工智能驱动科学发现（AI4S）与大语言模型对高质量多模态语料提出迫切需求，而传统基于文献外部特征的知识组织难以满足深层知识服务，因此需要面向科技文献全文进行多模态多粒度的知识抽取，实现结构化知识单元的系统挖掘。 [方法/过程] 通过系统梳理工具、建立文献知识表示模型、设计实现管道式抽取，实现文献获取、结构解析、多模态内容抽取及存储处理。针对基础信息精度不足、学术声明结构混乱、支撑材料信息缺失三大问题，分别优化并整合进工具链。 [结果/结论] 经水稻育种领域实证与SciWatch平台验证，本研究所构建的多模态知识抽取工具链能够有效将非结构化PDF文献转化为结构化关联知识库，优化提升原工具链中对基础信息、学术声明、支撑材料部分的抽取策略模型，显著提升了知识单元抽取的精度与召回率，能够支撑大规模科技文献知识抽取，研究成果为领域知识挖掘发现及大语言模型技术演进提供了可扩展的解决方案与实践参考。

关键词: 知识抽取, 多模态知识, AI4S, 科技文献, 知识单元, 大语言模型

Abstract:

[Purpose/Significance] The deep integration of the latest technological revolution and industrial transformation has created an urgent demand for high-quality multimodal corpora for artificial intelligence-driven scientific discovery (AI4S) and large language models. Traditional coarse-grained knowledge organization methods based on documents have become insufficient for deep knowledge services. This study aims to construct a toolchain for extracting multimodal and multigranular knowledge units from scientific and technological literature, enabling the systematic mining of structured knowledge units from massive literature and enhancing the depth and efficiency of knowledge services. [Method/Process] This study conducted a systematic review of mainstream knowledge extraction tools, both domestical and international, and performed a comparative analysis and screening on dimensions such as technical principles, functional characteristics, application advantages, existing limitations, and processing efficiency. An application demand system was constructed from four levels: identification of research subjects, context tracing, content analysis, and evidence localization. Taking the field of rice breeding as an empirical scenario, a knowledge representation model for multimodal information was constructed based on the physical organizational logic of literature. Documents were divided into four major categories of 22 knowledge units: basic information subjects, structural support, material systems, and academic descriptions. The boundaries between knowledge units are clear, and there are abundant associative relationships. Integrating the extraction needs of various types of scientific and technological literature knowledge units with tool research results, a pipeline-style extraction process framework for multimodal and multigranular knowledge units has been designed. This framework implemented a pipeline-style processing framework for the entire process of document acquisition, physical structure analysis, logical structure reconstruction, multimodal content extraction, and knowledge unit fusion and storage, constructing a cascading processing toolchain from PDF original documents to semi-structured data, and then to structured knowledge. To address three major issues: insufficient accuracy of basic information, chaotic structure of academic statements, and missing information in supporting materials, GROBID domain-adaptive retraining, XML and Markdown fusion parsing, and DeepSeek large model hierarchical extraction instructions were optimized and integrated into a full-chain toolchain. [Results/Conclusions] Preliminary experiments on the toolchain have achieved good extraction of multimodal and multigranular data. In optimization experiments, overall micro-average F1 score of the header model increased by nearly 3 percentage points, significantly enhancing the model's balance and generalization ability when processing documents in diverse formats. The problems of chaotic distribution and weakened structure of academic statement information were successfully solved, achieving robust structured extraction of more than ten types of statement information such as acknowledgements, conflicts of interest, and data availability. The introduction of the large language model DeepSeek enabled deep mining and association of chart titles, formal citation sentences, and related discussion sentences in literature. The model achieved an F1 score greater than 0.99 for extracting chart titles and greater than 0.93 for recognizing formal citation sentences. Verification through the SciWatch platform demonstrates the extraction, presentation, knowledge association, and contextual coherence of charts, supporting deep literature understanding and cross-validation. The multimodal knowledge extraction toolchain for scientific literature constructed in this paper has been able to efficiently and accurately complete the automated extraction and structured application of various knowledge units in scientific literature, covering a complete toolchain, including preprocessing, multimodal information recognition, relation extraction, knowledge fusion, and storage. The research results provide a scalable solution and practical reference for the evolution of domain knowledge mining and knowledge service technology.

Key words: knowledge extraction, multimodal knowledge, AI4S, scientific literature, knowledge unit, large language model

中图分类号: G254.9

葛澜, 黄永文, 孔令博, 孙坦, 赵瑞雪, 罗婷婷, 鲜国建. 面向AI4S的科技文献多模态知识抽取工具链研究[J/OL]. 农业图书情报学报. https://doi.org/10.13998/j.cnki.issn1002-1248.26-0178.

GE Lan, HUANG Yongwen, KONG Lingbo, SUN Tan, ZHAO Ruixue, LUO Tingting, XIAN Guojian. Multimodal Knowledge Extraction Toolchain for Scientific Literature towards AI4S[J/OL]. Journal of library and information science in agriculture. https://doi.org/10.13998/j.cnki.issn1002-1248.26-0178.

图/表 17

表1

开源抽取工具对比分析"

工具名称	技术描述	优点	缺点	处理效率
GROBID	CRF算法	可以批量从PDF文献中提取、解析和restructuring为结构化的XML/TEI编码文档，识别68个细粒度标签，涵盖出版物元数据和全文结构各个方面，准确度和运行效率高	中文文献信息抽取效果较差，对复杂多模态PDF文档的处理能力有限	并发~2.5PDF/s，元数据提取达36PDF/s；CPU运行，内存<2GB，无需GPU
MinerU	版面分析+OCR	能够完美保留原始文档的结构，支持复杂多模态PDF文档的处理，包括去除页眉页脚、保留标题段落和表格结构、公式和表格格式转换、OCR识别等操作，输出Markdown文件	对非结构化数据的深度理解能力有限，对特殊任务和非英语环境的适应性有待提高	CPU约32s/页，GPU加速>10 000tokens/s；建议4核CPU+8GB内存，GPU可加速
PaddleX	飞桨套件	支持多种任务场景，包括图像分类、目标检测、图像分割、OCR、文本图像版面分析、文本图像信息抽取等，提供低代码开发模式，支持统一API接口，便于模型串联	对文献知识抽取的专注度不够，对复杂推理任务的处理能力有限	GPU下单页2~5s，吞吐量12~30页/min；需GPU（≥4GB显存），CPU回退慢5~10倍
RAGFlow	深度文档理解+大语言模型	能够从复杂格式的非结构化数据中提取信息，支持无限上下文场景，基于模板的文本切片机制保证结果可控性和可解释性，降低幻觉风险，兼容异构数据源	对文献知识抽取的效率有待提高，对多语言文献的处理能力有限	单文档分钟级（含Embedding与LLM推理）；需高性能GPU及≥16GB内存
DeepSeek	MoE+MLA架构	结合检索增强生成（RAG）技术，实时检索外部知识库，提升生成内容的准确性与专业性，尤其在处理多语言文献和复杂推理任务时表现突出	对文献知识抽取的效率有待提高，对复杂多模态PDF文档的处理能力有限	单请求秒级至10秒级，多轮交互更长；需大显存GPU或API调用

表1

图1

图2

图3

图4

表2

图5

图6

表3

图7

表4

表5

表6

图8

表7

图9

图10

参考文献 45

[1]	科技部启动“人工智能驱动的科学研究”专项部署工作-中国法院网[EB/OL]. [2024-11-19]. .
[2]	彭以祺. 国家科技文献保障工作的形势与NSTL“十四五”规划[J]. 数字图书馆论坛, 2021(5): 2-7.
	Peng Yiqi. Situation of the national scientific and technological literature guarantee work and the 14th Five-Year Plan of NSTL[J]. Digital Library Forum, 2021(5): 2-7.
[3]	国家数据局. “数据要素×”三年行动计划(2024-2026年)[R]. 北京: 国家数据局, 2023.
[4]	蔡祎然, 胡正银, 刘春江. 大语言模型赋能科技文献数据挖掘进展分析[J]. 农业图书情报学报, 2025, 37(2): 4-22.
	Cai Yiran, Hu Zhengyin, Liu Chunjiang. Analysis of progress in data mining of scientific literature using analysis using large language models[J]. Journal of Library and Information Science in Agriculture, 2025, 37(2): 4-22.
[5]	Asai A, He J, Shao Rulin, et al. Synthesizing scientific literature with retrieval-augmented language models[J]. Nature, 2026, 650(8103): 857-863.
[6]	刘峥, 孙坦, 张建勇. NSTL资源的深度组织和揭示: 从资源描述到语义描述[J]. 数字图书馆论坛, 2020, 16(7): 60-66.
	Liu Zheng, Sun Tan, Zhang Jianyong. The knowledge organization of NSTL resources: From resource description to semantic description[J]. Digital Library Forum, 2020, 16(7): 60-66.
[7]	秦春秀, 马续补. 基于知识单元的科技文献细粒度组织研究[M]. 西安: 西安电子科技大学出版社, 2024.
	Qin Chunxiu, Ma Xubu. Research on fine-grained organization of scientific literature based on knowledge unit[M]. Xi'an: Xidian University Press, 2024.
[8]	Information and documentation - Foundation and vocabulary: [S]. International Organization for Standardization, 2017.
[9]	邱均平, 段宇锋, 陈敬全, 等. 我国文献计量学发展的回顾与展望[J]. 科学学研究, 2003(2): 143-148.
	Qiu Junping, Duan Yufeng, Chen Jingquan, et al. Review and prospect of the development of Bibliometrics in China[J]. Studies in Science of Science, 2003(2): 143-148.
[10]	Ding Ying, Liu Xiaozhong, Guo Chun, et al. The distribution of references across texts: Some implications for citation analysis[J]. Journal of Informetrics, 2013, 7(3): 583-592.
[11]	陆伟, 黄永, 程齐凯. 学术文本的结构功能识别——功能框架及基于章节标题的识别[J]. 情报学报, 2014, 33(9): 979-985.
	Lu Wei, Huang Yong, Cheng Qikai. The structure function of academic text and its classification[J]. Journal of the China Society for Scientific and Technical Information, 2014, 33(9): 979-985.
[12]	秦春秀, 杨智娟, 赵捧未, 等. 面向科技文献知识表示的知识元本体模型[J]. 图书情报工作, 2018, 62(3): 94-103.
	Qin Chunxiu, Yang Zhijuan, Zhao Pengwei, et al. The knowledge element ontology model of scientific literature for knowledge representation[J]. Library and Information Service, 2018, 62(3): 94-103.
[13]	徐雷, 秦翠玉, 李娇. 科技文献数据化及组织呈现路径研究[J]. 中国图书馆学报, 2022, 52(3): 25-42.
	Xu Lei, Qin Cuiyu, Li Jiao. Datafication, organization and manifestation of scientific literature[J]. Journal of Library Science in China, 2022, 52(3): 25-42.
[14]	李晶, 杨雪, 苏秋丹, 等. 基于知识单元理论的科技成果创新性测度研究述评[J]. 现代情报, 2023, 43(8): 161-177.
	Li Jing, Yang Xue, Su Qiudan, et al. Review of the research on the measurement of innovation of scientific and technological achievements based on the theory of knowledge units[J]. Journal of Modern Information, 2023, 43(8): 161-177.
[15]	叶光辉, 彭泽, 陈国梁, 等. 学术文献中的知识单元抽取及其分布特征识别研究[J]. 情报理论与实践, 2023, 46(4): 90-98.
	Ye Guanghui, Peng Ze, Chen Guoliang, et al. Research on knowledge element extraction in academic literature and its distribution feature recognition[J]. Information Studies: Theory & Application, 2023, 46(4): 90-98.
[16]	赵冠壹, 韩松花. 科技文献的多粒度知识组织研究[J]. 情报科学, 2023, 41(8): 134-138, 161.
	Zhao Guanyi, Han Songhua. Multi-granularity knowledge organization of sci-tech literature[J]. Information Science, 2023, 41(8): 134-138, 161.
[17]	冯儒佳, 王忠义, 王艳凤, 等. 科技论文的多粒度知识组织框架研究[J]. 情报科学, 2016, 34(12): 46-50, 54.
	Feng Rujia, Wang Zhongyi, Wang Yanfeng, et al. Research on multi-granularity knowledge organization framework of scientific and technological papers[J]. Information Science, 2016, 34(12): 46-50, 54.
[18]	Hinchey M, Vassev E. KnowLang - A formal specification model for self-adaptive systems[M]//Theories of Programming and Formal Methods. Cham: Springer Nature Switzerland, 2023: 367-392.
[19]	王忠义, 夏立新, 李玉海. 基于知识内容的数字图书馆跨学科多粒度知识表示模型构建[J]. 中国图书馆学报, 2019, 45(6): 50-64.
	Wang Zhongyi, Xia Lixin, Li Yuhai. Construction of interdisciplinary multi-granularity knowledge representation model in digital library based on knowledge content[J]. Journal of Library Science in China, 2019, 45(6): 50-64.
[20]	任亮, 杜薇薇, 刘伟利. 面向科技文献知识元的知识图谱构建研究[J]. 情报科学, 2022, 40(9): 26-31.
	Ren Liang, Du Weiwei, Liu Weili. The construction of knowledge graph for knowledge elements of scientific literature[J]. Information Science, 2022, 40(9): 26-31.
[21]	National Information Standards Organization JATS Standing Committee. : Journal Article Tag Suite, Version 1.4 [S/OL]. Baltimore: National Information Standards Organization, 2024[2026-03-31]. .
[22]	Consortium TEI. TEI P5: Guidelines for Electronic Text Encoding and Interchange [S/OL]. 2025[2026-03-31]. .
[23]	刘熠, 张智雄, 王宇飞, 等. 基于语步识别的科技文献结构化自动综合工具构建[J]. 数据分析与知识发现, 2024, 8(2): 65-73.
	Liu Yi, Zhang Zhixiong, Wang Yufei, et al. Constructing automatic structured synthesis tool for sci-tech literature based on move recognition[J]. Data Analysis and Knowledge Discovery, 2024, 8(2): 65-73.
[24]	牛永洁, 薛苏琴. 基于PDFBox抽取学术论文信息的实现[J]. 计算机技术与发展, 2014, 24(12): 61-63, 68.
	Niu Yongjie, Xue Suqin. Realization of extraction of academic papers information based on PDFBox[J]. Computer Technology and Development, 2014, 24(12): 61-63, 68.
[25]	Tkaczyk D, Bolikowski L, Czeczko A, et al. A modular metadata extraction system for born-digital articles[C]//2012 10th IAPR International Workshop on Document Analysis Systems. Gold Coast: IEEE, 2012: 11-16.
[26]	李雪驹, 王智广, 鲁强. 一种规则与SVM结合的论文抽取方法[J]. 计算机技术与发展, 2017, 27(10): 24-29.
	Li Xueju, Wang Zhiguang, Lu Qiang. An extraction method for papers via integration of rules with SVM[J]. Computer Technology and Development, 2017, 27(10): 24-29.
[27]	Kan M Y, Luong M T, Nguyen T D. Logical structure recovery in scholarly articles with rich document features[J]. International Journal of Digital Library Systems, 2010, 1(4): 1-23.
[28]	周忆莲. 学术论文PDF结构解析技术的研究[D]. 长沙: 湖南大学, 2020.
	Zhou Yilian. Research on PDF Structure Analysis Technology of Academic Papers[D]. Changsha: Hunan University, 2020.
[29]	秦成磊, 章成志. 基于层次注意力网络模型的学术文本结构功能识别[J]. 数据分析与知识发现, 2020, 4(11): 26-42.
	Qin Chenglei, Zhang Chengzhi. Recognizing structure functions of academic articles with hierarchical attention network[J]. Data Analysis and Knowledge Discovery, 2020, 4(11): 26-42.
[30]	刘昊坦, 刘家伟, 张帆, 等. 科技文献的多层次结构功能识别[J]. 信息资源管理学报, 2024, 14(3): 90-103.
	Liu Haotan, Liu Jiawei, Zhang Fan, et al. Multi-level functional structure recognition of scientific literature[J]. Journal of Information Resources Management, 2024, 14(3): 90-103.
[32]	何彦青, 陈光云, 兰天, 等. 基于Rule-Faster-RCNN的多语科技论文PDF文档结构框架元素解析[J]. 情报科学, 2023, 41(4): 51-61.
	He Yanqing, Chen Guangyun, Lan Tian, et al. Extraction of structure frame elements of PDF documents in multilingual scientific papers based on rule-faster-RCNN[J]. Information Science, 2023, 41(4): 51-61.
[33]	沈雪莹, 欧石燕. 科学文献知识单元抽取及应用研究: 梳理与展望[J]. 情报理论与实践, 2022, 45(12): 195-207.
	Shen Xueying, Ou Shiyan. Research on extraction and application of knowledge units in scientific literature: Reviews and prospects[J]. Information Studies (Theory & Application), 2022, 45(12): 195-207.
[34]	丁培. 学术图表知识发现技术框架及研究进展[J]. 图书情报工作, 2021, 65(23): 136-148.
	Ding Pei. The technical framework and research progress of knowledge discovery in academic figures and tables[J]. Library and Information Service, 2021, 65(23): 136-148.
[35]	Li Pengyuan, Jiang Xiangying, Shatkay H. Figure and caption extraction from biomedical documents[J]. Bioinformatics, 2019, 35(21): 4381-4388.
	于丰畅, 程齐凯, 陆伟. 基于几何对象聚类的学术文献图表定位研究[J]. 数据分析与知识发现, 2021, 5(1): 140-149.
	Yu Fengchang, Cheng Qikai, Lu Wei. Locating academic literature figures and tables with geometric object clustering[J]. Data Analysis and Knowledge Discovery, 2021, 5(1): 140-149.
[36]	李英群, 李亚菲, 裴雷, 等. 基于YOLOv5-ECA-BiFPN的学术期刊文献图表识别与提取方法研究[J]. 数据分析与知识发现, 2023, 7(11): 158-171.
	Li Yingqun, Li Yafei, Pei Lei, et al. Identifying and extracting figures and tables from academic literature based on YOLOv5-ECA-BiFPN[J]. Data Analysis and Knowledge Discovery, 2023, 7(11): 158-171.
[37]	Peters S E, Zhang Ce, Livny M, et al. A machine reading system for assembling synthetic paleontological databases[J]. PLoS One, 2014, 9(12): 1-22.
[38]	马建霞, 袁慧, 蒋翔. 基于Bi-LSTM+CRF的科学文献中生态治理技术相关命名实体抽取研究[J]. 数据分析与知识发现, 2020, 4(2): 78-88.
	Ma Jianxia, Yuan Hui, Jiang Xiang. Extracting name entities from ecological restoration literature with Bi-LSTM+CRF[J]. Data Analysis and Knowledge Discovery, 2020, 4(2): 78-88.
[39]	彭玉芳, 陈将浩. 基于深度学习与需求规则融合的学术文献“目标数据”抽取模型构建与应用—以南海数字资源为例[J]. 情报科学, 2022, 40(1): 141-147, 157.
	Peng Yufang, Chen Jianghao. Combine deep learning with the rules of requirement to construct the "target data" extraction model for the academic literature-Taking the resources of the South China Sea as an example[J]. Information Science, 2022, 40(1): 141-147, 157.
[40]	蔡乐, 罗卓然, 陆伟. 学术论文科研贡献类型自动识别研究[J]. 情报理论与实践, 2023, 46(6): 168-175.
	Cai Le, Luo Zhuoran, Lu Wei. Research on automatic recognition of scientific research contribution types of academic papers[J]. Information Studies (Theory & Application), 2023, 46(6): 168-175.
[41]	曹树金, 闫颂. 基于语义角色信息的科技论文创新段落定位及功能句识别方法研究——以中文情报学领域论文为例[J]. 情报理论与实践, 2022, 45(11): 1-9, 20.
	Cao Shujin, Yan Song. Research on innovative paragraph positioning and functional sentence identification method of scientific and technical papers based on semantic role information: A case study of papers in the field of Chinese intelligence[J]. Information Studies (Theory & Application), 2022, 45(11): 1-9, 20.
[42]	张颖怡, 章成志. 基于学术论文全文的研究方法句自动抽取研究[J]. 情报学报, 2020, 39(6): 640-650.
	Zhang Yingyi, Zhang Chengzhi. Methodological and automatic sentence extraction from academic article's full-text[J]. Journal of the China Society for Scientific and Technical Information, 2020, 39(6): 640-650.
[43]	Wadden D, Shi Kejian, Morrison J, et al. SciRIFF: A resource to enhance language model instruction-following over scientific literature[PP/OL]. V4. arXiv (2025-09-29)[2026-03-31]. .
[44]	Li Sihang, Huang Jin, Zhuang Jiaxi, et al. SciLitLLM: How to adapt LLMs for scientific literature understanding[PP/OL]. V5. arXiv (2025-04-18)[2026-03-31]. .
[45]	Zhang Xiantao. Roles of MLLMs in visually rich document retrieval for RAG: A survey[PP/OL]. V1. arXiv (2025-12-16)[2026-03-31]. .

知识类型	知识单元	评估维度	单栏	双栏	混合栏
基础信息	摘要	levenshtein_distance	244.50	109.70	104.40
	摘要	normalized_similarity	0.92	0.89	0.90
	标题	levenshtein_distance	0.10	85.90	86.30
	标题	normalized_similarity	1.00	0.91	0.91
	作者	精确率	0.97	0.82	0.98
		召回率	1.00	1.00	0.98
		F1值	0.99	0.90	0.98
	关键词	精确率	0.82	0.86	0.89
		召回率	0.98	0.56	0.80
		F1值	0.89	0.68	0.85
	时间	精确率	1.00	1.00	1.00
		召回率	0.72	0.84	0.84
		F1值	0.84	0.91	0.92
	作者与机构关联	精确率	1.00	1.00	0.84
		召回率	0.94	0.81	0.74
		F1值	0.97	0.89	0.79
	邮件	精确率	1.00	1.00	0.89
		召回率	0.65	0.75	0.53
		F1值	0.79	0.86	0.67
	通讯情况	精确率	1.00	1.00	0.86
		召回率	0.15	0.33	0.60
		F1值	0.27	0.50	0.71
	ORCID	精确率	0.78	0.00	0.75
		召回率	0.64	0.00	0.75
		F1值	0.70	0.00	0.75
	机构（抽取和分类正确）	精确率	0.90	0.86	0.88
		召回率	0.88	0.86	0.87
		F1值	0.89	0.86	0.88
	机构（抽取正确）	精确率	0.99	0.95	0.96
		召回率	0.98	0.95	0.95
		F1值	0.99	0.95	0.96
主体结构	正文	levenshtein_distance	3 504.70	1 070.10	1 349.50
主体结构	正文	normalized_similarity	0.90	0.96	0.96
支撑材料体系	图片抽取	精确率	1.00	1.00	1.00
		召回率	0.97	1.00	0.96
		F1值	0.98	1.00	0.98
	图片完整度	精确率	0.95	0.79	0.90
		召回率	0.92	0.79	0.87
		F1值	0.94	0.79	0.88
	图片标题	精确率	0.86	0.96	0.93
		召回率	0.70	0.96	0.82
		F1值	0.77	0.96	0.88
	表格图片	精确率	1.00	1.00	0.86
		召回率	0.94	1.00	1.00
		F1值	0.97	1.00	0.92
	表格标题	精确率	0.82	1.00	1.00
		召回率	0.56	1.00	1.00
		F1值	0.67	1.00	1.00
	表格脚注	精确率	0.67	0.95	0.93
		召回率	0.86	0.95	0.82
		F1值	0.75	0.95	0.88
	表格HTML	精确率	0.94	1.00	0.86
		召回率	1.00	1.00	1.00
		F1值	0.97	1.00	0.92
	表格HTML结构	精确率	0.85	0.97	0.81
		召回率	0.90	0.97	0.94
		F1值	0.88	0.97	0.87
	表格HTML内容	精确率	0.86	0.94	0.73
		召回率	0.91	0.94	0.85
		F1值	0.89	0.94	0.78
	LaTeX公式	精确率	1.00	1.00	1.00
		召回率	1.00	1.00	1.00
		F1值	1.00	1.00	1.00
	参考文献	精确率	0.99	0.99	0.99
		召回率	0.97	0.99	0.98
		F1值	0.98	0.99	0.98

标签	准确率	精确率	召回率	F1
<abstract>	98.95	88.89	84.21	86.49
<address>	98.95	93.88	95.83	94.85
<affiliation>	98.74	93.75	93.75	93.75
<author>	99.79	100	96.55	98.25
<copyright>	96.86	61.54	76.19	68.09
<date>	99.58	87.5	87.5	87.5
<doctype>	99.16	100	66.67	80
<editor>	100	100	100	100
<email>	99.79	100	96.43	98.18
<keyword>	98.74	81.25	81.25	81.25
<pubnum>	100	100	100	100
<reference>	97.9	78.95	71.43	75
<submission>	99.37	93.75	88.24	90.91
<title>	98.53	85	80.95	82.93
<web>	100	100	100	100

学术声明具体模块	基于xml方法			基于md方法
学术声明具体模块	精确率	召回率	F1	精确率	召回率	F1
Acknowledgments	1.000 0	1.000 0	1.000 0	1.000 0	1.000 0	1.000 0
Data Availability	1.000 0	1.000 0	1.000 0	1.000 0	0.928 6	0.963 0
Author Contributions	1.000 0	0.611 1	0.758 6	1.000 0	1.000 0	1.000 0
Funding	1.000 0	0.882 4	0.937 5	0.894 7	1.000 0	0.944 4
Conflict of Interest	1.000 0	0.950 0	0.974 4	1.000 0	0.950 0	0.974 4
Supplementary Material	1.000 0	0.307 7	0.470 6	1.000 0	1.000 0	1.000 0
Additional Information	0.000 0	0.000 0		1.000 0	1.000 0	1.000 0
Publisher's Note	0.000 0	0.000 0		1.000 0	1.000 0	1.000 0
Ethical Approval	1.000 0	0.666 7	0.800 0	1.000 0	1.000 0	1.000 0
Open Access	1.000 0	0.272 7	0.428 6	0.875 0	0.636 4	0.736 8
Statement	1.000 0	0.250 0	0.400 0	1.000 0	1.000 0	1.000 0

子组件	Avg-Pre	Avg-Rec	Avg-F1
去除Md方法刚性抽取	0.978 5	0.858 9	0.870 8
去除Md方法柔性捕获	1.000 0	0.679 7	0.755 2
去除Xml方法标题通道	0.978 5	0.961 6	0.968 3
去除Xml方法属性通道	0.978 5	0.957 0	0.966 0
md-xml综合框架（完整）	0.979 1	0.966 9	0.971 0

抽取内容	精确率	召回率	F1
标题	1.000 0	0.993 8	0.996 8
正式引用句	0.956 0	0.909 3	0.932 0

面向AI4S的科技文献多模态知识抽取工具链研究

Multimodal Knowledge Extraction Toolchain for Scientific Literature towards AI4S

RichHTML

PDF (PC)

赞

可视化

摘要/Abstract

引用本文

使用本文

图/表 17

参考文献 45

相关文章 15

Metrics

本文评价

推荐阅读 0

[1]	席崇俊, 赵亚娟, 吕璐成, 苏莹. 融合审查逻辑与知识库驱动的专利层级化自动分类研究——以人工智能领域为例[J]. 农业图书情报学报, 2026, 38(6): 28-42.
[2]	吕璐成, 周健, 孙文君, 赵亚娟, 韩涛. 微调大模型在专利文本挖掘中的应用效果研究[J]. 农业图书情报学报, 2026, 38(4): 36-46.
[3]	钱力, 杨颜僖, 张元哲, 胡懋地, 常志军. OpenClaw对科技文献情报工作的影响与启示[J]. 农业图书情报学报, 2026, 38(4): 4-12.
[4]	吴玉浩, 刘艺浩, 李庆军, 胡旭. 基于大语言模型的图书馆数据开放共享：逻辑、路径与策略[J]. 农业图书情报学报, 2026, 38(1): 28-43.
[5]	王晓宇, 胡靖源, 巫若羽, 王舒, 翟羽佳. 基于大语言模型数据增强的“科学-技术”主题关联方法研究——以节能领域为例[J]. 农业图书情报学报, 2025, 37(9): 63-81.
[6]	翟军, 孟子涵, 李方苏, 沈立新. AI4S背景下北美研究型图书馆AI指南研究——基于对125所ARL图书馆的调研[J]. 农业图书情报学报, 2025, 37(7): 35-49.
[7]	刘炜, 张磊, 嵇婷, 陈晓扬. 以AI塑形智慧图书馆：基于智能体的下一代图书馆服务平台[J]. 农业图书情报学报, 2025, 37(5): 15-26.
[8]	钱力, 王茜颖, 刘熠, 张元哲, 常志军. 科研场景下的智能体技术与应用研究[J]. 农业图书情报学报, 2025, 37(5): 5-14.
[9]	张丽, 王博, 张琪晶. 生成式人工智能驱动公共图书馆资源发现：基于动态评价模型的服务优化研究[J]. 农业图书情报学报, 2025, 37(5): 58-71.
[10]	史忠艳, 雷洁, 孙坦, 赵瑞雪, 李娇, 黄永文, 鲜国建. DeepSeek赋能领域知识图谱低成本构建研究[J]. 农业图书情报学报, 2025, 37(3): 4-17.
[11]	蔡祎然, 胡正银, 刘春江. 大语言模型赋能科技文献数据挖掘进展分析[J]. 农业图书情报学报, 2025, 37(2): 4-22.
[12]	乔晋华, 马雪赟. LLaMA人工智能大模型在高校未来学习中心应用的风险与规制[J]. 农业图书情报学报, 2025, 37(2): 37-48.
[13]	李鑫鑫, 马雨萌, 鞠孜涵, 王敬. 基于大语言模型的科技政策评论方面级情感分析研究——以新能源汽车产业为例[J]. 农业图书情报学报, 2025, 37(10): 53-66.
[14]	王昊贤, 周子茗, 丁菲菲, 韦成府. 数字人文与大语言模型：古文献语义检索实践与探索[J]. 农业图书情报学报, 2024, 36(9): 89-101.
[15]	叶光辉, 涂凯, 胡丽娜, 韩丽, 冯智敏. AI+专家驱动的科技文献信息资源消费端数据体系建设研究[J]. 农业图书情报学报, 2024, 36(9): 18-31.