融合思维链的中医药古籍多任务知识抽取方法研究

doi:10.13998/j.cnki.issn1002-1248.25-0422

Abstract

Abstract:

[Purpose/Significance] Although traditional Chinese Medicine (TCM) classics contain valuable knowledge they remain difficult to process automatically due to their complex page layouts, coexistence of traditional and simplified variant characters, alias-rich terminology, and strong cross-paragraph semantic dependencies. Existing pipelines often split the processes of optical character recognition (OCR), normalization, entity recognition, relation extraction, and entity alignment. This leads to error propagation. Additionally, many studies also focus on modern clinical texts rather than historical sources. This paper addresses these gaps by presenting an end-to-end pipeline that transforms ancient page images to a structured knowledge graph. The central contribution is the CoTCMKE, which is a chain-of-thought (CoT) and ontology-constrained joint model that performs named entity recognition (NER), relation extraction (RE), and entity alignment (EA) simultaneously. By making intermediate reasoning explicit and binding predictions to a TCM ontology, the framework improves batch digitization efficiency, extraction accuracy, and interpretability for digital humanities and library & information science (LIS) applications. [Method/Process] We built a unified pipeline with three steps. 1) Text recognition: a multimodal large language model (MLLM) recognizes text directly from complex pages with mixed vertical/horizontal layouts and performs context-aware traditional-to-simplified conversion. 2) Ontology construction: following semantic completeness, multimodal friendliness, evolvability, and interoperability, experts curate an ontology of core TCM concepts (e.g., diseases, symptoms, formulae, herbs) with aliases and constraints to guide decoding and ensure consistency. 3) Knowledge extraction: CoTCMKE integrates CoT with ontology constraints for multi-task extraction, which is entity localization and normalization, ontology-consistent relation generation, and cross-passage/cross-volume entity alignment. Constraint-aware decoding uses immediate checks and backtracking when a generated entity or relation violates ontology rules or alias mappings. For data, we used Shang Han Lun. Qwen2.5-VL-32B assists OCR, conversion, and initial auto-labeling; two TCM-trained annotators independently review and reconcile results. The final sets contain 2 340 NER items, 1 880 RE items, and 450 EA pairs, evaluated with 10-fold cross-validation. The multimodal large language model (MLLM) was adapted via LoRA with early stopping. The comparisons include traditional deep models, a unified IE framework, prompt-only inference, and a LoRA-SFT baseline. [Results/Conclusions] On Shang Han Lun, CoTCMKE outperformed LoRA-SFT by +3.1 F1 for NER, +1.6 for RE, and +1.3 for EA. In cross-book transfer to Jin Kui Yao Lue, the model maintained stable performance without retraining, indicating robustness and scalability. Ablation results showed that CoT reduced boundary and ambiguity errors, while ontology constraints curbed illegal triples and alias fragmentation. Combining both yielded the best overall results. The analysis yielded the following observations. 1) explicit medical relation templates act as semantic guardrails; 2) proactive alias consolidation before decoding reduces entity scattering and improves alignment; 3) explicit type-path guidance helps disambiguate fine-grained categories (e.g., pulse findings vs. general symptoms). The framework supports the automatic construction of "formula-symptom-herb" triples, as well as alias and variant normalization. It also supports evidence-linked semantic searches and navigation, which benefit LIS workflows, education, and research. Current limitations include the scope of the curated ontology and its focus on two classics. Future work will extend to additional TCM classics and broader historical corpora, support continual incremental learning, and deliver knowledge services based on the constructed graphs.

Key words: traditional Chinese medicine knowledge graph, multi-modal large language model, chain-of-thought

CLC Number:

AN Bo. A Multi-Task Knowledge Extraction Method for Traditional Chinese Medicine Ancient Books Integrating Chain-of-Thought[J].Journal of library and information science in agriculture, 2025, (): 1-14.

Figures/Tables 13

Fig.1

Fig.2

Fig.3

Fig.4

Fig.5

Fig.6

Fig.7

Table 1

Table 2

Table 3

Table 4

Fig.8

Table 5

References 28

[1]	陈力. 数字人文视域下的古籍数字化与古典知识库建设问题[J]. 中国图书馆学报, 2022, 48(2): 36-46.
	CHEN L. Digitalization of ancient books and construction of classical knowledge repository from the perspective of digital humanities[J]. Journal of library science in China, 2022, 48(2): 36-46.
[2]	邹苏, 张宗明. 习近平关于中医药的重要论述的核心要义、坚实根基及价值意蕴[J/OL]. 中国医学伦理学, 2024: 1-8.
	ZOU S, ZHANG Z M. Xi Jinping’s Important Discourse on Traditional Chinese Medicine has a profound value implication, as well as core principles and solid foundations[J/OL]. Chinese medical ethics, 2024: 1-8.
[3]	张伯礼. 新时代中医药传承发展的机遇与挑战[J]. 中国药理学与毒理学杂志, 2019, 33(9): 641-642.
	ZHANG B L. Opportunities and challenges of traditional Chinese medicine inheritance and development in the new era[J]. Chinese journal of pharmacology and toxicology, 2019, 33(9): 641-642.
[4]	苏尤丽, 胡宣宇, 马世杰, 等. 人工智能在中医诊疗领域的研究综述[J]. 计算机工程与应用, 2024, 60(16): 1-18.
	SU Y L, HU X Y, MA S J, et al. Review of research on artificial intelligence in traditional Chinese medicine diagnosis and treatment[J]. Computer engineering and applications, 2024, 60(16): 1-18.
[5]	JI S X, PAN S R, CAMBRIA E, et al. A survey on knowledge graphs: Representation, acquisition, and applications[J]. IEEE transactions on neural networks and learning systems, 2022, 33(2): 494-514.
[6]	车万翔, 窦志成, 冯岩松, 等. 大模型时代的自然语言处理: 挑战、机遇与发展[J]. 中国科学: 信息科学, 2023, 53(9): 1645-1687.
	CHE W X, DOU Z C, FENG Y S, et al. Towards a comprehensive understanding of the impact of large language models on natural language processing: Challenges, opportunities and future directions[J]. Scientia sinica (informationis), 2023, 53(9): 1645-1687.
[7]	YIN S K, FU C Y, ZHAO S R, et al. A survey on multimodal large language models[J/OL]. arXiv preprint arXiv:2306.13549, 2023.
[8]	LIANG Q, LIU Y J, ZHOU W X, et al. Expanding the boundaries of vision prior knowledge in multi-modal large language models[J/OL]. arXiv: 2503.18034, 2025.
[9]	张伯礼, 张俊华. 中医药现代化研究20年回顾与展望[J]. 中国中药杂志, 2015, 40(17): 3331-3334.
	ZHANG B L, ZHANG J H. Twenty years' review and prospect of modernization research on traditional Chinese medicine[J]. China journal of Chinese materia Medica, 2015, 40(17): 3331-3334.
[10]	王东波, 刘畅, 朱子赫, 等. SikuBERT与SikuRoBERTa: 面向数字人文的《四库全书》预训练模型构建及应用研究[J]. 图书馆论坛, 2022, 42(6): 31-43.
	WANG D B, LIU C, ZHU Z H, et al. Construction and application of pre-trained models of siku Quanshu in orientation to digital humanities[J]. Library tribune, 2022, 42(6): 31-43.
[11]	童攀, 龙炳鑫, 拥措. 基于注意力机制藏文乌金体古籍文字识别研究[J]. 计算机技术与发展, 2023, 33(10): 163-168, 208.
	TONG P, LONG B X, YONGCUO. Research on Tibetan Ujin ancient book character recognition based on attention mechanism[J]. Computer technology and development, 2023, 33(10): 163-168, 208.
[12]	文玉锋, 林伟杰, 夏翠娟, 等. 面向古籍文献智能处理的大语言模型效能测评[J]. 图书馆论坛, 2025, 45(8): 52-60.
	WEN Y F, LIN W J, XIA C J, et al. Large language models for intelligent processing of ancient texts: A performance evaluation[J]. Library tribune, 2025, 45(8): 52-60.
[13]	刘洋, 王东波. 古籍智能信息处理研究现状[J]. 图书情报工作, 2024, 68(23): 120-138.
	LIU Y, WANG D B. Current research on intelligent information processing for ancient books[J]. Library and information service, 2024, 68(23): 120-138.
[14]	王惠茹, 李秀红, 李哲, 等. 多模态预训练模型综述[J]. 计算机应用, 2023, 43(4): 991-1004.
	WANG H R, LI X H, LI Z, et al. Survey of multimodal pre-training models[J]. Journal of computer applications, 2023, 43(4): 991-1004.
[15]	陈晋音, 席昌坤, 郑海斌, 等. 多模态大语言模型的安全性研究综述[J]. 计算机科学, 2025, 52(7): 315-341.
	CHEN J Y, XI C K, ZHENG H B, et al. Survey of security research on multimodal large language models[J]. Computer science, 2025, 52(7): 315-341.
[16]	马咏梅, 耿生玲, 赵维纳, 等. 基于LoRA微调的撒拉族建筑知识图谱构建[J/OL]. 山西大学学报(自然科学版), 2025: 1-12.
	MA Y M, GENG S L, ZHAO W N, et al. Construction of knowledge graph for salar architecture based on LoRA fine-tuning[J/OL]. Journal of Shanxi university (natural science edition), 2025: 1-12.
[17]	LI Q, SUN H X, XIAO F, et al. PS-CoT-Adapter: Adapting plan-and-solve chain-of-thought for ScienceQA[J]. Science China information sciences, 2024, 68(1): 119101.
[18]	WEI J, WANG X Z, SCHUURMANS D, et al. Chain-of-thought prompting elicits reasoning in large language models[C]//Proceedings of the 36th International Conference on Neural Information Processing Systems. 28 November 2022, New Orleans, LA, USA. ACM, 2022: 24824-24837.
[19]	WANG X, WEI J, SCHUURMANS D, ET AL. Self-consistency improves chain of thought reasoning in language models[J/OL]. arXiv preprint arXiv:2203.11171, 2022.
[20]	ZHOU D, SCHÄRLI N, HOU L, et al. Least-to-most prompting enables complex reasoning in large language models[J/OL]. arXiv: 2205.10625, 2022.
[21]	MA X L, LI J, ZHANG M. Chain of thought with explicit evidence reasoning for few-shot relation extraction[J/OL]. arXiv: 2311.05922, 2023.
[22]	KWAK A, MORRISON C, BAMBAUER D, et al. Classify first, and then extract: Prompt chaining technique for information extraction[C]//Proceedings of the Natural Legal Language Processing Workshop 2024. Miami, FL, USA. Stroudsburg, PA, USA: ACL, 2024: 303-317.
[23]	LI M C, ZHOU H X, YANG H, et al. RT: A Retrieving and Chain-of-Thought framework for few-shot medical named entity recognition[J]. Journal of the American medical informatics association, 2024, 31(9): 1929-1938.
[24]	LIU Y, WANG Z, ZHANG H, et al. ERA-CoT: Improving chain-of-thought through entity-relationship-aware reasoning for NER[C]. Thailand: Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics Bangkok, 2024: 8780-8794.
[25]	CHEN F, FENG Y J. Chain-of-thought prompt distillation for multimodal named entity recognition and multimodal relation extraction[J/OL]. arXiv: 2306.14122, 2023.
[26]	TIAN H Y, YANG K, DONG X, et al. TCMLLM-PR: Evaluation of large language models for prescription recommendation in traditional Chinese medicine[J]. Digital Chinese medicine, 2024, 7(4): 343-355.
[27]	ZENG A H, XU B, WANG B W, et al. ChatGLM: A family of large language models from GLM-130B to GLM-4 all tools[J/OL]. arXiv: 2406.12793, 2024.
[28]	叶淋潮, 邵会会, 谢振平. 基于精调LLaMA模型的中西医概念关系对比分析方法[J]. 中文信息学报, 2025, 39(2): 162-170.
	YE L C, SHAO H H, XIE Z P. A contrastive study on the concept relations between traditional Chinese medicine and western medicine via finetuned LLaMA[J]. Journal of Chinese information processing, 2025, 39(2): 162-170.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

超参数名称		设置值
学习率（Learning Rate）	2×10^-5
批次大小（Batch Size）	14
LoRA秩（LoRA Rank）	8
优化器（Optimizer）	AdamW
最大训练轮数（Max Epochs）	10
提前终止轮数（Early Stopping Patience）	3
最大序列长度（Max Sequence Length）	512
梯度裁剪（Gradient Clipping）	1.0
学习率策略（LR Scheduler）	Linear decay

模型	Precision	Recall	F1
SikuBERT+BiLSTM+CRF	90.2	87.7	88.9
UIE	92.2	90.1	91.1
Qwen-VL(Prompt-Only)	85.7	80.5	83.1
Qwen-VL(LoRA-SFT)	93.1	91.0	92.0
CoTCMKE	95.2	94.9	95.1

模型	Precision	Recall	F1
SikuBERT+MLP	85.2	83.5	84.3
HGERE	88.7	86.9	87.8
HIORE	90.1	88.4	89.2
UIE	91.3	87.4	89.3
Qwen-VL（Prompt-Only）	86.2	83.1	84.6
Qwen-VL（LoRA-SFT）	92.1	91.9	92.0
CoTCMKE	94.6	93.2	93.6

模型	Precision	Recall	F1
Sentence-BERT	86.4	83.2	84.8
BERT-INT	89.3	87.5	88.4
PARIS	85.7	81.4	83.5
Qwen-VL（Prompt-Only）	91.8	89.5	90.6
Qwen-VL（LoRA-SFT）	94.1	93.2	93.6
CoTCMKE	95.0	94.8	94.9

A Multi-Task Knowledge Extraction Method for Traditional Chinese Medicine Ancient Books Integrating Chain-of-Thought

RichHTML

PDF (PC)

Abstract

Cite this article

share this article

Figures/Tables 13

References 28

Related Articles 0

Metrics

Comments

Recommended 0