农业图书情报学报 ›› 2023, Vol. 35 ›› Issue (8): 66-77.doi: 10.13998/j.cnki.issn1002-1248.23-0300

• 研究论文 • 上一篇    下一篇

基于HARP框架的农业知识图谱表示模型研究

陈彩铭1, 冯建中1,*, 白林燕2,3, 王剑1, 谢能付1, 邹军1   

  1. 1.中国农业科学院农业信息研究所,北京 100081;
    2.中国科学院空天信息创新研究院 数字地球重点实验室,北京 100094;
    3.可持续发展大数据国际研究中心,北京 100094
  • 收稿日期:2023-05-16 出版日期:2023-08-05 发布日期:2023-12-04
  • 通讯作者: *冯建中(1971- ),男,博士,研究员,研究方向为信息技术与数字农业等。E-mail:fengjianzhong@caas.cn
  • 作者简介:陈彩铭(1998- ),男,硕士,研究方向为农业知识图谱及应用。白林燕(1981- ),博士,研究方向为地理遥感技术等。王剑(1976- ),博士,副研究员,研究方向为农业专业信息搜索理论与技术等。谢能付(1975- ),博士,研究员,研究方向为区块链农业应用、大规模农业知识处理,农业智能计算等。邹军(1997- )男,硕士,研究方向为农业数字孪生
  • 基金资助:
    国家科技创新2030新一代人工智能重大项目课题“农业智能知识服务平台”(2021ZD0113702-02); 新疆生产建设兵团(重点领域)科技攻关计划项目“昆玉市‘互联网+’的智慧农业集成示范应用技术研究”(2019AB002); 中国农业科学院科技创新工程项目(CAAS-ASTIP-2023-AIl)

Representation Model of Agricultural Knowledge Graph Based on the HARP Framework

CHEN Caiming1, FENG Jianzhong1,*, BAI Linyan2,3, WANG Jian1, XIE Nengfu1, ZOU Jun1   

  1. 1. Agricultural Information Institute of CAAS, Beijing 100081;
    2. Institute of Remote Sensing and Digital Earth, Chinese Academy of Sciences, Beijing 100094;
    3. International Research Center of Big Data for Sustainable Development Goals, Beijing 100094
  • Received:2023-05-16 Online:2023-08-05 Published:2023-12-04

摘要: [目的/意义]随着农业知识图谱数据规模的增长,图谱的节点和关系复杂度不断提升,这对其训练和表示提出了新的挑战。在此背景下,探索如何在保全知识图谱结构的同时降低资源消耗并加快嵌入速度具有重要的研究和应用意义。[方法/过程]针对这一问题,本研究提出了一种基于HARP框架的农业知识图谱层次表示模型。该模型利用农业知识图谱的层次性特征,采用一种改进的基于关系路径随机行走策略,有效地保留了图谱中节点的层次性和非对称关系结构。[结果/结论]1)与HARP框架相比,使用LEIDEN的HRWP模型能更好地保留空间结构,并快速收敛了速度;2)采用HRWP的融合模型训练时间基本小于二者训练时间总和,且对原算法时间复杂度影响较小;3)结合HRWP的传统算法各指标平均提高2%,非神经网络模型有显著提升。综上,认为模型可以准确表示农业知识图谱并有效缩短训练时间。

关键词: 知识图谱, 随机游走, 表示学习, HARP框架

Abstract: [Purpose/Significance] In the era of big data, the volume of data is growing at an exponential rate. One of the most prominent areas affected by this growth is the field of agriculture. The use of agricultural knowledge graphs, which serve as key infrastructures for managing agricultural knowledge, has expanded significantly. However, as the number of nodes and relationships within these graphs increase, so too does their complexity. This complexity gives rise to new challenges in training and representing such large-scale knowledge graphs. It is therefore of great significance to investigate methods for speeding up the embedding process of agricultural knowledge graphs, while preserving their structural integrity and minimizing resource consumption. This research embarks on a novel exploration to address this issue. It stands out from previous studies by concentrating on a hierarchical representation model for agricultural knowledge graphs. The potential impacts of this research on propelling the advancement of the field and on addressing significant real-world problems are substantial. [Method/Process] To confront this challenge, we propose a hierarchical representation model for agricultural knowledge graphs rooted in the HARP framework. Our model leverages the inherent hierarchical features of the agricultural knowledge graph. It incorporates an improved random walk strategy based on relational paths to semantically model relationship objects within the agricultural knowledge graph. This innovative approach effectively retains the hierarchy and asymmetrical relationship structure of the nodes in the graph, setting our work apart from previous research. The validity of our proposed model is fortified by a strong foundation of theoretical and empirical evidence. [Results/Conclusions] Our experimental results reveal several key findings. First, the hierarchical random walk with path (HRWP) model using the LEIDEN algorithm can preserve the spatial structure more effectively and converge more quickly to the maximum modularity, in comparison to the HARP framework. Second, the fusion model employing HRWP takes less training time than the total training time of both models combined, without significantly affecting the time complexity of the original algorithm. Third, we observed that when traditional algorithms are integrated with HRWP, there is an average improvement of 2% across various indicators, with a substantial enhancement in non-neural network models. Therefore, our proposed model not only accurately represents the agricultural knowledge graph but also effectively reduces the training time. Despite the promising outcomes of our study, there remain areas of potential improvement. One such area is the need for a more detailed discussion on the hierarchical nature of relationship objects in future research. This provides potential avenues for future exploration in this field. The findings of this research carry profound implications for the development of agricultural knowledge management systems, offering an effective approach to handle the burgeoning complexity of knowledge graphs.

Key words: knowledge graph, walk, representation learning, the hierarchical random walk with path (HRWP) framework

中图分类号: 

  • TP391.1

引用本文

陈彩铭, 冯建中, 白林燕, 王剑, 谢能付, 邹军. 基于HARP框架的农业知识图谱表示模型研究[J]. 农业图书情报学报, 2023, 35(8): 66-77.

CHEN Caiming, FENG Jianzhong, BAI Linyan, WANG Jian, XIE Nengfu, ZOU Jun. Representation Model of Agricultural Knowledge Graph Based on the HARP Framework[J]. Journal of Library and Information Science in Agriculture, 2023, 35(8): 66-77.