农业图书情报学报 ›› 2020, Vol. 32 ›› Issue (3): 29-36.doi: 10.13998/j.cnki.issn1002-1248.2019.12.12-1085

• 研究论文 • 上一篇    下一篇

基于文本语义理解的学科发展趋势分析

余丽1,2   

  1. 1.中国科学院文献情报中心,北京 100190;
    2.资源与环境信息系统国家重点实验室,北京 100101
  • 收稿日期:2019-12-12 出版日期:2020-03-05 发布日期:2020-03-23
  • 作者简介:余丽(ORCID:0000-0002-4374-8743)(1986-),女,博士,馆员,研究方向:知识图谱与文献计量研究。
  • 基金资助:
    国家自然科学基金青年基金项目“中文网络文本的地理实体语义关系标注与评价”(项目编号:41801320);资源与环境信息系统国家重点实验室开放基金

Discipline Development Trend Analysis based on Text Semantic Understanding

YU Li   

  1. 1. National Science Library, Chinese Academy of Sciences, Beijing 100190;
    2. State Key laboratory of Resources and Environmental Information System, Beijing 100101
  • Received:2019-12-12 Online:2020-03-05 Published:2020-03-23

摘要: [目的/意义]学术论文是科技创新发展的重要战略资源,是反映学科研究动态的一手资料;为后续研究者提供了宝贵的方法论和创新基础。目前,学术论文的知识组织还缺乏细粒度知识的结构化描述,阻碍了科技情报服务向计算化和精准化的转型升级。[方法/过程]首先提出一种深入文本内容的语义分析框架,半自动化从论文摘要中识别出“研究主题”和“关键技术”;然后设计了一种短语级多层次聚类方法,水平方向上的聚类融合了同义词语,垂直方向上的聚类构建了层次关系;最后以地理信息科学领域的代表性期刊论文摘要为实验数据,运用文献计量分析方法,分析了地理信息科学领域近10年的热点研究主题和关键技术,及其随时间发展的脉络。[结果/结论]研究方法可为面向文本内容理解的情报分析提供算法与数据支撑。

关键词: 人工智能, 语义标注, 神经网络, 短语聚类, 文献计量分析

Abstract: [Purpose/Significance] Academic papers are the important strategic resources for the development of scientific and technological innovation. They are also the primary data that reflect the research trends of one subject, which provide the valuable methodological and innovative basis for the follow-up researchers. Recently, the knowledge organization of academic papers still lack of the fine-grained knowledge, which hinders the upgrading of scientific and technological information services to computerization and precision. [Method/Process] Firstly, this paper provides a framework of analyzing the semantic of article content: the "research topics" and "key technologies" are extracted from papers by using a semi-automatic model. Secondly, a multi-level clustering method for phrases are designed. The synonymous phrases are merged by clustering in the horizontal direction, and the hierarchical relations are built by clustering in the vertical direction. Finally, the experiments are carried out by using the massive abstracts from the core journals in the discipline of geographic information science. Based on the bibliometric analysis, we analyzed the top N of "research topics" and "key technologies", and their development trajectories over time. [Results/Conclusions] The proposed method can provide technologies and datasets for the intelligent service of the scientific and technological information.

Key words: artificial intelligence, semantic annotation, neural network, phrase clustering, bibliometric analysis

中图分类号: 

  • G251

引用本文

余丽. 基于文本语义理解的学科发展趋势分析[J]. 农业图书情报学报, 2020, 32(3): 29-36.

YU Li. Discipline Development Trend Analysis based on Text Semantic Understanding[J]. Journal of Library and Information Science in Agriculture, 2020, 32(3): 29-36.