农业图书情报学报 ›› 2021, Vol. 33 ›› Issue (5): 14-27.doi: 10.13998/j.cnki.issn1002-1248.20-1061

• 特约稿件 • 上一篇    下一篇

基于不同内容层面的特定领域研究主题差异分析研究

赵磊, 章成志*   

  1. 南京理工大学 经济管理学院信息管理系,南京 210094
  • 收稿日期:2020-11-26 出版日期:2021-05-05 发布日期:2021-06-03
  • 通讯作者: *章成志(ORCID:0000-0001-8121-4796),男,博士,教授,博士生导师,研究方向为信息组织、信息检索、数据挖掘及自然语言处理。Email:zhangcz@njust.edu.cn
  • 作者简介:赵磊(ORCID:0000-0001-9077-5720),男,情报学专业硕士研究生,研究方向为文本挖掘与科学计量
  • 基金资助:
    江苏省社科基金重点项目“智能化驱动的学者细粒度画像构建研究”(20TQA001)

Difference Analysis of Research Topics in a Specific Domain Based on Different Content Levels

ZHAO Lei, ZHANG Chengzhi*   

  1. Department of Information Management, School of Economics & Management, Nanjing University of Science & Technology, Nanjing 210094
  • Received:2020-11-26 Online:2021-05-05 Published:2021-06-03

摘要: [目的/意义]旨在探究不同内容层面:标题和摘要、引文内容、全文内容中的主题是否存在差异,以分析标题和摘要中的主题内容是否可以揭示全文的研究内容,以及引文内容对其施引文献内容的作用,为基于文献的标题和摘要来分析全文的研究内容提供理论支持。[方法/过程]使用新冠领域的中文期刊论文进行实证研究,从文献的标题和摘要、引文内容、全文内容中抽取特征词,使用聚类算法对特征词进行聚类,然后采用人工判读的方式识别研究主题,并进行对比研究,分析三者之间的主题差异。[结果/结论]研究结果表明:研究主题在文献的标题和摘要、引文内容、全文内容中存在差异;与标题和摘要相比,全文中富含更多的主题内容,但二者的主题内容差异较小,可以使用标题和摘要中的主题内容来表征全文的研究内容;引文内容与其施引文献内容的主题相关,二者可以进行内容互补。

关键词: 新冠肺炎, 特征词抽取, 词聚类, 主题分析, 主题模型

Abstract: [Purpose/Significance] This paper aims to explore whether there are differences in the title and abstract, citation content, and full-text content on research topic, and analyze whether the topic content in the title and abstract can reveal the research content of the full text and the effect of the citation content on the content of the citing literature, so as to provide theoretical support for analyzing the research content of the full text based on the title and abstract of the literature. [Method/Process] This paper conducts an empirical study using Chinese journal papers in the field of COVID-19, extracts feature words from the titles and abstracts, citations and full-text contents of the literature, uses the clustering algorithm to cluster the feature words, and then uses manual interpretation to identify the research topics, and conducts a comparative study to analyze the topic differences among the three parts. [Results/Conclusions] The results show that: the research topics are different in the title and abstract, citation content and full-text content of the literature; compared with the title and abstract, the full text contains more topic content, but the difference in the topic content is small, so the topic content in the title and abstract can be used to represent the research content of the full text; the content of the citation is related to the topic of the citing literature, and they can complement each other.

Key words: COVID-19, feature word extraction, word clustering, topic analysis, topic model

中图分类号: 

  • G237.5

引用本文

赵磊, 章成志. 基于不同内容层面的特定领域研究主题差异分析研究[J]. 农业图书情报学报, 2021, 33(5): 14-27.

ZHAO Lei, ZHANG Chengzhi. Difference Analysis of Research Topics in a Specific Domain Based on Different Content Levels[J]. Journal of Library and Information Science in Agriculture, 2021, 33(5): 14-27.