中文    English

Journal of Library and Information Science in Agriculture ›› 2021, Vol. 33 ›› Issue (5): 14-27.doi: 10.13998/j.cnki.issn1002-1248.20-1061

• Special manuscript • Previous Articles     Next Articles

Difference Analysis of Research Topics in a Specific Domain Based on Different Content Levels

ZHAO Lei, ZHANG Chengzhi*   

  1. Department of Information Management, School of Economics & Management, Nanjing University of Science & Technology, Nanjing 210094
  • Received:2020-11-26 Online:2021-05-05 Published:2021-06-03

Abstract: [Purpose/Significance] This paper aims to explore whether there are differences in the title and abstract, citation content, and full-text content on research topic, and analyze whether the topic content in the title and abstract can reveal the research content of the full text and the effect of the citation content on the content of the citing literature, so as to provide theoretical support for analyzing the research content of the full text based on the title and abstract of the literature. [Method/Process] This paper conducts an empirical study using Chinese journal papers in the field of COVID-19, extracts feature words from the titles and abstracts, citations and full-text contents of the literature, uses the clustering algorithm to cluster the feature words, and then uses manual interpretation to identify the research topics, and conducts a comparative study to analyze the topic differences among the three parts. [Results/Conclusions] The results show that: the research topics are different in the title and abstract, citation content and full-text content of the literature; compared with the title and abstract, the full text contains more topic content, but the difference in the topic content is small, so the topic content in the title and abstract can be used to represent the research content of the full text; the content of the citation is related to the topic of the citing literature, and they can complement each other.

Key words: COVID-19, feature word extraction, word clustering, topic analysis, topic model

CLC Number: 

  • G237.5
[1] 章成志, 胡志刚, 徐硕, 等. 全文本计量分析理论与技术的新进展与新探索——2019全文本文献计量分析学术沙龙综述[J]. 信息资源管理学报, 2020, 10(1): 111-117.
ZHANG C Z, HU Z G, XU S, et al.New progress and exploration of full-text bibliometric analysis theory and technology - A review of the 2019 academic salon on full-text bibliometric analysis[J]. Journal of Information resources management, 2020, 10(1): 111-117.
[2] 祝清松, 冷伏海. 基于引文内容分析的高被引论文主题识别研究[J]. 中国图书馆学报, 2014, 40(1): 39-49.
ZHU Q S, LENG F H.Topic identification of highly cited papers based on citation content analysis[J]. Journal of library science in China, 2014, 40(1): 39-49.
[3] YANG F, ZHANG S, WANG Q, et al.Analysis of the global situation of COVID-19 research based on bibliometrics.[J]. Health information science and systems, 2020, 8(1): 30.
[4] 张金柱, 于文倩. 基于短语表示学习的主题识别及其表征词抽取方法研究[J]. 数据分析与知识发现, 2020: 1-13.
ZHANG J Z, YU W Q.Topic recognition and key-phrase extraction with phrase representation learning[J]. Data analysis and knowledge discovery, 2020: 1-13.
[5] 储节旺, 钱倩. 基于词频分析的近10年知识管理的研究热点及研究方法[J]. 情报科学, 2014, 32(10): 156-160.
CHU J W, QIAN Q.Analysis of research focus and research methods in the field of knowledge management during the past decade[J]. Information science, 2014, 32(10): 156-160.
[6] 陈红琳, 魏瑞斌, 张玮, 等. 基于共词分析的国内文本情感分析研究[J]. 现代情报, 2019, 39(6): 91-101.
CHEN H L, WEI R B, ZHANG W, et al.Research on domestic text sentiment analysis based on co-word analysis[J]. Journal of modern information, 2019, 39(6): 91-101.
[7] 高劲松, 彭博. 关键词频度演化视角下的研究热点挖掘方法研究[J]. 图书与情报, 2020(3): 61-70.
GAO J S, PENG B.Research hotspot mining method from the perspective of keyword frequency evolution[J]. Library and information, 2020(3): 61-70.
[8] BLEI D M, NG A Y, JORDAN M I.Latent Dirichlet allocation[J]. Journal of machine learning research, 2003, 3(4-5): 993-1022.
[9] BOON-ITT S.A text-mining analysis of public perceptions and topic modeling during the COVID-19 pandemic using Twitter data[J]. JMIR public health and surveillance, 2020.
[10] 褚征, 于炯, 王佳玉, 等. 基于LDA主题模型的移动应用相似度构建方法[J]. 计算机应用, 2017, 37(4): 1075-1082.
CHU Z, YU J, WANG J Y, et al.Construction method of mobile application similarity matrix based on latent Dirichlet allocation topic model[J]. Journal of computer applications, 2017, 37(4): 1075-1082.
[11] 李春杰, 马建玲. 国内外图情领域信息抽取研究文献计量分析[J]. 情报科学, 2019, 37(4): 157-164.
LI C J, MA J L.A statistical analysis of literature on information extractin of library and information science[J]. Information science, 2019, 37(4): 157-164.
[12] 刘志辉. 基于文献计量的国外信息历史研究现状分析[J]. 情报杂志, 2009, 28(1): 176-179.
LIU Z H.Bibliometrics-based analysis of status quo of information history research[J]. Journal of intelligence, 2009, 28(1): 176-179.
[13] 李万辉, 张文德, 陈振标. 我国城市信息化发展现况研究——基于文献计量与社会网络分析[J]. 图书情报工作网刊, 2010(12): 54-63.
LI W H, ZHANG W D, CHEN Z B.Research on the current situation of my country's urban informatization development - Based on bibliometrics and social network analysis[J]. Knowledge management forum, 2010(12): 54-63.
[14] 张涛, 孙瑞英, 李钟隽. 中国舆情文献主题聚类及演化趋势研究(1998 年—2019 年)[J]. 农业图书情报学报, 2020, 32(2): 14-21.
ZHANG T, SUN R Y, LI Z J.Subject clustering and evolutionary trend of public opinion documents in China[J]. Journal of library and information science in agriculture, 2020, 32(2): 14-21.
[15] TATSAWAN T, GIYEONG K, MIN S.A data-driven analysis of the knowledge structure of library science with full-text journal articles[J]. Journal of Librarianship and Information Science, 2020, 52(2).
[16] SALTON G, BUCKLEY C.Term-weighting approaches in automatic text retrieval[J]. Information processing & management, 1988, 24(5): 513-523.
[17] KULLBACK S, LEIBLER R A.On information and sufficiency[J]. The annals of mathematical statistics, 1951, 22(1).
[18] LIN J.Divergence measures based on the Shannon entropy[J]. IEEE transactions on information theory, 1991, 37(1): 145-151.
[19] 余冲, 李晶, 孙旭东, 等. 基于词嵌入与概率主题模型的社会媒体话题识别[J]. 计算机工程, 2017, 43(12): 184-191.
YU C, LI J, SUN X D, et al.Social media topic recognition based on word embedding and probabilistic topic model[J]. Computer engineering, 2017, 43(12): 184-191.
[20] BENGIO Y, DUCHARME R, VINCENT P, et al.A neural proba-bilistic language model[J]. Journal of machine learning research, 2003,3(6): 1137-1155.
[21] FREY B J, DUECK D.Clustering by passing messages between data points[J]. Science, 2007, 315(5814): 972-976.
[22] SALTON G, MCGILL M J.Introduction to modern information retrieval[M]. New York: McGraw-Hill, 1983: 448.
[23] ROUSSEEUW P J.Silhouettes: A graphical aid to the interpretation and validation of cluster analysis[J]. Journal of computational and applied mathematics, 1987, 20: 53-65.
[24] HUANG C, WANG Y, LI X, et al.Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China[J]. Lancet,2020, 395(10223): 497-506.
[25] CHEN N, ZHOU M, DONG X, et al.Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study[J]. The lancet, 2020, 395(10223): 507-513.
[26] NA Z, DINGYU Z, WENLING W, et al.A novel coronavirus from patients with pneumonia in China, 2019[J]. New England journal of medicine, 2020, 382(8).
[1] CHEN Jinghao, LUO Qi. Research on the Influencing Factors of Adoption Intention of Information Technology for Pandemic Prevention Among the Elderly in Public Health Emergency [J]. Journal of Library and Information Science in Agriculture, 2022, 34(4): 30-40.
[2] MA Yunzhe, CUI Xu, ZHANG Xiaoyi. 24-Hour Service Quality Evaluation System of a Self-Service Library under the Background of Normal State of COVID-19 Epidemic Prevention and Control [J]. Journal of Library and Information Science in Agriculture, 2022, 34(3): 68-80.
[3] WANG Weiwei, HUA Bolin. Extraction and Mining of Intelligent Description Information of Public Culture [J]. Journal of Library and Information Science in Agriculture, 2021, 33(8): 13-23.
[4] WANG Feiyan, CAO Yunqiu, XIAO Anqi, JI Lu, KE Qing. Thematic Correlation and Contextual Factors of Netizens' Information Needs During the COVID-19 Pandemic [J]. Journal of Library and Information Science in Agriculture, 2021, 33(5): 28-39.
[5] TI Huiying, GENG Qian, JIN Jian. A COPRA Based Algorithm for Subject Division [J]. Journal of Library and Information Science in Agriculture, 2021, 33(1): 41-52.
[6] JIA Xiaoshuang, YAO Jing. The Analysis of the Role of Digital Humanities in Sudden Public Crises: Taking COVID-19 as an Example [J]. Journal of Library and Information Science in Agriculture, 2020, 32(9): 22-30.
[7] ZHAO Shuai, ZHOU Dan. Analysis on the Epidemic Situation of COVID-19 in Six Provinces Adjacent to Hubei [J]. Journal of Library and Information Science in Agriculture, 2020, 32(4): 5-14.
[8] ZHENG Yufei, WANG Zheng. Characteristics and Trends of Overseas Library Open Access Activities During the Epidemic Period [J]. Journal of Library and Information Science in Agriculture, 2020, 32(12): 20-28.
[9] ZHANG Chuanzhen, WANG Huanjing. Research on Domestic Information Resource Management Based on Co-word Cluster Analysis [J]. , 2018, 30(3): 48-52.
[10] YE Chun-lei, NING Lu. Research Trends Analysis of Modern Agricultural Science and Technology During Recent 10 Years in China Based on Bibliometric and Topic Model [J]. , 2016, 28(10): 77-82.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!