中文    English

Journal of Library and Information Science in Agriculture ›› 2023, Vol. 35 ›› Issue (3): 52-70.doi: 10.13998/j.cnki.issn1002-1248.23-0214

Previous Articles     Next Articles

Interdisciplinarity Measurement Method of Scientific Research Papers based on Adaptive Feature Selection

WANG Jinfei1, SUN Wei1,2,*, ZHANG Xuefu1,2, YANG Lu1   

  1. 1. Institute of Agricultural Information, Chinese Academy of Agricultural Sciences, Beijing 100081;
    2. Key Laboratory of Agricultural Big Data, Ministry of Agriculture and Rural Affairs, Beijing 100081
  • Received:2023-02-04 Online:2023-03-05 Published:2023-05-31

Abstract: [Purpose/Significance] Interdisciplinary research can creatively solve complex problems in natural environment and human society through knowledge integration and penetration. With the increase of interdisciplinary research results, the evaluation of interdisciplinarity becomes increasingly necessary. How to establish an effective method for interdisciplinarity measurement and achieve a comprehensive measurement of scientific research papers is an urgent problem to be solved. [Method/Process] Based on the above background, this study takes the data of scientific research papers as the analysis source, deconstructs the interdisciplinarity of scientific research papers from multiple dimensions, constructs the feature set of interdisciplinarity of scientific research papers, and on this basis proposes the method for measuring interdisciplinarity based on the adaptive method of machine learning, and conducts a comprehensive measurement of interdisciplinarity. This study has certain positive significance for researchers to understand the interdisciplinary papers in the field. The work process is as follows: First of all, the basic concepts of interdisciplinarity are sorted out and related concepts are discriminated, and the index of interdisciplinarity of different dimensions is analyzed. Based on the connotation and characteristics of interdisciplinary research, the characteristic index of interdisciplinarity of scientific research papers is extracted from three dimensions: subject attribute, knowledge network topology and knowledge integration text content. Secondly, an interdisciplinarity measurement method based on machine learning is constructed. By analyzing information gain and feature similarity of input indexes and data in feature sets, a feature selection calculation method based on adaptive feature selection is proposed, and the accuracy of feature classification is maximized by machine learning classifier. At the same time, the feature subset that can best express the interdisciplinary is selected based on the adaptive selection of the minimum number of features, and the selected adaptive feature set is used in the calculation of the interdisciplinary of the paper, and the results of the calculation of the original feature set are analyzed comprehensively. Finally, an empirical study was carried out in the field of plant nanobiotechnology to verify the effectiveness of the index system and adaptive feature selection listed above, identify and screen papers with high interdisciplinarity in the field, measure the interdisciplinarity of papers and identify key influencing factors based on the calculation of subject attributes, knowledge network topology and knowledge integration text content features. [Results/Conclusions] The main empirical results show that, among the subject attributes, the balance degree and the difference degree have a greater effect on the interdisciplinary evaluation. The overall effect of knowledge network topology structure features is satisfactory, the distribution breadth of knowledge integration text content features has a greater effect on interdisciplinary evaluation, and the calculation effect is further improved by fitness weighted summation of each feature. The results demonstrate that the adaptive feature selection proposed in this paper can effectively screen the interdisciplinary related feature indexes, improve the reliability of the results, and achieve a comprehensive and in-depth measurement of the interdisciplinary of scientific research papers. This measure method avoids the subjective defects that may occur in qualitative evaluation and the problems that different measure indicators may produce contradictory results. It provides a new idea and direction for interdisciplinary measurement.

Key words: interdisciplinarity, adaptive feature selection, paper measurement

CLC Number: 

  • TP391.1
[1] GLANZEL W, DEBACKERE K.Various aspects of interdisciplinarity in research and how to quantify and measure those[J]. Scientomet-rics, 2022, 127(9): 5551-5569.
[2] 曾粤亮, 李玉海. 基于生态系统理论的跨学科科研合作运行框架与关键问题[J]. 情报资料工作, 2022, 43(3): 34-42.
ZENG Y L, LI Y H.Operational framework and key issues of interdisciplinary scientific research cooperation based on ecological systems theory[J]. Information and documentation services, 2022, 43(3): 34-42.
[3] European commission. Directorate general for research and innovation., research, innovation,science policy experts (rise)[M/OL]. Quests for interdisciplinarity: A challenge for the ERA and HORIZON2020, LU: Publications Office, 2015. https://data.europa.eu/doi/10.2777/499518.
[4] 樊春良, 樊天. 国外学科交叉研究的发展趋势及启示[J]. 中国科学基金, 2019, 33(5): 446-452.
FAN C L, FAN T.The trends of development interdisciplinary research abroad and its inspiration[J]. Bulletin of national natural science foundation of China, 2019, 33(5): 446-452.
[5] 中华人民共和国科学技术进步法_中国人大网[EB/OL]. (2021-12-24)[2022-08-01].http://www.npc.gov.cn/npc/c30834/202112/1f4abe22e8ba49198acdf239889f822c.shtml.
[6] 步一, 陈洪侃, 许家伟, 等. 跨学科研究的范式解析: 理解情报学术中的”范式”[J]. 情报理论与实践, 2022, 45(3): 12-18, 34.
BU Y, CHEN H K, XU J W, et al.Connotations of interdisciplinarity from the perspective of paradigms: Towards "paradigms" in information science research and practices[J]. Information studies: Theory & application, 2022, 45(3): 12-18, 34.
[7] STIRLING A.A general framework for analysing diversity in science, technology and society[J]. Journal of the royal society interface, 2007, 4(15): 707-719.
[8] PORTER A L, RAFOLS I.Is science becoming more interdisciplinary? Measuring and mapping six research fields over time[J]. Scientometrics, 2009, 81(3): 719-745.
[9] ZHANG L, ROUSSEAU R, GL?NZEL W. Diversity of references as an indicator of the interdisciplinarity of journals: Taking similarity between subject fields into account[J]. Journal of the association for information science and technology, 2016, 67(5): 1257-1265.
[10] LEYDESDORFF L, WAGNER C S, BORNMANN L.Interdisciplinarity as diversity in citation patterns among journals: Rao-Stirling diversity, relative variety, and the Gini coefficient[J]. Journal of informetrics, 2019, 13(1): 255-269.
[11] RAFOLS I, MEYER M.Diversity and network coherence as indicators of interdisciplinarity: Case studies in bionanoscience[J]. Scientometrics, 2010, 82(2): 263-287.
[12] RAFOLS I.Knowledge integration and diffusion: Measures and mapping of diversity and coherence[M]//DING Y, ROUSSEAU R, WOLFRAM D. Measuring scholarly impact. Cham: Springer, 2014: 169-190.
[13] LEYDESDORFF L, WOUTERS P, BORNMANN L.Professional and citizen bibliometrics: Complementarities and ambivalences in the development and use of indicators - A state-of-the-art report[J]. Scientometrics, 2016, 109(3): 2129-2150.
[14] XU H Y, GUO T, YUE Z H, et al.Interdisciplinary topics of information science: A study based on the terms interdisciplinarity index series[J]. Scientometrics, 2016, 106(2): 583-601.
[15] 黄菡, 王晓光, 王依蒙. 复杂网络视角下的研究主题学科交叉测度研究[J]. 图书情报工作, 2022, 66(19): 99-109.
HUANG H, WANG X G, WANG Y M.Research on the interdisciplinary measurement of research topics from the perspective of complex networks[J]. Library and information service, 2022, 66(19): 99-109.
[16] 姚旭, 王晓丹, 张玉玺, 等. 特征选择方法综述[J]. 控制与决策,2012, 27(2): 161-166, 192.
YAO X, WANG X D, ZHANG Y X, et al.Summary of feature selection algorithms[J]. Control and decision, 2012, 27(2): 161-166, 192.
[17] CHEN M X, CHU X Q, SUBBALAKSHMI K P.MMCoVaR: Multimodal COVID-19 vaccine focused data repository for fake news detection and a baseline architecture for classification[C]//Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. New York: ACM, 2021: 31-38.
[18] VAN VLASSELAER V, BRAVO C, CAELEN O, et al.APATE: A novel approach for automated credit card transaction fraud detection using network-based extensions[J]. Decision support systems, 2015, 75: 38-48.
[19] 熊志斌. 信用评估中的特征选择方法研究[J]. 数量经济技术经济研究, 2016, 33(1): 142-155.
XIONG Z B.Research on feature selection method in credit evaluation[J]. The journal of quantitative & technical economics, 2016, 33(1): 142-155.
[20] DAHIYA S, HANDA S S, SINGH N P.A feature selection enabled hybrid-bagging algorithm for credit risk evaluation[J]. Expert Sys-tems, 2017, 34(6): e12217.
[21] 赵蕴华, 张静, 李岩, 等. 基于机器学习的专利价值评估方法研究[J]. 情报科学, 2013, 31(12): 15-18.
ZHAO Y H, ZHANG J, LI Y, et al.Study on evaluation for patent value based on machine learning[J]. Information science, 2013, 31
22 (12): 15-18.
[22] 何向, 李莉, 王小绪. 基于机器学习的高校专利价值评估体系构建[J]. 情报工程, 2020, 6(1): 50-58.
HE X, LI L, WANG X X.The construction of assessing college patent value system based on machine learning[J]. Technology intelligence engineering, 2020, 6(1): 50-58.
[23] 李欣, 范明姐, 黄鲁成. 基于机器学习的专利质量评价研究[J]. 科技进步与对策, 2020, 37(24): 116-124.
LI X, FAN M J, HUANG L C.Research on patent quality evaluation using machine learning[J]. Science & technology progress and policy, 2020, 37(24): 116-124.
[24] 李欣, 温阳, 黄鲁成, 等. 一种基于机器学习的研究前沿识别方法研究[J]. 科研管理, 2021, 42(1): 20-32.
LI X, WEN Y, HUANG L C, et al.A study of the research front identification method based on machine learning[J]. Science research management, 2021, 42(1): 20-32.
[25] 钱玲飞, 贺婉莹, 杨建林. 论文学术创新力特征指标体系研究[J]. 情报科学, 2021, 39(1): 56-64.
QIAN L F, HE W Y, YANG J L.The characteristic index system of academic innovation ability[J]. Information science, 2021, 39(1): 56-64.
[26] 李道全, 李腾, 李玉秀. 基于自适应特征选择与KNN的网络流量分类研究[J/OL]. 计算机工程与应用: 1-9[2023-05-08]. http://kns.cnki.net/kcms/detail/11.2127.TP.20220510.1353.002.html.
LI D Q, LI T, LI Y X. Research on network traffic classification based on adaptive feature selection and KNN[J/OL]. Computer Engineering and Applications: 1-9[2023-05-08]. http://kns.cnki.net/kcms/detail/11.2127.TP.20220510.1353.002.html.
[27] SHAFIQ M, YU X Z, BASHIR A K, et al.A machine learning approach for feature selection traffic classification using security analysis[J]. The journal of supercomputing, 2018, 74(10): 4867-4892.
[28] 刘凯. 随机森林自适应特征选择和参数优化算法研究[D]. 长春: 长春工业大学, 2018.
LIU K.Research on adaptive feature selection and parameter optimization algorithm for random forest[D]. Changchun: Changchun University of Technology, 2018.
[29] National academy of sciences, national academy of engineering, institute of medicine[M]//Facilitating interdisciplinary research institute of medicine[M]//Facilitating interdisciplinary research. Washington, D.C.: The National Academies Press, 2005.
[30] 黄颖, 张琳, 孙蓓蓓, 等. 跨学科的三维测度——外部知识融合、内在知识会聚与科学合作模式[J]. 科学学研究, 2019, 37(1): 25-35.
HUANG Y, ZHANG L, SUN B B, et al.Interdisciplinarity measurement: External knowledge integration, internal information convergence and research activity pattern[J]. Studies in science of science, 2019, 37(1): 25-35.
[31] ZENG B, LYU H H, ZHAO Z Y, et al.Exploring the direction and diversity of interdisciplinary knowledge diffusion: A case study of professor Zeyuan Liu's scientific publications[J]. Scientometrics, 2021, 126(7): 6253-6272.
[32] 张琳, 刘冬东, 吕琦, 等. 论文学科交叉测度研究: 从全部引文到章节引文[J]. 情报学报, 2020, 39(5): 492-499.
ZHANG L, LIU D D, LYU Q, et al.Interdisciplinarity measurement in publications: From full reference analysis to sectional reference analysis[J]. Journal of the China society for scientific and technical information, 2020, 39(5): 492-499.
[33] 谢娟英, 吴肇中, 郑清泉. 基于信息增益与皮尔森相关系数的2D自适应特征选择算法[J]. 陕西师范大学学报(自然科学版), 2020, 48(6): 69-81.
XIE J Y, WU Z Z, ZHENG Q Q.An adaptive 2D feature selection algorithm based on information gain and Pearson correlation coefficient[J]. Journal of Shaanxi normal university (natural science edition), 2020, 48(6): 69-81.
[34] CHEN R C, CARAKA R E, PILIANG A, et al.An end to end of scalable tree boosting system[J]. Sylwan, 2020, 164(5): 140-151.
[35] 遆慧颖, 耿骞, 靳健. 一种基于重叠社区标签传播的学科划分方法[J]. 农业图书情报学报, 2021, 33(1): 41-52.
TI H Y, GENG Q, JIN J.A COPRA based algorithm for subject di-vision[J]. Journal of library and information science in agriculture, 2021, 33(1): 41-52.
[36] 张宝隆, 王昊, 张卫. 学科交叉视角下的学科区分能力测度方法及分析研究[J]. 情报学报, 2022, 41(4): 375-387.
ZHANG B L, WANG H, ZHANG W.Measurement and analysis of disciplinary discriminative capacity from an interdisciplinary perspective[J]. Journal of the China society for scientific and technical information, 2022, 41(4): 375-387.
[37] 韩正琪, 刘小平, 寇晶晶. 基于Rao-stirling指数和LDA模型的领域学科交叉主题识别——以纳米科技为例[J]. 情报科学, 2020, 38(2): 116-124.
HAN Z Q, LIU X P, KOU J J.Interdisciplinary literature discovery based on Rao-stirling diversity indices: Case studies in nanoscience and nanotechnology[J]. Information science, 2020, 38(2): 116-124.
[1] CHEN Caiming, FENG Jianzhong, BAI Linyan, WANG Jian, XIE Nengfu, ZOU Jun. Representation Model of Agricultural Knowledge Graph Based on the HARP Framework [J]. Journal of Library and Information Science in Agriculture, 2023, 35(8): 66-77.
[2] LI Yikai, YE Sa, KOU Yuantao. User Interaction Mode of Agricultural Knowledge Service System [J]. Journal of Library and Information Science in Agriculture, 2022, 34(9): 86-94.
[3] SHI Yunlai, CUI Yunpeng, DU Zhigang. A Classification Method of Agricultural News Text Based on BERT and Deep Active Learning [J]. Journal of Library and Information Science in Agriculture, 2022, 34(8): 19-29.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!