中文    English

Journal of library and information science in agriculture ›› 2016, Vol. 28 ›› Issue (11): 50-53.doi: 10.13998/j.cnki.issn1002-1248.2016.11.012

• Literature study • Previous Articles     Next Articles

Research on text Classification Model Based on Random Forests

LUO Xin   

  1. School of Business Administration, South China University of Technology, Guangdong Guangzhou 510640, China
  • Received:2016-05-17 Online:2016-11-05 Published:2016-11-08

Abstract: Text classification is the key technology for processing large amount of text data. It can solve the information explosion problem in a certain extent. Random forests algorithm proposed by Breiman has the characteristics of good generalization and robustness, insensitivity for noise and ability in dealing with continuous attributes, which is very suitable for the establishment of text classification model. This paper attempted to construct the text classification model based on random forests algorithm, and compared with the text categorization model Reuters-21578 to verify the model's validity and accuracy for classification. Results showed: this model could be applied in text classification well; compared with the results of CART, REPTree and J48it models, it had the best effect, whose F1-Measure was 0.777; it had easy, intuitive and effective operation, and reliable results, which provided new idea for text classification research.

Key words: Random forests

CLC Number: 

  • TP391
[1] 刘怀亮, 张治国,马志辉,孙蕾.基于SVM与KNN的中文文本分类比较实证研究[J].情报理论与实践,2008,31(6):941-944.
[2] 杜选.基于加权补集的朴素贝叶斯文本分类算法研究[J].计算机应用与软件,2014,31(9):253-255.
[3] Breiman L. Random Forests[J].Machine Learning,2001,45(1):5-32.
[4] 吴潇雨,和敬涵,张沛,胡骏.基于灰色投影改进随机森林算法的电力系统短期负荷预测[J].电力系统自动化,2015,39(12):50-55.
[5] 杨帆,林琛,周绮凤,符长虹,罗林开.基于随机森林的潜在k近邻算法及其在基因表达数据分类中的应用[J].系统工程理论与实践,2012,32(4):815-825.
[6] 詹曙,姚尧,高贺. 基于随机森林的脑磁共振图像分类[J].电子测量与仪器学报,2013,27(11):1067-1072.
[7] 赖成光,陈晓宏,赵仕威,王兆礼,吴旭树.基于随机森林的洪灾风险评价模型及其应用[J].水利学报,2015,46(1):58-66.
[8] Breiman L, Friedman J, Olshen R, al et. Classification and Regression Trees[M].New York:Chapman&Hall,1984.
[9] Breiman L. Bagging Preditors [J]. Machine Learning,1996,24(2):123-140.
[1] WANG Xiaoyu, HU Jingyuan, WU Ruoyu, WANG Shu, ZHAI Yujia. An LLM-based Data Augmentation Method for Constructing Science & Technology Topic Linkages: Taking the Energy Conservation Field as an Example [J]. Journal of library and information science in agriculture, 2025, 37(9): 63-81.
[2] CHANG Hao, XU Taotao, LI Feng. A Multi-dimensional Feature Text Complexity Framework and Knowledge Base Augmentation Model [J]. Journal of library and information science in agriculture, 2025, 37(8): 61-77.
[3] Yifan ZHANG, Zuqin CHEN, Jike GE, Mingkun HE, Jie TAN. Construction of a Multimodal Dataset for Emergency Event Identification and Classification [J]. Journal of library and information science in agriculture, 2024, 36(10): 76-85.
[4] Deming ZHENG, Sijia LI, Jianlong ZHENG, Zhaoxin WANG. Modeling and Simulation of Coupled Network Public Opinion Propagation Across Social Media Platforms [J]. Journal of library and information science in agriculture, 2024, 36(8): 69-81.
[5] Fangrui BAI, Shaobo LIANG, Dan WU, Yuheng REN, Fan YANG. Human-Intelligent Information System Collaboration in Digital Twin Environment: Value Proposition, Key Technologies, and Practical Approaches [J]. Journal of library and information science in agriculture, 2024, 36(7): 4-18.
[6] LI Mengli, WANG Ying, QIAN Li, XIE Jing, CHANG Zhijun, JIA Haiqing. Building an Scientific and Technological Talent Database for New Quality Productive Forces [J]. Journal of library and information science in agriculture, 2024, 36(2): 15-25.
[7] LI Sijia, ZHENG Deming, LIU Bo. Network Analysis of Emergency Information Dissemination Considering the Strength Relationship Between Nodes [J]. Journal of library and information science in agriculture, 2024, 36(1): 83-96.
[8] SUN Yusheng, FAN Ying, ZHU Bo. Research Advances in Resource Management Technology of Smart Recommendation Enabled by Big Data in China [J]. Journal of library and information science in agriculture, 2023, 35(12): 4-17.
[9] LI Sijia, ZHENG Deming, SUN Zhengyi. Analysis of Information Dissemination of Emergencies Based on Weibo User Characteristics [J]. Journal of library and information science in agriculture, 2023, 35(11): 86-97.
[10] WANG Sili, ZHANG Ling, YANG Heng, LIU Wei. Review of Deep Learning for Language Modeling [J]. Journal of library and information science in agriculture, 2023, 35(8): 4-18.
[11] CHEN Caiming, FENG Jianzhong, BAI Linyan, WANG Jian, XIE Nengfu, ZOU Jun. Representation Model of Agricultural Knowledge Graph Based on the HARP Framework [J]. Journal of library and information science in agriculture, 2023, 35(8): 66-77.
[12] NIU Xianyun. A Functional Framework for a Library's Mobile Reading Service [J]. Journal of library and information science in agriculture, 2023, 35(8): 55-65.
[13] LIU Nanzhu, CUI Yunpeng, WANG Mo. Construction and Application of Semantic Retrieval Model for Ancient Agricultural Literature [J]. Journal of library and information science in agriculture, 2023, 35(7): 52-62.
[14] CHEN Yuanyuan, WANG Lei. Think-Tank's Text Summarization Based on Combined Keywords and Contrastive Learning Training [J]. Journal of library and information science in agriculture, 2023, 35(6): 72-82.
[15] YANG Min, GUO Limin. Application of EfficientNet-based Transfer Learning in Image Classification of Modern Documents: Taking Shanghai Library's "Picture Gallery of Modern Chinese Literature" as an Example [J]. Journal of library and information science in agriculture, 2023, 35(4): 90-99.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!