中文    English

Journal of Library and Information Science in Agriculture ›› 2023, Vol. 35 ›› Issue (12): 49-59.doi: 10.13998/j.cnki.issn1002-1248.23-0813

Previous Articles     Next Articles

Online Social Spammer Detection Based on Deep Learning

ZHANG Jiyang1,2, ZHANG Peng1,*, GONG Siyu3, SONG Naipeng1   

  1. 1. Research Center for Network Public Opinion Governance, China People's Police University, Langfang 065000;
    2. Ningbo Entry Exit Border Inspection Station, Ningbo 315040;
    3. Yiling Branch of Yichang Public Security Bureau, Yichang 443100
  • Received:2023-12-27 Online:2023-12-05 Published:2024-04-07

Abstract: [Purpose/Significance] The development of the Internet has led to the rapid development of social networks, providing users with a convenient channel for the release, dissemination and acceptance of information. However, its low-threshold characteristics have also given rise to a group of the "Internet water army"--online social spammers, who are paid to post online comments with particular content and spread false information on purpose. They have become a major problem for the Internet ecology. It is of great significance to detect the Internet water army, prevent their malicious attacks, and combat and eliminate their negative effects on the security of the online public opinion. [Method/Process] First, we analyzed the development process and characteristics of the online social spammers, summarized the algorithms used in previous studies and the characteristics mentioned, and sorted out three research starting points: text features, interaction features and graph structure features. Then, an online social spammer detection method based on deep learning was proposed. Combined with the three aspects of user basic information, historical remarks and interaction behavior, six types of features were extracted from the basic information, recent remarks, social intimacy, interaction behavior, microblog number and membership level. Through feature depth extraction and vector splicing and fusion, the user feature vectors were formed with the same length. Finally, a convolutional neural network was used as a classifier to build an automatic, high-precision and high-efficiency spammer detection model. Two Chinese online spammer datasets collected from the Sina Weibo platform were selected for the experiment. The features of the datasets were spliced and aligned to form the Weibo Spammer 2023 dataset as the model training dataset, which prevented the data features of a single dataset from being too discrete and reducing modle generalization. Considering the overfitting problem in the model training process, we solved the problem by adding abandoned layers. [Results/Conclusions] The online spammer detection model constructed in this experiment has significantly improved in terms of metrics such as precision and accuracy. At the same time, the ablation experiment shows that the six features extracted in this experiment have a positive effect on the detection process. Through empirical analysis, the model constructed in this paper has a high detection accuracy and detection efficiency, which can provide certain technical support and theoretical guidance for online spammer identification. By using machine learning methods to actively identify online social spammer accounts, real-time monitoring and prevention of key spammer accounts can prevent the occurrence of malicious network events more timely and effectively and reduce the risk of illegal forces damaging the public opinion ecology.

Key words: network spammer, deep learning, CNN, bag-of-words

CLC Number: 

  • G356.8
[1] 杨金春, 崔展豪. “网络水军” 社会危害分析及治理方法研究[J]. 法制与社会, 2016(16): 161-162.
YANG J C, CUI Z H.Analysis of social harm of "network water army" and research on governance methods[J]. Legal system and society, 2016(16): 161-162.
[2] 莫倩, 杨珂. 网络水军识别研究[J]. 软件学报, 2014, 25(7): 1505-1526.
MO Q, YANG K.Overview of web spammer detection[J]. Journal of software, 2014, 25(7): 1505-1526.
[3] MCCORD M, CHUAH M.Spam detection on twitter using traditional classifiers[C]//International Conference on Autonomic and Trusted Computing. Berlin, Heidelberg: Springer, 2011: 175-186.
[4] 尹鹏飞. “网络水军” 危害治理研究[J]. 法制与经济, 2023, 32(S1): 129-136.
YIN P F.Research on hazard control of "network water army"[J]. Legal and economy, 2023, 32(S1): 129-136.
[5] 李岩, 邓胜春, 林剑. 社交网络水军用户的动态行为分析及在线检测[J]. 计算机工程, 2019, 45(8): 287-295.
LI Y, DENG S C, LIN J.Dynamic behavior analysis and online detection of spammer user in social network[J]. Computer engineering, 2019, 45(8): 287-295.
[6] 程传鹏, 张书钦, 刘小明, 等. 基于特定话题的网络水军识别研究[J]. 中原工学院学报, 2018, 29(4): 64-69.
CHENG C P, ZHANG S Q, LIU X M, et al.Research on detection method of online water army based on special topic[J]. Journal of Zhongyuan university of technology, 2018, 29(4): 64-69.
[7] 张艳梅, 黄莹莹, 甘世杰, 等. 基于贝叶斯模型的微博网络水军识别算法研究[J]. 通信学报, 2017, 38(1): 44-53.
ZHANG Y M, HUANG Y Y, GAN S J, et al.Weibo spammers' identification algorithm based on Bayesian model[J]. Journal on communications, 2017, 38(1): 44-53.
[8] GHANEM R, ERBAY H.Context-dependent model for Spam detection on social networks[J]. SN applied sciences, 2020, 2(9): 1587.
[9] 杨昊, 吴爱华, 屈青英. 一种基于深度神经网络的水军识别模型[J]. 现代计算机, 2019(18): 24-29.
YANG H, WU A H, QU Q Y.A spammer detection model based on deep neural network[J]. Modern computer, 2019(18): 24-29.
[10] 杨海梅, 王恒. 国内网络水军识别研究[J]. 网络安全技术与应用, 2021(2): 152-154.
YANG H M, WANG H.Research on identification of domestic network water army[J]. Network security technology & application, 2021(2): 152-154.
[11] 孙卫强. 基于深度信念网络的网络水军识别研究[D]. 湘潭: 湘潭大学, 2015.
SUN W Q.Research of "water army" recongnition based on DBN[D]. Xiangtan: Xiangtan University, 2015.
[12] ALHASSUN A S, RASSAM M A.A combined text-based and metadata-based deep-learning framework for the detection of Spam accounts on the social media platform twitter[J]. Processes, 2022, 10(3): 439.
[13] 文晓慧. 基于图神经网络的微博水军识别系统的设计与实现[D]. 曲阜: 曲阜师范大学, 2021.
WEN X H.Design and implementation of weibo water army identification system based on graph neural network[D]. Qufu: Qufu Normal University, 2021.
[14] AL DUHAYYIM M, MESFER ALSHAHRANI H, AL-WESABI F N, et al. Deep learning empowered cybersecurity Spam bot detection for online social networks[J]. Computers, materials & continua, 2022, 70(3): 6257-6270.
[15] 王渔樵, 李涛, 肖智婕. 社交网络水军识别的特征评价与选择[J]. 计算机工程与设计, 2019, 40(9): 2440-2445.
WANG Y Q, LI T, XIAO Z J.Feature evaluation and selection of social network spammers identification[J]. Computer engineering and design, 2019, 40(9): 2440-2445.
[16] 宁新丽, 孙圆. 基于豆瓣网短评的网络水军识别[J]. 统计与咨询, 2022(3): 6-9.
NING X L, SUN Y.Network water army identification based on douban network short comment[J]. Statistics and consultation, 2022(3): 6-9.
[17] 杨臻, 张明慧, 肖汉. 基于多特征的网络水军识别方法[J]. 激光杂志, 2016, 37(12): 110-113.
YANG Z, ZHANG M H, XIAO H.Information entropy based net-work spammers detection method[J]. Laser journal, 2016, 37(12): 110-113.
[18] 王烁, 徐健, 刘颖. 网络“水军” 探测方法研究[J]. 现代图书情报技术, 2014(S1): 92-100.
WANG S, XU J, LIU Y.Research on detection method of network "water army"[J]. Data analysis and knowledge discovery, 2014(S1): 92-100.
[19] 刘云虹. 网络水军的识别与治理[J]. 辽宁警察学院学报, 2017, 19(4): 61-65.
LIU Y H.The recognition and management of network navy[J]. Journal of Liaoning police college, 2017, 19(4): 61-65.
[20] 王军博. 基于电商评论的网络水军识别[D]. 北京: 北京交通大学, 2016.
WANG J B.Review spammers based on E-business review[D]. Beijing: Beijing Jiaotong University, 2016.
[21] 杨珂. 电子商务网络水军的智能识别研究[D]. 北京: 北京工商大学, 2015.
YANG K.Research on spammer detection in online shopping websites[D]. Beijing: Beijing Technology and Business University, 2015.
[1] WANG Sili, ZHANG Ling, YANG Heng, LIU Wei. Review of Deep Learning for Language Modeling [J]. Journal of Library and Information Science in Agriculture, 2023, 35(8): 4-18.
[2] LIU Nanzhu, CUI Yunpeng, WANG Mo. Construction and Application of Semantic Retrieval Model for Ancient Agricultural Literature [J]. Journal of Library and Information Science in Agriculture, 2023, 35(7): 52-62.
[3] LU Lina, YU Xiao. Recognition and Classification of Deep Learning in Soybean Leaf Image Data Management [J]. Journal of Library and Information Science in Agriculture, 2023, 35(2): 87-94.
[4] HOU Xiangying, CUI Yunpeng, LIU Juan. Applications and Prospect Analysis of Deep Learning in Plant Genomics and Crop Breeding [J]. Journal of Library and Information Science in Agriculture, 2022, 34(8): 4-18.
[5] SHI Yunlai, CUI Yunpeng, DU Zhigang. A Classification Method of Agricultural News Text Based on BERT and Deep Active Learning [J]. Journal of Library and Information Science in Agriculture, 2022, 34(8): 19-29.
[6] MAO Jin, CHEN Ziyang. A Deep Learning Based Approach to Structural Function Recognition of Scientific Literature Abstracts [J]. Journal of Library and Information Science in Agriculture, 2022, 34(3): 15-27.
[7] LYU Lucheng, HAN Tao. Artificial Intelligence Empowers Library and Information Service ——Review of Forums about Information Technology for Library 2019 [J]. Journal of Library and Information Science in Agriculture, 2020, 32(5): 13-18.
[8] WANG Xuejing. Research on Intelligent Service Mode of Digital Library Based on Deep Learning Technology [J]. , 2018, 30(9): 150-153.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!