Spam Detection from Big Data based on Evolutionary Data Mining Systems

Ehsani Chimeh, H.; Karami, M.

doi:10.47176/TMI.2018.1

Spam Detection from Big Data based on Evolutionary Data Mining Systems

Document Type : Original Article

Authors

H. Ehsani Chimeh ¹

M. Karami ²

¹ Department of Electrical Engineering, Amirkabir University of Technology, Tehran, Iran

² Department of Electrical Engineering, Shahid Beheshti University, Tehran, Iran

10.47176/TMI.2018.1

Abstract

News releases and users' ability to discuss events, events, and writing personalities and environments are services that provide opportunities for new types of spam and spammers. For example, popular topics and topics that involve the most discussions can be an opportunity to create traffic, visits, and sources of income. When something happens, thousands of users write about it, send text and quickly become the subject of discussion. These topics are targeted by spammers, because their writings contain the common words used in popular discussions. Often there are links in spam that direct users to websites that are not related to the topic, and since these URLs are shortened, it's difficult for users to log in. This type of spams can reduce the value and efficiency of instantaneous search services, and users of these services refer to materials that do not contain links to the searcher, so a method for identifying spammers should be found. Methods available to deal with spammers can be included in three categories which contain detection-based approach, prevention-based approach, and degradation-based approach that this research uses is a detection approach. Hence, this research uses a smart method that initially enters large data into the program, then a feature extraction based on the genetic algorithm is performed. In the next step, the classification of data in order to detect spam is done using the combined method of self-organized mapping neural network and probabilistic neural network with the support vector machine core as a radial basis function.

Keywords

Spam Detection

Big Data

Genetic Algorithm

Self- Organizing Neural Network (SOM)

Probabilistic Neural Network (PNN)

Wu, T., Wen, S., Xiang, Y., & Zhou, W. (2018). Twitter spam detection: Survey of new approaches and comparative study. Computers & Security, 76, 265–284. doi:10.1016/j.cose.2017.11.013
Fei, G., Li, H., & Liu, B. (2017). Opinion spam detection in social networks. Στο Sentiment Analysis in Social Networks (pp. 141–156). doi:10.1016/b978-0-12-804412-4.00009-7
Eshraqi, N., Jalali, M., & Moattar, M. H. (2015). Spam detection in social networks: A review. 2015 International Congress on Technology, Communication and Knowledge (ICTCK), Mashhad. doi:10.1109/ictck.2015.7582661
Chakraborty, M., Pal, S., Pramanik, R., & Ravindranath Chowdary, C. (2016). Recent developments in social spam detection and combating techniques: A survey. Information processing & management, 52(6), 1053–1073. doi:10.1016/j.ipm.2016.04.009
Benevenuto, F., Magno, G., Rodrigues, T., & Almeida, V. (2010). Detecting spammers on twitter. Collaboration, electronic messaging, anti-abuse and spam conference (CEAS), 6.
Ahmed, F., & Abulaish, M. (2013). A generic statistical approach for spam detection in Online Social Networks. Computer Communications, 36(10–11), 1120–1129. doi:10.1016/j.comcom.2013.04.004
Zhu, L., Sun, A., & Choi, B. (2011). Detecting spam blogs from blog search results. Information Processing & Management, 47(2), 246–262. doi:10.1016/j.ipm.2010.03.006
Jeong, S., Noh, G., Oh, H., & Kim, C.-K. (2016). Follow spam detection based on cascaded social information. Information sciences, 369, 481–499. doi:10.1016/j.ins.2016.07.033
Wu, F., Shu, J., Huang, Y., & Yuan, Z. (2016). Co-detecting social spammers and spam messages in microblogging via exploiting social contexts. Neurocomputing, 201, 51–65. doi:10.1016/j.neucom.2016.03.036
Savage, D., Zhang, X., Yu, X., Chou, P., & Wang, Q. (2015). Detection of opinion spam based on anomalous rating deviation. Expert systems with applications, 42(22), 8650–8657. doi:10.1016/j.eswa.2015.07.019
Palomo, E. J., Domínguez, E., Luque, R. M., & Muñoz, J. (2009). Spam detection based on a hierarchical self-organizing map. Στο Lecture notes in computer science. Emerging Intelligent Computing Technology and Applications. With Aspects of Artificial Intelligence (PP. 30–37). doi:10.1007/978-3-642-04020-7_4
Shahreza, M. L., Moazzami, D., Moshiri, B., & Delavar, M. R. (2011). Anomaly detection using a self-organizing map and particle swarm optimization. Scientia Iranica, 18(6), 1460–1468. doi:10.1016/j.scient.2011.08.025
Inuwa-Dutse, I., Liptrott, M., & Korkontzelos, I. (2018). Detection of spam-posting accounts on Twitter. Neurocomputing, 315, 496–511. doi:10.1016/j.neucom.2018.07.044
Haddadnia, J., Seryasat, O. R., & Rabiee, H. (2013). Thyroid diseases diagnosis using probabilistic neural network and principal component analysis. Journal of Basic and Applied Science Research, 3(2), 593-598.

Volume 1, Issue 1
Winter 2018
Pages 1-9

XML

PDF 579.47 K

Receive Date 29 January 2018
Revise Date 14 February 2018
Accept Date 04 March 2018

Article View 323
PDF Download 340

Transactions on Machine Intelligence

Spam Detection from Big Data based on Evolutionary Data Mining Systems

Volume 1, Issue 1Winter 2018Pages 1-9

Files

History

Share

How to cite

Statistics

Volume 1, Issue 1
Winter 2018
Pages 1-9