Implementation of the Naive Bayes Algorithm in Spam Detection in SMS Messages

Authors

  • Ulfi Muzayyanah Fadil Universitas Islam Negeri Sumatera Utara, Medan Author
  • Kalfida Eka Wati Siregar Universitas Islam Negeri Sumatera Utara, Medan Author
  • Wily Supi Ramadani Universitas Islam Negeri Sumatera Utara, Medan Author

DOI:

https://doi.org/10.64803/cessmuds.v1.26
   

Keywords:

Naive Bayes, spam detection, SMS, text classification, TF-IDF

Abstract

This study discusses the application of the Naive Bayes algorithm to detect spam messages in Short Message Service (SMS) services. The background of this study is the increasing spread of spam messages containing advertisements, fraud, and malicious content, which necessitates an automated system to distinguish spam from non-spam. The methods used in this study include collecting labeled SMS data, preprocessing (text cleaning, tokenization, stopword removal, and stemming), and feature extraction using the Term Frequency-Inverse Document Frequency (TF-IDF) technique. The Naive Bayes model was trained on a Kaggle dataset and tested in Google Colab to evaluate classification performance using accuracy, precision, and recall metrics. The results showed that the Multinomial Naive Bayes model achieved an accuracy of 96.86%, with a strong ability to recognize ham (non-spam) messages and exemplary performance in detecting spam messages. These findings demonstrate that the Naive Bayes algorithm is effective and efficient at classifying Indonesian-language text messages, making it a suitable basis for developing a more innovative, faster automatic SMS spam detection system.

References

Adnan Sauddin, Try Azisah Nurman, Nur Aeni, & Sadem Rahayu Sudarta. (2025). SMS Spam Classification Using Naive Bayes Classifier and K-Nearest Neighbor. MSA Journal (Mathematics and Statistics and Their Applications) , 13 (1), 101–109. https://doi.org/10.24252/msa.v13i1.46192

Al-Kaabi, H., Darroudi, A.D., & Jasim, A.K. (2024). Survey of SMS Spam Detection Techniques: A Taxonomy. AlKadhim Journal for Computer Science , 2 (4), 23–34. https://doi.org/10.61710/kjcs.v2i4.88

Aldisa, FE, & Rahmawati, DE (2021). Creating an SMS Classification System Using the Naive Bayes and TF-IDF Methods, 1 (1), 2548–2964. Retrieved from http://j-ptiik.ub.ac.id

Amin, MBM, Hakim, G., Maulana, MT, Alwan, MF, Anggraheni, HS, Naufal, MJ, & Yudistira, N. (2024). Text-Based Indonesian Language Spam Detection Using the BERT Model. Journal of Information Technology and Computer Science , 11 (6), 1291–1302. https://doi.org/10.25126/jtiik.2024118121

Chrysanti, R., Wijaya, SH, & Haryanto, T. (2024). The Development of Classification Algorithm Models on Spam SMS Using Feature Selection and SMOTE. ILKOM Scientific Journal , 16 (3), 356–370. https://doi.org/10.33096/ilkom.v16i3.2220.356-370

Kokila, M., & Amalredge, G. (2022). Spam Detection in SMS Using Naïve Bayes in Machine Learning, 2 (5), 191–194.

Pranata, E.A., Subari, S., & Gunawan, G.F. (2022). Application of the Naive Bayes Method for Classifying Spam SMS Using Java Programming. J-Intech , 7 (02), 104–108. https://doi.org/10.32664/j-intech.v7i02.435

Reinhart, J. (2023). SMS SPAM CLASSIFICATION USING NAIVE BAYES AND SVM, 18 (1), 63–68.

Retail, DR, & Safri, M. (2024). Journal of Computer Networks, Architecture and High Performance Computing Implementation of The Apriori Algorithm in Managing Stock Items at Journal of Computer Networks, Architecture and High Performance Computing, 6 (3), 838–849.

Setifani, NA, Fitriana, DN, & Yusuf, A. (2020). Comparison of Naïve Bayes, SVM, and Decision Tree Algorithms for Spam SMS Classification. JUSIM (Jurnal Sistem Informasi Musirawas) , 5 (02), 153–160. https://doi.org/10.32767/jusim.v5i02.956

Utami, LD, Yusuf, L., & Nurlaela, D. (2021). Comparison of Naive Bayes and Support Vector Machine Algorithms in Sentiment Analysis of Human Rights and SPAM SMS. Infotek: Journal of Informatics and Technology , 4 (2), 249–258. https://doi.org/10.29408/jit.v4i2.3665

Vijay, & Kumar, S. (2021). Spam SMS Detection Using Naive Bayes Classifier Abstract : International Journal of Scientific Research and Engineering Development , 4 (1), 561–563.

Published

2025-10-27

Issue

Section

Articles

How to Cite

Implementation of the Naive Bayes Algorithm in Spam Detection in SMS Messages. (2025). Proceedings of The International Conference on Computer Science, Engineering, Social Science, and Multi-Disciplinary Studies, 1, 165-170. https://doi.org/10.64803/cessmuds.v1.26