Implementation of the Naive Bayes Algorithm in Spam Detection in SMS Messages
DOI:
https://doi.org/10.64803/cessmuds.v1.26Keywords:
Naive Bayes, spam detection, SMS, text classification, TF-IDFAbstract
This study discusses the application of the Naive Bayes algorithm to detect spam messages in Short Message Service (SMS) services. The background of this study is the increasing spread of spam messages containing advertisements, fraud, and malicious content, which necessitates an automated system to distinguish spam from non-spam. The methods used in this study include collecting labeled SMS data, preprocessing (text cleaning, tokenization, stopword removal, and stemming), and feature extraction using the Term Frequency-Inverse Document Frequency (TF-IDF) technique. The Naive Bayes model was trained on a Kaggle dataset and tested in Google Colab to evaluate classification performance using accuracy, precision, and recall metrics. The results showed that the Multinomial Naive Bayes model achieved an accuracy of 96.86%, with a strong ability to recognize ham (non-spam) messages and exemplary performance in detecting spam messages. These findings demonstrate that the Naive Bayes algorithm is effective and efficient at classifying Indonesian-language text messages, making it a suitable basis for developing a more innovative, faster automatic SMS spam detection system.
References
Adnan Sauddin, Try Azisah Nurman, Nur Aeni, & Sadem Rahayu Sudarta. (2025). SMS Spam Classification Using Naive Bayes Classifier and K-Nearest Neighbor. MSA Journal (Mathematics and Statistics and Their Applications) , 13 (1), 101–109. https://doi.org/10.24252/msa.v13i1.46192
Al-Kaabi, H., Darroudi, A.D., & Jasim, A.K. (2024). Survey of SMS Spam Detection Techniques: A Taxonomy. AlKadhim Journal for Computer Science , 2 (4), 23–34. https://doi.org/10.61710/kjcs.v2i4.88
Aldisa, FE, & Rahmawati, DE (2021). Creating an SMS Classification System Using the Naive Bayes and TF-IDF Methods, 1 (1), 2548–2964. Retrieved from http://j-ptiik.ub.ac.id
Amin, MBM, Hakim, G., Maulana, MT, Alwan, MF, Anggraheni, HS, Naufal, MJ, & Yudistira, N. (2024). Text-Based Indonesian Language Spam Detection Using the BERT Model. Journal of Information Technology and Computer Science , 11 (6), 1291–1302. https://doi.org/10.25126/jtiik.2024118121
Chrysanti, R., Wijaya, SH, & Haryanto, T. (2024). The Development of Classification Algorithm Models on Spam SMS Using Feature Selection and SMOTE. ILKOM Scientific Journal , 16 (3), 356–370. https://doi.org/10.33096/ilkom.v16i3.2220.356-370
Kokila, M., & Amalredge, G. (2022). Spam Detection in SMS Using Naïve Bayes in Machine Learning, 2 (5), 191–194.
Pranata, E.A., Subari, S., & Gunawan, G.F. (2022). Application of the Naive Bayes Method for Classifying Spam SMS Using Java Programming. J-Intech , 7 (02), 104–108. https://doi.org/10.32664/j-intech.v7i02.435
Reinhart, J. (2023). SMS SPAM CLASSIFICATION USING NAIVE BAYES AND SVM, 18 (1), 63–68.
Retail, DR, & Safri, M. (2024). Journal of Computer Networks, Architecture and High Performance Computing Implementation of The Apriori Algorithm in Managing Stock Items at Journal of Computer Networks, Architecture and High Performance Computing, 6 (3), 838–849.
Setifani, NA, Fitriana, DN, & Yusuf, A. (2020). Comparison of Naïve Bayes, SVM, and Decision Tree Algorithms for Spam SMS Classification. JUSIM (Jurnal Sistem Informasi Musirawas) , 5 (02), 153–160. https://doi.org/10.32767/jusim.v5i02.956
Utami, LD, Yusuf, L., & Nurlaela, D. (2021). Comparison of Naive Bayes and Support Vector Machine Algorithms in Sentiment Analysis of Human Rights and SPAM SMS. Infotek: Journal of Informatics and Technology , 4 (2), 249–258. https://doi.org/10.29408/jit.v4i2.3665
Vijay, & Kumar, S. (2021). Spam SMS Detection Using Naive Bayes Classifier Abstract : International Journal of Scientific Research and Engineering Development , 4 (1), 561–563.
Published
Issue
Section
License
Copyright (c) 2025 Ulfi Muzayyanah Fadil, Kalfida Eka Wati Siregar, Wily Supi Ramadani (Author)

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.





