Comparative Analysis of the Performance of K-Nearest Neighbor (K-NN) and Naive Bayes Algorithms on User Satisfaction Levels of the Tokopedia Application

Authors

  • Muhammad Syahputra Novelan Universitas Pembangunan Panca Budi Author
  • Muhammad Iqbal Universitas Pembangunan Panca Budi Author

DOI:

https://doi.org/10.64803/cessmuds.v1.37
   

Keywords:

K-Nearest Neighbor, Naive Bayes, Tokopedia, Sentiment Analysis, Machine Learning

Abstract

The rapid advancement of digital technology has significantly influenced the development of e-commerce platforms in Indonesia, where Tokopedia stands out as one of the most popular and widely used online marketplaces. As user expectations continue to increase, understanding and measuring user satisfaction has become essential for ensuring service quality and maintaining customer loyalty. This study aims to perform a comparative analysis of the performance of two machine learning classification algorithms—K-Nearest Neighbor (K-NN) and Naive Bayes—in analyzing and predicting user satisfaction levels toward the Tokopedia application. The dataset used in this study was obtained from a combination of online reviews and structured survey responses from active Tokopedia users. The research methodology includes several stages: data collection, text preprocessing (tokenization, stop-word removal, and stemming), feature extraction using the Term Frequency–Inverse Document Frequency (TF-IDF) technique, and model implementation using the two algorithms. Both models were evaluated using key performance metrics such as accuracy, precision, recall, and F1-score. The experimental results indicate that the K-NN algorithm achieved superior performance compared to Naive Bayes, demonstrating higher accuracy and better consistency in classifying user sentiments into “satisfied” and “dissatisfied” categories. The K-NN model proved to be more effective in handling diverse and nonlinear data patterns derived from user-generated reviews. Meanwhile, Naive Bayes, although computationally efficient, showed limitations in processing complex text dependencies. The findings of this research highlight the importance of selecting appropriate machine learning algorithms for user satisfaction analysis. Furthermore, the study contributes to the broader understanding of sentiment-based evaluation models in e-commerce platforms and provides valuable insights for Tokopedia and similar companies in enhancing customer experience and service improvement strategies.

References

Abdillah, T., Khaira, U., & Hutabarat, B. F. (2024). Komparasi Metode Naive Bayes dan K-Nearest Neighbors Terhadap Analisis Sentimen Pengguna Aplikasi Zenius. Jurnal PROCESSOR, 19(1). https://doi.org/10.33998/processor.2024.19.1.1596

Alamsyah, A., & Saviera, F. (2021). A Comparison of Indonesia’s E-Commerce Sentiment Analysis for Marketing Intelligence Effort (case study of Bukalapak, Tokopedia and Elevenia).

Dhany, H. W., Izhari, F., & Komputer, S. (2024). Jurnal ICT : Information and Communication Technologies, 15 (2) (2024) 48-54 Prediction analysis condition animal use algorithm (SVM+KNN). www.ejournal.marqchainstitute.or.id/index.php/JICT

Farta Wijaya, R., Kurniawan, F., Putrai, R. R., Alvin, A., Sains, F., & Teknologi, D. (2024). OPTIMIZING THE MANAGEMENT OF VILLAGE ACTIVITIES THROUGH INFORMATION SYSTEMS FOR TRANSPARENCY AND ACCOUNTABILITY IN PERTUMBUKAN VILLAGE, WAMPU DISTRICT.

Genkin, M. (2020). Zero-Shot Machine Learning Technique for Classification of Multi-User Big Data Workloads. Proceedings - 2020 IEEE International Conference on Big Data, Big Data 2020, 5046–5055. https://doi.org/10.1109/BigData50022.2020.9378023

Hamid, M. S., Eviyanti, A., Hindarto, H., & Ariyanti, N. (2025). Analisis Sentimen Tingkat Kepuasan Aplikasi WordPress Menggunakan Metode K-Nearest Neighbor dan Naive Bayes. SMATIKA JURNAL, 15(01), 102–114. https://doi.org/10.32664/smatika.v15i01.1522

Hasan Putra, P., Syahputra Novelan, M., & Rizki, M. (n.d.). ANALYSIS K-NEAREST NEIGHBOR METHOD IN CLASSIFICATION OF VEGETABLE QUALITY BASED ON COLOR. In Journal of Applied Engineering and Technological Science (Vol. 3, Issue 2).

Iqbal, M., & Efendi, S. (2023). Data-Driven Approach for Credit Risk Analysis Using C4.5 Algorithm. ComTech: Computer, Mathematics and Engineering Applications, 14(1), 11–20. https://doi.org/10.21512/comtech.v14i1.8243

Kadek, I., Sugianta, A., Kadek, N., & Patrianingsih, W. (2025). Evaluasi Naïve Bayes dan K-Nearest Neighbor dalam Klasifikasi Sentimen Ulasan Produk Skincare MSGLOW di Tokopedia. Jurnal ProTekInfo |, 12(1).

Khairul, K., Nasyuha, A. H., Ikhwan, A., H. Aly, M., & Ahyanuardi, A. (2023). Implementation of Multiple Linear Regression to Estimate Profit on Sales of Screen Printing Equipment. JURNAL INFOTEL, 15(2), 55–61. https://doi.org/10.20895/infotel.v15i2.934

Novelan, M. S., & Aryza, S. (2025). OPTIMIZATION CVRP WITH MACHINE LEARNING FOR IMPROVED CLASSIFICATION OF IMBALANCED DATA FOOD DISTRIBUTION. JITK (Jurnal Ilmu Pengetahuan Dan Teknologi Komputer), 10(4), 917–925. https://doi.org/10.33480/jitk.v10i4.6467

Novelan, M. S., Efendi, S., Sihombing, P., & Mawengkang, H. (2023). VEHICLE ROUTING PROBLEM OPTIMIZATION WITH MACHINE LEARNING IN IMBALANCED CLASSIFICATION VEHICLE ROUTE DATA. Eastern-European Journal of Enterprise Technologies, 5(3(125)), 49–56. https://doi.org/10.15587/1729-4061.2023.288280

Nuranisah, Efendi, S., & Sihombing, P. (2020). Analysis of algorithm support vector machine learning and k-nearest neighbor in data accuracy. IOP Conference Series: Materials Science and Engineering, 725(1). https://doi.org/10.1088/1757-899X/725/1/012118

Putera Utama Siahaan, A., Azizah Harahap, N., Yuni Simanullang, R., & Wanny, P. (2025). Analysis of Inpatient Data Using Cluster Analysis on Simulation Dataset. Bulletin of Information Technology (BIT), 6(1), 33–39. https://doi.org/10.47065/bit.v5i2.1830

Sitorus, Z., Prayogi, D., Rizko, M. A., Suteja, A. G., & Harahap, M. R. (2024). Implementation of the Insertion Sort Algorithm to Sort Positive Integers in Ascending Order Using Flowgorithm. Journal of Information Technology, Computer Science and Electrical Engineering (JITCSE), 1(3), 323–328. https://doi.org/10.30596/jitcse

Talmera, A. T., Wardhana, M., & Ratnasari, V. (2025). Analysis of the Impact of UI/UX Elements on User Satisfaction and Loyalty In E-Commerce Platforms: An Empirical Study on the Tokopedia Platform. Journal of Social Research. http://ijsr.internationaljournallabs.com/index.php/ijsr

Tapidingan, Y. C., & Paseru, D. (2020). Comparative Analysis of Classification Methods of KNN and Naïve Bayes to Determine Stress Level of Junior High School Students. In Indonesian Journal of Information Systems (IJIS) (Vol. 2, Issue 2).

Von Rueden, L., Mayer, S., Beckh, K., Georgiev, B., Giesselbach, S., Heese, R., Kirsch, B., Pfrommer, J., Pick, A., Ramamurthy, R., Walczak, M., Garcke, J., Bauckhage, C., & Schuecker, J. (2023). Informed Machine Learning - A Taxonomy and Survey of Integrating Prior Knowledge into Learning Systems. IEEE Transactions on Knowledge and Data Engineering, 35(1), 614–633. https://doi.org/10.1109/TKDE.2021.3079836

Wijianto, R., Pratmanto, D., Widayanto, A., & Ubaidilah. (2025). Komparasi K-Nearest Neighbors (KNN) dan Naive Bayes pada Klasifikasi Sentimen Ulasan Aplikasi Tokopedia di Google Play Store. Informatics and Computer Engineering Journal, 5(2), 75-80

Wulandari, N., Cahyana, Y., & Hikmayanti Handayani, H. (2025). Sentiment Analysis on the Relocation of the National Capital (IKN) on Social Media X Using Naive Bayes and K-Nearest Neighbor (KNN) Methods. In Journal of Applied Informatics and Computing (JAIC) (Vol. 9, Issue 3). http://jurnal.polibatam.ac.id/index.php/JAIC

Downloads

Published

2025-11-02

Issue

Section

Articles

How to Cite

Comparative Analysis of the Performance of K-Nearest Neighbor (K-NN) and Naive Bayes Algorithms on User Satisfaction Levels of the Tokopedia Application. (2025). Proceedings of The International Conference on Computer Science, Engineering, Social Science, and Multi-Disciplinary Studies, 1, 218-224. https://doi.org/10.64803/cessmuds.v1.37