Studi Perbandingan Analisis: Evaluasi Kinerja Algoritma Klasifikasi pada Dataset Terbatas

Authors

  • Agus Wantoro Universitas Aisyah Pringsewu
  • Aviv Fitria Yuliana Universitas Aisyah Pringsewu
  • Dita Septasari Universitas Aisyah Pringsewu
  • Dwi Yana Ayu Andini Universitas Aisyah Pringsewu
  • Ikna Awaliyani Universitas Aisyah Pringsewu

Keywords:

classification, small data, multi algorithm, machine learning

Abstract

This research aims to evaluate and compare the performance of various classification algorithms under conditions of limited data quantity. Six algorithms were tested, including Naïve Bayes, Support Vector Machine (SVM), Decision Tree, Random Forest, AdaBoost, and k-NN against a small-scale public dataset from different domains. Experiments were conducted using a cross-validation technique (k-fold=5), and evaluation was based on confusion matrix to measure performance in terms of accuracy, precision, recall, F1-score, AUC, and model computation time. The results of the comparison of accuracy, precision, and recall performance show that the Naive Bayes algorithm consistently exhibits optimal performance with a value of 0.896. Timing tests show that the Naive Bayes algorithm demonstrates the fastest time in building the model, while the Random Forest algorithm shows the worst time. The AUC test results indicate that the Naive Bayes algorithm excels, followed by k-NN. Meanwhile, SVM shows the worst AUC performance. Based on the f1-score test, the Random Forest and Naive Bayes algorithms demonstrate the best performance, while the Tree algorithm shows the worst performance. This is because the Naive Bayes algorithm has ease of implementation, speed in calculations, and its ability to work well with large, medium, and limited data, as well as with many features. Each user should choose the algorithm tailored to the data used. In addition, the use of Cross-validation has proven to provide a more reliable performance estimate. These findings offer practical recommendations for researchers and practitioners in selecting effective classification algorithms for small-scale datasets, as well as highlighting the importance of validation techniques and data processing in enhancing model generalization under data limitation conditions

Author Biographies

Aviv Fitria Yuliana, Universitas Aisyah Pringsewu

Fakultas Teknologi dan Informatika

Dita Septasari, Universitas Aisyah Pringsewu

Pendidikan Teknologi Informasi

Dwi Yana Ayu Andini, Universitas Aisyah Pringsewu

Fakultas Teknologi dan Informatika

Ikna Awaliyani, Universitas Aisyah Pringsewu

Pendidikan Teknologi Informasi

Published

2025-12-29

How to Cite

Wantoro, A., Yuliana, A. F., Septasari, D., Ayu Andini, D. Y., & Awaliyani, I. (2025). Studi Perbandingan Analisis: Evaluasi Kinerja Algoritma Klasifikasi pada Dataset Terbatas. SEMNASTIK - APTIKOM 2025, 1(1), 89–94. Retrieved from http://ojssemnastik2025.aptikomlampung.id/index.php/semnastik2025/article/view/12

Similar Articles

1 2 3 > >> 

You may also start an advanced similarity search for this article.