Main Article Content

Abstract

In a large dataset, data mining is a solution to arrange new models  into useful information. The algorithm is often used in machine learning is C4.5. This algorithm is known to be very strong in classifying, but has several weaknesses, such as overlapping and overfitting of data. To handle this, it is necessary to select an attribute that can identify the relevant attribute without reducing the accuracy of the algorithm itself. The Particle Swarm Optimization (PSO) is an optimization algorithm which one can be used as an attribute selection. The PSO benefit is that to easy to use, efficient and has a simple concept when to compared of data mining algorithms and other optimization techniques. In this study, the precision of C4.5 which is optimized by Particle Swarm Optimization (PSO) algorithm is proven to be higher than using the C4.5 algorithm alone. Where the algorithm C4.5+PSO has an precision  of 66.80% while the algorithm of C4.5 has an precision of 76.32%.

Keywords

C4.5 K-Means K-Medoid Particle Swarm Optimization (PSO) C4.5 K-Means K-Medoid Particle Swarm Optimization (PSO)

Article Details

How to Cite
Safira, D., & Mustakim. (2021). Perbandingan Algoritma C4.5 dengan C4.5+Particle Swarm Optimization untuk Klasifikasi Angkatan Kerja. Jurnal Komputer Terapan, 7(2), 272–279. https://doi.org/10.35143/jkt.v7i2.5143

References

  1. R. S. Kodeeshwari and K. T. Ilakkiya, “Different Types of Data Mining Techniques Used in Agriculture - A Survey,†Int. J. Adv. Eng. Res. Sci., vol. 4, no. 6, pp. 17–23, 2017, doi: 10.22161/ijaers.4.6.3.
  2. A. Waluyo, H. Jatnika, M. R. S. Permatasari, T. Tuslaela, I. Purnamasari, and A. P. Windarto, “Data Mining Optimization uses C4.5 Classification and Particle Swarm Optimization (PSO) in the location selection of Student Boardinghouses,†IOP Conf. Ser. Mater. Sci. Eng., vol. 874, no. 1, pp. 1–9, 2020, doi: 10.1088/1757-899X/874/1/012024.
  3. Clustering algorithms 3.1. 2020.
  4. Z. E. Rasjid and R. Setiawan, “Performance Comparison and Optimization of Text Document Classification using k-NN and Naïve Bayes Classification Techniques,†Procedia Comput. Sci., vol. 116, pp. 107–112, 2017, doi: 10.1016/j.procs.2017.10.017.
  5. R. H. Saputra and B. Prasetyo, “Improve the Accuracy of C4.5 Algorithm Using Particle Swarm Optimization (PSO) Feature Selection and Bagging Technique in Breast Cancer Diagnosis,†J Soft Comp. Exp, vol. 1, no. 1, pp. 47–55, 2020.
  6. O. Pahlevi, “JITE ( Journal of Informatics and Telecommunication Engineering ) Data Mining Optimization Based on Particle Swarm Optimization,†vol. 5, no. July, pp. 152–159, 2021.
  7. T. Eftimov and P. Korošec, “A novel statistical approach for comparing meta-heuristic stochastic optimization algorithms according to the distribution of solutions in the search space,†Inf. Sci. (Ny)., vol. 489, pp. 255–273, 2019, doi: 10.1016/j.ins.2019.03.049.
  8. A. Adamu, M. Abdullahi, S. B. Junaidu, and I. H. Hassan, “An hybrid particle swarm optimization with crow search algorithm for feature selection,†Mach. Learn. with Appl., vol. 6, no. April, p. 100108, 2021, doi: 10.1016/j.mlwa.2021.100108.
  9. Mustakim, “Effectiveness of K-means clustering to distribute training data and testing data on K-nearest neighbor classification,†J. Theor. Appl. Inf. Technol., vol. 95, no. 21, pp. 5693–5700, 2017.
  10. I. Romli, T. Pardamean, S. Butsianto, T. N. Wiyatno, and E. Bin Mohamad, “Naive Bayes Algorithm Implementation Based on Particle Swarm Optimization in Analyzing the Defect Product,†J. Phys. Conf. Ser., vol. 1845, no. 1, 2021, doi: 10.1088/1742-6596/1845/1/012020.
  11. P. Kumar and D. Sirohi, “Comparative analysis of FCM and HCM algorithm on Iris data set,†Int. J. Comput. Appl., vol. 5, no. 2, pp. 33–37, 2017, doi: 10.5120/888-1261.
  12. Mustakim, M. Z. Fauzi, Mustafa, A. Abdullah, and Rohayati, “Clustering of Public Opinion on Natural Disasters in Indonesia Using DBSCAN and K-Medoids Algorithms,†J. Phys. Conf. Ser., vol. 1783, no. 1, 2021, doi: 10.1088/1742-6596/1783/1/012016.
  13. D. F. Pramesti, Lahan, M. Tanzil Furqon, and C. Dewi, “Implementasi Metode K-Medoids Clustering Untuk Pengelompokan Data,†J. Pengemb. Teknol. Inf. dan Ilmu Komput., vol. 1, no. 9, pp. 723–732, 2017, doi: 10.1109/EUMC.2008.4751704.
  14. L. D. Yulianto, A. Triayudi, and I. D. Sholihati, “Implementation Educational Data Mining For Analysis of Student Performance Prediction with Comparison of K-Nearest Neighbor Data Mining Method and Decision Tree C4.5,†J. Mantik, vol. 4, no. 1, pp. 441–451, 2020.
  15. M. R. Khalilpour Darzi, S. T. A. Niaki, and M. Khedmati, “Binary classification of imbalanced datasets: The case of CoIL challenge 2000,†Expert Syst. Appl., vol. 128, pp. 169–186, 2019, doi: 10.1016/j.eswa.2019.03.024.
  16. L. Rahmi, “Analisis Proyeksi Pertumbuhan Penduduk Terhadap Kondisi Ketenagakerjaan Di Kota Sawahlunto Sumatera Barat,†Georafflesia, vol. 2, no. 1, pp. 95–106, 2017.
  17. S. Sundaramurthy and P. Jayavel, “A hybrid Grey Wolf Optimization and Particle Swarm Optimization with C4.5 approach for prediction of Rheumatoid Arthritis,†Appl. Soft Comput. J., vol. 94, p. 106500, 2020, doi: 10.1016/j.asoc.2020.106500.
  18. H. Bin Wang and Y. J. Gao, “Research on C4.5 algorithm improvement strategy based on MapReduce,†Procedia Comput. Sci., vol. 183, pp. 160–165, 2021, doi: 10.1016/j.procs.2021.02.045.
  19. X. Meng, P. Zhang, Y. Xu, and H. Xie, “Construction of decision tree based on C4.5 algorithm for online voltage stability assessment,†Int. J. Electr. Power Energy Syst., vol. 118, no. October 2019, p. 105793, 2020, doi: 10.1016/j.ijepes.2019.105793.
  20. W. Utomo, “The comparison of k-means and k-medoids algorithms for clustering the spread of the covid-19 outbreak in Indonesia,†Ilk. J. Ilm., vol. 13, no. 1, pp. 31–35, 2021, doi: 10.33096/ilkom.v13i1.763.31-35.
  21. R. M. Adnan, P. Khosravinia, B. Karimi, and O. Kisi, “Prediction of hydraulics performance in drain envelopes using Kmeans based multivariate adaptive regression spline,†Appl. Soft Comput., vol. 100, p. 107008, 2021, doi: 10.1016/j.asoc.2020.107008.
  22. A. K. Singh, S. Mittal, P. Malhotra, and Y. V. Srivastava, “Clustering Evaluation by Davies-Bouldin Index(DBI) in Cereal data using K-Means,†Proc. 4th Int. Conf. Comput. Methodol. Commun. ICCMC 2020, no. Iccmc, pp. 306–310, 2020, doi: 10.1109/ICCMC48092.2020.ICCMC-00057.
  23. I. Yulianti, R. A. Saputra, M. S. Mardiyanto, and A. Rahmawati, “Optimasi Akurasi Algoritma C4.5 Berbasis Particle Swarm Optimization dengan Teknik Bagging pada Prediksi Penyakit Ginjal Kronis,†Techno.Com, vol. 19, no. 4, pp. 411–421, 2020, doi: 10.33633/tc.v19i4.3579.