Perbandingan Algoritma C4.5 dengan C4.5+Particle Swarm Optimization untuk Klasifikasi Angkatan Kerja

Devy Safira; Mustakim

doi:10.35143/jkt.v7i2.5143

Authors

Devy Safira Universitas Islam Negeri Sultan Syarif Kasim Riau
Mustakim Universitas Islam Negeri Sultan Syarif Kasim Riau

DOI:

https://doi.org/10.35143/jkt.v7i2.5143

Keywords:

C4.5, K-Means, K-Medoid, Particle Swarm Optimization (PSO)

Abstract

In a large dataset, data mining is a solution to arrange new models into useful information. The algorithm is often used in machine learning is C4.5. This algorithm is known to be very strong in classifying, but has several weaknesses, such as overlapping and overfitting of data. To handle this, it is necessary to select an attribute that can identify the relevant attribute without reducing the accuracy of the algorithm itself. The Particle Swarm Optimization (PSO) is an optimization algorithm which one can be used as an attribute selection. The PSO benefit is that to easy to use, efficient and has a simple concept when to compared of data mining algorithms and other optimization techniques. In this study, the precision of C4.5 which is optimized by Particle Swarm Optimization (PSO) algorithm is proven to be higher than using the C4.5 algorithm alone. Where the algorithm C4.5+PSO has an precision of 66.80% while the algorithm of C4.5 has an precision of 76.32%.

Downloads

Download data is not yet available.

References

R. S. Kodeeshwari and K. T. Ilakkiya, â€œDifferent Types of Data Mining Techniques Used in Agriculture - A Survey,â€ Int. J. Adv. Eng. Res. Sci., vol. 4, no. 6, pp. 17â€“23, 2017, doi: 10.22161/ijaers.4.6.3.

A. Waluyo, H. Jatnika, M. R. S. Permatasari, T. Tuslaela, I. Purnamasari, and A. P. Windarto, â€œData Mining Optimization uses C4.5 Classification and Particle Swarm Optimization (PSO) in the location selection of Student Boardinghouses,â€ IOP Conf. Ser. Mater. Sci. Eng., vol. 874, no. 1, pp. 1â€“9, 2020, doi: 10.1088/1757-899X/874/1/012024.

Clustering algorithms 3.1. 2020.

Z. E. Rasjid and R. Setiawan, â€œPerformance Comparison and Optimization of Text Document Classification using k-NN and NaÃ¯ve Bayes Classification Techniques,â€ Procedia Comput. Sci., vol. 116, pp. 107â€“112, 2017, doi: 10.1016/j.procs.2017.10.017.

R. H. Saputra and B. Prasetyo, â€œImprove the Accuracy of C4.5 Algorithm Using Particle Swarm Optimization (PSO) Feature Selection and Bagging Technique in Breast Cancer Diagnosis,â€ J Soft Comp. Exp, vol. 1, no. 1, pp. 47â€“55, 2020.

O. Pahlevi, â€œJITE ( Journal of Informatics and Telecommunication Engineering ) Data Mining Optimization Based on Particle Swarm Optimization,â€ vol. 5, no. July, pp. 152â€“159, 2021.

T. Eftimov and P. KoroÅ¡ec, â€œA novel statistical approach for comparing meta-heuristic stochastic optimization algorithms according to the distribution of solutions in the search space,â€ Inf. Sci. (Ny)., vol. 489, pp. 255â€“273, 2019, doi: 10.1016/j.ins.2019.03.049.

A. Adamu, M. Abdullahi, S. B. Junaidu, and I. H. Hassan, â€œAn hybrid particle swarm optimization with crow search algorithm for feature selection,â€ Mach. Learn. with Appl., vol. 6, no. April, p. 100108, 2021, doi: 10.1016/j.mlwa.2021.100108.

Mustakim, â€œEffectiveness of K-means clustering to distribute training data and testing data on K-nearest neighbor classification,â€ J. Theor. Appl. Inf. Technol., vol. 95, no. 21, pp. 5693â€“5700, 2017.

I. Romli, T. Pardamean, S. Butsianto, T. N. Wiyatno, and E. Bin Mohamad, â€œNaive Bayes Algorithm Implementation Based on Particle Swarm Optimization in Analyzing the Defect Product,â€ J. Phys. Conf. Ser., vol. 1845, no. 1, 2021, doi: 10.1088/1742-6596/1845/1/012020.

P. Kumar and D. Sirohi, â€œComparative analysis of FCM and HCM algorithm on Iris data set,â€ Int. J. Comput. Appl., vol. 5, no. 2, pp. 33â€“37, 2017, doi: 10.5120/888-1261.

Mustakim, M. Z. Fauzi, Mustafa, A. Abdullah, and Rohayati, â€œClustering of Public Opinion on Natural Disasters in Indonesia Using DBSCAN and K-Medoids Algorithms,â€ J. Phys. Conf. Ser., vol. 1783, no. 1, 2021, doi: 10.1088/1742-6596/1783/1/012016.

D. F. Pramesti, Lahan, M. Tanzil Furqon, and C. Dewi, â€œImplementasi Metode K-Medoids Clustering Untuk Pengelompokan Data,â€ J. Pengemb. Teknol. Inf. dan Ilmu Komput., vol. 1, no. 9, pp. 723â€“732, 2017, doi: 10.1109/EUMC.2008.4751704.

L. D. Yulianto, A. Triayudi, and I. D. Sholihati, â€œImplementation Educational Data Mining For Analysis of Student Performance Prediction with Comparison of K-Nearest Neighbor Data Mining Method and Decision Tree C4.5,â€ J. Mantik, vol. 4, no. 1, pp. 441â€“451, 2020.

M. R. Khalilpour Darzi, S. T. A. Niaki, and M. Khedmati, â€œBinary classification of imbalanced datasets: The case of CoIL challenge 2000,â€ Expert Syst. Appl., vol. 128, pp. 169â€“186, 2019, doi: 10.1016/j.eswa.2019.03.024.

L. Rahmi, â€œAnalisis Proyeksi Pertumbuhan Penduduk Terhadap Kondisi Ketenagakerjaan Di Kota Sawahlunto Sumatera Barat,â€ Georafflesia, vol. 2, no. 1, pp. 95â€“106, 2017.

S. Sundaramurthy and P. Jayavel, â€œA hybrid Grey Wolf Optimization and Particle Swarm Optimization with C4.5 approach for prediction of Rheumatoid Arthritis,â€ Appl. Soft Comput. J., vol. 94, p. 106500, 2020, doi: 10.1016/j.asoc.2020.106500.

H. Bin Wang and Y. J. Gao, â€œResearch on C4.5 algorithm improvement strategy based on MapReduce,â€ Procedia Comput. Sci., vol. 183, pp. 160â€“165, 2021, doi: 10.1016/j.procs.2021.02.045.

X. Meng, P. Zhang, Y. Xu, and H. Xie, â€œConstruction of decision tree based on C4.5 algorithm for online voltage stability assessment,â€ Int. J. Electr. Power Energy Syst., vol. 118, no. October 2019, p. 105793, 2020, doi: 10.1016/j.ijepes.2019.105793.

W. Utomo, â€œThe comparison of k-means and k-medoids algorithms for clustering the spread of the covid-19 outbreak in Indonesia,â€ Ilk. J. Ilm., vol. 13, no. 1, pp. 31â€“35, 2021, doi: 10.33096/ilkom.v13i1.763.31-35.

R. M. Adnan, P. Khosravinia, B. Karimi, and O. Kisi, â€œPrediction of hydraulics performance in drain envelopes using Kmeans based multivariate adaptive regression spline,â€ Appl. Soft Comput., vol. 100, p. 107008, 2021, doi: 10.1016/j.asoc.2020.107008.

A. K. Singh, S. Mittal, P. Malhotra, and Y. V. Srivastava, â€œClustering Evaluation by Davies-Bouldin Index(DBI) in Cereal data using K-Means,â€ Proc. 4th Int. Conf. Comput. Methodol. Commun. ICCMC 2020, no. Iccmc, pp. 306â€“310, 2020, doi: 10.1109/ICCMC48092.2020.ICCMC-00057.

I. Yulianti, R. A. Saputra, M. S. Mardiyanto, and A. Rahmawati, â€œOptimasi Akurasi Algoritma C4.5 Berbasis Particle Swarm Optimization dengan Teknik Bagging pada Prediksi Penyakit Ginjal Kronis,â€ Techno.Com, vol. 19, no. 4, pp. 411â€“421, 2020, doi: 10.33633/tc.v19i4.3579.