Blood Donation Classification with Decision Tree Method using C4.5 Algorithm

Authors

  • Jefri Junifer Pangaribuan Universitas Pelita Harapan, Indonesia https://orcid.org/0000-0001-5508-0763
  • Alexander Putra Universitas Pelita Harapan, Indonesia

DOI:

https://doi.org/10.59653/ijmars.v2i03.961

Keywords:

Blood Donation, C4.5 Algorithm, Classification, Data Mining, Decision Tree

Abstract

Donating blood is an altruistic act driven by concern for others and personal commitment to health. It is crucial for patients needing transfusions due to excessive bleeding. However, there has been a decline in blood donations globally. To address this, the medical community needs a method to predict whether a donor will donate again, enabling proactive measures to ensure an adequate blood supply. This study utilizes data from the University of California, Irvine (UCI) Machine Learning Repository, specifically the Blood Transfusion Service Data Set, employing the Decision Tree method with the C4.5 algorithm. C4.5, an improvement over Iterative Dichotomiser 3 (ID3), can handle missing values, pruning, and continuous data.  The aim is to classify blood donor eligibility accurately. The aim of this study is to explore how the utilization of the C4.5 algorithm in decision tree classification can predict whether an individual will donate blood again or not. The analysis identifies five key attributes—Recency, Frequency, Monetary, Time (Months), and Decision—as determinants of repeat donation likelihood. Using a confusion matrix to assess accuracy, the C4.5 algorithm achieved a 77.68% accuracy, with an error rate of 22.32%, a sensitivity of 30.19%, and a specificity of 92.40%.

Downloads

Download data is not yet available.

References

Angraini, Y., Fauziah, S., & Putra, J. L. (2020). Analisis Kinerja Algoritma C4.5 dan Naive Bayes Dalam Memprediksi Keberhasilan Sekolah Menghadapi UN. Jurnal Ilmu Pengetahuan Dan Teknologi Komputer (JITK), 5(2), 285–290. https://doi.org/10.33480/jitk.v5i2.1233

Ayodonor - Palang Merah Indonesia. (2016). Palang Merah Indonesia. https://ayodonor.pmi.or.id/

Barus, O. P., Nathasya, C., & Pangaribuan, J. J. (2023). The Implementation of RFM Analysis to Customer Profiling Using K-Means Clustering. Mathematical Modelling of Engineering Problems, 10(1), 298–303. https://doi.org/10.18280/mmep.100135

Barus, O. P., Romindo, & Pangaribuan, J. J. (2023). Classification of Hearing Loss Degrees with Naive Bayes Algorithm. Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), 7(4), 751–757. https://doi.org/10.29207/resti.v7i4.4683

Charbuty, B., & Abdulazeez, A. (2021). Classification Based on Decision Tree Algorithm for Machine Learning. Journal of Applied Science and Technology Trends, 2(01), 20–28. https://doi.org/10.38094/jastt20165

Irawan, Y. (2021). Penerapan Algoritma Decision Tree C4.5 Untuk Memprediksi Kelayakan Calon Pendonor Melakukan Donor Darah Dengan Klasifikasi Data Mining. JTIM : Jurnal Teknologi Informasi Dan Multimedia, 2(4), 181–189. https://doi.org/10.35746/jtim.v2i4.75

Kaptoge, S., Di Angelantonio, E., Moore, C., Walker, M., Armitage, J., Ouwehand, W. H., Roberts, D. J., Danesh, J., Thompson, S. G., Kaptoge, S., Di Angelantonio, E., Moore, C., Walker, M., Armitage, J., Ouwehand, W. H., Roberts, D. J., Danesh, J., Thompson, S. G., Donovan, J., … Roberts, D. J. (2019). Longer-term efficiency and safety of increasing the frequency of whole blood donation (INTERVAL): extension study of a randomised trial of 20 757 blood donors. The Lancet Haematology, 6(10), e510–e520. https://doi.org/10.1016/S2352-3026(19)30106-1

Liao, S. H., Widowati, R., & Puttong, P. (2022). Data Mining Analytics Investigate Facebook Live Stream Users’ Behaviors and Business Models: The Evidence from Thailand. Entertainment Computing, 41, 100478. https://doi.org/10.1016/j.entcom.2022.100478

Maimon, O., & Rokach, L. (2005). Introduction to Knowledge Discovery in Databases. In O. Maimon & L. Rokach (Eds.), Data Mining and Knowledge Discovery Handbook (pp. 1–17). Springer US. https://doi.org/10.1007/0-387-25465-X_1

Pangaribuan, J. J., & Suharjito. (2014). Diagnosis of Diabetes Mellitus Using Extreme Learning Machine. 2014 International Conference on Information Technology Systems and Innovation, ICITSI 2014 - Proceedings, 33–38. https://doi.org/10.1109/ICITSI.2014.7048234

Pangaribuan, J. J., Tanjaya, H., & Kenichi. (2017). Mendeteksi Penyakit Jantung Menggunakan Machine Learning dengan Algoritma Logistic Regression. Journal Information System Development, 6(2), 40–48.

Priyasadie, N., & Isa, S. M. (2021). Educational Data Mining in Predicting Student Final Grades on Standardized Indonesia Data Pokok Pendidikan Data Set. International Journal of Advanced Computer Science and Applications, 12(12), 212–216. https://doi.org/10.14569/IJACSA.2021.0121227

Ramadani, S., Hidayat, S., & Ramahdanty, R. (2023). Application of Data Mining on Inventory Grouping using Clustering Method. Jurnal Teknik Informatika C.I.T Medicom, 15(5), 228–239. https://doi.org/10.35335/cit.Vol15.2023.608.pp228-239

Resti, Y., Aryanto, R., Yahdin, S., & Kresnawati, E. S. (2023). Rain Event Prediction Performance Using Decision Tree Method. AIP Conference Proceedings, 2689(1), 120006. https://doi.org/10.1063/5.0117434

Rusyana, N. R., Renaldi, F., & Destiani, D. (2023). Prediction Analysis of Four Disease Risk Using Decision Tree C4.5. ICCoSITE 2023 - International Conference on Computer Science, Information Technology and Engineering: Digital Transformation Strategy in Facing the VUCA and TUNA Era, 90–94. https://doi.org/10.1109/ICCoSITE57641.2023.10127710

Sathiyanarayanan, P., Pavithra., S., Saranya., M. S., & Makeswari., M. (2019). Identification of Breast Cancer Using The Decision Tree Algorithm. 2019 IEEE International Conference on System, Computation, Automation and Networking (ICSCAN), 1–6. https://doi.org/10.1109/ICSCAN.2019.8878757

Sumiati, S., Repi, V. V. R., Hendriyati, P., Anharudin, A., Yusta, A., & Triayudi, A. (2023). Classification of Cardiac Disorders Based on Electrocardiogram Data Using a Decision Tree Classification Approach with the C45 Algorithm. IAES International Journal of Artificial Intelligence (IJ-AI), 12(3), 1128. https://doi.org/10.11591/ijai.v12.i3.pp1128-1138

Syukmana, F., Wahyudi, E., Gata, W., Wahono, H., Febianto, N. I., Kuntoro, A. Y., Nitra, R. O., Effendi, L., Saputra, D. D., & Sulaeman, O. R. (2020). Predicting Relegation Clubs in Italian Serie A with Method based C4.5 Decision Tree Algorithm. Journal of Physics: Conference Series, 1471(1), 12016. https://doi.org/10.1088/1742-6596/1471/1/012016

Visa, S., Ramsay, B., Ralescu, A., & Van Der Knaap, E. (2011). Confusion Matrix-based Feature Selection. Proceedings of The 22nd Midwest Artificial Intelligence and Cognitive Science Conference.

Why Blood Donation is Important – and Who Benefits. (2021, May 19). The American National Red Cross. https://www.redcrossblood.org/local-homepage/news/article/blood-donation-importance.html

Yang, A., Zhang, W., Wang, J., Yang, K., Han, Y., & Zhang, L. (2020). Review on the Application of Machine Learning Algorithms in the Sequence Data Mining of DNA. Frontiers in Bioengineering and Biotechnology, 8. https://doi.org/10.3389/fbioe.2020.01032

Yeh, I.-C. (2008). Blood Transfusion Service Center Data Set. In UCI Machine Learning Repository. https://archive.ics.uci.edu/ml/datasets/Blood+Transfusion+Service+Center

Zulfikar, W. B., Gerhana, Y. A., & Rahmania, A. F. (2018). An Approach to Classify Eligibility Blood Donors Using Decision Tree and Naive Bayes Classifier. 6th International Conference on Cyber and IT Service Management, CITSM, 2018. https://doi.org/10.1109/CITSM.2018.8674353

Downloads

Published

2024-06-23

How to Cite

Pangaribuan, J. J., & Putra, A. (2024). Blood Donation Classification with Decision Tree Method using C4.5 Algorithm. International Journal of Multidisciplinary Approach Research and Science, 2(03), 1248–1259. https://doi.org/10.59653/ijmars.v2i03.961