PREDICTION OF LIFE EXPECTANCY FOR ASIAN POPULATION USING MACHINE LEARNING ALGORITHMS
DOI:
https://doi.org/10.24191/mjoc.v7i2.18218Keywords:
Life Expectancy, Data Classification, Data Mining, Asian PopulationAbstract
Predicting life expectancy has become more important nowadays as life has become more vulnerable due to many factors, including social, economic, environmental, education, lifestyle, and health condition. A lot of studies on life expectancy have been carried out. However, studies focusing on the Asian population are limited. This study presents machine learning algorithms for life expectancy based on the Asian population dataset. Comparisons are made between tree classifier models, namely, J48, Random Tree, and Random Forest. Cross validations with 10 and 20 folds are used. Results show that the highest accuracy is obtained with Random Forest with 84% accuracy with 10-fold cross-validation. This study further identifies the most significant factors that influence life expectancy prediction, which includes socioeconomic factors and educational status, health conditions and infectious disease.
References
Agarwal, P., Shetty, N., Jhajharia, K., Aggarwal, G., & Sharma, N. V. (2019). Machine learning for prognosis of life expectancy and diseases. International Journal of Innovative Technology and Exploring Engineering, 8(10), 1765–1771.
Ahmad Tarmizi, S. S., Mutalib, S., Abdul Hamid, N. H., Abdul-Rahman, S., & Md Ab Malik, A. (2019). A Case Study on Student Attrition Prediction in Higher Education Using Data Mining Techniques. In International Conference on Soft Computing in Data Science (pp. 181–192). Springer. https://doi.org/10.1007/978-981-15-0399-3_15
Andre, F. E., Booy, R., Bock, H. L., Clemens, J., Datta, S. K., John, T. J., Lee, B. W., Lolekha, S., Peltola, H., Ruff, T. A., Santosham, M., & Schmitt, H. J. (2008). Vaccination greatly reduces disease, disability, death and inequity worldwide. Bulletin of the World Health Organization, 86(2), 140–146.
Basheer, M. Y. I., Mutalib, S., Hamid, N. H. A., Abdul Rahman, S., & Malik, A. M. A. (2019). Predictive analytics of university student intake using supervised methods. IAES International Journal of Artificial Intelligence, 8(4), 367–374.
Beeksma, M., Verberne, S., van den Bosch, A., Das, E.,Hendrickx, I., & Groenewoud, S. (2019). Predicting life expectancy with a long short-term memory recurrent neural network using electronic medical records. BMC Medical Informatics and Decision Making, 19(1), 36.
Bin-Jumah, M. N., Nadeem, `M., Gilani, S., Al-Abbasi, F., Ullah, I., Alzarea, S., Kazmi, I. (2022). Genes and Longevity of Lifespan. International Journal of Molecular Sciences, 23(3), 1-27.
Brownlee, J. (2019). How to Use Ensemble Machine Learning Algorithms in WEKA, WEKA Machine Learning. A post at MachineLearningMastery available at https://machinelearningmastery.com/use-ensemble-machine-learning-algorithmsWEKA/
Chan, M. F., & Kamala Devi, M. (2015). Factors Affecting Life Expectancy: evidence from 1980-2009 data in Singapore, Malaysia, and Thailand. Asia Pacific Journal of Public Health, 27(2), 136–146. https://doi.org/10.1177/1010539512454163
Clarke, K. (2017). Review of the epidemiology of diphtheria 2000-2016. US Centeres for Disease Control and Prevention. https://doi.org/10.1371/journal.pone.0044878
Destefano, F., Bodenstab, H. M., & Offit, P. A. (2019). Principal Controversies in Vaccine Safety in the United States. Clinical Infectious Diseases, 69(4), 726–731. Education system. (2020). Retrieved from Ministry of Education: http://english.moe.go.kr/sub/infoRenewal.do?m=0301&page=0301&s=english
Eibe, F., Mark A. H., & Ian H. W. (2016). The WEKA Workbench. Online Appendix for "Data Mining: Practical Machine Learning Tools and Techniques", Morgan Kaufmann, Fourth Edition, 2016.
Fire, M., & Elovici, Y. (2015). Data mining of online genealogy datasets for revealing lifespan patterns in human population. ACM Transactionson Intelligent Systems and Technology, 6(2), 1–22. https://doi.org/10.1145/2700464
Gagneur, A., Quach, C., Boucher, F. D., Tapiero, B., De Wals, P., Farrands, A., Lemaitre, T., Boulianne, N., Sauvageau, C., Ouakki, M., Gosselin, V., Gagnon, D., Petit, G., Jacques, M. C., & Dubé, È. (2019). Promoting vaccination in the province of Québec: The PromoVaQ randomized controlled trial protocol. BMC Public Health, 19(1), 1–9.
Hosmer, D. W., & Lemeshow, S. (2000). Applied Logistic Regression 2nd edn John Wiley & Sons. Inc.: New York, NY, USA, 160–164.
James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning (Vol. 112). Springer.
Kang, J., & Adibi, S. (2018). Systematic Predictive Analysis of Personalized Life Expectancy Using Smart Devices. Technologies, 6(3), 74.
Kaplan, G. A., Pamuk, E. R., Lynch, J. W., Cohen, R. D., & Balfour, J. L. (1996). Inequality in income and mortality in the United States: Analysis of mortality and potential pathways. British Medical Journal, 312(7037), 999–1003.
Karacan, I., Sennaroglu, B., & Vayvay, O. (2020). Analysis of life expectancy across countries using a decision tree. Eastern Mediterranean Health Journal, 26(2), 143–151.
Kaur, P., Chahal, J. K., & Sharma, T. (2021). A DATA MINING APPROACH FOR CROPYIELD PREDICTION IN AGRICULTURE SECTOR. Advances in Mathematics: Scientific Journal, 10(3), 1425–1430.
Le, Y., Ren, J., Shen, J., Li, T., & Zhang, C. F. (2015). The changing gender differences in life expectancy. PLoS ONE, 10(4), 1–11.
Li, Y., Schoufour, J., Wang, D., Dhana, K., Pan, A., Liu, X., Hu, F. (2020, January 8). Healthy lifestyle and life expectancy free of cancer,. BMJ, 368(8228), 1-9.
Luy, M., Zannella, M., Wegner-Siegmundt, C., Minagawa, Y., Lutz, W., & Caselli, G. (2019). The impact of increasing education levels on rising life expectancy: a decomposition analysis for Italy, Denmark, and the USA. Genus, 75(1).
Marcus, J. L., Leyden, W. A., Alexeeff, S. E., Anderson, A. N., Hechter, R. C., Hu, H., Lam, J. O., Towner, W. J., Yuan, Q., Horberg, M. A., & Silverberg, M. J. (2020). Comparison of Overall and Comorbidity-Free Life Expectancy Between Insured Adults With and Without HIV Infection, 2000-2016. JAMA Network Open, 3(6), e207954. https://doi.org/10.1001/jamanetworkopen.2020.7954
Meshram, S. S. (2020). Comparative Analysis of Life Expectancy between Developed and Developing Countries using Machine Learning. 2020 IEEE Bombay Section Signature Conference (IBSSC), 6–10.
Miladinov, G. (2020). Socioeconomic development and life expectancy relationship: evidence from the EU accession candidate countries. Genus, 76(1).
Mohammad Suhaimi, N., Abdul-Rahman, S., Mutalib, S., Abdul Hamid, N. H., & Md Ab Malik, A. (2019). Predictive Model of Graduate-On-Time Using Machine Learning Algorithms. In Communications in Computer and Information Science (Vol. 1100, Issue September). Springer Singapore.
Monsef, A., & Mehrjardi, A. S. (2015). Determinants of Life Expectancy: A Panel Data Approach. Asian Economic and Financial Review, 5(11), 1251–1257.
Murray, C. J. L. (1988). The Infant Mortality Rate, Life Expectancy at Birth, and a Linear Index of Mortality as Measures of General Health Status. International Journal of Epidemiology, 17(1), 122–128. https://doi.org/10.1093/ije/17.1.122
Nalluri, S., Vijaya Saraswathi, R., Ramasubbareddy, S., Govinda, K., & Swetha, E. (2020). Chronic Heart Disease Prediction Using Data Mining Techniques. Advances in Intelligent Systems and Computing, 1079(June), 903–912.
National Center on Education. (n.d.). Top Performing Countries. Availabe at https://ncee.org/country/korea/
Navidi, W., & Monk, B. (2015). Elementary Statistics (2nd ed). MCGraw-Hill Education.
Rizzuto, D., & Fratiglioni, L. (2014). Lifestyle factors related to mortality and survival: A mini-review. Gerontology, 60(4), 327–335. https://doi.org/10.1159/000356771
Roser, M., Ortiz-Ospina, E., & Ritchie, H. (2019). Life Expectancy. A post at OurWorldinData availabe at https://ourworldindata.org/life-expectancy
Saravana, N., & Gayathri, D. V. (2018). Performance and classification evaluation of J48 algorithm and Kendall's based J48 algorithm (KNJ48). Int. J. Comput. Trends Technol.(IJCTT)--Volume, 59, 73–80.
Sharma, N. C., Efstratiou, A., Mokrousov, I., Mutreja, A., Das, B., & Ramamurthy, T. (2019). Diphtheria. Primer, 1-18. doi:https://doi.org/10.1038/s41572-019-0131-y
Sharma, T., Sharma, A., & Mansotra, V. (2016). Performance analysis of data mining classification techniques on public health care data. International Journal of Innovative Research in Computer and Communication Engineering, 4(6), 11381–11386.
Shuja, M., Mittal, S., & Zaman, M. (2020). Effective Prediction of Type II Diabetes Mellitus Using Data Mining Classifiers and SMOTE. January, 195-211.
Song, T.-M., & Song, J. (2021). Prediction of risk factors of cyberbullying-related words in Korea: Application of data mining using social big data. Telematics and Informatics, 58, 101524. https://doi.org/10.1016/j.tele.2020.101524
Vanlalawmpuia, R., & Lalhmingliana, M. (2020). Prediction of Depression in Social Network Sites Using Data Mining. 2020 4th International Conference on Intelligent Computing and Control Systems (ICICCS), 489–495.
Verma, A. K., Pal, S., & Kumar, S. (2020). Prediction of Skin Disease Using Ensemble Data Mining Techniques and Feature Selection Method—a Comparative Study. Applied Biochemistry and Biotechnology, 190(2), 341–359.
Vydehi, K., Manchikanti, K., Satya Kumari, T., & Ahmad Shah, S. K. (2020). Machine learning techniques for life expectancy prediction. International Journal of Advanced Trends in Computer Science and Engineering, 9(4), 4503-4507.
Walczak, D., Wantoch-Rekowski, J., & Marczak, R. (2021). Impact of income on life expectancy: A challenge for the pension policy. Risks, 9(4).
Walter, S., MacKenbach, J., Vokó, Z., Lhachimi, S., Ikram, M. A., Uitterlinden, A. G., Newman, A. B., Murabito, J. M., Garcia, M. E., Gudnason, V., Tanaka, T., Tranah, G. J., Wallaschofski, H., Kocher, T., Launer, L. J., Franceschini, N., Schipper, M., Hofman, A., & Tiemeier, H. (2012). Genetic, physiological, and lifestyle predictors of mortality in the general population. American Journal of Public Health, 102(4), 3–10.
Wang, Z., Li, L., Glicksberg, B. S., Israel, A., Dudley, J. T., & Ma'ayan, A. (2017). Predicting age by mining electronic medical records with deep learning characterizes differences between chronological and physiological age. Journal of Biomedical Informatics, 76, 59–68.
World Health Organization. (n.d.-a). Diphtheria reported cases. Availabe at http://apps.who.int/immunization_monitoring/globalsummary/timeseries/tsincidencediphtheria.html.
World Health Organization. (n.d.-b). GHE: Life expectancy and healthy life expectancy. Availabe at https://www.who.int/data/gho/data/themes/mortality-and-global-healthestimates/ghe-life-expectancy-and-healthy-life-expectancy
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Nurul Shahira Pisal, Shuzlina Abdul-Rahman, Mastura Hanafiah, Saidatul Izyanie Kamarudin

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.




