PREDICTION OF CUSTOMER CHURN FOR ABC MULTISTATE BANK USING MACHINE LEARNING ALGORITHMS

Authors

  • Hui Shan Hon School of Management, Universiti Sains Malaysia, 11800 Minden, Pulau Pinang, Malaysia
  • Khai Wah Khaw School of Management, Universiti Sains Malaysia, 11800 Minden, Pulau Pinang, Malaysia
  • XinYing Chew School of Computer Sciences, Universiti Sains Malaysia, 11800 Minden, Pulau Pinang, Malaysia
  • Wai Peng Wong School of Information Technology, Monash University, Malaysia Campus, Selangor, Malaysia

DOI:

https://doi.org/10.24191/mjoc.v8i2.21393

Keywords:

Bank Customer Churn, Machine Learning, Supervised Machine Learning

Abstract

Customer churn is defined as the tendency of customers to cease doing business with a company in a given period. ABC Multistate Bank faces the challenges to hold clients. The purpose of this study is to apply machine learning algorithms to develop the most effective model for predicting bank customer churn. In this study, six supervised machine learning methods, K-Nearest Neighbors, Support Vector Machine, Naïve Bayes, Decision Tree, Random Forest, and Extreme Gradient Boosting (XGBoost), are applied to the churn prediction model using Bank Customer Data of ABC Multistate Bank obtained from Kaggle. The results showed that XGBoost outperformed the other six classifiers, with an accuracy rate of 84.76%, an F1 score of 56.95%, and a ROC curve graph of 71.64%. The bank may use XGBoost model to accurately identify customers who are at risk of leaving, concentrate their efforts on them, and possibly make a profit. Future research should focus on various machine learning approaches for determining the most accurate models for bank customer churn datasets

References

Asselman, A., Khaldi, M., & Aammou, S. (2021). Enhancing the prediction of student performance based on the machine learning XGBoost algorithm. Interactive Learning Environments, 1-20. DOI: https://doi.org/10.1080/10494820.2021.1928235

Berk, R. (2012). Criminal justice forecasts of risk: A machine learning approach. Springer Science & Business Media.

Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32.

Chandar, M., Laha, A., & Krishna, P. (2006, March). Modeling churn behavior of bank customers using predictive data mining techniques. In National Conference on Soft computing techniques for engineering applications (SCT-2006) (pp. 24-26).

Chen, S., Webb, G. I., Liu, L., & Ma, X. (2020). A novel selective naïve Bayes algorithm. Knowledge-Based Systems, 192, 105361.

Cica, D., Sredanovic, B., Tesic, S., & Kramar, D. (2020). Predictive modeling of turning operations under different cooling/lubricating conditions for sustainable manufacturing with machine learning techniques. Applied Computing and Informatics. DOI: https://doi.org/10.1016/j.aci.2020.02.001

Costache, R., Arabameri, A., Moayedi, H., Pham, Q. B., Santosh, M., Nguyen, H., ... & Pham,B. T. (2022). Flash-flood potential index estimation using fuzzy logic combined with deep learning neural network, naïve Bayes, XGBoost and classification and regression tree. Geocarto International, 37(23), 6780-6807.

Costache, R., Hong, H., & Wang, Y. (2019). Identification of torrential valleys using GIS and a novel hybrid integration of artificial intelligence, machine learning and bivariate statistics. Catena, 183, 104179.

Cunningham, P., & Delany, S. J. (2021). K-nearest neighbour classifiers-a tutorial. ACM Computing Surveys (CSUR), 54(6), 1-25.

de Lima Lemos, R. A., Silva, T. C., & Tabak, B. M. (2022). Propension to customer churn in a financial institution: A machine learning approach. Neural Computing and Applications, 34, 11751-11768.

Guan, D., Yuan, W., Lee, Y. K., Najeebullah, K., & Rasel, M. K. (2014). A review of ensemble learning based feature selection. IETE Technical Review, 31(3), 190-198.

Guliyev, H., & Tatoğlu, F. Y. (2021). Customer churn analysis in banking sector: Evidence from explainable machine learning models. Journal Of Applied Microeconometrics, 1(2), 85-99.

Gumus, M., & Kiran, M. S. (2017, October). Crude oil price forecasting using XGBoost. In 2017 International conference on computer science and engineering (UBMK) (pp. 1100-1103). IEEE.

Ha Thanh, N., & Vy, N. (2022). Building a proper churn prediction model for Vietnam’s mobile banking service. International Journal of Advanced and Applied Sciences, 9(7), 139-149.

He, B., Shi, Y., Wan, Q., & Zhao, X. (2014). Prediction of customer attrition of commercial banks based on SVM model. Procedia computer science, 31, 423-430.

Hung, S. Y., Yen, D. C., & Wang, H. Y. (2006). Applying data mining to telecom churn management. Expert Systems with Applications, 31(3), 515-524.

Kaur, H., & Kumari, V. (2022). Predictive modelling and analytics for diabetes using a machine learning approach. Applied Computing and Informatics, 18, 90-100.

Kaur, I., & Kaur, J. (2020, November). Customer churn analysis and prediction in banking industry using machine learning. In 2020 Sixth International Conference on Parallel, Distributed and Grid Computing (PDGC) (pp. 434-437). IEEE.

Keramati, A., Ghaneei, H., & Mirmohammadi, S. M. (2016). Developing a prediction model for customer churn from electronic banking services using data mining. Financial Innovation, 2(1), 1-13.

Khodabandehlou, S., & Rahman, M. Z. (2017). Comparison of supervised machine learning techniques for customer churn prediction based on analysis of customer behavior. Journal of Systems and Information Technology, 19, 65 93.

Li, C., Hou, L., Sharma, B. Y., Li, H., Chen, C., Li, Y., ... & Chen, H. (2018). Developing a new intelligent system for the diagnosis of tuberculous pleural effusion. Computer Methods and Programs in Biomedicine, 153, 211-225.

Malhotra, D. K., Malhotra, K., & Malhotra, R. (2020). Evaluating consumer loans using machine learning techniques. In Applications of Management Science. Emerald Publishing Limited.

Malik, E. F., Khaw, K. W., & Chew, X. Y. (2022). A new hybrid data preprocessing technique for fraud detection prediction. Computing and Informatics, 41, 981-1001.

Marcelino, P., de Lurdes Antunes, M., Fortunato, E., & Gomes, M. C. (2021). Machine learning approach for pavement performance prediction. International Journal of Pavement Engineering, 22(3), 341-354.

Meza Ramirez, C. A., Greenop, M., Ashton, L., & Rehman, I. U. (2021). Applications of machine learning in spectroscopy. Applied Spectroscopy Reviews, 56(8-10), 733-763.

Oh, G., Song, J., Park, H., & Na, C. (2022). Evaluation of random forest in crime prediction: Comparing three-layered random forest and logistic regression. Deviant Behavior, 43(9), 1036-1049.

Patgiri, R., Varshney, U., Akutota, T., & Kunde, R. (2018, November). An investigation on intrusion detection system using machine learning. In 2018 IEEE Symposium Series on Computational Intelligence (SSCI) (pp. 1684-1691). IEEE.

Patro, K. K., Jaya Prakash, A., Jayamanmadha Rao, M., & Rajesh Kumar, P. (2022). An efficient optimized feature selection with machine learning approach for ECG biometric recognition. IETE Journal of Research, 68(4), 2743-2754.

Pisal, N. S., Abdul Rahman, S., Hanafiah, M., & Kamarudin, S. I. (2022). Prediction of life expectancy for Asian population using machine learning ALGORITHMS. Malaysian Journal of Computing (MJoC), 7(2), 1150-1161.

Rahman, M., & Kumar, V. (2020, November). Machine learning based customer churn prediction in banking. In 2020 4th International Conference on Electronics, Communication and Aerospace Technology (ICECA) (pp. 1196-1201). IEEE.

Sahin, E. K. (2020). Assessing the predictive capability of ensemble tree methods for landslide susceptibility mapping using XGBoost, gradient boosting machine, and random forest. SN Applied Sciences, 2(7), 1-17.

Shafie, S., Ooi, S. P., & Khaw, K. W. (2023). Prediction of employee promotion using hybrid sampling method with machine learning architecture. Malaysian Journal of Computing, 8(1), 1264-1286.

Taunk, K., De, S., Verma, S., & Swetapadma, A. (2019, May). A brief review of nearest neighbor algorithm for learning and classification. In 2019 International Conference on Intelligent Computing and Control Systems (ICCS) (pp. 1255-1260). IEEE.

Tékouabou, S. C., Gherghina, Ș. C., Toulni, H., Mata, P. N., & Martins, J. M. (2022). Towards Explainable Machine Learning for Bank Churn Prediction Using Data Balancing and Ensemble-Based Methods. Mathematics, 10(14), 2379.

Wang, M., & Chen, H. (2020). Chaotic multi-swarm whale optimizer boosted support vector machine for medical diagnosis. Applied Soft Computing, 88, 105946.

Yan, C., Liu, X., Boota, M. W., & Pan, Z. (2022). A Comparison of Machine Learning Methods Applied to the Automated Selection of River Networks. The Cartographic Journal, 59(3), 187-202.

Zhang, H., Zimmerman, J., Nettleton, D., & Nordman, D. J. (2020). Random forest prediction intervals. The American Statistician, 74(4), 392-406.

Zhao, J., & Dang, X. H. (2008, October). Bank customer churn prediction based on support vector machine: Taking a commercial bank's VIP customer churn as the example. In 2008 4th International Conference on Wireless Communications, Networking and Mobile Computing (pp. 1-4). IEEE.

Zhu, B., Baesens, B., Backiel, A., & Vanden Broucke, S. K. (2018). Benchmarking sampling techniques for imbalance learning in churn prediction. Journal of the Operational Research Society, 69(1), 49-65.

Downloads

Published

2023-10-10

How to Cite

Hon, H. S. ., Khaw, K. W. ., Chew, X. ., & Wong, W. P. . (2023). PREDICTION OF CUSTOMER CHURN FOR ABC MULTISTATE BANK USING MACHINE LEARNING ALGORITHMS. Malaysian Journal of Computing, 8(2), 1602–1619. https://doi.org/10.24191/mjoc.v8i2.21393