PREDICTION OF MALAYSIAN WOMEN DIVORCE USING MACHINE LEARNING TECHNIQUES

Authors

  • Nazim Aimran Center of Statistical and Decision Science Studies, Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA, Shah Alam, Selangor, Malaysia.
  • Adzhar Rambli Center of Statistical and Decision Science Studies, Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA, Shah Alam, Selangor, Malaysia.
  • Asyraf Afthanorhan Faculty of Business and Management, Universiti Sultan Zainal Abidin, Kampung Gong Badak, Kuala Terengganu, Malaysia
  • Adzmel Mahmud Faculty of Economics and Administration, Universiti of Malaya, Malaysia
  • Azlin Sapri Population and Family Research Division, National Population and Family Development Board, Malaysia
  • Airena Aireen Population and Family Research Division, National Population and Family Development Board, Malaysia

DOI:

https://doi.org/10.24191/mjoc.v7i2.17077

Keywords:

Artificial Neural Network, Data Mining, Decision Tree, Divorce, Logistic Regression, Malaysian Women

Abstract

This paper discusses the performance of three machine learning techniques namely Decision Tree, Logistic Regression and Artificial Neural Network for predicting divorce among Malaysian women. Secondary data were obtained from the Fifth Malaysia Population and Family Survey (MPFS-5) conducted by the National Population and Family Development Board (LPPKN). The total number of instances in the dataset was 7,644 ever married Malaysian women aged 15 to 59 years old. Divorce is currently a serious problem among the Malaysian community due to various reasons. In 2019, the divorce rate in Malaysia rose by 12% from the previous year. During the first three months of the Movement Control Order (MCO), i.e. from March 18 to June 18, 2020, the Syariah Court of Malaysia recorded 6,569 divorce cases. Worse, a total of 90,766 divorce cases were recorded from January to October 2020. Six predictive models were used for comparison, namely Decision Tree (C5.0 and CHAID), Logistic Regression (Forward Stepwise and Backward Stepwise), and Artificial Neural Network (Multi-Layer Perceptron and Radial Basis Function). Among the six predictiv  methods, the Decision Tree model (C5.0) was found to be the best model in classifying divorce among Malaysian women. The accuracy of the C5.0 model was 77.96% followed by the Artificial Neural Network (Multi-Layer Perceptron) and Logistic Regression (Forward Stepwise) model (74.68% and 67.89%, respectively). The order of important predictors in predicting divorce among Malaysian women is the wives’ employment status (0.1531) followed by the husbands’ employment status (0.1396), type of marriage (0.1327), race/ethnicity (0.1327), distant relationship (0.1212), the wives’ qualification level (0.1115), age group (0.1053) and religion (0.0998).

References

Abdelkader, E. M., Al-Sakkaf, A., & Ahmed, R. (2020). A Comprehensive Comparative Analysis of Machine Learning Models for Predicting Heating and Cooling Loads. Decision Science Letters, 9(3), 409-420.

Abdul Ghani, N., Norris, I. N. H., Abdullah, B., Ahmad, M. F., Zulkifli, N. I., & Hasan, M. R. (2017). Divorce Trends among the Malay Community in Perlis, Malaysia over a Ten-Year Period (2006 ~ 2015).International Journal of Academic Research in Business and Social Sciences, 7, 269-283.

Abirami, S. & Chitra, P. (2020). Energy-efficient edge based real-time healthcare support system. In Advances in Computers; Elsevier: Amsterdam, The Netherlands, 2020; Volume 117 (pp. 339–368).

Al-jabery, K. K., Obafemi-Ajayi, T., Olbricht, G. R. Wunsch II, D. C. (2020). 9 - Data Analysis and Machine Learning Tools in MATLAB and Python, Academic Press, (pp. 231-290).

Borucka, A. (2020). Logistic Regression in Modeling and Assessment of Transport Services. Open Engineering, 10(1), 26-34.

Buaton, R., Mawengkang, H., Zarlis, M., Effendi, S., Pardede, A. M. H., Maulita, Y., Fauzi A., Novvriyenni, N., Sihombing, A., & Lumbanbatu, K. (2019). Decision Tree Optimization in Data Mining with Support and Confidenc. Journal of Physics: Conference Series, 1255(1). Retrieved from https://iopscience.iop.org/article/10.1088/1742-6596/1255/1/012056/pdf.

Department of Statistics Malaysia (2019). Marriage and Divorce Statistics, Malaysia, 2019 [Press Release]. Retrieved from https://www.dosm.gov.my/v1/index.php?r=column/pdfPrev&id=d1BZVzBZYXVwOTBPdXhGVEJTQWl4dz09

Department of Statistics Malaysia (2020). Marriage and Divorce Statistics, Malaysia, 2020 [Press Release]. Retrieved from https://www.dosm.gov.my/v1/index.php?r=column/pdfPrev&id=QmZ1cE4xRFAvYWQ0R05hTk1rWm5KQT09

Dreiseitl, S. & Ohno-Machado, L. (2003). Logistic regression and Artificial Neural Network Classification Models: A Methodology Review. Journal of Biomedical Informatics, 35 (5-6), (pp. 352-359).

Fath, A. H., Madanifar, F., & Abbasi, M. (2020). Implementation of multilayer perceptron (MLP) and radial basis function (RBF) neural networks to predict solution gas-oil ratio of crude oil systems. Petroleum. 6(1), 80–91.

Fetzer, M., (2017, April 12). By the numbers: A breakdown of divorce by generation. Avvo Stories. https://stories.avvo.com/relationships/divorce/numbers-breakdown-divorcegeneration.html

Folke, O., & Rickne, J. (2020). All the Single Ladies: Job Promotions and the Durability of Marriage. American Economic Journal: Applied Economics. 12 (1), 260-87.

Guzella, T. S., & Caminhas, W. M. A review of machine learning approaches to spam filtering. Expert Syst. Appl. 36, 10206–10222 (2009).

IBM Knowledge Center. (2021a, Sept 20). Logistic Node Model Option. IBM Corporation. https://www.ibm.com/docs/en/spss-modeler/18.1.0?topic=node-logistic-model-options

IBM Knowledge Center (2021b, Sept 20). Predictor Importance. IBM Corporation. https://www.ibm.com/docs/en/spssmodeler/18.1.0?topic=SS3RA7_18.1.0/modeler_mainhelp_client_ddita/clementine/idh_common_predictor_importance.html

IBM Knowledge Centre. (2022, July 17). Basics (neural networks). IBM Corporation. https://www.ibm.com/docs/en/spss-modeler/18.0.0?topic=networks-basics-neural

Jones, G. (2018, August 9). Breaking Down Divorce by Generation. Goldberg Jones. https://www.goldbergjones-wa.com/divorce/divorce-bygeneration/#:~:text=As%20far%20as%20breaking%20down,at%20higher%20rates%20than%20ever

Kaur, H., & Wasan, S. K. (2006). Empirical Study on Applications of Data Mining Techniques in Healthcare. Journal of Computer Science. 2(2), 194–200.

Krapf, S. (2017). Moving in or Breaking Up? The Role of Distance in the Development of Romantic Relationships. European Journal of Population. 34(3), 313–336.

Kulkarni, A., Chong, D., & Batarseh, F. A. (2020). 5 Foundations of Data Imbalance and Solutions for a Data Democracy. Data Democracy, At the Nexus of Artificial Intelligence, Software Development and Knowledge Engineering, ISBN: 978-0-12- 818366-3, 83-106.

Kozak, J. (2018). Decision Tree and Ensemble Learning Based on Ant Colony Optimization. Volume 781 of Studies in Computational Intelligence. Springer.

Lin, C., & Fan, C. (2019). Evaluation of CART, CHAID, and QUEST Algorithms: A Case Study of Construction Defects in Taiwan, Journal of Asian Architecture and Building Engineering. 18(6), 539-553.

Milanovic, M., & Stamenkovic, M. (2016). CHAID Decision Tree: Methodological Frame and Application. Economic Theme, 54(4), 563-586.

Mohamed, R. I. A., & Alkhyeli, M. K. (2016). Early divorce prediction in Abu Dhabi. Proceedings of The Spirit of Official Statistics: Partnership and Continuous Innovation. 1 - 16, International Association for Official Statistics.

Mossalam, A., & Arafa, M. (2018). Using Artificial Neural Networks (ANN) In Projects Monitoring Dashboards’ Formulation. HBRC Journal. 14(3), 385-392, DOI: 10.1016/j.hbrcj.2017.11.002

Mudunuru, V. R., & Skrzypek, L. A. (2020). A Comparison of Artificial Neural Network and Decision Trees with Logistic Regression as Classification Models for Breast Cancer Survival. International Journal of Mathematical, Engineering and Management Science. 5 (6), pp. 1170-1190.

National Population and Family Development Board (2016). Report on The Key Findings of the Fifth Malaysian Population and Family Survey (MPFS-5) 2014. Kuala Lumpur.

Nichols, J. A., Chan, H. W., & Baker, M. (2019). Machine learning: applications of artificial intelligence to imaging and diagnosis. Biophysical reviews. 11(1), 111–118. https://doi.org/10.1007/s12551-018-0449-9

Pandya, R., & Pandya, J. (2015). C5.0 Algorithm to Improved Decision Tree with Feature Selection and Reduced Error Pruning. International Journal of Computer Applications. 17(16), 18-21.

Rahlin, N. A., Awang, Z., Afthanorhan, A., & Aimran, N. (2019). Antecedents and Consequences of Employee Safety Climate in The Small Manufacturing Enterprises: Translation, Validation and Application of the Generic Safety Climate Questionnaire. International Journal of Innovation, Creativity and Change. 7(10), 307-328.

Raley, R. K., McClendon D. M., & Steidl, E. A. (2015). Credits & credentials: An in-depth analysis of the association between educational attainment and the risk of divorce. Paper presented at PAA, May 1st, 2015.

Raymo, J. M., Fukuda, S., & Iwasawa, M. (2013). Educational Differences in Divorce in Japan. Demographic Research. 28, 177-206.

Reddi, S., & Eswar, G.V. (2021). Chapter 9 - Fake news in social media recognition using Modified Long Short-Term Memory Network. Intelligent Data-Centric Systems, Security in IoT Social Networks. Academic Press, 205-227. Rodrigues, I., (2020, February 17). CRISP-DM methodology leader in data mining and big data. Towards Data Science. https://towardsdatascience.com/crisp-dm-methodologyleader-in-data-mining-and-big-data-467efd3d3781

Sayer, L. C., England, P., Allison, P. D., & Kangas, N. (2011). She Left, He Left: How Employment and Satisfaction Affect Women’s and Men’s Decision to Leave Marriages. AJS; American Journal of Sociology. 116(6), 1982-2018.

Sharma, A., Chudhey, A. S., & Singh, M. (2021). Divorce case prediction using Machine learning algorithms. 2021 International Conference on Artificial Intelligence and Smart Systems (ICAIS). Doi:10.1109/icais50930.2021.93958.

Sharma, N., Chawla, V. K., & Ram, N. (2019). Comparison of Machine Learning Algorithms for the Automatic Programming of Computer Numerical Control Machine. International Journal of Data and Network Science. 4(1), 1-14.

Siddiqui, M. A., Khan, A. S., & Witjaksono, G. (2020). Classification of the Factors for Smoking Cessation using Logistic Regression, Decision Tree & Neural Networks. AIP Conference Proceedings 2203, 020036 (2022). Doi: 10.1063/1.5142128.

Song, Y. Y., & Lu, Y. (2015). Decision tree methods: applications for classification and prediction. Shanghai archives of psychiatry. 27(2), 130-135. https://doi.org/10.11919/j.issn.1002-0829.215044.

Sperandei S. (2014). Understanding Logistic Regression Analysis. Biochemia medica. 24(1), 12–18. https://doi.org/10.11613/BM.2014.003

Steverman, B. (2016, July 28). Don’t Blame Divorce on Money. Ask: Did the Husband Have a Job?. Bloomberg. https://www.bloomberg.com/news/articles/2016-07-28/don-tblame-divorce-on-money-ask-did-the-husband-have-a-job

Weng, S. F., Reps, J., Kai, J., Garibaldi, J. M., & Qureshi, N (2017). Can machine-learning improve cardiovascular risk prediction using routine clinical data? PLoS One. 2017 Apr 4;12(4):e0174944. DOI: 10.1371/journal.pone.0174944. PMID: 28376093; PMCID: PMC5380334.

Yöntem, M. K., Adem, K., İlhan, T., & Kılıçarslan, S. (2019). Divorce Prediction Using Correlation Based Feature Selection and Artificial Neural Networks. Nevşehir Hacı Bektaş Veli Üniversitesi SBE Dergisi, 9(1), 259-273.

Downloads

Published

2022-10-01

How to Cite

Aimran, N., Rambli, A., Afthanorhan, A., Mahmud, A., Sapri, A., & Aireen, . A. (2022). PREDICTION OF MALAYSIAN WOMEN DIVORCE USING MACHINE LEARNING TECHNIQUES. Malaysian Journal of Computing, 7(2), 1067–1081. https://doi.org/10.24191/mjoc.v7i2.17077