Advancing Corporate Credit Risk Assessment in Emerging Markets: A Comparative Analysis of Machine Learning Classifiers in South Africa
Main Article Content
Abstract
This study examines the predictive power of machine learning techniques in corporate credit rating assessment using firm-level financial data from 208 companies across three key sectors in South Africa. By employing statistical models alongside advanced classifiers, including logistic regression, support vector machines, random forest, decision trees, k-nearest neighbors, and XGBoost, the analysis evaluates model performance using accuracy, sensitivity, specificity, precision, and the Matthews correlation coefficient. The empirical design incorporates financial ratios capturing liquidity, solvency, profitability, and efficiency, thereby aligning predictive analytics with established financial theory. Results demonstrate that while traditional models provide a baseline framework, ensemble and kernel-based methods deliver superior classification accuracy, particularly when sectoral heterogeneity is considered. These findings underscore the growing role of artificial intelligence in improving credit risk assessments, enhancing financial inclusion, and supporting regulatory oversight in emerging markets. The study offers theoretical contributions to credit risk modeling and provides policy recommendations for integrating explainable machine learning into financial supervision and lending practices.
Metrics
Article Details

This work is licensed under a Creative Commons Attribution 4.0 International License.
References
Ahn, H., & Kim, Y. (2021). Machine learning-based credit rating prediction models: Economic implications for banks. Finance Research Letters, 41, 101846. https://doi.org/10.1016/j.frl.2020.101846
Alemu, M. (2019). Predicting banking distress in emerging economies using machine learning. Emerging Markets Finance and Trade, 55(10), 2165–2180. https://doi.org/10.1080/1540496X.2018.1479996
Altman, E. I., Iwanicz-Drozdowska, M., Laitinen, E. K., & Suvas, A. (2020). Financial distress prediction in an international context: A review and empirical analysis of Altman’s Z-score model. Journal of International Financial Management & Accounting, 31(2), 131–171. https://doi.org/10.1111/jifm.12098
Beck, T., & Rojas-Suarez, L. (2020). Financial inclusion of SMEs in emerging markets: Bridging the gaps. Comparative Economic Studies, 62(2), 178–201. https://doi.org/10.1057/s41294-020-00119-3
Bhatia, A., & Singh, A. (2022). Predicting firm credit ratings in emerging economies: Evidence from India. Emerging Markets Review, 52, 100877. https://doi.org/10.1016/j.ememar.2022.100877
Boussaïd, N., & Hamza, T. (2022). Predicting corporate financial distress: Machine learning techniques and empirical evidence from French firms. Journal of Risk and Financial Management, 15(3), 116. https://doi.org/10.3390/jrfm15030116
Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785–794. https://doi.org/10.1145/2939672.2939785
Chen, X., Liu, Y., & Zhang, H. (2024). Transfer learning for credit scoring across countries: Evidence from Africa and Asia. Journal of International Financial Markets, Institutions & Money, 91, 101478. https://doi.org/10.1016/j.intfin.2023.101478
Chicco, D., & Jurman, G. (2020). The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics, 21(1), 6. https://doi.org/10.1186/s12864-019-6413-7
Choudhury, S., Kar, S., & Dutta, P. (2023). Interpretable credit risk prediction using KNN ensembles in imbalanced data environments. Expert Systems with Applications, 213, 118944. https://doi.org/10.1016/j.eswa.2022.118944
Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297. https://doi.org/10.1007/BF00994018
Dastile, X., Celik, T., & Potsane, M. (2020). Statistical and machine learning models in credit scoring: A systematic literature survey. Applied Soft Computing, 91, 106263. https://doi.org/10.1016/j.asoc.2020.106263
Doshi-Velez, F., & Kim, B. (2017). Towards a rigorous science of interpretable machine learning. arXiv preprint, arXiv:1702.08608
Fedorova, E., Gilenko, E., & Dovzhenko, S. (2021). Sectoral determinants of default probability: Evidence from emerging markets. Emerging Markets Review, 46, 100759. https://doi.org/10.1016/j.ememar.2020.100759
Felten, E. W., Rajagopal, K., & Robinson, D. T. (2024). Feature importance and interpretability in credit scoring models using SHAP values. Journal of Banking & Finance, 148, 106715. https://doi.org/10.1016/j.jbankfin.2024.106715
Gambacorta, L., Huang, Y., Qiu, H., & Wang, J. (2022). Data vs collateral: Evidence from bank lending. Journal of Financial Intermediation, 52, 100947. https://doi.org/10.1016/j.jfi.2021.100947
Gao, X., Li, S., & Zhang, Y. (2022). Explainable AI in credit scoring: Balancing accuracy and interpretability. Decision Support Systems, 155, 113701. https://doi.org/10.1016/j.dss.2022.113701
García, F., Guijarro, F., & Moya, I. (2022). Financial ratios and credit risk: A machine learning approach. Mathematics, 10(7), 1125. https://doi.org/10.3390/math10071125
Gupta, S., Kumar, R., & Agarwal, S. (2020). Comparative evaluation of machine learning models for corporate bankruptcy prediction. Journal of Computational and Applied Mathematics, 376, 112832. https://doi.org/10.1016/j.cam.2020.112832
Han, J., Kamber, M., & Pei, J. (2021). Data mining: Concepts and techniques (4th ed.). Elsevier.
He, J., Sun, J., & Li, H. (2021). Deep learning and ensemble methods for credit scoring: A comparative analysis. Expert Systems with Applications, 176, 114912. https://doi.org/10.1016/j.eswa.2021.114912
Li, Y., Wang, Y., & Xu, J. (2021). Ensemble learning methods for corporate default prediction: Evidence from China. Pacific-Basin Finance Journal, 65, 101481. https://doi.org/10.1016/j.pacfin.2020.101481
Maree, J., & Steyn, S. (2018). Predicting bank distress in South Africa: Combining financial ratios and machine learning. South African Journal of Economic and Management Sciences, 21(1), a1919. https://doi.org/10.4102/sajems.v21i1.1919
Moyo, T., & Sibanda, K. (2021). Sectoral determinants of credit risk modeling in Sub-Saharan Africa. African Finance Journal, 23(2), 45–63. https://doi.org/10.2139/ssrn.3825471
Moyo, V. (2021). Corporate credit rating prediction using machine learning: Evidence from emerging markets. Emerging Markets Finance and Trade, 57(10), 2798–2813. https://doi.org/10.1080/1540496X.2019.1703664
Nkosi, P., & Jansen van Rensburg, P. (2022). Financial distress in African banks: A machine learning approach. Journal of African Business, 23(2), 150–169. https://doi.org/10.1080/15228916.2021.2011508
Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). “Why should I trust you?” Explaining the predictions of any classifier. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1135–1144. https://doi.org/10.1145/2939672.2939778
Shen, K., Wang, H., & Li, F. (2023). Domain adaptation for credit scoring: A cross-country transfer learning study. Expert Systems with Applications, 208, 118145. https://doi.org/10.1016/j.eswa.2022.118145
Song, Y., & Lu, Y. (2020). Decision tree-based credit scoring for SMEs: Balancing interpretability and accuracy. Small Business Economics, 55(2), 371–388. https://doi.org/10.1007/s11187-019-00142-2
Varian, H. R. (2021). Machine learning and the profession of economics. Journal of Economic Perspectives, 35(4), 3–14. https://doi.org/10.1257/jep.35.4.3
Wang, Y., Chen, X., & Zhang, L. (2022). Explainable artificial intelligence for credit risk management: Applications and challenges. Decision Support Systems, 159, 113764. https://doi.org/10.1016/j.dss.2022.113764
Xu, X., Xu, Y., & He, Z. (2021). Financial credit risk prediction with SVM: A comparative analysis. Neural Computing and Applications, 33(23), 16387–16399. https://doi.org/10.1007/s00521-021-06072-w
Zhang, L., Zhou, J., & Ding, Y. (2022). Hyper-parameter optimization in credit risk prediction: A Bayesian approach. Knowledge-Based Systems, 239, 107994. https://doi.org/10.1016/j.knosys.2021.107994
Zhou, Y., & Hooker, N. (2022). Random forest credit scoring models for SMEs: Evidence from developing economies. Journal of Business Research, 145, 356–367. https://doi.org/10.1016/j.jbusres.2022.02.012