Relative performance of Neural Networks and Binary Logistic Regression in a Variable Selection framework
Castro Gbêmêmali Hounmenou, Emile Codjo Agbangba, Génevieve Amagbégnon, Reine Marie Ndéla Marone
DOI: http://dx.doi.org/10.15439/2024F8304
Citation: Position Papers of the 19th Conference on Computer Science and Intelligence Systems, M. Bolanowski, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 40, pages 25–32 (2024)
Abstract. In this study, the predictive abilities of a binary response variable on a set of descriptors using Multilayer Perceptron neural networks and binary logistic regression in a variable selection context were evaluated. The data used was related to the identification of prenatal factors linked to premature birth in women already in labor. The stepwise selection method on binary logistic regression and the Olden selection method based on the neural network approach were used to select the most relevant variables to predict the probability of premature birth by women. Then, the two selection methods were combined with binary logistic regression and multilayer perceptron neural network models. Using the performance criteria such as sensitivity, precision, rate of good classification, F-score and Area Under the Curve (AUC), the selection methods were compared in order to choose the best model. It appears from the analysis that the best procedure for selecting variables in a binary variable prediction is the use of the Stepwise procedure followed by multilayer perceptron neural networks.
References
- D. Wolke, “Preterm birth: high vulnerability and no resiliency? reflections on van lieshout et al.(2018),” Journal of Child Psychology and Psychiatry, vol. 59, no. 11, pp. 1201–1204, 2018.
- L. Liu, S. Oza, D. Hogan, Y. Chu, J. Perin, J. Zhu, J. E. Lawn, S. Cousens, C. Mathers, and R. E. Black, “Global, regional, and national causes of under-5 mortality in 2000–15: an updated systematic analysis with implications for the sustainable development goals,” The Lancet, vol. 388, no. 10063, pp. 3027–3035, 2016.
- R. Alijahan, S. Hazrati, M. Mirzarahimi, F. Pourfarzi, and P. A. Hadi, “Prevalence and risk factors associated with preterm birth in ardabil, iran,” Iranian journal of reproductive medicine, vol. 12, no. 1, p. 47, 2014.
- A. T. Deressa, A. Cherie, T. M. Belihu, and G. G. Tasisa, “Factors associated with spontaneous preterm birth in addis ababa public hospitals, ethiopia: cross sectional study,” BMC pregnancy and childbirth, vol. 18, no. 1, pp. 1–5, 2018.
- C. J. Peng and T. H. So, Logistic Regression Analysis and Reporting: A Primer, 2002.
- P. K. Josepha and A. Ame, “Effect of testing logistic regression assumptions on the improvement of the propensity scores,” International Journal of Statistics and Applications, vol. 8, no. 1, pp. 9–17, 2018. [Online]. Available: http://http://dx.doi.org/10.5923/j.statistics.20180801.02
- B. G. Tabachnick and L. S. Fidell, Using Multivariate Statistics, 2007.
- J. C. Stoltzfus, “Logistic regression: A brief primer,” ACADEMIC EMERGENCY MEDICINE, vol. 18, pp. 1099–1104, 2011. [Online]. Available: http://http://dx.doi.org/10.1111/j.1553-2712.2011.01185.x
- H. Park, “An introduction to logistic regression: From basic concepts to interpretation with particular attention to nursing domain,” J Korean Acad Nurs, vol. 43, no. 2, pp. 154–164, 2013. [Online]. Available: https://doi.org/10.4040/jkan.2013.43.2.154
- D. M. Bates and D. G. Watts, Nonlinear regression analysis and its applications. Wiley New York, 1988, vol. 2.
- J. K. Lindsey, Nonlinear models in medical statistics. Oxford University Press on Demand, 2001.
- S. Wang, L. Zheng, J. Dai et al., “Empirical likelihood diagnosis of modal linear regression models,” Journal of Applied Mathematics and Physics, vol. 2, no. 10, p. 948, 2014.
- J. Hagenauer, H. Omrani, and M. Helbich, “Assessing the performance of 38 machine learning models: the case of land consumption rates in bavaria, germany,” International Journal of Geographical Information Science, vol. 33, no. 7, pp. 1399–1419, 2019. [Online]. Available: https://doi.org/10.1080/13658816.2019.1579333
- M. Cottrell, M. Olteanu, F. Rossi, J. Rynkiewicz, and N. Villa-Vialaneix, “Neural networks for complex data,” KI-Künstliche Intelligenz, vol. 26, no. 4, pp. 373–380, 2012.
- G. Daniel, Principles of artificial neural networks. World Scientific, 2013, vol. 7.
- O. Asogwa and A. Oladugba, “On the comparison of artificial neural network (ann) and multinomial logistic regression (mlr),” West African Journal of Industrial and Academic Research, vol. 13, no. 1, pp. 3–9, 2015.
- A. Mollalol, M. K. Rivera, and B. Vahedi, “Artificial network modeling of novel coronavirus (covid 19) incidence rates across the continental united states,” Int. J. Environ. Res. Public Health, vol. 17, no. 4204, pp. 1–13, 2020. [Online]. Available: http://http://dx.doi.org/10.3390/ijerph17124204
- S. Sperandei, “Understanding logistic regression analysis,” Biochemia Medica, vol. 24, no. 1, pp. 12–18, 2014. [Online]. Available: http://dx.doi.org/10.11613/BM.2014.003
- P. Du Jardin, “Prevision de la defaillance et reseaux de neurones : l’apport des methodes numeriques de selection de variables,” in These de Doctorat, Universite Nice Sophia Antipolis, France. HAL, 2007.
- J. D. Olden, M. K. Joy, and R. G. Death, “An accurate comparison of methods for quantifying variable importance in artificial neural networks using simulated data,” Ecological modelling, vol. 178, no. 3-4, pp. 389–397, 2004.
- J. D. Jobson, Applied multivariate data analysis: volume II: Categorical and Multivariate Methods. Springer Science & Business Media, 2012.
- S. A. Czepiel, “Maximum likelihood estimation of logistic regression
- models: theory and implementation,” Available at czep.net/stat/mlelr.pdf, pp. 1 825 252 548–1 564 645 290, 2002.
- A. Diop, A. Diop, and J.-F. Dupuy, “Maximum likelihood estimation in the logistic regression model with a cure fraction,” Electronic journal of statistics, vol. 5, pp. 460–483, 2011.
- C. G. Hounmenou, R. Tohoun, K. E. Gneyou, R. Glèlè Kakaï et al., “Empirical determination of optimal configuration for characteristics of a multilayer perceptron neural network in nonlinear regression,” Afrika Statistika, vol. 15, no. 3, pp. 2413–2429, 2020.
- M. Riedmiller and H. Braun, “A direct adaptive method for faster backpropagation learning: The rprop algorithm,” in IEEE international conference on neural networks. IEEE, 1993, pp. 586–591.
- C.-K. Chen and H. John Jr, “Using ordinal regression model to analyze student satisfaction questionnaires. ir applications, volume 1, may 26, 2004.” Association for Institutional Research (NJ1), 2004.
- D. Hosmer and S. Lemeshow, “Applied logistic regression 2nd edition wiley,” New York, 2000.
- C. M. Bishop et al., Neural networks for pattern recognition. Oxford university press, 1995.
- J. M. Zurada, A. Malinowski, and I. Cloete, “Sensitivity analysis for minimization of input data dimension for feedforward neural network,” in Proceedings of IEEE International Symposium on Circuits and Systems-ISCAS’94, vol. 6. IEEE, 1994, pp. 447–450.
- D. G. Garson, “Interpreting neural network connection weights,” Artificial Intelligence Expert, vol. 9, no. 3, pp. 46–51, 1991.
- A. T. Goh, “Back-propagation neural networks for modeling complex systems,” Artificial Intelligence in Engineering, vol. 9, no. 3, pp. 143–151, 1995.
- J. Olden and D. Jackson, “Illuminating the “black box”: A randomization approach for understanding variable contributions in artificial neural networks,” Ecological Modelling, vol. 154, pp. 135–150, 2002. [Online]. Available: https://doi.org/10.1016/S0304-3800(02)00064-9
- J. Zarifis, V. Grammatikou, M. Kallistratos, A. Katsivas, and I. of the Prospective Noninterventional Observational Study of the Antianginal Efficacy of Ivabradine During a 4-Month Treatment of a Greek Population With Coronary Artery Disease, “Treatment of stable angina pectoris with ivabradine in everyday practice: a pan-hellenic, prospective, non-interventional study,” Clinical cardiology, vol. 38, no. 12, pp. 725–732, 2015.
- R. R Core Team, “A language and environment for statistical computing. r foundation for statistical computing, vienna, austria. version 3.3.6,” https://www.R-project.org/, 2019.
- S. Fritsch, F. Guenther, and M. F. Guenther, “Package ?neuralnet?” Training of Neural Networks. Recuperado de https://cran.r-project.org/web/packages/neuralnet/neuralnet.pdf, 2019.
- M. W. Beck, “Neuralnettools: Visualization and analysis tools for neural networks,” Journal of statistical software, vol. 85, no. 11, p. 1, 2018.
- P. M. West, P. L. Brockett, and L. L. Golden, “A comparative analysis of neural networks and statistical methods for predicting consumer choice,” Marketing Science, vol. 16, no. 4, pp. 370–391, 1997.
- B. Eftekhar, K. Mohammad, H. Ardebili, M. Ghodsi, and E. Ketabchi, “Comparison of artificial neural network and logistic regression models for prediction of mortality in head trauma based on initial clinical data,” BMC Med Inform Decis Mak, vol. 5, no. 3, p. 20, 2005. [Online]. Available: https://doi.org/10.1186/1472-6947-5-3
- P. J. Adeodato, G. C. Vasconcelos, A. L. Arnaud, R. A. Santos, R. C. Cunha, and D. S. Monteiro, “Neural networks vs logistic regression: a comparative study on a large data set.” in ICPR (3), 2004, pp. 355–358.
- V. Bourdès, S. Bonnevay, P. Lisboa, R. Defrance, D. Pérol, S. Chabaud, T. Bachelot, T. Gargi, and S. Négrier, “Comparison of artificial neural network with logistic regression as classification models for variable selection for prediction of breast cancer patient outcomes,” Advances in Artificial Neural Systems, vol. 2010, 2010.
- C.-p. LI, X.-y. Zhi, M. Jun, C. Zhuang, Z.-l. Zhu, C. Zhang, and L.-P. Hu, “Performance comparison between logistic regression, decision trees, and multilayer perceptron in predicting peripheral neuropathy in type 2 diabetes mellitus,” Chinese medical journal, vol. 125, no. 5, pp. 851–857, 2012.
- M. Parsaeian, K. Mohammad, M. Mahmoudi, and H. Zeraati, “Comparison of logistic regression and artificial neural network in low back pain prediction: second national health survey,” Iranian journal of public health, vol. 41, no. 6, p. 86, 2012.
- M. Garcı́a, C. Valverde, M. I. López, J. Poza, and R. Hornero, “Comparison of logistic regression and neural network classifiers in the detection of hard exudates in retinal images,” in 2013 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). IEEE, 2013, pp. 5891–5894.
- A. Kazemnejad, Z. Batvandi, and J. Faradmal, “Comparison of artificial neural network and binary logistic regression for determination of impaired glucose tolerance/diabetes,” EMHJ-Eastern Mediterranean Health Journal, vol. 16, no. 6, pp. 615–620, 2010.
- S. Menard, Applied Logistic Regression Analysis (Second Edition). Sage Publications, 2002, vol. 106.