Machine learning for survival analysis: a comparative study on intensive care unit (ICU) patient data and simulations
Lukáš Boček, Lubomír Štěpánek
DOI: http://dx.doi.org/10.15439/2025F6352
Citation: Proceedings of the 20th Conference on Computer Science and Intelligence Systems (FedCSIS), M. Bolanowski, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 43, pages 647–652 (2025)
Abstract. Survival analysis focuses on modeling the time until a specific event occurs, often in the presence of censored observations. While classical methods like the Cox model are widely used, modern machine learning (ML) approaches offer greater flexibility and predictive power. This paper compares classical and ML-based survival models on both real-world and simulated datasets. We demonstrate that techniques like CoxBoost and penalized Cox regression outperform tree-based models like Random Survival Forests in most settings. Explainable Artificial Intelligence (AI) tools are applied to improve the transparency and interpretability of model predictions.
References
- D. R. Cox. “Regression Models and Life-Tables”. In: Journal of the Royal Statistical Society. Series B (Methodological) 34.2 (1972), pp. 187–220. ISSN: 00359246. URL: http://www.jstor.org/stable/2985181.
- P. Chapfuwa, C. Li, N. Mehta, L. Carin, and R. Henao. “Survival Cluster Analysis”. In: (Feb. 2020). https://dx.doi.org/10.48550/ARXIV.2003.00355.
- L. Štěpánek, F. Habarta, I. Malá, and L. Marek. “Analysis of asymptotic time complexity of an assumption-free alternative to the log-rank test”. In: Proceedings of the 2020 Federated Conference on Computer Science and Information Systems. Vol. 21. IEEE, Sept. 2020, pp. 453–460. URL: http://dx.doi.org/10.15439/2020F198.
- L. Štěpánek, F. Habarta, I. Malá, and L. Marek. “Non-parametric comparison of survival functions with censored data: A computational analysis of greedy and Monte Carlo approaches”. In: Proceedings of the 19th Conference on Computer Science and Intelligence Systems (FedCSIS). Vol. 39. Polish Information Processing Society, Oct. 2024, pp. 725–730. URL: http://dx.doi.org/10.15439/2024F223.
- V. Van Belle, K. Pelckmans, J. Suykens, and S. Van Huffel. Support vector machines for survival analysis. eng. 2007.
- C. Fouodo, I. König, C. Weihs, A. Ziegler, and M. Wright. “Support Vector Machines for Survival Analysis with R”. In: The R Journal 10.1 (2018), p. 412. ISSN: 2073-4859. https://dx.doi.org/10.32614/rj-2018-005.
- L. Štěpánek, F. Habarta, I. Malá, L. Marek, and F. Pazdírek. “A Machine-learning Approach to Survival Time-event Predicting: Initial Analyses using Stomach Cancer Data”. In: 2020 International Conference on e-Health and Bioengineering (EHB). IEEE, Oct. 2020. https://dx.doi.org/10.1109/ehb50910.2020.9280301.
- L. Štěpánek, F. Habarta, I. Malá, L. Štěpánek, M. Nakládalová, et al. “Machine Learning at the Service of Survival Analysis: Predictions Using Time-to-Event Decomposition and Classification Applied to a Decrease of Blood Antibodies against COVID-19”. In: Mathematics 11.4 (Feb. 2023), p. 819. ISSN : 2227-7390. https://dx.doi.org/10.3390/math11040819.
- J. L. Katzman, U. Shaham, A. Cloninger, J. Bates, T. Jiang, et al. “DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network”. In: BMC Medical Research Methodology 18.1 (Feb. 2018). ISSN: 1471-2288. https://dx.doi.org/10.1186/s12874-018-0482-1.
- C. Lee, W. Zame, J. Yoon, and M. Van der Schaar. “Deep-Hit: A Deep Learning Approach to Survival Analysis With Competing Risks”. In: Proceedings of the AAAI Conference on Artificial Intelligence 32.1 (Apr. 2018). ISSN: 2159-5399. https://dx.doi.org/10.1609/aaai.v32i1.11842.
- S. Wiegrebe, P. Kopper, R. Sonabend, B. Bischl, and A. Bender. “Deep learning for survival analysis: a review”. In: Artificial Intelligence Review 57.3 (Feb. 2024). ISSN : 1573-7462. https://dx.doi.org/10.1007/s10462-023-10681-3.
- F. E. J. Harell. Regression Modeling Strategies. With Applications to Linear Models, Logistic and Ordinal Regression, and Survival Analysis. 2nd ed. 2015. Springer eBook Collection. Cham: Springer, 2015. 58215753 pp. ISBN: 9783319194257.
- H. Ishwaran, U. B. Kogalur, E. H. Blackstone, and M. S. Lauer. “Random survival forests”. In: Annals of Applied Statistics 2008, Vol. 2, No. 3, 841-860 2.3 (Sept. 2008). ISSN: 1932-6157. https://dx.doi.org/10.1214/08-aoas169.
- H. Binder and M. Schumacher. “Incorporating pathway information into boosting estimation of high-dimensional risk prediction models”. In: BMC Bioinformatics 10.1 (Jan. 2009). ISSN: 1471-2105. https://dx.doi.org/10.1186/1471-2105-10-18.
- T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Second Edition. Springer, 2009, p. 768. ISBN: 978-0-387-84857-0.
- N. Simon, J. Friedman, T. Hastie, and R. Tibshirani. “Regularization Paths for Cox’s Proportional Hazards Model via Coordinate Descent”. In: Journal of Statistical Software 39.5 (2011). ISSN : 1548-7660. https://dx.doi.org/10.18637/jss.v039.i05.
- T. Hothorn, K. Hornik, and A. Zeileis. “Unbiased Recursive Partitioning: A Conditional Inference Framework”. In: Journal of Computational and Graphical Statistics 15.3 (Sept. 2006), pp. 651–674. ISSN: 1537-2715. https://dx.doi.org/10.1198/106186006x133933.
- L. Breiman, J. Friedman, C. J. Stone, and R. Olshen. Classification and Regression Trees. 1st. New York: Chapman and Hall/CRC, 1984. ISBN: 0-534-98053-8.
- B. C. Jaeger, S. Welden, K. Lenoir, J. L. Speiser, M. W. Segar, et al. “Accelerated and Interpretable Oblique Random Survival Forests”. In: Journal of Computational and Graphical Statistics 33.1 (Aug. 2023), pp. 192–207. ISSN: 1537-2715. https://dx.doi.org/10.1080/10618600.2023.2231048.
- F. E. Harrell. “Evaluating the Yield of Medical Tests”. In: JAMA: The Journal of the American Medical Association 247.18 (May 1982), p. 2543. ISSN: 0098-7484. https://dx.doi.org/10.1001/jama.1982.03320430047030.
- M. A. Ahmad, A. Teredesai, and C. Eckert. “Interpretable Machine Learning in Healthcare”. In: 2018 IEEE International Conference on Healthcare Informatics (ICHI). IEEE, June 2018. https://dx.doi.org/10.1109/ichi.2018.00095.
- F. Xu, H. Uszkoreit, Y. Du, W. Fan, D. Zhao, et al. “Explainable AI: A Brief Survey on History, Research Areas, Approaches and Challenges”. In: Lecture Notes in Computer Science. Springer International Publishing, 2019, pp. 563–574. ISBN: 9783030322366. https://dx.doi.org/10.1007/978-3-030-32236-6_51.
- M. Spytek, M. Krzyziński, S. H. Langbein, H. Baniecki, M. N. Wright, et al. survex: an R package for explaining machine learning survival models. 2023. https://dx.doi.org/10.48550/ARXIV.2308.16113.
- M. Krzyziński, M. Spytek, H. Baniecki, and P. Biecek. “SurvSHAP(t): Time-dependent explanations of machine learning survival models”. In: (2022). https://dx.doi.org/10.48550/ARXIV.2208.11080.
- M. S. Kovalev, L. V. Utkin, and E. M. Kasimov. SurvLIME: A method for explaining machine learning survival models. 2020. https://dx.doi.org/10.48550/ARXIV.2003.08371.
- A. E. Johnson, T. J. Pollard, L. Shen, L.-w. H. Lehman, M. Feng, et al. “MIMIC-III, a freely accessible critical care database”. In: Scientific Data 3.1 (May 2016). ISSN: 2052-4463. https://dx.doi.org/10.1038/sdata.2016.35.
- S. L. Brilleman, R. Wolfe, M. Moreno-Betancur, and M. J. Crowther. “Simulating Survival Data Using the simsurv R Package”. In: Journal of Statistical Software 97.3 (2021). ISSN: 1548-7660. https://dx.doi.org/10.18637/jss.v097.i03.