Claim Frequency Estimation in Motor Third-Party Liability (MTPL): Classical Statistical Models versus Machine Learning Methods
Ondřej Vít, Lubomír Seif, Lubomír Štěpánek
DOI: http://dx.doi.org/10.15439/2025F5118
Citation: Communication Papers of the 20th Conference on Computer Science and Intelligence Systems (FedCSIS), M. Bolanowski, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 45, pages 161–166 (2025)
Abstract. This paper compares classical statistical models and machine learning techniques for claim frequency estimation in compulsory motor third-party liability insurance (MTPL). We evaluate Generalized Linear Models (GLMs), Hurdle models, and feedforward neural networks on real-world insurance data. Emphasis is placed on the trade-off between interpretability and predictive power, especially in segments with scarce data. Our findings show that expert-driven data preparation enables GLMs to perform competitively with complex neural networks. Hurdle models further improve performance in zero-inflated settings. While neural networks offer improved predictive performance in some segments, they struggle in underrepresented ones. Results highlight that careful preprocessing is as important as model complexity.
References
- M. Denuit, X. Marechal, S. Pitrebois, a J.-F. Walhin, Actuarial Modelling of Claim Counts: Risk Classification, Credibility and Bonus-Malus Systems. Chichester, West Sussex, England; Hoboken, NJ: Wiley-Interscience, 2007.
- P. McCullagh, Generalized Linear Models, 2nd ed. New York: Routledge, 2019.
- J. Mullahy, “Specification and testing of some modified count data models,” Journal of Econometrics, vol. 33, no. 3, pp. 341–365, Dec. 1986.
- M. V. Wuthrich, “From Generalized Linear Models to Neural Networks, and Back,” Social Science Research Network, Rochester, NY, Dec. 2019. [Online]. Available: https://papers.ssrn.com/abstract=3491790
- S. A. Klugman, H. H. Panjer, and G. E. Willmot, Loss Models: From Data to Decisions, 3rd ed. Hoboken, NJ: John Wiley & Sons, 2012.
- Dong-Young Lim, “A Neural Frequency-Severity Model and Its Application to Insurance Claims,” [Online]. Available: https://paperswithcode.com/paper/a-neural-frequency-severity-model-and-its
- I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. Cambridge, MA: MIT Press, 2016.
- J. A. Nelder and R. W. M. Wedderburn, “Generalized Linear Models,” J. Roy. Statist. Soc. A, vol. 135, no. 3, pp. 370–384, 1972.
- M. V. Wuthrich and M. Merz, “Statistical Foundations of Actuarial Learning and its Applications,” Social Science Research Network, Rochester, NY, Jun. 2022. [Online]. Available: https://papers.ssrn.com/abstract=3822407
- L. Štěpánek, P. Martinková. “Feasibility of computerized adaptive testing evaluated by Monte-Carlo and post-hoc simulations”, Proceedings of the 2020 Federated Conference on Computer Science and Information Systems, vol. 21, pp. 359–367, FedCSIS, Sep. 2020. Available: http://dx.doi.org/10.15439/2020F197.
- A. Agresti, Categorical Data Analysis, 3rd ed. Hoboken, NJ: Wiley, 2013.
- T. Dozat, “Incorporating Nesterov Momentum into Adam,” in Proc. 4th Int. Conf. Learn. Representations (ICLR), 2016
- J. C. Platt, “Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods,” in Advances in Large Margin Classifiers, A. Smola, P. Bartlett, B. Schölkopf, and D. Schuurmans, Eds. Cambridge, MA: MIT Press, 1999