Let's estimate all parameters as probabilities: Precise estimation using Chebyshev's inequality, Bernoulli distribution, and Monte Carlo simulations

Lubomír Štěpánek; Filip Habarta; Ivana Malá; Luboš Marek

Let's estimate all parameters as probabilities: Precise estimation using Chebyshev's inequality, Bernoulli distribution, and Monte Carlo simulations

Lubomír Štěpánek, Filip Habarta, Ivana Malá, Luboš Marek

DOI: http://dx.doi.org/10.15439/2023F1144

Citation: Proceedings of the 18th Conference on Computer Science and Intelligence Systems, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 35, pages 1223–1227 (2023)

Full text

Abstract. Regarding the parameter estimation task, besides the time effectiveness of the simulation, parameter estimates are required to be precise enough. Usually, the estimates are Monte Carlo-simulated using a prior estimated variability within a small sample. However, the problem with pre-estimated variability is that it can be estimated imprecisely or, even worse, underestimated, resulting in estimation bias. In this work, we address the abovementioned issue and suggest estimating all parameters as probabilities. Since the probability is not only finite but has its theoretical maximum as $1$, using outcomes of Bernoulli and binomial distribution's upper-bounded variance and Chebyshev's inequality, the estimator's variability is theoretically upper-bounded within the Monte Carlo simulation and estimation process. It cannot be underestimated or estimated inaccurately; thus, its precision is ensured till a given decimal digit, with very high probability. If there is a known process that treats the parameter of interest in terms of probability, we can estimate how many iterations of the Monte Carlo simulation are needed to ensure parameter estimate on a given level of precision. Also, we analyze the asymptotic time complexity of the proposed estimation strategy and illustrate the approach on a short case study of $\pi$ constant estimation.

References

Søren Asmussen. “Conditional Monte Carlo for sums, with applications to insurance and finance”. In: Annals of Actuarial Science 12.2 (Jan. 2018), pp. 455–478. http://dx.doi.org/10.1017/s1748499517000252.
Eleni G. Elia, Shirley Ge, Lisa Bergersen, et al. “A Monte Carlo Simulation Approach to Optimizing Capacity in a High-Volume Congenital Heart Pediatric Surgical Center”. In: Frontiers in Health Services 1 (Feb. 2022). http://dx.doi.org/10.3389/frhs.2021.787358.
Christopher Mooney. Monte Carlo Simulation. SAGE Publications, Inc., 1997. DOI : 10.4135/9781412985116.
Patrick Billingsley. Probability and Measure. en. 3rd ed. Wiley Series in Probability & Mathematical Statistics: Probability & Mathematical Statistics. Nashville, TN: John Wiley & Sons, May 1995.
Gerold Alsmeyer. “Chebyshev’s Inequality”. In: International Encyclopedia of Statistical Science. Springer Berlin Heidelberg, 2011, pp. 239–240. http://dx.doi.org/10.1007/978-3-642-04898-2_167.
R Core Team. R: A language and environment for statistical computing. manual. Vienna, Austria, 2021. URL : https://www.R-project.org/.
Patrícia Martinková, Lubomír Štěpánek, Adéla Drabinová, et al. “Semi-real-time analyses of item characteristics for medical school admission tests”. In: Proceedings of the 2017 Federated Conference on Computer Science and Information Systems. IEEE, Sept. 2017. http://dx.doi.org/10.15439/2017f380.
Lubomír Štěpánek, Filip Habarta, Ivana Malá, et al. “Analysis of asymptotic time complexity of an assumption-free alternative to the log-rank test”. In: Proceedings of the 2020 Federated Conference on Computer Science and Information Systems. IEEE, Sept. 2020. http://dx.doi.org/10.15439/2020f198.
Lubomír Štěpánek, Filip Habarta, Ivana Malá, et al. “Machine Learning at the Service of Survival Analysis: Predictions Using Time-to-Event Decomposition and Classification Applied to a Decrease of Blood Antibodies against COVID-19”. In: Mathematics 11.4 (Feb. 2023), p. 819. http://dx.doi.org/10.3390/math11040819.