Challenges in Causal Inference from Personal Monitoring Devices

Tomasz Wiktorski

Challenges in Causal Inference from Personal Monitoring Devices

Tomasz Wiktorski

DOI: http://dx.doi.org/10.15439/2018F378

Citation: Position Papers of the 2018 Federated Conference on Computer Science and Information Systems, M. Ganzha, L. Maciaszek, M. Paprzycki (eds). ACSIS, Vol. 16, pages 99–102 (2018)

Full text

Abstract. Personal Monitoring Devices (PMDs) collect im- mense amount of data about health and wellness of hundreds of millions of people. One of the obstacles of the prevailing data analytics approaches to PMDs' data is limited value of correlation-based conclusions in a health context. Causal inference seems a natural solution, but general causal inference methodologies are difficult to apply to PMDs data due to size and complexity of observational data. Some methods, such as randomized trials, are largely infeasible in PMDs' context due to lack of control over the investigated population. In this paper, we overview existing approaches to causal inference including recent works that attempt to take advantage of time series data to automatically derive causality using extended difference- in-deference or Granger methods. We then outline challenges and opportunities for causal inference in the health context. Finally, we propose a following challenge: can we establish a new standard of evidence and a study design process that: (1) allows for drawing causal conclusions from large observational datasets and (2) can suggest interventions to enforce causal links discovered in the data.

References

“Google’s flu project shows the failings of big data | time,” http://time.com/23782/google-flu-trends-big-data-problems/, (Visited on 12/29/2015).
J. Pearl and D. Mackenzie, The Book of Why: The New Science of Cause and Effect. Penguin UK, 2018.
H. Ekbia, M. Mattioli, I. Kouper, G. Arave, A. Ghazinejad, T. Bowman, V. R. Suri, A. Tsou, S. Weingart, and C. R. Sugimoto, “Big data, bigger dilemmas: A critical review,” Journal of the Association for Information Science and Technology, 2015.
G. George, M. R. Haas, and A. Pentland, “Big data and management,” Academy of Management Journal, vol. 57, no. 2, pp. 321–326, 2014.
J. Grimmer, “We are all social scientists now: How big data, machine learning, and causal inference work together,” PS: Political Science & Politics, vol. 48, no. 01, pp. 80–83, 2015.
D. Lazer, R. Kennedy, G. King, and A. Vespignani, “The parable of google flu: traps in big data analysis,” Science, vol. 343, no. 6176, pp. 1203–1205, 2014.
F. Provost and T. Fawcett, “Data science and its relationship to big data and data-driven decision making,” Big Data, vol. 1, no. 1, pp. 51–59, 2013.
R. Kitchin, “Big data, new epistemologies and paradigm shifts,” Big Data & Society, vol. 1, no. 1, p. 2053951714528481, 2014.
A. Gelman, “Causality and statistical learning,” American Journal of Sociology, vol. 117, no. 3, pp. 955–966, 2011.
J. S. Mill, A System of Logic Ratiocinative and Inductive: Boeing a Connected View of the Principales of Evidence and the Methods of Scientific Investigation. Bombay, 1906.
K. H. Brodersen, F. Gallusser, J. Koehler, N. Remy, S. L. Scott et al., “Inferring causal impact using bayesian structural time-series models,” The Annals of Applied Statistics, vol. 9, no. 1, pp. 247–274, 2015.
J. Neyman, “Sur les applications de la théorie des probabilités aux experiences agricoles: Essai des principes,” Roczniki Nauk Rolniczych, vol. 10, pp. 1–51, 1923.
D. B. Rubin, “Estimating causal effects of treatments in randomized and nonrandomized studies.” Journal of educational Psychology, vol. 66, no. 5, p. 688, 1974.
K. A. Markus, “Principles and practice of structural equation modeling by rex b. kline,” Structural Equation Modeling: A Multidisciplinary Journal, vol. 19, no. 3, pp. 509–512, 2012.
J. Pearl et al., “Causal inference in statistics: An overview,” Statistics Surveys, vol. 3, pp. 96–146, 2009.
D. Aliprantis, “A distinction between causal effects in structural and rubin causal models,” 2015.
C. M. Zigler and F. Dominici, “Uncertainty in propensity score estimation: Bayesian methods for variable selection and model-averaged causal effects,” Journal of the American Statistical Association, vol. 109, no. 505, pp. 95–107, 2014.
C. W. Granger, “Causality, cointegration, and control,” Journal of Economic Dynamics and Control, vol. 12, no. 2-3, pp. 551–559, 1988.
H. M. Krumholz, “Big data and new knowledge in medicine: the thinking, training, and tools needed for a learning health system,” Health Affairs, vol. 33, no. 7, pp. 1163–1170, 2014.
K. Visvanathan, L. A. Levit, D. Raghavan, C. A. Hudis, S. Wong, A. Dueck, and G. H. Lyman, “Untapped potential of observational research to inform clinical decision making: American society of clinical oncology research statement,” Journal of Clinical Oncology, vol. 35, no. 16, pp. 1845–1854, 2017.
A. T. Janke, D. L. Overbeek, K. E. Kocher, and P. D. Levy, “Exploring the potential of predictive analytics and big data in emergency care,” Annals of emergency medicine, vol. 67, no. 2, pp. 227–236, 2016.
J. Pearl and E. Bareinboim, “External validity: From do-calculus to transportability across populations,” Statistical Science, pp. 579–595, 2014.