An efficient approach towards the generation and analysis of interoperable clinical data in a knowledge graph
Jens Dörpinghaus, Sebastian Schaaf, Vera Weil, Tobias Hübenthal
DOI: http://dx.doi.org/10.15439/2021F29
Citation: Proceedings of the 16th Conference on Computer Science and Intelligence Systems, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 25, pages 59–68 (2021)
Abstract. Knowledge graphs have been shown to play an important role in recent knowledge mining and discovery, for example in the field of life sciences or bioinformatics. Contextual information is widely used for NLP and knowledge discovery in life sciences since it highly influences the exact meaning of natural language and also queries for data. The contributions of this paper are an efficient approach towards interoperable data, a runtime analysis of 14 real world use cases represented by graph queries and a unique view on clinical data and its application combining methods of algorithmic optimisation, graph theory and data science.
References
- “Integrative Daten-Semantik für die Neurodegenerationsforschung https://www.idsn.info/de/idsn.html, Juli 2020. [Online]. Available: https://www.idsn.info/de/idsn.html
- J. Dörpinghaus, A. Stefan, B. Schultz, and M. Jacobs. (2020) Towards context in large scale biomedical knowledge graphs. [Online]. Available: http://arxiv.org/abs/2001.08392
- C. S. Burns, R. M. Shapiro, T. Nix, J. T. Huber et al., “Examining medline search query reproducibility and resulting variation in search results,” iConference 2019 Proceedings, 2019.
- C. T. Fakhry, P. Choudhary, A. Gutteridge, B. Sidders, P. Chen, D. Ziemek, and K. Zarringhalam, “Interpreting transcriptional changes using causal graphs: new methods and their practical utility on public networks,” BMC bioinformatics, vol. 17, no. 1, pp. 1–15, 2016.
- F. Belleau, M.-A. Nolin, N. Tourigny, P. Rigault, and J. Morissette, “Bio2rdf: towards a mashup to build bioinformatics knowledge systems,” Journal of biomedical informatics, vol. 41, no. 5, pp. 706–716, 2008.
- D. S. Himmelstein, A. Lizee, C. Hessler, L. Brueggeman, S. L. Chen, D. Hadley, A. Green, P. Khankhanian, and S. E. Baranzini, “Systematic integration of biomedical knowledge prioritizes drugs for repurposing,” Elife, vol. 6, p. e26726, 2017.
- L. Harland, “Open phacts: A semantic knowledge infrastructure for public and commercial drug discovery research,” in International Conference on Knowledge Engineering and Knowledge Management. Springer, 2012, pp. 1–7.
- C. Manning, P. Raghavan, and H. Schütze, Introduction to Information Retrieval. Cambridge University Press, 2008.
- A. Clark, C. Fox, and S. Lappin, The handbook of computational linguistics and natural language processing. John Wiley & Sons, 2013.
- H. Mirisaee, E. Gaussier, C. Lagnier, and A. Guerraz, “Terminology-based text embedding for computing document similarities on technical content,” arXiv preprint https://arxiv.org/abs/1906.01874, 2019.
- J. Dörpinghaus and A. Stefan, “Knowledge extraction and applications utilizing context data in knowledge graphs,” in 2019 Federated Conference on Computer Science and Information Systems (FedCSIS). IEEE, 2019, pp. 265–272.
- M. Ashburner, C. A. Ball, J. A. Blake, D. Botstein, H. Butler, J. M. Cherry, A. P. Davis, K. Dolinski, S. S. Dwight, J. T. Eppig et al., “Gene ontology: tool for the unification of biology,” Nature genetics, vol. 25, no. 1, p. 25, 2000.
- D. S. Wishart, Y. D. Feunang, A. C. Guo, E. J. Lo, A. Marcu, J. R. Grant, T. Sajed, D. Johnson, C. Li, Z. Sayeeda et al., “Drugbank 5.0: a major update to the drugbank database for 2018,” Nucleic acids research, vol. 46, no. D1, pp. D1074–D1082, 2017.
- K. Khan, E. Benfenati, and K. Roy, “Consensus qsar modeling of toxicity of pharmaceuticals to different aquatic organisms: Ranking and prioritization of the drugbank database compounds,” Ecotoxicology and environmental safety, vol. 168, pp. 287–297, 2019.
- “SNPedia https://www.snpedia.com/index.php/APOE, Juli 2020.
- “UCUM- The Unified Code for Units of Measure http://unitsofmeasure.org, Juli 2020.
- J. Hastings, The Gene Ontology Handbook. Springer, 2017, ch. Primer on Ontologies, pp. 3–13.
- “Dublin Core Metadata Initiative https://www.dublincore.org/specifications/dublin-core/, Juli 2020.
- M. Gorelick and I. Ozsvald, High Performance Python: Practical Performant Programming for Humans. O’Reilly Media, 2014. [Online]. Available: https://books.google.de/books?id=bIZaBAAAQBAJ
- “Import in Neo4j https://neo4j.com/docs/operations-manual/current/tools/neo4j-admin-import/, Juli 2020.
- J. Dörpinghaus and A. Stefan, “Knowledge extraction and applications utilizing context data in knowledge graphs,” in 2019 Federated Conference on Computer Science and Information Systems (FedCSIS). IEEE, 2019, pp. 265–272.
- J. Dörpinghaus and A. Stefan, “Optimization of Retrieval Algorithms on Large Scale Knowledge Graphs,” 2020.
- P. T. Wood, “Query Languages for Graph Databases,” SIGMOD Rec., vol. 41, no. 1, pp. 50–60, apr 2012. [Online]. Available: http://doi.acm.org/10.1145/2206869.2206879
- A. O. Mendelzon and P. T. Wood, “Finding Regular Simple Paths in Graph Databases,” SIAM Journal on Computing, vol. 24, no. 6, pp. 1235–1258, 1995. [Online]. Available: https://doi.org/10.1137/S009753979122370X
- P. T. Wood, “Query Languages for Graph Databases,” SIGMOD Rec., vol. 41, no. 1, p. 50–60, Apr. 2012. [Online]. Available: https://doi.org/10.1145/2206869.2206879
- F. Grando, D. Noble, and L. C. Lamb, “An analysis of centrality measures for complex and social networks,” in 2016 IEEE Global Communications Conference (GLOBECOM). IEEE, 2016, pp. 1–6.
- C. Chen, W. Wang, and X. Wang, “Efficient maximum closeness centrality group identification,” in Australasian Database Conference. Springer, 2016, pp. 43–55.
- M. Fink and J. Spoerhase, “Maximum betweenness centrality: approximability and tractable cases,” in International Workshop on Algorithms and Computation. Springer, 2011, pp. 9–20.
- M. Fink, “Zentralitätsmaße in komplexen Netzwerken auf Basis kürzester Wege,” Master’s thesis, Julius-Maximilians-Universität Würzburg: Institut für Informatik, 2009.
- “Neo4j Betweenness Centrality https://neo4j.com/docs/graph-data-science/current/algorithms/betweenness-centrality/, Juli 2020.