Knowledge Extraction and Applications utilizing Context Data in Knowledge Graphs

Jens Dörpinghaus; Andreas Stefan

Knowledge Extraction and Applications utilizing Context Data in Knowledge Graphs

Jens Dörpinghaus, Andreas Stefan

DOI: http://dx.doi.org/10.15439/2019F3

Citation: Proceedings of the 2019 Federated Conference on Computer Science and Information Systems, M. Ganzha, L. Maciaszek, M. Paprzycki (eds). ACSIS, Vol. 18, pages 265–272 (2019)

Full text

Abstract. Context is widely considered for NLP and knowledge discovery since it highly influences the exact meaning of natural language. The scientific challenge is not only to extract such context data, but also to store this data for further NLP approaches. Here, we propose a multiple step knowledge graph based approach to utilize context data for NLP and knowledge expression and extraction. We introduce the graph-theoretic foundation for a general context concept within semantic networks and show a proof-of-concept based on biomedical literature and text mining. We discuss the impact of this novel approach on text analysis, various forms of text recognition and knowledge extraction and retrieval.

References

C. C. Aggarwal and C. Zhai, “An introduction to text mining,” in Mining text data. Springer, 2012, pp. 1–10.
J. Fluck, A. Klenner, S. Madan, S. Ansari, T. Bobic, J. Hoeng, M. Hofmann-Apitius, and M. Peitsch, “Bel networks derived from qualitative translations of bionlp shared task annotations,” in Proceedings of the 2013 Workshop on Biomedical Natural Language Processing, 2013, pp. 80–88.
M. Ashburner, C. A. Ball, J. A. Blake, D. Botstein, H. Butler, J. M. Cherry, A. P. Davis, K. Dolinski, S. S. Dwight, J. T. Eppig et al., “Gene ontology: tool for the unification of biology,” Nature genetics, vol. 25, no. 1, p. 25, 2000.
D. S. Wishart, Y. D. Feunang, A. C. Guo, E. J. Lo, A. Marcu, J. R. Grant, T. Sajed, D. Johnson, C. Li, Z. Sayeeda et al., “Drugbank 5.0: a major update to the drugbank database for 2018,” Nucleic acids research, vol. 46, no. D1, pp. D1074–D1082, 2017.
K. Khan, E. Benfenati, and K. Roy, “Consensus qsar modeling of toxicity of pharmaceuticals to different aquatic organisms: Ranking and prioritization of the drugbank database compounds,” Ecotoxicology and environmental safety, vol. 168, pp. 287–297, 2019.
C. Haupt, P. Groth, and M. Zimmermann, “Representing text mining results for structured pharmacological queries,” ISWC, 2011.
J. Dörpinghaus, J. Darms, and M. Jacobs, “What was the question? a systematization of information retrieval and nlp problems.” in 2018 Federated Conference on Computer Science and Information Systems (FedCSIS). IEEE, 2018.
J. Dörpinghaus, J. Klein, J. Darms, S. Madan, and M. Jacobs, “Scaiview – a semantic search engine for biomedical research utilizing a microservice architecture,” in Proceedings of the Posters and Demos Track of the 14th International Conference on Semantic Systems - SEMANTiCS2018, 2018.
F. B. Rogers, “Medical subject headings,” Bulletin of the Medical Library Association, vol. 51, pp. 114–116, 1963.
H. Yang and H. Lee, “Research trend visualization by mesh terms from pubmed,” International journal of environmental research and public health, vol. 15, no. 6, p. 1113, 2018.
R. Cyganiak, D. Wood, and M. Lanthaler, “RDF 1.1 concepts and abstract syntax,” W3C, W3C Recommendation, Feb. 2014, http://www.w3.org/TR/2014/REC-rdf11-concepts-20140225/.
P. Patel-Schneider, S. Rudolph, M. Krötzsch, P. Hitzler, and B. Parsia, “OWL 2 web ontology language primer (second edition),” W3C, Tech. Rep., Dec. 2012, http://www.w3.org/TR/2012/REC-owl2-primer-20121211/.
E. Summers and A. Isaac, “SKOS simple knowledge organization system primer,” W3C, W3C Note, Aug. 2009, http://www.w3.org/TR/2009/NOTE-skos-primer-20090818/.
M. Zeng, M. Hlava, J. Qin, G. Hodge, and D. Bedford, “Knowledge organization systems (kos) standards,” Proceedings of the Association for Information Science and Technology, vol. 44, no. 1, pp. 1–3, 2007.
“Guidelines for the construction, format, and management of monolingual controlled vocabularies,” National Information Standards Organization, Baltimore, Maryland, U.S.A., Standard, 2005.
M. Zeng, “Knowledge organization systems (kos),” vol. 35, pp. 160–182, 01 2008.
A. Malhotra, E. Younesi, M. Gündel, B. Müller, M. T. Heneka, and M. Hofmann-Apitius, “Ado: A disease ontology representing the domain knowledge specific to alzheimer’s disease,” Alzheimer’s & Dementia, vol. 10, no. 2, pp. 238 – 246, 2014.
A. Iyappan, E. Younesi, A. Redolfi, H. Vrooman, S. Khanna, G. B. Frisoni, and M. Hofmann-Apitius, “Neuroimaging feature terminology: A controlled terminology for the annotation of brain imaging features,” Journal of Alzheimer’s Disease, vol. 59, no. 4, pp. 1153–1169, 2017.
J. Voß, “Classification of knowledge organization systems with wikidata.” in NKOS@ TPDL, 2016, pp. 15–22.
D. Vrandečić, “Toward an abstract wikipedia,” in 31st International Workshop on Description Logics (DL), ser. CEUR Workshop Proceedings, M. Ortiz and T. Schneider, Eds., no. 2211, Aachen, 2018. [Online]. Available: http://ceur-ws.org/Vol-2211/#paper-03
A. Oßwald, J. Schöpfel, and B. Jacquemin, “Continuing professional education in open access. a french-german survey,” LIBER Quarterly. The journal of the Association of European Research Libraries, vol. 26, no. 2, pp. 43–66, 2015.
A. Volanakis and K. Krawczyk, “Sciride finder: a citation-based paradigm in biomedical literature search,” Scientific reports, vol. 8, no. 1, p. 6193, 2018.
S. Madan, S. Hodapp, P. Senger, S. Ansari, J. Szostak, J. Hoeng, M. Peitsch, and J. Fluck, “The bel information extraction workflow (belief): evaluation in the biocreative v bel and iat track,” Database, vol. 2016, 2016.
S. Madan, J. Szostak, J. Dörpinghaus, J. Hoeng, and J. Fluck, “Overview of bel track: Extraction of complex relationships and their conversion to bel,” Proceedings of the BioCreative VI Workshop, 2017.
J. Kim, “Correction to: Evaluating author name disambiguation for digital libraries: a case of dblp,” Scientometrics, vol. 118, no. 1, pp. 383–383, 2019.
V. Franzoni, M. Lepri, and A. Milani, “Topological and semantic graph-based author disambiguation on dblp data in neo4j,” arXiv preprint https://arxiv.org/abs/1901.08977, 2019.
C. D. Rickett, U.-U. Haus, J. Maltby, and K. J. Maschhoff, “Loading and querying a trillion rdf triples with cray graph engine on the cray xc,” in Cray User Group, 2018.
D. Nadeau and S. Sekine, “A survey of named entity recognition and classification,” Lingvisticae Investigationes, vol. 30, no. 1, pp. 3–26, 2007.
D. Cai and G. Wu, “Content-aware attributed entity embedding for synonymous named entity discovery,” Neurocomputing, vol. 329, pp. 237–247, 2019.
P. Prajapati and P. Sivakumar, “Context dependency relation extraction using modified evolutionary algorithm based on web mining,” in Emerging Technologies in Data Mining and Information Security. Springer, 2019, pp. 259–267.
S. A. Cook, “The complexity of theorem-proving procedures,” in Proceedings of the third annual ACM symposium on Theory of computing. ACM, 1971, pp. 151–158.
M. D. Wilkinson, M. Dumontier, I. J. Aalbersberg, G. Appleton, M. Axton, A. Baak, N. Blomberg, J.-W. Boiten, L. B. da Silva Santos, P. E. Bourne et al., “The fair guiding principles for scientific data management and stewardship,” Scientific data, vol. 3, 2016.