Named Property Graphs
Dominik Tomaszuk, Łukasz Szeremeta
DOI: http://dx.doi.org/10.15439/2018F103
Citation: Proceedings of the 2018 Federated Conference on Computer Science and Information Systems, M. Ganzha, L. Maciaszek, M. Paprzycki (eds). ACSIS, Vol. 15, pages 173–177 (2018)
Abstract. The amount of information that is stored and processed by computer systems is constantly increasing. The relational model is still popular. Unfortunately, despite its simplicity, it has many disadvantages, which more often exclude it from large-scale applications. The property graph model seems to be a good alternative for describing real world data with its relationships. Therefore, property graph based databases become more and more popular every day. In this paper we introduce Named Property Graph model that allows to group graphs into separate units and describe information about them. We also present Cypher\_n query language that supports our proposal, mapping algorithms, use cases with the chemical data, and SDFEater that is our tool for processing data. Presented solutions are fully backward compatible with existing databases.
References
- J. Webber, “A programmatic introduction to Neo4J,” in Proceedings of the 3rd Annual Conference on Systems, Programming, and Applications: Software for Humanity, ser. SPLASH ’12. New York, NY, USA: ACM, 2012. http://dx.doi.org/10.1145/2384716.2384777. ISBN 978-1-4503-1563-0 pp. 217–218. [Online]. Available: http://dx.doi.org/10.1145/2384716. 2384777
- C. Tesoriero, Getting started with OrientDB. Packt Publishing Ltd, 2013. ISBN 978-1782169956
- L. Dohmen, “Algorithms for large networks in the NoSQL database ArangoDB,” Bachelor’s Thesis, RWTH Aachen University, Aachen, 2012.
- K. Chodorow, MongoDB: The definitive guide: powerful and scalable data storage. O’Reilly Media, Inc., 2013. ISBN 978-1449344689
- D. Tomaszuk, “RDF data in property graph model,” in Metadata and Semantics Research: 10th International Conference, MTSR 2016, Göttingen, Germany, November 22-25, 2016, Proceedings. Springer, 2016. http://dx.doi.org/10.1007/978-3-319-49157-8_9 pp. 104–115. [Online]. Available: http://dx.doi.org/10.1007/978-3-319-49157-8_9
- R. De Virgilio, A. Maccioni, and R. Torlone, “Converting relational to graph databases,” in First International Workshop on Graph Data Management Experiences and Systems, ser. GRADES ’13. ACM, 2013. http://dx.doi.org/10.1145/2484425.2484426. ISBN 978-1-4503-2188-4 pp. 1:1–1:6. [Online]. Available: http://dx.doi.org/10.1145/2484425.2484426
- R. De Virgilio, A. Maccioni, and R. Torlone, “R2G: A tool for migrating relations to graphs,” in Proceeding of the 17th International Conference on Extending Database Technology (EDBT 2014), 2014, pp. 640–643.
- S. Lee, B. H. Park, S. H. Lim, and M. Shankar, “Table2Graph: A scalable graph construction from relational tables using Map- Reduce,” in 2015 IEEE First International Conference on Big Data Computing Service and Applications, March 2015. http://dx.doi.org/10.1109/BigDataService.2015.52 pp. 294–301. [Online]. Available: http://dx.doi.org/10.1109/BigDataService.2015.52
- G. Schreiber and Y. Raimond, “RDF 1.1 Primer,” W3C, W3C Note, 2014. [Online]. Available: http://www.w3.org/TR/2014/NOTE-rdf11-primer-20140624/
- J. J. Carroll, C. Bizer, P. Hayes, and P. Stickler, “Named graphs, provenance and trust,” in Proceedings of the 14th International Conference on World Wide Web. ACM, 2005. http://dx.doi.org/10.1145/1060745.1060835 pp. 613–622. [Online]. Available: http://dx.doi.org/10.1145/1060745.1060835
- M. Junghanns, A. Petermann, N. Teichmann, K. Gómez, and E. Rahm, “Analyzing extended property graphs with Apache Flink,” in Proceedings of the 1st ACM SIGMOD Workshop on Network Data Analytics, ser. NDA ’16. New York, NY, USA: ACM, 2016. http://dx.doi.org/10.1145/2980523.2980527. ISBN 978-1-4503-4513-2 pp. 3:1–3:8. [Online]. Available: http://dx.doi.org/10.1145/2980523.2980527
- M. Junghanns, P. André, and R. Erhard, “Distributed grouping of property graphs with GRADOOP,” in Datenbanksysteme für Business, Technologie und Web (BTW 2017), B. Mitschang, N. Daniela, L. Frank, S. Harald, H. Melanie, T. Jens, H. Theo, K. Oliver, and W. Matthias, Eds. Gesellschaft für Informatik, Bonn, 2017, pp. 103–122.
- M. Levene and A. Poulovassilis, “An object-oriented data model formalised through hypergraphs,” Data & Knowledge Engineering, vol. 6, no. 3, pp. 205–224, 1991. http://dx.doi.org/10.1016/0169-023X(91)90005-I. [Online]. Available: http://dx.doi.org/10.1016/0169-023X(91)90005-I
- M. Levene and A. Poulovassilis, “The hypernode model and its associated query language,” in Information Technology, 1990. ’Next Decade in Information Technology’, Proceedings of the 5th Jerusalem Conference on (Cat. No.90TH0326-9), Oct 1990. http://dx.doi.org/10.1109/JCIT.1990.128324 pp. 520–530. [Online]. Available: http://dx.doi.org/10.1109/JCIT.1990.128324
- M. Gyssens, J. Paredaens, and D. V. Gucht, “A graph-oriented object model for database end-user interfaces,” ACM SIGMOD Record, vol. 19, no. 2, pp. 24–33, 1990. http://dx.doi.org/10.1145/93605.93616. [Online]. Available: http://dx.doi.org/10.1145/93605.93616
- G. M. Kuper and M. Y. Vardi, “The logical data model,” ACM Transactions on Database Systems (TODS), vol. 18, no. 3, pp. 379–413, 1993. http://dx.doi.org/10.1145/155271.155274. [Online]. Available: http://dx.doi.org/10.1145/155271.155274
- A. Ghrab, O. Romero, S. Skhiri, A. Vaisman, and E. Zimányi, “Grad: On graph database modeling,” arXiv preprint https://arxiv.org/abs/1602.00503, 2016.
- J. Marton, G. Szárnyas, and D. Varró, “Formalising openCypher graph queries in relational algebra,” in Advances in Databases and Information Systems, M. Kirikova, K. Nørvåg, and G. A. Papadopoulos, Eds. Cham: Springer International Publishing, 2017. http://dx.doi.org/10.1007/978- 3-319-66917-5_13. ISBN 978-3-319-66917-5 pp. 182–196. [Online]. Available: http://dx.doi.org/10.1007/978-3-319-66917-5_13
- R. Zhou and E. A. Hansen, “Parallel Structured Duplicate Detection,” in Proceedings of the 22Nd National Conference on Artificial Intelligence - Volume 2, ser. AAAI’07. AAAI Press, 2007. ISBN 978-1-57735-323-2 pp. 1217–1223.
- D. Tomaszuk and K. Pak, “Reducing vertices in property graphs,” PLOS ONE, vol. 13, no. 2, pp. 1–25, 02 2018. http://dx.doi.org/10.1371/journal.pone.0191917. [Online]. Available: http://dx.doi.org/10.1371/journal.pone.0191917
- A. Dalby, J. G. Nourse, W. D. Hounshell, A. K. Gushurst, D. L. Grier, B. A. Leland, and J. Laufer, “Description of several chemical structure file formats used by computer programs developed at Molecular Design Limited,” Journal of Chemical Information and Computer Sciences, vol. 32, no. 3, pp. 244–255, 1992. http://dx.doi.org/10.1021/ci00007a012. [Online]. Available: http://dx.doi.org/10.1021/ci00007a012
- J. Hastings, G. Owen, A. Dekker, M. Ennis, N. Kale, V. Muthukrishnan, S. Turner, N. Swainston, P. Mendes, and C. Steinbeck, “ChEBI in 2016: Improved services and an expanding collection of metabolites,” Nucleic Acids Research, vol. 44, no. D1, pp. D1214–D1219, 2016. http://dx.doi.org/10.1093/nar/gkv1031. [Online]. Available: http://dx.doi.org/10.1093/nar/ gkv1031
- D. S. Wishart, Y. D. Feunang, A. C. Guo, E. J. Lo, A. Marcu, J. R. Grant, T. Sajed, D. Johnson, C. Li, Z. Sayeeda, N. Assempour, I. Iynkkaran, Y. Liu, A. Maciejewski, N. Gale, A. Wilson, L. Chin, R. Cummings, D. Le, A. Pon, C. Knox, and M. Wilson, “DrugBank 5.0: a major update to the DrugBank database for 2018,” Nucleic Acids Research, vol. 46, no. D1, pp. D1074–D1082, 2018. http://dx.doi.org/10.1093/nar/gkx1037. [Online]. Available: http://dx.doi.org/10.1093/nar/gkx1037
- Ł. Szeremeta and D. Tomaszuk, “SDFParser example Cypher outputs,” 5 2018. http://dx.doi.org/10.6084/m9.figshare.6249962.v1. [Online]. Available: http://dx.doi.org/10.6084/m9.figshare.6249962.v1
- Y. Wang, S. H. Bryant, T. Cheng, J. Wang, A. Gindulyte, B. A. Shoemaker, P. A. Thiessen, S. He, and J. Zhang, “PubChem BioAssay: 2017 update,” Nucleic acids research, vol. 45, no. D1, pp. D955–D963, 2016. http://dx.doi.org/10.1093/nar/gkw1118. [Online]. Available: http://dx.doi.org/10.1093/nar/gkw1118