Logo PTI
Polish Information Processing Society
Logo FedCSIS

Annals of Computer Science and Information Systems, Volume 21

Proceedings of the 2020 Federated Conference on Computer Science and Information Systems

How well a multi-model database performs against its single-model variants: Benchmarking OrientDB with Neo4j and MongoDB

, , ,

DOI: http://dx.doi.org/10.15439/2020F76

Citation: Proceedings of the 2020 Federated Conference on Computer Science and Information Systems, M. Ganzha, L. Maciaszek, M. Paprzycki (eds). ACSIS, Vol. 21, pages 463470 ()

Full text

Abstract. Digitalization is currently the key factor for progress, with a rising need for storing, collecting, and processing large amounts of data. In this context, NoSQL databases have become a popular storage solution, each specialized on a specific type of data. Next to that, the multi-model approach is designed to combine benefits from different types of databases, supporting several models for data. Despite its versatility, a multi-model database might not always be the best option, due to the risk of worse performance comparing to the single-model variants. It is hence crucial for software engineers to have access to benchmarks comparing the performance of multi-model and single-model variants. Moreover, in the current Big Data era, it is important to have cluster infrastructure considered within the benchmarks.

References

  1. M. Macak, H. Bangui, B. Buhnova, A. J. Molnár, and C. I. Sidló, “Big data processing tools navigation diagram.” in IoTBDS, 2020, pp. 304–312.
  2. F. Gessert, W. Wingerath, S. Friedrich, and N. Ritter, “Nosql database systems: a survey and decision guidance,” Computer Science-Research and Development, vol. 32, no. 3-4, pp. 353–365, 2017.
  3. S. Kaisler, F. Armour, J. Espinosa, and W. Money, “Big data: Issues and challenges moving forward,” 01 2013. http://dx.doi.org/10.1109/HICSS.2013.645. ISBN 978-1-4673-5933-7 pp. 995–1004.
  4. P. J. Sadalage and M. Fowler, NoSQL distilled: a brief guide to the emerging world of polyglot persistence. Pearson Education, 2013.
  5. E. Raguseo, “Big data technologies: An empirical investigation on their adoption, benefits and risks for companies,” International Journal of Information Management, vol. 38, no. 1, pp. 187 – 195, 2018. http://dx.doi.org/https://doi.org/10.1016/j.ijinfomgt.2017.07.008. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0268401217300063
  6. M. Macak, M. Stovcik, and B. Buhnova, “The suitability of graph databases for big data analysis: A benchmark.” in IoTBDS, 2020, pp. 213–220.
  7. A. Messina, P. Storniolo, and A. Urso, “Keep it simple, fast and scalable: A multi-model nosql dbms as an (eb) xml-over-soap service,” in 2016 30th International Conference on Advanced Information Networking and Applications Workshops (WAINA), 2016, pp. 220–225.
  8. T. P. Hong and P. Do, “Combining apache spark orientdb to find the influence of a scientific paper in a citation network,” in 2018 10th International Conference on Knowledge and Systems Engineering (KSE), 2018, pp. 113–117.
  9. W. Schultz, T. Avitabile, and A. Cabral, “Tunable consistency in mongodb,” Proc. VLDB Endow., vol. 12, no. 12, p. 2071–2081, Aug. 2019. http://dx.doi.org/10.14778/3352063.3352125.
  10. T. T. Aung and T. T. S. Nyunt, “Community detection in scientific co-authorship networks using neo4j,” in 2020 IEEE Conference on Computer Applications(ICCA), 2020, pp. 1–6.
  11. S. Ataky T. M, L. Ferreira, M. Ribeiro, and M. Prado Santos, “Evaluation of graph databases performance through indexing techniques,” International Journal of Artificial Intelligence & Applications (IJAIA), vol. 06, pp. 87–98, 09 2015. http://dx.doi.org/10.5121/ijaia.2015.6506
  12. C. Messaoudi, M. Amrou, R. Fissoune, and B. Hassan, “A performance study of nosql stores for biomedical data,” 11 2017.
  13. D. Jayathilake, C. Sooriaarachchi, T. Gunawardena, B. Kulasuriya, and T. Dayaratne, “A study into the capabilities of nosql databases in handling a highly heterogeneous tree,” in 2012 IEEE 6th International Conference on Information and Automation for Sustainability, 2012, pp. 106–111.
  14. C. Messaoudi, R. Fissoune, and B. Hassan, “A performance evaluation of nosql databases to manage proteomics data,” International Journal of Data Mining and Bioinformatics, vol. 21, pp. 70–89, 09 2018. http://dx.doi.org/10.1504/IJDMB.2018.10016724
  15. F. R. Oliveira and L. del Val Cura, “Performance evaluation of nosql multi-model data stores in polyglot persistence applications,” in Proceedings of the 20th International Database Engineering & Applications Symposium, ser. IDEAS ’16. New York, NY, USA: Association for Computing Machinery, 2016. http://dx.doi.org/10.1145/2938503.2938518. ISBN 9781450341189 p. 230–235. [Online]. Available: https://doi.org/10.1145/2938503.2938518
  16. D. Fernandes and J. Bernardino, “Graph databases comparison: Allegrograph, arangodb, infinitegraph, neo4j, and orientdb,” in Proceedings of the 7th International Conference on Data Science, Technology and Applications - Volume 1: DATA,, INSTICC. SciTePress, 2018. http://dx.doi.org/10.5220/0006910203730380. ISBN 978-989-758-318-6 pp. 373–380.
  17. G. Bathla, R. Rani, and H. Aggarwal, “Comparative study of nosql databases for big data storage,” International Journal of Engineering & Technology, vol. 7, no. 2.6, pp. 83–87, 2018. http://dx.doi.org/10.14419/ijet.v7i2.6.10072. [Online]. Available: https://www.sciencepubco.com/index.php/ijet/article/view/10072
  18. S. Mazumdar, D. Seybold, K. Kritikos, and Y. Verginadis, “A survey on data storage and placement methodologies for cloud-big data ecosystem,” Journal of Big Data, vol. 6, no. 1, p. 15, Feb 2019. http://dx.doi.org/10.1186/s40537-019-0178-3. [Online]. Available: https://doi.org/10.1186/s40537-019-0178-3
  19. F. Holzschuher and R. Peinl, “Performance of graph query languages: comparison of cypher, gremlin and native access in neo4j,” in Proceedings of the Joint EDBT/ICDT 2013 Workshops. ACM, 2013, pp. 195–204.
  20. D. Dominguez-Sal, P. Urbón-Bayes, A. Giménez-Vañó, S. Gómez-Villamor, N. Martínez-Bazán, and J. L. Larriba-Pey, “Survey of graph database performance on the hpc scalable graph analysis benchmark,” in Web-Age Information Management, H. T. Shen, J. Pei, M. T. Özsu, L. Zou, J. Lu, T.-W. Ling, G. Yu, Y. Zhuang, and J. Shao, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010. ISBN 978-3-642-16720-1 pp. 37–48.
  21. S. Jouili and V. Vansteenberghe, “An empirical comparison of graph databases,” in 2013 International Conference on Social Computing, Sep. 2013. http://dx.doi.org/10.1109/SocialCom.2013.106 pp. 708–715.
  22. M. Ciglan, A. Averbuch, and L. Hluchy, “Benchmarking traversal operations over graph databases,” in 2012 IEEE 28th International Conference on Data Engineering Workshops, April 2012. http://dx.doi.org/10.1109/ICDEW.2012.47 pp. 186–189.
  23. A. S. Mondal, M. Sanyal, S. Chattopadhyay, and K. C. Mondal, “Comparative analysis of structured and un-structured databases,” in Computational Intelligence, Communications, and Business Analytics, J. K. Mandal, P. Dutta, and S. Mukhopadhyay, Eds. Singapore: Springer Singapore, 2017. ISBN 978-981-10-6430-2 pp. 226–241.
  24. R. A. Rossi and N. K. Ahmed, “The network data repository with interactive graph analytics and visualization,” in AAAI, 2015. [Online]. Available: http://networkrepository.com