Big Data Techniques, Systems, Applications, and Platforms: Case Studies from Academia
Atanas Radenski, Todor Gurov, Kalinka Kaloyanova, Nikolay Kirov, Maria Nisheva, Peter Stanchev, Eugenia Stoimenova
Citation: Proceedings of the 2016 Federated Conference on Computer Science and Information Systems, M. Ganzha, L. Maciaszek, M. Paprzycki (eds). ACSIS, Vol. 8, pages 883–888 (2016)
Abstract. Big data is a broad term with numerous dimensions, most notably: big data characteristics, techniques, software systems, application domains, computing platforms, and big data milieu (industry, government, and academia). In this paper we briefly introduce fundamental big data characteristics and then present seven case studies of big data techniques, systems, applications, and platforms, as seen from academic perspective (industry and government perspectives are not subject of this publication). While we feel that it is difficult, if at all possible, to encapsulate all of the important big data dimensions in a strict and uniform, yet comprehendible language, we believe that a set of diverse case studies -- like the one that is offered in this paper -- a set that spreads over the principal big data dimensions can indeed be beneficial to the broad big data community by helping experts in one realm to better understand currents trends in the other realms.
- Big Data Era in Sky and Earth Observation (Big-SkyEarth) COST Action TD1403, http://bigskyearth.eu/, (Retrieved January, 2016).
- Bulg. Research and Educational Network (BREN), http://www.bren.bg/.
- EGI, 2016, European Grid Infrastructure, www.egi.eu.
- J.P. Ferguson, D. Palejev, “Calibration of p-values for multiple testing problems in genomics”, Stat. Appl. Genet. Molec. Biol., vol. 13(6), 2014, pp. 659–73, http://dx.doi.org/10.1515/sagmb-2013-0074.
- S. Gilbert, N. Lynch, “Perspectives on the CAP Theorem”, Computer, vol. 45, no. 2, 2012, pp. 30–36, http://doi.ieeecomputersociety.org/10.1109/MC.2011.389.
- T. Gruber, “Toward Principles for the Design of Ontologies Used for Knowledge Sharing”, International Journal of Human-Computer Studies, Vol. 43, 1995, pp. 907–928, http://dx.doi.org/10.1006/ijhc.1995.1081.
- Advanced Computing and Data Center, IICT, http://hpc.acad.bg/.
- K. Ivanova, I. Mitov, P. Stanchev, Ph. Ein-Dor, K. Vanhoof, “Establishing Correspondences between Attribute Spaces and Complex Concept Spaces Using Meta-PGN Classifier”, Proc. of the 2nd Int. Conf. “Digital Preservation and Presentation of Cultural Heritage”, V. Tarnovo, Bulgaria, IMI-BAS, Sofia, 2012, ISSN 1314-4006, pp.71–77.
- O. Kounchev et al, Astroinformatics: “A Synthesis between Astronom- ical Imaging and Information & Communication Technologies”, In: Modern Trends in Mathematics and Physics, ed. S.S. Tinchev, Heron Press, Sofia, 2009, pp. 60–69.
- H. Kyurkchiev, K. Kaloyanova, “Performance Study of Analytical Queries of Oracle and Vertica”, Proc. of the 7t h International Conference “Information Systems & Grid Technologies”, Sofia, 2013, pp. 127–139, http://dx.doi.org/10.13140/2.1.3667.0726.
- H. Kyurkchiev, E. Mitreva, “Performance Study of SQL and NoSQL Solutions for Analytical Loads”, Proc. of the Doctoral Conference in ”Mathematics, Informatics and Education” (MIE2013), Sofia, 2014, pp. 49–57, http://dx.doi.org/10.13140/2.1.1307.7766.
- J. Leskovec, A. Rajaraman, J. D. Ullman, Mining of Massive Datasets, 3rd Edition, Cambridge University Press, 2014.
- E. Mitreva, K. Kaloyanova, “NoSQL solutions to handle big data”, Proc. of the Doctoral Conference in Mathematics, Informatics and Education (MIE 2013), Sofia, 2013, pp. 77–85.
- A. Radenski, L. Ehwerhemuepha, K. Anderson. “From in-disk to inmemory big data with Hadoop: Performance experiments with nucleotide sequence data”, Proc. ABDA’15, the International Conference on Advances in Big Data Analytics, CSREA Press (H. Arabnia and M. Yang, Ed.), 2015, pp. 34–40.
- D. Salmen, T. Malyuta, A. Hansen, S. Cronen, B. Smith, “Integration of Intelligence Data through Semantic Enhancement”, Proceedings of the Conference on Semantic Technology in Intelligence, Defense and Security STIDS 2011, CEUR, Vol. 808, 2011, pp. 6–13.
- A. Sheth, “Transforming Big Data into Smart Data: Deriving Value via harnessing Volume, Variety and Velocity using semantics and Semantic Web”, Keynote at the 21st Italian Symposium on Advanced Database Systems 2013. http://j.mp/SmatData, visited on December 23, 2015.
- C. Soneson, M. Delorenzi, “A comparison of methods for differential expression analysis of RNA-seq data”, BMC bioinformatics, vol. 14(1), 2013, 1, http://dx.doi.org/10.1186/1471-2105-14-91.
- E. Stoimenova, D. Palejev, “Comparison of incomplete ranked lists with application to RNA-seq differential expression methods”, Preprint, 2016.
- K. Thirunarayan, A. Sheth, “Semantics-Empowered Approaches to Big Data Processing for Physical-Cyber-Social Applications”, Association for the Advancement of Artificial Intelligence (AAAI) Technical Report FS-13-04, 2013.
- TOP500 list, November 2015, http://www.top500.org/site/50586.
- M. Tsvetkov, K. Tsvetkova, N. Kirov, “Technology for scanning of astronomical photographic plates”, Serdica Journal of Computing, vol. 6 (1), 2012, pp. 77–88.
- M. Tsvetkov, “Wide-Field Plate Database: a Decade of Development”, In: Observatory: Plate Content Digitization, Archive Mining and Image Sequence, Processing, iAstro workshop, Sofia, Bulgaria, Eds. M. Tsvetkov, F. Murtagh, R. Molina, 2006.
- VI-SEEM, 2016. https://vi-seem.eu/.
- Wide-Field Plate Database, http://www.wfpdb.org.
- T. White, Hadoop: The Definitive Guide, O’Reilly, 2012.
- A. Woodie, Spark Smashes MapReduce in Big Data Benchmark. Datanami, October 10, 2014, http://www.datanami.com/2014/10/10/spark-smashes-mapreduce-big-data-benchmark/.
- E. Zdravevski, et al, “Parallel computation of information gain using Hadoop and MapReduce”, Proc. of the 2015 Federated Conf. on Comp. Sci. and Inf. Systems” (FedCSIS2015), Lodz, Poland, 2015, pp. 181–192, http://dx.doi.org/10.15439/2015F89.