Quality of Histograms As Indicator Of Approximate Query Quality

Agnieszka Chądzyńska-Krasowska, Marcin Kowalski

DOI: http://dx.doi.org/10.15439/2016F543

Citation: Proceedings of the 2016 Federated Conference on Computer Science and Information Systems, M. Ganzha, L. Maciaszek, M. Paprzycki (eds). ACSIS, Vol. 8, pages 9–15 (2016)

Full text

Abstract. We consider concept of approximate query in RDBMS i.e. query that returns results which may differ from common (exact) query results in a way but its evaluation requires less resources. In the work we focus mostly on time and storage space aspects. We follow one of the state-of-the-art trends using synopses of data as the input of approximate query evaluation. We propose some measures of approximate query results quality. Basing on them we present steps of adaptive elaboration of synopses quality measure that should be mutually corresponding.

References

V. Ganti, M.-L. Lee, and R. Ramakrishnan, “Icicles: Self-tuning samples for approximate query answering.” in VLDB, vol. 176. Citeseer, 2000, p. 187.
S. Chaudhuri, G. Das, and V. Narasayya, “Optimized Stratified Sampling for Approximate Query Processing,” ACM Trans. Database Syst., vol. 32, no. 2, p. 9, 2007.
K. Chakrabarti, M. N. Garofalakis, R. Rastogi, and K. Shim, “Approximate Query Processing Using Wavelets,” VLDB J., vol. 10, no. 2-3, pp. 199–223, 2001.
G. Cormode, M. N. Garofalakis, P. J. Haas, and C. Jermaine, “Synopses for massive data: Samples, histograms, wavelets, sketches,” Foundations and Trends in Databases, vol. 4, no. 1-3, pp. 1–294, 2012. http://dx.doi.org/10.1561/1900000004.
B. Mozafari and N. Niu, “A handbook for building an approximate query engine,” IEEE Data Eng. Bull., vol. 38, no. 3, pp. 3–29, 2015. [Online]. Available: http://sites.computer.org/debull/A15sept/p3.pdf
D. Ślȩzak and V. Eastwood, “Data warehouse technology by infobright,” in Proceedings of the 2009 ACM SIGMOD International Conference on Management of data. ACM, 2009, pp. 841–846.
D. Ślȩzak, P. Synak, A. Wojna, and J. Wróblewski, “Two database related interpretations of rough approximations: Data organization and query execution,” Fundam. Inform., vol. 127, no. 1-4, pp. 445–459, 2013. http://dx.doi.org/10.3233/FI-2013-920. [Online]. Available: http://dx.doi.org/10.3233/FI-2013-920
D. Ślȩzak and M. Kowalski, “Towards approximate SQL–Infobright’s approach,” in Rough Sets and Current Trends in Computing. Springer, 2010, pp. 630–639.
M. Kowalski, D. Ślȩzak, and P. Synak, “Approximate assistance for correlated subqueries,” in Proceedings of the 2013 Federated Conference on Computer Science and Information Systems, Kraków, Poland, September 8-11, 2013., M. Ganzha, L. A. Maciaszek, and M. Paprzycki, Eds., 2013, pp. 1455–1462. [Online]. Available: http://fedcsis.org/2013/
S. Chaudhuri, “An overview of query optimization in relational systems,” in Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems. ACM, 1998, pp. 34–43.
R. P. Kooi, “The optimization of queries in relational databases,” 1980.
V. Poosala and Y. E. Ioannidis, “Selectivity estimation without the attribute value independence assumption,” in VLDB, vol. 97, 1997, pp. 486–495.
Y. E. Ioannidis and V. Poosala, “Balancing histogram optimality and practicality for query result size estimation,” in ACM SIGMOD Record, vol. 24, no. 2. ACM, 1995, pp. 233–244.
H. V. Jagadish, N. Koudas, S. Muthukrishnan, V. Poosala, K. C. Sevcik, and T. Suel, “Optimal histograms with quality guarantees,” in VLDB, vol. 98, 1998, pp. 24–27.