VDDA: automatic visualization-driven data aggregation in relational databases
详细信息    查看全文
  • 作者:Uwe Jugel ; Zbigniew Jerzak ; Gregor Hackenbroich ; Volker Markl
  • 关键词:Relational databases ; Data aggregation ; Visual aggregation ; Dimensionality reduction ; Data visualization ; Line rasterization ; Overplotting
  • 刊名:The VLDB Journal
  • 出版年:2016
  • 出版时间:February 2016
  • 年:2016
  • 卷:25
  • 期:1
  • 页码:53-77
  • 全文大小:2,986 KB
  • 参考文献:1.Agarwal, S., Panda, A., Mozafari, B., Iyer, A.P., Madden, S., Stoica, I.: Blink and it’s done: Interactive queries on very large data. PVLDB 5(12), 1902–1905 (2012)
    2.Aigner, W., Miksch, S., Schumann, H., Tominski, C.: Visualization of Time-Oriented Data. Human-Computer Interaction Series. Springer, Berlin (2011)CrossRef
    3.Battle, L., Stonebraker, M., Chang, R.: Dynamic reduction of query result sets for interactive visualizaton. In: IEEE Big Data, pp. 1–8. IEEE (2013)
    4.Bresenham, J.E.: Algorithm for computer control of a digital plotter. IBM Syst. J. 4(1), 25–30 (1965)CrossRef
    5.Burtini, G., Fazackerley, S., Lawrence, R.: Time series compression for adaptive chart generation. In: CCECE, pp. 1–6. IEEE (2013)
    6.Chen, J.X., Wang, X.: Approximate line scan-conversion and antialiasing. Comput. Graph. Forum 18(1), 69–78 (1999)CrossRef
    7.Chi, E.H., Riedl, J.T.: An operator interaction framework for visualization systems. In: Symposium on Information Visualization, pp. 63–70. IEEE (1998)
    8.Cudré-Mauroux, P., Kimura, H., Lim, K.T., Rogers, J., Simakov, R., Soroush, E., Velikhov, P., Wang, D.L., Balazinska, M., Becla, J., et al.: A demonstration of SciDB: a science-oriented DBMS. PVLDB 2(2), 1534–1537 (2009)
    9.Salomon, David: Data Compression. Springer, Berlin (2007)
    10.Douglas, D.H., Peucker, T.K.: Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. Cartogr. J. 10(2), 112–122 (1973)CrossRef
    11.Duan, Q., Wang, P., Wu, M., Wang, W., Huang, S.: Approximate query on historical stream data. In: DEXA, pp. 128–135. Springer (2011)
    12.Eick, S.G., Karr, A.F.: Visual scalability. J. Comput. Graph. Stat. 11(1), 22–43 (2002)MathSciNet CrossRef
    13.Elmqvist, N., Fekete, J.D.: Hierarchical aggregation for information visualization: overview, techniques and design guidelines. TVCG 16(3), 439–454 (2010)
    14.Esling, P., Agon, C.: Time-series data mining. ACM Comput. Surv. 45(1), 12–34 (2012)CrossRef
    15.Färber, F., Cha, S.K., Primsch, J., Bornhövd, C., Sigg, S., Lehner, W.: SAP HANA database-data management for modern business applications. SIGMOD Rec. 40(4), 45–51 (2012)CrossRef
    16.Fu, T., Chung, F., Luk, R., Ng, C.: Representing financial time series based on data point importance. EAAI J. 21(2), 277–300 (2008)
    17.Fu, T.C.: A review on time series data mining. EAAI J. 24(1), 164–181 (2011)
    18.Gandhi, S., Foschini, L., Suri, S.: Space-efficient online approximation of time series data: streams, amnesia, and out-of-order. In: ICDE, pp. 924–935. IEEE (2010)
    19.Haber, R.B., McNabb, D.A.: Visualization idioms: a conceptual model for scientific visualization systems. Vis. Sci. Comput. 74, 93 (1990)
    20.Hershberger, J., Snoeyink, J.: Speeding up the Douglas–Peucker line-simplification algorithm. University of British Columbia, Department of Computer Science (1992)
    21.Jerzak, Z., Heinze, T., Fehr, M., Gröber, D., Hartung, R., Stojanovic, N.: The DEBS 2012 grand challenge. In: DEBS, pp. 393–398. ACM (2012)
    22.Jugel, U., Jerzak, Z., Hackenbroich, G., Markl, V.: Faster visual analytics through pixel-perfect aggregation. PVLDB 7(13), 1705–1708 (2014)
    23.Jugel, U., Jerzak, Z., Hackenbroich, G., Markl, V.: M4: a visualization-oriented time series data aggregation. PVLDB 7(10), 797–808 (2014)
    24.Jugel, U., Markl, V.: Interactive visualization of high-velocity event streams. PVLDB (PhD Workshop) 5(13) (2012)
    25.Keim, D.A., Panse, C., Schneidewind, J., Sips, M., Hao, M.C., Dayal, U.: Pushing the limit in visual data exploration: techniques and applications. LNCS 2821, 37–51 (2003)
    26.Keogh, E.J., Pazzani: A simple dimensionality reduction technique for fast similarity search in large time series databases. In: PAKDD, pp. 122–133. Springer (2000)
    27.Kolesnikov, A.: Efficient Algorithms for Vectorization and Polygonal Approximation. University of Joensuu, Joensuu (2003)
    28.Lindstrom, P., Isenburg, M.: Fast and efficient compression of floating-point data. TVCG 12(5), 1245–1250 (2006)
    29.Liu, Z., Jiang, B., Heer, J.: imMens: real-time visual querying of big data. Comput. Graph. Forum 32(3pt4), 421–430 (2013)CrossRef
    30.Ma, W., Bedner, I., Chang, G., Kuchinsky, A., Zhang, H.: A framework for adaptive content delivery in heterogeneous network environments. In: Proceedings of SPIE, Multimedia Computing and Networking, vol. 3969, pp. 86–100. SPIE (2000)
    31.Mackinlay, J., Hanrahan, P., Stolte, C.: Show me: automatic presentation for visual analysis. TVCG 13(6), 1137–1144 (2007)
    32.Mutschler, C., Ziekow, H., Jerzak, Z.: The DEBS 2013 grand challenge. In: DEBS, pp. 289–294. ACM (2013)
    33.Office of Electricity Delivery & Energy Reliability: Smart Grid (2014). http://​energy.​gov/​oe/​technology-development/​smart-grid
    34.Przymus, P., Boniewicz, A., Burzańska, M., Stencel, K.: Recursive query facilities in relational databases: a survey. In: DTA and BSBT, pp. 89–99. Springer (2010)
    35.Reumann, K., Witkam, A.P.M.: Optimizing curve segmentation in computer graphics. In: Proceedings of the International Computing Symposium, pp. 467–472. North-Holland Publishing Company (1974)
    36.Shi, W., Cheung, C.: Performance evaluation of line simplification algorithms for vector generalization. Cartogr. J. 43(1), 27–44 (2006)CrossRef
    37.Upson, C., Faulhaber Jr, T.A., Kamins, D., Laidlaw, D., Schlegel, D., Vroom, J., Gurwitz, R., Van Dam, A.: The application visualization system: a computational environment for scientific visualization. IEEE Comput. Graph. Appl. 9(4), 30–42 (1989)CrossRef
    38.Visvalingam, M., Whyatt, J.D.: Line generalisation by repeated elimination of points. Cartogr. J. 30(1), 46–51 (1993)CrossRef
    39.Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)CrossRef
    40.Wesley, R., Eldridge, M., Terlecki, P.T.: An analytic data engine for visualization in Tableau. In: SIGMOD, pp. 1185–1194. ACM (2011)
    41.Wu, E., Battle, L., Madden, S.R.: The case for data visualization management systems. PVLDB 7(10), 903–906 (2014)
    42.Wu, Y., Agrawal, D., El Abbadi, A.: A comparison of DFT and DWT based similarity search in timeseries databases. In: CIKM, pp. 488–495. ACM (2000)
  • 作者单位:Uwe Jugel (1)
    Zbigniew Jerzak (1)
    Gregor Hackenbroich (1)
    Volker Markl (2)

    1. SAP SE, Walldorf/Dresden, Germany
    2. Technische Universität Berlin, Berlin, Germany
  • 刊物类别:Computer Science
  • 刊物主题:Database Management
  • 出版者:Springer Berlin / Heidelberg
  • ISSN:0949-877X
文摘
Contemporary RDBMS-based systems for visualization of high-volume numerical data have difficulty to cope with the hard latency requirements and high ingestion rates of interactive visualizations. Existing solutions for lowering the volume of large data sets disregard the spatial properties of visualizations, resulting in visualization errors. In this work, we introduce VDDA, a visualization-driven data aggregation that models visual aggregation at the pixel level as data aggregation at the query level. Based on the M4 aggregation for producing pixel-perfect line charts from highly reduced data subsets, we define a complete set of data reduction operators that simulate the overplotting behavior of the most frequently used chart types. Relying only on the relational algebra and the common data aggregation functions, our approach is generic and applicable to any visualization system that consumes data stored in relational databases. We demonstrate our visualization-driven data aggregation using real-world data sets from high-tech manufacturing, stock markets, and sports analytics, reducing data volumes by up to two orders of magnitude, while preserving pixel-perfect visualizations, as producible from the raw data. Keywords Relational databases Data aggregation Visual aggregation Dimensionality reduction Data visualization Line rasterization Overplotting

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700