Probabilistic nearest neighbor query processing on distributed uncertain data
详细信息    查看全文
  • 作者:Daichi Amagata ; Yuya Sasaki ; Takahiro Hara…
  • 关键词:Probabilistic nearest neighbor query ; Uncertain databases ; Distributed query processing
  • 刊名:Distributed and Parallel Databases
  • 出版年:2016
  • 出版时间:June 2016
  • 年:2016
  • 卷:34
  • 期:2
  • 页码:259-287
  • 全文大小:1,014 KB
  • 参考文献:1.AbdulAzeem, Y.M., ElDesouky, A.I., Ali, H.A.: A framework for ranking uncertain distributed database. DKE 92, 1–19 (2014)CrossRef
    2.Antova, L., Jansen, T., Koch, C., Olteanu, D.: Fast and simple relational processing of uncertain data. In: ICDE, pp. 983–992 (2008)
    3.Atallah, M.J., Qi, Y.: Computing all skyline probabilities for uncertain data. In: PODS, pp. 279–287 (2009)
    4.Bernecker, T., Emrich, T., Kriegel, H.P., Renz, M., Zankl, S., Züfle, A.: Efficient probabilistic reverse nearest neighbor query processing on uncertain data. PVLDB 4(10), 669–680 (2011)
    5.Beskales, G., Soliman, M.A., IIyas, I.F.: Efficient search for the top-k probable nearest neighbors in uncertain databases. PVLDB 1(1), 326–339 (2008)
    6.Börzsönyi, S., Kossmann, D., Stocker, K.: The skyline operator. In: ICDE, pp. 421–430 (2001)
    7.Cheema, M.A., Lin, X., Wang, W., Zhang, W., Pei, J.: Probabilistic reverse nearest neighbor queries on uncertain data. IEEE TKDE 22(4), 550–564 (2010)
    8.Chen, J., Cheng, R., Mokbel, M., Chow, C.Y.: Scalable processing of snapshot and continuous nearest-neighbor queries over one-dimensional uncertain data. VLDB J. 18(5), 1219–1240 (2009)CrossRef
    9.Cheng, R., Chen, L., Chen, J., Xie, X.: Evaluating probability threshold k-nearest-neighbor queries over uncertain data. In: EDBT, pp. 672–683 (2009)
    10.Cheng, R., Kalashnikov, D.V., Prabhakar, S.: Querying imprecise data in moving object environments. IEEE TKDE 16(9), 1112–1127 (2004)
    11.Cormode, G., Li, F., Yi, K.: Semantics of ranking queries for probabilistic data and expected ranks. In: ICDE, pp. 305–316 (2009)
    12.Dalvi, N., Suciu, D.: Efficient query evaluation on probabilistic databases. VLDB J. 16(4), 523–544 (2007)CrossRef
    13.Ding, X., Jin, H.: Efficient and progressive algorithms for distributed skyline queries over uncertain data. IEEE TKDE 24(8), 1448–1462 (2012)
    14.Fu, T.Y., Peng, W.C., Lee, W.C.: Parallelizing itinerary-based knn query processing in wireless sensor networks. IEEE TKDE 22(5), 711–729 (2010)
    15.Ge, T., Zdonik, S., Madden, S.: Top-k queries on uncertain data: on score distribution and typical answers. In: SIGMOD, pp. 375–388 (2009)
    16.Hua, M., Pei, J., Zhang, W., Lin, X.: Ranking queries on uncertain data: a probabilistic threshold approach. In: SIGMOD, pp. 673–686 (2008)
    17.Kriegel, H.P., Kunath, P., Renz, M.: Probabilistic nearest-neighbor query on uncertain objects. In: DASFAA, pp. 337–348 (2007)
    18.Li, F., Yi, K., Jestes, J.: Ranking distributed probabilistic data. In: SIGMOD, pp. 361–374 (2009)
    19.Li, J., Saha, B., Deshpande, A.: A unified approach to ranking in probabilistic databases. PVLDB 2(1), 502–513 (2009)
    20.Li, X., Wang, Y., Li, X., Wang, X., Yu, J.: Gdps: an efficient approach for skyline queries over distributed uncertain data. Big Data Res. 1, 23–36 (2014)CrossRef
    21.Lian, X., Chen, L.: Probabilistic group nearest neighbor queries in uncertain databases. IEEE TKDE 20(6), 809–824 (2008)
    22.Lian, X., Chen, L.: Shooting top-k stars in uncertain databases. VLDB J. 20(6), 819–840 (2011)CrossRef
    23.Lin, X., Xu, J., Hu, H., Lee, W.: Authenticating location-based skyline queries in arbitrary subspaces. IEEE TKDE 26(6), 1479–1493 (2014)
    24.Liu, X., Yang, D., Ye, M., Lee, W.: U-skyline: a new skyline query for uncertain databases. IEEE TKDE 25(4), 945–960 (2013)
    25.Pei, J., Jiang, B., Lin, X., Yuan, Y.: Probabilistic skylines on uncertain data. In: VLDB, pp. 15–26 (2007)
    26.Pripužić, K., Žarko, I.P., Aberer, K.: Distributed processing of continuous sliding-window k-nn queries for data stream filtering. World Wide Web 14(5–6), 465–494 (2011)
    27.Sarma, A.D., Benjelloun, O., Halevy, A., Widom, J.: Working models for uncertain data. In: ICDE, pp. 7–27 (2006)
    28.Soliman, M.A., Ilyas, I.F., Chen-Chuan Chang, K.: Top-k query processing in uncertain databases. In: ICDE, pp. 896–905 (2007)
    29.Tang, M., Li, F., Phillips, J.M., Jestes, J.: Efficient threshold monitoring for distributed probabilistic data. In: ICDE, pp. 1120–1131 (2012)
    30.Tang, M., Li, F., Tao, Y.: Distributed online tracking. In: SIGMOD (2015)
    31.Wang, Y., Li, X., Li, X., Wang, Y.: A Survey of Queries Over Uncertain Data. Knowledge and Information Systems. Springer, London (2013)
    32.Ye, M., Lee, W., Lee, D., Liu, X.: Distributed processing of probabilistic top-k queries in wireless sensor networks. IEEE TKDE 25(6), 76–91 (2013)
    33.Yiu, M.L., Mamoulis, N.: Efficient processing of top-k dominating queries on multi-dimensional data. In: VLDB, pp. 483–494 (2007)
    34.Yuen, S.M., Tao, Y., Xiao, X., Pei, J., Zhang, D.: Superseding nearest neighbor search on uncertain spatial databases. IEEE TKDE 22(7), 1041–1055 (2010)
    35.Zhang, X., Chomicki, J.: Semantics and evaluation of top-k queries in probabilistic databases. Distrib Parallel Databases 26(1), 67–126 (2009)CrossRef
    36.Zhang, Y., Lin, X., Zhu, G., Zhang, W., Lin, Q.: Efficient rank based knn query processing over uncertain data. In: ICDE, pp. 28–39 (2010)
  • 作者单位:Daichi Amagata (1)
    Yuya Sasaki (2)
    Takahiro Hara (1)
    Shojiro Nishio (1)

    1. Department of Multimedia Engineering Graduate School of Information Science and Technology Osaka University, 1-5 Yamadaoka, Suita, Osaka, 565-0871, Japan
    2. Department of Systems and Social Informatics, Graduate School of Information Science Nagoya University, Furo-cho, Chikusa-ku, Nagoya, 464-8601, Japan
  • 刊物类别:Computer Science
  • 刊物主题:Database Management
    Data Structures
    Information Systems Applications and The Internet
    Operating Systems
    Memory Structures
  • 出版者:Springer Netherlands
  • ISSN:1573-7578
文摘
A nearest neighbor (NN) query, which returns the most similar object to a user-specified query object, plays an important role in a wide range of applications and hence has received considerable attention. In many such applications, e.g., sensor data collection and location-based services, objects are inherently uncertain. Furthermore, due to the ever increasing generation of massive datasets, the importance of distributed databases, which deal with such data objects, has been growing. One emerging challenge is to efficiently process probabilistic NN queries over distributed uncertain databases. The straightforward approach, that each local site forwards its own database to the central server, is communication-expensive, so we have to minimize communication cost for the NN object retrieval. In this paper, we focus on two important queries, namely top-k probable NN queries and probabilistic star queries, and propose efficient algorithms to process them over distributed uncertain databases. Extensive experiments on both real and synthetic data have demonstrated that our algorithms significantly reduce communication cost.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700