基于概率数据库的偏好查询研究

设为首页

收藏本站

网站地图 | English | 公务邮箱

远程访问

NSTL服务站

基于概率数据库的偏好查询研究

详细信息本馆镜像全文| 推荐本文 | | 获取CNKI官网全文

英文题名：Research of Preference Queries on Probabilistic Database
作者：王晓伟
论文级别：博士
学科专业名称：计算机科学与技术
中文关键词：不确定数据 ; 概率数据库 ; 偏好查询 ; top-k ; skyline
英文关键词：Uncertain data ; probabilistic database ; preference query ; top-k ; skyline
学位年度：2011
导师：贾焰
学科代码：0812
学位授予单位：国防科学技术大学
论文提交日期：2011-10-01

摘要

在商业数据管理、金融数据分析、传感器、RFID、地理信息系统等许多重要的现代应用中，数据普遍带有不确定性特征，查询和分析的精确度对应用是否成功具有决定性影响。传统的数据库技术建立在确定性理论的基础上，无法有效处理此类不确定数据。概率数据库技术能够直接管理不确定数据，是近年来数据库领域的研究热点。概率数据库的应用层通过偏好查询满足用户的个性需求，其中最重要的偏好查询是概率skyline和概率top-k查询。
     本文针对基于概率数据库的偏好查询技术，从提高查询效率和通用性出发，对概率skyline查询和概率top-k查询的若干关键问题展开研究，主要工作包括：
     1、概率skyline查询的非索引算法。现有的概率skyline算法普遍基于R树索引，而R树索引技术在通用性方面存在限制。非索引算法不需要任何预处理或索引结构，具有最广泛的适用性，是索引算法的必要补充。基本思想是从已访问的对象中提取概率信息，来估计未来访问对象的skyline概率。分别提出了基于内存R树和有序链表的两种非索引算法。实验结果表明，两种算法均明显优于朴素的非索引算法，且在对数据维度的适应性方面互为补充。
     2、概率skyline查询的分布式算法。现实应用中，数据通常分散于多个分布节点，分布式概率skyline查询的延迟主要取决于通信开销。以降低通信开销为主要目标，提出了基于概要共享的分布式概率skyline算法框架，然后分别提出了两种具体算法。其中，VOS算法将虚对象集合作为间接表示数据分布的概要，用基于距离的虚对象集合压缩方法来降低共享虚对象导致的额外通信开销；GRID算法着眼于克服VOS算法对数据分布的依赖性，将对象映射到固定宽度的网格中，提出网格概要的压缩方法来减少共享网格带来的额外通信开销。实验结果表明，VOS和GRID均优于现有算法。其中前者适于处理具有独立分布特征或维度较高的数据，后者更适合处理具有反相关特征的数据。
     3、支持多种概率top-k查询的层次索引。概率top-k查询需要同时考虑评分函数值和存在概率值，因此具有不同语义。每种语义都有专门的查询处理方法，缺少能应用于一类重要语义的索引技术。提出两种层次索引，能应用于满足特定性质的一类概率top-k查询。首先提出基于skyline的SL索引，为了提高SL索引的鲁棒性，进一步提出了基于支配频率的FL索引，并提出了高效的索引建立算法。实验结果表明，SL索引和FL索引均能显著减少需要访问的对象数量，其中FL索引具有更好的鲁棒性。
     4、概率逆top-k查询。前述研究成果都是从用户角度出发的查询优化问题，逆top-k查询能够帮助生产者分析产品影响的用户群体，在商业分析等领域有重要应用。现有逆top-k算法针对确定数据，在现实应用中，数据往往由于数据集成等原因带有不确定性。提出了概率逆top-k查询的合理定义，并提出了基于物化视图的查询算法。该算法将数据空间划分为网格，预先计算网格顶点的查询结果，后续查询利用物化视图避免对偏好集合的全盘访问。实验结果表明，基于物化视图的查询算法能够有效减少需要计算的偏好数量。
     综上所述，本文针对基于概率数据库的两类偏好查询，提出了较为高效和通用的的解决方法，对于概率数据库的研究和应用具有一定的理论意义和应用价值。
Uncertainty is inherent in the real world and the precision of the query and analysisis the major concern in many modern applications like business data management,finance analysis, sensor, RFID, and GIS, etc. However, Traditional databases techniqueswhich are based on deterministic theory are not prepared to handle imprecise data. Theprobabilistic database techniques which are based on probability theory can manageuncertain data directly and emerges as a hot field in the database community. In theapplication layer of databases preference queries are used to fulfill the diverserequerment of individual users, among which skyline and top-k query are the mostimportant ones.
     This dissertation focuses on efficient and general solutions for preference querieson probabilistic database, and addresses several challenging problems of probabilisticskyline and probabilstic top-k queries. The main contributions can be concluded asfollows:
     (1) Non-indexing probabilistic skyline algorithm. Existing probabilistic skylinealgorithms are generally based on R-tree indices, which are not general enough.Non-indexing algorithms, which require no preprocessing or index structures, have thewidest applicability and are necessary complements to the indexing algorithms. Twonon-indexing algorithms that based on in memory R-trees and ordered list respectivelyare proposed. The basic idea is to estimate the skyline probability of future objects fromthe objects accessed before. Experimental results show that both algorithms outperformthe naive algorithm, and are complementary to each other.
     (2) Distributed probabilistic skyline algorithm. In practical applications, data isoften distributed across multiple nodes, and applications use distributed queries toobtain the holistic results. The query delay mainly depends on the communicationoverhead, and it is improtant to design communication efficient distributed algorithms.A framework based on summary sharing is first proposed, and then two specificalgorithms namely VOS and GRID are proposed respectively. The VOS algorithm usesvirtual objects as the summary of data distribution. In order to reduce additionalcommunication cost caused by summary sharing, a distance based compression schemeis further proposed. GRID algorithm is proposed to overcome the shortage of datadependency on data distribution of the VOS algorithm. In the GRID algorithm, eachobject is mapped to a cell of a regular grid, and a compression scheme of grid isproposed to reduce the additional communication cost of grid sharing. Experimentalresults show that, VOS and GRID outperform the existing algorithm in communicationcost, and the former is more suitable for independent like or high dimentional datasets,while the latter is suitable for anti-correlated datasets.
     (3) The probabilistic top-k indexing scheme. There are a collect of semantics of theprobabilistic top-k query because it needs to rank tuples taking into account both theprobability and scoring function. Different query algorithms are designed for each querysemantic, and there is lack of indexing schemes applicable to a class of important querysemantics. Two layer indexes that can be used by a collect of important probabilistictop-k queries are proposed. The first one is based on the concept of skyline, the secondone is based on dominate frequency, and is proposed to achieve better robustness.Morever, an efficient indexing building algorithm is proposed. Experimental resultsshow that both indexing schemes reduce the objects accessed by probabilistic top-kqueries, and the latter is more robust.
     (4) Probabilistic reverse top-k query. Above research topics all consider the queryevaluation problem from the perspective of users. The reverse top-k query is importantin business analysis because it helps manufactures learn preferences from the data oftop-k preference. Existing reverse top-k queries assume the underlying data is certain,however, in the real-world applications data is often imprecise because of dataintegration. A reasonable definition of probabilistic reverse top-k query is first proposed,and then an efficient query algorithm based on materialized views is presented. Thealgorithm partitions the data space to a grid, and stores the pre-computing query resultsof grid vertices. Subsequent queries use the materialized set preferences to avoid fullscan. Experimental results show that the query algorithm reduces the number ofpreferences need to calculated and scales well in cardinality of preference set.
     In summary, this dissertation focuses on the most important preference queries ofprobabilistic database, and provides efficient and general solutions. These works haveacademic and practical value for advancing the theory and practicability of probabilisticdatabase.

引文

[1]李德仁,王树良,李德毅.空间数据挖掘理论与应用[M].北京:科学出版社,2006.
    [2] Li J. T., Guo J. B., Luo H. Y. RFID Technology and Application[J].TechnologyLetter of Institute of Computing Technology,2004,11:1~10.
    [3]许嘉,于戈,谷峪,王艳秋. RFID不确定数据管理技术[J].计算机科学与探索,2009,3(6):561~576.
    [4] Jeffery S. R., Garofalakis M., Franklin M. J. Adaptive Cleaning for RFID DataStreams[C].//Proceedings of the32nd International Conference on Very LargeData Bases (VLDB). Seoul, Korea: VLDB Endowment,2006:163~174.
    [5]贺少帅,杨敏华,李伟建.遥感数据模糊不确定性来源及其处理方法的探讨[J].测绘科学,2008,33(6):107~109.
    [6]史文中.空间数据与空间分析不确定性原理[M]:北京:科学出版社,2005.
    [7]李德毅,刘常昱,杜鹢,韩旭.不确定性人工智能[J].软件学报,2004,15(11):1583~1594.
    [8] Green T., Tannen V. Models for Incomplete and ProbabilisticInformation[C].//Proceedings of the9th International Conference on ExtendingDatabases Technology: Advances in Database Technology (EDBT). Munich,Germany: ACM,2006:278~296.
    [9]汤新红.单主体多类型偏好逻辑[D].重庆:西南大学,2009.
    [10] Girard P. Modallogics for Belief and Preference Change[D].Standford: StandfordUniversity,2008.
    [11] Andreka H., Ryan M., Schobbens P.-Y. Operators and Laws for CombiningPreferential Relations [J].Journal of Logic and Computation,2002,12:12~53.
    [12] Kie ling W. Foundations of Preference in Database Systems[C].//Proceedings ofthe28th International Conference on Very Large Data Bases (VLDB), HongKong,2002:311~322.
    [13] Ilyas I. F. Rank-Aware Query Processing and Optimization[D].West Lafayette:Purdue University,2004.
    [14]邓波.分布式序敏感查询处理关键技术研究[D].长沙:国防科学技术大学,2006.
    [15] Bentley J. L., Kung H. T., Schkolnick M., Thompson C. D. On the AverageNumber of Maxima in a Set of Vectors and Applications[J].Journal of theAssocation for Computing Machinery,1978,25(4):536~543.
    [16] Borzsony S., Kossmann D., Stocker K. The Skyline Operator[C].//Proceedings ofthe17th International Conference on Data Engineering (ICDE). Hannover,Germany: IEEE Computer Society,2001:421~430.
    [17] Soliman M. A., Ilyas I. F., Chang K. C.-C. Top-k Query Processing in UncertainDatabases[C].//Proceedings of the23rd International Conference on DataEngineering (ICDE).Istanbul, Turkey: IEEE Computer Society,2007:896~905.
    [18] Hua M., Pei J., Zhang W., Lin X. Ranking Queries on Uncertain Data: AProbabilistic Threshold Approach[C].//Proceedings of the2008ACM SIGMODInternational Conference on Management of Data (SIGMOD).Vancouver,Canada: ACM,2008:673~686.
    [19] Zhang X., Chomicki J. On the Semantics and Evaluation of Top-k Queries inProbabilistic Databases[C].//Proceedings of the24th International Conference onData Engineering (ICDE).Cancun, Mexico: IEEE Computer Society,2008:556~563.
    [20] Cormode G., Li F., Yi K. Semantics of Ranking Queries for Probabilistic Dataand Expected Ranks[C].//Proceedings of the25th International Conference onData Engineering (ICDE).Shanghai: IEEE Computer Society,2009:305-316.
    [21] Pei J., Jiang B., Lin X., Yuan Y. Probabilistic Skylines on UncertainData[C].//Proceedings of the33rd International Conference on Very Large DataBases (VLDB).Vienna, Austria: VLDB Endowment,2007:15~26.
    [22] Klir G. J., Folger T. A. Fuzzy Sets, Uncertainty and Information[M].New Jersey:Pemtice-Hall,1988.
    [23] Fuhr N., Rolleke T. A Probabilistic Relational Algebra for the Integration ofInformation Retrieval and Database Systems[J].ACM Transactions on DatabaseSystems,1997,14(1):32~66.
    [24] Lakshmanan L. V. S., Lepone N., Ross R., Subrahmanian V. S. Probview: AFlexible Probabilistic Database System[J].ACM Transactions on DatabaseSystems,1997,22(3):419~469.
    [25] Sarma A. D., Benjelloun O., Halevy A., Widom J. Working Models forUncertain Data[C].//Proceedings of the22nd International Conference on DataEngineering(ICDE).Atlanta, USA: IEEE Computer Society,2006:7~7.
    [26] Dalvi N., Suciu D. The Dichotomy of Conjunctive Queries on ProbabilisticStructures[C].//Proceedings of the26th ACM SIGMOD-SIGACT-SIGARTSymposium on Principles of Database Systems (PODS).Beijing: ACM,2007:293~302.
    [27] Wang D. Z., Michelakis E., Garofalakis M., Hellerstein J. M. Bayesstore:Managing Large, Uncertain Data Repositories with Probabilistic GraphicalModels[J].Proceedings of the VLDB Endowment,2008,1(1):340~351.
    [28] Poole D. First-Order Probabilistic Inference[C].//Proceedings of the18thInternational Joint Conference on Artificial Intelligence.San Francisco, USA:Morgan Kaufmann Publishers,2003:985~991.
    [29] Sen P., Deshpande A., Getoor L. Prdb: Managing and Exploiting RichCorrelations in Probabilistic Databases[J].The VLDB Journal,2009,18(5):1065~1090.
    [30] Nierman A., Jagadish H. V. Protdb: Probabilistic Data in Xml[C].//Proceedingsof the28th International Conference on Very Large Data Bases (VLDB).HongKong: VLDB Endowment,2002:646~657.
    [31] Abiteboul S., Senellart P. Querying and Updating Probabilistic Information inXML[C].//Proceedings of the9th International Conference on ExtendingDatabases Technology: Advances in Database Technology (EDBT).Munich,Germany: ACM,2006:1059~1068.
    [32] Senellart P., Abiteboul S. On the Complexity of Managing Probabilistic XmlData[C].//Proceedings of the26th ACM SIGMOD-SIGACT-SIGARTSymposium on Principles of Database Systems (PODS).Beijing, China: ACM,2007:283~292.
    [33] Babcock B., Babu S., Datar M., Motwani R., Widom J. Models and Issues inData Stream[C].//Proceedings of the21st ACM SIGACT-SIGMOD-SIGART onPrinciples of Database Systems (PODS).Madison, USA: ACM,2002:1~16.
    [34] Cormode G., Garofalakis M. Sketching Probabilistic DataStreams[C].//Proceedings of the2007ACM SIGMOD International Conferenceon Management of Data (SIGMOD).Beijing: ACM,2007:281~292.
    [35] Jayram T. S., Kale S., Vee E. Efficient Aggregation Algorithms for ProbabilisticData[C].//Proceedings of the18th Annual ACM-SIAM Symposium on DiscreteAlgorithms (SODA).New Orleans, USA: SIAM,2007:346~355.
    [36] Jin C., Yi K., Chen L., Yu J. X., Lin X. Sliding-Window Top-k Queries onUncertain Streams[J].//Proceedings of the VLDB Endowment,2008:301~312.
    [37] Magnani M., Montesi D. A Survey on Uncertainty Management in DataIntegration[J].ACM Journal of Data and Information Quality,2010,2(1):1~33.
    [38] Dong X., Halevy A., Yu C. Data Integration with Uncertainty[J].The VLDBJournal,2009,18(2):469~500.
    [39] Sarma A. D. Managing Uncertain Data[D].Standford: Standford University,2009.
    [40] Cheng R., Prabhakar S., Kalashnikov D. V. Querying Imprecise Data in MovingObject Environments[J].IEEE Transaction on Knowledge and Data Engineering,16(9):1112~1127.
    [41] Cheng R., Xia Y., Prabhakar S., Shah R., Vitter J. S. Efficient Indexing Methodsfor Probabilistic Threshold Queries over Uncertain Data[C].//Proceedings of the30th International Conference on Very Large Data Bases (VLDB).Toronto,Canada: VLDB Endowment,2004:876~887.
    [42] Tao Y., Cheng R., Xiao X., Ngai W. K., Kao B., Prabhakar S. IndexingMulti-Dimensional Uncertain Data with Arbitrary Probability DensityFunctions[C].//Proceedings of the31st International Conference on Very LargeData Bases (VLDB).Trondheim, Norway: VLDB Endowment,2005:922~933.
    [43] Singh S., Mayfield C., Prabhakar S., Shah R., Hambrusch S. Indexing UncertainCategorical Data[C].//Proceedings of the23rd International Conference on DataEngineering(ICDE).Istanbul, Turkey: IEEE Computer Society,2007:616~625.
    [44] Re C., Dalvi N., Suciu D. Query Evaluation on Probabilistic Databases[J].IEEEData Engineering Bulletin,2006,29(1):25~31.
    [45] Tung A. K. H., Zhang Z. Uncertain Data Clustering: Models, Methods andApplications[C].//Proceedings of the9th International Conference on Web-AgeInformation Management (WAIM).Zhangjiajie: Springer,2008.
    [46] Tsang S., Kao B., Yip K. Y., Ho W.-S., Lee S. D. Decision Trees for UncertainData[C].//Proceedings of the25th International Conference on Data Engineering(ICDE). Shanghai: IEEE Computer Society,2009:64~78.
    [47] Aggarwal C. C., Yu P. S. Outlier Detection with Uncertain Data[C].//Proceedingsof SDM. Auckland, New Zealand: Citeseer,2008.
    [48] Zhang Q., Li F., Yi K. Finding Frequent Items in Probabilistic Data[C].//Proceedings of the2008ACM SIGMOD International Conference onManagement of Data (SIGMOD).Vancouver, Canada: ACM,2008:819~832.
    [49] Chau M., Cheng R., Kao B., Ng J. Uncertain Data Mining: An Example inClustering Location Data[J].Advances in Konwledge Discovery and Data Mining,2006:199~204.
    [50] Ngai W. K., Kao B., Chui C. K., Cheng R., Chau M., Yip K. Y. EfficientClustering of Uncertain Data[C].//Proceedings of6th IEEE InternationalConference on Data Mining (ICDM).Hong Kong: IEEE Computer Society,2006:436~445.
    [51] Lee S. D., Kao B., Cheng R. Reducing Uk-Means to K-Means[C].//Proceedingsof the7th IEEE International Conference on Data Mining (ICDM).Omaha, USA:IEEE Computer Society,2007:483~488.
    [52] Kriegel H. P., Pfeifle M. Density-Based Clustering of UncertainData[C].//Proceedings of the11th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining (SIGKDD).Chicago, USA: ACM,2005:672~677.
    [53] Xu H. J., Li G. H. Density-Based Probabilistic Clustering of UncertainData[C].//Proceedings of the2008International Conference on ComputerScience and Software Engineering (CSSE).Wuhan: IEEE Computer Society,2008:474~477.
    [54] Bi J., Zhang T. Support Vector Classification with Input Data Uncertainty[J].Advances in Neural Information Processing Systems,2004,17(5):161~168.
    [55] Bhattacharyya C., Pannagadatta K. S., Smola A. J. A Second Order ConeProgramming Formulation for Classifying Missing Data[C].//Proceedings of the2004Conference on Advances in Neural Information Processing Systems.Vancouver, Canada: MIT Press,2004:153~160.
    [56]于浩,王斌,肖刚,杨晓春.基于距离的不确定离群点检测[J].计算机研究与发展,2010,47(3):474~484.
    [57] Chui C.-K., Kao B., Hung E. Mining Frequent Itemsets from UncertainData[J].Advances in Knowledge Discovery and Data Mining,2007:47~58.
    [58] Chui C.-K., Kao B. A Deremental Approach for Mining Frequent Itemsets fromUncertain Data[C].//Proceedings of the12th Pacific-Asia conference onAdvances in knowledge discovery and data mining(PAKDD). Osaka, Japan:Springer,2008:64~75.
    [59] Leung C. K.-S., Carmichael C., Hao B. Y. Efficient Mining of Frequent Patternsfrom Uncertain Data[C].//Proceedings of the7th IEEE International Conferenceon Data Mining (ICDM). Omaha, USA: IEEE Computer Society,2007:489~494.
    [60]魏小娟,杨婧,李翠平,陈红. Skyline查询处理综述[J].软件学报,2008,19(6):1386~1400.
    [61] Chomicki J. Skyline with Presorting[C].//Proceedings of the the19thInternational Conference on Data Engineering (ICDE). Bangalore, India: IEEEComputer Society,2003:717~719.
    [62] Godfrey P., Shipley R., Gryz J. Maximal Vector Computation in Large DataSets[C].//Proceedings of the31st International Conference on Very Large DataBases(VLDB). Trondheim, Norway: VLDB Endowment,2005:229~240.
    [63] Kossmann D., Ramsak F., Rost S. Shooting Stars in the Sky: An OnlineAlgorithm for Skyline Queries[C].//Proceedings of the28th InternationalConference on Very Large Data Bases (VLDB). Hong Kong, China: VLDBEndowment,2002:275~286.
    [64] Papadias D., Tao Y., Fu G., Seeger B. An Optimal and Progressive Algorithm forSkyline Queries[C].//Proceedings of the2003ACM SIGMOD InternationalConference on Management of Data (SIGMOD), San Diego, California: ACM,2003:467~478.
    [65] Pei J., Jin W., Ester M., Tao Y. Catching the Best Views of Skyline: A SemanticApproach Based on Decisive Subspaces[C].//Proceedings of the31stInternational Conference on Very Large Data Bases (VLDB), Trondheim,Norway: ACM Press,2005:253~264.
    [66] Yuan Y., Lin X., Liu Q., Wang W., Yu J. X., Zhang Q. Efficient Computation ofthe Skyline Cube[C].//Proceedings of the31st International Conference on VeryLarge Data Bases (VLDB), Trondheim, Norway: ACM Press,2005:241~252.
    [67] Xia T., Zhang D. Refreshing the Sky: The Compressed Skycube with EfficientSupport for Frequent Updates[C].//Proceedings of the2006ACM SIGMODInternational Conference on Management of Data (SIGMOD), Chicago, Illinois,USA: ACM Press,2006:491~502.
    [68] Tao Y., Xiao X., Pei J. Efficient Skyline and Top-k Retrieval inSubspaces[J].IEEE Transactions on Knowledge and Data Engineering,2007,19(8):1072~1088.
    [69] Balke W.-T., Güntzer U., Zheng J. X. Efficient Distributed Skylining for WebInformation Systems[C].//Proceedings of the7th International Conference onExtending Databases Technology: Advances in Database Technology (EDBT).Heraklion, Greece: ACM,2004:573~574.
    [70] Wang S., Ooi B. C., Tung A. K. H., Xu L. Efficient Skyline Query Processing onPeer-to-Peer Networks[C].//Proceedings of the23rd International Conference onData Engineering(ICDE). Istanbul, Turkey: IEEE Computer Society,2007:1126~1135.
    [71] Wu P., Zhang C., Feng Y., Zhao B. Y., Agrawal D., Abbadi A. E. ParallelizingSkyline Queries for Scalable Distribution[C].//Proceedings of the9thInternational Conference on Extending Databases Technology: Advances inDatabase Technology (EDBT), Munich, Germany: ACM,2006:112~130.
    [72] Vlachou A., Doulkeridis C., Kotidis Y., Vazirgiannis M. Skypeer: EfficientSubspace Skyline Computation over Distributed Data[C].//Proceedings of the the23rd International Conference on Data Engineering (ICDE). Istanbul, Turkey:IEEE Computer Society,2007:416~425.
    [73] Cui B., Lu H., Chen Q. X. L., Dai Y., Zhou Y. Parallel Distributed Processing ofConstrained Skyline Queries by Filtering[C].//Proceedings of the the24thInternational Conference on Data Engineering (ICDE). Cancun, Mexico: IEEEComputer Society,2008:546~555.
    [74] Zhu L., Tao Y., Zhou S. Distributed Skyline Retrieval with Low BandwidthConsumption[J].IEEE Transactions on Knowledge and Data Engineering,2009,21(3):384~400.
    [75] Tao Y., Papadias D. Maintaining Sliding Window Skylines on DataStreams[J].IEEE Transactions on Knowledge and Data Engineering,2006,18(3):377~391.
    [76] Lin X., Yuan Y., Wang W., Lu H. Stabbing the Sky: Efficient SkylineComputation over Sliding Windows[C].//Proceedings of the21st InternationalConference on Data Engineering (ICDE). Tokyo, Japan: IEEE Computer Society,2005:502~513.
    [77]孙圣力,戴东波,黄震华,张齐勋,周立新.概率数据流上skyline查询处理算法[J].电子学报,2009,37(2):285~293.
    [78] Zhang W., Lin X., Zhang Y., Wang W., Yu J. X. Probabilistic Skyline Operatorover Sliding Windows[C].//Proceedings of the the25th International Conferenceon Data Engineering(ICDE). Shanghai: IEEE Computer Society,2009:1060~1071.
    [79] Yiu M. L., Mamoulis N., Dai X., Tao Y., Vaitis M. Efficient Evaluation ofProbabilistic Advanced Spatial Queries on Existentially Uncertain Data[J].IEEETransactions on Knowledge and Data Engineering,2009,21(1):108~122.
    [80] Chan C. Y., Jagadish H. V., Tan K. L., Tung A. K. H., Zhang Z. On HighDimensional Skylines[C].//Proceedings of International Conference onExtending Database Technology (EDBT). Munich, Germany: ACM,2006:478~495.
    [81] Chaudhuri S., Gravano L., Marian A. Optimizing Top-k Selection Queries overMultimedia Repositories[J].IEEE Transactions on Knowledge and DataEngineering,2004,16(8):992~1009.
    [82] Kimelfeld B., Sagiv Y. Finding and Approximating Top-k Answers in KeywordProximity Search[C].//Proceedings of the25th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS). Chicago, USA:ACM,2006:173~182.
    [83] Han J., Wang J., Lu Y., Tzvetkov P. Mining Top-k Frequent Closed Patternswithout Minimum Support[C].//Proceedings of the2002IEEE InternationalConference on Data Mining (ICDM). Maebashi, Japan: IEEE Computer Society,2002:211~218.
    [84] Güntzer U., Balke W.-T., Kieβling W. Towards Efficient Multi-Feature Queriesin Heterogeneous Environments[C].//Proceedings of the2001InternationalSymposium on Information Technology. Las Vegas, USA: IEEE ComputerSociety,2001:622~628.
    [85] Donjerkovic D., Ramakrishnan. R. Probabilistic Optimization of Top nQueries[C].//Proceedings of25th International Conference on Very Large DataBases (VLDB). Edinburgh, UK: VLDB Endowment,1999:411~422.
    [86] Bruno N., Chaudhuri S., Gravano L. Top-k Selection Queries over RelationalDatabases. Mapping Strategies and Performance Evaluation[J].ACMTransactions on Database Systems,2002,27(2):153~187.
    [87] Fagin R. Combining Fuzzy Information from Multiple Systems[C].//Proceedingsof the15th ACM SIGACT-SIGMOD-SIGART Symposium on Principles ofDatabase Systems (PODS). Montreal, Canada: ACM,1996:216~226.
    [88] Fagin R., Lotem A., Naor M. Optimal Aggregation Algorithms forMiddleware[C].//Proceedings of the20th ACM SIGACT-SIGMOD-SIGARTSymposium on Principles of Database Systems (PODS). Santa Barbara, USA:ACM,2001:102~113.
    [89] Chang Y.-C., Bergman L., Castelli V., Li C.-S., Lo M.-L., Smith J. R. The OnionTechnique: Indexing for Linear Optimization Queries[C].//Proceedings of the2000ACM SIGMOD International Conference on Management of Data(SIGMOD). Dallas, United States: ACM,2000:391~402.
    [90] Xin D., Chen C., Han J. Towards Robust Indexing for RankedQueries[C].//Proceedings of the32nd International Conference on Very LargeData Bases (VLDB). Seoul, Korea: VLDB Endowment,2006:235~246.
    [91] Hristidis V., Koudas N., Papakonstantinou Y. Prefer: A System for the EfficientExecution of Multi-Parametric Ranked Queries[J]. ACM SIGMOD Record,2001,30(2):259~270.
    [92] Das G., Gunopulos D., Koudas N., Tsirogiannis D. Answering Top-k QueriesUsing Views[C].//Proceedings of32nd International Conference on Very LargeData Bases (VLDB). Seoul, Korea: VLDB Endowment,2006:451~462.
    [93] Cao P., Wang Z. Efficient Top-k Query Calculation in DistributedNetworks[C].//Proceedings of the23rd Annual ACM Symposium on Principlesof Distributed Computing (PODC). St. John's, Canada: ACM,2004:206~215.
    [94] He Y., Shu Y., Wang S., Du X. Efficient Top-k Query Processing in P2PNetwork[C].//Proceedings of the15th International Conference on Database andExpert Systems Applications (DEXA). Zaragoza, Spain: Springer,2004:381~390.
    [95] Nejdl W., Siberski W., Thaden U., Balke W.-T. Top-k Query Evaluation forSchema-Based Peer-to-Peer Networks[C].//Proceedings of the3rd InternationalSemantic Web Conference. Hiroshima, Japan: Spinger,2004:137~151.
    [96] Balke W.-T., Nejdl W., Siberski W., Thaden U. Progressive Distributed Top-kRetrieval in Peer-to-Peer Networks[C].//Proceedings of the21st InternationalConference on Data Engineering (ICDE). Tokyo, Japan: IEEE Computer Society,2005:174~185.
    [97] Doulkeridis A. V. C. On Efficient Top-k Query Processing in Highly DistributedEnvironments[C].//Proceedings of the2008ACM SIGMOD InternationalConference on Management of Data (SIGMOD). Vancouver, Canada: ACM,2008:9~12.
    [98] Patt-Shamir B., Shafrir A. Approximate Top-k Queries in SensorNetworks[C].//Proceedings of the3rd Colloquium on Structural Information andCommunication Complexity. Chester, UK: Spinger,2006:319~333.
    [99] Babcock B., Olston C. Distributed Top-k Monitoring[C].//Proceedings of the2003ACM SIGMOD International Conference on Management of Data(SIGMOD). San Diego, USA: ACM,2003:28~39.
    [100] Metwally A., Agrawal D., Abbadi A. E. Efficient Computation of Frequent andTop-k Elements in Data Streams[C].//Proceedings of the10th InternationalConference on Database Theory (ICDT). Edinburgh, Scotland: Spinger,2005:398~412.
    [101] Mouratidis K., Bakiras S., Papadias D. Continuous Monitoring of Top-k Queriesover Sliding Windows[C].//Proceedings of the ACM SIGMOD InternationalConference on Management of Data (SIGMOD). Chicago, USA: ACM,2006:635~646.
    [102] Li J., Saha B., Deshpande A. A Unified Approach to Ranking in ProbabilisticDatabases[C].//Proceedings of the35th International Conference on Very LargeData Bases (VLDB).Lyon, France: VLDB Endowment,2009:502~513.
    [103] Jin C., Yi K. Dynamic Structures for Top-k Queries on Uncertain Data[J].Algorithms and Computation,2007:427~438.
    [104] Patil M., Shah R., Thankachan S. V. Fully Dynamic Data Structure for Top-kQueries on Uncertain Data[J].Arxiv Preprint,2010.
    [105]孙永佼,王国仁. P2P环境中不确定数据top-k查询处理算法[J].计算机研究与发展,2010,46(Suppl.):280~286.
    [106] Ye M., Liu X., Lee W.-C., Lee D. L. Probabilistic Top-k Query Processing inDistributed Sensor Networks[C].//Proceedings of the26th InternationalConference on Data Engineering (ICDE). Long Beach, USA: IEEE ComputerSociety,2010:585~588.
    [107] Li F., Yi K., Jestes J. Ranking Distributed Probabilistic Data[C].//Proceedings ofthe2009ACM SIGMOD International Conference on Management of Data(SIGMOD). Providence, USA: ACM,2009:361~374.
    [108] Guttman A. R-Trees: A Dynamic Index Structure for Spatial Searching[C].//Proceedings of the1984ACM SIGMOD International Conference onManagement of Data (SIGMOD). Boston, USA: ACM,1984:47~57.
    [109]张明波,陆锋,申排伟,程昌秀. R树家族的演变和发展[J].计算机学报,2005,28(3):289~300.
    [110] Tan K.-L., Eng P.-K., Ooi B. C. Efficient Progressive SkylineComputation[C].//Proceedings of the27th International Conference on VeryLarge Data Bases (VLDB). Roma, Italy: Morgan Kaufmann Publishers,2001:301~310.
    [111] B hm C., Fiedler F., Pfeifle M. Skyline Operator for Existentially UncertainData[C].//Proceedings of QUeST. Settle, USA: ACM,2009.
    [112] Chaudhuri S., Dalvi N., Kaushik R. Robust Cardinality and Cost Estimation forthe Skyline Operator[C].//Proceedings of the22nd International Conference onData Engineering (ICDE). Atlanta, USA: IEEE Computer Society,2006:64~64.
    [113] Ratnasamy S., Francis P., Handley M., Karp R., Schenker S. A ScalableContent-Addressable Network[C].//Proceedings of the2001conference onApplications, technologies, architectures, and protocols for computercommunications (SIGCOMM). San Diego, USA: ACM,2001:161~172.
    [114] Jagadish H. V., Ooi B. C., Vu Q. H. Baton: A Balanced Tree Strucure forPeer-to-Peer Networks[C].//Proceedings of the31st International Conference onVery Large Data Bases (VLDB). Trondheim, Norway: VLDB Endowment,2005:661~672.
    [115] Vlachou A., Norvag K. Bandwidth-Constrained Distributed SkylineComputation[C].//Proceedings of the8th ACM International Workshop on DataEngineering for Wireless and Mobile Access, Providence. Rhode Island, USA:ACM,2009:17~24.
    [116] Ding X., Jin H. Efficient and Progressive Algorithms for Distributed SkylineQueries over Uncertain Data[C].//Proceedings of the30th InternationalConference on Distributed Computing Systems (ICDCS). Genoa, Italy: IEEEComputer Society,2010:149~158.
    [117] Yu C. T., Chang C. C. Distributed Query Processing[J].ACM ComputingSurveys,1984,16(4):399~433.
    [118] Bernstein P. A., Goodman N., Wong E., Reeve C. L., James J., Rothnie B. QueryProcessing in a System for Distributed Databases(Ssd-1)[J].ACM Transactionson Database Systems,1981,6(4):602~625.
    [119] Faloutsos C., Roseman S. Fractals for Secondary Key Retrieval[C].//Proceedingsof the8th ACM SIGACT-SIGMOD-SIGART Symposium on Principles ofDatabase Systems (PODS).Philadelphia, USA: ACM,1989:247~252.
    [120] Goldstain J., Ramakrishnan R., Shaft U., Yu J. Processing Queries by LinearConstraints[C].//Proceedings of the1997ACM SIGACT-SIGMOD-SIGARTSymposium on Principles of Database Systems (PODS). Tucson, USA:1997:257~267.
    [121] Heo J.-S., Cho J., Whang K.-Y. The Hybrid-Layer Index: A Synergic Approachto Answering Top-k Queries in Arbitrary Subspace[C].//Proceedings of the26thInternational Conference on Data Engineering (ICDE). Long Beach, USA: IEEEComputer Society,2010:445~448.
    [122] Vlachou A., Doulkeridis C., Kotidis Y., Norvag K. Reverse Top-kQueries[C].//Proceedings of the26th International Conference on DataEngineering (ICDE). Long Beach, USA: IEEE Computer Society,2010:365~376.
    [123] Jestes J., Cormode G., Li F., Yi K. Semantics of Ranking Queries forProbabilistic Data[J].IEEE Transcations on Knowledge and Data Engineering,2010:1084~4627.
    [124] Ilyas I. F., Beskales G., Soliman M. A. A Survey of Top-k Query ProcessingTechniques in Relational Database Systems[J].ACM Computer Survey,2008,40(4):1~58.
    [125] Lian X., Chen L. Monochromatic and Bichromatic Reverse Skyline Search overUncertain Databases[C].//Proceedings of the2008ACM SIGMOD InternationalConference on Management of Data (SIGMOD). Vancouver, Canada: ACM,2008:213~226.
    [126] Wu X., Tao Y., Wong R. C. W., Ding L., Yu J. X. Finding the Influence Setthrough Skylines[C].//Proceedings of the12th International Conference onExtending Database Technology (EDBT). Saint-Petersburg, Russia: ACM,2009:1030~1041.
    [127]周傲英,金澈清,王国仁,李建中.不确定性数据管理技术研究综述[J].计算机学报,2009,32(1):1~16.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700