高性能数据立方体及其语义研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
数据立方体技术是联机分析处理的主要手段。随着数据规模的扩大和维数的增加,数据立方体的操作代价急剧增加,需要进行优化处理。目前数据立方体的研究包括:物化、索引、近似、压缩、约简以及联机聚集等。
     形式概念分析理论(FCA)是以形式化的概念和概念层次为基础的数学分析工具。研究发现,概念格作为FCA的核心结构与数据立方体格都基于序结构,并且以数据仓库中的基本表作为形式背景,FCA理论中与概念相对应的等价特征组与数据立方体的覆盖等价类对数据单元具有相同的划分结果。本文将FCA和概念格理论引入数据立方体的研究,进行高性能数据立方体及其语义研究。研究表明,FCA及相关理论的引入,为数据立方体研究提供了一个新的有力的分析工具,利用该工具可以从数据内部特性入手,实现结构简单、体积较小且性能较优的数据立方体,并使数据立方体语义的理解更深刻,更易于实现。主要的研究工作如下:
     (1)提出基于形式概念格结构表达的数据立方体。
     首先对数据立方体与形式概念格进行相关分析,以概念格结构表达数据立方体,提出聚集概念和聚集概念格结构(ACL)。ACL是一种完全的数据立方体结构,由于其内具有相同聚集值的若干单元用一个聚集概念表示,因此能实现与商立方体相同的约简。另外,ACL结构中概念间的泛化和例化关系反映了约简后数据之间的层次关联,可表达比商立方体更清晰的数据立方体语义关系。
     其次,在ACL基础上,本文提出约简聚集概念结构(RAC)。基于形式概念分析理论中G偏序关系的性质研究发现,由于基本表的完备性,基本表中各个元组与ACL结构中的对象概念一一对应,因此基本表可以看作是所有对象概念的集合。RAC结构对ACL进一步约简,去除所有对象概念和特殊概念(Ω,null)。与基本表联合,RAC仍然是完全的立方体结构,但能实现比商立方体和ACL结构更大的约简,且仍能保持所有非对象聚集概念之间的语义关系。
     第三,基于形式概念分析理论中M偏序集的性质,提出基于ACL和RAC高效的查询方法。该方法利用属性概念内涵m″确定在ACL和RAC上的查询搜索路径,避免全范围的搜索,查询效率较高。
     最后,对形式背景进行讨论,将概念格的属性约简理论应用于数据立方体,通过合并相对必要属性、删除绝对不必要属性实现形式背景的简化,最终实现数据立方体相关操作的简化。
     (2)研究形式背景的属性蕴含关系,采用关系系统存储,提出基于属性蕴含的约简聚集概念数据立方体结构(RAC-AI)。
     根据形式概念分析理论,研究形式背景中描述概念格的两类非平凡属性蕴含:前提是伪内涵的蕴含和前提是真前提的蕴含。研究通过属性蕴含而不再依赖概念格结构确定概念内涵。在RAC结构基础上,提出两种基于属性蕴含的约简聚集概念数据立方体结构(RAC-AI):基于前提是伪内涵和基于前提是真前提的RAC-AI结构。RAC-AI结构摒弃RAC复杂的概念格结构,增加属性蕴含表,记录形式背景中所有非平凡的蕴含,并采用主流的关系系统存储所有非对象聚集概念。理论分析和实验结果表明,RAC-AI体积小,结构简单,构建和增量维护代价较低,查询响应速度也较快,是目前综合性能较优的数据立方体。
     (3)数据立方体语义关系的挖掘和应用直接影响联机分析处理的各种操作。本文研究基于FCA和概念格理论的数据立方体语义操作实现。
     首先讨论形式背景的净化和约简,消除形式背景中的冗余信息。现有的数据立方体语义研究都未考虑对数据本身进行约简,大量冗余信息的存在干扰了对语义的理解和发现。
     其次,利用形式概念分析的M偏序集理论,将M偏序关系作为生成概念分层的一种启发式的规则,形成属性级别的概念分层语义,而现有的概念分层一般只进行到维级别。
     第三,利用M偏序关系和非平凡的属性蕴含,实现数据立方体单元之间的上卷和下钻语义操作。通过分析等价特征组上界和下界的特性,获得等价特征组的结构,实现具有相同聚集值单元之间的上卷和下钻语义操作。利用非平凡的属性蕴含获取任意概念的父概念和子概念的内涵,实现不同聚集值单元的上卷和下钻语义操作。该方法不依赖任何特殊结构,实现从数据立方体任意单元出发的上卷和下钻操作,重复这个过程,能在数据立方体格中漫游,而不必生成完整的数据立方体。现有的数据立方体上卷和下钻语义操作一般只进行到视图级别,能达到单元级别的一般要依赖复杂特殊的结构实现。
     (4)范围查询是应用于多维数据立方体的有效的分析工具,预计算技术是提高范围查询响应速度的一种方法。本文在现有prefix sum技术和分块技术基础上,提出基于前缀区域的不规则方体的分块方法PRC,这种分块方法利于从起始单元开始的前缀区域聚集值的计算。对d维数据立方体(假定每维的度都为n),PRC在分块及区域求和时使用回归分割技术,在不增加额外空间的基础上,实现范围查询和数据更新的代价都为O(log~dn)。
Data cube is the main means to implement On Line Analytical Processing (OLAP). With sharply increasing of number of data dimensions and data scale, operation cost on it is very high and optimization should be done for data cube. The current technologies of data cube include partially materialization, index methods, approximate query, data compress, data reduction, online aggregation and so on.
     The Formal Concept Analysis (FCA) is a mathematical tool for analysis based on formal concept and concept hierarchy and has been widely used in the fields of knowledge discovery and data analysis. A comparative examination of data cube lattice and formal concept lattice shows that each of them is based on order-structure and furthermore if base table in data warehousing is looked as a concept context, equivalent character groups corresponding to concept in FCA and cover equivalence classes of cells in data cube have the same partitions for data. The theories of FCA and concept lattice are introduced into the research of data cube in this paper firstly and research on high performance data cube and implementation of semantics in data cube are carried out in succession. Study shows that introduce of FCA theories into data cube can provide a new powerful analytical tool for data cube. And using this tool and beginning with data characteristics, high performance data cube that has simple structure, low cost and small size can be obtained. Semantics in data cube can be understood deeply and be realized easily. Some achievements are obtained as follows:
     (1)Data cube structures based on concept lattice are put forward.
     Firstly, correlations of concept lattice and data cube lattice are analyzed and concept lattice is applied to data cube lattice. Aggregate Concept and Aggregate Concept Lattice (ACL) are brought forward in this paper. ACL can contain all aggregations of cube integrally. As a same aggregate value possessed by several cells is recorded with one aggregate concept, ACL can reduce data cube in same ratio with queotient cube. In addition relations of generalization and specialization among aggregate concepts in ACL can express the hierarchical relations among cells after cube reduction and keep semantic relations of data cube.
     Secondly, based on ACL, Reductive Aggregate Concept structure (RAC) is proposed in this paper. Based on properties of G partial relations in theories of FCA and completeness of base table in data warehousing, the one to one corresponding of object concepts derived from base table in ACL and tuples in base table is found. So base table can be looked as set of all object concepts. All object concepts and special (Ω, null) are removed based on ACL in RAC. Combined with base table, RAC is still a fully pre-computed cube and can realize reduction of data cube in more high ratio compared with ACL and quotient cube. And in RAC hierarchical relations among non-object concepts can be still preserved.
     Thirdly, based on properties of M partial relations in theories of FCA, an efficient method of query answering in ACL and RAC is proposed. The search path in ACL and RAC when query can be determined using intent of attribute concept m" avoiding whole scope searching and query can be achived effectively.
     Lastly, formal context is discussed and attribute reductive theories of concept lattice are applied to research of data cube. The base table can be reduced and raletive operations of data cube can be simplified through removing unnecessary attributes and combining relative necessary attributes.
     (2) The attribute implicit relations in formal context are studied and based on relational system, Reductive Aggregate Concept structure based on Attribute Implication (RAC-AI) is proposed in this paper.
     According to theories of FCA, two kinds of nontrivial attribute implication which describes concept lattice are discussed. One is attribute implication in which preconditions are real-preconditions and the other is attribute implication in which preconditions are quasi-intensions. The intent of concept can be obtained through attribute implications but not concept lattice. Based on RAC two structure of RAC-AI that preconditions are real-preconditions and quasi-intensions are put forward. In RAC-AI discarding the complex structure of concept lattice, the table of attribute implication is added that record all nontrivial attribute implication and all non-object aggregate concepts are stored using mainstream relational system. Both theories and experiments show that RAC-AI has simple structure and small cube size. The costs of construction and maintenance incrementally are low, too. Furthermore, speed of response for query is high. So RAC-AI is a good structure of cube with high integrative performance.
     (3)The semantic relations among cells in data cube are more important for efficient query and OLAP. In this paper implementation of semantic operation in data cube based on theories of FCA and concept lattice is studied.
     Firstly, purification and reduction of formal context are discussed to remove redundant information. The current semantic research did not consider to reduct data themselves and plenty of redundant information left over in data disturb semantic finding and understanding.
     Secondly, based on properties of M partial relations in theories of FCA, M partial relations can be used as heuristic rules for concept hierarchy and semantic concept hierarchy can be obtained at attribute level but not dimension level liking in current methods.
     Thirdly, using properties of M partial relations and nontrivial attribute implication in theories of FCA, semantic meaning of roll-up and drill-down among data cells is realized. Through analyzing properties of upper bound and lower bound of equivalent character groups, computational methods of upper bound and lower bound are proposed to obtaine the structure of equivalent character groups and then obtain semantic meaning of roll-up and drill-dwon among data cells whose aggregates are same in cube. Based on properties of M partial relations and nontrivial attribute implication, intent of parents and children concept of any concept can be obtained and then obtain semantic meaning of roll-up and drill-down among data cells whose aggregates are different in cube. This method avoids depending any special structure and achieves roll-up and drill-down operations starting from any data cell. Roaming can be done in data cube through repeating this process and full data cube is not necessary. The current methods of semantic data cube were achieved at view level commonly or achieved at cell level depending on complex special structure.
     (4) Range query is a very effective tool to analyze data for Multi-dimensional data cubes. Pre-computing is one mean to improve range query. This paper work on partition scheme and analyze how to organize cells and carry on prefix sum in partitions and propose a Prefix Region data Cube structure (PRC). PRC partitions data cube into several irregular boxes in favor of pre-computing of prefix region. Technology of recursive partition is used to partition data cube and organize cells of partitions to carry prefix sum in PRC. For data cube with d dimensions (suporsed that the cardinality of each dimension is n) without adding any space overhead compared to storing the original array both of the range query and update costs of PRC are O(lgo~dn).
引文
[1]W.H.Inmon.Building the Data Warehouse.J.Wiley & Sons Inc.,second edition,1996.
    [2]A.Berson,S.J.Smith.Data Warehousing,Data Mining and OLAP.McGraw-Hill,1997.
    [3]S.Chaudhuri,U.Dayal.An overview of data warehousing and OLAP technology.ACM SIGMOD,1997,26(1):517-526.
    [4]Y.Zhuge,H.Garcia-Molina,J.Hammer,et al.View maintenance in a warehousing environment.In SIGMOD,New York,USA,ACM,1995,pages 316-327.
    [5]S.Chaudhuri,U.Dayal,V.Ganti.Database technology for decision support systems.IEEE Computer,2001,34(12):48-55.
    [6]P.Gray,H.J.Watson.Present and future directions in data warehousing.DATA BASE,1998,29(3):83-90.
    [7]E.F.Codd,S.B.Codd,C.T.Salley.Providing OLAP(on-line analytical processing) to user-analysis:an it mandate.Technical report,E.F.Codd Associates,1993.
    [8]M.Golfarelli,D.Maio,S.Rizzi.The dimensional fact model:a conceptual model for data warehouses.International Journal of Cooperative Information Systems,1998,7(2-3):215-247.
    [9]Y.Cui,J.Widom,J.Wiener.Tracing the lineage of view data in a warehousing environment.ACM Transactions on Database Systems,2000,25(2):179-227.
    [10]R.Kimball.The Data Warehouse Toolkit.John Wiley & Sons,Inc.,1996.
    [11]C.Salka.Ending the MOLAP ROLAP debate:Usage based aggregation and flexible HOLAP(abstract).In ICDE,Washington.DC,USA,IEEE Computer Society,1998,page 180.
    [12]J.Gray,A.Bosworth,A.Layman,et al.Data cube:A relational aggregation operator generalizing group-by,cross-tab,and sub-totals.In ICDE,New Orleans,IEEE Computer Society,1996,pages 152-159.
    [13]S.Agarwal,R.Agrawal,P.Deshpande,et al.On the computation of multidimensional aggregates.In VLDB,San Francisco,USA,Morgan Kaufmann Publishers Inc.,1996,pages 506-521.
    [14]V.Harinarayan,A.Rajaraman,J.Ullman.Implementing data cubes efficiently.In SIGMOD,New York,USA,ACM,1996,pages 205-216.
    [15]Y.Zhao,P.M.Deshpande,Jeffrey F.Naughton.An Array-based Algorithm for Simultaneous Multidimensional.In:ACMSIGMOD,Tucson,Arizona,USA,ACM,1997,pages 159-170.
    [16]C.Ho,R.Agrawal,N.Megiddo,et al.Range queries in OLAP data cubes.In ACMSIGMOD,New York,USA,ACM,1997,pages 73-88.
    [17]M.Fang,N.Shivakumar,H.Garcia-Molina,et al.Computing iceberg queries efficiently.In VLDB,San Francisco,USA,Morgan Kaufmann Publishers Inc.,1998,pages 299-310.
    [18]S.Agarwal,R.Agrawal,A.Gupta.On computing the Data Cub.Research report 10026.IBM Almaden Research Center,San Jose,California 1996.
    [19]K.A.Ross,D.Srivastava.Fast Computation of Sparse Datacubes.In Proc.Int.Conf of Very Large Data Bases(VLDB),San Francisco,USA,Morgan Kaufmann Publishers Inc.,1997,pages 116-125.
    [20]K.Beyer,R.Ramakrishnan.Bottom-up computation of sparse and iceberg CUBEs.In ACM SIGMOD,New York,USA,ACM,1999,pages 359-370.
    [21]W.Wang,H.Lu,J.Feng,et al.Condensed cube:An effective approach to reducing data cube size.In ICDE,Washington.DC,USA,IEEE Computer Society,2002,pages 155-165.
    [22]L.Lakshmanan,J.Pei,J.Han.Quotient cube:How to summarize the semantics of a data cube.In VLDB,Hong Kong,Morgan Kaufmann,2002,pages 778-789.
    [23]K.Morfonios,Y.Loannidis.CURE for Cubes:Cubing Using a ROLAP Engine.In VLDB,Seoul,Korea,VLDB Endowment,2006,pages 379-390.
    [24]S.Sarawagi,M.Stonebraker.Efficient Organization of Large Multidimensional Arrays.In IEEE ICDE.Data Engineering,Washington.DC,USA,IEEE Computer Society,1994,pages 328-336.
    [25]Z.Shao,J.Han,D.Xin.MM-Cubing:Computing Iceberg Cubes by Factorizing the Lattice Space.In Proc.of the 16th International Conference on Scientific and Statitistical Database Management(SSDBM),Washington.DC,USA,IEEE Computer Society,2004,pages 213-222.
    [26]J.Li,Y.Li,J.Srivastava.Efficient aggregation algorithm on very large compressed data warehouses.J.Comput.Sci.Technol.,2000,15(3):213-229.
    [27]J.Li,J.Srivastava.Efficient aggregation algorithms for compressed data warehouses.IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING,2002,14(3):515-529.
    [28]T.Johnson,D.Shasha.Some Approaches to Index Design fro Cube Forests.Data Engineering Bulletin,1997,20(1):27-35.
    [29]Y.Sismanis,A.Deligiannakis,N.Roussopoulos,et al.Dwarf:Shrinking the PetaCube.In SIGMOD,Madison,ACM,2002,pages 464-475.
    [30]Y.Sismanis,A.Deligiannakis,Y.Kotidis,et al.Hierarchical Dwarf:Shrinking the petacube Shrinking the petacube.In SIGMOD,California,USA,ACM,2003,pages 17-24.
    [31]L.Lakshmanan,J.Pei,Y.Zhao.QC-trees:An efficient summary structure for semantic OLAP.In ACM SIGMOD,New York,USA,ACM,2003,pages 64-75.
    [32]Y.Feng,K.Agraowal,A.Abbadi,et al.Range CUBE:Efficient Cube Computation by Exploiting Data Correlation.In ICDE,Washington.DC,USA,IEEE Computer Society,2004,pages 658-669.
    [33]J.Han,J.Pei,G.Dong,et al.Efficient Computation of Iceberg Cubes with Complex Measures.In SIGMOD,New York,USA,ACM,2001,pages 1-12.
    [34]D.Xin,J.Han,X.Li,et al.Star cubing:Computing Iceberg Cubes by Top-Down and Bottom-up Integration.In VLDB,Berlin,Morgan Kaufmann Publishers Inc.,2003,pages 476-487.
    [35]D.Xin,J.Han,et al.Computing Iceberg Cubes by Top-down and Bottom-up Intergration:The StarCubing Approach.IEEE Transactions on Knowledge and Data Engineering,2007,19(1):111-126.
    [36]D.Xin,Z.Shao,J.Han,et al.C-Cubing:Efficient Computation of Closed Cubes by Aggregation-Based Checking.In ICDE,Atlanta,GA,USA,IEEE Computer Society,2006,page 4.
    [37]X.Li,F.Han,H.Gonzalez.High-dimensional OLAP:A Minimal Cubing Approach.In VLDB,Toronto,Canada,VLDB Endowment,2004,pages 528-539.
    [38] Z. Chen, V. R. Narasayya. Efficient Computation of Multiple Group By Queries. In SIGMOD,New York, USA, ACM, 2005, pages 263-274.
    [39] A. Shukla, P. Deshpande, J. Naughton, et al. Storage estimation for multidimensional aggregates in the presense of hierarchies. In VLDB, San Francisco, USA, Morgan Kaufmann Publishers Inc., 1996, pages 522-531.
    [40] Y. Sismanis, A. Deligiannakis. The Polynomial Complexity of Fully Materialized Coalesced Cubes. In VLDB, San Francisco, USA, Morgan Kaufmann Publishers Inc., 2004, pages 540-551.
    [41] H. J. Karloff, M. Mihail. On the Complexity of the View-Selection Problem. In Proc. Of Symposium on Principles of Database Systems, New York, USA, ACM, 1999, pages 167-173.
    [42] N. Roussopoulos. View Indexing in Relational Databases. ACM Transactions on Database Systems, 1982,7(2): 258-290.
    [43] H. Gupta, V. Harinarayan, A. Rajaraman, et al. Index selection for OLAP. In ICDE, Washington. DC, USA, IEEE Computer Society, 1997, pages 208-219.
    [44] E. Baralis, S. Paraboschi, E. Teniente. Materialized view selection in a multidimensional database. In VLDB, San Francisco, USA, Morgan Kaufmann Publishers Inc., 1997, pages 156-165.
    [45] A. Shukla, P. M. Deshoande, J. F. Naughton. Materialized view selection in a multidimentional database. In VLDB, New York, USA, Morgan Kaufmann, 1998, pages 488-499.
    [46] H. Gupta. Selection of Views to Materialize in Data Warehouse. In ICDT, London, UK, Springer-Verlag, 1997, pages 98-112.
    [47] H. Gupta, I. Mumick. Selection of views to materialize under a maintenance cost constraint. In ICDT, London, UK, Springer-Verlag, 1999, pages 453-470.
    [48] H. Gupta, I. S. Mumick. Selection of views to materialize in a data warehouse. TKDE, 2005, 17(1): 24-43.
    [49] M. Lee, J. Hammer. Speeding up materialized view selection in data warehouses using a randomized algorithm. International Journal of Cooperative Information Systems, 2001, 10(3): 327-353.
    [50] P. Kalnis, N. Mamoulis, D. Papadias. View Selection Using Randomized Search. Data Knowledge, 2002,42(1): 89-111.
    [51] D. Theodoratos, T. Sellis. Data Warehouse Configuration. In VLDB, San Francisco, USA, Morgan Kaufmann Publishers Inc., 1997, pages 126-135.
    [52] C. Zhang, J. Yang. Genetic Algorithm for Materialized View Selection in Data Warehouse Environments. In Proc. Warehousing and Knowledge Discovery, London, UK, Springer-Verlag, 1999, pages 116-125.
    [53] J. Horng, J. Chang, B. Liu, et al. Materialized view selection using genetic algorithms in a data warehouse system. In Proc. Of World Congress on Evolutionary Comutation, Washington. DC, USA, IEEE Computer Society, 1999, pages 2221-2227.
    [54] C. Zhang, X. Yao, J. Yang. An evolutionary approach to materialized views selection in a data warehouse environment. IEEE Transactions on Systems, Man and Cybernetics, Part C, 2001, 31(3): 282-294.
    [55] J. Yu, X. Yao, C. Choi, et al. Materialized view selection as constrained evolutionary optimization.IEEE TRANSACTIONS ON SYSTEMS,MAN,AND CYBERNETICS PART C:APPLICATIONS AND REVIEWS,2003,33(4):458-467.
    [56]A.Bauer,W.Lehner.On solving the view selection problem in distributed data warehouse architectures.In SSDBM,Cambridge,MA,USA,IEEE Computer Society,2003,pares 9-11.
    [57]张宜红,徐宏炳,王能斌.实视图选取策略及其实现技术.软件学报,1998,9(12):927-931.
    [58]R.Chirkova,A.Y.Halevy,D.Suciu.A formal perspective on the view selection problem.In VLDB,Secaucus,N J,USA,Springer-Verlag New York,Inc.,2002,pages 216-237.
    [59]R.Chirkova,C.Li.Materializing views with minimal size to answer queries.In ACM SIGMOD,New York,USA,ACM,2003,pages 38-48.
    [60]S.Agrawal,S.Chaudhuri,V.Narasayya.Automated selection of materialized views and indexes in sql databases.In VLDB,San Francisco,USA,Morgan Kaufmann Publishers Inc.,2000,pages 496-505.
    [61]潭红星,周龙骧.多维数据实视图的选择.软件学报,2002,13(6):1090-1096.
    [62]G.Colliat.OLAP,relational and multidimensional database systems.SIGMOD,1996,25(3):64-69.
    [63]S.Kulkarni,M.K.Mohania.Concurrent maintenance of views using multiple versions.In International Database Engineering and Applications Symposium,Washington.DC,USA,IEEE Computer Society,1999,pages 254-259.
    [64]C.Li,M.Bawa,J.D.Ullman.Minimizing view sets without losing query-answering power.In ICDT,London,UK,Springer-Verlag,2001,pages 99-113.
    [65]Y.Kotidis,N.Toussopoulos.DynaMat:A Dynamic View management System for Data Warehouse.In Proc.ACM SIGMOD,New York,USA,ACM,1999,pages 371-382.
    [66]S.Acharya,P.Gibbons,V.Poosala,et al.The aqua approximate query answering system.In ACM SIGMOD,New York,USA,ACM,1999,pages 574-576.
    [67]S.Acharya,P.Gibbons,V.Poosala.Aqua:A Fast Decision Support Systems Using Approximate Query Answers.In VLDB,New York,USA,Morgan Kaufmann,1999,pages 754-757.
    [68]V.Poosala,V.Ganti.Fsat Approximate Answers to Aggregate Queries on a Data Cube.In Proceedings of the 11th International Conference on Scientific on Scientific and Statistical Database Management,Washington.DC,USA,IEEE Computer Society,1999,pages 24-33.
    [69]D.Gunopulos,G.Kollios,V.J.Tsotras.Approximating Multi-dimensional Aggregate Range Queries over Real Attributes.In ACM SIGMOD,New York,USA,ACM,2000,pages 463-474.
    [70]F.Furfaro,G.M.Mazzeo,D.Sacca,et al.Hierarchical,Binary,Histograms for Summarizing Multi-dimensional Data.In Proceedings of the 2005 ACM symposium on Applied computing,New York,USA,ACM,2005,pages 598-603.
    [71]S.Chaudhuri,G.Das,M.Datar,et al.Overcoming Limitations of Sampling for Aggregation Queries.In ICDE,Washington.DC,USA,IEEE Computer Society,2001,pages 534-542.
    [72]S.Acharya,P.B.Gibbons,V.Poosala.Compressional Samples for Approximate Answering of Group-By Queries.In ACM SIGMOD,New York,USA,ACM,2000,pages 487-498.
    [73]S.Chaudhuri,G.Das,V.Narasayya.A Robust Optimization-Based Approach for Approximate Answering of Aggregate Queries.In ACM SIGMOD,New York,USA,ACM,2001,pages 295-306.
    [74]B.Babcock,S.Chaudhuri,G.Das.Dynamic Sample Selection for Approximate Query Processing.In ACM SIGMOD,New York,USA,ACM,2003,pages 539-550.
    [75]J.S.Vitter,M.Wang,B.Iyer.Data Cube Approximation and Histograms via Wavelets.In Proceedings of the seventh international conference on Information and knowledge management,New York,USA,ACM,1998,pages 96-104.
    [76]J.S.Vitter,M.Wang.Approximate computation of multidimensional aggregates of sparse data using wavelets.In SIGMOD,New York,USA,ACM,1999,pages 193-204.
    [77]Y.Matias,J.S.Vitter,M.Wang.Dynamic Maintenance of Wavelet-based Histograms.In VLDB,San Francisco,USA,Morgan Kaufmann Publishers Inc.,2000,pages 101-110.
    [78]K.Chakrabarti,M.Garofalakis,R.Rastogi,et al.Approximate query processing using wavelets.In VLDB,Secaucus,NJ,USA,Springer-Verlag New York,Inc.,2001,pages 199-223.
    [79]Y.L.Wu,D.Agrawal,A.E.Abbadi.Using Wavelet Decomposition to Support Pregressive and Approximate Range-Sum Queries over Data Cubes.In Proceedings of the ninth international conference on Information and knowledge management,New York,USA,ACM,2000,pages 414-421.
    [80]Y.Feng,S.Wang.Compressed data cube for approximate OLAP query processing.Journal of Computer Science and Technology,2002,5(17):625-635.
    [81]A.Cuzzocrea.Overcoming limitations of Approximate Query Answering in OLAP.In Proceedings of the 9th International Database Engineering & Application Symposium,Montreal,Canada,IEEE Computer Society,2005,pages 200-209.
    [82]N.Roussopoulos,Y.Kotidis,M.Roussopoulos.Cubetree:Organization of and Bulk Incremental Updates on the Data Cube.In ACM SIGMOD,New York,USA,ACM,1997,pages 89-99.
    [83]D.Barbara,M.Sullivan.Quasi-cube:Exploiting approximation in multidimensional database.In ACMS IGMOD,New York,USA,ACM,1997,pages 12-17,.
    [84]D.Barbara,X.Wu.Using loglinear models to compress datacube.In Proceedings of the First International Conference on Web-Age Information Management,London,UK,Springer-Verlag,2000,pages 311-322.
    [85]J.Shanmugasundaram,U.Fayyad,P.S.Bradley.Compressed Data Cubes for OLAP Aggregate Query Approximation on Continuous Dimensions.In ACM SIGKDD,New York,USA,ACM,1999,pages 223-232.
    [86]S.Geffner,D.Agrawal,A.E.Abbadi,et al.Relative prefix sums:an efficient approach for querying dynamic OLAP data cubes.In ICDE,Washington.DC,USA,IEEE Computer Society,1999,pages 328-335.
    [87]W.Liang,H.Wang,M.Orlowska.Range queries in dynamic OLAP data cubes.Data and Knowledge Engineer,2000,34(1):21-38.
    [88]G.Chan,Q.Li,L.Feng.Design and selection of materialized views in a data warehousing environment:Acase study.In DOLAP,New York,USA,ACM,1999,pages 42-47.
    [89]M.Riedewald,D.Agrawal,A.E.Abbadi.Flexible data cubes for online aggregation.In 8th Internal Conf on Database Theory,London,UK,Springer-Verlag,pages 2001,159-173.
    [90]高宏,李建中,李金宝.数据仓库系统中层次式cube存储结构.软件学报,2003,14(7):1258-1266.
    [91]S.Geffner,D.Agrawal,A.E.Abbadi.The dynamic data cube.In EDBT,London,UK, Springer-Verlag,2000,pages 237-253.
    [92]M.Riedewald,D.Agrawal,A.E.Abbadi,et al.Space efficient data cubes for dynamic environments.In Proc.of the International Conf on Data Warehousing and Knowledge Discovery,London,UK,Springer-Verlag,2000,pages 24-33.
    [93]C.Poon.Dynamic orthogonal range queries in OLAP.Theoretical Computer Science,2003,296(3):487-510.
    [94]F.Bengtsson,J.Chen.Space-Efficient Range-Sum Queries in OLAP.In Proc.of 6th International Conf.on Data Warehousing and Knowledge Discovery,Amsterdam,The Netherlands,Elsevier Science Publishers B.V.,2004,pages 83-102.
    [95]J.M.Hellerstein,P.J.Haas,H.J.Wang.Online aggregation.In SIGMOD,New York,USA,ACM,1997,pages 171-182.
    [96]R.Schmidt,C.S.Propolyne.A fast wavelet-based technique for progressive evaluation of polynomial range-sum queries.In 8th International Conference on Extending Database Technology,Berlin,Springer-Verlag,2002,pages 664-681.
    [97]R.Schmidt,C.Shahabi.How to evaluate multiple range-sum queries progressively.In Symposium onPrinciples of Database Systems,New York,USA,ACM,2002,pages 133-141.
    [98]M.Riedewald,D.Agrawal,A.E.Abbadi.Pcube:Update-efficient online aggregation with progressive feedback and error bounds.In SSDBM,Santa Barbara,CA,USA,University of California at Santa Barbara,2000,pages 95-108.
    [99]I.Lazaridis,S.Mehrotra.Progressive approximate aggregate queries with a multiresolution tree structure.In SIGMOD,New York,USA,ACM,2001,pages 401-412.
    [100]李红松,黄厚宽.无需附加空间的数据立方体联机聚集.软件学报,2006,17(4):806-813.
    [101]A.Cuzzocrea.Improving range-sum query evaluation on data cubes via polynomial approximation.Data & Knowledge Engineering,2006,56(2):85-121.
    [102]A.Cuzzocrea,W.Wei.Approximate range-sum query answering on data cube with probabilistic guarantees.Journal of Intelligent Information Systems,2007,28(2):161-197.
    [103]D.W.Kim,E.J.Lee,M.Kim,et al.An efficient processing of range-MIN/MAX queries over data cube.Information Sciences,1998,112:223-237.
    [104]H.Li,T.Ling,S.Lee.Range-Max/Min Query in OLAP Data Cube.In Proceedings of the 11th International Conference on Database and Expert Systems Applications,London,UK,Springer-Verlag,2000,pages 467-476.
    [105]K.Wang,Y.Jiang,J.X.Yu,et al.Divide-and-Approximate:A Novel Constraint Push Strategy for Iceberg Cube Mining.IEEE Transaction on Knowledge and Data Engineering,2005,17(3):354-368.
    [106]L.Chou,X.Zhang.Computing complex iceberg cubes by multiway aggregation and bounding.In Proc.Of the 6th Intl.Conf On Data Warehousing and Knowledge Discovery,London,UK,Springer-Verlag,2004,pages 108-117.
    [107]X.Zhang,P.Chou,G.Dong.Efficient Computation of Iceberg Cubes by Bounding Aggregate Functions.IEEE Transaction on Knowledge and Data Engineering,2007,19(7):903-918.
    [108]孙延凡,陈红,王珊.FreeCube:有效减少Data Cube体积.计算机科学,2003,30(增 刊):241-245.
    [109]孙延凡,陈红.GSFC-基于图结构的Free Cube存储方法.计算机研究与发展,2004,41(10):1652-1660.
    [110]J.Feng,H.Si,Y.Feng.Indexing and incremental Updating Condensed Data Cube.In Proc.Of the 15th Intl Conf.on Scientific and Statistical Database Management,Cambridge,MA,USA,IEEE Computer Society,2003,pages 23-32.
    [111]J.Feng,Q.Fang,H.Ding.PrefixCube:prefix-sharing condensed data cube.In Proc.Of ACM 7th Intl Workshop on Datawarehouse and OLAP,New York,USA,ACM,2004,pages 38-47.
    [112]L.Xiang,Y.Feng.H.Gui.Construction and compression of Dwarf.Journal of Zhejiang University SCIENCE,2005,6A(6):519-527.
    [113]李盛恩,王珊.封闭数据立方体技术研究.软件学报,2004,15(8):1165-1171.
    [114]X.Dong,H.huang,H.Li.HQC:An efficient method for rolap with hierarchical dimensions.In RSFDGrC2005,Berlin,Springer-Verlag,2005,pages 211-220.
    [115]杨科华,董逸生,胡孔法.语义Cube的层次聚类方法.计算机研究与发展,2005,42(11):1989-1996.
    [116]C.Li,K.Tung,S.Wang.Incremental maintenance of quotient cube based on galois lattice.Journal of Computer Science & Technology,2004,19(3):302-308.
    [117]C.Li,C.Gao,K.Tung,et al.Incremental Maintenance of Quotient Cube for Median.In ACM SIGKDD,New York,USA,ACM,2004,pages 226-235.
    [118]向隆刚,龚健雅.一种高度浓缩和语义保持的数据立方.计算机研究与发展,2007,44(5):837-844.
    [119]R.Wille.Restructuring lattice theory:An approach based on hierarchies of concepts.Ordered Sets.Reidel,Dordrecht,1982,445-470.
    [120]B.Ganter,R.Wille.Formal Concept Analysis.Mathematical Foundations,Berlin,Springer,1999.
    [121]B.Ganter.Two basic algorithms in concept analysis.FB4-Preprint No.831,TH Darmstadt,1984.
    [122]B.Ganter.Formal Concept Analysis:algorithmic aspects.Preliminary lecture notes,Darmstadt,2002,http://www.math.tu-dresden.de/-ganter/c102/.
    [123]G.Stumme,R.Taouil,Y.Bastide,et al.Fast computation of concept lattices using data mining techniques.In Proceedings of 7~(th) International Workshop on Knowledge Representation Meets Databases(KRDB 2000),Berlin,Germany,2000,pages 129-139.
    [124]M.Chein.Algorithme de recherch6 des sons-mmatrices d'une matrice.Bull.Math.Soc.Sci.R,S.Roumanie,1969,13(61):21-25.
    [125]L.Nourine,O.Raynaud.A fast algorithm for building lattices.Information Processing Letters,1999,71:199-204.
    [126]R.Godin,R.Missaoui,H.Alaoui.Incremental concept formationalgorithms based on Galois(concept) lattices.Computational Intelligence,1995,11(2):246-267.
    [127]J.P.Bordat.Calcul pratique du treillis de Galois d'une correspondance.Mathematiques et Sciences Humaines,1986,24:31-47.
    [128]C.Lindig.Fast concept analysis.In Working with Conceptual Structures-Contributions to ICCS 2000,Shaker Verlag,Aachen,Germany,2000,pages 152-161.
    [129]C.Carpineto,G.Romano.Galois:an order-theoretic approach to conceptual clustering.In Proceedings of 10th International Conference on Machine Learning(ICML'93),Amherst:Elsevier,1993,pages 33-40.
    [130]D.Merwe,D.Kourie.AddAtom:an incremental algorithm for constructing concept-and concept sublattices.Technical report of the Department of Computer Science,University of Pretoria,South Africa,2002.
    [131]D.Merwe,S.Obiedkov,D.Kourie.AddIntent:a new incremental algorithm for constructing concept lattices.In Proceedings of 2nd International Conference on Formal Concept Analysis(ICFCA 2004),Berlin,Springer-Verlag,2004,pages 372-358.
    [132]W.X.Zhang,L.Wei,J.J.Qi.Attribute reduction theory and approach to concept lattice.In Proceeding of the 10th International Conference on Rough Sets,Fuzzy Sets,Data Mining,and Granular Computing(RSFDGrC),Science in China Press,co-published with Springer-Verlag GmbH,2005 pages 713-726.
    [133]张文修,魏玲,祁建军.概念格的属性约简理论与方法.中国科学(E辑),2005,35(6):628-639.
    [134]C.Lindig,G.Snelting.Assessing modular structure of legacy code based on mathematical concept analysis.In Proceedings of the International Conference on Software Engineering (ICSE 97),Boston,USA,1997,pages 349-359.
    [135]A.Deursen,T.Kuipers.Identifying objects using cluster and concept analysis.In Proceedings of the 21st International Conference on Software Engineering(ICSE'99),Los Angeles,California,1999,pages 246-255.
    [136]R.Godin,G.Mineau,R.Missaoui,et al.Applying concept formation methods to software reuse.International Journal of Knowledge Engineering and Software Engineering,1995,5(1):119-142.
    [137]C.Lindig.Concept-based component retrieval.In Proceedings of IJCAI-95 Workshop on Formal Approaches to the Reuse of Plans,Proofs,and Programs,Montreal,1995,pages 21-25.
    [138]R.Godin,H.Mili,G W Mineau,et al.Design of class hierarchies based on concept (Galois)lattices.Theory and Application of Object Systems,1998,4(2):117-134.
    [139]G.Snelting,T.Tip.Reengineering class hierarchies using concept analysis.In Proceedings of the 6~(th) ACM SIGSOFT Symposium on the Foundations of Software Engineering,New York,USA,ACM,1998,pages 99-110.
    [140]P.Njiwoua,E.Mephu Nguifo.Forwarding the choice of bias,LEGAL-F:using feature selection to reduce the complexity of LEGAL.In Proceedings of Benelux Conference on Machine Learning(BENELEARN),Netherlands,1997,pages 89-98.
    [141]Z.Xie,W.Hsu,Z.Liu,et al.Concept lattice based composite classifiers for high predictability.Journal of Experimental and Theoretical Artificial Intelligence,2002,14:143-156.
    [142]C.Carpineto,G.Romano.Information retrieval through hybrid navigation of lattice representations.International Journal of Human-Computer Studies,1996,45:553-578.
    [143]D.Richards,P.Compton.Combining formal concept analysis and ripple down rules to support reuse.In Proceedings of Software Engineering Knowledge Engineering(SEKE'97), Skokie USA,Springer Verlag,1997,pages 177-184.
    [144]G.D.Oosthuizen.The Application of Concept Lattice to Machine Learning.Technical Report,University of Pretoria,South Africa,1996.
    [145]M.Siff,T.Reps.Identifying modules via concept analysis.IEEE Transaction on Software Engineering,1999,25(6):749-768.
    [146]P.Tonella.Using a concept lattice of decomposition slices for program understanding and impact analysis.IEEE Trans on Software Engineering,2003,29(6):495-509.
    [147]F.H.Gatzemeier,O.Meyer.Text schema mining using graphs and formal concept analysis.In Conceptual Structures-Integration and Interfaces(ICCS),Berlin,Springer-Verlag,2002,pages 107-121.
    [148]Y.Y.Yao.Concept lattices in rough set theory.In Proceedings of 2004 AnnualMeeting of the North American Fuzzy Information Processing Society(NAFIPS),2004,pages 796-801.
    [149]G.D.Oosthuizen.Rough sets and concept lattices.In Proc.of Rough sets,and Fuzzy sets and Knowledge Discovery(RSKD '93).London:Springer-Verlag,pages 24-31,1994.
    [150]R.E.Kent.Rough concept analysis.In Rough Sets,and Fuzzy Sets and Knowledge Discovery(RSKD '93),London:Springer-Verlag,pages 248-255,1993.
    [151]J.Saquer,J.S.Deogun.Concept approximations based on rough sets and similarity measures.Appl Math ComputSci,2001,11(3):655-674.
    [152]胡学钢.从数据库中提取知识的模型研究:[学位论文].合肥:合肥工业大学,2000.
    [153]曲开社,翟岩慧.偏序集、包含度与形式概念分析.计算机学报,2006,29(2):219-226.
    [154]C.Hahn,S.Warren,J.London.Edited synoptic cloud reports from ships and land stations over the globe.1994,http://cdiac.esd.Ornl.gov/cdiac/ndps/ndp026b.html.
    [155]J.Han,M.Kamber.Data Mining Concepts and Techniques.second edition,Morgan Karfmann Publisher,2001.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700