列存储系统的若干关键技术研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
当今,许多企事业单位的高管人员,迫切需要高性能的分析型数据库管理系统,用于分析大数据,辅助决策。列存储技术在处理大数据方面,显著优于行存储技术,所以吸引了许多学者的研究。列存储技术的研究取得了一些成果,但是关于列存储系统的存储优化、查询优化和查询执行等关键技术还有待进一步研究。
     在列存储系统中,按列存储数据,使得在查询处理时能够只读取查询所需要的列,避免读入无关的列。按列存储的数据具有很好的可压缩性,在查询处理过程中可以直接对压缩数据进行处理。这两点使得列存储系统在查询处理过程中的数据I/O效率比行存储高得多,有利于提高查询处理的速度。另一方面,对按列存储的数据进行查询处理时,需要将分散存储在不同位置的多列数据进行元组重构。元组重构形成了列存储系统中的一个重要性能瓶颈。
     本文以国家工信部核高基重大专项课题“数据仓库专用DBMS原型系统研制”(2010ZX01042-001-003-04)和国家自然科学基金项目“数据仓库中行列混合存储引擎的优化模型”(61070031)为依托,以提高列存储系统的查询性能为目标,对影响列存储系统性能的一些关键技术进行了深入研究。本文主要做了以下几个方面的工作:
     (1)研究列存储系统中数据存储布局对元组重构性能的影响后,提出了一个以列存储为基础,结合组合多列的存储模型。该模型对历史查询使用数据的方式进行分析,分析一个逻辑表中的哪些列经常一起被查询输出,将这些列进行物化,供后续查询使用。对需要物化的多列,首先形成逻辑上的一个投影并进行水平划分,然后对划分的每一块,在块内按列组织并压缩后存储。这样能充分利用列存储的优势,同时也能减少元组重构的开销,为后续查询提供了最优存储。
     (2)传统B+树索引是稀疏的,对其搜索的路径较长,对其进行插入和搜索的效率较低,不适合分析型应用。对此,本文提出了一种精简的、适合于列存储的B+树结构——RB+树。RB+树几乎是一棵满的平衡二叉树,一页能容纳更多的索引项,因而能用较矮的RB+树存储大量的索引项。按这种结构树组织数据,搜索数据的路径短,搜索效率高。关于RB+树索引的创建和维护,分别对行号索引和列值索引提出了自底向上的高效创建方法和维护方法。
     (3)研究了数据库中的数据压缩技术,包括轻量级的压缩方法、压缩粒度的选择和压缩方法的选择策略。特别对位图压缩技术进行了深入的研究,提出了一种富扩展划分位图索引和一种自适应的划分字对齐压缩方法(APWAH)。富扩展划分位图包含了一些统计信息,为直接使用划分位图进行聚集操作提供了方便。(?)PWAH能根据位向量中0-1分布情况,自适应地选择最合适的0-填充段长和1-填充段长,提高了压缩效率和查询处理效率。同时研究了区级压缩,区级压缩同时具有压缩率高和压缩管理方便的优点。本文提出根据数据的分布情况,自适应地选择区的大小。一个区由若干块构成,每区的块数不一定相同。这样可以根据相邻数据块之间的相似性,灵活地进行区划分,不受区大小的限制,保证区内数据分布特征相似性强,区之间数据分布特征相似性弱,以便对每个区选择更合适的压缩方法。关于压缩方法的选择,建立了一个数据分布特征模型,并根据提出的模型建立了选择压缩方法的决策方案。
     (4)研究缓冲区管理技术,提出了一种适应于列存储系统的三级缓冲区管理方案。在全局级,使用两条链分别管理系统的自由缓冲区和所有查询使用的缓冲区,对使用的缓冲区按综合自适应置换策略进行置换。一个缓冲区是否可被置换,不仅考虑正在执行的查询,同时还考虑了一定量的后续查询。在查询级,每个执行的查询都用一条主链管理它使用的缓冲区,一个查询处理中每出现一个并发操作阶段,都从主链中产生一条相应的分支链来管理并发操作阶段使用的缓冲区。在操作阶段级,对每个操作阶段设计了一种灵活且自适应的缓冲区分配策略(MG-x-y-z)和与它的访问模式相适应的置换策略。提出的三级缓冲区管理方案充分考虑了分析型工作负载的特点、数据访问模式特点和可用的缓冲区情况,也考虑了数据预取。
     (5)研究列存储系统中的物化技术后,针对现有物化技术的不足,提出了基于带值路径的物化技术(PVM)。PVM在物理执行树中增加了带值路径,并使用传递块来保存执行的中间结果。通过这种方法,避免了查询执行过程中对原始数据的重读。对带值路径中包含的位向量,使用本文提出的APWAH压缩方法进行压缩,减少或避免了因中间结果太大而造成的额外I/O。
     本文研究的内容是我们所研制的原型系统中的关键技术。研究的结果对提高系统的总体性能起到了决定性的作用。
Analytical DBMS with high performance is greatly required by high level managers in many enterprises for big data analysis and decision making assistance. Column-store is still on research due to its high efficiency in big data query, and many research progenies have been made. However, some key technologies in column-stores such as storage optimization, query optimization and query execution are still needed to be further studied.
     Data in relational table is stored by columns in column-stores. Processing a query by reading columns that are only needed by the query is able to avoid reading unnecessary columns in the table from disk in column stores. The data stored by columns has better compressibility than that stored by row, and queries can be evaluated directly on compressed data. These two characteristics make column store better input efficiency than traditional row store, and improve query processing performance. On the other hand, when processing queries on data stored by column, attribute values stored separately in different columns are extracted to reconstruct into tuples. Tuple reconstruction is an important performance bottleneck in column stores.
     Relying on the subject of major science and technology project "Development of a DBMS prototype for data warehouse"(Grant No.2010ZX01042-001-003-04) supported by Ministry of Industry and Informatilization of China, and on the project "An optimized storage model with row and column mixture in data warehouse"(Grant No.61070031) supported by National Natural Science Foundation of China, this thesis studied some key technologies in column-stores, aiming at improving query performance. The major contributions of this thesis are as follows:
     (1)We proposed a combined multi-column store model after studied the effects of data storage layout on performance of tuple reconstruction. System analyses automatically users' query history, the column groups that were output together frequently from a logical table are generated, then every one of these column groups are materialized. For a column group needed to be materialized, a logical projection is firstly formed and a horizontal partiontion to the projection is generated. With a block, data are organized by column, then compressed and stored. This approach takes fully advantage of column store, and reduces substantially cost of tuple-reconstruction, and provides optimal storage for later queries.
     (2) Traditional B+tree indices are sparse, and searching paths within them are long. It is not fit for analytic workload due to its low insertion and search efficiency. In the thesis, we proposed the structure of reduced B+tree (RB+tree for short) which suites for column stores. A RB+tree is almost a full and balanced binary tree, in which each node is able to accommodate more indexed entries, therefore, search paths are short. So, RB+tree is able to improve the performance of query processing.A bottom-to-up creation and maintaining approach for rowID indices and column value indices in RB+tree was presented in this thesis.
     (3) We studied data compression technologies in database, including lightweight compression, the selection strategy of compression granularity and compression scheme. Based on intensive study of bitmap compressions, we proposed a rich-extend partition bitmap index approach and a self-adaptive partition word-aligned bit vector compression approach (APWAH). Each rich-extend partition bitmap index contained the statistics information which could facilitate directive aggregate operation on partition bitmaps. APWAH adaptively selectes the most suitable0-fill and1-fill segment length according to0-1distribution in a bit vector, thus it could hava better compression effects and improve query performance. Sector-level compression was studied as well. Two merits of sector-level compression were its high compressing ratio and its simplity of compression management. This thesis also presented that compressing subsystem in column stores should select sector size adaptive for data distribution. One sector is composited with some blocks, the number of blocks in different sectors might be different. Thus compressing subsystem could flexibly partion a column into some sectors according to similarity of adjoining block, and it is not nesesary that all sectors are of the same size. This ensures that the similarity of data in a sector was strong, and the similarity of data between adjoining sectors was weak, to select most appropriate compressing approach for each sector. Moreover, a data distribution feature model was built, and a decision scheme for selecting compressing approach was created with proposed model.
     (4) We proposed a3-level buffer management solution suitable for column-stores after studied buffer managing technologies. On the global level, a free buffer chain and in-use buffer chain are constructed, resectively. The blocks in the in-use chain are replaced by general adaptive replacement policy. The buffer replacement decision is made according to not only the running queries but also some afterwards queries. On the query level, a main chain is maintained for every query' in-use buffer. When a concurrent operation phase occurs, a branch buffer chain is generated from the main buffer chain to manage buffer used by the concurrent phase, respectively. On the operation phase level, we designed a flexible and adaptive buffer allocation approach-MG-x-y-z for every operation phase and replacement policy suitable for access pattern of the operation phase. The proposed3-level buffer strategy took fully into account not only the characteristics of analytic workload, pattern of accessing data and available buffer, but also other factors like pre-fetching data etc.
     (5) Finaly, we proposed a materializing technology based paths with value (PVM) after studied the materialization strategies in column stores. PVM added paths with value to query execute tree, and utilized passing-block to hold intermediate results. This technique avoided reading original data blocks repeatingly. For the intermediate results of bit vectors in PVM, executing subsystem calle the proposed APWAH compression approach to compress it, which could reduce or avoid large I/O cost due to large bit vector in intermediate results.
     The mentioned technologies above were the key components of our column store DBMS prototype system. These researched results improve the overall performance of the system.
引文
[1]Martin L. Kersten, Stratos I, et al. The researcher's guide to the data deluge_querying a scientific database in just a few seconds[C].//Proc of the 37 Int Conf on very large data bases. Seattle, Washington:ACM,2011
    [2]Dawei J,Beng Chin O, et al. The performance of mapreduce:an in-depth study[C].//Proc of PVLDB 2010,pp:472-483
    [3]Daniel.J.Abadi. Query execution in column-oriented database systems [D]. Boston:MIT, 2008
    [4]Mike S, Daniel J, Adam B, et al. C-Store:A column oriented DBMS [C].//Proc of the 31st Int Conf on Very Large Data Bases. New York:ACM,2005, pp:553-564
    [5]Daniel J. Abadi, Samuel R. Madden, and Miguel Ferreira. Integrating compression and execution in column-oriented database systems. In SIGMOD, Chicago, IL, USA,2006, pp: 671-682
    [6]Alan Halverson, Jennifer L.Beckmann, Jeffrey F.Naughton, and David J.DeWitt. A comparison of C-store and row-store in a common framework[C].//Proc of the 32nd VLDB Conference, Seoul, Korea,2006
    [7]D.S.Batory. On searching transposed files. ACM TODS, Vol.4, No.4, Dec.,1979, pp:531-544
    [8]Harry K.T. Wong, Hsiu-Fen Liu, et el. Bit transposed files[C].//Proc of VLDB 85, Stockholm, pp:448-457
    [9]George P, and Setrag N. A decomposition storage model[C].//Proc of the 1985 ACM SIGMOD Int Conf on Management of Data. New York:ACM,1985, pp:268-279
    [10]P. Boncz, and M.Kersten. Monet:An impressionist sketch of an advanced database system[C].//Proc of of Basque International Workshop on Information Technology, San Sebastian, Spain, July 1995
    [11]http://www.monetdb.org
    [12]P.Boncz. Monet:A next-generation DBMS kernel for query-intensive applications [D]. Universiteit van Amsterdam, Amsterdam, The Netherlands, May 2002
    [13]Peter B, Marcin Z, and Niels N. Monetdb/X100:hyper-pipelining query execution[C]. //Proc of the 2005 CIDR Conf. New York:ACM,2005, pp:225-237
    [14]Roger M., and Blaine F. Sybase IQ multiplex-designed for analytics[C].//Proc of the 30th Int Conf on Very Large Data Bases. San Francisco:Morgan Kaufmann,2004, pp:1227 -1230
    [15]F.Teklitz. Sybase Business Intelligence and data warehousing solutions for Sybase IQ. DM Review Magazine 2005
    [16]http://www.sybase.com
    [17]Anastassia A, David J, Mark D, et al. Weaving relations for cache performance[C].//Proc of the 27th Int Conf on Very Large Data Bases. San Francisco:Morgan Kaufmann,2001, pp:169-180
    [18]R.A Hankins. Data Morphing:An adaptive, cache-conscious storage [C].//Proc of VLDB'03, Berlin,2003.
    [19]Daniel.J.Abadi. Column-oriented database systems[C].//Proc of VLDB'09, Lyon Franch, 2009, pp:449-450
    [20]Allison L., and David J. Read-optimized databases, in depth [C].//Proc of the 34th Int Conf on Very Large Data Bases. New York:ACM,2008, pp:502-513
    [21]A.Pavlo, E.Paulson, and A.Rasin. A Comparison of approaches to large-scale data analysis [C].//Proc of the 2009 ACM SIGMOD Int Conf on Management of Data. Providence, Rhode Island, USA,2009
    [22]Daniel J, Samuel R, and Nabil H. Column-stores vs. row-stores:how different are they really?[C].//Proc of the 2008 ACM SIGMOD Int Conf on Management of Data.New York: ACM,2008, pp:967-980
    [23]Stavros Harizopoulos,Velen Liang, Daniel J.Abadi et al. Performance tradeoffs in read-optimized databases [C]//Proc of the 32nd Int Conf on Very Large Data Bases. New York:ACM,2006, pp:487-498
    [24]Bruno.N. Teaching an old elephant new tricks[C].//In CIDR'09,2009, Asilomar, California, USA.
    [25]Daniel Boβwetter. SPAX-PAX with super-pages[C].//ADBIS 2009, LNCS 5739,2009, pp: 362-377
    [26]Xiangwu Ding, Mei Wang, and Jiajin Le. A column-based self-organizing hybrid storage model for data warehouse[C].//Proc. the 2nd International Conference on Information Science and Engineering. ICISE, Vol.7,2010, pp:4819-4822
    [27]Xiangwu Ding, Wenbing Yu, and Jiajin Le. An adaptive projection strategy and its implementation in column stores[C].//Proc. of the IEEE Joint International Information Technology and Artificial Intelligence Conference (ITAIC 2011), pp:468-473.
    [28]Daniel J.Abadi. Column-stores for wide and sparse data[C].//In CIDR'07,2007, pp:1-6
    [29]Yongqiang He, et al. RCFile:A Fast and space-efficient data placement structure in mapreduce-based warehouse systems[C].//ICDE'11, Hannover,2011
    [30]A.Floratou. Column-oriented storage techniques for mapreduce. Proc of the VLDB'11, Seattle, WA,2011
    [31]Goetz Graefe. Efficient columnar storage in B-tree. SIGMOD Record, Vol.36, No.1,2007, pp:3-6
    [32]于利胜,张延松,王珊等.基于行存储模型的模拟列存储策略研究[J].计算机研究与发展.Vol.47,No,5,2010,pp:878-885
    [33]胡玉乐,孙莉,王梅,刘国华.RB+树—一种列存储数据的树型索引结构.计算机研究与发展.Vo1.48,SuppI.1,2011年(增刊)
    [34]Stochinger, K. Bitmap indices for speeding up high-dimension data analysis. In LNCS 2002,pp:881-890
    [35]A Hamadou. An efficient bitmap indexing strategy based on word aligned hybrid for data warehouses[C].//In 2008 International Conference on Computer Science and Software Engineering,2008, pp:486-491
    [36]Corporation, O. A mapping mechanism to support bitmap index and other auxiliary structures on tables stored as primary B+trees[C].//In SIGMOD 2003, pp:78-88
    [37]Truls A. Bj(?)rklund, Johannes Gehrke,(?)ystein Torbjornsen. Inverted indexes vs. bitmap indexes in decision support systems.In CIKM'09,2009, pp:1509-1512
    [38]Kesheng Wu, Ekow J. Otoo and Arie Shoshani. A performance comparison of bitmap indexes. In CIKM'09,2001, pp:559-561
    [39]K.Wu, E. Otoo,and A.Shoshani. On the performance of bitmap indices for high cardinality attributes[C].//Proc of the VLDB'04,2004, Toronto, Canada, pp:24-35
    [40]N. Koudas. Space efficient bitmap indexing[C].//In Proc. of ACM Conference on Information and Knowledge Management (CIKM'00),2000, pp:194-201
    [41]M.C. Wu, and A. Buchmann. Encoded bitmap indexing for data warehouses[C].//In Proc. of Int. Conference on Data Engineering (ICDE'98),1998. pp:220-230.
    [42]D. Rinfret, P. O'Neil, and E. O'Neil. Bit-sliced index arithmetic[C].//In Proc. of ACM SIGMOD Int. Conference on Management of Data,2001, pp:47-57
    [43]K.L. Wu, and P.S. Yu. Range-based bitmap indexing for high cardinality attributes with skew[C].//In Int. Computer Software and Applications Conference (COMPSAC'98),1998, pp:61-67
    [44]Patrick O., and Asllan Q. Improved query performance with variant indexes[C].//Proc of the 1997ACM SIGMOD Int Conf on Management of Data. New York:ACM,1997, pp:38-49
    [45]Morteza Z, Somnuk P, and Su-Cheng H. An adequate design for large data warehouse systems:Bitmap index versus B-tree index[J].//International journal of computers and communications,2008(2), pp:39-46
    [46]T. Apaydin, G. Canahuate, H. Ferhatosmanoglu, and A. S. Tosun. Approximate encoding for direct access and query processing over compressed bitmaps[C].//Proc of VLDB'06, Korea,2006, pp:846-857
    [47]Peter A.Alsberg, and P.A. Space and time savings through large data base compression and dynamic restructuring. In Proceedings of the IEEE,Vol.63, No.8, August 1975
    [48]Gautam Ray, Jayant R. Haritsa, and S. Seshadri. Database compression:a performance enhancement tool. COMAD 1995
    [49]Christian Lemke, Kai-Uwe S., Franz F.and Alexander Z. Speeding up queries in column stores a case for compression. Springer-Verlag Berlin Heidelberg,2010, pp:117-129
    [50]Allison L. Holloway, Vijayshankar Raman, Garret Swart and David J. DeWitt. How to barter bits for chronons:compression and bandwidth trade offs for database scans[C].// Proc of SIGMOD 2007 June11-14, Beijing, China
    [51]G.Graefe and L.Shapiro. Data compression and database performance. In ACM/IEEE-CS Symp. On Applied Computing, April 1991, pp:22-27
    [52]Balakrishna R. Iyer and David Wilhite. Data compression support in databases[C].//Proc of VLDB'94, Santigo,Chile,1994, pp:695-704
    [53]Mark A. Roth, Scott J. Van Horn. Database compression[C].//In Proc of SIGMOD 1993, New York, NY, USA:ACM, pp:31-39
    [54]S. S. Ruth and P. J. Keutzer. Data compression for business files. Datamation 18(September 1972)
    [55]Jonathan Goldstein, Raghu Ramakrishnan and Uri Shaft. Compressing relations and indexes[C].//Proc of ICDE'98,1998, pp:370-379
    [56]V. Raman and G. Swart. How to wring a table dry:entropy compression of relations and querying of compressed relations [C].//Proc of VLDB'06, Korea,2006
    [57]T.Westmann, D.Kossmann, S.Helmer, and G. Moerkotte. The implementation and performance of compressed databases. SIGMOD Rec, Vol.29, No.3,2000, pp:55-67
    [58]M. Zukowski, S. Heman, Niels Nes, P. Boncz, et el.Super-scalar RAM-CPU cache compression[C].//Proc of ICDE 2006
    [59]Hatsukazu T, Albertoleon G. Efficient run-length encodings [J]. IEEE Trans on Information Theory,1982,6(28), pp:880-890
    [60]Jacob Ziv, and Abraham Lempel. An universal algorithm for sequential data compression. IEEE Transactions on Information Theory, Vol... IT-23, NO.3, May 1977,1977(23), pp:337-343
    [61]M.Stabno, and R.Wrembel. RLH:bitmap compression technique based on run-length and huffman encoding[C].//Proc of DOLAP'07, pp:41-48
    [62]I.H.Witten, R.Neal, and J.Cleary. Arithmetic coding for data compression. Communications of the ACM 1987, Vol.30, No.6, pp:520-540
    [63]Jeffrey, Scott and Vitter. Design and analysis of dynamic huffman codes. In ACM 1987, pp: 825-845
    [64]Scientific Data Management Research Group. Fastbit:an efficient compressed bitmap index technology. Retrieved from http://sdm.lbl.gov/fastbit
    [65]K.Wu, E. J. Otoo, and A. Shoshani. Compressing bitmap indexes for faster search operations[C].//In Proc. of International Conference on Scientific and Statistical Database Management (SSDBM'02),2002, pp:99-108
    [66]Kesheng Wu, Ekow J. Otoo, and Arie Shoshani.An efficient compression scheme for bitmap indices [C].//Proc of VLDB'04
    [67]Corrales, F., D. Chiu, and J. Sawin. Variable length compression for bitmap indices. In DEXA2011, Springer-Verlag, Berlin Heidelberg, pp:381-95
    [68]KeSheng Wu, Aarie Shoshani, and Kurt Stockinger. Analyses of multi-level and multi-component compressed bitmap indexes[C].In ACM.2010
    [69]G. Antoshenkov. Byte-aligned bitmap compression[J]. Technical report, Oracle Corp.,1994
    [70]K.Wu,E.J.Otoo, and A.Shoahani. Optimizing bitmap indice with efficient compression. ACM Transactions on Database Systems, Vol.31, No.1,2006, pp:1-38
    [71]Keshang W, Ekow J.,and Arie S. Word-aligned hybrid compression for bitmap indices,Revision 1.3.0 [OL].(2004-12)[2012-04]//crd-legacy.lbl.gov/-kewu/fastbit
    [72]Francois Deliege, Torben Bach Pedersen. Position list word aligned hybrid:optimizing Space and Performance for Compressed Bitmaps[C].//Proc of EDBT'10, Lausanne, Switzerland
    [73]D. G. Severance. A practitioner's guide to data base compression. Inf. Sys.8,1 (1983),51
    [74]K. Wu, E. J. Otoo, A. Shoshani, and H. Nordberg. Notes on design and implementation of compressed bit vectors. Technical Report LBNL/PUB-3161, Lawrence Berkeley National Laboratory, Berkeley, CA,2001
    [75]王振玺,乐嘉锦等.列存储数据区级压缩模式与压缩策略选择方法.计算机学报,Vo1.33,No.8,2010,pp:1523-1530
    [76]丁祥武,李清炳,王梅.列存储查询执行中间结果的自适应位向量压缩技术[J].计算机研究与发展增刊,Vo1.49,suppl.1,2012,pp:114-119
    [77]S.Khoshafian, G. Copeland, Thomas Jagodis, Haran Boral, and Patrick Valduriez. A query processing strategy for the decomposed storage model[C].//Proc of ICDE'87,1987, pp:636-643
    [78]Stratos I, Martin L, and Stefan M. Self-organizing tuple reconstruction in column-stores[C]. //Proc of the 2009 ACM SIGMOD Int Conf on Management of Data. New York:ACM, 2009,pp:297-308
    [79]Daniel J, Daniel S, David J, et al. Materialization strategies in a column-oriented DBMS[C].//Proc of ICDE'07. Los Alamitos, CA:IEEE Computer Society,2007, pp: 466-475
    [80]Ziyang Liu. Efficient and scalable data evolution with column[C].//Proc of EDBT'11,2011, pp:105-116
    [81]P.Boncz, and M.Kersten. MIL primitives for querying a fragmented world[J].The VLDB Journal,8(2), March 1999, pp:101-119
    [82]Hector Carcia-Molina, Jeffrey D Ullman, and Jennifer Widom. Database system implementation(Second Edition)[M]. New York:Prentice Hall,2000
    [83]MG Ivanova. An architechure for recycling intermediates in a column-store[C].//Proc of SIGMOD'09,2009, pp:309-320
    [84]Herodotos.H,Nedyalko B.and Shivanth B.Join optimizatin techniques for partitioned tables[C].//Proc of the 36th Int Conf on Very Large Data Bases. New York:ACM,2010, pp:253-265
    [85]Patrick O'Neil, and Goetz Graefe. Multi-table joins through bitmapped join indices[C].//Proc of SIGMOD 1995. Vol.24,No.3, pp:8-11
    [86]Stefan M., Peter B..and Martin L. What happens during a join? dissecting CPU and memery optimization effects[C].//Proc of VLDB'00,2000, Cairo, Egypt
    [87]Kurt Stockinger, Kesheng Wu, and Arie Shoshani. Evaluation strategies for bitmap Indices with Binning[C].//Proc of LNCS 2004, pp:120-129
    [88]丁祥武,余文兵,刘国华.VPM:列存储系统中基于带值路径的物化技术[J].计算机研究与发展.2012,Vo1.49,No.10,pp:2086-2094
    [89]Zhiyuan Chen, Johannes Gehrke, and Flip Korn. Query optimization in compressed database systems[C].//Proc of SIGMOD 2001, pp:271-282
    [90]Sihem Amer-Yahia, and Theodore Johnson. Optimizing queries on compressed bitmaps[C].//Proc of VLDB'00,2000, Cairo, Egypt
    [91]Kamer M, and Kesheng W. Efficient joins with compressed bitmap indexes[C].//Proc of the 18th ACM conference on Information and knowledge management,New York:ACM, 2009, pp:1017-1026
    [92]TeraData database[OL]. http://www.teradata.com/products-and-services/database/.2011
    [93]R. Agrawal, and R. Srikant. Fast algorithms for mining association rules[C].//In Proc of 20th Intl. Conf. on Very Large Data Bases (VLDB), September,1994, Santiago, Chile
    [94]Pat O'Neil, Betty O'Neil, and Xuedong Chen. The star schema benchmark revision 3[EB/OL]. (2009-06-05) [2010-05-01], http://www.cs.umb.edu/-poneil/StarSchemaB.PDF
    [95]Zhe Li, and Kenneth Ross A. PERF join:an alternative to two-way semijoin and bloomjoin[C].//Proc of the 4th ACM conference on Information and knowledge management, New York:ACM,1995, pp:137-144
    [96]M. Kersten, and S. Manegold. Cracking the database store[C].//Proc of CIDR 2005.
    [97]R.Bayer, and E.M.McCreight.Organization and maintenance of large ordered indices. Report of boeing scientific reach lab, No.20
    [98]Theodore Johnson, and Denis Shasha. Utilization of B-tree with insets,deletes and modifies.1989 ACM 0010-4892/79/0600-0121
    [99]Beng Chin Ooi,and Kian-Lee Tan.B-trees:bearing fruits of all kinds[C].//Proc of ADC2002,Melbourne,Australia
    [100]Douglas Comer. The ubiquitous B-tree. The ACM of computing surveys,Vol.11,No.2,June,1979, pp:121-137
    [101]Sebastiaan J.van, and Oege de Moor. A memory efficient reachablility data structure through bit vector compression[C].//Proc of SIGMOD 2011, pp:913-924
    [102]Yannis Smaragdakis.General adaptive replacement policies. [C].//Proc of ISMM'04, October,2004, Vancouver, Canada
    [103]G. Sacco, and M. Schkolnick. Buffer management in relational database systems. ACM Transactions on Database Systems, Vol.11, No.4, December 1986, pp:473-498
    [104]Hong-Tai Chou, and David J. Dewitt. An evaluation of buffer management strategies for relational database systems [C].//Proc of of VLDB 85, Stockholm, pp:127-141
    [105]Douglas W. Cornell, and Philip S. Yu. Integration of buffer management and query optimization in relational database environment [C].//Proc of VLDB 89, Amsterdam, pp: 247-256
    [106]Belady.L.A. A study of replacement algorithms for virtural storage computers. IBM Sys, January,1996, pp:78-101
    [107]L.Donghee, C.Jongmoo, and K. Jong. LRFU:A spectrum of policies that subsumes the least recently used and least frequently used polices [J]. IEEE Transations on Computers, 50(12), December,2001, pp:1352-1361
    [108]E.J.O'Neil, P E.O'Neil, and G.Weikum. The LRU-K page replacement algorithm for database disk buffering[C].//Proc of ACM SIGMOD Conf.,1993, pp:297-306
    [109]R. Turner, and H. Levy. Segmented FIFO page replacement[C].//Proc of ACM SIGMETRICS Conference on Measurement and Modeling of computer systems,1981
    [110]M. H. J. Baylis, D. G. Fletcher, and D. J. Howarth. Paging studies made on the I.C.T. ATLAS computer. Information Processing, IFIP Congress Booklet D,1968
    [111]S. Yannis, K. Scott, and W. Paul. EELRU:simple and effective adaptive page replacement algorithm[C].//Proc of ACM SIGMETRICS, Janauary 1999, pp:92-112
    [112]S.Yannis, K.Scott, and W.Paul. The EELRU adaptive replacement algorithm[J]. Performance Evaluation,53(2), July 2003, pp:93-123
    [113]G.Glass, and P.Cao. Adaptive page replacement based on memory reference behavior[C].//Proc of ACM SIGMETRICS Conference,1997, pp:115-126
    [114]W.Christopher, F.Eduardo.B, and L.Tomas. Minimization of demand paging for the LRU stack model of program Behavior. Information Processing Letters, v(16),1983, pp:99-104
    [115]Peter J. Denning. The working set model for program. In ACM on Communications. Vol.11,No.5,1968, pp:323-333
    [116]G. M.Sacco, and M.Schkolnick. A mechanism for managing the buffer pool in a relational database system using the hot set model [C].//Proc of VLDB 85, Mexico City,1982, pp:257-262
    [117]C. Faloutsos, R.Ng, and T. Sellis. Flexible and adaptable buffer management techniques for database management systems. IEEE Trsactions on computers,Vol.44, No.4,1995, pp:546-560
    [118]S.B.Yao. Approximating block accesses in database organizations. Communications of ACM,Vol.20,No.4,1977, pp:260-261
    [119]王珊等编著.数据仓库技术与联机分析处理[M].科学出版社,1998,pp:219-228

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700