数据仓库中物化视图选择问题的研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着计算机技术在各个行业的普及应用,承载信息的数据随着时间的推移而不断增长,人们已经不再满足于日常操作型的事务处理,而是更加关注能否从纷繁复杂、大量沉淀的数据环境中分析挖掘到有用的决策信息,及时做出正确的分析与决策,使这些历史的业务数据变得有意义,因此数据仓库的概念诞生了。
     数据仓库是一个面向主题的、集成的、相对稳定的、随时间变化的、用于支持管理决策的数据集合。它维护着海量的数据,支持复杂的查询形式,通常需要访问大量数据,而一个决策支持系统必须对查询做出快速响应,因此,数据仓库要具有较高的性能。物化视图是提高数据仓库性能的一项重要技术,它使得查询效率大大提高,但是物化视图的保存会占用一定的存储空间,而且为了与基表数据保持一致还需要一定的维护开销。这就需要考虑物化视图的选择,要让物化视图尽量少地占用存储空间的情况下尽量大地提高查询效率,这就是本文要解决的重点问题。
     本文首先对物化视图选择问题进行描述,并建立了问题的数学模型。其次,介绍了几种现有的求解物化视图选择问题的方法,在此基础上提出了物化视图蚁群选择算法ACS-VSP,作为本文的创新点。通过仿真实验,将ACS-VSP的求解性能和已经成功求解过该问题的遗传算法的求解性能进行比较,得出结论:蚁群算法的求解性能优于遗传算法。再次,结合实际应用中查询分布随着时间的推移发生变化的特点,提出了一种物化视图动态调整算法。该算法能够使物化视图集合更好的适应用户的需求,并极大地提高系统对用户查询的响应速度。最后,将理论研究成果应用到学生成绩查询分析系统中,从而证明所提出的物化视图选择算法的实用价值。
With the application of computer technology in all areas, the data carrying information grows with the passage of time. People are no longer satisfied with the daily operation of the business, but more concerned about whether they can find useful information for decision-making from the complicated, large data environment, timely make the correct decision, which make these data become meaningful. So the data warehouse concept was born.
     A data warehouse is a subject-oriented, integrated, relatively stable, time-variant data set to support decision making.It maintains a large amount of data, supports complex queries, and a decision support system should make rapid response to users, therefore, the data warehouse should have high performance. Materialized view is an important technology to improve the performance of data warehouse, it makes the efficiency of queries greatly increased, but the materialized view will take up some storage space and also need some maintenance cost. So, we must consider materialized view selection problem. Materialized views occupy less storage space. At the same time, the efficiency of queries is improved larger, which is the focus of this paper.
     First, the paper describes materialized view selection problem, establishes the mathematical model of the problem and introduces several existing materialized view selection methods. On this basis, Ant colony system (ACS) method is proposed to solve this problem, as the innovation. Experiments proved that the performance of ant colony algorithm is better than the genetic algorithm. Then, according to the actual applications, the query distribution is changed with the passage of time, a materialized view dynamic adjustment algorithm is proposed. The algorithm can greatly make materialized view set adapt to the needs of users and greatly improve the speed of the system response to users. Finally, the theoretical research results are applied to student score query and analysis system to prove the practical value of the Materialized View selection algorithm.
引文
[1] H.Gupta.Selection of Views to Materialize in a Data Warehouse[J].Proceedings of the 23 nd VLDB Conference, Athens, Greece, 1997:P156-165
    [2] H.Gupta and LS.Mumick. Selection of Views Maintenance Cost Constraint[J]. In Proc. Of the 7 th Intl. Conf. On Database Theory, 1999: p453-470
    [3] Satyanarayana R Valluri,Soujanya Vadapalli,Kamalakar Karlapalem. View Relevance Driven Materialized View Selection in Data Warehousing Environment[J].The 13 thAustralasian Database Conference(ADC2002),Melbourne,Australia.Conferences in Reseach and Practice in InformationTechnology,2002,v5
    [4] Jian Yang,Kamalakar Karkapalem,and Qing Li. Algorithm for Materialized View Design in Data Warehousing Environment[J]. VLDB'97, 1997: p20-40
    [5] Marta Indulska.Shared Result Identification for Materialized View Selection[J].Proceedings of the 11 thDatabase Conference, ADC 2000
    [6] Ligoudistianos S,Theodoratos D,Shllis T.Experimental DataEvaluation of Warehouse Configuration Algorithms[J]. Proceedings of the 9 th Interational Workshop on Database and Expert Systems Applications 1998:P218-22
    [7] C. Zhang, X. Yao, J.Yang.An evolutionary approach to materialized views selection in a data warehouse environment[J]. IEEE Transactions on Systems, Man, and Cybernetics-PartC: Applications and Reviews. 2001,Vol.31, pp.282-294.
    [8] 周丽娟,刘大昕,柳池.数据仓库中实视图的选取[J].计算机工程与应用,2003.34:194~196
    [9] W.Y.Lin, I.C.Kuo, A genetic selection algorithm for OLAP data cubes[J]. Knowledge and Information Systems.2004,Vol.6, pp.83-102.
    [10] 徐海涛,郑宁.数据仓库中物化视图选择的一种混合算法[J].计算机工程与设计,2005.10(26): 2752-2755
    [11] D.Theodoratos, T.Sellis. Dynamic Data Warehouse Design[J]. Data Warehousing and Knowledge Discovery, 1999:P1-10
    [12] C.Zhang, J.Yang. Materialized View Evolution Support in Data Warehouse Environment[J]. Proceedings of the 6th International Conference on DatabaseSystems for Advanced Applations, Florence, 1999:247-254
    [13] Y.Kotidis, N.Roussopoulosa. Case for Dynamic View Management[J].Proceedings of ACM Transactions on Database Systems, 2001:388-423
    [14] 王新军 , 洪晓光 , 王海洋 . 考虑更新频率等因素的物化视图有效更新方法 [J]. 计算机工程 , 2003.12(30):86-88
    [15] A.Gupta,LS.Mumick. Maintenance of Materialized Views[J] . Problems,Technique, and Applications. IEEE Data Engineering Bulletin(Specialissue on Materialized Views and Data Warehousing),June 1995:p3-18
    [16] Chuan Zhang, Xin Yao, Jian Yang. Evolving Materialized View in Data Warehouse[J]. Proceedings of the 1999 Congress on Evolutionary Computation. CEC Washington, DC, USA, July 1999
    [17] Yue Zhuge, Hector Garcia-Molina, and Janet L.Wiener. The Strobe Algoriithms for Multi-souce Warehouse Consistency[J]. International Conf.On Parallel and Distributed InformatinSysytem, Dec. 1996
    [18] J.Wiener et al. A System Prototype for Warehouse View Maintenance[J]. In Workshop on Materialized Views:Tech.and App.,1996
    [19] John W.T.Lee and Xiang Ye.Materalized View Design and Maintenance in a Financial Data Warehouse System[J]. IEEE SMC’99 Conf. Proc,1999,v5
    [20] D.Agrawal,A.EI Abbadi,A.Singh,T.Yurek. Efficient View Maintenance at Data Warehouses[J]. In Proceedings of the 1997 ACM International Conference on Management of Data,May 1997: p417-427
    [21] Ashish Gupta,Inderpal S.Mumick,Jun Rao and Kennth A.Ross. Adapting Materialized Views after Redefinitions [J].Techniques and a Performance Study. Information systems .2001
    [22] Tok Wang Ling,Eng Koon Sze.Materialized View Maintenance Using Version Numbers[J].Proceedings of the 6th International Conference on Database systems for Advanced Applications, l999
    [23] A.Gupta,H.Jagadish, LMumuick. Data Integration Using Self-MaintainableViews[J]. Proceddingds of the Fourth International Conferece on Extending Database Technology,1996:p140-144
    [24] R.Hull, G.Zhou. A Framework for Supporting Data Integration Using the Materialized and Virtual Approaches[J]. Proceedings of the ACM-SIGMOD Conference, l996:p481-492
    [25] Dallan Quass, Ashish Gupta,Inderpal Singh Mumick. Making Views Self-Maintainable for Data Warehousing[J]. Proceedings of the International Conference on Parallel and Distributed Information Systems, Miami Beach,FL,1996:p158-169
    [26] Weifang Liang, Hui Li, Hui Wang, Maria E. Orlowska. Making Multiple Views Self-Maintainable in a Data Warehouse[J]. Data&Knowledge Engineering. 1999, v30(2)
    [27] Yue Zhizge, Janet L.Wiener and Hector Garcia-Molina. Multiple View Consistency for Data Warehousing[J]. Technical Report, Standford University,Sept. 1997
    [28] Venky Harinarayan, Anand Rajaraman. Implementing Data Cubes Efficiently[J]. Montreal, Canada,Volume 25 ,Issue 2 ,1996, pp.208-219.
    [29] 李建中,高宏.一种数据仓库的多维数据模型[J].软件学报,2000,11(7): 908-917
    [30] 周根贵.数据仓库与数据挖掘.浙江:浙江大学出版社,2004
    [31] M.Dorigo,L.M.Gambardella.Ant Colonies for the Traveling Salesman Problem [J]. BioSystems.1997, 43:73-81.
    [32] V.Maniezzo,M.Dorigo,A.Colorni.The ant System Applied to the Quadratic Assignment Problem[R].Technical report IRIDIA/94-28, Belgium: Universite de Bruxelles,1994.
    [33] A Colorni,etal.Ant System for Job-shop Scheduling[J]. JORBEL.1994,34(1):39-53.
    [34] 胡小兵,黄席樾.蚁群优化算法及其应用[J].计算机仿真.2004, 24(5):81-85.
    [35] Yun-Chia Liang, Sadan Kulturel-Konak, Alice E. Smith. Meta Heuristics for the Orienteering Problem[J], Proceedings of the 2002 Congress on Evolutionary Computation, May 12-17, 2002, p384-389.
    [36] Yun-Chia Liang and Alice E. Smith.An Ant Colony Approach to the Orienteering Problem[J],Journal of the Chinese Institute of Industrial Engineers, vol. 23, no. 5, 2006, pp. 403-414
    [37] 谭红星,周龙骧.多维数据实视图的动态选择[J].软件学报,2002,13(6):1090-1096
    [38] 徐海涛,郑宁.基于模拟退火算法的实体化视图动态选择方法[J].计算机工程与应用,2005.22:190~193
    [39] Y Kotidis, N Roussopoulos. A dynamic view management system for data warehouses[J]. The 1999 ACM SIGMOD Int'l Conf on Management of Data, Philadelphia,Pennsylvania, 1999
    [40] 刘乃丽,李玉忱等.存储空间约束下物化视图的选择[J].计算机应用, 2004,24(8):76-78
    [41] 周丽娟.数据仓库中实视图的选择和维护技术的研究[D].哈尔滨工程大学,2004
    [42] 衣振萍.数据仓库中基于访问频率的动态物化视图的研究[D].山东大学,2005
    [43] 李泽海.数据仓库中多维数据处理与查询相关技术的研究[D].吉林大学,2005

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700