嵌入式主动实时数据库ARTs-EDB的索引技术
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
实时数据库(Real-Time Database, RTDB)中的数据和事务都具有显式的定时限制,系统的正确性不仅依赖于逻辑结果,更依赖于逻辑结果产生的时间。RTDB的高性能要求以内存数据库(Main-Memory Database, MMDB)作为低层支持。索引是提高数据库系统执行效率的一种有效工具,索引选择问题是数据库物理设计中一个重要的优化问题,在内存数据库上建立索引必然要受内存的快速存取以及高效利用影响。
     ARTs-EDB是自行研发、拥有自主知识产权的嵌入式主动实时数据库系统。以内存数据库作为其底层支持。大量的数据放到内存中,使得在内存中找到目标数据所用的时间不可忽略,因此,在内存数据库中,除了要使用传统的索引机制外,还要引入用于查找内存中的数据的索引。而非主键索引是一个比主键索引使用频率更高、设计更为复杂的索引机制。
     在分析了用于查找内存数据的非主键索引的特性之后,设计并实现了一个适合于内存数据库的非主键索引机制。此索引机制使用倒排表实现,支持区间查找,并使用一个高效算法实现联合查找,而索引机制本身不需要创建联合索引。实验证明,此索引结构比传统的散列表结构有更优的查找性能和更高的内存利用率。此索引结构的缺点在于创建的代价高,因此,系统引入了热度评价模块来支持此索引机制。
     热度评价模块根据属性在当前一段时间内使用的频率等因素对其进行合理的分类。系统依据属性的类型决定是否要基于此属性创建非主键索引、是否要保存此非主键索引。使用有限的内存空间保存“热门”数据的非主键索引,使得系统在不频繁的创建非主键索引的前提下,有较高的“索引命中率”。
     在自主研制的嵌入式主动实时数据库管理系统ARTs-EDB上,实验结果表明上述索引机制较之传统的索引机制,在时空性能上有大幅度的提高。
Real-time database (RTDB) needs to face the challenge of simultaneously satisfying data integrity and timing requirements. The transactions and data in real-time database systems have explicit time constraints, and the correctness of transaction execution relies on not only the logical results but also the time constraint. In general, RTDB requires the support of main-memory database system. Index is an efficient tool that can improve the DBS’executing performance. Establishing index in MMDB must be influenced by quick access and efficient use of the memory.
     ARTs-EDB is an embedded active real-time database system, which is researched and developed independently and possesses proprietary intellectual property rights. It is based on the support of main-memory database system. A large number of data reserves in the main-memory, leading to the result that the time spending on searching the aiming data in the main-memory can not be ignored. Therefore, in the MMDB, except for the use of traditional index, another index that supports to search data in the main-memory should be introduced. And the non-primary index is used more frequently and designed more complicatedly than the primary index.
     After analyzing the characteristics of the non-primary index used to search the data in the main memory, a new non-primary index system is designed and implemented which fits to MMDB. The index is implemented as a inverted table index that supports range and united search, and the united search is implemented with a quick algorithm rather than a united index. The experiment demonstrates that the new index structure has better performance of searching and needs less main memory than the hashing table. The defect of the index structure is that it needs a more complicated establishing algorithm. Therefore, the system introduces popularity diagnose sub-system to support the new index structure.
     Popularity diagnose sub-system classifies the attribute by the use frequency during a latest period of time at the present and some other elements. The system determines whether it needs establishing and reserving the non-primary index by the classification of the attribute. Reserving the popular index with limited main memory supports high index shooting average, on the premise that the system needn’t establish non-primary index frequently.
     The experiments’result based on ATRs-EDB shows that the above index strategy can greatly improve the time and spatial performance than the traditional index strategy.
引文
[1]刘云生.现代数据库技术.第1版.北京:国防工业出版社, 2001. 108~109,250~252
    [2] Kao. Ben, Lam. Kam-yiu, B. Adelberg et al. Maintaining temporal consistency of discrete objects in soft real-time database systems. IEEE Transactions, 2003, 52(3): 373~389
    [3] Suhee Kim, S. H. Son, J.A. Stankovic. Performance evaluation on a real-time database. In: Proceedings of IEEE Eighth Real-Time and Embedded Technology and Applications Symposium, California: San Jose, 2002. 253~265
    [4]许贵平,刘云生.基于抢占阀值的嵌入式实时内存数据库事务调度.华中科技大学学报, 2003, 31(6): 52~53
    [5] R. Rajeev, S. Seshadri, P. Bohannon. Improving predictability of transaction execution times in real-time databases. Real-Time Systems,2000,19(3): 283~302
    [6] Tei-Wei Kuo, yuan-Ting Kao, Chin-Fu Kuo. Two-version based concurrency control and recovery in real-time client/server database. IEEE Transaction on Knowledge and Data Engineering, 2003, 52(4): 506~524
    [7] Kam-yiu Lam, Tei-Wei Kuo, Wai-Hung Tsang et al. The reduced ceiling protocol for concurrency control in real-time databases with mixed transactions. The Computer Journal, 2000, 43(1): 65~80
    [8] Kwok-Wa Lam, Sang H.Son, Sheung-Lun Hung et al. Scheduling transactions with stringent real-time constraints. Information Systems, 2000, 25(6): 131~152
    [9] K. W. Lam, V. C. S. Lee, S. L. Hung. Transaction scheduling in distributed real-time systems. Real-Time Systems, 2000, 19(2): 169~193
    [10] J. A. Stankovic, W. Zhao. On real-Time Transactions. ACM SIGMOD RECORD, 1998, 25(3): 10~15
    [11] R. Aranha. Implementation of a real-time database system. Information Systems, 1996, 21(1):55~74
    [12] Ozgur Ulusoy. Current Research on real-Time Database. ACM SIGMOD RECORD, 1992, 21(4):16~21
    [13] P. Bohannon, R. Rastogi, S. Seshadri et al. Detection and recovery techniques fordatabase corruption. IEEE Transaction on Knowledge and Data Engineering, 2003, 15(5): 1120~1136
    [14]刘云生,廖国琼,付蔚.一个支持实时内存数据库的恢复系统.小型微型计算机系统. 2003, 24(3): 460~463
    [15]刘云生,许贵平.内存数据库的图论存取方法.计算机学报, 2001, 24(10): 1095~1101
    [16] A. Caprara, M. Fischetti, D. Maio. Exact and approximate algoximate algorithms for the index selection problem in physical database design. IEEE Transactions on Knowledge and Data Engineering, 1995, 7(6): 955~967
    [17]孙宁平,中村良三,孙文玲. AVL平衡树插入算法的平衡特性.中央民族大学学报,2000, 9(1): 30-38
    [18]周龙骧.数据库管理系统实现技术.第1版.武汉:中国地质大学出版社, 1990. 23~60
    [19] J. Nievergelt, H. Hinterberger, K. C. Sevcik. The grid file: an adaptable, symmetric, multikey file structure. ACM Transactions on Database Systems, 1984, 9(1): 38~71
    [20] Liang peng, S. See, Yueqin Jiang et al. Performance evaluation in computational grid environments. In: Proceedings of High Performance Computing and Grid. Asia Pacific Region, Jack. 2004. 54~62
    [21] R. L. Rivest. Partial match retrieval algorithms. SIAM J. Computing, 1976, 5(1): 16~50
    [22] W. A. Burkhard. Hashing and tree algorithms for partial match retrieval. ACM Transactions on Database Systems, 1976,1(2):175~187
    [23] O. Gunther, J. Bilmes. Tree-based access methods for spatial database: implementation and performance evaluation. IEEE Transaction on Knowledge and Data Engineering, 1991, 3(3): 342~356
    [24] Y. Theodoridis, E. Stefanakis, T. Sellis. Efficient cost models spatial queries using R-trees. IEEE Transactions on Knowledge and Data Engineering, 2000, 12(1): 19~32
    [25] S. T. Leutennegger, M. A. Lopez. The effect of buffering on the performance of R-trees. IEEE Transactions on Knowledge and Data Engineering, 2000, 12(1): 33~34
    [26]徐长醒,刘云生,许贵平.基于图的主动数据库E-RG规则执行模型研究.小型微型计算机系统, 2002,23(5): 600~602
    [27]刘云生,郭刚.实时内存数据库中复杂触发器条件的评价.华中理工大学学报, 1999. 8(2): 93~95
    [28]刘云生.关于实时数据库事务.软件学报, 1995, 6(10):614~622
    [29] Yunsheng Liu. On Real-Time Database Transactions. CMPSCI Tech.Report, Univ.of Mass.,1994:56~60
    [30] G. Greco, S. Greco, E. Zumpano. Alogical framework for querying and repairing inconsistent database. IEEE Transactions on Knowledge and Data Engineer, 2003, 15(6): 1389~1408
    [31] J. A. Stankovic, W. Zhao. On Real-Time Transactions. ACM SIGMOD RECORD, 1988, 25(3): 33~37
    [32] A. Datta, S. H. Son. A study of concurrency control in real-time active database systems. IEEE Transactions on Knowledge and Data Engineering, 2002, 14(3): 465~484
    [33] Danilo Montesi, Elisa Bertino, Peter Deamley. Rules Termination Analysis Investigating the Interaction between Transactions and Triggers. In: Proceedings of International Database Engineering and Application Symposium IEEE 2002: 285~294
    [34]王珊. SYBASE原理高级系统管理与性能调优.第3版.北京:中国水利水电出版社, 1998. 361~362
    [35] H. Garcia-Molina, K. Salem. Main memory database system: an overview. IEEE Transaction on Knowledge and Data Engineering, 1992, 4(6): 509~516
    [36] Jing Huang, L. Gruenwald. A data priority reload technique for real-time main memory databases. In: Proceedings of the Eighth Euro micro Workshop on Real-Time Systems, Laquila, Joson, 1996: 96~101
    [37] Vijay Kumar, Albert Burger. Performance Measurement of Main Memory Database Recovery Algorithms Based on Updata-in-Place and Shadow Approaches. IEEE Transactions on knowledge and data engineering, 1992, 4(6): 567~571
    [38]吴绍春.一种内存数据库定义及相关技术探讨.江汉石油学院学报, 1996,18(4):91~94
    [39] Hector Garcia-Molina, Jeffrey D. Ullman, Jennifer Widom.数据库系统实现.第1版.杨冬青,唐世渭,徐其钧等译.北京:机械工业出版社, 2001. 88~90
    [40] Bin Cui, Beng Chin Coi, Jianwen Su et al. Indexing high-dimensional data for efficient in-memory similarity search. IEEE Transactions on Knowledge and DataEngineering, 2005, 17(3): 339~353
    [41] Y. Theodoridis, E. Stefanakis, T. Sellis. Efficient cost models for spatial queries using R-trees. IEEE Transactions on Knowledge and Data Engineering, 2000,12(1):19~32
    [42] R. Baldoni, F. Quaglia, P. Fornara. An index-based checkpointing algorithm for autonomous distributed systems. IEEE Transactions on Parallel and Distributed Systems, 1999, 10(2): 181~192
    [43] A. Caprara, M. Fischetti, D. Maio. Exact and approximate algorithms for the index selection problem in physical database design. IEEE Transactions on Knowledge and Data Engineering, 1995, 7(6):955~967

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700