基于Zhang编码的XML与关系数据库中间件的研究与实现
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着Web应用之间的XML数据交换数量的不断增长,如何在数据库中可靠和有效地存储XML文档以及XML和数据库之间的数据交换技术将变得越来越重要。将XML数据存储到关系数据库中,可以利用关系数据库成熟的索引、存储、查询技术,但是,XML数据复杂的层次结构和关系数据库扁平的表结构之间的不匹配,使得在存储过程中出现很多复杂问题。
     从上述应用背景出发,本文分析了目前国内外XML数据存储相关技术及其优缺点,研究了相关的理论和技术。首先提出了一种基于区间编码在关系数据库中存储XML文档的方法,该方法是一种模型映射方法,采用统一的关系模式存储XML文档。基于这种存储方法,设计算法实现由XPath查询向SQL查询的转换。转换的过程分为三个步骤:首先由XPath表达式产生XPath查询图;然后,将这个查询图根据定位步分解为若干个子图,通过基于Zhang编码方案的XPath查询轴的计算判别条件得到这些子图所代表的定位步测试结点所对应的元素Zhang编码集合;最后,再根据该计算判别条件由上一步骤得到的Zhang编码集合产生SQL语句。此外,本文还设计了将关系数据发布为XML文档的算法,该算法既可以将存储在关系数据库中的XML文档进行还原,也可以使用户提交的XPath查询后得到的是XML文档片段形式的查询结果。
     整个系统设计为三个模块,分别是数据存储模块、查询转换模块和数据格式转换模块。在实现方法上,将整个系统分为数据访问层、业务逻辑层和用户界面层。实验表明,采用模型映射方法在关系数据库中存储XML文档,避免了采用结构映射的各种缺陷,发挥了关系数据库中数据管理的优势,实现了XML数据在关系模式中的有效存储。
With the unceasingly increase of XML data exchanging amount among web applications, how to store XML documents and exchange data reliably and efficiently between XML and database is becoming more and more important. Index、storage and query of relational database can be used if we store XML document into relational database. But many complicated problems appear in the process of storage because of the unmatching between the complicated hierarchical structure of XML document and the flat structure of relational database.
     Beginning from the background applied above, this paper analyzes the advantages and disadvantages of related technology and instrument of XML document storage abroad and domestic at present, studies the related theory and technology. At first, this paper addresses how to store XML documents into RDBMS by using region encode. This method is one kind of model mapping methods, which stores XML documents using uniform relational schema. Based on this storage method, this paper designs an algorithm which implements the transformation from XPath query to SQL query. The process of transformation can be divided into three steps: First, construct XPath query graph from XPath expression. Secondly, according to the locating steps split query graph into several sub-graphs. By using calculation method of XPath query axis based on Zhang encode method, we can get Zhang encoding sets which are related to testing nodes in locating steps. These nodes are represented by sub-query graphs. Thirdly, by calculation method again we generate SQL according to Zhang encoding which are generated from former step. On the other hand, this paper addresses an algorithm of publishing relational data to XML format. This algorithm not only can convert relational data into XML document which is stored in RDBMS, but also can let user get XML format query result after submit XPath expressions.
     The whole system is designed into three modules, which are data storage module, query transformation module and data format transformation module. During the implementation, the system is divided into data access layer, business logic layer and user interface layer. Experiments show that storing XML document into RDBMS by using model mapping method can avoid limitations of structure mapping method. This kind of method takes advantage of the power of data management in RDBMS, implements efficient storage of storing XML document in relational schema.
引文
[1] 周傲英等 基于关系的 XML 数据存储[J] 计算机应用.2000, 20(9)
    [2] World Wide Web Consortium. Extensible Makeup Language (XML) 1.0 (Third Edition). W3C Recommendation [S]. 4 February 2004. http://www.w3c.org/TR/REC-xml/
    [3] 翁亮, 曾昭平, 刘超俊, 诸鸿文. XML 技术分析[J]. 计算机与网络. 2000. 1. 24-25
    [4] 瞿裕忠, 张剑锋, 陈峥, 王丛刚. XML 语言及相关技术综述[J]. 计算机工程. 2000. 26(12)
    [5] LV Teng, YAN Ping, Wang Zhen-xing. Functional Dependencies for XML and Their Relationship with Keys [J].In: MINI-MICRO SYSTEMS Vol.26 No.9 Sep.2005
    [6] Yoshikawa M, Amagara T, Shimura T, et al. XREL: A Path-Based Approach to Store and Retrieval of XML document using Relational Databases [J]. ACM Transaction on Internet Technology (TOIT). 2001, 1(1): 110~141
    [7] Li Q and Moon B. Indexing and Querying XML Data for Regular Path Expression[C]. In : Apera P M G et al Eds. Proceedings of the 27th VLDB International Conference on Very Large Database. Rome, Italy. September 11-14, 2001. San Francisco: Morgan Kaufmann Publishers
    [8] Grust T. Accelerating XPath Location Steps[C]. In: Franklin M J et al Eds. Proceedings of the 21th ACM SIGMOD International Conference on Management of Data. Madison, Wisconsin, USA. June 3-6 ,2002. New York: ACM Press, 2002. 109~120
    [9] Christophides V,Plexousakis D,Scholl M,et al.On Labeling Schemes for the Semantic Web [C].In:Hencsey G et al Eds. Proceedings of the 12th International World Wide Web Conference(WWW’03). Budapest, Hungary. May20-24,2003. Switzerland: IW3C2,2003.1~12
    [10] Wang Wei, Jiang Haifeng, Lu Hongjun, et al.PBiTree Coding and Efficient Processing of Containment Joins [J]. In:Casati F et al Eds.Proceedings of the 19th IEEE ICDE International Conference on Data Engineering.Bangalore,India.March 5-8,2003.Los Alamitos:IEEE Computer Society,2003.391~402
    [11] Tatarinov I,Viglas S D,Beyer K,et al.Storing and Querying Ordered XML using aRelational Database System [J].In:Franklin M J et al Eds.Processdings of the 21th ACM SIGMOD International Conference on Management of Data.Madison,Wisconsin,USA.June3-6,2002.New York:ACM Press,2002.204~215
    [12] Zhang C, Naughton J, DeWitt D, et al. On Supporting Containment Queries in Relational Database Management Systems[J]. In: Mehrotra S et al Eds. Proceedings of the 20th ACM SIGMOD International Conference on Management of Data. Santa Barbara, California, USA. May 21-24, 2001.New York: ACM Press, 2001.426~437
    [13] Wan Chang xuan and Liu Yunsheng. Efficient Supporting XML Query and Keyword Search in Relational Database Systems[C]. In: Meng Xiaofeng et al Eds. Proceedings of the third WAIM International Conference on Web-Age Information Management(Lecture Notes in Computer Science, Vol. 2419 ). Beijing, China. August 11-13, 2002. Heidelberg: Springer-Verlag, 2002.1~12
    [14] Wan Chang xuan and Liu Yun-sheng. X-RESTORE: Middleware for XML’s Relational Storage and Retrieve[J]. Wuhan University Journal of Natural Sciences, 2003,8(1A): 28~34
    [15] Dietz P F. Maintaining Order in a Linked List[C]. In: Lewis H R et al Eds. Proceedings of the 14th Annual ACM Symposium on Theory of Computing(STOC '82). San Francisco, Califonia, USA, May 5-7,1982. New York: ACM Press, 1982. 122~127.
    [16] Wirth N. Type Extensions[M]. ACM Transactions on programming Languages and Systems,1988, 10(2): 204~214
    [17] Online Computer Library Center. Dewey decimal classification[S]. http://www.oclc.org/dewey
    [18] Al-Khalifa S,Jagadish HV,Koudas N,et al.Structural Joins:A Primitive of Efficient XML Query Pattern Matching [J].In:Hiong Ngu A H et al Eds. Proceedings of the 18th IEEE ICDE International Conference on Data Engineering.San Jose,California,USA.February 26-March 1,2002.Los Alamitos:IEEE Computer Society,2002.141~152
    [19] Chien SY, Vagena Z, Zhang D, et al. Efficient Structural Joins on Indexed XML Documents[C]. In: Papadias D et al Eds. Proceedings of the 28th VLDB International Conference on Very Large Database. Hong Kong, China. August 20-23, 2002. San Francisco: Morgan Kaufmann Publishers 2002.263~274
    [20] Bruno N, Koudas N, Srivastava D. Holistic Twig Joins: Optimal XML Pattern Matching[C]. In: Franklin M J et al Eds. Procedings of the 21th ACM SIGMOD InternationalConference on Management of Data. Madison, Wisconsin, USA. June 3-6, 2002. New York: ACM Press, 2002. 310~321
    [21] Jiang Haifeng,Wang Wei,Lu Hongjun,et al.Holistic Twig Joins on Indexed XML Documents [J].In:Heuer A et al Eds. Proceedings of the 29th VLDB International Conference on Very Large Database.Berlin,Germany.September 9-12,2003.San Francisco:Morgan Kaufmann Publishers,2003.273~284
    [22] JDOM[S] http://www.jdom.org
    [23] SunTM Micro System JDBC API Specification [S]4.0 December 11, 2006. http://java.sun.com/javase/6/docs/technotes/guides/jdbc/
    [24] A. Deutsh, M. Fernandez, D. Suciu. Storing Semistructured Data with STORED[C]. In Proceding of the 1999 SIGMOD Conference, Philidephia, USA, June, 1999.
    [25] D. Florescy, D. Kossman. “A Performance evaluation of alternative mapping schemas for storing XML data in a relational database”[C] technical report(No. 3680). May, 1999.
    [26] A. Schmidt, M. Kersten. M. Windhouwer, F.Waas. Efficient Relational Storage and Retrieval if XML Documents[C]. In the Proceedings of the 3rd International Workshop WebDB, Dallas, USA. May, 2000.
    [27] Jiang Haifeng, Lu Hongjun, Wang Wei, at el. XParent: An Efficient RDBMS-based XML Database System [C]. In: Hiong Ngu A H et al Eds. Proceedings of the 18th IEEE ICDE International Conference on Data Engineering. San Jose, California, USA. February 26~March 1, 2002. Los Alamitos: IEEE Computer Society, 2002. 335,336
    [28] Philip J Harding Quanzhong Li Bongki Moon XISS/R: XML Indexing and Storage System Using RDBMS [C], Proceedings of the 29th VLDB Conference,Berlin, Germany, 2003
    [29] Zhen Hua Liu, Anguel Novoselsky Efficient XSLT Processing in Relational Database System[C], VLDB 2006
    [30] Bohannon P, Freire J, Roy P, et al. From XML Schema to Relations: A Cost-based Approach to XML Storage [C]. In: Hitong Ngu A H et al Eds. Proceedings of the 18th IEEE ICDE International Conference on Data Engineering. San Jose, California, USA. February 26~March 1,2002. Los Alamitos: IEEE Computer Society,2002. 64-75
    [31] Florescu D and Kossmann D. Storing and Querying XML Data using an RDBMS [J]. IEEE Data Engineering Bulletin, 1999, 22(3):27~34
    [32] Shanmugasundaram J, Tufle K, Zhang C, et al. Relational Database for Querying XML Documents: Limitation and Opportunities [C]. In: Atkinson M P et al Eds. Proceedings of the 25th VLBD International Conference on Very Large Database. Edinburgh, Scotland, September 7-10,1999. San Francisco: Morgan Kaufmann Publishers, 1999. 302-314
    [33] G Kappel, E Kapsammer, W Retschitzegger. XML and Relational Database Systems- AComparison of Concepts[C]. In: Proc of Int’l Conf on Internet Computing (2001), Las Vegas, Nevada, USA, CSREA Press, 2001:199~205
    [34] Ronald Bourret. Mapping DTDs to Database[S]. The XML Cover Pages. http://www.oasis-open.org/cover/ , May 2001
    [35] CatalinaFan,JohnFundethurk,Hou-in Lum,Jery Kieman,Eugeneshekita,Jayvel Shanmugasundaram.XTLBLES: bridging relational technology and XML[J]. IBM Systems Joumal 41,No.4,2002
    [36] 曹亮,王茜,卢菁,XML 数据在关系数据库中存储和检索的研究和实现[J],东南大学学报(自然科学版),2002,32(1):p124-128
    [37] 万 常 选 , 林 大 海 . 基 于 关 系 数 据 库 的 XML 数 据 管 理 [J]. 计 算 机 科 学 , 2003,30(8):64~68
    [38] 林大海,万常选.基于区间编码方案分裂大型 XML 文档到关系存储[J].计算机应用, 2004,24(2),141~145
    [39] 徐慧,施化吉等,一种基于 RDBMS 的 XML 数据的存储方法[J].计算机工程与应用,2004,27:160~164,
    [40] 刘颖,张玉芳等,XML 文档在关系数据库中的存储方法[J].计算机应用,2004,24(4):42~4,
    [41] S.Abiteboul, D.Quass, J. Widom, J.Wiener. Lore: A database management system for Semi structured Data. [J] SIGMOD Record, 26(3) :54~66, Sep, 1997
    [42] Carey M, Florescu D, Ives Z, et al. XPERANTO: Publishing Object-Relational Data as XML [C]. In: Suciu D et al Eds. Proceedings of the 3rd WebDB International Workshop on the Web and Database. Dallas, Texas, USA. May 18-19, 2000. 105~110
    [43] Carey M, Kiernan J, Shanmugasundaram J, et al. XPERANTO: A Middleware for Publishing Object-Relational Data as XML Documents [C]. In: Abbad A E et al Eds. Proceedings of the 26th VLDB International Conference on Very Large Database. Cairo, Egypt. Semtember 10-14, 2000, San Francisco: Morgan Kauffmann Publishers, 2000.646~648
    [44] Fernandez M, Tan W, Suciu D. SilkRoute: Trading Between Relations and XML [C]. In: Herman I et al Eds. Proceedings of the 9th International World Wide Web Conference (WWW”00). Amsterdam, The Netherlands. May 15-19, 2000. Amsterdam: Foretec Seminars Inc. , 2000. 723~745
    [45] Fernandez M, Tan W, Suciu D. Publishing Relational Data as XML: The SilkRoute Approach [J]. IEEE Data Engineering Bulletin, 2001, 24(2): 12~19
    [46] Rajasekar Krishnamurthy , Venkatesan T. Chakaravarthy , Raghav Kaushik , Jeffrey F. Naughton, Recursive XML Schemas, Recursive XML Queries, and Relational Storage: XML-to-SQL Query Translation[C], Proceedings of the 20th International Conference on Data Engineering, p.42, March 30-April 02, 2004
    [47] Rajasekar Krishnamurthy, Raghav Kaushik, and Jeffrey F. Naughton. Efficient XML-to-SQL Query Translation: Where to Add the Intelligence[C] In Proceedings of VLDB 2004.
    [48] Rajasekar Krishnamurthy, Raghav Kaushik, and Jeffrey F. Naughton. XML Views as Integrity Constraints and their Use in Query Translation[C]. In Proceedings of ICDE 2005.
    [49] R.Catekk, D.Barry, M. Berler, et al. The Object Database Standard: ODMG 3.0[M]. Mrogan Kaufmann Publishers, Inc. San Francisco, Califonia, 2000.
    [50] V. Christrophides, S. Abiteboul, S. Cluet, M. Scholl. From Structured Document to Novel Query Facilities. In Proceedings of the 1994 ACM SIGMOD International Conference om Management of Data, Minnesota, USA, May, 1994.
    [51] Bourret R. XML Database Product: Native XML Database[S]. http://www.rpbourret.com/xml/ProdsNative.htm
    [52] Kanne C C and Moerkotte G. Efficient Storage of XML Data[C]. In Proceedings of 16th ICDE, San Diego, Califonia, USA, Feb 2000.
    [53] Jagadish H V, AL-Khalifa S, et al. TIMBER: A Native XML Database[C]. Technical Report, University of Michigan, April 2002
    [54]IBM Research. XML Storage Manager[S]. http://www.research.ibm.com/xmlstorage/
    [55] 王照岳、孙建伶、董金祥,XML 数据库管理系统研究,计算机科学,2002 年第一期
    [56] 罗道峰,孟晓峰,安靖. OrientStore: Native XML 存储方法. 计算机科学, [53]2003,.105~110
    [57] 孟晓峰,王宇. OrientX:一个 Native XML 数据库系统的实现策略. 计算机科学, 2003.111~115
    [58] H. Sconing. Tamino-A DBMS designed for XML [J]. Proc. Of the ICDE Conf. Heidelberg, Germany, Apr, 2001.
    [59] World Wide Web Consortium. XML Path Language (XPath) 2.0 [S]. W3C Recommendation 23 January 2007 http://www.w3.org/TR/xpath20/
    [60] Shanmugasundaram J, Shekita E J, Kiernan J, et al. A General Technique for Querying XML Documents Using a Relational Database System [J]. ACM SIGMOD Record, 2001, 30(3): 20~26
    [61] Shanmugasundaram J, Shekita E J, Barr R, et al. Efficiently Publishing Relational Data as XML Documents [C]. In: Abbad A E et al Eds. Proceedings of the 26th VLDB International Conference on Very Large Database. Cairo, Egypt. Semtember 10-14, 2000, San Francisco: Morgan Kauffmann Publishers, 2000, 65~76
    [62] Shanmugasundaram J, Shekita E J, Barr R, et al.Querying XML Views of Relational Data [C]. In: Apers P M G et al Eds. Proceedings of the 26th VLDB International Conference on Very Large Database. Cairo, Egypt. Semtember 10-14, 2000, San Francisco: Morgan Kauffmann Publishers, 2000, 261~270
    [63] 万常选. 以 XML 文档发布关系数据. 计算机应用与软件[J],2002,19(8):30~33
    [64] 万常选,林大海.基于关系数据库分裂存储的 XML 文档片段重构. 计算机应用研究, 2004, 21(8):166~170
    [65]万常选,刘云生,徐升华,林大海.基于 X-RESTORE 查询 XML 视图.小型微型计算机系统. 2004, 25(10)

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700