XML数据库的规范化理论研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
在网络数据交换日益增多的今天,XML作为一种半结构化数据以其简单易标记和跨平台等优点被越来越广泛的应用到数据存储和数据传输领域,成为Internet上的主要的数据表示和交换标准之一,应用范围非常广泛。XML数据库是一项在最近几年发展起来的新技术。和关系数据库类似,在XML数据文档中由其模式定义形成的树型结构可能包含数据冗余,从而引起数据的更新、插入和删除异常。引起数据冗余的根本原因是其中包含异常数据依赖,包括部分函数依赖、传递函数依赖和多值依赖。由于Web的开放性,XML数据异常的危害性要远远大于关系数据异常的危害性。
     研究XML数据依赖是进行XML数据库技术中其他相关研究如XML数据的存储与发布技术、XML数据查询与优化技术等的基础。目前对于XML数据依赖的研究主要集中在XML文档的强函数依赖这一领域,对于XML树型结构中引入空值后的弱函数依赖方面的研究就更少。
     本文的主要工作是在已有的DTD规范基础上,采用路径表达式和树元组的表示方法对XML数据进行规范化研究,主要内容如下:
     1、基于路径表达式和树元组给出XML函数依赖、部分函数依赖、传递函数依赖和多值依赖的概念,给出了部分函数依赖、传递函数依赖和多值依赖的推理规则。
     2、基于XML数据依赖形式化定义,给出XML不同级别范式的定义,提出XML文档规范化规则——元素提升规则、元素创建规则和元素上移规则。在规范化基础上给出XML文档规范化算法,并分析了算法的正确性、可终止性、时间复杂性、无损联接性和函数依赖保持性等。
     3、在XML树型结构中引入空值概念,提出XML弱函数依赖的逻辑蕴含问题,给出一组适合XML空值模型的函数依赖推理规则集;定义了单依赖集合,证明了单依赖集合判定定理和单依赖集合判定可终止定理。
Nowadays, network data exchanging is increasing day by day, as a half- structured data, XML is widely applied in data storage and data transmission fields because of its simplicity, easily marked and running in various platforms. XML database which is sprung up in recent years is a new technology. Like the relational database, the tree structure formed by schema definition may contain data redundancy in the XML data documents, which leads exception in data updating, data inserting and data deleting. The radical reason that brings data redundancy is exception of data dependency, such as partial functional dependency , transitive functional dependency and multiple value dependency.
     In fact, studying the XML functional dependency plays a fundamental role in other related studies in the XML database technology such as storage and distribution of data, XML data querying and data optimization. Currently, the research on XML functional dependency is focused on strong functional dependency of XML document. The research on weak XML functional dependency—the functional dependency when introducing null in XML tree structure is even seldom.
     This paper studies the XML data standardization based on the existing DTD rules, according to path expressions and tree component present, the main achievements are listed as follow:
     1.The conceptions and inference rules of XML partial functional dependency, transitive functional dependency and multiple value dependency are given, basing on path expressions and tree component.
     2. Different levels of XML paradigms are defined according to XML formalized definition. It also deduces the XML document standardization rules----element upgrading rules, element establishing rules and element upper moving rules. It gives the XML document standardization algorithm, analyzing the correctness, term- inability, time complexity and nondestructive reliance and maintenance functional dependency of the algorithm .
     3. It introduces the conception of null in XML tree structure, puts forward XML weak functional dependency containment issue and a set of functional dependency inferences rules suit for XML null model, defines single dependency set, testifies single dependency theorem and single dependency term- inability theorem.
引文
[1]Tim Bray,Jean Paoli,C.M.Sperberg-McQueen.Extensible Markup Language(XML)1.0,w3c Recommendation.http://www.w3.org/TR/1998/REC-xml-19980210.html.February 10~(th),1998.
    [2]W3C.XML Specification DTD("XMLspec").http://www.w3.org/XML/1998/06/xmlspec-report-19980910.htm.Jun,1998.
    [3]S.Abiteboul,R.Hull and V.Vianu.Foundations of Databases.Addison-Wesley,1995:166-221.
    [4]R.Ramakrishnan,J.Gehrke.Database Management Systems.USA:McGraw-Hill Higher Education,2000:107-125.
    [5]S.Abiteboul,V.Vianu.Regular path queries with constraints.In:Proceedings of ACM Symposium on Principles of Database Systems(PODS).Tucson,Arizona,1997:122-133.
    [6]Peter Buneman,Wenfei Fan and Scott Weinstein.Some undecidable implication problems for path constraints.Technical Report MS-CIS-97-14,Department of Computer and Information Science,University of Pennsylvania,1997:1-24.
    [7]Peter Buneman,Wenfei Fan and Scott Weinstein.The decidability of some restricted implication problems for path constraints.Technical Report MS-CIS-97-15,Department of Computer and Information Science,University of Pennsylvania,1997.
    [8]Peter Buneman,Wenfei Fan and Scott Wemstein.Path Constraints on Semistructured and Structured Data.In:Proceedings of ACM Symposium on Principles of Database Systems(PODS).Seattle,Washington,1998:129-135.
    [9]Peter Buneman,Wenfei Fan and Scott Weinstein.Interaction between Path and Type Constraints.Technical Report MS-CIS-98-16,University of Pennsylvania,1998.
    [10]Mary F.Fernandez,Daniela Florescu,Alon Y.Levy,etc.Verifying Integrity Constraints on Web Sites.Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence.Sweden,1999:614-619.
    [11]吕腾,闫萍.XML函数依赖及其推理规则.计算机研究与发展,2005,42(5):792-796
    [12]胡小明,陈子阳,高翔,刘国华.基于树元组的函数依赖推理规则.燕山大学学 报,2005,29(1):75-80
    [13]陈子阳.基于DTD路径编码的XML逻辑蕴涵问题研究.燕山大学学报,2005,29(5):381-384
    [14]吕腾,闫萍,王真星.XML函数依赖及其与键的关系.小型微型计算机系统,2005,26(9):1677-1679
    [15]刘艳,郝忠孝.对XML DTD中的键和函数依赖的探讨.齐齐哈尔大学学报,2005,21(2):51-53
    [16]谈子敬,施伯乐.函数依赖和规范化在关系和XML间的传播.软件学报,2005,16(04):533-539
    [17]张忠平,王超,朱扬勇.基于约束的XML文档规范化算法.计算机研究与发展,2005,42(5):755-764
    [18]苏召,刘国华.弱函数依赖及其推理规则.计算机应用,2007,27(05):1228-1231
    [19]Mike Hogan.XML:The Foundation for The Future.http://www.XML.org/XML/XML foundation_future.htm.July 2001.
    [20]Dan Suciu.On Database Theory and XML.In Proceedings of the ACM PODS International Conference.Sep.,2001.Santa Barbara CA,USA,2001,30(3):39-45
    [21]Kimbro Staken.Introduction to Native XML Databases.http://www.XML.com/pub/a/2001/10/31/nativeXMLdb.html.October 2001
    [22]H.V.Jagadish,S.Al-Khalifa,L.V.S.Lakshmanan,etc.TIMBER:A Native XML Database.The VLDB Journal,April 2002(11):274-291
    [23]Ronald Bourret.XML Database Products.The XML Cover Pages.http://www.oasis-open.org/over/.November 2001.
    [24]Marcelo Arenas,Leonid Libkin.A Normal Form for XML Documents.In:Proceedings of ACM Symposium on Principles of Database Systems(PODS).Madison,Wisconsin,USA,2002:85-96.
    [25]施伯乐,何继潮,崔靖.关系数据库理论及应用.河南:河南科学技术出版社.1989:355-445
    [26]谈子敬,施伯乐.DTD的规范化.计算机研究与发展.2004.41(4):594-600
    [27]张忠平,王超,朱扬勇.基于约束的XML文档规范化算法.计算机研究与发展.2005,42(5):755-764
    [28]谈子敬,庞引明,施伯乐XML上的函数依赖推理.软件学报,2003,14(9):1564-1570.
    [29]吕腾,顾宁,施伯乐.XMLDTD的一种范式.计算机研究与发展,2004,41(4):615-620
    [30]吴永辉.用于XML模式和DTD规范化设计的层次模式设计.软件学报,2004,15(07):1099-1106
    [31]刘方鑫主编.数据库原理与技术.北京:电子工业出版社.2002
    [32]张广玲.XML不完全信息树下的完全函数依赖弱保持.信息技术,2007,(1):39-42
NGLC 2004-2010.National Geological Library of China All Rights Reserved.
Add:29 Xueyuan Rd,Haidian District,Beijing,PRC. Mail Add: 8324 mailbox 100083
For exchange or info please contact us via email.