基于XML文档结构语义的信息检索方法与应用研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着电力信息化的发展,不同的供电企业都建立了各自的管理信息系统,在实际应用中积累了大量数据,研究从海量的不同电力企业信息中快速而灵活地查询数据信息有利于信息资源的充分利用和管理人员的决策。本文对XML索引技术、XML结构语义检索技术、变电设备信息CIM模型及其XML表示、基于XML的变电设备信息检索、家族变压器状态变化规律进行了深入的研究,主要的研究成果如下:
     1.提出了一种新的XML索引结构,包括倒排元素标签索引(ETI)、倒排元素值索引(ECI)及结点层次-路径索引(NLPI)。该索引结构既考虑了XML文档的文本内容信息,又考虑了其结构信息,此外还适合于XML结构语义检索算法的实现。
     2.进一步研究和扩展了XML结构语义概念,提出了多个结点语义相关判断时具有的规律,并加以证明,为XML结构语义检索算法的研究提供理论基础。在此基础上提出了一种新的基于“标签—关键词”查询的XML结构语义搜索算法。该算法在判断多个结点间的语义相关时避免了大量的结点对相连关系的判断,大大提高了检索速度。
     3.提出基于CIM标准的变电设备信息模型及变电设备信息XML数据规范,分析了基于XML的变电设备信息检索系统各组成部分及其关键技术。基于CIM标准可以使变电设备信息XML文档与电力行业其它符合CIM标准的信息模型兼容;有了变电设备信息的XML数据规范,可以使不同供电企业异构的变电设备信息数据遵循XML数据规范,有相同的语义,有利于提高XML搜索引擎的检索效率。
     4.首次提出应用聚类分析方法研究变压器家族状态变化的规律,以确定变压器状态评估中家族质量缺陷对变压器状态的影响。提出了基于值距离和曲线斜率距离的凝聚层次聚类算法,用于变压器家族状态变化规律的聚类分析。实例分析表明,本文的算法优于传统的凝聚层次聚类算法。提出根据家族状态变化规律确定家族质量缺陷的影响分值,给出了家族质量缺陷影响的评分方法。利用本文的聚类结果对同一家族另外一台变压器的状态变化进行了预测,结果与实际相符,说明家族状态变化规律的研究对变压器状态综合评估、故障预测具有重要意义。
With the development of electric power information management, many enterprises have accumulated large amounts of data, the research of query data from large amounts of electric power information fast and flexibly is beneficial for making full advantage of information resources and for manager’s decision. This paper pays attention to the deep research on XML index structure, XML retrieval method based on structural semantics, the extended CIM model of substation equipment information and its expression by XML, the integration and query of substation equipments’information based on XML, the clustering analysis in condition evolution of transformer in family. The main achievements are as the following:
     1. A new XML index structure is put forward, which includes inversed element tag index (ETI), inversed element content index (ECI) and node level-path index (NLPI). The index structure considers both content and structure information of XML document; more over, it is fit for the structural semantic retrieval of XML documents.
     2. The structural semantic concept of XML document is extended, the rule of judging many nodes’s semantic correlated relation is put forward and proved, which supplies the theory basis for XML structural semantic retrieval algorithm. A new tag-keyword semantic retrieval algorithm is put forword, which avoids judging many nodes’s semantic correlated relation and improves the retrieval speed greatly.
     3. An extended CIM model of substation equipment information and the translation rule from CIM model to XML document are put forword, the frame of substation equipment information search system based on XML is given and the key technology used by each part of the frame is analyzed. The XML documents of substation equipment information based on CIM standard is compatible with other electric power information model according with CIM standard; with the standard XML document criterion, substation equipment information from different electric power enterprise expressed by XML document may have same semantic, which is beneficial to improve the retrieval efficiency of XML search engine.
     4 . Using clustering technology to research the condition evolution rule of transformer in family is first put forward, which is used for deciding the influence of family quality on transformer’s condition evaluation. An improved agglomerate hierarchical clustering algorithm based on value distance and curve slope distance is put forward to analyze transformer condition evolution rule. The example shows that the algorithm is better than traditional agglomerate hierarchical clustering algorithm. Using clustering result to decide the influence of family quality defect on transformer’s condition evaluation is provided. At last, this paper analyzes another transformer’s condition in family according to the clustering result and gets accurate result, which shows that research of condition evolution rule of transformer in family is important to integrated condition evaluation and fault forecast.
引文
[1] Extensible Markup Language (XML), http://www.w3.org/XML/
    [2]朱凯.基于Web Services的电力信息辅助应用研究:[硕士学位论文].北京:华北电力大学,2004
    [3]唐晓波,黎朝辉. XML Web服务在电力企业信息系统集成中的应用框架.华中电力,2004,11(1):8~11
    [4]杨争林,宋燕敏,曹荣章.基于Web Services技术的数据申报实现.电力系统自动化,2005,29(4):14~17
    [5]王新房,张坤丽,陈盟. XML及SOAP在变电站综合信息管理系统中的应用.计算机应用研究,2003,20(5):74~76
    [6]曹阳,姚建国,张慎明,等.XML技术在电网自动化系统中的应用探讨.电力系统自动化,2002,26(21):73~77
    [7]仇宏祥,王康元.能量管理系统数据CIM导出的实现.继电器,2006, 34(1):66~70
    [8] Jian Wu, Noel N. Schulz. Overview of CIM-Oriented Database Design and Data Exchanging in Power System Applications. Proceedings of the 37th Annual North American Power Symposium. Ames,USA,October,2005: 16~20
    [9] Huang Haifeng, Zhao Jinhu, Cao Yang, et al. The study of Data Exchange Technology among Control Center Systems. International Conference on Power System Technology. Chongqing,China,October,2006:1~4
    [10] de Vos, A., Widergren, S.E., Zhu, J. XML for CIM Model Exchange. 22nd IEEE Power Engineering Society International Conference on Power Industry Computer Applications. Sydney,Australia,May,2001:31~37
    [11] Werner, T., Vetter, C., Kostic, T., et al. Data Exchange in Asset Management Application for Electric Utilities Using XML. 2000 International Conference on Advances in Power System Control, Operation and Management. Hongkong,China,October,2000:220~224
    [12] S. Chen, X. L. Wang. Power Quality XML Markup Language for Enhancing the Sharing of Power Quality Data. IEEE Power Engineering Society General Meeting. Toronto,Canada,July,2003:1565~1570
    [13] Qiu B,Gooi H B.Managing Metadata over the WWW Using Extensible Markup Language(XML).IEEE Power Engineering Society Winter Meeting. New York,USA,2002:678~683
    [14]万博,彭秀艳,李永亮,等.基于XML的数据交换在变电站自动化中的应用.东北电力学院学报,2003,23(1):67~71
    [15]牛春霞,宋玮,张坤峰.电力系统信息的XML描述.电力自动化设备,2006,26(7):34~37
    [16]卢彩霞.基于IEC61970系列标准的数据库接口的研究:[硕士学位论文].北京:中国电力科学研究院,2003
    [17]张慎明,卜凡强,姚建国,等.遵循IEC61970标准的实时数据库管理系统.电力系统自动化,2002,26(24):26~30
    [18]浣惠莺.基于IEC-61970标准的电力设备数据库模型及接口研究:[硕士学位论文].北京:中国电力科学研究院,2004
    [19]李荔芳,刘东,陈清鹤.公共信息模型在配电网建模工具中的应用.电力系统自动化,2005,29(24): 55~59
    [20]翁芳芳.基于CIM的电网故障信息主站系统的研究:[硕士学位论文].北京:华北电力大学,2004
    [21]黄蕾,王康元,梁继勇.基于CIM的电力设备维护管理系统模型.继电器,2005,33(7):50~54
    [22] Xuzhu Dong, Yilu Liu, Frank A. LoPinto, et al. Information Model for Power Equipment Diagnosis and Maintenance. IEEE Power Engineering Society Winter Meeting. New York,USA, 2002:701~706
    [23] Xiaofeng Wang, Noel N. Schulz, Scott Neumann. CIM Extensions to Electrical Distribution and CIM/XML for the IEEE Radial Test Feeders.IEEE Transactions on Power Systems. 2003,18(3):1021~1028
    [24]逢焕利,周连嚣,刘寒梅,等.基于概念检索的中文搜索引擎.吉林工学院学报, 2002,23(1):8~10
    [25]王海波,姜吉发,耿晖,等. XML搜索引擎研究.计算机应用研究, 2001, 18(4):68~71
    [26] Salton G, Wong A,Yang C S . A Vector Space Model for Automatic Indexing.Communications of the ACM.New York:ACM Press,1975 :613~620
    [27]王晓燕,王海洋,洪晓光.自行调整粒度的XML向量空间检索.武汉大学学报(理学版), 2004, 50(5):609~613
    [28]郭永民.XML文档检索技术研究:[硕士学位论文].太原:太原理工大学, 2003
    [29] Torsten Schlider,Holger meuss. Querying and ranking XML documents. Journal of the American Society for Information Science and Technology,2002 ,53(6):489~503
    [30]吴劲,陈泽琳.基于部分匹配的XML文本文档向量检索模型.电子学报,2002,30(12A):2169~2171
    [31]孙登峰.XML文档信息检索技术研究与实现:[硕士学位论文].长沙:国防科学技术大学,2002
    [32] W3C(1999) "XML Path Language". httpa/www.w3.org/TR/xpath
    [33] S.Abiteboul, D.Quass, J.McHugh, et al. The lore1 query languages for semistructured data. International Journal on Digital Libraries, 1997,1(1):68~88
    [34] A. Deutsch, M. Fernandez, D. Florescu, et al. A query language for XML. In Proceedings of the 8th International World Wide Web Conference.Toronto, Canada, May, 1999: 77~91
    [35] S. Ceri, S. Comai, E.Damiani, et al. XML-GL: A graphical language for querying and restructuring XML documents. In Proceedings of the 8th International World Wide Web Conference. Toronto, Canada, May, 1999: 93~109
    [36] J.Robie, J.Lapp, D. chach. XML Query Language (XQL), httpa/www.w3.org/TandS/QLIQL98/pp/xql.html, 30 April, 2002.
    [37] James Clark, editor. XSL Transformations (XSLT), W3C Recommendation. 1999. http://www.w3.org/TR/xslt.
    [38] D. Chamberlin, J.Robie, D.Florescu. Quilt: An XML query language or heterogeneous data sources. In International Workshop on the Web and Databases. Dallas, USA, May, 2002:1~25
    [39] XQuery 1.0: An XML Query Language, httpa/www.w3.org/TR/xqueryl
    [40] Theobald, G Weikum. The index-based XXL search engine for querying XML data with relevance ranking. 8th international conference on extending database technology. Prague, Czech Republic, March, 2002: 477~495.
    [41]曲卫民.中文XML信息检索系统的研究:[博士学位论文].北京:中科院研究生院, 2004
    [42] Chinenyanga T.T., Kushmerick N. Expressive retrieval from XML documents. 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New Orleans, USA, September, 2001:163~171
    [43]宋玲,马军,郭家义.支持XML信息检索的索引技术.计算机应用研究, 2005, 22(3):31~33,50
    [44]王申康,张雪燕.一种基于S-DOM的XML文档索引算法.计算机应用研究,2005,22(2):87~89
    [45] Arzucan ?zgür, Taflan I. Gündem. Efficient Indexing Technique for XML-Based Electronic Product Catalogs. Electronic Commerce Research and Applications. 2006, 5(1):66~77
    [46]雷向欣.XML索引和过滤查询若干关键技术研究:[博士学位论文].上海:复旦大学,2004
    [47]孙伟.基于XML半结构数据索引的研究:[硕士学位论文].哈尔滨:哈尔滨工程大学, 2004.
    [48]郭松涛.XML结构索引技术及查询优化研究:[硕士学位论文].重庆:重庆大学,2003
    [49]向桂林.XML引擎研究:[博士学位论文].北京:中科院研究生院,2004
    [50] Sara Cohen, Jonathan Mamou, et al. XSEarch: a semantic search engine for XML. Proceedings of the 29th VLDB Conference. Berlin, Germany, 2003:45~ 46
    [51] B Amann,C Beeri,I Fundulaki,et al.Querying XML sources using an ontology-based mediator.The 10th Int'l Conf on Cooperative Information Systems .Irvine,USA,2002: 429~448
    [52] Henry M. Kim. XML-hoo!: A Prototype Application for Intelligent Query of XML Documents using Domain-Specific Ontologies. Proceedings of the 35th Hawaii International Conference on System Sciences. Hawaii,USA, January,2002: 1289 ~ 1298
    [53] Anja Theobald. An Ontology for Domain-oriented Semantic Similarity Search on XML Data. Datenbanksysteme für Business, Technologie und Web(BTW2003). Leipzig, Germany,Februry,2003: 217~226
    [54] Fang Yuan, Ya-Nan Hao, Ge Yu. The Study Of Key Techniques In Intelligent Xml Search Engine. Proceedings of the Third International Conference on Machine Learning and Cybernetics. Shanghai, China, August, 2004:1194~1197
    [55]简琤峰.基于XML的异构产品信息网上交换、检索技术研究与应用:[博士学位论文].浙江大学,2002
    [56] Srinivasa K.G., Sharath S., Venugopal K.R., et al. GaXsearch: An XML information retrieval mechanism using genetic algorithms. 18th Australian Joint Conference on Artificial Intelligence.Sydney,Australia,December,2005: 435~444
    [57] Schenkel Ralf, Theobald Martin. Structural feedback for keyword-based XML retrieval. 28th European Conference on IR Research. London, UK, April, 2006: 326~337
    [58]张莲梅,陈世鸿,陈红梅,等.基于分布式电力资源库的搜索引擎框架.高电压技术, 2005, 31(8):66~68
    [59] Hongman Yan, Zheng Zhang, Dong Liu, et al. Information Retrivial of Data and State for Distribution Application based on Metadata. 2006 China International Conference on Electricity Distribution. Beijing, China ,September,2006
    [60]刘有为,李光范,高克利,等.制订《电气设备状态维修导则》的原则框架.电网技术,2003,27(6):64~67,76
    [61]纪航,朱永利,郭伟.基于模糊综合评价的变压器状态评分方法研究.继电器, 2006,34(5):29~33
    [62]王谦.基于模糊理论的电力变压器运行状态综合评估方法研究:[硕士学位论文].重庆:重庆大学,2005,4
    [63]袁志坚,孙才新,袁张渝,等.变压器健康状态评估的灰色聚类决策方法.重庆大学学报(自然科学版), 2005,28(3):22~25
    [64]熊浩,孙才新,杜鹏,等.基于物元理论的电力变压器状态综合评估.重庆大学学报(自然科学版),2006,29(10): 24~28
    [65] W.H. Tang, K. Spurgeon, Q.H. Wu,et al. An Evidential Reasoning Approach to Transformer Condition Assessments. IEEE Transactions on Power Delivery, 2004,19(4):1696~1703
    [66]陈安,陈宁,周龙骧等.数据挖掘技术及应用.北京:科学出版社,2006
    [67]王成山,曹旌,陈光远.基于聚类分析的电力系统暂态稳定故障筛选.电网技术,2005, 29(15):18~22
    [68]黄梅,贺仁睦,杨少兵.模糊聚类在负荷实测建模中的应用.电网技术,2006, 30(14):49~52
    [69]侯勇,张荣乾,谭忠富,等.基于模糊聚类和灰色理论的各行业与全社会用电量关联分析.电网技术,2006, 30(2):46~50
    [70]张智晟,孙雅明,张世英,等.基于数据挖掘多层次细节分解的负荷序列聚类分析.电网技术,2006, 30(2):51~56
    [71] R. Baeza-Yates, B. Ribeiro-Neto. Modern Information Retrieval,Addison-Wesley, Reading, MA, 1999
    [72] C. Zhang, J. Naughton, D. DeWitt, et al. On supporting containment queries in relational database management systems.Proceedings of the ACM SIGMOD International Conference on the Management of Data. Santa Barbara,USA,May,2001: 425~436
    [73] Barbara Catania, Anna Maddalena, Athena Vakali. XML Document Indexes: A Classification. IEEE Internet Computing, 2005,9(5):64~71
    [74] T. Chinenyanga,N. Kushmerick. An expressive and efficient language for XML information retrieval. Journal of the American Society for Information Science and Technology, 2002,53(6): 438~453
    [75] D. Florescu, D. Kossmann, I. Manolescu. Integrating keyword search into XML query processing. The International Journal of Computer and Telecommunications Networking, 2000,33(1):119~135
    [76] N. Fuhr,K. Gro?johann. XIRQL: a query language for information retrieval in XML documents. In Proc. 24th International ACM SIGIR Conference on Research and Development in Information Retreival. New Orleans,USA, September,2001:172~180
    [77] ACM SIGMOD. Available Products. http://www.acm.org/sigmod/record/xml
    [78]中华人民共和国国家发展和改革委员会. DL/T 890.301-2004/IEC 619701-301:2003.能量管理系统应用程序接口(EMS-API)第301部分:公共信息模型(CIM)基础.北京:北京出版社,2005-5-1
    [79]秦华. CIM技术在电网数据流规划中的应用研究:[硕士学位论文].杭州:浙江大学,2006
    [80] System Interfaces for Distribution Management-Part 11: Distribution Information Exchange Model. IEC TC57 WG14, http://www.wg14.com
    [81]孙家广,刘强.软件工程——理论、方法与实践.北京:高等教育出版社,2006
    [82]中华人民共和国电力工业部. DL/T 596-1996.电力设备预防性试验规程.北京:中国标准出版社,1997-01-01
    [83]廖玉祥.一种电力变压器运行状态综合评估模型的研究: [硕士学位论文].重庆:重庆大学,2006
    [84]张存明.变电设备状态分析及管理信息系统:[硕士学位论文].保定:华北电力大学, 2003
    [85]刘庆,欧阳琦环,徐瑜,等. SF6断路器故障辨识及在线监测的探讨.广东电力,2006,19(12):28~30
    [86]何建宁,杨吉仁.变压器在线监测及其分析.西北电力技术,2004,(4):80~81,84
    [87]李朋,张保会,郝治国,等.基于回路平衡方程的变压器绕组变形在线监测研究.电力自动化设备, 2006,26(5):11~14,27
    [88]肖燕彩,朱衡君,张霄元,等.基于溶解气体分析的电力变压器在线监测与诊断技术.电力自动化设备, 2006,26(6):93~96
    [89]周海洋,李辉,严璋.改进中性点测量法的变压器套管在线监测.高电压技术,2002,28(5):35~37
    [90]党晓强,刘念.电力电容器在线监测技术研究.四川电力技术,2005,(6):6~7,55
    [91]于莉莉.高压断路器的在线监测与诊断系统:[硕士学位论文].济南:山东大学,2005
    [92]王阳,唐琦.高压断路器在线监测系统.电气时代,2005,(9):128~129
    [93]许大宇,李祖明,李先允,等.高压断路器状态在线监测系统研究.南京工程学院学报(自然科学版),2006,4(3):35~39
    [94]杨武,丁丹,荣命哲,等.高压开关柜的在线监测和故障诊断.电工技术杂志,2001,(3):20~22,25
    [95]李利,朱松,徐建源.移相电容器的在线监测及其保护.东北电力技术,2002,(11):12~14
    [96]孟永鹏,贾申利,荣命哲.真空断路器机械特性的在线监测方法.高压电器,2006,42(1):31~34
    [97]王俊美,钟军.智能仪器在高压开关柜在线监测系统中的应用.供用电, 2006,23(1):39~41
    [98]陈维荣,宋永华,孙锦鑫.电力系统设备状态监测的概念及现状.电网技术,2000,24(11):12~17
    [99]国家电网公司. Q/GDW 172-2008.SF6高压断路器状态检修导则.北京:中国电力出版社,2008-5
    [100]中华人民共和国电力工业部. DL/T 573-95.电力变压器检修导则.北京:中国电力出版社,1996-4-1
    [101]中华人民共和国国家经济贸易委员会. DL/T 727-2000.互感器运行检修导则.北京:中国电力出版社,2001-03-01
    [102]杨波.基于工作流组件的变电设备缺陷管理实现:[硕士学位论文].北京:华北电力大学,2007
    [103] Paul Benjamin Lowry. XML Data Mediation and Collaboration: A Proposed Comprehensive Architecture and Query Requirements for Using XML to Mediate Heterogeneous Data Sources and Targets. Proceedings of the 34th Hawaii International Conference on System Sciences.Hawaii,USA,2001:7045
    [104]乔溪.基于XML的数据集成系统的研究与应用:[硕士学位论文].北京:北京化工大学,2005
    [105]刘利国.基于XML的异构数据库系统集成:[硕士学位论文].武汉:武汉理工大学,2003
    [106]洪著财.基于XML的动态质量信息系统的研究:[博士学位论文].南京:东南大学,2004
    [107]张志胜.基于XML的动态质量信息系统关键技术研究:[博士学位论文].南京:东南大学,2004
    [108] Christian Ensel, Alexander Keller.XML-based Monitoring of Services and Dependencies. IEEE Global Telecommunications Conference.San Antonio,USA,September,2001: 1646~1650
    [109]金正淑,王彦彬,范小玲.基于XML的电力企业管理中异构数据共享.东北电力学院学报,2005,25(1):76~79
    [110] A. Le Hors, P. Le Hegaret, G. Nicol, et al. Document Object Model (DOM) Level 2 Core Specification Version 1.0. http://www.w3.org/TR/DOM-Level-2-Core/. W3C Recommendation, Nov. 2000
    [111] David Megginson. SAX:Simple API for XML. http://sax. sourceforge. net/
    [112]朱巧明.中文信息处理技术教程.北京:清华大学出版社,2005
    [113]廖瑞金,王谦,骆思佳,等.基于模糊综合评判的电力变压器运行状态评估模型.电力系统自动化.2008,32(3):70~75
    [114]陈卫中. 35kV及以上主变压器状态检修的探讨.浙江电力,1999,(5): 40~42
    [115]陆晓春.变压器状态检修技术方案的可靠性研究.上海电力学院学报,2003,19(2):26~32.
    [116]贾慧敏,何光宇,方朝雄,等.用于负荷预测的层次聚类和双向夹逼结合的多层次聚类法.电网技术. 2007,31(23):33~36
    [117]林济铿,罗萍萍,曹绍杰,等.基于数据挖掘技术的负荷曲线对故障反映相似性的研究.电力系统自动化.2005,29(1):29~33
    [118] Brian S.Everitt.Cluster Analysis.Halsted Press,Third Edition,1993
    [119]中华人民共和国国家经济贸易委员会.DL/T722-2000.变压器油中溶解气体分析和判断导则.北京:中国电力出版社,2001-3-18
    [120]陈家斌.电气设备故障检测诊断方法及实例.北京:中国水利水电出版社.2003.6
    [121]成永红.电力设备绝缘检测与诊断.北京:中国电力出版社.2001.8

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700