空间关联规则挖掘算法的研究与应用
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
空间数据挖掘是从空间数据库中提取隐含的预测信息,找出最有价值的知识来指导科学决策,这已经成为人们研究和应用的热点;在空间关联规则挖掘中,基于空间事务的挖掘方法虽是目前应用较为广泛的技术,然而频繁项目集的构建和修剪技术是其用于海量空间数据挖掘的难点之一。
     随着数字化电力系统的快速发展,空间数据挖掘在电力系统中的应用已成研究的重点;在电网可视化管理系统中,拓扑分析搜索的节点和线元素数目是影响电网分析效率的主要因素;由于现有的挖掘算法存在不足,其不能有效地提高拓扑分析的速度,故需研究有效的空间关联规则挖掘算法,用在电网可视化管理系统中提高电网分析的效率。
     针对空间关联横向挖掘中存在的不足,即现有的空间横向挖掘算法,虽改进了候选频繁项的构建和修剪技术,但其不能有效地提取包含空间对象个数较多的单层横向空间关联规则;论文首先提出一种基于交替搜索的空间事务挖掘算法ASTMAS (An algorithm of spatial transaction mining based on alternate search),其适合挖掘同一空间模式下不同空间对象之间的关联;该算法主要是通过改变传统构建频繁项的方式和现有二进制挖掘算法的搜索策略,在海量空间数据挖掘中提取包含任何数目空间对象的单层横向空间关联规则;算法运用了数字的递增和递减两种方式双向产生候选频繁项,实现交替搜索提取空间关联规则;并且在计算支持数时,其用数字特征减少被扫描空间事务的个数;模拟实验表明其效率比现有算法高。将其应用到电网可视化管理系统中,删除与供电源不相关的设备,减少拓扑分析搜索的节点或线元素数目,提高“供电范围分析”功能的执行效率,通过系统性能评估体现了算法的实用性。
     其次,针对现有基于空间事务的挖掘算法不能够有效地提取跨层横向空间关联规则,论文再提出一种基于数字递增的跨层(多层)空间事务挖掘算法AMSTMDA (An algorithm of multilayer spatial transaction mining based on digital ascending),其适合挖掘不同空间模式下的不同空间对象之间的关联;该算法主要是通过改进构建频繁项的技术和空间数据的存储方式,在海量空间数据中提取跨层横向空间关联规则;算法用二进制数表示空间拓扑关系改进了数据存储方式,并用数字递增方式产生候选频繁项,实现空间拓扑关联挖掘,模拟实验表明了算法的高效性;将其应用到电网可视化管理系统中,删除与停电操作不相关的设备,减少拓扑分析搜索的节点或线元素数目,提高“最优化停电方案分析”功能的执行效率,通过系统性能评估体现了算法的实用性。
Spatial data mining extracts latent forecasting information and finds the most worthwhile knowledge to guide Scientific Decision Making from spatial database, which is a hot topic of research and application for people. At present, in spatial association rules mining, the mining method based on spatial transaction is a comprehensive applied technology by people, but the technology of forming frequent itemsets and pruning is one of difficult problems when these algorithms are applied to huge spatial data mining.
     As fast development of digitization electric power system, it is a disquisitive emphasis that spatial data mining is used in electric power system. In Grid Visual Manage System, the efficiency of grid analysis is mainly affected by the number of node and line searched by topology analysis. Because of shortcomings of presented mining algorithms, which inefficiently improve the speed of topology analysis, we need research some efficient algorithms of spatial association rules mining, which are used in Grid Visual Manage System to improve efficiency of grid analysis.
     In this paper, aiming to the shortage of spatial association transverse mining, namely, in presented these spatial transverse mining algorithms, although these algorithms improve the technology of forming candidate frequent itemsets and pruning, they inefficiently extract monolayer transverse spatial association rules that contain more the number of spatial object. Firstly, an algorithm of spatial transaction mining based on alternate search (ASTMAS) is proposed, which is suitable for mining these association among these different spatial objects from the same spatial pattern. In huge spatial data mining, the algorithm extracts monolayer transverse spatial association rules that contain any the number of spatial objects, through changing the traditional way of forming frequent itemsets and search strategy of presented binary mining algorithms. The algorithm uses two ways of number ascending and descending to double generate candidate frequent itemsets, in order to extract spatial association rules by alternate search. And this algorithm uses number character to reduce the number of scaned spatial transactions when computing support. Simulation experiments indicate that its efficiency is more efficient than presented algorithms. The algorithm is used in Grid Visual Manage System to improve execution efficiency of scope analysis of power supply, via deleting irrelevant devices with power supply and reducing the number of node or line searched by topology analysis, and System Performance Evaluation embodies practicability of algorithm.
     And then, aiming to these presented mining algorithms based on spatial transaction inefficiently extract multilayer transverse spatial association rules, an algorithm of multilayer spatial transaction mining based on digital ascending (AMSTMDA) is proposed, which is suitable for mining these association among these different spatial objects from these different spatial pattern. In huge spatial data mining, the algorithm extracts multilayer transverse spatial association rules through improving the technology of forming frequent itemsets and saving spatial data. The algorithm uses binary number to express spatial topology association to improve the way of data saving, and uses number ascending to generate candidate frequent itemsets, in order to extract spatial topology association. Simulation experiments indicate that its efficiency is fast and efficient. The algorithm is used in Grid Visual Manage System to improve execution efficiency of analysis of Optimization Power Failure Scheme, via deleting irrelevant devices with power failure and reducing the number of node or line searched by topology analysis, and System Performance Evaluation embodies practicability of algorithm.
引文
[1] HERSKOVITS E H, GERRING J P. Application of a data-mining method based on Bayesian networks to lesion-deficit analysis. J. Neuron Image, 2003, 19(4):1664-1673
    [2] ESTIVILL-CASTRO V, L EE I. Clustering with obstacles for geographical data mining. ISPRS Journal of Photogrammetry and Remote Sensing, 2004, 59(12):21-34
    [3] KANEVSKI M, PARKIN R, POZDNU KHOV A, et al. Environmental data mining and modeling based on machine learning algorithms and geostatistics. J. Environmental Modeling &Software, 2004, 19(9): 845-855
    [4]张雪伍,苏奋振,石忆邵,张丹丹.空间关联规则挖掘研究进展.地理科学进展. 2007, 26(6): 119-128
    [5]李德仁,王树良,史文中等.论空间数据挖掘和知识发现.武汉大学学报(信息科学版), 2001, 26(6): 491-499
    [6]邸凯昌,李德仁,李德毅. Rough集理论及其在GIS属性分析和知识发现中的应用.武汉测绘科技大学学报,1999, 24(1): 6-10
    [7]邸凯昌,李德仁,李德毅.云理论及其在空间数据发掘和知识发现中的应用.中国图象图形学报,1999, 4(11): 930-935
    [8] MURRA YAT, ESTIVILL-CASTRO V. Cluster discovery techniques for exploratory spatial data analysis. International Journal of Geographical Information Science, 1998, 12(5): 431-443
    [9] RA YMOND T N, HAN J. CLARANS: A method for clustering object s for spatial data mining. IEEE Transactions on Knowledge and Data Engineering, 2002, 14(5): 1003-1016
    [10] BEGEJA L, DRUCKER H, GIBBON D, et al. Semantic data mining of short utterances. IEEE Transactions on Speech and Audio Processing, 2005, 13(5): 672-680
    [11] KEIM D A, PANSE C, SIPS M, et al. Pixel based visual data mining of geo - spatial data. Computers & Graphics, 2004, 28(3): 327-344
    [12]孙连英,彭苏萍,张德政.基于超图模型的空间数据挖掘.计算机工程与应用, 2002,11(11): 30-32
    [13]段晓君,杜小勇,易东云.可视化数据挖掘技术及其应用.计算机应用,2000, 20(1):54-56
    [14]张斌.关联规则技术在电力行业应用的前景分析.天津电力技术. 2006, 2: 20-23
    [15]张斌,孙哲.关联规则技术在电力系统中的应用.电网技术. 2007, 31(S1): 64-66
    [16]李德仁,王树良,李德毅.空间数据挖掘理论与应用.北京:科学出版社会.2006
    [17]王金凤,杨丽徙,陈根永.空间数据挖掘在电力系统中的应用探讨.继电器: 2004, 32(18):79-82
    [18] K Koperski, J Han. Discovery of spatial association rules in geographic information databases. Lecture Notes in Computer Science, 1995, 951: 47-66.
    [19]马荣华,蒲英霞,马小冬. GIS空间关联模式发现.北京:科学出版社, 2007
    [20] Ceci M, Appice A, Malerba D. Spatial association classification at different levels of granularity: A probabilistic approach. Knowledge Discovery in Databases: Pkdd 2004, Proceedings. Springer Berlin/Heidelberg, 2004, 3202:99-111
    [21]施颖男,李德敏,薛丹,赵丽娜.移动计算中基于Apriori算法的空间关联规则提取.计算机工程与应用,2003, 35: 55-56
    [22] Gang FANG, Zu-Kuan WEI, Qian YIN. Extraction of Spatial Association Rules Based on Binary Mining Algorithm in Mobile Computing. IEEE Information Conference on Information and Automation. IEEE, 2008, 2184: 1571-1575
    [23]王佐成,汪林林,薛丽霞,李永树.空间关联规则的双向挖掘.计算机科学.2006, 33(7): 199-203
    [24] Annalisa Appice, Paolo Buono. Analyzing Multi- level Spatial Association Rules through a Graph- Based Visualization. In Proc. Innovations in Applied Artificial Intelligence. Springer Berlin/Heidelberg, 2005, 3533: 448-457
    [25]陈江平.空间关联规则挖掘算法研究.计算机工程, 2004, 30(23): 53-55
    [26]袁红春,熊范纶.元规则指导下的逐步求精多层空间关联规则挖掘算法.计算机工程, 2004, 30(8): 34-39
    [27] Annalisa Appice, Margherita Berardi, Michelangelo Ceci, et al. Mining and Filtering Multi-level Spatial Association Rules with ARES. Foundations of Intelligent Systems. Springer Berlin/Heidelberg, 2005, 3488:342-353
    [28]陆勤,刘胜军,蔡庆生.空间关联规则发掘的一种渐进求精算法.计算机应用研究, 2006, 6: 10-11
    [29] Ansaf Salleb, Christel Vrain. An Application of Association Rules Discovery to Geographic Information Systems. In Proc. Principles of Data Mining and KnowledgeDiscovery. Springer Berlin/Heidelberg, 2000, 1910: 613-618
    [30]陈江平,傅仲良,边馥苓,沙衷尧.基于空间分析的空间关联规则提取.计算机工程, 2003,29(11): 29-31
    [31]刘君强,潘云鹤.挖掘空间关联规则的前缀树算法设计与实现.中国图象图形学报, 2003, 8(4): 476-480
    [32] Francesca A. Lisi, Donato Malerba. Inducing Multi-Level Association Rules from Multiple Relations. Machine Learning, 2004, 55(2): 175-208
    [33]李宏,蔡之华.应用于空间关联规则挖掘的ILP方法.计算机工程与应用, 2003,16: 188-197
    [34]马荣华,马小冬,蒲英霞.从GIS数据库中挖掘空间关联规则研究.遥感学报, 2005, 9(6): 733-741
    [35] Qin Ding, Qiang Ding, William Perrizo. Association Rule Mining on Remotely Sensed Images Using P- trees. In Proc. Advances in Knowledge Discovery and Data Mining. Springer Berlin/Heidelberg, 2002, 2336: 66-79
    [36] Anthony J T Lee, Ruey-Wen Hong, Wei-Min Ko, etal. Mining spatial association rules in image databases. Information Sciences, 2006, 177(7): 1593-1608
    [37] Robert Bembenik, Henryk Rybinski. Mining Spatial Association Rules with No Distance Parameter. Intelligent Information Processing and Web Mining. Springer Berlin/Heidelberg, 2006, 35:499-508
    [38] M Ester, H P Kriegel, J Sander. Spatial data mining: A database approach, in Advances in Spatial Databases, 1997, 1262:47-66
    [39]刘大有,王生生,虞强源等.基于定性空间推理的多层空间关联规则挖掘算法.计算机研究与发展, 2004, 41(4): 565-570
    [40] LAN Rongqing, LIU Zengliang, YANG Xiaomei. Methods of Mining Fuzzy Spatial Association Rules. Journal of Institute of Surveying and Mapping, 2005, 22(1): 36-39
    [41] Beaubouef Theresa, Petry Frederick E. A rough set foundation for spatial data mining involving vague regions. IEEE International Conference on Plasma Science, 2002, 1: 767-772
    [42] Beaubouef Theresa, Ladner Roy, Petry Frederick. Rough Set Spatial Data Modeling for Data Mining. International Journal of Intelligent Systems, Granular Computing and Data Mining, 2004, 19(7): 567-584
    [43] Seo Eun~Kyoung, Biggerstaff, Michael I. Impact of cloud model microphysics on passive microwave retrievals of cloud properties. Part II: Uncertainty in rain, hydrometeor structure, and latent heating retrievals. Journal of Applied Meteorology and Climatology, 2006, 45(7): 955-972
    [44] Yang Bin, Zhu, Zhong-Ying. Mining multilevel spatial association rules with cloud models. Journal of Harbin Institute of Technology (New Series), 2005, 12(3):314-318
    [45] Wang Xiao Hui, Xie Jiancang, Li Jianxun, et al. Implementation of GIS spatial data mining based on cloud theory. Proceedings 2006 International Conference on Hybrid Information Technology, ICHIT 2006, 2006, 1: 530-534
    [46] D. Bruzzese, C. Davino. Visual post analysis of association rules. Journal of Visual Languages and Computing, 2003, 14(6):621-635
    [47] Annalisa Appice, Paolo Buono. Analyzing Multi-level Spatial Association Rules through a Graph- Based Visualization. In Proc. Innovations in Applied Artificial Intelligence. Springer Berlin/Heidelberg, 2005, 3533: 448-457
    [48]涂建东,陈崇成,樊明辉等.基于Java3D的空间关联规则可视化原理与实现.高技术通讯, 2004(6): 98-99
    [49] L K Sharma, O P Vyas, U S Tiwary, et al. A Novel Approach of Multilevel Positive and Negative Association Rule Mining for Spatial Databases. Machine Learning and Data Mining in Pattern Recognition. Springer Berlin/Heidelberg, 2005, 3587:620-629
    [50] Wu X, Zhang C, Zhang S. Efficient Mining of Both Positive and Negative Association rule. ACM Tran. On Information System, 2004, 22(3), 381-405
    [51] William Perrizo, Qin Ding, Qiang Ding, et al. Deriving High Confidence Rules from Spatial Data Using Peano Count Trees. In Proc. Advances in Web~Age Information Management. Springer Berlin/Heidelberg, 2001, 2118:91-102
    [52]郭平,范丽,叶莲.空间规则的可视化解释.计算机科学,2003,31(5):169-172
    [53] ESTIVILL-CASTROV, HOULE ME. Robust distance -based clustering wit h applications to spatial data mining. Algorithmic a, 2001, 30(2): 216-242
    [54] WANG W, YANGJ, MUNTZ R. An approach to active spatial data mining based on statistical information. IEEE Transactions on Knowledge and Data Engineering, 2000, 12(5): 715-728
    [55]徐胜华,刘纪平,胡明远.空间数据挖掘与发展趋势探讨.地理与地理信息科学.2008, 24(3): 24-27
    [56]周海燕.空间数据挖掘的研究.郑州,中国人民解放军信息工程大学,2003
    [57]苏奋振,杜云艳,杨晓梅等.地学关联规则与时空推理的渔业分析应用.地球信息科学, 2004, 6(4): 68-69
    [58]刘独玉,杨晋浩,钟守铭.关联规则挖掘研究综述.成都大学学报(自然科学版),2006, 25(1): 54-57
    [59] Gang FANG, Zu-Kuan WEI, Qian YIN. An Algorithm of Constrained Spatial Association Rules Based on Binary. Lecture Notes in Computer Science (LNCS),Springer Berlin/Heidelberg,2008, 5264:21-24
    [60]陈耿,朱玉全,杨鹤标等.关联规则挖掘中若干关键技术的研究.计算机研究与发展: 2005, 42(10): 1785-1789
    [61] Gang FANG, Zu-Kuan WEI, Qian YIN. Chen Zhu. An algorithm of association rules double search mining based on binary. 7th International Conference on Machine Learning and Cybernetics (ICMLC2008). IEEE, 2008, 2095: 184–189
    [62]范平,梁家荣,李天志,巩建闽.基于二进制的关联规则挖掘算法.计算机应用研究: 2007(8):79-81
    [63]周步祥,刘欣宇.基于网络图形的配电网拓扑分析方法及应用.电力系统自动化, 2003,27(16): 67-70
    [64]郑勇,周步祥,贺琦.基于GIS的配电网拓扑分析方法.继电器,2004, 32(14): 25-28

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700