基于GIS平台的空间查询语言与空间数据挖掘研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
空间信息的数量、复杂性的飞快增长,远远超出了人们的解译能力。虽然,目前的空间数据库可以实现数据的录入、修改、统计、查询等功能,却无法有效发现隐藏在空间数据背后的关系、规则和发展趋势等特征,这就导致了“空间数据丰富,空间知识贫乏”的现象。因此,从空间数据库中自动地挖掘知识,寻找空间数据库中不明确的、隐含的知识、空间关系或其他模式就显得越来越重要。本文从空间数据的获取手段,即查询语言入手,研究了空间数据挖掘的初级阶段的空间在线分析处理和高级阶段的空间关联规则分析两个方面。
     本文针对空间数据的特点提出了一种基于GIS平台的空间查询语言SQDL-G。通过将空间谓词表示为空间运算符的形式,并将子查询结构引入查询表达式中,使得SQDL-G能够轻松表达复杂条件的空间查询,且执行效率比传统查询语言高。有了空间查询语言的支持,本文继而对空间数据挖掘的初级阶段,即空间在线分析处理(SOLAP)进行了研究,并提出了一种基于GIS平台的多空间维SOLAP。通过空间查询语言SQDL-G,将事实对象和周边对象的空间信息全部引入到空间数据立方体中,建立主空间维和辅空间维,并且围绕各种空间维,提出了多元化的分析方法。同时,本文还在空间数据挖掘高级阶段的领域中研究了空间关联规则的问题,在多主题模式的基础上提出一种基于缓冲区分析的空间关联规则挖掘方法,避免了人为选择中心对象,且使得该方法适合针对多种空间类型事物的分析。
     总体来说,本文对空间数据挖掘做了有益的探索和研究。语义丰富、表达多样的空间查询语言SQDL-G,配合构建多空间维数据立方体,并参与基于缓冲区的关联规则分析,使得空间数据挖掘和知识发现趋于规范化和工程化。
Fast growing of spatial information and the complexity is far beyond the capacity of people's interpretation. Although spatial database enables data input, changes, statistics, queries and other functions, but can not effectively find out hidden in the back of the relations between spatial data, rules and trends and other features, which results situation of "Spatial data rich, space knowledge poor". Therefore, from the spatial database automatically mining knowledge, to find not clear, implicit knowledge, spatial relations, or other modes becomes increasingly important. Content of this thesis started at spatial query language, and then deeply researched on space online analytical processing and spatial association rules.
     According to the characteristics of spatial data, this thesis proposed a new spatial query language, Semantic Query Description Language for Geography (SQDL-G), based on GIS. The language converts spatial predicates into spatial operators and introduces sub-query structure. The language can express complex spatial query expression, and have a better efficiency than traditional query language. With the support of spatial query language, this thesis studied on the initial stage of spatial data mining-Space online analytical processing (SOLAP). Cooperating with spatial query language (SQDL-G), SOLAP system introduces factor objects and surrounding things into data cubic to setup main spatial dimensions and auxiliary spatial dimensions. Multi-spatial-dimension SOLAP makes full use of topology ability of GIS, materializes spatial relation between objects, and enhances analysis ability of SOLAP. At same time, this thesis did corresponding research in one direction of the advanced stage of spatial data mining-spatial association rules. A new discovering spatial association rules method was proposed, which base on co-location pattern and buffer analysis. This method needn't to manually select a center object, and suit for all kinds of spatial objects.
     Overall, this thesis made a useful spatial data mining exploration and research. Rich semantics to express a variety of spatial query language SQDL-G, with the construction of multi-dimensional data cube of space, and to participate in the association rules based on analysis of the buffer, making spatial data mining and knowledge discovery become standardized and engineering.
引文
[1]Fan Yanan, Zhang Liting, Chen Zhu an. Application of MapGIS to Present Situation Map of Land.Geospatial Information,2009,7(5):145-147.
    [2]Pei Bei. Analysis and Development Situation of WebGIS in Intelligent Transportation Systems. Transpo World,2008,20:70-72.
    [3]Wang Guang-ming, Liang Xiu-juan, Xiao Chang-la. The status of the application of GIS technology in the field of hydrology and water resources and development trend. JiLin Water Resources,2009:1-5.
    [4]TANG Hua-li, WANG Xiao-hong. Application of GIS in the environmental appraisal. Journal of Mountain Agriculture & Biology,2008,27(6):534-538.
    [5]SHANG Rong, HUYang-wei, DINGY in-long, MEN Peng. GIS Applications in Agriculture and the Status Quo Outlook. Science and Technology of West China,2009,8(9):45-46.
    [6]Yao Ming-bo, Kong Zhi-gang, Dai Jia-sheng. The GIS technology studies at the geological disaster the application. China Water Transport,2006,4(12):135-139.
    [7]Google. Google Earth. http://earth.google.com/,2010-3-1
    [8]韩建文.“3S”技术应用现状综述.重庆林业科技,2007,1:p11-13.
    [9]CHEN Hua, MEI Xiao-dan. The Development of Mobile GIS on PDA. Geomatics & Spatial Information Technology,2008,31(6):75-76.
    [10]He Yuanyuan, Zhang Ying. The Current Status and Prospects of WebGIS. Journal of Shijiazhuang Institute of Railway Technology,2008,7(1):52-55.
    [11]Qiu Wei, Zhang Lichen. Design and Application in Virtual GIS Data Interoperability With XML and VR. Control & Automation,2006,06S:285-288.
    [12]LIU Dao-hua, YUAN Si-cong, J IANG Xiang-kui,LU Di,WANG Rui. Research on the acquisition method of neural network production rule based on knowledge. Journal of Xi'an University of Architecture & Technology,2007,39(3):423-428.
    [13]Ester M, Kriegel H P, Sander J. Spatial Data Mining:a Database Approach. In:Scholl M V, ed. Proceedings of the 5th International Symposium on Spatial Databases (SSD.97). Berlin: Springer-Verlag,1997.
    [14]熊承义,李玉海.统计模式识别及其发展现状综述.科技进步与对策,2003,20(8):173-175.
    [15]XIAO Man-sheng, YU Xun-quan, ZHOU Li-juan. Research on image recognition based on fuzzy clustering for multi-dimensional vector model. XIAO Man-sheng, YU Xun-quan, ZHOU Li-juan,2008,29(15):4001-4005.
    [16]程建华.数据挖掘分类算法研究综述.中国高新技术企业,2008,24:160-160.
    [17]CHU Na, MA Li-zhuang, WANG Yan. Research for clustering tendency. Application Research of Computers,2009,3:185-188.
    [18]Xie Wu, Han YuanJie. A Study on Synthetic Evaluation Based on Data Mining and Evidence Theory. Modern Electronic Technique,2005,28(17):56-58.
    [19]石云,孙玉芳,左春.基于RoughSet的空间数据分类方法.软件学报,2000,11(5):673-678.
    [20]FAN Wen-jian, YANG Li-hua. Research of association rule for quantitative data based on fuzzy set. Journal of Chengdu University of Information Technology,2007,22(2):231-233.
    [21]XU Fei, ZHU Wei-xing, LI Zhong. Cloud model based rules extraction algorithm. Computer Engineering and Applications,2008,44(2):63-66.
    [22]WANG Song-quan, CHENG Jia-xing. Performance Analysis on Solving Problem of TSP by Genetic Algorithm and Simulated Annealing. Computer Technology and Development,2009, 19(11):97-100.
    [23]GUO Wen-Zhong, CHEN Guo-Long. An Efficient Discrete Particle Swarm Optimization Algorithm for Multi-Criteria Minimum Spanning Tree. Pattern Recognition and Artificial Intelligence,2009,4:597-604.
    [24]Zhou Xudong, Wang Liai, Chen Ling. Study on Solving MCP by Intelligent Search Algorithm. Computer Applications and Software,2008,25(5):10-11.
    [25]XIAO Wei-ping, HE Hong. Data mining algorithm based on genetic algorithm and its application. Journal of Hunan University of Science & Technology:Natural Science Editon, 2009,24(3):82-86.
    [26]ZHOU Xu-sheng, WANG Zhi-ming. Application of rough set and neural network in data mining. Computer Engineering and Applications,2009,45(7):146-149.
    [27]ZENG Zheng-liang, LUO Ke,ZOU Rui-zhi. Classifying algorithm based on compound particle swarm optimization. Computer Engineering and Applications,2009,45(7):156-158.
    [28]ZHANG Qun, XIONG Ying, HUANG Qing-ju. The Algorithm Based on ACA and Clustering Algorithm Combination. Journal of Hubei University of Technology,2007,22(2):5-9.
    [29]Abello Alberto, Song Il-Yeol. Data warehousing and OLAP (DOLAP'08). Data and Knowledge Engineering, January 2010,69(1):1-2.
    [30]Elnaffar Said, Martin Pat, Schiefer Berni, Lightstone Sam. Is it DSS or OLTP:Automatically identifying DBMS workloads. Journal of Intelligent Information Systems, June 2008,30(3), 249-271.
    [31]Zhang Hui, Lu Yu, Zhou Jinshu. Study on association rules mining based on searching frequent free item sets using partition. Proceedings-2009 Asia-Pacific Conference on Information Processing, APCIP,2009,1:343-346.
    [32]Fukada, Youji. SPATIAL CLUSTERING PROCEDURES FOR REGION ANALYSIS. Pattern Recognition,1980,12(6):395-403
    [33]Shashi S, Chawla S.空间数据库.谢昆青,马修军,杨冬青,等译.北京:机械工业出版社,2004.
    [34]邬伦,刘瑜,张晶,马修军,韦中亚,田原.地理信息系统——原理、方法和应用.北
    京:科学出版社.2005年.
    [35]边少锋,柴洪州,金际航.大地坐标系与大地基准.北京:科学出版社,2003
    [36]李英奎,吕肖庆,李敬.多投影间地图投影变化实现的途径与优化.地理学与国土研究,2000,16(20):79-84.
    [37]Dong Xuemin, Li Yan. Standardization of SVG in implementing WebGIS. Proceedings-2009 International Conference on Environmental Science and Information Application Technology, ESIAT 2009,2:534-537.
    [38]Ribeiro Joao Araujo, Monteiro De Farias Oscar Luis, Roque Luiz Alberto Oliveira Lima. A syntactic and lexicon analyzer for the geography markup language (GML). International Geoscience and Remote Sensing Symposium (IGARSS),2004,5:2896-2899.
    [39]Iorio Angelo Di, Vitali Fabio, Zonta Gianluca. Dynamic conversion between XML-based languages for vector graphics. Proceedings of SPIE-The International Society for Optical Engineering,2006:60-61.
    [40]Wikipedia. Geographic information system. http://en.wikipedia.org/wiki/GIS,2010-3-1.
    [41]Egenhofer M. A formal definition of binary topological relationships Lecture Notes in Computer Science. Germany:Springer-Verlag,1989,367:457-472.
    [42]Egenhofer M J, Franzosa R D. Point-set topological spatial relations. International Journal of Geographical Information Systems,1991,5(2):161-174.
    [43]Egenhofer M, Herring J. A mathematical framework for the definition of topological relationships.4th International Symposium on Spatial Data Handling. Zurich, Switzerland:[s. n.],1990:803-813.
    [44]Egenhofer M. Reasoning about binary topological relations. Lecture Notes in Computer Science. London:Springer Verlag,1991,525:143-160.
    [45]Ryden, K.OpenGIS simple features specification for SQL. (1999.5). http://www.opengeospatial.org/specs/?page=specs,2010-3-1.
    [46]吴信才.空间数据库.北京:科学出版社,2009.
    [47]Zhang Chun, Chen Rongguo, Cheng Changxiu. An Analytical Study of Two International Standards for Spatial Database. JOURNAL OF GEO-INFORMATION SCIENCE,2009, 11(4):526-534.
    [48]龚健雅,高文秀.地理信息共享与互操作技术及标准.地理信息世界,2000,3(3):18-17.
    [49]Open Geospatial Consortium Inc. OpenGIS Implementation Specification for Geographic information-Simple Feature access-Part 1:Common Architecture,2006. http://www.opengeospatial.org/standards/sfa,2009-12-10.
    [50]International Organization for Standardization. ISO/IEC 13249-1:Information Technology-Database Language-SQL Multimedia and Application Packages-Part 3:Spatial, 2006. http://www.iso.org/iso/catalogue_detail.htm?csnumber=53698,2009-12-10.
    [51]Fang Yu, Chu Fang, Chen Bin. Spatial Structural Query Language-G/SQL. Journal of Image and Graphics,1999,4(11):901-910.
    [52]Pan Xiao Fang, Wan Bo, Yang Lin, Li Li. Research an Design of Spatial Query Language GSQL and It's Interpreter. Moden Computer,2005,7(8):32-34.
    [53]Ju Shi Guang. VISUAL QUERY LANGUAGE CQL FOR SPATIAL DATABASE. CHINESE J. COMPUTERS,1999,22(2):205-211.
    [54]Ma Lin Bing, Gong Jian Ya. Research on Spatial Database Query Oriented Natural Language. Computer Engineering and Applications,2003,12:16-19.
    [55]Oracle Inc. Oracle Spatial User's Guide and Reference Release 9.0.1. http://download.oracle.com/docs/html/A88805_01/toc.htm,2009-12-10.
    [56]Jarke M, Koch J. Query Optimization in Database Systems. ACM Computing Surveys,1984, 16(2).
    [57]徐承志,许承瑜,钱铁云.基于GIS系统的空间查询语言.计算机科学,2010,37(6):206-210.
    [58]ESRI Inc. ArcGIS Server-企业级服务器.http://www.esrichina-bj.cn/templates/T_yestem_News/index.aspx?nodeid=155,2009-12-10
    [59]韩鹏,王泉.地理信息系统开发.武汉:武汉大学出版社,2008.
    [60]Hall P A V, Hitchcock P, Todd S J P. An Algebra of Relations for Machine Computation. Conference Record of the Second ACM Symposium on Principles of Programming Languages,1975.
    [61]王珊,萨师煊.数据库系统概论.北京:高等教育出版社(第四版),2006.
    [62]JavaCC. JavaCC简介.http://www.ibm.com/developerworks/cn/data/library/techarticles/dm-0401brereton/
    [63]测绘科学数据共享服务网.1比400 万中国地图. http://sms.webmap.cn/,2009-12-10.
    [64]Inmon W H. Data warehouse and data mining. Communications of the ACM,1996,39(11): 49-50.
    [65]Taher Omran Ahmed. Spatial On-Line Analytical Processing (SOLAP):Overview and Current Trends.2008 International Conference on Advanced Computer Theory and Engineering,2008:1095-1099
    [66]Bedard Y, Merrett T, Han J, Fundaments of Spatial Data Warehousing for Geographic Knowledge Discovery. Geographic Data Mining and Knowledge Discovery Taylor & Francis, London,2001:53-73.
    [67]Nebojsa Stefanovic, Jiawei Han. Object-based Selective Materialization for Efficient Implementation of Spatial Data Cubes. IEEE Transactions on Knowledge and Data Engineering,2000,12(6):1-21.
    [68]Sonia Rivest, Yvan Bedarda. SOLAP technology:Merging business intelligence with geospatial technology for interactive spatio-temporal exploration and analysis of data. ISPRS Journal of Photogrammetry and Remote Sensing,2005,60(1):17-33.
    [69]Jiawei Han, Micheline Kamber.数据挖掘概念与技术:第2版.范明,孟小峰译.北京:机械工业出版社,2007.
    [70]Nebojsa Stefanovic, Jiawei Han, Krzysztof Koperski. Object-Based Selective Materialization for Efficient Implementation of Spatial Data Cubes. IEEE Transactions on Knowledge and Data Engineering,2000,12(6):938-958.
    [71]Park, J.-M. Hwang, C.-S. A Design and Practical Use of Spatial Data Warehouse. International Geoscience and Remote Sensing Symposium,2005,2:726.
    [72]Sandro Bimonte, Jerome Gensel, Michela Bertolotto. Enriching Spatial OLAP with Map Generalization:a Conceptual Multidimensional Model.2008 IEEE International Conference on Data Mining Workshops,2008:332-341.
    [73]Elzbieta Malinowski, Esteban Zimanyi. Logical Representation of a Conceptual Model for Spatial Data Warehouses. GeoInformatica,2007,11:431-457.
    [74]Kenneth Choi, Wo-Shun Luk. Processing Aggregate Queries on Spatial OLAP Data. Lecture Notes in Computer Science,2008,5128:125-134.
    [75]韩鹏,王泉.地理信息系统开发.武汉:武汉大学出版社,2008.
    [76]Wu Xingdong, Optimization Problems in Extension Matrixes (Science in China, Series A, English edition,1992,35(3).
    [77]Salka, Corey. Ending the ROLAP/MOLAP debate:Usage based aggregation and flexible HOLAP. Proceedings-International Conference on Data Engineering,1998:180.
    [78]Jiawei Han, Micheline Kamber.数据挖掘概念与技术:第2版.范明,孟小峰译.北京:机械工业出版社,2007.
    [79]Agrawal Rakesh, Imielinski Tomasz, Swami Arun. Mining association rules between sets of items in large databases. Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data,1993,22(2):207-216.
    [80]Agrawal R, Srikant R. Fast algorithms for mining association rules. In:Proceedings of 1994 International Conference on Very Large Databases, Santiago(Chile),1994:487-499.
    [81]Koperski Krzysztof, Han Jiawei. Discovery of spatial association rules in geographic information databases. Proceedings of the 4th International Symposium on Large Spatial Databases-SSD'95,1995:6-9.
    [82]刘新,刘文宝.空间数据挖掘中关联规则的支持度和可信度研究.测绘科学技术学报,2007,24(2):93-96.
    [83]Goodsell G. Onfinding p-th nearest neighbours of scattered points in two dimensions for small p. Computer Aided Geometric Des,2000,7:387-392.
    [84]Piegl L, Tiller W. Algorithm for finding all K nearest neighbors. Computer Aided Des,2002, 34:167-172.
    [85]何婧,王丽珍,邹力鹃.基于云南气象数据的空间关联规则挖掘.计算机工程与应用,2003,39(34):187-190.
    [86]马荣华,马晓冬,蒲英霞.从GIS数据库中挖掘空间关联规则研究.遥感学报,2005,9(6):733-741.
    [87]杨莉,周志富.空间关联挖掘在交通领域的应用研究.交通科技与经济,2007,9(6):
    56-58.
    [88]邱洁,过仲阳,苏君毅,戴晓燕,林晖.关联规则及其在灾害天气预测中的应用.华东师范大学学报,2005,12:165-169.
    [89]Huang Y, Shekhar S, Xiong H. Discovering colocation patterns from spatial data sets:a general approach. Knowledge and Data Engineering, IEEE Transactions on,2004,16, 1472-1485.
    [90]Yoo Jin Soung, Shekhar Shashi, Celik Mete. A join-less approach for co-location pattern mining:A summary of results. Proceedings-5th IEEE International Conference on Data Mining, ICDM,2005:813-816.
    [91]Huang Y, Xiong H, Shekhar S, et al. Mining Confident Co-location Rules without A Support Threshold. In:Proc.2003 ACM Symposium on Applied Computing, New York, NY, USA, 2003:497-501.
    [92]王丽爱,周旭东,陈崚.最大团问题研究进展及算法测试标准.计算机应用研究,2007,24(7):69-70.
    [93]Shannon C E. A Mathematical Theory of Communication. Bell System Technical Journal, 1948,27(6,10):379-423,623-656.
    [94]Jaynes E T. Information theory and statistical Mechanics. Physical Review,1957,106(4): 620-630.
    [95]同济大学数学教研室.高等数据(上册 第四版).北京:高等教育出版社,2000.
    [96]LI De-yi, DI Kai-chang, LI De-ren, SHI Xue-mei. Mining Association Rules with Linguistic Cloud Models. Journal of Software,2000.11(2):143-158.
    [97]王现君,高莉.基于概念分层关联规则挖掘的研究.河南科学,2007,25(6):988-991.
    [98]邹逸江.空间数据立方体的空间度量聚集.计算机应用研究,2007,24(7):13-15.
    [99]张桂刚.海量规则并行处理研究(博士毕业论文).武汉大学,2009.
    [100]陈国良.并行计算—结构.算法.编程.北京:高等教育出版社,2005.
    [101]GuiGang Zhang, ChengZhi Xu, Qi Wang, Philip C-Y Sheu. Parallel Processing of Massive Number of Rules. In Proceedings of the 4th IEEE International Conference on Semantic Computing,2010.9.
    [102]边馥苓,万幼.k_近空间关系下的空间同位模式挖掘算法.武汉大学学报(信息科学版),2009,34(3):331-334.
    [103]万幼.K邻近空间关系下的离群点检测和管理模式挖掘研究(博士毕业论文).武汉大学,2008.
    [104]Li Rui, Wei Xian-mei, Huang Ming, Liang Xu. An Improved Decision Tree Learning Algorithm. Science Technology and Engineering,2009,9(20):6038-6041.
    [105]Cai Zhihua, Li Hong, Hu Jun. Decision Tree Algorithm to Spatial Classification Rule Mining. Computer Engineering.2003,29(11):74-76.
    [106]Pearl J F. propagation and structuring in belief networks. Artificial Intelligence,1986,29(3): 241-288.
    [107]Neapolitan R E. Learning Bayesian networks. New York:Pearson Prentice Hall Upper Saddle River,2004.
    [108]张连文,郭海鹏.贝叶斯网引论.北京:科学出版社,2006.
    [109]Vapnik V N. Book review:the nature of statistical learning theory. Technometrics,1996, 38(4):400.
    [110]Vapnik V N. Support vector method. Lecture Notes in Computer Science,1997,1327:263.