空间数据挖掘技术及在城镇土地定级中的应用
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
现代空间数据获取技术和计算机网络等技术的迅速发展,使得地理信息系统中的空间数据急剧膨胀。虽然这些空间数据满足了人类研究地球资源和环境的潜在需求,拓宽了可供利用的信息源,但是空间数据的复杂性和大小远非一般的事务型数据所能企及,而目前的空间数据处理手段又相对落后,使得蕴含在空间数据资源中的丰富知识被迫束之高阁,为了满足人们对空间信息日益增长的高层次需求,空间数据挖掘相关理论和技术便应运而生了。
     空间数据挖掘是数据挖掘的一个研究分支,而空间聚类分析是空间数据挖掘的一个重要的研究领域。因此,空间聚类算法的研究一直是空间数据挖掘研究领域中一个非常活跃的研究课题,并且已经被广泛地研究了许多年,但研究的范围主要集中在基于距离的聚类分析。本文系统地研究了划分和层次等空间数据聚类的传统算法后,总结出这些聚类算法容易陷入局部最优,在实现的过程中没有考虑到保持群体对象的全局分布特性,而且对“孤立点”信息比较敏感。正是由于这些不足,大大限制了聚类算法在GIS领域当中的应用。
     同时遗传算法模仿生物进化过程的自然选择和进化机制,是一种基于群体的全局随机优化算法。因此,可以考虑运用遗传算法来解决空间聚类问题。
     本文正是将“局部和整体”两个层次进行聚类结果优化作为出发点,分析了遗传算法与常规的聚类方法各自的优点,研究了基于遗传算法的空间聚类方法。经过具体的理论分析和模拟试验,发现该方法是可行的,而且在具体的试验过程了,确实达到了理论的要求。
     最后,针对目前土地定级划分中人为的主观因素占主导的现实状况,利用改进的空间聚类算法,以基本地价区片为样本,进行土地等级划分,取得了比较理想的结果。本文的研究还存在一些不足之处,文章对这部分提出了展望,以推动进一步的工作。
Modern spatial data acquisition technology and computer network technology is developing rapidly,making GIS spatial data in the rapid expansion.Although these spatial data meets the human's potential demand for study of Earth's resources and environment,and broadens the sources of information available,the complexity and size of the data of general affairs is far from the spatial data.Currently,space data processing means is relatively straggly,so it makes spatial data contained in the wealth of knowledge resources to be shelved.In order to meet the growing demand of high-level information for spatial data,spatial data mining theory and the technology has come into being.Spatial Data Mining is a research branch of data mining,and the spatial clustering analysis is an important area of research of spatial data mining.Spatial clustering algorithm has been a very active research topic on spatial Data Mining Research field,and has been widely studied for many years. However,the scope of the study mainly concentrated in the cluster analysis based on distance.In this paper,a systematic study of traditional clustering algorithms of spatial data is made,such as hierarchical clustering algorithm,and we conclude that this clustering algorithm is summed up easily into local optimum.In the process of realization does not take into account the overall situation of target groups to maintain distribution,and is more sensitive to the isolated information.Because of these deficiencies,it greatly limits the clustering algorithm in the field of GIS Application.
     Meanwhile,genetic algorithms mimic biological evolution process of natural selection and evolution mechanism,and it is a global optimization algorithm based on a random group.So we can be considered the use of genetic algorithms to solve the problem of spatial clustering.Considering the relationships between party and entirety of Clustering object's feature,we came out of a method of combining the genetic algorithm and the conventional clustering,to design a spatial clustering algorithm based on genetic algorithm.After the specific theoretical analysis and simulation test,found that the method is feasible,and indeed reached a theoretical requirements in the course of specific test.
     Finally,we analysis the status of distinction of grading of the land,and concludes that artificial subjective factors is a dominated factor.using improved of spatial clustering algorithm,as the basic premium for the sample area,we do the work of Land Classification,and get the fairly satisfactory results.Although there are some fields not to be studied,this will be the important research area for the future.
引文
[1]R T Ng,J Han.Efficient and effective clustering methods for spatial data mining [C]//20th International Conference on Very Large Data Bases(VLDB'94).Santiago,Chile,1994:144- 155.
    [2]J S Yoo,S Shekhar.A Partial Join Approach for Mining Co-location Patterns[C]//I F C Dieter Pfoser,Marc Ronthaler,eds.12th ACM International Workshop on Geographic Information Systems.Washington,DC,USA,2004:241-249.
    [3]B M Kazar,S Shekhar,D J Lilja,et al.A parallel formulation of the spatial auto-Regression model[C]//The International Conference on Geographic Information GIS PLANET.Lisbon,Portugal,2005:87-109
    [4]C Byhm,K Kailing,H P Kriegel,et al.Density connected clustering with local subspace preferences[C]//the 4th IEEE Int.Conf.on Data Mining (ICDM'04).Brighton,UK,2004:27-34.
    [5]R Ng,J Han.CLARANS:A method for clustering objects for spatial data mining [J].IEEE Trans Knowledge & Data Engineering,2002,14(5):1003-1016.
    [6]李德仁.空间数据挖掘的理论与应用[M].北京:科学出版社,2005.
    [7]李德仁,王树良,史文中,等.论空间数据挖掘和知识发现[J].武汉大学学报·信息科学版,2001,26(6):491-499
    [8]何彬彬,方涛,郭达志.空间数据挖掘不确定性及其传播[J].数据采集与处理,2004,19(4):475-480.
    [9]Huang T,Qin X,Chen C,et al.Density-based spatial outliers detecting[C]//The International Conference on Computational Science 2005(ICCS 2005).Atlanta,USA,Springer-Verlag,2005:979-986.
    [10]Huang T,Qin X,Wang Q,et al.Quick spatial outliers detecting with random sampling[C]// the eighteenth Canadian Conference on Artificial Intelligence(AI 2005).Victoria,Canada.New York:Springer-Verlag,2005:302-306.
    [11]邬伦,刘瑜,张晶等.地理信息系统原理、方法和应用[M].北京:科学出版社,2001.2
    [12]陈述彭,鲁学军,周成虎.地理信息系统导论[M].北京:科学出版社,2000.3-10.
    [13]李德仁,王树良,李德毅,等.论空间数据挖掘和知识发现的理论与方法[J].武汉大学学报(信息科学版),2002,27(3):221-233.
    [14]陈中祥,岳超源.空间数据挖掘的研究与发展[J].计算机工程与应用,2003,3:5-7
    [15]裴韬,周成虎,骆剑承,等.空间数据知识发现研究进展评述[J]中国图象图形学报,2001,6A(9):854-860.
    [16]Han J,Kamber M.Data Mining:Concepts and Techniques.Morgan Kaufmann Publishers,2000,335-398.
    [17]K.Alsabti,S.Ranka,V.Singh.An efficient K-means clustering algorithm.First Workshop High Performance Data Mining,Mar,1998.
    [18]Zhang,R.Ramakrishnan,M.Livny.BIRCH:An efficient data clustering method for very large datbaases.ACM-SIGMOD Int.Conf Management of Data(SIGMOD,96),Montreal,Canada,June1996:103-114.
    [19]S Guha,R Rastogi,K Shim.Cure:An efficient clustering algorithm of large Databases.ACM SIGMOD Int.Conf on Management of Data,Seattle,Washington,ACM Press,1998:73-84.
    [20]S.Guhu,R.Rastogi,K.Shim.Rock:A robust clustering algorithm for categorical attributes.Int.Conf Data Engineering(ICDE,99),Sydney,Australia,Mar.1999:512-521.
    [21]G,Kyarpis,E.H.Han,V,Kumar.CHAMELEON:A hierarchical clustering algorithm using dynamic modeling.COMPUTER(32),1999:68-75.
    [22]何中胜,刘宗田,庄燕滨.基于数据分区的并行DBSCAN算法,小型微型计算机系统,Vol.27,No.1,Jan.2006,114-116
    [23]吕安民.人口空间数据挖掘及其应用方法研究[D].武汉大学博士学位论文,2002.
    [24]W.Wang,J.Yang,R.Muntz.STING:A statistical information grid approach to spatial data mining.Int.Conf Very Large Databases(VLDB'97),Athens,Greece,Aug.1997:168-195.
    [25]G..Sheikholeslami,S.Chaterjee,A.Zhang.WaveClusetr:A multi-resolution clustering approach for very large spatial,databases.Int. Very Large DataBases(VLDB,97),NewYork,Aug.1998:428-439.
    [26]Aggrawal R,Gehrke J,Gunopulos D,Raghawan P Automatic subspace clustering of high dimensional data for data mining applications.Proceedings of the ACM SIGMOD International Conference on Management of Data,Seattle,WA,1998:94-105.
    [27]R.S.Michalski,R.E.Stepp.Learning from observation:Conceptual clustering.In:R.S.Michalski,J.G.Carbonell,and TM.Mitchell,editors,Machine Leaning:An Artificial Intelligence Approach,Vol.1.SanMateo,CA:Morgan Kanfmann,1983:168-177.
    [28]D.Fisher.Improving inferenece throuhg conceptual clustering,AAAI Conf,Seatle,WA,July1987:461-465.
    [29]T,Kohonen.Self-organized formation of topolgoically correct feature maps.Biological Cybernetics,43,1982:59-69.
    [30]吉根林 遗传算法在数据挖掘中的应用[J].信息技术,2001.12
    [31]朱金钧,高凯,周万珍.遗传算法在数据挖掘中的应用[J]计算机工程与应用2003.17
    [32]王小平,曹立明.遗传算法--理论、应用与软件实现[M]西安交通大学出版社2003.3
    [33]陈国良,王煦法,庄镇泉,王东生.遗传算法及其应用[M].人民邮电出版社,2001.
    [34]王正志,薄涛.进化计算[M]国防科技大学出版社.2000.11
    [35]Choenni,R,on the Suitability of Genetic-Based Algorithms for Data Mining[J].ER Workshop 1998,LNCS 1552,55-67.
    [36]Tung A K H,Hou J,Han J.Spatial Clustering in the Presence of Obstacles[C].The International Conference on Data Engineering,Los Alamitos,CA,2001
    [37]王立新,韩亚洪.涉及障碍物的聚类方法研究.计算机应用,2003,12(23:)73-76.
    [38]国土资源部.城镇土地分等定级规程[M].北京:中国标准出版社,2002
    [39]廖俊国,肖烨.用GIS技术进行土地定级方法的探讨.工程勘察.2001,No.4.
    [40]葛京凤,杨秀敏.城区土地定级方法及级别结构模式探讨.经济地理.
    1998,Vol.19,No.2.
    [41]严星,林增杰.城市地产评估[M].北京:中国人民大学出版社,2000
    [42]赵国富.基于聚类的空间数据挖掘方法与应用研究[D].淄博:山东理工大学,2006.
    [43]陈安,陈宁,周龙骧等.数据挖掘技术及应用[M].北京:科学出版社,2006.
    [44]李宁宁,刘玉树.基于GIS的空间位置关系聚类研究与应用[J].微机发展,2004,14(6):8-9.
    [45]吴信才.基于邻接关系的空间数据挖掘[J].计算机工程,2002,7(7):89-91.
    [46]王晓鹏.模糊聚类与主成份分析土地分等定级方法之研究.青海师大学报,1998
    [47]黎明,杨小芹.基于Hamming神经网络聚类分析的进化策略,南昌航空工业学院学报.2000,14(2)7-11
    [48]周明.遗传算法原理及应用[M].国防工业出版社,2002
    [49]卢开澄.卢华明.组合数学[M].清华:大学出版社2002

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700