自适应空间聚类方法研究

英文题名：A Methodology of Adaptive Spatial Clustering Analysis
作者：刘启亮
论文级别：硕士
学科专业名称：大地测量学与测量工程
中文关键词：空间数据挖掘 ; 空间聚类 ; 自适应 ; Delaunauy三角网
英文关键词：Spatial data mining ; spatial clustering ; adaptiveness ; Delaunay triangulation
学位年度：2011
导师：邓敏 ; 李光强
学科代码：081601
学位授予单位：中南大学
论文提交日期：2011-05-01

摘要

空间聚类分析是地理空间数据挖掘与知识发现的主要研究内容之一,旨在发现潜在的空间实体分布模式以及探测空间异常。空间聚类分析在天文学、地理学、地质学、气象学、地图学及公共卫生等众多领域具有广泛的应用。伴随着实际应用的深入,迫切需要发展具有良好自适应性的空间聚类方法,一方面可以自动适应空间数据的复杂分布,如不同形态、不同密度的空间分布等；另一方面能够便捷用户操作,如需要设置较少的参数。为此,本文较为系统地研究了白适应的空间聚类分析方法,主要包括：
     (1)从空间数据的基本特征与性质出发,分析了空间聚类分析的研究特点；进而,对空间聚类问题进行了明确的定义,建立了空间数据清理、空间聚类趋势分析、空间聚类特征提取、空间聚类算法设计及空间聚类有效性评价等五个部分为核心的空间聚类分析理论框架；最后对现有的空间聚类算法进行了较为系统的总结与分析,对其适用范围与性能进行了归纳。
     (2)提出了基于场论的自适应空间聚类算法。本文从空间数据场的角度出发,提出了一种适用于空间聚类的凝聚场,并给出了一种新的空间聚类相似性度量指标,即凝聚力。进而,提出了一种基于场论的自适应空间聚类算法(简称FTASC)。该算法根据凝聚力的矢量计算获取每个实体的邻近实体,并通过递归搜索的策略,生成一系列不同的空间簇。通过模拟实验验证、经典算法比较和实际应用分析得出,本文提出的算法能够发现任意形状、密度变化的空间簇,且可以实现无参数聚类。
     (3)提出了基于Delaunay三角网的自适应空间聚类算法。借助Delaunay三角网描述空间实体间邻近关系,并采取由整体到局部的策略,构造针对性的Delaunay.边长约束准则来发现空间聚集结构,提出了一种自适应的空间聚类算法(简称ASCDT)。通过实验分析与比较发现,ASCDT算法可以自动地发现复杂的空间簇,且对噪声点稳健。在ASCDT的基础上,顾及了空间聚类过程中可能存在的空间障碍(如河流,山脉),并进一步发展了一种顾及空间障碍的空间聚类方法(简称ASCDT+)。
     (4)提出了基于图论与密度的混合空间聚类算法。结合基于图论与基于密度的空间聚类算法的优势,提出了一种顾及专题属性的空间聚类算法(简称HGDSC)。其主要思想为：首先借助基于图论的空间聚类方法思想针对复杂分布的空间数据集构建实体间邻近关系,进一步借助改进的基于密度的空间聚类方法顾及专题属性进行聚类。通过实验分析与比较证明,HGDSC算法不仅能够适应复杂的空间分布,而且可以同时顾及实体间专题属性的相似性,需要人为的干预较少。
     (5)提出了基于力学思想的空间聚类有效性评价方法。首先,对较有代表性的空间聚类有效性评价方法进行总结。进而,借助于物理学中的力学思想,结合地理学基本规律,提出了一种基于力学思想的空间聚类有效性评价指标(简称SCV)。通过实验比较分析发现,该指标能够更准确、高效地对二维地理空间数据的聚类结果进行评价。
     (6)开发了具有自主知识产权的可视化空间聚类分析软件原型-EasyC luster V1.0。包括：空间数据清理,空间聚类先验信息获取,21种经典及改进的空间聚类算法以及3种空间聚类有效性方法(可选35种空间聚类有效性函数)等主要功能模块。
Spatial clustering has played an important role in spatial data mining and knowledge discovery. It aims to classify a spatial database into several clusters without any prior knowledge (e.g., probability distribution and the number of clusters). Spatial clustering has a wide range of applications, such as astronomy, geography, geology, meteorology, cartography and public health. Currently, the applications on complicated spatial database bring new demand for spatial clustering algorithms-adaptiveness. First, spatial clustering algorithms should be adaptive to complicated spatial database, such as clusters adjacent to each other, with arbitrary geometrical shapes and/or different densities and a large amount of noise possibly exists. Second, spatial clustering algorithms should be adaptive to the requirements of users, such as different kinds of applications, minimal requirements of prior knowledge to determine the input parameters. On that account, a methodology of adaptive spatial clustering analysis is developed in this thesis. The primary contents of the thesis can be summarized as follows:
     (1) The special characteristics of spatial clustering are firstly analyzed based on the feathers and characteristics of spatial data. Then, a detailed definition of spatial clustering is given, and a framework for spatial clustering including spatial data cleaning, spatial clustering trend analysis, spatial clustering feature extraction, spatial clustering algorithm and spatial clustering validity assessment is also proposed. Finally, an overview and comparison of current spatial clustering algorithms are made.
     (2) A field theory based adaptive spatial clustering algorithm-FTASC is proposed. A novel data field for spatial clustering, called aggregation field, is first of all developed. Then a novel concept of aggregation force is utilized to measure the degree of aggregation among the entities. The FTASC algorithm does not involve the setting of input parameters, and a series of iterative strategies are implemented to obtain different clusters according to various spatial distributions. Two experiments are designed to illustrate the advantages of the FTASC algorithm. The practical experiment indicates that FTASC algorithm can effectively discover local aggregation patterns. The comparative experiment is made to further demonstrate the FTASC algorithm superior than classic DBSCAN algorithm.
     (3) An adaptive spatial clustering algorithm based on Delaunay triangulation-ASCDT is proposed. The ASCDT algorithm employs both statistical features of the edges of Delaunay triangulation and a novel spatial proximity definition based upon Delaunay triangulation to detect spatial clusters. Normally, this algorithm can automatically discover clusters of complicated shapes, and non-homogeneous densities in a spatial database, without the need to set parameters or prior knowledge. The user can also modify the parameter to fit with special applications. In addition, the algorithm is robust to noise. Experiments on both simulated and real-world spatial databases are utilized to demonstrate the effectiveness and advantages of the ASCDT algorithm. Based on the ASCDT algorithm, a novel adaptive spatial clustering algorithm considering spatial obstacles-ASCDT+is further developed.
     (4) A graph and density based hybrid spatial clustering algorithm-HGDSC is proposed. First, Delaunay triangulation with edge length constraints is used for the modeling of the spatial proximity relationships among spatial entities. Then, a modified density-based clustering strategy is used to identify the spatial clusters. The algorithm mainly has two desirable properties. First, both spatial and non-spatial attributes are taken into account. Entities in same cluster are similar in both spatial and non-spatial domain. Second, the algorithm can adapt to a complex spatial database which may contain the clusters of arbitrary shapes and/or non-homogeneous densities and/or large amount of noise. Experiments on both synthetic and real-world spatial datasets are utilized to demonstrate the effectiveness and practicability of the HGDSC algorithm.
     (5) A spatial clustering validity index based on gravitational theory-SCV is proposed. The construction principle of the spatial clustering validity function is first investigated. Then, the aggregation force is utilized to describe the issue of spatial clustering similar to the FTASC algorithm and a novel spatial clustering validity index for two dimension spatial hard clustering is developed. Through the experiments on both simulated data set and real-world data set, it can be found that the index developed in this thesis can well evaluate the spatial hard clustering scheme including both arbitrary shape clusters and outliers.
     (6) A spatial clustering software named as EasyCluster is developed. There are mainly four aspects of functions, including spatial data cleaning, spatial clustering information extraction,21 spatial clustering algorithms and 3 spatial clustering validity index (35 spatial clustering validity functions).

引文

[1]Anderberg M. Cluster analysis for applications [M]. New York:Academic Press, 1973.
    [2]Everitt B, Landau S and Leese M. Cluster analysis,4th edition [M]. London: Arnold,2001.
    [3]Xu R and Wunsch D II. Clustering [M]. New Jersey:John Wiley & Sons,2009.
    [4]方开泰,潘恩沛.聚类分析[M].北京：地质出版社,1982.
    [5]Wu X D, Kumar V, Quinlan J, et al. Top 10 algorithms in data mining [J]. Knowledge Information System,2008,14(1):1-37.
    [6]Han J W and Kamber M. Data mining:concepts and technique, second edition [M]. San Francisco:Morgan Kaufmann,2005.
    [7]Tan P N, Steinbach M and Kumar V. Introduction to data mining [M]. Boston: Addison Wesley,2005
    [8]Fayyad U M, Piatetsky-Shapiro G and Smyth P. From Data Mining to Knowledge Discovery:An Overview [M]. Advances in Knowledge Discoveiy and Data Mining, AAAI/MIT Press,1996,37-54.
    [9]李德仁,关泽群.空间信息系统的集成与实现[M].武汉：武汉大学出版社,2000.
    [10]李德仁,王树良,李德毅.空间数据挖掘理论及应用[M].北京：科学出版社,2006.
    [11]王家耀.空间信息系统原理[M].北京：科学出版社,2001.
    [12]裴韬,周成虎,骆剑承,等.空间数据知识发现研究进展评述[J].中国图像图形学报,2001,6(9)：854-860.
    [13]郭仁忠.空间分析[M].武汉：武汉测绘科技大学出版社,1997.
    [14]Koperski K.1999. A progressive refinement approach to spatial data mining [D]. Ph.D Thesis, Derssertation of Simon Fraster University.
    [15]马荣华,蒲英霞,马晓冬.GIS空间关联模式发现[M].北京：科学出版社,2007.
    [16]Li D R and Cheng T. KDG-knowledge discovery from GIS[C]. In:Proceedings of the Canada Conference on GIS, Canada,1994,1001-1012.
    [17]Han J W, Koperski K and Stefanovic N. GeoMiner:a system prototype for spatial data mining[C]. In:Proceedings of the SIGMOD'97,1997,553-556.
    [18]Miller H J and Han J W. Geographic data mining and knowledge discovery, second edition [M]. New York:CRC Press,2009.
    [19]Ester M, Kriegel H P and Sander J. Spatial data mining:a database approach [C]. In:Proceedings of SSD'97,1997,47-66.
    [20]Shekhar S, Lu C T and Zhang P S. A unified approach to detecting spatial outliers [J]. Geolnformation,2003,7(2):139-166.
    [21]Shekhar S, Chawla S, Ravada S, et al. Spatial databases-accomplishments and research needs [J]. IEEE Transactions on Knowledge and Data Engineering,2005, 11(1):45-55.
    [22]Ng R and Han J W. Efficient and Effective Clustering Method for Spatial Data Mining[C]. Proceeding of the 1994 International Conference on Very Large Data Bases,1994,144-155.
    [23]邸凯昌.空间数据挖掘与知识发现[M].武汉：武汉大学出版社,2000.
    [24]杨春成.空间数据挖掘中的聚类分析算法研究[D].博士论文,郑州：中国人民解放军信息工程大学,2004.
    [25]Liao K and Guo D S. A clustering-based approach to the capacitated facility location problem [J]. Transaction in GIS,2008,12(3):323-339.
    [26]毛政元,李霖.空间模式的测度及其应用[M].北京：科学出版社,2004.
    [27]Li Z L. Algorithmic foundation of multi-scale spatial representation [M]. New York:CRC Press,2007.
    [218]郭庆胜,黄远林,郑春燕,等.空间推理与渐进式地图综合[M].武汉：武汉大学出版社,2007.
    [29]武芳,钱海忠,邓红艳,等.面向地图自动综合的空间信息智能处理[M].北京：科学出版社,2008.
    [30]Yan H W, Weibel R Yang B S. A multi-parameter approach to automated building grouping and generalization [J]. Geoinformatica,2008,12(1):73-89.
    [31]卢林,吴纪桃,柳重堪.基于特征的等高线数据聚类方法[J].测绘学报,2005,34(2)：138-141.
    [32]Xu X W, Ester M, Kriegel H P, et al. A distribution-based clustering algorithm for mining in large spatial databases [C]. In:Proceedings of the 14th International Conference on Data Engineering,1998,324-331.
    [33]Pei T, Zhu A X, Zhou C H, et al. A new approach to the nearest-neighbor method to discover cluster features in overlaid spatial point processes [J]. International Journal of Geographical Information Science,2006,20(2):153-168.
    [34]Pei T, Jasra A, Hand D J, et al. DECODE:a new method for discovering clusters of different densities in spatial data [J]. Data Mining and Knowledge Discovery,2009,18(3):337-369.
    [35]Deng M, Liu Q, Cheng T, et al. An adaptive spatial clustering algorithm based on Delaunauy triangulation [J]. Computers, Environment and Urban Systems,2011, DOI information:10.1016/j.compenvurbsys.
    [36]汪闽.空间聚类挖掘方法研究[D].北京：中国科学院地理资源研究所博士论文,2003.
    [37]焦利民,刘耀林,刘艳芳.区域城镇基准地价水平的空间自相关格局分析[J].武汉大学学报(信息科学版),2009,34(7)：873-877.
    [38]邓羽,刘盛和,张文婷,等.广义多维云模型及在空间聚类中的应用[J].地理学报,2009,64(12)：1439-1447.
    [39]Sander J, Ester M, Kriegel H P, et al. Density-based clustering in spatial databases:the algorithm GDBSCAN and its applications [J]. Data Mining and Knowledge Discovery,1998,2(2):169-194.
    [40]骆剑承,梁怡,周成虎.基于尺度空间的分层聚类方法及其在遥感影像分类中的应用[J]测绘学报,1999,28(4)：319-324.
    [41]秦昆,徐敏.基于云模型和FCM聚类的遥感图像分割方法[J].地球信息科学,2008,10(3)：302-307.
    [42]Birant D and Kut A. ST-DBSCAN:An algorithm for clustering spatial-temporal data [J]. Data & Knowledge Engineering,2007,60(1):208-221.
    [43]Estivill-Castro V and Lee I. Multi-level clustering and its visualization for exploratory spatial analysis [J]. GeoInformatica,2002,6(2):123-152.
    [44]Lee J G, Han J W and Whang K Y. Trajectory clustering:a partition and group framework [C]. Proceedings of 2007 ACM-SIGOD International conference on Management of Data, Beijing,2007,593-604.
    [45]王海起,王劲峰.基于分区的局域神经网络时空建模方法研究[J].遥感学报,2008,12(5)：707-715.
    [46]李光强,刘启亮,邓敏.一种基于BP神经网络的空间异常探测方法[J].测绘科学技术学报,2009,26(6)：439-448.
    [47]Koperski K and Han J W. Discovery of spatial association rules in geographic information databases [C]. Proceedings of the 4th International Symposium on Large Spatial Databases,1995,47-66.
    [48]Malerba D, Esposito F and Lisi F A. Mining spatial association rules in census data [J]. Research in Official Statistics,2002,5(1):19-44.
    [49]Knorr E M and Ng R T. Finding aggregate proximity relationships and commonalities in spatial data mining [J]. IEEE transaction on Knowledge and Data Engineering,1996,8(6):884-897.
    [50]Faber V. Clustering and the continuous k-means algorithm [J]. Los Alamos Science,1994,22,138-144.
    [51]Cihlar J, Guindon B, Beaubian J, et al. From need to product:a methodology for completing a land cover map of Canada with landsat data [J]. Can. J. Remote Sensing, 2003,29(2):171-186.
    [52]李光强,邓敏,程涛,等.一种基于双重距离的空间聚类算法[J].测绘学报,2008,37(4)：482-488.
    [53]林甲祥,陈崇成,樊明辉,等.基于MST聚类的空间数据离群算法挖掘[J].地球信息科学,2008,10(5)：586-591.
    [54]刘启亮,邓敏,王佳谬,等.时空一体化框架下的时空异常探测[J].遥感学报,2011,15(3)：466-474.
    [55]邓敏,刘启亮,李光强.采用聚类技术探测空间异常[J].遥感学报,2010,14(5)：951-958.
    [56]Gan G J, Ma C Q and Wu J H. Data clustering:theory, algorithm and applications [M]. ASM-SIAM Series on Statistics and Applied Probability, SIAM, Philadelphia, ASA, Alexandria, VA,2007.
    [57]邬伦,刘瑜,张晶,等.地理信息系统-原理、方法和应用[M].北京：科学出版社,2002.
    [58]李霖,吴凡.空间数据多尺度表达模型及其可视化[M].北京：科学出版社,2005.
    [59]王劲峰.空间分析[M].北京：科学出版社,2006.
    [60]王佳璆.时空序列数据分析与建模[D].广州：中山大学博士学位论文,2008.
    [61]王远飞,何洪林.空间数据分析方法[M].北京：科学出版社,2007.
    [62]李光强.时空异常探测的理论与方法[D].长沙：中南大学博士学位论文,2009.
    [63]De Smith, M., Goodchild, M and Longley, P. Geospatial analysis:a comprehensive guide to principle, techniques and software tools [M]. UK:The Winchelsea Press,2007.
    [64]Tobler W. A computer movie simulating urban growth in the Detriot region [J]. Economic Geography,1970,46:234-240.
    [65]Haining R. Spatial data analysis:theory and practice[M]. UK:Cambridge Press, 2003.
    [66]王劲峰,廖一兰,刘鑫.空间数据分析教程[M].北京：科学出版社,2010.
    [67]蒲英霞.基于GIS-SDA的空间关联知识发现[D].南京：南京大学博士论文,2005.
    [68]邬建国.景观生态学-格局、过程、尺度与等级(第二版)[M].北京：高等教育出版社,2007.
    [69]Gordon A. Classification,2nd edition [M]. London:CRC Press,1999.
    [70]Everitt B, Landau S and Leese M. Cluster analysis,4th edition [M]. London: Arnold,2001.
    [71]Hansen P and Jaumard B. Cluster analysis and mathematical programming [J]. Mathematical Programming,1997,79:191-215.
    [72]Shekhar S, Vatsavai R R and Celik M. Spatial and spatiotemporal data mining: recent advances [M]. Next Generation of Data Mining, CRC Press,2009.
    [73]Aldstadt J. Spatial clustering [M]. Handbook of applied spatial analysis. Springer, Berlin, Heidelberg and New York,2009,279-300.
    [74]孙吉贵,刘杰,赵连宇.聚类算法研究[J].软件学报,2008,19(1)：48-61.
    [75]MacQueen J. Some methods for classification and analysis of multivariate observations [C]. In:Proceedings of 5th Berkeley Symposium on Mathematics, Statistics and Probability,1967,281-297.
    [76]Kaufman L and Rousseeuw P J. Finding groups in data:an introduction to cluster analysis [M]. John Wiley & Sons,1990.
    [77]Huang Z. Extensions to the k-means algorithm for clustering large data sets with categorical values [J]. Data Mining and Knowledge Discovery Ⅱ,1998,2:283-304.
    [78]Ruspini E H. A New Approach to Clustering [J]. Information and Control,1969, 15(1):22-32.
    [79]Pena J, Lozano J and Larranaga P. An empirical comparison of four initialization methods for the k-means algorithm [J]. Pattern Recognition Letters,1999,20(10): 1027-1040.
    [80]Khan S and Ahmad A. Cluster center initialization algorithm for k-means clustering [J]. Pattern Recognition Letters,2004,25(11):1293-1302.
    [81]Pelleg D and Moore A. x-means:extending k-means with efficient estimation of the number of clusters [C]. In:Proceedings of the Seventeenth International Conference on Machine Learning,2000,727-734.
    [82]于剑,程乾生.模糊聚类方法中的最佳聚类数的搜索范围[J].中国科学(E 辑),2002,32(2)：274-280.
    [83]Bezdek J C, Coray C, Gunderson R, et al. Detection and characterization of cluster substructure:Ⅱ. Fuzzy c-varietirs and convex combinations thereof [J]. SIAM Journal on Application Mathematics,1981,40(2):358-372.
    [84]Dave N R. Fuzzy shell clustering and application to circle detection in digital images [J]. International Journal of General Systems,1990,16(4):343-355.
    [85]Zhang T, Ramakrishnan R and Livny M. BIRCH:an efficient data clustering method for very large databases [C]. In:Proceedings of the ACM SIGMOD International Conference on Management of Data, Montreal, Canada,1996,103-114.
    [86]Guha S, Rastogi R and Shim K. CURE:an efficient clustering algorithm for large database [C]. In:Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, New York, USA,1998,73-84.
    [87]Karypis G, Han E H and Kumar V. Chameleon:hierarchical clustering using dynamic modeling [C]. IEEE Computer,1999,32 (8),68-75.
    [88]Leung Y, Zhang J and Xu Z. Clustering by scale-space filtering. IEEE Transactions on Pattern Analysis and Machine Intelligence [J],2000,22(12): 1396-1410.
    [89]汪闽,周成虎,裴涛,等MSCMO:基于数学形态学算子的尺度空间聚类方法[J].遥感学报,2004,8(1)：45-50.
    [90]Ester M, Kriegel H P, Sander J, et al. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise [C]. In:Proceeding of the 2nd the International Conference on Knowledge Discovery and Data Mining. Portland, OR,1996:226-231.
    [91]李光强,邓敏,刘启亮,等.一种适应局部密度变化的空间聚类方法[J].测绘学报,2009,38(3)：255-263.
    [92]Ankerst M, Breunig M M, Kriegel H P, et al. OPTICS:ordering pionts to identify the clustering structure[C]. In:Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data, Philadelphia, USA,1999,49-60.
    [93]Hinneburg A and Keim D A. An efficient approach to clustering in large multimedia databases with noise [C]. In:Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining, New York, USA,1998, 58-65.
    [94]Nosovskiy G V, Liu D and Sourina O. Automatic clustering and boundary detection algorithm based on adaptive influence function [J]. Pattern Recognition, 2008,41(9):2757-2776.
    [95]Ertoz L, Steinbach M and Kumar V, Finding clusters of different sizes, shapes, and densities in noisy, high dimensional data[C]. In:Proceedings of the 2003 SIAM International Conference on Data Mining, San Francisco, USA,2003,1-12.
    [96]Zahn C T. Graph-theoretical methods for detecting and describing gestalt clusters [J]. IEEE Transaction on Computers,1971, C20 (1):68-86.
    [97]Estivill-Castro V and Lee I. Argument free clustering for large spatial point-data sets [J]. Computers, Environment and Urban Systems,2002,26(4):315-334.
    [98]Zhong C, Miao D Q and Wanf R Z. A graph-theoretical clustering method based on two rounds of minimum spanning trees [J]. Pattern Recognition,2010,43(3): 752-766.
    [99]McLachlan G J and Krishnan T. The EM algorithm and extensions, first ed[M]. New York:Wiley-Interscience,1996.
    [100]韩力群.人工神经网络理论、设计及应用[M].北京：化学工业出版社,2004
    [101]Fernando B, Victor L and Marco P. The self-organizing map, the Geo-SOM, and relevant variants for geosciences [J]. Computers & Geosciences,2005,31(2), 155-163.
    [102]Wang W, Yang J and Muntz R. STING:a statistical information grid approach to spatial data mining [C]. In:Proceeding of the 1997 International Conference on Very Large Data Bases(VLDB'97), August,1997,186-195.
    [103]Sheikholeslami G, Chatterjee S and Zhang A. WaveCluster:a multi-resolution clustering approach for very large spatial databases [C]. In: Proceedings of the 1998 International Conference on Very Large Data Bases (VLDB' 98), August,1998,428-439.
    [104]Lin C, Liu K and Chen M. Dual clustering:integrating data clustering over optimization and constraint domains [J]. IEEE Transaction on Knowledge and Data Engineering,2005,17(5):628-637.
    [105]Lin C and Chen M. Combining partitional and hierarchical algorithms for robust and efficient data clustering with cohesion self-merging [J]. IEEE Transactions on Knowledge and Data Engineering,2005,17(2):145-159
    [106]Handl J and Knoeles J. An Evolutionary Approach to Multiobjective Clustering [J]. IEEE Transactions on Evolutionary Computation,2007,11(1):56-76.
    [107]Wang X and Hamilton H J. DBRS:a density-based spatial clustering method with random sampling [C]. In:Proceedings of the 7th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Korea, Seoul,2003:563-575.
    [108]Liu D, Nosovskiy G V and Sourina O. Effective clustering and boundary detection algorithm based on Delaunay triangulation [J]. Pattern Recognition Letters, 2008,29(9):1261-1273.
    [109]Kovacs F, Legany C and Babos, A. Cluster validity measurement techniques [C]. In:Proceedings of the 5th WSEAS International Conference on Artificial Intelligence, Knowledge Engineering and Data Bases, Madrid, Spain,2006:88-393.
    [110]王树良.基于数据场与云模型的空间数据挖掘和知识发现[D].武汉：武汉大学博士学位论文,2002.
    [111]淦文燕,李德毅,王建民.一种基于数据场的层次聚类算法[J].电子学报,2006,34(2)：258-262.
    [112]陈军Voronoi动态空间数据模型[M].北京：测绘出版社,2002.
    [113]Gold C M. The meaning of "Neighbor"[M]. Theories and Methods of Spatial-Temporal Reasoning in Geographic Space, Lecture Notes in Computing Science No.639, Berlin,1992.
    [114]Wright W E. Gravitational clustering [J]. Pattern Recognition,1977,9(3): 151-166.
    [115]李海民.遗传算法性能及其在聚类分析中的应用研究[D].西安：西安电子科技大学博士学位论文,1999.
    [116]吴启焰,陈浩.云南城市经济影响区空间组织演变规律[J].地理学报,2007,62(12)：1244-1252.
    [117]Bailey T and Gatrell A. C. Interactive spatial analysis [M]. New York:Wiley, 1995.
    [118]邓敏,刘启亮,李光强.基于场论的空间聚类方法[J].遥感学报,2010,14(4)：702-709.
    [119]蒋海昆,李永莉,曲延军,等.中国大陆中强地震序列类型的空间分布特征[J].地震学报,2006,28(4)：389-398.
    [120]皇甫岗,李忠华,秦嘉政,等.川滇菱形块体强震活动关联分析[J].地震研究,2007,30(3)：205-209
    [121]Tung A K H, Hou J and Han J W. COE:Clustering with obstacles entities, a preliminary study [C]. In:Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Lecture Notes in Computer Science,2000, 1805:165-168.
    [122]Tung A K H, Hou J and Han J W. Spatial Clustering in the Presence of Obstacles [C]. In:Proceedings of International Conference on Data Engineering, Heidelberg, Germany, April,2001,359-367.
    [123]Tung A K H, Han J W, Lakshmanan L V S, et al. Constraint-based clustering in large databases [C]. In:Proceedings of 2001 International Conference on Data Theory, London, UK,2001,405-419.
    [124]张雪萍.基于集群智能的带约束条件空间聚类分析研究[D].郑州：解放军信息工程大学博士论文,2007.
    [125]Estivill-Castro, V and Lee, I. Clustering with obstacles for geographical data mining [J]. ISPRS Journal of Photogrammetry & Remote Sensing,2004,59(1-2): 21-34.
    [126]Zaiane O R and Lee C H. Clustering spatial data when facing physical constraints[C]. In:Proceedings of the IEEE International Conference on Data Mining, Maebashi City, Japan,2002,737-740.
    [127]Wang X and Hamilton H J. Clustering spatial data in the presence of obstacles [J]. International Journal on Artificial Tools,2005,14(1-2):177-198.
    [128]Wang X, Rostoker C and Hamilton H J. A density-based spatial clustering for physical constrains [J]. Journal of Intelligent Information Systems,2011, DOI: 10.1007/s10844-011-0154-7.
    [129]Watson D F. Natural neighbour sorting [J]. The Australian Computer Journal, 1985,17(4):189-193.
    [130]Liu H X, Jezek K C and O'Kelly M E. Detecting outliers in irregularly distributed spatial data sets by locally adaptive and robust statistical analysis and GIS [J]. International Journal of Geographical Information Science,2001,15(8):721-741
    [131]Pakhira M K, Bandyopadbyay A and Maulik U. Validity index for crisp and fuzzy clustering [J]. Pattern Recognition,2004,37(3):487-501.
    [132]Berry M and Linoff G. Data mining techniques:for marking, sales and customer support [M]. John Wiley & Sons, Berlin,1997.
    [133]Larsen B and Aone C. Fast and Effective Text Mining Using Linear-time Document Clustering [C]. In:Proceedings of KDD-99, San Diego, California,1999: 16-22.
    [134]Rand W. Objective Criteria for the Evaluation of Clustering Methods [J]. Journal of the American Statistical Association,1971,66(336):846-850.
    [135]Fowlkes E and Mallow S C. A Method for Comparing Two Hierarchical Clusterings [J]. Journal of the American Statistical Association,1983,78(382): 569-576.
    [136]Halkidi M, Batistakis Y and Vazirgiannis M. On clustering validation techniques [J]. Journal of Intelligent Information Systems,2001a,17(2-3):107-145.
    [137]Halkidi M, Batistakis Y and Vazirgiannis M. Clustering algorithm and validity measures [J]. In:Proceedings of 13th International Conference on Scientific and Statistics Database Management,2001b,3-22.
    [138]Halkidi M, Batistakis Y and Vazirgiannis M. Cluster Validity Methods:Part Ⅰ [J]. ACM SIGMOD Record,2002a,31(2):40-45.
    [139]Halkidi M, Batistakis Y and Vazirgiannis M. Cluster Validity Methods:Part Ⅱ [J]. ACM SIGMOD Record,2002b,31(3):19-27.
    [140]张惟皎,刘春煌,李芳玉.聚类质量的评价方法[J].计算机工程,2005,31(20)：10-12.
    [141]Dunn J C. Well Separated Clusters and Optimal Fuzzy Partitions [J], Journal of Cybernetica,1974,4(3):95-104.
    [142]Davies D L and Bouldin D W.1979. Cluster separation measure [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,1979,1(2):95-104.
    [143]Subhash S. Applied multivariate techniques [M]. New York, John Wiley & Sons,1996.
    [144]Halkidi M, Vazirgiannis M and Batistakis I. Quality Scheme assessment in the clustering process [C]. In:Proceedings of PKDD,2000,265-276.
    [145]Halkidi M and Vazirgiannis M. Clustering validity assessment:finding the optimal partitioning of a data set [C]. In:Proceedings of ICDM,2001,187-194.
    [146]Bezdek J and Pal N. Some new indexes of cluster validity [J]. IEEE Transactions on Systems, Man and Cybernetics,1998,28(3):301-315.
    [147]Pal N and Biswas J. Cluster validation using graph theoretic concepts [J]. Pattern Recognition,1997,30(6):847-857.
    [148]Kim M and Ramakrishna R. New indices for cluster validity assessment [J]. Pattern Recognition Letters,2005,26(15):2353-2363.
    [149]Halkidi M and Vazirgiannis M. A density-based validity approach using multi-representatives [J]. Pattern Recognition Letters,2008,29(6):773-786.
    [150]Pascual D, Pla F and Sanchez J. Clustering validation using information stability measures [J]. Pattern Recognition Letters,2010,31(6):454-461.
    [151]Yue S, Wang J, Wu T, et al. A new separation measure for improving the effectiveness of validity indices [J]. Information Sciences,2010,180(5):748-764.
    [152]岳士弘,李平,于剑.一组新的聚类有效性指标[J].模式识别与人工智能,2004,17(4)：517-522.
    [153]于勇前,赵相国,陈衡岳,等.基于引力概念的聚类质量评估算法[J].东北大学学报(自然科学版),2007,28(8)：1109-1112.
    [154]毛政元.集聚型空间点模式结构信息提取研究[J].测绘学报,2007,36(2)：181-186.
    [155]李晓雯,毛政元,李建微.一种基于几何概率的聚类有效性函数[J].中国图象图形学报,2008,13(12)：2351-2356.
    [156]刘福刚,孟宪刚.中国县域经济年鉴(2008卷)[R].北京：社会科学文献出版社,2008.
    [157]CLUTO:http://glaros.dtc.umn.edu/gkhome/views/cluto
    [158]Weka:http://www.cs.waikato.ac.nz/ml/weka/

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700