摘要
从空间数据场的角度,借鉴高斯势函数发展了一种新的空间异常度度量指标。进而,提出了一种基于场论的空间异常探测方法。该方法通过空间聚类获得局部相关性较强的空间簇,并构建合理、稳定的空间邻近域。在此基础上,采用专题属性变化梯度修复策略减弱空间邻近域中潜在异常的影响,并利用空间异常度度量指标计算实体的异常度,从而探测空间异常。实验结果及实例证明了此方法的正确性。
Spatial outlier detection is one of the major data mining methods.Detection of outliers will contribute to the discovery of implicit knowledge,significant changes,surprising patterns,and meaningful insights.In the field of geography,a spatial outlier is an object whose non-spatial attribute value is significantly different from the values of its spatial neighbors.Most current spatial outlier detection methods primarily consider that all the objects for outlier detection are correlated.Actually,spatial correlation decreases with the increase of distance.At the same time,the objects could be potentially wrongly identified as spatial outliers when there are several real outliers in their spatial neighborhoods.From the viewpoint of the spatial data field,a similar Gaussian potential function is utilized to measure the degree of spatial outlier degree.Further a field-theory based spatial outlier detecting algorithm is proposed.Firstly,the spatial clustering is employed to extract the local autocorrelation patterns,called clusters.Then the clusters were utilized to construct the reasonable and stable spatial neighborhoods using the constraint Delaunay triangulation.Finally,a robust spatial outlier measure is proposed to determine spatial outliers in each cluster.Experimental results show that the proposed method is effective for determining detecting spatial outliers in spatial point datasets.
引文
[1]Li Deren,Wang Shuliang,Li Deyi,et al.Theories and Technologies of Spatial Data Mining and Knowledge Discovery[J].Geomatics and Information Science of Wuhan University,2002,27(3):221-233(李德仁,王树良,李德毅,等.论空间数据挖掘和知识发现的理论和方法[J].武汉大学学报·信息科学版,2002,27(3):221-233)
[2]Li Deren,Wang Shuliang,Li Deyi.Spatial Data Mining Theories and Applications[M].Beijing:Science Press,2013(李德仁,王树良,李德毅.空间数据挖掘理论及应用[M].北京:科学出版社,2013)
[3]Liu Dayou,Chen Huiling,Qi Hong,et al.Advance in Spatiaotemporal Data Mining[J].Journal of Computer Research and Development,2013,50(2):225-239(刘大有,陈慧灵,齐红,等.时空数据挖掘研究进展[J].计算机研究与发展,2013,50(2):225-239)
[4]Hawkins D.Identification of Outliers[M].London:Chapman and Hall,1980
[5]Shekhar S,Lu C T,Zhang P S.A Unified Approach to Detecting Spatial Outliers[J].GeoInformatica,2003,7(2):139-166
[6]Haslett J,Brandley R,Craig P,et al.Dynamic Graphics for Exploring Spatial Data with Application to Locating Global and Local Anomalies[J].The American Statistician,1991,45(3):234-242
[7]Chen D C,Lu C T,Kou Y F,et al.On Detecting Spatial Outliers[J].GeoInformatica,2008,12(4):455-475
[8]Ma Ronghua,He Zengyou.Fast Mining of Spatial Outliers from GIS Database[J].Geomatics and Information Science of Wuhan University,2006,31(8):679-682(马荣华,何增友.从GIS数据库中挖掘空间离群点的一种高效算法[J].武汉大学学报·信息科学版,2006,31(8):679-682)
[9]Chawla S,Sun P.SLOM:A New Measure for Local Spatial Outliers[J].Knowledge and Information Systems,2006,9(4):412-429
[10]Schubert E,Zimek A,Kriegel H P.Local Outlier Detection Reconsidered:A Generalized View on Locality with Applications to Spatial,Video,and Network Outlier Detection[J].Data Mining and Knowledge Discovery,2014,28(1):190-237
[11]Xue Anrong,Ju Shiguang.Outlier Mining Based on Spatial Constraint[J].Computer Science,2007,34(6):207-209,230(薛安荣,鞠时光.基于空间约束的离群点挖掘[J].计算机科学,2007,34(6):207-209,230)
[12]Deng Min,Liu Qiliang,Li Guangqiang.Spatial Outlier Detection Method Based on Spatial Clustering[J].Journal of Remote Sensing,2010,14(5):944-958
[13]Ester M,Kriegel H P,Sander J,et al.A DensityBased Algorithm for Discovering Clusters in Large Spatial Databases with Noise[C].The 2nd International Conference on Knowledge Discovery and Data Mining,Portland,O R,1996
[14]Chen F,Lu C T,Boedihardjo A P.GLS-SOD:A Generalized Local Statistical Approach for Spatial Outlier Detection[C].The 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,New York,USA,2010
[15]Cai Q,He H B,Man H.Spatial Outlier Detection Based on Iterative Self-organizing Learning Model[J].Neurocompuing,2013,117:161-172
[16]Deng M,Liu Q L,Cheng T,et al.An Adaptive Spatial Clustering Algorithm Based on Delaunay Triangulation[J].Computer,Environment,Urban and Systems,2011,35(4):320-332
[17]Kolingerova I,Zalik B.Reconstructing Domain Boundaries within A Given Set of Points Using Delaunay Triangulation[J].Computers&Geosciences,2006,32(9):1 310-1 319
[18]Li Deyi,Du Yi.Artificial Intelligence with Uncertainty(Second Edition)[M].BeiJing:National Defense Press,2014(李德毅,杜鹢.不确定性人工智能(第2版)[M].北京:国防工业出版社,2014)
[19]Wang Shuliang.Data Field and Cloud Model Based Spatial Data Mining and Konledge Discovery[D].Wuhan:Wuhan University,2002(王树良.基于数据场与云模型的空间数据挖掘和知识发现[D].武汉:武汉大学,2002)
[20]Deng Min,Liu Qiliang,Li Guangqiang,et al.Field-Theory Based Spatial Clustering Method[J].Journal of Remote Sensing.2010,14(4):694-709(邓敏,刘启亮,李光强,等.基于场论的空间聚类算法[J].遥感学报.2010,14(4):694-709)
[21]Deng Min,Peng Dogliang,Liu Qiliang,et al.A Hierarchical Spatial Clustering Algorithm Based on Field Theory[J].Geomatics and Information Science of Wuhan University,2011,36(7):847-852(邓敏,彭东亮,刘启亮,等.一种基于场论的层次空间聚类算法[J].武汉大学学报·信息科学版,2011,36(7):847-852)
[22]Gan Wenyan,Li Deyi,Wang Jianmin.An Hierarchical Clustering Method Based on Data Fields[J].Acta Electronica Sinica,2006,34(2):258-262(淦文燕,李德毅,王建民.一种基于数据场的层次聚类算法[J].电子学报,2006,34(2):258-262)
[23]Wu Tao,Qin Kun.Image Segmentation Using Cloud Model and Data Field[J].PR&AI,2012,25(3):397-405(吴涛,秦昆.利用云模型和数据场的图像分割方法[J].模式识别与人工智能,2012,25(3):397-405)
[24]Jiang Shengyi,Li Qinghua.GLOF:A New Approach for Mining Local Outlier[C].The 2nd International Conference on Machine Learning and Cybernetics,Xi'an,China,2003
[25]Rousseeuw P J,Hubert M.Robust Statistics for Outlier Detection[J].WIREs:Data Mining and Knowledge Discovery,2011,1(1):73-79