模糊度阈值范围内模糊对象的co-location模式挖掘
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
空间co-location模式代表的是一组空间对象,它们的实例在空间中频繁的关联。空间co-location模式挖掘是空间数据挖掘的一个重要研究方向,在现实生活中有着十分广泛的应用。人们已经在确定及不确定数据上对co-location模式挖掘问题进行了大量研究并获得了很多成果,但在模糊数据上进行的研究几乎没有。模糊数据可以应用于许多领域,比如GIS和生物医学图像数据库。本文研究模糊度阈值范围内模糊对象的co-location模式挖掘问题。
     首先,介绍了空间co-location模式挖掘的相关概念、性质、相关工作及算法。
     其次,研究了模糊对象的空间co-location模式挖掘问题。提出了相关定义及算法,包括一个基本算法和四个改进算法,改进算法包括剪枝模糊对象、减少实例间连接、改进剪枝步和基于网格的距离计算,并对算法的时间性能及挖掘结果进行了实验分析。
     第三,针对模糊对象的空间co-location模式挖掘只能在单一模糊度阈值下进行的问题,论文对模糊度阈值范围内模糊对象的co-location模式挖掘问题进行了研究。提出了相关定义,并在此基础上给出了一个基本算法。为了提高基本算法的挖掘效率,提出了减少挖掘次数和缩小挖掘范围两个改进算法,并用实验比较了基本算法和改进算法的时间性能。
     第四,将模糊对象的co-location模式挖掘应用于三江并流项目中。
     第五,对论文的全部内容进行总结并对未来的工作做出展望。
Space co-location patterns represent a group of spatial objects whose instances are frequently associated in the space. Space co-location pattern mining is an important research direction for spatial data mining, and has a very wide range of applications in real lift. The mining co-location pattern problem for certain and uncertain data had been investigated in the past, but not for fuzzy data. Fuzzy data could be applied to many areas such as GIS and biomedical image databases. This paper investigates the spatial co-location pattern mining problem for ambiguity range.
     First, the definitions, theorems, related work of the spatial co-location pattern mining and algorithms are introduced.
     Second, we investigate the spatial co-location pattern mining problem for fuzzy objects. The related concepts and algorithms of spatial co-location patterns mining on fuzzy objects are proposed. Algorithms includes FB algorithm, the pruning objects, reducing of the operation joining between spatial instances, optimizing the pruning steps and grid-based distance calculation. By extensive experiments, we analyzed the performance and results of the algorithms.
     Third, because the spatial co-location pattern mining for fuzzy objects only in a single probability threshold, so the paper investigates the problem for ambiguity range. The related concepts are defined, and on this basis, a basic algorithm has been proposed. To improve the mining performance, two kinds of the improved algorithms---reducing the number of mining and narrowing the excavation area are put forward. The time performance of the algorithms has been analyzed.
     Forth, the fuzzy object co-location pattern mining applied in a "three parallel rivers" project.
     At the last, conclusion and future work were presented.
引文
[1]W. Lu, J. Han. Discovery of general knowledge in large Spatial Databases[C]. In:Proc Far East Workshop on Geographic In-formation Systems. Singapore,1993:275-289.
    [2]P.-N. Tan. M. Steinbach, V. Kumar著,范明,范宏建等译.数据挖掘导论[M],北京:人民邮电出版社,2006
    [3]S. Shekhar and S. Chawla. Spatial Databases:A Tour [M], England:Prentice Hall,2003.
    [4]S. Shekhar and Y. Huang. Discovering Spatial Co-location Patterns:A Summary of Results [C], In: Proc. of International Symposium on Spatial and Temporal Database (SSTD) Heidelberg:Springer Berlin.2001. LNCS 2121:236-256.
    [5]王丽珍,周丽华.陈红梅等.数据仓库与数据挖掘原理及应用[M].第二版.北京:科学出版社,2009
    [6]K. Koperski,.I. Han. J. Adhikary. Mining knowledge in Geographical data[J]. IEEE Transaction on Knowledge and Data Engineering.1993. (10):903-913.
    [7]M. Ester. H. P. Kriegel. X. Xu. Knowledge discovery in large spatial databases:Focusing techniques for efficient class identification[C]. In:Proc. of 4thSymp on Large Spatial Databases(SSD'95), Berlin, 1995:67-82.
    [8]K. Koperski. J. Han. Discovery of spatial Association Rules in Geographic Information Databases[C]. In:Proc. of 4thSymp on Large Spatial Databases(SSD'95), Berlin,1995:47-66.
    [9]A. K. H. Tung, J. Hou, J. Han. Spatial Clustering in the Presence of Obstacles[J], IEEE Transaction on Knowledge and Data Engineering.2011, (11):359-369.
    [10]S.Shashi.S. Chawla著,谢昆青,马修军等译.空间数据厍[M],北京:机械工业出版社,2004.
    [11]M. Chau. R. Cheng, et al. Uncertain Data mining:An Example in Clustering Location Data [C]. In: proc. of the 10th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2006). volume 3918 of Lecture Notes in Computer Science. Singapore.9-12 April 2006. Springer, pp.199-204.
    [12]K.N. Wang, et al. Efficient Clustering of Uncertain Data. In:proc. of the 6th IEEE International Conference on Data Mining (ICDM 2006). IEEE Computer Society. pp.436-445.
    [13]S. D. Lee, B. Kao, and R. Cheng. Reducing UK-means to K-means[C]. In:The Workshop on Data Mining of Uncertain Data (DUNE), in conjunction with ICDM,2007, pp.483-488.
    [14]B. Kao, S. D. Lee, D. W. Cheung, W. S. Ho, K. F. Chan. Clustering Uncertain Data Using Voronoi Diagrams[C], In:Eight IEEE International Conference on Data Mining,2008. ICDM'08, Pisa, Dec, 2008, pp.333-342.
    [15]P. B.Volk, F. Rosenthal, M. Hahmann, D. Habich, W. Lehner, et al. Clustering Uncertain Data with Possible Worlds[C], In:ICDE 2009, pp:1625-1632.
    [16]H.-P. Kriegel and M. Pfeifle. Density-Based Clustering of Uncertain Data[C]. In:Proc.11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD),2005, pp: 672-677.
    [17]H.-P. Kriegel and M. Pfeifle. Hierarchical Density Based Clustering of Uncertain Data[C]. In:Proc. Fifth IEEE International Conference on Data Mining (ICDM 2005), IEEE Computer Society, pp: 689-692.
    [18]C. C. Aggarwal and P. S. Yu. A Framework for Clustering Uncertain Data Streams. In:Proc.24th IEEE Intertional Conference on Data Engineering (ICDE 2008), pp:150-159.
    [19]Bi Jinbo, Zhang Tong. Support Vector Classification with Input Data uncertainty[C]. In:Advances in Neural Information Processing Systems, Vancouver,2005,17, pp:161-168.
    [20]G.V.Suresh, E.V.Reddy, Shabbeer Shaik. Classification of Uncertain Data using Gaussian Process Model[J], International Journal of Computer Science Information Security. Vol 8, NO.9, Dec 2010, pp:306-312.
    [21]B.Qin. Y.Xia. S.Prabhakar and Y.Tu. A Rule-Based Classification Algorithm for Uncertain Data[C]. In: ICDE 2009, pp.1633-1640.
    [22]J. Ren. S. Lee, X. Chen, B. Kao. R. Cheng, and D. Cheung. Navie Bayes Classification of Uncertain Data. In ICDM 2009. pp:944-949.
    [23]S. Tsang. B. Kao, K. Y. Yip. W.-S. Ho, and S. D. Lee. Decision trees for uncertain data. In ICDE 2009, pp:441-444.
    [24]C. C. Aggarwal and P. S. Yu. Outlier Dectection with Uncertain Data[C]. In:SIAM International Conference on Data Mining,2008, pp:483-493.
    [25]B. Jiang and J. Pei. Outlier Detection on Unertain Data:Objects, Instances, and Inferences[C]. In: ICDE2011.
    [26]B. Liu, J. Yin, Y. Xiao, L.Cao and P. S. Yu. Exploiting Local Data Unertainty to Boost Global Outlier Detection[C]. In:2010 IEEE International Conference on Data Mining(ICDM 2010), pp:304-313.
    [27]C. K. Chui, B. Kao and E. Hung. Mining Frequent Itemsets from Uncertain Data[C], In:PAKDD, 2007,pp:47-58.
    [28]C. K. Chui and B. Kao. A Dccremental Approach for Mining Frequent Itemsets from Uncetain Data[C]. In:PAKDD 2008, Springer, pp:64-75
    [29]C. K. Leung, M. F. Mateo and D. A. Brajczuk. A Tree-Based Approach for Frequent Pattern Mining from Uncertain Data[C]. In:PAKDD 2008. Springer, pp:653-661.
    [30]Q. Zhang, F. Li and K. Yi. Finding Frequent Items in Probabilistic Data[C]. In:SIGMOD 2008, Vancouver Canada, pp:819-832.
    [31]L. A. Abd-Elmegid. M. E. El-Sharkawi. Vertical Mining of Frequent Patterns from Uncertain Data[J]. In:Computer and Information Science, Vol3. No.2, May 2010, pp:171-179.
    [32]L. Zadeh. Fuzzy sets[C]. Information and Control,1965,8(3):338-353.
    [33]D. Altman. Fuzzy set theoretic approaches for handling imprecision in spatial analysis[C]. International Journal of Geographical Information Science,1994,8(3):271-289.
    [34]M. Schneider. Fuzzy topological predicates, their properties, and their integration into query languages[C]. Proceedings of the ACM International Symposium on Advances in Geographic Information Systems (ACMGIS). New York. USA,2001:9-14.
    [35]M. Schneider. Uncertainty management for spatial data in databases:fuzzy spatial data types[C]. Proceedings of the International Symposium on Advances in Spatial Databases. Berlin, Germany.1999: 330-351.
    [36]X. Tang, W. Kainz. Analysis of topological relations between fuzzy regions in a general fuzzy topological space[C]. Proceedings of the Symposium on Geospatial Theory. Processing and Applications. Ottawa. Canada,2002:114-123.
    [37]Zheng Kai. Fung Pui-Cheong, Zhou Xiao-Fang. K-nearest neighbor search for fuzzy objects[C]. Proceedings of the Special Interest Group on Management of Data (SIGMOD'10), Indiana, USA,2010: 699-710.
    [38]欧阳志平,王丽珍,陈红梅.模糊对象的空间co-location模式挖掘研究[J].计算机学报,2011,34 (10):1947-1955.
    [39]Y. Huang, S. Shekhar, H. Xiong, Discovering Co-location Patterns from Spatial Data Sets:A General Approach[J], IEEE Transactions on Knowledge and Data Engineering 16(12) (2004) 1472-1485.
    [40]J. S. Yoo, S. Shekhar, M. Celik, A Join-Less Approach for Co-Location Pattern Mining:A Summary of Results[C]. In:Proc. of the 5th IEEE Int. Conf. on Data Mining,ICDM 2005. Houston, Nov.2005, pp.813-816.
    [41]J. S. Yoo and S. Shekhar. A partial Join Approach for Mining Co-location Patterns[C], In Proc. of the 12th annual ACM international workshop on Geographic information systems,2004, pp:241-249.
    [42]Y. Huang. L. Zhang, and P. Yu. Can We Apply Projection Based Frequent Pattern Mining Paradigm to Spatial Co-location Mining?[C], In Proc. of the 9th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD),2005, pp:719-725.
    [43]Y. Huang, J. Pei. H. Xiong. Mining Co-Location Patterns with Rare Events from Spatial Data Sets[J], Geolnformatica.10(3) (2006), pp:239-260.
    [44]L. Wang. Y. Bao, J. Lu. J. Yip. A New Join-less Approach for Co-location Pattern Mining[C], In: Proceedings of the IEEE 8th International Conference on Computer and Information Technology (C1T2008). Syney. Australia,2008. pp.197-202.
    [45]L. Wang. Y. Bao. Z. Lu. Efficient Discovery of Spatial Co-Location Patterns Using the iCPI-tree[J]. The Open Information Systems Journal.2009. Vol.3.69-80.
    [46]L. Wang, L. Zhou. J. Lu, J. Yip. An Order-Clique-Based Approach for Mining Maximal Co-locations[J], Information Sciences 179(2009)3370-3382.
    [47]Y. Huang, P. Zhang. On the Relationships between Clustering and Spatial Co-location Pattern Mining[C], In:Proc. of the 18th IEEE Int. Conf. on Tools with Artificial Intelligence (ICTAI 06), Washington D.C., Nov.2006, pp.513-522.
    [48]X. Y. Xiang, X. Xie, Q. Luo, W.-Y. Ma. Density Based Co-location Pattern Discovery[J]. In:Proc. of the 16th ACM International Conference on Advance in geographic information systems (GIS'08), ACM Press, pp:11-20.
    [49]陆叶,王丽珍,张晓峰.从不确定数据集中挖掘频繁co-location模式[J],计算机科学与探索,2009年3月,第6卷:656-664.
    [50]陆叶,于丽珍,陈红梅,赵丽红.基于可能世界的不定空间co-location模式挖掘研究[J],计算机研究与发展,2010年,第47卷(增刊),215-221.
    [51]Wang Li-Zhen, Chen Hong-Mei, Zhao Li-Hong. Efficiently mining co-location rules on interval data[C]. Proceedings of the 6th Int Conf on Advanced Data Mining and Applications (ADMA 2010), Chongqing, China, Part I, LNCS 6440.2010:477-488.
    [52]Wang Li-Zhen, Wu Ping-Ping. Chen Hong-Mei. Finding Probabilistic Prevalent Co-locations in Spatially Uncertain Data Sels[J]. IEEE Transactions on Knowledge and Data Engineering(TKDE),DOI:http://doi.ieeecomputersociety.org/10.1109/TKDE.2011.256.2012.