用户名: 密码: 验证码:
离群检测及其优化算法研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
在数据集中,离群点是指那些相对于大量常规数据异常孤立的数据模式。在很多情况下离群点被认为是噪声而抛弃,但在实际应用中我们发现一些包含重要信息的数据往往就是离群点。离群检测就是利用统计学,机器学习,智能计算,可视化等多种技术来发现数据集中的离群点,供用户进行分析和处理。
     由于离群点可能蕴含重要知识,离群检测在预防电信和信用卡欺诈,医疗保险,市场分析,气象预测等领域有广泛的应用,相关研究将具有重要的学术和现实意义。然而面对日益复杂的大型高维数据集,如何迅速有效地发现并处理异常行为是一个具有挑战性的问题。
     本文尝试将聚类与分类方法用于发现数据集中的异常对象,同时研究离群检测相关的优化算法。我们提出了基于谱聚类以及RBF人工神经网络的离群检测方法,针对高维数据集定义了关键离群属性子集的概念并实现了属性约简来优化离群检测。主要工作和成果如下:
     ①对谱聚类基本原理和典型算法做了较为全面的分析和研究,利用谱聚类的特性实现了在复杂数据集上的聚类。提出了一种改进的基于随机行走的谱聚类算法,该算法引入了密度敏感的距离量度来更精确地计算对象之间的相似性,并且通过计算随机矩阵相关特征值来自动确定数据集的最优聚类数。利用该算法获得的稳定聚类,是有效完成离群检测的前提。
     ②首次将谱聚类用于离群检测,并通过定义扩展的多路剪切和分段常数特征向量证明了其可行性。提出了一种基于谱聚类的离群检测算法,该算法首先对数据集进行聚类,然后计算所有聚类中对象的离群因子并根据该值来确定离群点。在谱聚类过程中,利用共享邻居的邻接矩阵构造方法来获得较为稀疏的邻接矩阵,其特征向量可以用Lanczos算法来快速求解。
     ③利用RBF人工神经网络来构造离群检测模型,该模型使用减法聚类来有效选择隐节点中心,同时获得更快的训练速度。网络训练过程中,在传统误差函数中加入了一个调整项,旨在消除隐层节点的波动。为每个输入样本定义离群度,在网络输出结果确定的情况下,可以根据离群度判断那些实际输出严重偏离其期望的样本为离群点。
     ④针对在大型高维数据集中发现离群点效率不高的问题,我们引入粗糙集相关概念并提出了基于属性约简的离群检测方法。如果在某属性子集上得出的离群划分与在全属性集上得出的离群划分足够相似,则对于这样的数据集,可以直接在这些属性子集(即关键离群属性子集)上进行离群检测。此外,提出了高效的关键离群属性子集的查找方法,并通过实验验证了其有效性。
An outlier in dataset is an observation or data pattern which is considerably dissimilar or inconsistent with the remainder of the data. In most cases, outliers are abandoned due to be considered as noise. Objects including important information, however, are outliers found in some real-life applications. Outlier detection aims to find outliers in dataset by utilizing statistics, machine learning, intelligent computing, visualization and the other technology for further analysis and study.
     Since the rare events may contain important knowledge, outlier detection has a number of useful applications such as in defend for communication and credit card fraud, medical insurance, market analysis and weather forecast. Thus the study on outlier detection is very significant both on research and practice. How to efficiently and effectively find and deal with abnormity in large high dimensional dataset is a challenging problem.
     We focus on finding abnormity in datasets with clustering and classified structure and studying the implement and optimization of key technology for outlier detection in this paper. We have proposed outlier detection method based on spectral clustering and RBF neural network, and implement attribute reduction to speed up finding outliers by utilizing rough set. The main results are outlined as follows:
     ①The basic theory and traditional algorithms of spectral clustering are analyzed and studied roundly. Clustering on complex datasets can be implemented by using spectral method. An advanced algorithm based on random walk is proposed, which introduces the density sensitive distance metric to calculate the similarity between objects more accurately, and automatically selects the optimal clustering number according the eigenvalues of stochastic matrix. The stable cluster obtained by using such algorithm is the premise of achieving effective outlier detection.
     ②It is the first time to apply spectral clustering for outlier detection, and its feasibility can be proved by the definition of extended multicut and piecewise constant eigenvectors. An outlier detection algorithm based on spectral clustering is proposed, which first partitions the dataset, then calculate the outlying factor of objects in each cluster and identifies the outliers according such values. In the spectral clustering process, a sparse matrix can be obtained by using shared neighborhood based adjacent matrix whose first eigenvectors can be easily computed by Lanczos method.
     ③An outlier detection model by using RBF neural network is constructed, which utilizes subtractive clustering algorithm for selecting the hidden node centers so as to achieve faster training speed. In the network training process, a regularization term is added in the traditional error function to minimize the variances of the nodes in the hidden layer. By defining the degree of outlier, we can effectively find the abnormal data whose actual output is serious deviation from its expectation as long as the output is certainty.
     ④To solve the inefficient problem of finding outliers in large high dimensional datasets, an attribute reduction based detection method is proposed by introducing the concept of rough set. By defining outlying partition similarity, we can mine the outliers on the key outlying attribute subset rather than on the full dimensional attribute set of dataset as long as the similarity of outlying partition produced by them is large enough. An effective method for finding the key outlying attribute subset is proposed, and the experimental results testify its effectiveness.
引文
[1] Mehmed Kantardzic著,闪四清等译.数据挖掘-概念、模型、方法和算法[M].北京:清华大学出版社, 2003.
    [2] Z.A. Bakar, R. Mohemad and A. Ahmad. A Comparative Study for Outlier Detection Techniques in Data Mining [C]. IEEE Conference on Cybernetics and Intelligent Systems, 2006, 1-6.
    [3] D. J. Miller and J. Browning. A Mixture Model and EM-based Algorithm for Class Discovery, Robust Classification, and Outlier Rejection in Mixed Labeled Unlabeled Data Sets [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003, 25(11): 1468-1483.
    [4] Zhi Li, Hong Ma and Yongdao Zhou. A Unifying Method for Outlier and Change Detection from Data Streams [C]. International Conference on Computational Intelligence and Security, 2006, 580-585.
    [5] A. Schaum. Advanced Methods of Multivariate Anomaly Detection [C]. IEEE Aerospace Conference, 2007, 1-7.
    [6] M. M. Breunig, H. Kriegel and R. T. Ng. OPTICS-OF Identifying Local Outliers [C]. Proceedings of the 3rd European Conference on Principles and Practice of Knowledge Discovery in Databases, 1999, 262-270.
    [7] Y. Pei, O. R. Zaiane and Y. Gao. An Efficient Reference-Based Approach to Outlier Detection in Large Datasets [C]. Proceedings of the Sixth International Conference on Data Mining, 2006, 478-487.
    [8] S. Goldenstein and C. Vogler. When Occlusions are Outliers [C]. Proceedings of the 2006 Conference on Computer Vision and Pattern Recognition Workshop, 2006, 89-98.
    [9] A. Pawling, N. V. Chawla and G. Madey. Anomaly Detection in a Mobile Communication Network [J]. Comput Math Organ Theory, 2007, 13(4): 407-422.
    [10] D. Wang, D. S. Yeung and C. C. Tsang. Structured One-Class [J]. IEEE Transactions on Systems, Man, and Cybernetics, 2006, 36(6): 1283-1295.
    [11] J. A. Ting, A. D. Souza and S. Schaal. Automatic Outlier Detection_A Bayesian Approach [C]. IEEE International Conference on Robotics and Automation, 2007, 2489-2494.
    [12] D. Birant, A. Kut. Spatio-temporal outlier detection in large databases [C]. 28th Int. Conf. Information Technology Interfaces, 2006, 179-184.
    [13] X. Song, M. Wu and C. Jermaine. Conditional Anomaly Detection [J]. IEEE Transactions on Knowledge and Data Engineering, 2007, 19(5): 631-645.
    [14] Z. Wang, J. Li and H. Yu. Research of Spatial Outlier Detection Based on Quantitative Value of Attributive Correlation [C]. Proceedings of the 6th World Congress on Intelligent Control and Automation, 2006, 5906-5910.
    [15] M. Breitenbach and G. Z. Grudic. Clustering Through Ranking On Manifolds [C]. Proceedings of the 22 nd International Conference on Machine Learning, 2005, 73-80.
    [16] H. Xiong, G. Pandey and M. Steinbach. Enhancing Data Analysis with Noise Removal [J]. IEEE Transactions on Knowledge and Data Engineering, 2006, 18(3): 304-319.
    [17] A. Ghoting, M. E. Otey and S. Parthasarathy. LOADED: link-based outlier and anomaly detection in evolving data sets [C]. Proceedings of the Fourth IEEE International Conference on Data Mining, 2004, 387-390.
    [18] A.H. Pilevar and M. Sukumar. GCHL: A grid-clustering algorithm for high-dimensional very large spatial data bases [J]. Pattern Recognition Letters, 26(7): 999-1010.
    [19] R. S. Menjoge and . E. Welsch. A diagnostic method for simultaneous feature selection and outlier identification in linear regression [J]. Computational Statistics & Data Analysis, 2010, 54(12): 3181-3193.
    [20] Z. Q. Wang, S. K. Wang and T. Hong. A spatial outlier detection algorithm based multi-attributive correlation [C]. Proceedings of the Third International Conference on Machine Learning and Cybernetics, 2004, 1727-1732.
    [21] C. C. Aggarwal and P. S. Yu. An effective and efficient algorithm for high-dimensional outlier detection [J]. The VLDB Journal, 2005, 14(2): 211-221.
    [22] T. Zhu. An Outlier Detection Model Based on Cross Datasets Comparison for Financial Surveillance [C]. Proceedings of the IEEE Asia-Pacific Conference on Services Computing, 2006, 601-604.
    [23] N. Iyer and P. P. Bonissone. Automated Risk Classification and Outlier Detection [C]. Proceedings of the IEEE Symposium on Computational Intelligence in Multicriteria Decision Making, 2007, 272-279.
    [24] R. Fransens, C. Strecha and L. V. Gool. Robust Estimation in the Presence of Spatially Coherent Outliers [C]. Proceedings of the Conference on Computer Vision and Pattern Recognition Workshop, 2006, 102-110.
    [25] Tarem Ahmed and Mark Coates. Multivariate Online Anomaly Detection Using Kernel Recursive Least Squares [C]. 26th IEEE International Conference on Computer Communications, 2007, 625-633.
    [26] T.D. Nguyen and R. Welsch. Outlier detection and least trimmed squares approximation using semi-definite programming [J]. Computational Statistics & Data Analysis, 2010, 54(12): 3212-3226.
    [27] N. R. Adam, V. P. Janeja and V. Atluri. Neighborhood based detection of anomalies in high dimensional spatio-temporal of Sensor Datasets [C]. ACM Symposium on Applied Computing, 2004, 586-583.
    [28] K. Lekadir, R. Merrifield and G. Z. Yang. Outlier Detection and Handling for Robust 3-D Active Shape Models Search [J]. IEEE Transactions on Medical Imaging, 2007, 26(2): 212-222.
    [29] M. Wu and C. Jermaine. Outlier Detection by Sampling with Accuracy Guarantees [C]. Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, 2006, 767-772.
    [30] C. Strecha, R. Fransens and L. V. Gool. Combined Depth and Outlier Estimation in Multi-View Stereo [C]. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2006, 2394-2401.
    [31] G. Kollios, D. Gunopulos and N. Koudas. Efficient biased sampling for approximate clustering and outlier detection in large data sets [J]. IEEE Transactions on Knowledge and Data Engineering, 2003, 15(5): 1170-1187.
    [32] P. D. Urso. Fuzzy Clustering for Data Time Arrays With Inlier and Outlier Time Trajectories [J]. IEEE Transactions on Fuzzy Systems, 2005, 13(5): 583-604.
    [33] M. Victor, W. Zibetti and J. Mayer. Outlier Robust and Edge-Preserving Simultaneous Super-Resolution [C]. IEEE International Conference on Image Processing, 2006, 1741-1744.
    [34] M. Novotny and H. Hauser. Outlier-Preserving Focus+Context Visualization in Parallel Coordinates [J]. IEEE Transactions on Visualization and Computer graphics, 2006, 12(5): 893-900.
    [35] S. Kim and S.Cho. Prototype based outlier detection [C]. International Joint Conference on Neural Networks, 2006, 820-826.
    [36] S. Y. JIANG and Q. H. LI. GLOF: a new approach for mining local outlier [C]. Proceedings of the Second International Conference on Machine Learning and Cybernetics, 2003, 157-162.
    [37]周晓云,孙志挥等.高维类别属性数据流离群点快速检测算法.软件学报, 2007, 18(4): 933-942.
    [38]孙焕良,鲍玉斌,于戈.一种基于划分的孤立点检测算法.软件学报, 2006, 17(5): 1009-1016.
    [39]薛安荣,鞠时光,何伟华等.局部离群点挖掘算法研究.计算机学报, 2007, 30(8): 1455-1463.
    [40] Y. F. Jin and Q. S. Zhu. An Exceptional Reduction Algorithm for Outliers Analyzing in High-Dimension Space. Proc. of 6th World Congress on Intelligent Control and Automation, 2006, 5911-5914.
    [41] B. Mohar. The Laplacian spectrum of graphs [J]. Graph Theory, Combinatorics, and Applications, 1988: 871-898.
    [42] B. Mohar. Some applications of Laplace eigenvalues of graphs [J]. Graph Symmetry: Algebraic Methods and Applications, 1997: 225-275.
    [43] F. Chung. Spectral Graph Theory [M]. Providence: AMS and CBMS, 1997.
    [44] D. Verma and M. Meila. A Comparison of Spectral Clustering Algorithms. UW CSE. Technical report 03-05-01, 2003.
    [45] J. Shi and J. Malik. Normalized cuts and image segmentation [J]. IEEE Trans. Pattern Anal. Mach. Intell.,2000, 22(8): 888-905.
    [46] D. Wagner and F. Wagner. Between min cut and graph bisection [C]. Proceedings of the 18th International Symposium on Mathematical Foundations of Computer Science (MFCS), 1993, 744-750.
    [47] C. Ding, X. F. He and H. Y. Zha. A min-max cut algorithm for graph partitioning and data clustering [C] The IEEE International Conference on Data mining, 2001, 107-114.
    [48] A. Y. Ng, M. I. Jordan and Y. Weiss. On spectral clustering: Analysis and an algorithm [C]. Advances in Neural Information Processing Systems, 2002, 849-856.
    [49] M. Meila and J. Shi. A random walks view of spectral segmentation [C]. Proceedings of the 8th International Workshop on Artificial Intelligence and Statistics (AISTATS), 2001, 3-7.
    [50] M. Meila. The multicut lemma. UW Statistics Technical Report 417, 2001.
    [51] D. Zhou, O. Bousquet, and T. N. Lal. Learning with local and global consistency [C]. Advances in Neural Information Processing Systems 16, 2004, 32l-328.
    [52] M. Belkin, I. Matveeva and P. Niyogi. Regularization and semi-supervised learning on large graphs [C]. Proceedings of COLT, 2004, 624-638.
    [53] V. Vapnik. Statistical Learning Theory [M]. New York: John Wiley & Sons Press, 1998.
    [54]王玲,薄列峰,焦李成.密度敏感的谱聚类[J].电子学报, 2007, 35(8): 1577-1581.
    [55] C. H. Wang.“Recognition of semiconductor defect patterns using spatial filtering and spectral clustering [J]. Expert Systems with Applications, 2008, 34(3): 1914-1923.
    [56] A. Azran and Z. Ghahramani. Spectral Methods for Automatic Multiscale Data Clustering [C]. Proceedings of the IEEE Computer Society Conference on Computer Vision and PatternRecognition, 2006, 190-197.
    [57] U. V. Luxburg. A tutorial on spectral clustering [J]. Stat Comput, 2007, 17(4): 395-416.
    [58] K. Wagstaff and C. Cardie. Clustering with Instance-level Constraints [C] Proceedings of Seventeenth International Conference on Machine Learning, 2000.
    [59] M. Meila. Comparing Clusterings: An Axiomatic View [C]. Proceedings of the 22nd International Conference on Machine Learning, 2005, 577-584.
    [60] D. R. Wilson and T. R. Martinez. Improved Heterogeneous Distance Functions [J]. Journal of Artificial Intelligence Research, 1997, 6: 1-34.
    [61] C. Englund and A. Verikas. A hybrid approach to outlier detection in the offset lithographic printing process [J]. Engineering Applications of Artificial Intelligence, 2005, 18(6): 759-768.
    [62] Kyung A. Yoon and DooHwan Bae. A pattern-based outlier detection method identifying abnormal attributes in software project data[J]. Information and Software Technology, 2010, 52(2): 137-151.
    [63] Sanghamitra Bandyopadhyay and Santanu Santra. A genetic approach for efficient outlier detection in projected space[J]. Pattern Recognition, 2008, 41(4): 1338-1349.
    [64] Freedom N. Gumedze, Sue J. Welham and Beverley J. Gogel. A variance shift model for detection of outliers in the linear mixed model[J]. Computational Statistics & Data Analysis, 2010, 54(9): 2128-2144.
    [65] S. M. Guo, L. C. Chen, and J. S. H. Tsai. A boundary method for outlier detection based on support vector domain description[J]. Pattern Recognition, 2009, 42(1): 77-83.
    [66] J. Chen and J. A. Romagnoli. A strategy for simultaneous dynamic data reconciliation and outlier detection[J]. Computers & Chemical Engineering, 1998, 22(4): 559-562.
    [67] Edelgard Hund, D. Luc Massart and Johanna Smeyers Verbeke. Robust regression and outlier detection in the evaluation of robustness tests with different experimental designs[J]. Analytica Chimica Acta, 2002, 463(1): 53-73.
    [68] Patrick Wiegand, Randy Pell and Enric Comas. Simultaneous variable selection and outlier detection using a robust genetic algorithm[J]. Chemometrics and Intelligent Laboratory Systems, 2009, 98(2): 108-114.
    [69] Jianxin Pan. Discordant outlier detection in the growth curve model with Rao's simple covariance structure[J]. Statistics & Probability Letters, 2004, 69(2): 135-142.
    [70] Qi Zhou, Shaonan Li and Xiaopeng Li. Detection of outliers and establishment of targets in external quality assessment programs[J]. Clinica Chimica Acta, 2006, 372(1): 94-97.
    [71]崔贯勋.基于密度的离群数据挖掘算法研究[D].重庆:重庆大学, 2007.
    [72] Shuyan Chen, Wei Wang and Henk van Zuylen. A comparison of outlier detection algorithmsfor ITS data[J]. Expert Systems with Applications, 2010, 37(2): 1169-1178.
    [73] Asmund Ukkelberg and Odd S. Borgen. Outlier detection by robust alternating regression[J]. Analytica Chimica Acta, 1993, 277(2): 489-494.
    [74] James W. Wisnowski, Douglas C. Montgomery and James R. Simpson. A Comparative analysis of multiple outlier detection procedures in the linear regression model[J]. Computational Statistics & Data Analysis,2001, 36(3): 351-382.
    [75] Gerald C. Lalor and Chaosheng Zhang. Multivariate outlier detection and remediation in geochemical databases[J]. The Science of The Total Environment, 2001, 281(1): 99-109.
    [76] Desire L. Massart, Leonard Kaufman and Peter J. Rousseeuw. Least median of squares: a robust method for outlier and model error detection in regression and calibration[J]. Analytica Chimica Acta, 1986, (187): 171-179
    [77] Y. Barnett and T. Lewis. Outliers in Statistical Data [M]. New York: John Wiley and Sons, 1994.
    [78] P. Rousseeuw and A. Leory. Robust Regression and Outlier Detection [M]. New York: John Wiley and Sons, 1987.
    [79] E. Knorr and R. Ng. Algorithms for Mining Distance-Based Outliers in Large Datasets [C]. Proc. of International Conf. on Very Large Databases, 1998, 392-403.
    [80]蔡博文.高维数据集中离群数据挖掘方法的研究[D].合肥:合肥工业大学, 2006.
    [81] Ralf Ostermark. A fuzzy vector valued KNN-algorithm for automatic outlier detection[J]. Applied Soft Computing, 2009, 9(4): 1263-1272.
    [82] M. R. Brito, E. L. Chavez and A. J. Quiroz. Connectivity of the mutual k-nearest-neighbor graph in clustering and outlier detection[J]. Statistics & Probability Letters, 1997, 35(1): 33-42.
    [83] Mohamed Chaouch and Camelia Goga. Design-based estimation for geometric quantiles with application to outlier detection[J]. Computational Statistics & Data Analysis, 2010, 54(10): 2214-2229.
    [84] Alberto Luceno. Multiple outliers detection through reweighted least deviances[J]. Computational Statistics & Data Analysis, 1998, 26(3): 313-326.
    [85] B. Walczak. Outlier detection in multivariate calibration[J]. Chemometrics and Intelligent Laboratory Systems, 1995, 28(2): 259-272.
    [86] S. Ramaswamy, R. Rastogi and S. Kyuseok. Efficient algorithms for mining outliers from large data sets [C]. Proceedings of the ACM SIGMOD International Conference on Management of Data, 2000, 93-104.
    [87]薛安荣.空间离群点挖掘技术的研究[D].镇江:江苏大学, 2008.
    [88] C. Hartmann, P. Vankeerberghen and J. Smeyers-Verbeke. Robust orthogonal regression for the outlier detection when comparing two series of measurement results[J]. Analytica Chimica Acta, 1997, 344(1): 17-28.
    [89] Wentong Cui and Xuefeng Yan. Adaptive weighted least square support vector machine regression integrated with outlier detection and its application in QSAR[J]. Chemometrics and Intelligent Laboratory Systems, 2009, 98(2): 130-135.
    [90] J. A. Fernandez Pierna, F. Wahl and O. E. de Noord. Methods for outlier detection in prediction[J]. Chemometrics and Intelligent Laboratory Systems, 2002, 63(1): 27-39.
    [91] Siddhartha Chib and Ram C. Tiwari. Outlier detection in the state space model[J]. Statistics & Probability Letters, 1994, 20(2): 143-148.
    [92] M. J. Gomez, Z. De Benzo and C. Gomez. Comparison of methods for outlier detection and their effects on the classification results for a particular data base[J]. Analytica Chimica Acta, 1990, 239, 229-243.
    [93]汤俊.基于可疑金融交易识别的离群模式挖掘研究[D].武汉:武汉理工大学, 2007.
    [94] Yumin Chen, Duoqian Miao and Hongyun Zhang. Neighborhood outlier detection[J]. Expert Systems with Applications, 2010, 37(12): 8745-8749.
    [95] M. E Tarter. Density estimation applications for outlier detection[J]. Computer Programs in Biomedicine, 1979, 10(1): 55-60.
    [96] Chang Chun Lin and An Pin Chen. Fuzzy discriminant analysis with outlier detection by genetic algorithm[J]. Computers & Operations Research, 2004, 31(6): 877-888.
    [97] Gerd Puchwein and Anton Eibelhuber. Outlier detection in routine analysis of agricultural grain products by near-infrared spectrometry[J]. Analytica Chimica Acta, 1989, 223, 95-103.
    [98] Price John and Long James. Method and apparatus for biological fluid analyte concentration measurement using generalized distance outlier detection[J]. Laboratory Automation & Information Management, 1997, 33(2): 145.
    [99] M. M. Breunig, H. P. Kriegel and R. T. Ng. LOF: Identifying density based local outliers [C]. Proceedings of ACM Conference, 2000, 93-104.
    [100]杨风召,朱扬勇,施伯乐. IncLOF:动态环境下局部异常的增量挖掘算法[J].计算机研究与发展, 2004, 41(3): 477-484.
    [101] J. Wen and K. H. Anthony. Mining Top-n Local Outliers in Large Databases [C]. Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, 2001, 293-298.
    [102] H. J. Escalante. A Comparison of Outlier Detection Algorithms for Machine Learning. Programming and Computer Software, 2005, 228–237.
    [103]金义富.高维稀疏离群数据集延伸知识发现研究[D].重庆:重庆大学, 2007.
    [104] Johanna Hardin and David M. Rocke. Outlier detection in the multiple cluster setting using the minimum covariance determinant estimator[J]. Computational Statistics & Data Analysis, 2004, 44(4): 625-638.
    [105] Fang Liu and Baolin Wu. Multi-group cancer outlier differential gene expression detection[J]. Computational Biology and Chemistry, 2007, 31(2): 65-71.
    [106] Bernard Van Cutsem and Isak Gath. Detection of outliers and robust estimation using fuzzy clustering[J]. Computational Statistics & Data Analysis, 1993, 15(1): 47-61.
    [107]鞠可一,周德群,张玉强.高维离群检测算法及其应用[J].系统工程, 2008, 26(11): 116-122.
    [108] J. A. S. Almeida, L. M. S. Barbosa and S.J. Formosinho. Improving hierarchical cluster analysis: A new method with outlier detection and automatic clustering[J]. Chemometrics and Intelligent Laboratory Systems, 2007, 87(2): 208-217.
    [109] M.F. Jiang, S. S. Tseng and C. M. Su. Two-phase clustering process for outlier detection [J]. Pattern recognition letters, 2001, 22(6): 691-700.
    [110] D. Yu, G. Sheikholeslami and A. Zhang. Findout: finding outliers in very large datasets [J]. Knowledge and Information Systems, 2002, 4(4): 387-412.
    [111] H. D. K. Moonesinghe and P. N. Tan. Outlier Detection Using Random Walks [C]. Proceedings of the 18th IEEE International Conference on Tools with Artificial Intelligence, 2006, 1-8.
    [112]刘豫.大型结构动力特性的Lanczos算法程序设计.西南民族大学学报(自然科学版), 2008, 34(3): 590-594.
    [113] Z. Y. He, X. F. Xu and S. C. Deng. Discovering cluster-based local outliers [J]. Pattern Recognition Letters, 2003, 24(9): 1641-1650.
    [114] S. X. Yu and J. Shi. Multiclass Spectral Clustering [C]. Proceedings of the Ninth IEEE International Conference on Computer Vision, 2003, 1-7.
    [115] C. Hennig. Cluster-wise assessment of cluster stability [J]. Computational Statistics & Data Analysis, 2007, 52(1): 258-271
    [116] Philip T. Quinlan. Structural change and development in real and artificial neural networks[J]. Neural Networks, 1998, 11(4): 577-599
    [117] R. J. Kuo, P. Wu and C. P. Wang. An intelligent sales forecasting system through integration of artificial neural networks and fuzzy neural networks with fuzzy weight elimination[J]. Neural Networks, 2002, 15(7): 909-925.
    [118] Paulo J. Lisboa and Azzam F.G. Taktak. The use of artificial neural networks in decisionsupport in cancer: A systematic review[J]. Neural Networks, 2006, 19(4): 408-415.
    [119] S. D. Hunt and J. R. Deller Jr. Selective training of feedforward artificial neural networks using matrix perturbation theory[J]. Neural Networks, 1995, 8(6): 931-944.
    [120] R. J. Kuo. Multi-sensor integration for on-line tool wear estimation through artificial neural networks and fuzzy neural network[J]. Engineering Applications of Artificial Intelligence, 2000, 13(3): 249-261.
    [121] Marco Castellani and Hefin Rowlands. Evolutionary Artificial Neural Network Design and Training for wood veneer classification[J]. Engineering Applications of Artificial Intelligence, 2009, 22(4): 732-741.
    [122] Zhi Shang. Application of artificial intelligence CFD based on neural network in vapor–water two-phase flow[J]. Engineering Applications of Artificial Intelligence, 2005, 18(6): 663-671.
    [123] Cedric Archambeau, Jean Delbeke and Claude Veraart. Prediction of visual perceptions with artificial neural networks in a visual prosthesis for the blind[J]. Artificial Intelligence in Medicine, 2004, 32(3): 183-194.
    [124] Kaining Wang and Anthony N. Michel. Robustness and perturbation analysis of a class of artificial neural networks[J]. Neural Networks, 1994, 7(2): 251-259.
    [125] Chien Hung Wei. Analysis of artificial neural network models for freeway ramp metering control[J]. Artificial Intelligence in Engineering, 2001, 15(3): 241-252.
    [126] J. M. Benitez and J. L. Castro. Are Artificial neural networks Black Boxes [J]. IEEE Transactions on Neural Networks, 1997, 8(5): 1156-1164.
    [127]南晋华.决策神经网络模型及应用研究[D].武汉:华中科技大学, 2008.
    [128]段录平.基于RBF神经网络的数据挖掘研究[D].哈尔滨:哈尔滨理工大学, 2007.
    [129] Shane Naughton, Padraig Cunningham and Fergal Somers. Asynchronous transfer mode traffic modelling and dimensioning using artificial neural networks[J]. Engineering Applications of Artificial Intelligence, 1999, 12(3): 321-342.
    [130] Fernando Morgado Dias, Ana Antunes and Alexandre Manuel Mota. Artificial neural networks: a review of commercial hardware[J]. Engineering Applications of Artificial Intelligence, 2004, 17(8): 945-952.
    [131] Humar Kahramanli and Novruz Allahverdi. Rule extraction from trained adaptive neural networks using artificial immune systems[J]. Expert Systems with Applications, 2009, 36(2): 1513-1522.
    [132] Zhi Hua Zhou, Yuan Jiang and Yu Bin Yang. Lung cancer cell identification based on artificial neural network ensembles[J]. Artificial Intelligence in Medicine, 2002, 24(1): 25-36.
    [133] Douglas K. Swift and Cihan H. Dagli. A study on the network traffic of Connexion by Boeing: Modeling with artificial neural networks[J]. Engineering Applications of Artificial Intelligence, 2008, 21(8): 1113-1129.
    [134] Rey Chue Hwang, Yu Ju Chen and Huang Chu Huang. Artificial intelligent analyzer for mechanical properties of rolled steel bar by using neural networks[J]. Expert Systems with Applications, 2010, 37(4): 3136-3139.
    [135] R. J. Kuo, C. H. Chen and Y. C. Hwang. An intelligent stock trading decision support system through integration of genetic algorithm based fuzzy neural network and artificial neural network[J]. Fuzzy Sets and Systems, 2001, 118(1): 21-45.
    [136] Hany El Kadi. Modeling the mechanical behavior of fiber-reinforced polymeric composite materials using artificial neural networks-A review[J]. Composite Structures, 2006, 73(1): 1-23.
    [137] Padraig Cunningham, John Carney and Saji Jacob. Stability problems with artificial neural networks and the ensemble solution[J]. Artificial Intelligence in Medicine, 2000, 20(3): 217-225.
    [138] Alfonso Iglesias Nuno, Bernardino Arcay and J.M. Cotos. Optimisation of fishing predictions by means of artificial neural networks, anfis, functional networks and remote sensing images[J]. Expert Systems with Applications, 2005, 29(2): 356-363.
    [139] R. J. Kuo. Intelligent diagnosis for turbine blade faults using artificial neural networks and fuzzy logic[J]. Engineering Applications of Artificial Intelligence, 1995, 8(1): 25-34.
    [140] Tawfiq Al Saba and Ibrahim El Amin. Artificial neural networks as applied to long-term demand forecasting[J]. Artificial Intelligence in Engineering, 1999, 13(2): 189-197.
    [141] Christiane Ziegler, Annette Harsch and Wolfgang Gopel. Natural neural networks for quantitative sensing of neurochemicals: an artificial neural network analysis[J]. Sensors and Actuators B: Chemical, 2000, 65(1): 160-162.
    [142] M. Emin Tagluk, Mehmet Akin and Necmettin Sezgin. Class?f?cation of sleep apnea by using wavelet transform and artificial neural networks[J]. Expert Systems with Applications, 2010, 37(2): 1600-1607.
    [143] H. B. Bahar and D. H. Horrocks. Dynamic weight estimation using an artificial neural network[J]. Artificial Intelligence in Engineering, 1998, 12(1): 135-139.
    [144] Gisbert Schneider and Paul Wrede. Artificial neural networks for computer-based molecular design[J]. Progress in Biophysics and Molecular Biology, 1998, 70(3): 175-222.
    [145] S. J. Lee, S. R. Lee and Y. S. Kim. An approach to estimate unsaturated shear strength using artificial neural network and hyperbolic formulation[J]. Computers and Geotechnics, 2003,30(6): 489-503.
    [146] Bert A. Mobley, Eliot Schechter and William E. Moore. Predictions of coronary artery stenosis by artificial neural network[J]. Artificial Intelligence in Medicine, 2000, 18(3): 187-203.
    [147] R. P. Leger, Wm. J. Garland and W. F. S. Poehlman. Fault detection and diagnosis using statistical control charts and artificial neural networks[J]. Artificial Intelligence in Engineering, 1998, 12(1): 35-47.
    [148] J.Jose Vieira, Fernando Morgado Dias and Alexandre Mota. Artificial neural networks and neuro-fuzzy systems for modelling and controlling real systems: a comparative study[J]. Engineering Applications of Artificial Intelligence, 2004, 17(3): 265-273.
    [149] Costas Papaloukas, Dimitrios I. Fotiadis and Aristidis Likas. An ischemia detection method based on artificial neural networks[J]. Artificial Intelligence in Medicine, 2002, 24(2): 167-178.
    [150] Peter A. Lucon and Richard P. Donovan. An artificial neural network approach to multiphase continua constitutive modeling[J]. Composites Part B: Engineering, 2007, 38(7): 817-823.
    [151] Steven Walczak and Narciso Cerpa. Heuristic principles for the design of artificial neural networks[J]. Information and Software Technology, 1999, 41(2): 107-117.
    [152] Michele Romano, Shie Yui Liong and Minh Tue Vu. Artificial neural network for tsunami forecasting[J]. Journal of Asian Earth Sciences, 2009, 36(1): 29-37.
    [153] H. Schobesberger and C. Peham. Computerized Detection of Supporting Forelimb Lameness in the Horse Using an Artificial Neural Network[J]. The Veterinary Journal, 2002, 163(1): 77-84.
    [154]赵婧宏,潘维民.人工神经网络算法在数据挖掘中的应用[EB/01]. http://www.paper. edu.cn, 2007.
    [155] A. Mitiche and M. Lebidoff. Pattern classification by a condensed neural network [J]. Neural Networks, 2001, 14 (6) 575-580.
    [156] R. Feraud and F. Clerot. A Methodology to Explain Neural Network Classification [J]. Neural Networks, 2002, 15(2): 237-246.
    [157] G. Bloch, P. Thomas and D. Theilliol. Accommodation to outliers in identification of nonlinear SISO systems with neural networks [J]. Neurocomputing, 1997, 14 (1): 85-99.
    [158] C. C. Chuang and J. T. Jeng. CPBUM neural networks for modeling with outliers and noise [J]. Applied Soft Computing, 2007, 7(3): 957-967.
    [159] W. Zhao, D. Chen and S. Hu. Detection of outlier and a robust BP algorithm against outlier [J]. Computers and Chemical Engineering, 2004, 28(8): 1403-1408.
    [160] R. J. Bullen, D. Cornford and I. T. Nabney. Outlier detection in scatterometer data-neural network approaches [J]. Neural Networks, 2003, 16(3): 419-426.
    [161] J. Bourquin, H. Schmidli and P. V. Hoogevest. Pitfalls of artificial neural networks (ANN) modelling technique for data sets containing outlier measurements using a study on mixture properties of a direct compressed dosage form [J]. European Journal of Pharmaceutical Sciences, 1998, 7(1): 17-28.
    [162] A. Muiioz and J. Muruzhbal. Self-organizing maps for outlier detection [J]. Neurocomputing, 1998, 18(1): 33-60.
    [163] T. C. Lu, J. C. Juang and G. R. Yu. On-line Outliers Detection by Neural Network with Quantum Evolutionary Algorithm [C]. Proceedings of International Conference on Innovative Computing, Information and Control, 2007, 254-254.
    [164] S. S. Shirish and A. G. Ashok. Use of Instance Typicality for Efficient Detection of Outliers with Neural Network Classifiers [C]. Proceedings of the 9th International Conference on Information Technology, 2006, 225-228.
    [165] T. L. Lin and L. Z. Jiu. A RBF Neural Network Model for Anti-money Laundering [C]. 2008 International Conference on Wavelet Analysis and Pattern Recognition, 2008, 209-215.
    [166] S. Haralambos, A. Alex and B. George. A fast training algorithm for RBF networks based on subtractive clustering [J]. Neurocomputing, 2003, 51(4): 501-505.
    [167] S. L. Chiu. Extracting fuzzy rules for pattern classification by cluster estimation [C]. Proceedings of the 6th International Fuzzy Systems Association World Congress, 1995, 1-4.
    [168] M. Eftekhari and S. D. Katebi. Extracting compact fuzzy rules for nonlinear system modeling using subtractive clustering, GA and unscented filter [J]. Applied Mathematical modeling, 2008, 32(12): 2634-2651.
    [169] B. Karayiannis. Reformulated Radial Basis Neural Networks Trained by Gradient Descent [J]. IEEE Transaction on Neural Networks, 1999, 10(3): 657-671.
    [170] H. J. Lee and S. Cho. Application of LVQ to novelty detection using outlier training data [J]. Pattern Recognition Letters, 2006, 27(13): 1572-1579.
    [171] A. Hinneburg, C. C. Aggarwal and D. A. Keim. What is the nearest neighbor in high dimensional spaces [C]. Proceedings of the 26th VLDB Conference, 2000, 506-515.
    [172] Jiangtao Cui, Zhiyong An and Yong Guo. Efficient nearest neighbor query based on extended B+-tree in high-dimensional space[J]. Pattern Recognition Letters, 2010, 31(12): 1740-1748.
    [173] Ingo Schmitt and Soren Balko. Filter ranking in high-dimensional space[J]. Data & Knowledge Engineering, 2006, 56(3): 245-286.
    [174] Jack Lukaszuk and Ratko Orlandic. On accessing data in high-dimensional spaces: Acomparative study of three space partitioning strategies[J]. Journal of Systems and Software, 2004, 73(1): 147-157.
    [175] Michael N. Jones, Walter Kintsch and Douglas J. K. Mewhort. High-dimensional semantic space accounts of priming[J]. Journal of Memory and Language, 2006, 55(4): 534-552.
    [176] Dong Ho Lee and Hyoung Joo Kim. An efficient nearest neighbor search in high-dimensional data spaces[J]. Information Processing Letters, 2002, 81(5): 239-246.
    [177] Dong Ho Lee, Shin Heu and Hyoung Joo Kim. An efficient algorithm for hyperspherical range query processing in high-dimensional data space[J]. Information Processing Letters, 2002, 83(2): 115-123.
    [178] Jae Hyouk Lee and Naichung Conan Leung. Higher dimensional knot spaces for manifolds with vector cross products[J]. Advances in Mathematics, 2007, 213(1): 140-164.
    [179] Daniele V. Finocchiaro and Marco Pellegrini. On computing the diameter of a point set in high dimensional Euclidean space[J]. Theoretical Computer Science, 2002, 287(2): 501-514.
    [180] William F. Eddy, Audris Mockus and Shingo Oue. Approximate single linkage cluster analysis of large data sets in high-dimensional spaces[J]. Computational Statistics & Data Analysis, 1996, 23(1): 29-43.
    [181] E. Knorr and R. Ng. Finding Intensional Knowledge of Distance-based Outliers [C]. Proceedings of the 25th VLDB conference, 1999, 211-222.
    [182] S. D. Bay and M. Schwabacher. Mining Distance Based Outliers in Near Linear Time with Randomization and a Simple Pruning Rule [C]. Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, 2003, 29-38.
    [183] A. Ghoting, S. Parthasarathy and M. E. Otey. Fast mining of distance-based outliers in high-dimensional datasets [J]. Data Mining and Knowledge Discovery, 2008, 16(3): 349-364.
    [184] C. Aggarwal and P. S. Yu. Outlier detection for high dimensional data [C]. Proceedings of SIGMOD, 2001, 37-46.
    [185] Z. Chen and J. Tang. Modeling and efficient mining of intentional knowledge of outliers [C]. Proceedings of the Seventh International Database Engineering and Applications Symposium, 2003, 44-53.
    [186] J. Zhang and H. Wang. Detecting outlying subspaces for highdimensional data: the new task, algorithms, and performance [J]. Knowledge Information System, 2006, 10(3): 333-355.
    [187] Z. Meng and Z. Shi. A fast approach to attribute reduction in incomplete decision systems with tolerance relation-based rough sets [J]. Information Sciences, 2009, 179(16): 2774-2793.
    [188] M. Inuiguchi, Y. Yoshioka and Y. Kusunoki. Variable-precision dominance-based rough setapproach and attribute reduction [J]. International Journal of Approximate Reasoning, 2009, 50(8): 1199-1214.
    [189] Wei Hua Xu and Wen-Xiu Zhang. Measuring roughness of generalized rough sets induced by a covering[J]. Fuzzy Sets and Systems, 2007, 158(22): 2443-2455.
    [190] John N. Mordeson. Rough set theory applied to (fuzzy) ideal theory[J]. Fuzzy Sets and Systems, 2001, 121(2): 315-324.
    [191] Jari Kortelainen. On relationship between modified sets, topological spaces and rough sets[J]. Fuzzy Sets and Systems, 1994, 61(1): 91-95.
    [192] Francis E. H. Tay and Lixiang Shen. Economic and financial prediction using rough sets model[J]. European Journal of Operational Research, 2002, 141(3): 641-659.
    [193] Zdzisaw Pawlak. Rough set approach to knowledge-based decision support[J]. European Journal of Operational Research, 1997, 99(1): 48-57.
    [194] Ruixia Yan, Jianguo Zheng and Jinliang Liu. Research on the model of rough set over dual-universes[J]. Knowledge-Based Systems, 2010, 23(8): 817-822.
    [195] Y. Y. Yao. Two views of the theory of rough sets in finite universes[J]. International Journal of Approximate Reasoning, 1996, 15(4): 291-317.
    [196] Padmini Srinivasan, Miguel E. Ruiz and Donald H. Kraft. Vocabulary mining for information retrieval: rough sets and fuzzy sets[J]. Information Processing & Management, 2001, 37(1): 15-38.
    [197] Y. Yao and Y. Zhao. Attribute reduction in decision-theoretic rough set models [J]. Information Sciences, 2008, 178(17): 3356-3373.
    [198] S. Zhao and C.C. Tsang. On fuzzy approximation operators in attribute reduction with fuzzy rough sets [J]. Information Sciences, 2008, 178(16): 3163-3176.
    [199] R. Jensen and Q. Shen. Fuzzy rough attributes reduction with application to web categorization [J]. Fuzzy Sets and Systems, 2004, 141(3): 469-485.
    [200]金义富,朱庆生,邢永康.一种基于关键域子空间的离群数据聚类算法[J].计算机研究与发展, 2007, 44(4): 651-659.
    [201] F. Angiulli and C. Pizzuti. Outlier Mining in Large High-Dimensional Data Sets [J]. IEEE Transactions on Knowledge and Data Engieering, 2005, 17(2): 203-215.
    [202] F. Angiulli and L. Palopoli. Detecting Outlying Properties of Exceptional Objects [J]. ACM Transactions on Database Systems, 2009, 34(1): 1-62.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700