基于统计学习的模式识别几个问题及其应用研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
基于统计学习的模式识别方法是人工智能的一个重要研究领域。目前,统计模式识别已经得到了较深入的研究,一些相关技术成果已成功高效地应用于各种不同的领域。虽然如此,其中依然面临着许多的挑战,许多问题都还需要进步的深入探索和研究,特征降维和核方法就是其中倍受关注的两个重要主题。本课题就其中的几个关键挑战进行了相关的研究,所研究内容主要由四个部分,分述如下。
     第一部分由第二章组成。在这一部分中我们针对监督局部保持特征提取SLPP算法的小样本问题,提出了推广的监督局部保持特征提取GSLPP算法。在小样本情况下,GSLPP算法定义的优化问题可以等价的转换到一个低维空间中来求解,从而有效的克服了小样本问题。而且,在大样本情况下GSLPP算法等价于SLPP算法。
     第二部分由第三章和第四章组成。在这一部分内容中,我们主要讨论了最小类方差支撑向量机MCVSVM算法的改进问题。针对MCVSVM缺少考虑数据的局部结构信息的问题,在第三章中我们提出了最小局部保持类方差支撑向量机MCLPVSVM算法。该算法不但继承了传统支撑向量机SVM和MCVSVM的优点,同时又充分利用了数据的内在的几何结构信息,从而实现了泛化能力的进一步提高。同时在这一部分内容的第四章中,针对MCVSVM算法在小样本数据情况下仅利用了类内散度矩阵非零空间中的信息的问题,我们讨论了利用类内散度矩阵零空间中的信息来提高其泛化能力的问题,即首先在零空间中建立一种新的分类器-NSC,然后再把MCVSVM和NSC进行融合,从而进一步提出了集成分类器-EC。在EC算法中,综合利用类内散度矩阵非零空间和零空间中的信息来进一步提高分类性能,表现出了更强的泛化能力。
     第三部分由第五章组成。在这一部分内容中,我们依据支撑向量回归SVR回归算法可以通过构建SVM分类问题实现的基本思想,把MCVSVM分类算法推广到回归估计中,进而提出了最小方差支撑向量回归MVSVR回归算法。MVSVR继承了MCVSVM鲁棒性和泛化能力强的优点,同时其还可转化为一个标准的SVR问题来求解,并且在散度矩阵奇异情况下可以等价转换到新的数据空间中求解。
     第四部分由第六章组成。在这一部分内容中我们从理论上详细分析了支撑向量数据域描述SVDD算法的原始优化问题最优解的性质。我们首先把SVDD定义的原始优化问题等价转化为一个凸约束二次优化问题,然后从理论上证明了根据优化问题最优解所构建的超球圆心具有唯一性,然而超球半径在一定条件下却存在不唯一性,并且给出了半径存在不惟一性的充分必要条件。我们还从对偶优化问题的角度分析了超球的圆心和半径性质,并且给出了SVDD算法中在根据优化问题最优解构建超球半径不唯一情况下计算超球半径方法。
     本文的主要内容概括起来讲,第一部分探讨了特征降维问题,第二至第四部分探讨了核方法问题。
Pattern recognition abased on statistical theory is an important study field in artificial intelligence. At present, pattern recognition is deeply studied, and some relevant technology has been successfully applied in many fields. However, Pattern recognition still confronts many challenges, and many issues need to be more deeply explored and further study. Feature dimension reduction and kernel method are two important topics of it. Motivated by the above challenges, several issues are addressed in this study, which mainly involves the following four parts.
     In the first part which is composed of Chapter 2, aiming at the drawback of supervised locality preserving projection (SLPP), which encounters the so-called“small sample size”problem in the high-dimensional and small sample size case, a new algorithm called generalized supervised locality preserving projection (GSLPP) is proposed. The relationship between SLPP and GSLPP is theoretically analyzed. However, in the small sample size case GSLPP can be solved equivalently in lower-dimensionality space.
     In the second part which is composed of Chapter 3 and Chapter 4, how to improve the performance of the minimum class variance support vector machines (MCVSVM) algorithm is discussed. MCVSVM, contrast to the traditional support vector machines (SVM), utilizes effectively the distribution of the classes but has not taken the underlying geometric structure into full consideration. Therefore, in Chapter 3, a so-called minimum class locality preserving variance support vector machines (MCLPVSVM) is presented by introducing the basic theories of the locality preserving projections (LPP) into MCVSVM. This method inherits the characteristics of the traditional SVM and MCVSVM, fully considers the geometric structure between the samples, and shows better learning performance. On the other hand, in MCVSVM information only in the non-null space of the within-class scatter matrix is utilized in small sample size case. In order to improve farther the classification performance, in Chapter 4, the null space classifier (NSC) which is rooted in the null space is first presented and then a novel ensemble classifier (EC) is proposed by assembling the MCVSVM and the NSC. Be different form the MCVSVM and the NSC, the EC takes into consideration information both in the non-null space and in the null space.
     In the third part which is composed of Chapter 5, based on the basic idea that the support vector regression (SVR) can be regarded as a classification problem in the dual space, MCVSVM is extended to deal with the regression task, and then a novel regression algorithm called minimum variance support vector regression (SVR) is proposed. This method inherits the characteristics of the MCVSVM algorithm, such as gives a more robust solution and gets better generalization performance, and can be transformed into the traditional SVR.
     In the fourth part which is composed of Chapter 6, the properties of support vector data description (SVDD) solutions are explored. Most of previous research efforts on SVDD, which is one of the excellent and applied widely kernel methods, were directed toward efficient implementations and practical applications. However, very few research attempts have been directed toward studying the properties of SVDD solutions. In Chapter 6, the primal optimization of the SVDD is first transformed into a convex constrained optimization problem, and then the uniqueness of the centre of ball is proved and the non-uniqueness of the radius is investigated. In this paper, we investigate also the property of the centre and radius from the perspective of the dual optimization problem, and suggest a method to calculate the radius.
     As a whole, this study addresses feature dimension reduction method in the first part, and does kernel method from the second to the fourth part.
引文
[1] Sergios T, Konstantinos K著.模式识别[M].第3版.李晶皎,王爱侠,张广渊译.北京:电子工业出版社,2006
    [2]边肇祺,张学工.模式识别[M].北京:清华大学出版社, 2000
    [3]宋枫溪,高秀梅,刘树海,杨静宇.统计模式识别中的维数削减与低损降维[J].计算机学报, 2005, 28(11):1915-1922
    [4] Duda R O, Hart P E. Pattern, Stork D G.. Pattern Classification [M]. New York:Wiley Press, 2000
    [5] Guyon I, Elisseeff A. An introduction to variable and feature selection [J]. Journal of Machine Learning Research, 2003, 3:1157-1182
    [6] Andrew R W. Statistical Pattern Recognition[M]. Second Edition. John Wiley and Sons, 2002
    [7] Sun Y J. Iterative RELIEF for feature weighting: algorithms [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(6):1035-1051
    [8] Chow T W S, Wang P Y, Ma E W M. A new feature selection scheme using a data distribution factor for unsupervised nominal data [J]. IEEE Transactions on Systems, Man, and Cybernetics - Part B: Cybernetics, 2008, 38(2):499-509
    [9] Siedlecki W, Sklansky J. On automatic feature selection [J]. International Journal of Pattern Recognition and Artificial Intelligence, 1988, 2(2):197-205
    [10] F. Glover. Tabu search-part I [J]. ORSA Journal on Computing, 1989, 1(3): 190-206
    [11] F. Glover. Tabu search-part II [J]. ORSA Journal on Computing, 1990, 2:4-32
    [12] Foroutan I, Sklansky J. Feature selection for automatic classification of non-gaussian data [J]. I IEEE Transactions on Systems, Man and Cybernetics, 1987, 17(2):187-198
    [13] Bekkerman R, Yaniv R E, Tishby N, Winter Y. Distributional word clusters vs. words for text categorization [J]. Journal of Machine Learning Research, 2003, 3:1183-1208
    [14] Dhillon I, Mallela S, Kumar R. A divisive information-theoretic feature clustering algorithm for text classification [J]. Journal of Machine Learning Research, 2003, 3:1265-1287
    [15] Forman G. An extensive empirical study of feature selection metrics for text classification [J]. Journal of Machine Learning Research, 2003, 3:1289-1306
    [16] Scholkopf B, Smola A. Learning With Kernels [M]. Cambridge, MA: MIT Press, 2002
    [17] Roweis S, Saul L K. Nonlinear dimensionality reduction by locally linear embedding [J].Science, 2000, 290(22):2323-2326
    [18] Tenenbaum J B, Silva V, Langford J C. A global geometric framework for nonlinear dimensionality reduction [J]. Science, 2000, 290(22):2319-2323
    [19] Saul L, Roweis S. Think globally, fit locally: unsupervised learning of nonlinear manifolds [J]. Journal of Machine Learning Research, 2003, 4:119-155
    [20] Balasubramanian M, Shwartz E L, Tenenbaum J B, Silva V, Langford J C. The isomap algorithm and topological stability [J]. Science, 2002, 295(552):7-7
    [21] Tenenbaum J B, Silva V. Global versus local methods in nonlinear dimensionality reduction [C]. In: Advances in Neural Information Processing Systems, 2002
    [22] Mercer, J. Functions of positive and negative type and their connection with the theory of integral equations [C]. In: Philosophical Transactions of the Royal Society, London, Series A, Containing Papers of a Mathematical or Physical Character, 1909, 209:415-446
    [23] Boser B E,Guyon I M,Vapnik V N. A training algorithm for optimal margin classifiers [C]. In: D. Haussler, eds. Proceedings of the fifth annual workshop on Computational learning theory, Pittsburgh, PA:ACM Press, 1992.144-152
    [24] Cortes C, Vapnik V N. Support vector networks [J]. Machine Learning, 1995, 20(3):273-297
    [25] Vapnik V N. The Nature of Statistical Learning Theory [M]. Berlin:Springer-Verlag, 1995. (中文版:张学工译.统计学习理论的本质.北京:清华大学出版社, 2000)
    [26] Drueker H,Burges C J C,Kaufman L,Smola A,Vapnik V N. Support vector regression machines [C]. In: Advances in Neural Information Proeessing Systems, 1997
    [27] Sch?lkopf B, Smola A J, Wlllianson R, Bartlett P. New support vector aigorithms [J]. Neural Computation, 2000, 12(5):1207-1245
    [28] Smola A J, Sch?lkopf B. A tutorial on support vector regression [J]. Statistics and Computing, 2004, 14(3):199-222
    [29] Scholkopf B, Platt J C, Taylor J S, Smola A J, Williamson R C. Estimating the support of a high-dimensional distribution [J]. Neural Computation, 2001, 13(7):1443–1471
    [30] Tax D M J, Duin R P W. Support vector data description [J]. Machine Learning, 2004, 54:45–66
    [31] Asa B H, Horn D, Siegelmann H T, Vapnik V. Support vector clustering [J]. Journal of Machine Learning Research, 2001, 2:125-137
    [32] Fukunaga K. Introduction to Statistical Pattern Recognition [M]. New York:Academic Press, 1990
    [33] Diamantaras K I, Kung S Y. Principal Component Neural Networks [M]. NewYork:Wiley, 1996
    [34] Scholkopf A, Smola B, Muller K R. Nonlinear component analysis as a Kernel eigenvalue problem [J]. Neural Computing, 1998, 10(5):1299–1319
    [35] Mika S, Ratsch G, Weston J. Scholkopf B, Mullers K R. Fisher discriminant analysis with kernels [C]. In: Hu Y H, Larsen J, Wilson E, Douglas S, eds. Neural Networks for Signal Processing IX, IEEE, 1999:41-48
    [36] Baudat G, Anouar F. Generalized discriminant analysis using a kernel approach [J]. Neural Computation, 2000, 12(10):2385-2404
    [37] Joachims T. Text categorization with svm: learning with many relevant features [C]. In: Proceedings of ECM. 10th European Conference on Machine Learning, 1998
    [38] Zien A, R?tsch G, Mika S, et al. Engineering support vector machine kernels that recognize translation initiation sites in DNA [J]. Bioinformatics, 2000, 16(9):799-807
    [39] Veeramachaneni S, Nagy G. Style context with second-order statistics [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(1):14–22
    [40] He X F, Niyogi P. Locality preserving projections [C]. In: Advances in Neural Information Processing Systems, 2003
    [41] Zafeiriou S, Tefas A, Pitas I. Minimum class variance support vector machines [J]. IEEE Transanctions on Image Processing, 2007, 16(10):2551-2564
    [42]申中华,潘永惠,王士同.有监督的局部保留投影降维算法[J].模式识别与人工智能, 2008, 21(2):233-239
    [43] Wang H, Bell D, Murtagh F. Axiomatic approach to feature subset selection based on relevance[J]. IEEE Transactions on Pattern Analysis Machine Intelligence, 1999, 21(3):271-277
    [44] Pudil P, Novovicova J, Kittler J. Floating Search methods in feature selection [J]. Pattern Recognition Letters, 1994, 15:1119-1125
    [45] Kira K, Rendell L. A practical approach to feature selection [C]. In: Sleeman D, Edwards P, eds. Proceedings of the Ninth International Workshop on Machine Learning (ML92), San Mateo, California: Morgan Kaufmann, 1992.249-256
    [46] Kononenko I. Estimating attributes: analysis and extension of relief [C]. In: Proceedings of European Conference on Machine Learning, Catania, Italy, 1994.171-182
    [47] Andonie R, Cataron A. Feature ranking using supervised neural gas and informational energy [C]. In: Proceedings of IEEE International Joint Conference on Neural Networks (IJCNN2005), Montreal, Canada, July 31 - August 4, 2005
    [48] Hyvarinen A, Karhunen J, Oja E. Independent component analysis [M]. New York: Wiley, 2001
    [49] Velicer W F, Jackson D N. Component analysis versus common factor analysis: some issues in selecting an appropriate procedure [J]. Multivariate Behavioral Research, 1990, 25(1):1-28
    [50] Borg I, Groenen P. Modern Multidimensional Scaling: Theory and Applications [M] . New York: Springer-Verlag, 2005
    [51] Swets D L, Weng J. Using discriminant eigenfeatures for image retrieval [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1996, 18(8):831–836
    [52] Belhumeur P N, Hespanha J P, Kriegman D J. Eigenfaces vs. fisherfaces: recognition using class specific linear projection [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997, 19(7):711-720
    [53] Bach F R, Jordan M I. Kernel independent component analysis [J]. Journal of Machine Learning Research, 2002, 3:1-48
    [54] Mika S, Ratsch G, Weston J, Scholkopf B, Smola A, Muller K R. Constructing descriptive and discriminative nonlinear features: ayleigh coefficients in kernel feature spaces [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003, 25(5):623-628
    [55] Juwei L, Plataniotis K N, Venetsanopoulos A N. Face recognition using kernel direct discriminant analysis algorithms [J]. IEEE Transactions on Neural Networks, 2003, 14(1):117-126
    [56] He X F, Yan S, Hu Y, Niyogi P, Zhang H. Face recognition using laplacianfaces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 17(3):328-340
    [57] Kokiopoulou E, Saad Y. Orthogonal neighborhood preserving projections: a projection-based dimensionality reduction technique [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(12):2143-2156
    [58] Cai D, He X, Han J, Zhang H J. Orthogonal laplacianfaces for face recognition [J]. IEEE Transactions on Image Processing, 2006, 15(11):3608-3614
    [59] Yu W W, Teng X L, Liu C Q. Face recognition using discriminant locality preserving projections [J]. Image and Vision Computing, 2006, 24(3):239-248
    [60] Tao Q, Wu G W, Wang J. The theoretical analysis of FDA and applications [J]. Pattern Recognition, 2006, 39(6): 1199-1204
    [61] Liu J, Chen S C, Tan X Y. A study on three linear discriminant analysis based methods in small sample size problem [J]. Pattern Recognition, 2008, 41(1):102-116
    [62] Yang J, Yang J Y. Why can LDA be performed in PCA transformed space? [J]. Pattern Recognition, 2003, 36(2):563–566
    [63]杨健,杨静宇,叶晖. Fisher线性鉴别分析的理论研究及其应用[J].自动化学报,2003, 29(4):481-493
    [64] Zhuang X S, Dai D Q. Improved discriminate analysis for high-dimensional data and its application to face recognition [J]. Pattern Recognition, 2007, 40(5):1570-1578
    [65] Sugiyama M. Dimensionality reduction of multimodal labeled data by local fisher discriminant analysis [J]. Journal of Machine Learning Research, 2007, 8:1027-1061
    [66] Manor L Z, Perona P. Self-tuning spectral clustering [C]. In: Advances in Neural Information Processing Systems, 2005
    [67] Sam R. Binary Alphadigits [EB/OL]. http://www.cs.toronto.edu/roweis/data.html, 2008-4-30
    [68] Markus S. 3 Phase Data [EB/OL]. http://www.ncrg.aston.ac.uk/GTM/3PhaseData.html, 2008-4-30
    [69] Cambridge University Computer Laboratory. The ORL Database of Faces [EB/OL]. http://www.uk.research.att.com/facedatabase.htm, 2008-4-30
    [70] Yale University. Face Database [EB/OL]. http://cvc.yale.edu/projects/yalefaces/yalefaces.html, 2008-4-30
    [71] Jiang L X., Cai Z H, Wang D H, Jiang S W. Survey of improving k-nearest-neighbor for classification [C]. In: Fourth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), Aug 24-27, 2007:679-683
    [72]张学工.关于统计学习理论与支持向量机[J].自动化学报, 2000, 26(1):32-42
    [73] Vapnik V N. Estimation of Dependences Based on Empirical Data [M]. Berlin:Springer, 1982
    [74] Vapnik V. Statistical Learning Theory [M]. New York: Wiley, 1998
    [75] Bartlett P L, Taylor J S. Generalization performance on support vector machines and other pattern classifiers [C]. In: Advances in Kernel Methods Support Vector Learning, Cambridge, MA:MIT Press, 1999
    [76] Cristianini N, Taylor J S. An introduction to support vector machines [M]. Cambridge, U.K.:Cambridge University Press, 2000
    [77] Weston J, Watkins C. Multi-class support vector machines. Technical Report CSD-TR-98-04, Royal Holloway, University of London, Department of Computer Science, 1998
    [78] Bredensteiner E J, Bennett K P. Multicategory classification by support vector machines [J]. Computational Optimization and Applications, 1999, 12(1-3):53-79
    [79] Zhang T. Statistical analysis of some multi-category large margin classification methods [J]. The Journal of Machine Learning Research, 2004, 5:1225-1251
    [80] Guermeur Y. VC theory of large margin multi-category classifiers [J]. The Journal ofMachine Learning Research, 2007, 8:2551-2594
    [81] G?nen M, Tanugur A G, Alpayd?n E. Multiclass posterior probability support vector machines [J]. IEEE Transactions on Neural Networks, 2008, 19(1):130-139
    [82] Evgeniou T, Pontil M, Poggio T. Regularization networks and support vector machines [J]. Advances in Computational Mathematics, 2000, 13(1):1-50
    [83] Steinwart I. Consistency of support vector machines and other regularized kernel classifiers [J]. IEEE Transanctions on Information Theory, 2005, 51(1):128-142
    [84] Hastie T, Rosset S, Tibshirani R, Zhu J. The entire regularization path for the support vector machine [J]. Journal of Machine Learning Research, 2004, 5:1391-1415
    [85] Wang G, Yeung D Y, Lochovsky F H. A kernel path algorithm for support vector machines [C]. In: Proceedings of the 24th international conference on machine learning. 2007, 227:951-958
    [86] Mangasarian O L, Musicant D R. Lagrangian support vector machines [J]. Journal of Machine Learning Research, 2001, 1:161-177
    [87] Fung G, Mangasarian O L. Proximal support vector machine classifiers [C]. In: Proceedings of the seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, California, 2001:77-86
    [88] Gestel T V, Suykens J A K, Baesens B, Viaene S, Vanthienen J, Dedene G, Moor B, Vandewalle J. Benchmarking least squares support vector machine classifiers [J]. Machine Learning, 2004, 54(1):5-32
    [89] Lin C J. On the convergence of the decomposition method for support vector machines [J]. IEEE Transactions on Neural Networks, 2001, 12(6):1288-1298
    [90] Platt J C. Fast training of support vector machines using sequential minimal optimization [C]. In: Sch?lkopf B, Burges C J C ,Smola A J, eds. Advances in Kernel Methods - Support Vector Learning, Cambridge, MA: MIT Press, 1998
    [91] Keerthi S S, Shevade S K, Bhattacharyya C, Murthy K R K. Improvements to platt's smo algorithm for svm vlassifier design [J]. Neural Computation, 2001, 13(3):637-649
    [92] Fan R E, Chen P H, Lin C J. Working set selection using second order information for training svm [J]. Journal of Machine Learning Research, 2005, 6:1889-1918
    [93] Takahashi N, Nishi T . Global convergence of decomposition learning methods for support vector machines [J]. IEEE Transactions on Neural Networks, 2006, 17(6):1362-1369
    [94] Tsang I W, Kwok J T, Cheung P M. Core vector machines: fast svm training on very large data sets [J]. Journal of Machine Learning Research, 2005, 6:363-392
    [95] Tsang I W, Kwok J T, Zurada J M. Generalized core vector machines [J]. IEEETransactions on Neural Networks, 2006, 17(5):1126-1140
    [96] Er M J, Chen W, Wu S. High-speed face recognition based on discrete cosine transform and RBF neural networks [J]. IEEE Transanctions on Neural Networks, 2005, 16(3):679-691
    [97] Osuna E, Freund R, Girosi F. Training support vector machines: An application to face detection [C]. In: Proceedings of IEEE Conference Computer Vision and Pattern Recognition, San Juan, PR, 1997.130-136
    [98] Tong S, Chang E. Support vector machine active learning for image retrieval [C]. In; Proceedings of ACM International Conference on Multimedia, 2001.107-118
    [99] Ganapathiraju A, Hamaker J E, Picone J. Applications of support vector machines to speech recognition [J]. IEEE Transactions on Signal Processing, 2004, 52(8):2348-2355
    [100] Xiang S M, Nie F P, Zhang C S. Learning a Mahalanobis distance metric for data clustering and classification [J]. Pattern Recognition, 2008, 41(12):3600-3612
    [101] Belkin M, Niyogi P, Sindhwani V. Manifold regularization: a geometric framework for learning from labeled and unlabeled examples [J]. The Journal of Machine Learning Research, 2006, 7:2399-2434
    [102] Fletcher R. Practical Methods of Optimization [M]. New York:Wiley, 1987
    [103] Asuncion A, Newman D J. UCI Machine Learning Repository [EB/OL]. http://archive.ics.uci.edu/ml/, 2007
    [104] Chang C C, Lin C J. LIBSVM: a library for support vector machines [EB/OL]. http://www. csie.ntu.edu.tw/~cjlin/libsvm, 2009-04-02
    [105] Abril L G, Angulo C, Velasco F, Ortega J A. A note on the bias in svm for multiclassification [J]. IEEE Transanctions on Neural Networks, 2008, 19(4):723-725
    [106] Zhang X X, Jia Y D. A linear discriminant analysis framework based on random subspace for face recognition [J]. Pattern Recognition, 2007, 40(9):2585-2591
    [107] Zhu J, Hoi S C H, Lyu M R T. Robust regularized kernel regression [J]. IEEE Transactions on Systems, Man, and Cybernetics-Part B: Cybernetics, 2008, 38(6):1639-1644
    [108] Chu W, Keerthi S S, Ong C J. Bayesian support vector regression using a unified loss function [J]. IEEE Transactions on Neural Networks, 2004, 15(1):29-44
    [109] Lee S I., Lee H, Abbeel P, Ng A Y. Efficient l1 regularized logistic regression [C]. In: Proceedings of the Twenty-First National Conference on Artificial Intelligence (AAAI-06), 2006:401-408
    [110] Lee Y J, Hsieh W F, Huang C M. Epsilon-SSVR: A smooth support vector machine for epsilon-insensitive regression [J]. IEEE Transactions on Knowledge and DataEngineering, 2005, 17(5):678–685
    [111] Mangasarian O L, Musicant D R. Robust linear and support vector regression [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(9):950-955
    [112] Rasmussen C E, Williams C K I. Gaussian Processes for Machine Learning [M]. Cambridge, MA: MIT Press, 2006
    [113] Huber P J. Robust estimation of a location parameter [J]. The Annals of Mathematical Statistics, 1964, 35(1):73-101
    [114] Takeda H, Farsiu S, Milanfar P. Kernel regression for image processing and reconstruction [J]. IEEE Transactions on Image Processing, 2007, 16(2):349-366
    [115] Takeda H, Farsiu S, Milanfar P. Deblurring using regularized locally-adaptive kernel regression [J]. IEEE Transactions on Image Processing, 2008, 17(4):550-563
    [116] Zhao Y P, Sun J G. Robust support vector regression in the primal [J]. Neural Networks, 2008, 21(10):1548-1555
    [117] Gunter L, Zhu J. Efficient computation and model selection for the support vector regression [J]. Neural Computation, 2007, 19(6):1633-1655
    [118] Wang G, Yeung D Y, Lochovsky F H. A new solution path algorithm in support vector regression [J]. IEEE Transanctions on Neural Networks, 2008, 19(10):1753-1767
    [119] Bi J, Bennett K P. A geometric approach to support vector regression [J], Neurocomputing, 2003, 55(1-2):79-108
    [120]邓乃扬,田英杰.数据挖掘中的新方法-支撑向量机[M].北京:科学出版社, 2004
    [121]陶卿,曹进德,孙德敏.基于支持向量机分类的回归方法[J].软件学报, 2002, 13(5):1024-1028
    [122] Burges C, Crisp D. Uniqueness of the svm solution [C]. In: Advances in Neural Information Processing Systems, 2000
    [123] Burges C, Crisp D. Uniqueness theorems for kernel methods [J]. Neurocomputing, 2003, 55:187-220
    [124] Zhang J R, Chiu S Y, Lan L S. Non-uniqueness of solutions of 1-norm support vector classification in dual form [C]. In: IEEE International Joint Conference on Neural Networks (IJCNN). 2008:3058-3061
    [125] Tax D M J, Duin R P W. Support vector domain description [J]. Pattern Recognition Letters, 1999, 20(11-13):1191-1199
    [126] Lee K Y, Kim D W, Lee D, Lee K H. Improving support vector data description using local density degree [J]. Pattern Recognition, 2005, 38(10):1768-1771
    [127] Guo S M, Chen L C, Tsai J S H. A boundary method for outlier detection based on support vector domain description [J]. Pattern Recognition, 2009, 42(1):77-83
    [128] Park J, Kang D, Kim J, Kwok J T, Tsang I W. SVDD-based pattern denoising [J]. Neural Computation, 2007, 19:1919-1938
    [129] Hur A B, Horn D, Siegelmann H T, Vapnik V. Support vector clustering [J]. Journal of Machine Learning Research, 2001, 2:125-137
    [130] Asharaf S, Shevade S K, Murty M N. Rough support vector clustering [J]. Pattern Recognition, 2005, 38(10):1779-1783
    [131] Lee J, Lee D. An improved cluster labeling method for support vector clustering [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(3):461-464
    [132] Wang J S, Chiang J C. A cluster validity measure with outlier detection for support vector clustering [J]. IEEE Transactions on Systems, Man, and Cybernetics-Part B, 2008, 38(1):78-89
    [133] Mu T, Nandi A K. Multiclass classification based on extended support vector data description [J]. IEEE Transactions on Systems, Man, and Cybernetics-Part B: Cybernetics, 2009, 39(5):1206-1216
    [134] Chu C S, Tsang I W, Kwok J T. Scaling up support vector data description by using core-sets [C]. In: Proceedings of the International Joint Conference on Neural Networks, 2004:425-430
    [135]陈斌,冯爱民,陈松灿,李斌.基于单簇聚类的数据描述[J].计算机学报, 2007, 30(8):1325-1332
    [136]赵峰,张军英,刘敬.一种改善支撑向量域描述性能的核优化算法[J].自动化学报, 2008, 34(9):1122-1127
    [137] Wu M, Ye J P. A small sphere and large margin approach for novelty detection using training data with outliers [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009, 31(11):2088-2092
    [138] Lee D, Lee J. Domain described support vector classifier for multi-classification problems [J]. Pattern Recognition, 2007, 40(1):41-51
    [139] Lee K Y, Kim D W, Lee K H, Lee D. Density-induced support vector data description [J]. IEEE Transactions on Neural Networks, 2007, 18(1):284-289
    [140] Sj?strand K, Hansen M S, Larsson H B, Larsen R. A path algorithm for the support vector domain description and its application to medical imaging [J]. Medical Image Analysis, 2007, 11(5): 417-428
    [141] Lee S W, Park J, Lee S W. Low resolution face recognition based on support vector data description [J]. Pattern Recognition, 2006, 39(9):1809-1812
    [142] Chang C C, Tsai H C, Lee Y J. A minimum enclosing balls labeling method for support vector clustering [EB/OL].http://dmlab1.csie.ntust.edu.tw/downloads/papers/SVC_MEB.pdf, 2009-12-09
    [143] Stephen B. Convex Optimization [M]. Stanford University, New York:Cambridge University Press, 2004