基于分类的复杂数据处理方法研究

作者：王惠亚
论文级别：博士
学科专业名称：基础数学
中文关键词：模式分类 ; 小样本问题 ; 维数约简 ; 多源特征数据 ; 选择集成学习 ; 数据结构信息 ; 样本空间划分
英文关键词：pattern classification ; small sample size ; dimensionality reduction ; heterogeneous data ; selective ensemble learning ; structural information of
英文关键词：data ; sample space partition
学位年度：2013
导师：郭鹏江
学科代码：070101
学位授予单位：西北大学

摘要

模式分类是机器学习的核心技术之一。将数据集中属性一致的样本投影到某一给定的类别当中,并模型化为具体的分类器。近年来,模式分类取得了许多重要的研究成果,出现了诸如决策树、Bayes分类、k-近邻、神经网络、遗传算法、支持向量机等经典算法。随着其应用领域的进一步扩大,需要做分类处理的数据也变得复杂多样,分类模型的建立和分类器的设计面临着越来越多的挑战。
     本文围绕在分类问题中如何处理高维小样本数据、多源特征数据,以及在分类过程中如何利用数据分布的复杂结构信息有效提高分类性能的问题展开了研究。主要内容如下：
     (1)为解决高维复杂数据的维数约简问题,提出了一种新的基于正交局部判别嵌入(Orthogonal Linear Discriminant Analysis, O-LDE)的维数约简方法,并结合最近邻方法分类。首先,建立两个能够保持数据局部邻域信息的类内和类间邻接图；其次,针对小样本问题,重新定义邻接矩阵,适当修正优化的目标函数；然后,通过构建正交基求解目标函数,完成高维空间到低维流形的嵌入；最后,在低维空间中利用最近邻法进行分类。O-LDE的维数约简方法通过对类内样本的紧密度和类间样本的分离度的保持,达到了抽取有效分类信息和压缩特征空间维数的双重效果。在公共数据集Leukemia的试验结果表明该方法在基因表达谱的肿瘤识别中比LDA、LLDE、LDE等维数约简方法表现出更好的性能。
     (2)针对复杂数据的多源特征难以集中到一个分类器中做决策的问题,提出了一种基于分组特征子集选择的Bayes集成学习算法(the Bayesian Ensemble Algorithm based on Grouped Feature Selection, EGFS+BC)。首先,将数据特征按来源分组,对于每个特征源随机地从中抽取一部分作为初始化的特征子集；然后,以提高Bayes基分类器的准确率和分类器之间的差异度为目标,完成特征子集的动态选择；最后,根据选择得到的特征子集训练合适的基分类器,在集成学习的框架下,用加权投票的方式进行综合决策。该方法利用了不同源特征之间的差异性和互补性,在公开的DDSM多源数据集上的试验结果显示,该方法比k-NN、 Boost C5、Neural Net等多种分类器都具有更高的分类准确率。
     (3)为了更好的利用复杂数据中潜在的类内结构信息,提出了基于样本空间结构学习的分组SVM方法,包括基于聚类分组的SVM(Clustered Group SVM, GC-SVM)和基于EM样本空间分割的分组模糊支持向量机方法(Grouped Fuzzy SVM Algorithm with EM-based Partition of Sample Space, EMG-FSVM)。首先,为了清楚的描述类内样本结构信息,按照一定的相似性度量规则(分别采用了聚类和EM技术)对正、负类的样本空间分别进行有效分组；然后,交叉结合不同的正、负类的群组样本,训练出不同的SVM子分类器：最后,对未知的新样本,则根据其与各划分小组中心的Mahalanobis距离选择特定的SVM分类器来判断类别。该方法将复杂的大样本二次规划问题划分为一系列小的、简单的二次规划子问题,缩短了分类器的训练时间,一定程度上还提高了分类速度。仿真和实际乳腺病灶数据的试验结果表明,该方法确实比各种不同核的SVM方法具有更好的分类效果和稳定性。
Pattern classification is one of the core technologies of machine learning. The samples with consistent properties are projected onto a given class, which is modeled as a specific classifier. In recent years classification has many important research achievements, such as decision tree, Bayes classifier, neural networks, genetic algorithms, support vector machine, etc. However, with the development of application fields, classification needs deal with more and more complicated and diversified data sets, which make increasingly challenges for classifier design.
     For better classification, this paper is mainly about the research on how to deal with the high dimensionality and small sample size (SSS) data sets, the complicated data with heterogeneous features, and on how to make full use of the potential structural information of data distribution. The main works are listed as follows:
     (1) To deal with dimensionality reduction of the high dimensional data, a novel K-NN classification algorithm based on orthogonal linear discriminant analysis (O-LDE) is proposed. Firstly, construct two neighborhood graphs which can best keep the local neighborhood information of the within-class and between-class for the data; secondly, to overcome SSS problem, rewrite the affinity matrices and modify the objective function of optimization; thirdly, complete the embedding from high-dimensional space to low-dimensional space by producing orthogonal basis vectors to solve the optimization; finally, classification with k-NN in the low-dimensional embedding subspace. After dimensionality reduction with O-LDE, the data points of the same class maintain their intrinsic neighbor relations, whereas the neighboring points of the different classes are far from each other. Experimental results on the public tumor dataset Leukemia validate the better performance of the proposed algorithm than LDA, LLDE and LDE.
     (2) To solve the difficulty of making the classification decision with the same classifier on the complicated heterogeneous data, a novel Bayesian ensemble algorithm based on grouped feature subsets selection (EGFS+BC) is proposed. Firstly, group the features by their different type of resource and randomly extracte part of features as the initial feature subsets for each group; secondly, complete dynamic feature subsets selection in accordance with the strategy of improving the accuracy and diversity of the base classifiers; finally, integration of the favorable Bayes base classifiers trained with the selected feature subsets, under the framework of ensemble learning of weighted voting. EGFS+BC take advantage of the difference and complementarity between the diversified features. Experimental results on the public heterogeneous data DDSM demonstrate that our classification scheme outperforms many single traditional classifiers, such as k-NN、Boost C5、Neural Net et al.
     (3) To make better use of the structural information for classification, a kind of grouped SVM algorithm based on sample space partition is proposed, including clustered group SVM (GC-SVM) and grouped fuzzy SVM with EM-based sample space partition (EMG-FS VM). Firstly, partition the sample space of the positive and negative class into several subsets respectively for clear description the structural information of with-class, according to the certain similarity measure criterion (such as k-means and EM clustering); secondly, train different sub SVM classifier with the combining samples from each subsets with different class label; finally, predict the unknown sample with the specific SVM classifier selected by the Mahalanobis distance between the new sample and the center of each subsets. This integrated classification framework casts a difficult two-class classification problem into a serial simple two-class sub problems, which shorten the training time and improve the speed of classification. Experiments on both synthetic and real clustered microcalcification detection datasets show that the proposed integrated classification framework has much superior performance and stability than different kernel SVMs.

引文

[1]Fukunaga K. Introduction to statistical pattern recognition[M]. Access Online via Elsevier,1990
    [2]Duda R. O., Hart P. E., Stork D. G. Pattern classification[M]. John Wiley & Sons, 2012
    [3]Bishop C. M., Nasrabadi N. M. Pattern recognition and machine learning[M]. New York:springer,2006
    [4]边肇祺,张学工.模式识别(第二版)[M].北京：清华大学出版社,2000
    [5]Picard R. W. Affective computing[M]. MIT press,1997
    [6]Devijver P. A., Kittler J. Pattern recognition:A statistical approach[M]. Englewood Cliffs, NJ:Prentice/Hall International,1982
    [7]Friedman M., Kandel A. Introduction to pattern recognition:statistical, structural, neural, and fuzzy logic approaches[M]. Signapore:World scientific,1999
    [8]张翠平,苏光大.人脸识别技术综述[J].中国图象图形学报,2000,5(11)：885-894
    [9]Cheng H. D., Cai X., Chen X., et al. Computer-aided detection and classification of microcalcifications in mammograms:a survey[J]. Pattern recognition,2003, 36(12):2967-2991
    [10]Kumar A. Computer-vision-based fabric defect detection:a survey[J]. Industrial Electronics, IEEE Transactions on,2008,55(1):348-363
    [11]Norris D., McQueen J. M. Shortlist B:a Bayesian model of continuous speech recognition[J]. Psychological review,2008,115(2):357
    [12]谈恒贵,王文杰,李游华.数据挖掘分类算法综述[J].微型机与应用,2005,2：4-9
    [13]Jain A. K., Chandrasekaran B. Dimensionality and sample size considerations in pattern recognition practice[J]. Handbook of statistics,1982,2:835-855
    [14]Raudys S. J., Jain A. K. Small sample size effects in statistical pattern recognition:Recommendations for practitioners[J]. IEEE Transactions on pattern analysis and machine intelligence,1991,13(3):252-264
    [15]Raudys S., Pikelis V. On dimensionality, sample size, classification error, and complexity of classification algorithm in pattern recognition[J]. Pattern Analysis and Machine Intelligence, IEEE Transactions on,1980 (3):242-252
    [16]Bellman R. Adaptive control processes:a guided tour[M]. Princeton:Princeton university press,1961
    [17]Foley D. Considerations of sample and feature size[J]. Information Theory, IEEE Transactions on,1972,18(5):618-626
    [18]Tong S., Koller D. Support vector machine active learning with applications to text classification[J]. The Journal of Machine Learning Research,2002,2:45-66
    [19]章毓晋.中国图像工程：2008[J].中国图象图形学报,2009,14(5)：809-837
    [20]Nguyen M. N., Rajapakse J. C. Multi-class support vector machines for protein secondary structure prediction[J]. Genome Informatics Series,2003:218-227.
    [21]Japkowicz N. Learning from imbalanced data sets[C]. In:Proceedings of Workshops at the 17th National Conference on Artificial Intelligence,2000
    [22]Chawla N. V., Japkowicz N., Kotcz A. Editorial:special issue on learning from imbalanced data sets[J]. ACM SIGKDD Explorations Newsletter,2004,6(1): 1-6
    [23]He H, Garcia E A. Learning from imbalanced data[J]. Knowledge and Data Engineering, IEEE Transactions on,2009,21(9):1263-1284
    [24]Khoshgoftaar T. M., Van Hulse J., Napolitano A. Supervised neural network modeling:an empirical investigation into learning from imbalanced data with labeling errors[J]. Neural Networks, IEEE Transactions on,2010,21(5): 813-830
    [25]Christodoulou C. I., Pattichis C. S. Unsupervised pattern recognition for the classification of EMG signals[J]. Biomedical Engineering, IEEE Transactions on,1999,46(2):169-178
    [26]Veeramachaneni S, Nagy G. Style context with second-order statistics[J]. Pattern Analysis and Machine Intelligence, IEEE Transactions on,2005,27(1):14-22
    [27]Osuna E., Freund R., Girosit F. Training support vector machines:an application to face detection[C]. In:Proceeding of computer Vision and Pattern Recognition, 1997. IEEE Computer Society Conference on. IEEE,1997:130-136
    [28]Huang K., Yang H., King I., et al. Learning large margin classifiers locally and globally[C]. In:Proceedings of the twenty-first international conference on Machine learning. ACM,2004:51
    [29]Yeung D. S., Wang D., Ng W. W. Y., et al. Structured large margin machines: sensitive to data distributions[J]. Machine Learning,2007,68(2):171-200
    [30]Qi Z., Tian Y., Shi Y. Structural twin support vector machine for classification[J]. Knowledge-Based Systems,2013,43:74-81
    [31]Jolliffe I. Principal component analysis[M]. New York:Springer-Verlag,1986
    [32]Jutten C., Herault J. Independent component analysis versus PCA. In:Proceeding of European Signal Processing Conf,1988,287-314
    [33]Balakrishnama S., Ganapathiraju A. Linear discriminant analysis-a brief tutorial[J]. Institute for Signal and information Processing,1998
    [34]Sharma A., Paliwal K. K. Cancer classification by gradient LDA technique using microarray gene expression data[J]. Data & Knowledge Engineering,2008, 66(2):338-347
    [35]Xu P., Brock G. N., Parrish R. S. Modified linear discriminant analysis approaches for classification of high-dimensional microarray data[J]. Computational Statistics & Data Analysis,2009,53(5):1674-1687
    [36]Tian Q., Barbero M., Gu Z. H., et al. Image classification by the Foley-Sammon transform[J]. Optical Engineering,1986,25(7):834-840
    [37]Belhumeur P. N., Hespanha J. P., Kriegman D. J. Eigenfaces vs. fisherfaces: Recognition using class specific linear projection[J]. Pattern Analysis and Machine Intelligence, IEEE Transactions on,1997,19(7):711-720
    [38]Hastie T., Buja A., Tibshirani R. Penalized discriminant analysis[J]. The Annals of Statistics,1995:73-102
    [39]Chen L. F., Liao H. Y. M., Ko M. T., et al. A new LDA-based face recognition system which can solve the small sample size problem[J]. Pattern recognition, 2000,33(10):1713-1726
    [40]Tenenbaum J. B., De Silva V., Langford J. C. A global geometric framework for nonlinear dimensionality reduction[J]. Science,2000,290(5500):2319-2323.
    [41]Roweis S. T, Saul L. K. Nonlinear dimensionality reduction by locally linear embedding[J]. Science,2000,290(5500):2323-2326
    [42]Belkin M., Niyogi P. Laplacian eigenmaps for dimensionality reduction and data representation[J]. Neural Computation.2003,15(6):1373-1396
    [43]He X., Niyogi P. Locality Preserving Projections[C]. In:Proceedings of the Conference on Advances in Neural Information Processing Systems,2003
    [44]He X., Cai D., Yan S., et al. Neighborhood preserving embedding[C]. In: Processing of Computer Vision,2005. ICCV 2005. Tenth IEEE International Conference on. IEEE,2005,2:1208-1213
    [45]Pillati M., Viroli C. Supervised Locally Linear Embedding for Classification, An Application to Gene Expression profiles Analysis, In:Proceedings of 29th Annual Conference of the of the German Classification Society,2005:15-18
    [46]Zhao L., Zhang Z. Supervised locally linear embedding with probability-based distance for classification[J]. Computers & Mathematics with Applications, 2009,57(6):919-926
    [47]Zeng X, Luo S. A supervised subspace learning algorithm:supervised neighborhood preserving embedding[M]. Advanced Data Mining and Applications. Springer Berlin Heidelberg,2007:81-88
    [48]Chen H. T., Chang H. W., Liu T. L. Local discriminant embedding and its variants[C]. In:Processing of Computer Vision and Pattern Recognition,2005. CVPR 2005. IEEE Computer Society Conference on. IEEE,2005,2:846-853
    [49]Suen C. Y., Nadal C., Mai T. A., et al. Recognition of totally unconstrained handwritten numerals based on the concept of multiple experts[J]. Frontiers in Handwriting Recognition,1990:131-143
    [50]Huang Y. S., Suen C. Y. A method of combining multiple experts for the recognition of unconstrained handwritten numerals[J]. Pattern Analysis and Machine Intelligence, IEEE Transactions on,1995,17(1):90-94
    [51]陈羽,赖剑煌.基于多分类器融合的人脸识别方法[J].中山大学学报(自然科学版),2006,45(4)：24-27
    [52]王珏,周志华,周傲英.机器学习及其应用[M].北京：清华大学出版社,2006,pp170-188
    [53]Dasarathy B. V., Sheela B. V. A composite classifier system design:concepts and methodology[J]. Proceedings of the IEEE,1979,67(5):708-713
    [54]Schapire R. E. The strength of weak learnability[J]. Machine learning,1990,5(2): 197-227
    [55]Freund Y, Schapire R E. Experiments with a new boosting algorithm [C]. In: Processing of ICML.1996,96:148-156
    [56]Breiman L. Bagging predictors[J]. Machine learning,1996,24(2):123-140.
    [57]Breiman L. Bias, variance, and arcing classifiers[J].1996
    [58]Opitz D., Maclin R. Popular Ensemble Methods:An Empirical Study[J]. Journal of Artificial Intelligence Research,1999,11:169-198
    [59]Jain A. K., Duin R. P. W., Mao J. Statistical pattern recognition:A review[J]. Pattern Analysis and Machine Intelligence, IEEE Transactions on,2000,22(1): 4-37
    [60]Dietterich T. G. An experimental comparison of three methods for constructing ensembles of decision trees:Bagging, boosting, and randomization[J]. Machine learning,2000,40(2):139-157
    [61]张丽新.高维数据的特征选择及基于特征选择的集成学习研究[D].北京：清华大学,2004
    [62]Zhou Z. H., Tang W. Selective ensemble of decision trees[M]. Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing. Springer Berlin Heidelberg,2003: 476-483
    [63]Kuncheva L. I. Diversity in multiple classifier systems[J]. Information fusion, 2005,6(1):3-4
    [64]Hadjitodorov S. T., Kuncheva L. I., Todorova L. P. Moderate diversity for better cluster ensembles[J]. Information Fusion,2006,7(3):264-275
    [65]Kuncheva L. I., Rodriguez J. J. Classifier ensembles with a random linear oracle[J]. Knowledge and Data Engineering, IEEE Transactions on,2007,19(4): 500-508
    [66]Hansen L. K., Salamon P. Neural network ensembles[J]. Pattern Analysis and Machine Intelligence, IEEE Transactions on,1990,12(10):993-1001
    [67]Li Y., Jiang J. Combination of SVM knowledge for microcalcification detection in digital mammograms[M]. Intelligent Data Engineering and Automated Learning-IDEAL 2004. Springer Berlin Heidelberg,2004:359-365
    [68]Chang T. T., Feng J., Liu H. W., et al. Clustered Microcalcification Detection Based on a Multiple Kernel Support Vector Machine with Grouped Features (GF-SVM)[C]. In:Processing of Pattern Recognition,2008. ICPR 2008.19th International Conference on. IEEE,2008:1-4
    [69]Skyrpnyk I. Feature Selection and Training Set Sampling for Ensemble Learning on Hetergeneous Data[R]. DIMACS Technical Center. New Jersey:the State University of New Jersey,2003
    [70]Haykin S. Neural networks:a comprehensive foundation[M]. New Jersey: Prentice Hall,1999
    [71]Duda R., Hart P., Stork D. G., Pattern classification (2nd edn.). New York:Wiley, 2001
    [72]Wang D., Yeung D. S., Eric C. C. T. Sample reduction for SVMs via data structure analysis[C]. In:Proceeding of Systems Engineering,2007. SoSE'07. IEEE International Conference on. IEEE,2007:1-6
    [73]Lanckriet G. R. G., Ghaoui L. E., Bhattacharyya C., et al. A robust minimax approach to classification[J]. The Journal of Machine Learning Research,2003, 3:555-582
    [74]Noble W. S. Support vector machine applications in computational biology[J]. Kernel methods in computational biology,2004:71-92
    [75]Wu Y. C., Lee Y. S., Yang J. C. Robust and efficient multiclass SVM models for phrase pattern recognition[J]. Pattern recognition,2008,41(9):2874-2889
    [76]Khan N. M., Ksantini R., Ahmad I. S., et al. A novel SVM+ NDA model for classification with an application to face recognition[J]. Pattern Recognition, 2012,45(1):66-79
    [77]Huang K., Yang H., King I., et al. The minimum error minimax probability machine[J]. The Journal of Machine Learning Research,2004,5:1253-1286
    [78]Xue H., Chen S., Yang Q. Structural regularized support vector machine:a framework for structural large margin classifier[J]. Neural Networks, IEEE Transactions on,2011,22(4):573-587
    [79]Xue H., Chen S., Yang Q. Structural support vector machine[C]. In Advances in Neural Networks-ISNN 2008 Springer Berlin Heidelberg, pp.501-511
    [80]厉小润.模式识别的核方法研究[D].杭州：浙江大学,2007
    [81]Satoshi W. Pattern Recognition:Human and Mechanical[M]. New York:John Wiley & Sons,1985
    [82]Duda R. O., Hart P. E., Stork D. G.模式分类[M].李宏东,姚天翔译.北京：机械工业出版社,2003
    [83]肖健华.智能模式识别方法[M].广州：华南理工大学出版社,2006
    [84]孙即祥.现代模式识别[M].第2版.北京：高等教育出版社,2008
    [85]穆云峰.RBF神经网络学习算法在模式分类中的应用研究[D].大连：大连理工大学,2006
    [86]张学工.模式识别(第三版)[M].北京：清华大学出版社,2010
    [87]刘红岩,陈剑,陈国青.数据挖掘中的数据分类算法综述[J].清华大学学报：自然科学版,2002,42(6)：727-730
    [88]罗可,林睦纲,郗东妹.数据挖掘中分类算法综述[J].计算机工程,2005,31(1)：3-5
    [89]Domingos P., Pazzani M. Beyond independence:Conditions for the optimality of the simple Bayesian classifier[C]. In:Processing of 13th Intl. Conf. Machine Learning.1996:105-112
    [90]Friedman N., Geiger D., Goldszmidt M. Bayesian network classifiers[J]. Machine learning,1997,29(2-3):131-163
    [91]Fix E., Hodges J. L. Discriminatory analysis. Nonparametric discrimination: Consistency properties[J]. International Statistical Review/Revue Internationale de Statistique,1989,57(3):238-247
    [92]Wu X., Kumar V., Quinlan J. R., et al. Top 10 algorithms in data mining[J]. Knowledge and Information Systems,2008,14(1):1-37
    [93]Cover T. M., Hart P. E. Nearest neighbor pattern classification[J]. IEEE Trans. Inf. Theory,1967,13(1):21-27
    [94]Domeniconi C., Peng J. and Gunopulos D. Locally adaptive metric nearest-neighbor classification[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence.2002,24(9):1281-1285
    [95]Cheng D Y, Gersho A, Ramamurthi B, et al. Fast search algorithms for vector quantization and pattern matching[C]. In:Processing of Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP'84. IEEE,1984,9: 372-375
    [96]]Chen T. S. and Chang C. C.Diagonal Axes Method (DAM):A Fast Search Algorithm for Vector Quantization[J]. IEEE Trans. Circuits and Systems for Video Technology,7(3):555-559,1997
    [97]Tai S. C., Lai C. C., Lin Y. C. Two fast nearest neighbor searching algorithms for image vector quantization[J]. Communications, IEEE Transactions on,1996, 44(12):1623-1628
    [98]Katsavounidis I., Kuo C. C. J. and Zhang Z. Fast Tree-Structured Nearest Neighbor Encoding for Vector Quantization[J]. IEEE Trans. Image Processing. 1996,5(2):398-404
    [99]McNames J. A fast nearest-neighbor algorithm based on a principal axis search tree[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence.2001, 23(9):964-976
    [100]Vapnik V., Chervoknenkis A. Y. The Necessary and Sufficient Conditions for Consistency of the Method of Empirical Risk Minimization[J]. Pattern Recognition and Image Analysis,1991,1(3):284-305
    [101]Vapnik V. The Nature of Statistical Learning Theory[M]. New York: Springer-Verlag,1995
    [102]Tong S., Koller D. Support vector machine active learning with applications to text classification[J]. The Journal of Machine Learning Research,2002,2:45-66
    [103]张学工.关于统计学习理论与支持向量机.自动化学报,2000,26(1)：32-42
    [104]邓乃杨,田英杰.数据挖掘中的新方法-支持向量机[M].北京：科学出版社,2005
    [105]Cortes C., Vapnik V. Support vector machine[J]. Machine learning,1995,20(3): 273-297
    [106]Scholkopf B., Burges B., Smola A. Advances in Kernel Methods-Support Vector Learning. [M]. Cambridge:MIT Press,1999
    [107]Platt J. C.12 Fast Training of Support Vector Machines using Sequential Minimal Optimization[J].1999:185-208
    [108]Shin H., Cho S. Fast pattern selection for support vector classifiers[M]. Advances in Knowledge Discovery and Data Mining. Springer Berlin Heidelberg,2003:376-387
    [109]Cervantes J., Li X., Yu W., et al. Support vector machine classification for large data sets via minimum enclosing ball clustering[J]. Neurocomputing,2008, 71(4):611-619
    [110]Chen Y., Zhou X. S., Huang T. S. One-class SVM for learning in image retrieval[C]. In:Processing of Image Processing,2001. Proceedings.2001 International Conference on. IEEE,2001,1:34-37
    [111]吴高巍,陶卿,王珏.基于后验概率的支持向量机[J].计算机研究与发展,2005,42(2)：196-202
    [112]Lin C. F., Wang S. D. Fuzzy support vector machines[J]. Neural Networks, IEEE Transactions on,2002,13(2):464-471
    [113]Wang Y., Wang S., Lai K. K. A new fuzzy support vector machine to evaluate credit risk[J]. Fuzzy Systems, IEEE Transactions on,2005,13(6):820-831
    [114]蔡天文,沈晓彤.高维数据分析[M].高等教育出版社,2010
    [115]Langley P. Selection of relevant features in machine learning [J]. Proceedings of the AAAI Fall Symposium on Relevance. Menlo Park, CA:AAAI Press,1994: 140-144
    [116]Sun Z., Bebis G., Miller R. Object detection using feature subset selection[J]. Pattern recognition,2004,37(11):2165-2176
    [117]Li H., Zhang K., Jiang T. Robust and accurate cancer classification with gene expression profiling[C]. In:Proceedings of Computational Systems Bioinformatics Conference. IEEE,2005:310-321
    [118]Li H.F., Jiang T., Zhang K. Efficient and robust feature extraction by maximum margin criterion[J]. Neural Networks, IEEE Transactions on,2006,17(1): 157-165
    [119]罗四维,赵连伟.基于谱图理论的流形学习算法.计算机研究与发展,2006,43(7)：1173-1179
    [120]Zheng C. H., Li B., Zhang L., et al. Locally linear discriminant embedding for tumor classification[M]. Advanced Intelligent Computing Theories and Applications. With Aspects of Artificial Intelligence. Springer Berlin Heidelberg,2008:1093-1100
    [121]Pearson K. LIII. On lines and planes of closest fit to systems of points in space[J]. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science,1901,2(11):559-572
    [122]Abdi H., Williams L. J. Principal component analysis[J]. Wiley Interdisciplinary Reviews:Computational Statistics,2010,2(4):433-459
    [123]Jolliffe I. Principal component analysis[M]. John Wiley & Sons, Ltd,2005
    [124]Fisher R. A. The use of multiple measurements in taxonomic problems[J]. Annals of eugenics,1936,7(2):179-188
    [125]Chen S. C., Li D. H. Modified linear discriminant analysis[J]. Pattern Recognition,2005,38(3):441-443
    [126]Mika S., Ratsch G., Weston J., et al. Fisher discriminant analysis with kernels[C]. In:Processing of Neural Networks for Signal Processing IX,1999. Proceedings of the 1999 IEEE Signal Processing Society Workshop.1999: 41-48
    [127]马瑞,王家廞,宋亦旭.基于局部线性嵌入(LLE)非线性降维的多流形学习[J].清华大学学报：自然科学版,2008,48(4)：582-585
    [128]Yan S., Xu D., Zhang B., et al. Graph embedding and extensions:a general framework for dimensionality reduction[J]. Pattern Analysis and Machine Intelligence, IEEE Transactions on,2007,29(1):40-51
    [129]Xu D., Yan S., Tao D., et al. Marginal fisher analysis and its variants for human gait recognition and content-based image retrieval[J]. Image Processing, IEEE Transactions on,2007,16(11):2811-2821
    [130]Ye J. Characterization of a family of algorithms for generalized discriminant analysis on undersampled problems. Journal of Machine Learning Research, 2006,6:483-502
    [131]http://research.nh gri.nih.gov/microarray/Supplement http://www.gems-system.org/
    []32]潘翔,基于多分类器集成的模式识别研究[D].杭州：浙江工业大学,2002
    [133]Kittler J., Hatef M., Duin R. P. W., et al. On combining classifiers[J]. Pattern Analysis and Machine Intelligence, IEEE Transactions on,1998,20(3):226-239
    [134]Dietterich T. G. Machine Learning Research:Four current directions. AI Magazine,1997,18(4):97-136
    [135]常甜甜,刘红卫,冯筠.多源性数据SVM集成算法研究.西安电子科技大学学报,2010,37(1)：136-141
    [136]Hansen, L. K., Salamon, P. Neural Networks Ensembles[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence.1990,12(10):993-1001
    [137]Yu S. Feature selection and classifier ensembles:A study on hyperspectral remote sensing data[J]. United States:Scientific Literature Digital Library and Search Engine,2003,33
    [138]Krogh A., Vedelsby J. Neural network ensembles, cross validation, and active learning[J]. Advances in neural information processing systems,1995:231-238
    [139]Efron B., Tibshirani R. An introduction to the bootstrap[M]. Chapman & Hall, New York:CRC press,1993
    [140]Navone H. D., Verdes P. F., Granitto P. M., et al. Selecting diverse members of neural network Ensembles[C]. In:Processing of Neural Networks,2000. Proceedings. Sixth Brazilian Symposium on. IEEE,2000:255-260
    [141]Wolpert D. H. Stacked generalization[J]. Neural networks,1992,5(2):241-259.
    [142]Ricardo V., Youssef D., A perspective view and survey of meta-learning[J]. Artificial Intelligence Review,2002,18(2):77-95
    [143]谢元澄.分类器集成研究[D].南京：南京理工大学,2009
    [144]Littlestone N., Warmuth M. K. The weighted majority algorithm[C]. In Processing of Foundations of Computer Science,1989,30th Annual Symposium on. IEEE,1989:256-261
    [145]Fung G. P. C., Yu J. X., Wang H., et al. A balanced ensemble approach to weighting classifiers for text classification[J]. In:Processing of Data Mining, 2006. ICDM'06. Sixth International Conference on. IEEE,2006:869-873
    [146]Opitz D. W. Feature selection for ensembles[C]. In:Processing of 16th National Conf. on Artificial Intelligence, AAAI,1999:379-384
    [147]毛勇,周晓波,夏铮,等.特征选择算法研究综述[J].模式识别与人工智能,2007,20(2)：211-218
    [148]Ho T. K. Nearest neighbors in random subspaces[M]. Advances in Pattern Recognition. Springer Berlin Heidelberg,1998:640-648
    [149]Opitz D., Maclin R. Popular Ensemble Methods:An Empirical Study[J]. Journal of Artificial Intelligence Research,1999,11:169-198
    [150]Tsymbal A., Puuronen S., Patterson D. W. Ensemble feature selection with the simple Bayesian classification[J]. Information Fusion,2003,4(2):87-100
    [151]Tsymbal A., Puuronen S., Patterson D. Feature selection for ensembles of simple Bayesian classifiers[M]. Foundations of Intelligent Systems. Springer Berlin Heidelberg,2002:592-600
    [152]付强,冯筠,王惠亚.基于动态特征子集选择和EM-Bayesian集成算法的乳腺癌辅助检测.全国模式识别会议论文集(2009),IEEE出版,pp88-92
    [153]http://marathon.csee.usf.edu/Mammography/Database.html
    [154]王宇.基于SVM的乳腺癌微钙化簇检测系统[D].西安：西安电子科技大学,2008
    [155]Lee Y. J., Mangasarian O. L. RSVM:Reduced support vector machines[C]. In: Proceedings of the first SIAM international conference on data mining. Philadelphia:SIAM,2001:5-7
    [156]Tao D. C., Tang X. O. Random Sampling based SVM for Relevance Feedback Image Retrieval[C]. In:Processing of Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'04), IEEE,2004
    [157]Jain A. K., Murty M. N., Flynn P. J. Data clustering:a review[J]. ACM computing surveys (CSUR),1999,31(3):264-323
    [158]Xu R., Wunsch D. Survey of clustering algorithms[J]. Neural Networks, IEEE Transactions on,2005,16(3):645-678
    [159]Kaufman L., Rousseeuw P. J. Finding groups in data:an introduction to cluster analysis[M]. New York:John Wiley and Sons,1990
    [160]张翔,肖小玲,徐光祜.基于样本之间紧密度的模糊支持向量机方法.软件学报,2006,17(5)：951-958
    [161]Ordonez C. Programming the K-means clustering algorithm in SQL[C]. In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM,2004:823-828
    [162]Ordonez C, Cereghini P. SQLEM:Fast clustering in SQL using the EM algorithm[C]. In:Proceedings of ACM SIGMOD Record. ACM,2000,29(2): 559-570
    [163]Abbas O. A. Comparisons Between Data Clustering Algorithms [J]. Int. Arab J. Inf. Technol.,2008,5(3):320-325
    [164]Inoue T., Abe S. Fuzzy support vector machines for pattern classification[C]. In: Processings of Neural Networks,2001. IJCNN'01. International Joint Conference on. IEEE,2001,2:1449-1454
    [165]骆剑承,周成虎,梁怡,等.有限混合密度模型及遥感影像EM聚类算法.中国图象图形学报：A辑,2002,7(4)：336-340
    [166]Bansal P., Kant K., Kumar S., et al. Improved hybrid model of HMM/GMM for speech recognition[J]. Intelligent Information and Engineering Systems,2008: 69-74
    [167]McLachlan G., Peel D. Finite mixture models[M]. Wiley, com,2004
    [168]余鹏,封举富.基于高斯混合模型的纹理图像分割.中国图象图形学报,2005,10(3)：281-285
    [169]Dempster A. P., Laird N. M., Rubin D. B. Maximum likelihood from incomplete data via the EM algorithm[J]. Journal of the Royal Statistical Society. Series B (Methodological),1977:1-38
    [170]Akaike H. A new look at the statistical model identification[J]. Automatic Control, IEEE Transactions on,1974,19(6):716-723
    [171]Chawla N. V., Bowyer K. W., Hall L. O., et al. SMOTE:synthetic minority over-sampling technique[J]. Journal of Artificial Intelligence Research,2002, 16(1):321-357
    [172]http://www.csie.ntu.edu.tw/～cjlin/libsvm
    [173]Aybat N., Iyengar G. A First-Order Smoothed Penalty Method for. Compressed Sensing[J]. SIAM journel on optimization.2011,21(4):287-313
    [174]Candes E., Eldar Y., Needell D. Compressed sensing with coherent and redundant dictionaries[J]. Applied and Computational Harmonic Analysis.2010, 31(1):59-73
    [175]Lai M., Liu Y. The null space property for sparse recovery from multiple measurement vectors[J]. Applied and Computational Harmonic Analysis.2011, 30(3):402-406
    [176]Wright J., Yang A., Ma Y.Robust face recognition via sparse representation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence.2009,31(2): 210-227
    [177]Berkhin P. A survey of clustering data mining techniques[M]. In:Kogan J., Nicholas C., Teboulle M., eds. Grouping Multidimensional Data:Recent Advances in Clustering. Berlin:Springer-Verlag,2006. pp.25-71
    [178]Baumgartner C., et al. Subspace Selection for Clustering High-Dimensional Data[C]. In:Processings of 4th IEEE Int Conf On Data Mining(ICDM'04),2004: 11-18
    [179]Parsons L., Haque E., Liu H. Subspace clustering for high dimensional data:a review[J]. ACM SIGKDD Explorations Newsletter,2004,6(1):90-105
    [180]陈黎飞,郭躬德,姜青山.自适应的软子空间聚类算法[J].软件学报,2010,21(10)：2513-2523

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700