机器学习中特征选问题研究

英文题名：Research on Feature Selection for Machine Learning
作者：孙鑫
论文级别：博士
学科专业名称：计算机应用技术
中文关键词：机器学习 ; 模式识别 ; 特征选择 ; 信息理论 ; 联盟博弈理论 ; Banzhaf权利指数 ; Shapley值 ; 基因选择
英文关键词：Machine learning ; Pattern recognition ; Feature selection ; Information theory ; Cooperative game theory ; Banzhaf power index ; Shapley value ; Gene selection
学位年度：2013
导师：刘衍珩
学科代码：081203
学位授予单位：吉林大学
论文提交日期：2013-06-01
答辩委员会主席：郑斯清

摘要

随着计算机技术的日益发展，在各个领域中所采集的数据集规模不断增大，特别是高维数据中存在的大量冗余和无关特征给机器学习带来了巨大的挑战。特征选择是为了解决高维度数据计算问题而衍生的，通过剔除冗余特征和无关特征，提高机器学习算法的泛化性能和运行效率。随着研究的深入，特征之间复杂的相互关系对机器学习算法的影响被逐渐地认识到，如何在特征选择过程中识别和保留具有交互关系的有益特征组合，是目前仍未很好解决的难题。本文主要致力于辨别特征相互作用中冗余和依赖关系，研究能够选择出高度相关、内部依赖和低度冗余特征子集的Filter特征选择算法。提出了基于Banzhaf权利指数的特征评估及选择算法、基于Shapley值的特征选择算法优化方法和基于动态加权的特征选择算法。针对基因表达数据在疾病诊断中的应用问题，提出了基于动态相关性分析的基因选择算法。在公开测试数据集上的实验结果表明本文提出的这些算法均能获得良好的性能，达到了预期的效果和目的。
Machine learning is the process of analyzing data from different perspectives andextracting it into useful information. The performance of learn model constructed by thetraining samples is mostly depended on the quality of the dataset. Along with the newemergences of computer applications, such as social networks clustering, gene expressionarray analysis and combinatorial chemistry, datasets are getting larger and larger. Nevertheless,most of the features in huge dataset are irrelevant or redundant, which lead traditional miningand learning algorithms to low efficiency and over-fitting. To mitigate this problem, oneeffective way is to reduce the dimensionality of feature space with feature selection technique.Feature selection, also known as variable selection, is one of the fundamental problems in thefields of machine learning, pattern recognition and statistics. Feature selection aims at findinga good feature subset which produces higher classification accuracy. It can bring lots ofbenefits to machine learning algorithms, such as reducing the measurement cost and storagerequirements, coping with the degradation of the classification performance due to thefiniteness of training sample sets, reducing training and utilization time, and facilitating datavisualization and data understanding. It has attracted great attention and many selectionalgorithms have been developed during past years. Previous reviews of feature selection canbe found in literatures. Generally, all of these selection algorithms typically fall into threecategories: embedded, wrapper and filter methods. Filter methods are independent of learningalgorithms and assess the relevance of features by looking only at the intrinsic properties ofthe data. In practice, filter methods have much lower computational complexity than others,meanwhile, they achieve comparable classification accuracy for most classifiers. Thus, thefilter methods are very popular to high-dimension dataset. It is noteworthy that among variousfeature selection algorithms, information theoretic based algorithms achieve excellentperformance and have drawn more and more attention. However, most of these selectorsdiscard features which are highly correlated to the selected ones although relevant to thetarget class, which is likely to ignore some features having strong discriminatory power as agroup but weak as individuals. The main reason for this disadvantage is that informationtheoretic based measurements disregard the intrinsic structure among features.
     To untie this knot, this work focuses on how to select a maximal relevance,interdependence and minimal redundancy feature subset for machine learning. This thesisproposes two different kinds of feature selection algorithms and one optimization method forinformation theoretic based feature selection algorithms. It also introduces a gene selectionalgorithm for cancer diagnosis application. The main contribution of this paper and innovativepoints are as follows:
     (1) Fristly a comprehensive overview on the state-of-the-art of the feature selectionalgorithms has been done. Then an analysis and discussion about the problems faced bycurrent filter selection algorithms is presented. The discussion and analysis of these topicshave laid a solid foundation for carrying out further research work.
     (2) This thesis designs a feature evaluation and selection framework based on Banzhafpower index. The framework firstly introduces a cooperative game theory based framework toevaluate the power of each feature, in order to overcome the disadvantage that traditionalInformation-theoretic based selectors ignore some features which as a group have strongdiscriminatory power but are weak as individuals. Then a filter feature selection process withmRMR criterion is proposed to handle the feature selection problem. Experimental resultsshow that the proposed method works well. Its proven efficiency and effectiveness comparedwith other algorithms by four classifiers suggest that the proposed method is practical forfeature selection of high-dimensional data.
     (3) Considering there are so many outstanding feature selection algorithms, this thesisintroduces an optimization method for feature selection algorithms based on cooperative gametheory. A feature evaluation algorithm based on Shapley value is proposed to evaluate theweight of each feature according to its influence to the intricate and intrinsic interrelationamong features. Comparing with the Banzhaf power index method, Shapley value will favorsmaller winning coalitions more which are much helpful for small feature subset selection.Moreover approximate joint mutual information and joint conditional mutual information areintroduced to evaluate the independence and redundance among features. Experimental resultssuggest that the proposed framework is practical for optimizing feature selection algorithms.
     (4) This thesis also presents a feature selection algorithm based on dynamic weights. Itfirstly introduces a new scheme for feature relevance, interdependence and redundancyanalysis using information theoretic criteria. Then, a dynamic weighting-based featureselection algorithm is presented, which not only selects the most relevant features andeliminates redundant features, but also tries to retain useful intrinsic groups of interdependentfeatures. The primary characteristic of the method is that the feature is weighted according toits interaction with the selected features. And the weight of features will be dynamicallyupdated after each candidate feature has been selected. The experimental results indicate thatour proposed method achieves promising improvement on feature selection and classificationaccuracy.
     (5) Microarray analysis is widely accepted for human cancer diagnosis and classification.However the high dimensionality of microarray data poses a great challenge to classification.This thesis also introduces a new gene selection method for cancer diagnosis andclassification by retaining useful intrinsic groups of interdependent genes. The primarycharacteristic of this method is that the relevance between each gene and target will bedynamically updated when a new gene is selected. The effectiveness of our method isvalidated by experiments on six publicly available microarray data sets. Experimental results show that excellent classification accuracies are achieved by selecting the key genes using theproposed algorithm. Moreover the gene subset selected by the DRGS method is much moreenriched in gene sets which are related to cancer.
     These studies not only promote the futher development and application of the featureselection algorithm, but also suggest a new point of view to improve the classificationperformance by selecting independent feature subset. Therefore, these studies have importanttheoretical significance and application value as well.

引文

[1] S. Kotsiantis, Feature selection for machine learning classification problems: a recentoverview[J], Artificial Intelligence Review.2011:1-20.
    [2] P. Langley, Selection of relevant features in machine learning[M]. Defense TechnicalInformation Center,1994.
    [3] A. Jain, D. Zongker, Feature selection: Evaluation, application, and small sampleperformance[J], IEEE Transactions on Pattern Analysis and Machine Intelligence.1997,19(2):153-158.
    [4] A.L. Blum, P. Langley, Selection of relevant features and examples in machine learning[J],1997,97(1-2):245-271.
    [5] R. Kohavi, G.H. John, Wrappers for feature subset selection[J], Artificial Intelligence.1997,97(1-2):273-324.
    [6] Q. Zhu, L. Lin, M.-L. Shyu, S.-C. Chen, Feature Selection Using Correlation andReliability Based Scoring Metric for Video Semantic Detection[C], IEEE FourthInternational Conference on Semantic Computing,2010:462-469.
    [7] H. Ogura, H. Amano, M. Kondo, Comparison of metrics for feature selection inimbalanced text classification[J], Expert Systems with Applications.2011,38(5):4978-4989.
    [8] Y. Saeys, I. Inza, P. Larranaga, A review of feature selection techniques inbioinformatics[J], Bioinformatics.2007,23(19):2507-2517.
    [9] R. Cai, Z. Hao, X. Yang, W. Wen, An efficient gene selection algorithm based on mutualinformation[J], Neurocomputing.2009,72(4-6):991-999.
    [10] F. Amiri, M. Rezaei Yousefi, C. Lucas, et al., Mutual information-based feature selectionfor intrusion detection systems[J], Journal of Network and Computer Applications.2011,34(4):1184-1199.
    [11] K. Kira, L.A. Rendell, A practical approach to feature selection[C], Proceedings of theninth international workshop on Machine learning, Scotland, United Kingdom,1992:249-256.
    [12] I. Kononenko, Estimating attributes: Analysis and extensions of RELIEF[J], MachineLearning.1994,784:171-182.
    [13] M. Hall, Correlation-based Feature Selection for Discrete and Numeric Class MachineLearning[C], Proceedings of the Seventeenth International Conference on MachineLearning,2000:359-366.
    [14] S. Davies, S. Russell, NP-Completeness of Searches for Smallest Possible FeatureSets[C], AAI Symposium on Intelligent Relevance, New Orleans,1994:41-43.
    [15] H. Liu, L. Yu, Toward integrating feature selection algorithms for classification andclustering[J], IEEE Transactions on Knowledge and Data Engineering.2005,17(4):491-502.
    [16] L.C. Molina, L. Belanche, A. Nebot, Feature selection algorithms: a survey andexperimental evaluation[C], Proceedings of IEEE International Conference on DataMining, Maebashi City, Japan,2002:306-313.
    [17] P. Somol, P. Pudil, J. Kittler, Fast branch and bound algorithms for optimal featureselection[J], IEEE Transactions on Pattern Analysis and Machine Intelligence.2004,26(7):900-912.
    [18] M. ElAlami, A filter model for feature subset selection based on genetic algorithm[J],Knowledge-based Systems.2009,22(5):356-362.
    [19] K. Bailly, M. Milgram, Boosting feature selection for Neural Network basedregression[J], Neural Networks.2009,22(5–6):748-756.
    [20] L. Yu, H. Liu, Efficient Feature Selection via Analysis of Relevance and Redundancy[J],Journal of Machine Learning Research.2004,5:1205-1224.
    [21] I. Guyon, E. Andr, An introduction to variable and feature selection[J], Journal ofMachine Learning Research.2003,3:1157-1182.
    [22] Z. Zhu, Y.-S. Ong, M. Dash, Markov blanket-embedded genetic algorithm for geneselection[J], Pattern Recognition.2007,40(11):3236-3248.
    [23] J. Huang, Y. Cai, X. Xu, A hybrid genetic algorithm for feature selection wrapper basedon mutual information[J], Pattern Recognition Letters.2007,28(13):1825-1844.
    [24] R.M. Jarvis, R. Goodacre, Genetic algorithm optimization for pre-processing andvariable selection of spectroscopic data[J], Bioinformatics.2005,21(7):860-868.
    [25] M. Monirul Kabir, M. Monirul Islam, K. Murase, A new wrapper feature selectionapproach using neural network[J], Neurocomputing.2010,73(16-18):3273-3283.
    [26] I. Inza, P. Larra aga, R. Etxeberria, B. Sierra, Feature Subset Selection by Bayesiannetwork-based optimization[J], Artificial Intelligence.2000,123(1-2):157-184.
    [27] J. Weston, A. Elisseeff, B. Sch lkopf, M. Tipping, Use of the zero norm with linearmodels and kernel methods[J], The Journal of Machine Learning Research.2003,3:1439-1461.
    [28] J.R. Quinlan, C4.5: programs for machine learning[M]. Morgan kaufmann,1993.
    [29] I. Guyon, J. Weston, S. Barnhill, V. Vapnik, Gene selection for cancer classification usingsupport vector machines[J], Machine Learning.2002,46(1):389-422.
    [30] S.W. Card, Information distance based fitness and diversity metrics[C], Proceedings ofthe12th annual conference companion on Genetic and evolutionary computation,Portland, Oregon, USA,2010:1851-1854.
    [31] K. Thangavel, A. Pethalakshmi, Dimensionality reduction based on rough set theory: Areview[J], Applied Soft Computing.2009,9(1):1-12.
    [32] H. Liu, J. Li, L. Wong, A comparative study on feature selection and classificationmethods using gene expression profiles and proteomic patterns[J], Genome Informatics.2002,13:51-60.
    [33] T.W. Chow, D. Huang, Estimating optimal feature subsets using efficient estimation ofhigh-dimensional mutual information[J], IEEE Transactions on Neural Networks.2005,16(1):213-224.
    [34] V. Sindhwani, S. Rakshit, D. Deodhare, et al., Feature selection in MLPs and SVMsbased on maximum output information[J], IEEE Transactions on Neural Networks.2004,15(4):937-948.
    [35] M.D. Plumbley, E. Oja, A" nonnegative PCA" algorithm for independent componentanalysis[J], IEEE Transactions on Neural Networks.2004,15(1):66-76.
    [36] K. Mao, Orthogonal forward selection and backward elimination algorithms for featuresubset selection[J], IEEE Transactions on Systems, Man, and Cybernetics, Part B:Cybernetics.2004,34(1):629-634.
    [37] R. Caruana, V.R.d. Sa, Benefitting from the variables that variable selection discards[J],Journal of Machine Learning Research.2003,3(7-8):1245-1264.
    [38] H. Liu, H. Motoda, Feature Selection for Knowledge Discovery and Data Mining[M].Boston: Kluwer Academic Publishers Norwell,1998.
    [39] M. Robnik-ikonja, I. Kononenko, Theoretical and Empirical Analysis of ReliefF andRReliefF[J], Machine Learning.2003,53(1):23-69.
    [40] H. Liu, H. Motoda, L. Yu, A selective sampling approach to active feature selection[J],Artificial Intelligence.2004,159(1-2):49-74.
    [41] Y. Sun, Iterative RELIEF for Feature Weighting: Algorithms, Theories, andApplications[J], IEEE Transactions on Pattern Analysis and Machine Intelligence.2007,29(6):1035-1051.
    [42] Q. Hu, X. Che, L. Zhang, D. Yu, Feature evaluation and selection based on neighborhoodsoft margin[J], Neurocomputing.2010,73(10-12):2114-2124.
    [43] D. Zhang, S. Chen, Z.-H. Zhou, Constraint Score: A new filter method for featureselection with pairwise constraints[J], Pattern Recognition.2008,41(5):1440-1451.
    [44] J. Liang, K. Chin, C. Dang, R.C. Yam, A new method for measuring uncertainty andfuzziness in rough set theory[J], International Journal of General Systems.2002,31(4):331-342.
    [45] M. Modrzejewski, Feature selection using rough sets theory[J], Machine Learning.1993,667:213-226.
    [46] M. Dash, H. Liu, Consistency-based search in feature selection[J], Artificial Intelligence.2003,151(1-2):155-176.
    [47] J. Richard, Semantics-Preserving Dimensionality Reduction: Rough andFuzzy-Rough-Based Approaches[J], IEEE Transactions on Knowledge and DataEngineering.2004,16(12):1457-1471.
    [48] Q. Hu, D. Yu, Z. Xie, Information-preserving hybrid data reduction based on fuzzy-roughtechniques[J], Pattern Recognition Letters.2006,27(5):414-423.
    [49] Y. Qian, J. Liang, W. Pedrycz, C. Dang, An efficient accelerator for attribute reductionfrom incomplete data in rough set framework[J], Pattern Recognition.2011,44(8):1658-1670.
    [50] N. Abe, M. Kudo, Non-parametric classifier-independent feature selection[J], PatternRecognition.2006,39(5):737-746.
    [51] H.-L. Wei, S.A. Billings, Feature subset selection and ranking for data dimensionalityreduction[J], IEEE Transactions on Pattern Analysis and Machine Intelligence.2007,29(1):162-166.
    [52] G. Brown, A New Perspective for Information Theoretic Feature Selection[C],12thInternational Conference on Artificial Intelligence and Statistics,2009:49-56.
    [53]刘华文,基于信息熵的特征选择算法研究[D],吉林:吉林大学,2010.
    [54] A.K. Jain, R.P.W. Duin, J.C. Mao, Statistical pattern recognition: a review[J], IEEETransactions on Pattern Analysis and Machine Intelligence.2000,22(1):4-37.
    [55] R. Battiti, Using mutual information for selecting features in supervised neural netlearning[J], IEEE Transactions on Neural Networks.1994,5(4):537-550.
    [56] K. Nojun, C. Chong-Ho, Improved mutual information feature selector for neuralnetworks in supervised learning[C], International Joint Conference on Neural Networks,1999:1313-1318vol.1312.
    [57] N. Kwak, C.-H. Choi, Input Feature Selection by Mutual Information Based on ParzenWindow[J], IEEE Transactions on Pattern Analysis and Machine Intelligence.2002,24(12):1667-1671.
    [58] J. Novovicova, P. Somol, M. Haindl, P. Pudil, Conditional mutual information basedfeature selection for classification task[C], Proceedings of the12th Iberoamericanconference on Congress on pattern recognition, Mar-Valparaiso, Chile,2007:417-426.
    [59] H. Peng, F. Long, C. Ding, Feature Selection Based on Mutual Information: Criteria ofMax-Dependency, Max-Relevance, and Min-Redundancy[J], IEEE Transactions onPattern Analysis and Machine Intelligence.2005,27(8):1226-1238.
    [60] H. Liu, J. Sun, L. Liu, H. Zhang, Feature selection with dynamic mutual information[J],Pattern Recognition.2009,42(7):1330-1339.
    [61] C. Lee, G.G. Lee, Information gain and divergence-based feature selection for machinelearning-based text categorization[J], Information Processing&Management.2006,42(1):155-165.
    [62] Q. Hu, D. Yu, Z. Xie, J. Liu, Fuzzy probabilistic approximation spaces and theirinformation measures[J], IEEE Transactions on Fuzzy Systems.2006,14(2):191-201.
    [63] H.-M. Lee, C.-M. Chen, J.-M. Chen, Y.-L. Jou, An efficient fuzzy classifier with featureselection based on fuzzy entropy[J], IEEE Transactions on Systems, Man, andCybernetics, Part B: Cybernetics.2001,31(3):426-432.
    [64] J.-D. Shie, S.-M. Chen, Feature subset selection based on fuzzy entropy measures forhandling classification problems[J], Applied Intelligence.2008,28(1):69-82.
    [65] D. Huang, T.W.S. Chow, Effective feature selection scheme using mutual information[J],Neurocomputing.2005,63:325-343.
    [66] J. Beirlant, E. Dudewicz, L. Gy rfi, E. Van der Meulen, Nonparametric entropyestimation: An overview[J], International Journal of Mathematical and StatisticalSciences.1997,6:17-40.
    [67] M. Sebban, R. Nock, A hybrid filter/wrapper approach of feature selection usinginformation theory[J], Pattern Recognition.2002,35(4):835-846.
    [68] S. Das, Filters, wrappers and a boosting-based hybrid for feature selection[C],MACHINE LEARNING-INTERNATIONAL WORKSHOP THEN CONFERENCE,2001:74-81.
    [69] J.-G. Zhang, H.-W. Deng, Gene selection for classification of microarray data based onthe Bayes error[J], BMC bioinformatics.2007,8(1):370.
    [70] M. Bacauskiene, A. Verikas, A. Gelzinis, D. Valincius, A feature selection technique forgeneration of classification committees and its application to categorization of laryngealimages[J], Pattern Recognition.2009,42(5):645-654.
    [71] G. Alexe, S. Alexe, P.L. Hammer, B. Vizvari, Pattern-based feature selection in genomicsand proteomics[J], Annals Of Operations Research.2006,148(1):189-201.
    [72] I. Guyon, A. Elisseeff, C. Aliferis, Causal Feature Selection[M]. Chapman and Hall Press,2007.
    [73] A. Gyenesei, U. Wagner, S. Barkow-Oesterreicher, et al., Mining co-regulated geneprofiles for the detection of functional associations in gene expression data[J],Bioinformatics.2007,23(15):1927-1935.
    [74] Z. Zhao, H. Liu, Searching for interacting features[C], Proceedings of the20thinternational joint conference on Artifical intelligence,2007:1156-1161.
    [75] C.-H. Zheng, Y.-W. Chong, H.-Q. Wang, Gene selection using independent variablegroup analysis for tumor classification[J], Neural Computing&Applications.2011,20(2):161-170.
    [76] P. Meyer, G. Bontempi, On the Use of Variable Complementarity for Feature Selection inCancer Classification[J], Applications of Evolutionary Computing.2006,3907:91-102.
    [77] P.E. Meyer, C. Schretter, G. Bontempi, Information-Theoretic Feature Selection inMicroarray Data Using Variable Complementarity[J], IEEE Journal of Selected Topics inSignal Processing.2008,2(3):261-274.
    [78] M. Dash, H. Liu, Feature selection for classification[J],1997,1(1-4):131-156.
    [79] K. Kira, L.A. Rendell, The feature selection problem: Traditional methods and a newalgorithm[C], Proceedings of the National Conference on Artificial Intelligence,1992:129-129.
    [80] P.M. Narendra, K. Fukunaga, A branch and bound algorithm for feature subsetselection[J], IEEE Transactions on Computer.1977,100(9):917-922.
    [81] D. Koller, M. Sahami, Toward optimal feature selection[C], Proceedings13thInternational Conference on Machine Learning, Bari, Italy,1996.
    [82] W. Siedlecki, J. Sklansky, On automatic feature selection[J], International Journal ofPattern Recognition and Artificial Intelligence.1988,2(02):197-220.
    [83] M. Ben-Bassat, Pattern recognition and reduction of dimensionality[J], Handbook ofStatistics-II, PR Krishnaiah and LN Kanal, eds.1982:773-791.
    [84] J. Doak, An evaluation of feature selection methods and their application to computersecurity[M]. University of California, Computer Science,1992.
    [85]傅祖芸,信息论:基础理论与应用[M].北京:电子工业出版社,2001.
    [86] T.M. Cover, J.A. Thomas, J. Wiley, Elements of information theory, Second Edition[M].Hoboken, NJ, USA: John Wiley&Sons, Inc.,2005.
    [87] G. Forman, An extensive empirical study of feature selection metrics for textclassification[J], Journal of Machine Learning Research.2003,3:1289-1305.
    [88] Y. Yang, J.O. Pedersen, A comparative study on feature selection in textcategorization[C], MACHINE LEARNING-INTERNATIONAL WORKSHOP THENCONFERENCE,1997:412-420.
    [89] H. Liu, L. Liu, H. Zhang, Feature selection using mutual information: An experimentalstudy[J], PRICAI2008: Trends in Artificial Intelligence.2008:235-246.
    [90] K. Torkkola, Information-theoretic methods[M]. Berlin-Heidelberg: Springer.,2006.
    [91] H. Yang, J. Moody, Data visualization and feature selection: New algorithms fornongaussian data[J], Advances in Neural Information Processing Systems.2000,12:687-693.
    [92] F. Fleuret, Fast binary feature selection with conditional mutual information[J], Journalof Machine Learning Research.2004,5:1531-1555.
    [93] D.A. Bell, H. Wang, A Formalism for Relevance and Its Application in Feature SubsetSelection[J], Machine Learning.2000,41(2):175-195-195.
    [94] M. Vidal-Naquet, S. Ullman, Object recognition with informative features and linearclassification[C],2003.
    [95]谢识予,经济博弈论[M].上海:复旦大学出版社,2002.
    [96]张维迎,博弈论与信息经济学[M].上海:上海人民出版社,2004.
    [97] J.W. Friedman, Game theory with applications to economics[M]. New York: OxfordUniversity Press,1990.
    [98] S.-W. Lin, Z.-J. Lee, S.-C. Chen, T.-Y. Tseng, Parameter determination of support vectormachine and feature selection using simulated annealing approach[J], Applied SoftComputing.2008,8(4):1505-1512.
    [99] S.P. Moustakidis, J.B. Theocharis, SVM-FuzCoC: A novel SVM-based feature selectionmethod using a fuzzy complementary criterion[J], Pattern Recognition.2010,43(11):3712-3729.
    [100] H. Liu, L. Liu, H. Zhang, Boosting feature selection using information metric forclassification[J], Neurocomputing.2009,73(1-3):295-303.
    [101] G.H. John, R. Kohavi, K. Pfleger, Irrelevant features and the subset selectionproblem[C], Proceedings of the eleventh international conference on machine learning,1994:121-129.
    [102] X. Sun, Y. Liu, J. Li, et al., Using cooperative game theory to optimize the featureselection problem[J], Neurocomputing.2012,97(1):86-93.
    [103] R. Tibshirani, T. Hastie, B. Narasimhan, G. Chu, Diagnosis of multiple cancer types byshrunken centroids of gene expression[J], Proceedings of the National Academy ofSciences.2002,99(10):6567.
    [104] S. Cohen, G. Dror, E. Ruppin, Feature Selection via Coalitional Game Theory[J], Neuralcomputation.2007,19(7):1939-1961.
    [105] A. Frank, A. Asuncion, UCI Machine Learning Repository, in: Available(http://archive.ics.uci.edu/ml). Irvine, CA: University of California, School ofInformation and Computer Science), pp.2010.
    [106] J. Grzymala-Busse, M. Hu, A Comparison of Several Approaches to Missing AttributeValues in Data Mining[J], Rough Sets and Current Trends in Computing.2001,2005:378-385.
    [107] U.M. Fayyad, K.B. Irani, Multi-Interval Discretization of Continuous-Valued Attributesfor Classification Learning[C], Proceedings of13th International Joint Conference onArtificial Intelligence, Chambéry, France,1993:1022-1027.
    [108] X. Wu, V. Kumar, J. Ross Quinlan, et al., Top10algorithms in data mining[J],Knowledge and Information Systems.2008,14(1):1-37.
    [109] I.H. Witten, E. Frank, Data Mining: Practical machine learning tools and techniques[M].Amsterdam: Morgan Kaufmann,2005.
    [110] W. Li, Y. Yang, How many genes are needed for a discriminant microarray dataanalysis[J], Methods of microarray data analysis.2002:137-150.
    [111] C. Ding, H. Peng, Minimum redundancy feature selection from microarray geneexpression data[C], Second IEEE Computational Systems Bioinformatics Conference,2003:523-528.
    [112] T.R. Golub, D.K. Slonim, P. Tamayo, et al., Molecular Classification of Cancer: ClassDiscovery and Class Prediction by Gene Expression Monitoring[J], Science.1999,286(5439):531-537.
    [113] L.S. Shapley, A value for n-person games[M]. Princeton University Press,1953.
    [114] H. Jeong, S. Mason, A. Barabasi, Z. Oltvai, Lethality and centrality in proteinnetworks[J], Nature.2001,411(6833):41-42.
    [115] X. Deng, C.H. Papadimitriou, On the Complexity of Cooperative Solution Concepts[J],Mathematics of Operations Research.1994,19(2):257-266.
    [116]朱明,数据挖掘[M].合肥:中国科学技术大学出版社,2002.
    [117] X. Sun, Y. Liu, J. Li, et al., Feature evaluation and selection with cooperative gametheory[J], Pattern Recognition.2012,45(8):2992-3002.
    [118] Y. Lu, J. Han, Cancer classification using gene expression data[J], Information Systems.2003,28(4):243-268.
    [119] T.R. Golub, D.K. Slonim, P. Tamayo, et al., Molecular classification of cancer: classdiscovery and class prediction by gene expression monitoring[J], Science.1999,286(5439):531-537.
    [120] A.-L. Boulesteix, C. Strobl, T. Augustin, M. Daumer, Evaluating microarray-basedclassifiers: an overview[J], Cancer Informatics.2008,6:77.
    [121] A. Statnikov, C.F. Aliferis, I. Tsamardinos, et al., A comprehensive evaluation ofmulticategory classification methods for microarray gene expression cancer diagnosis[J],Bioinformatics.2005,21(5):631-643.
    [122] C. Alonso-González, Q. Moro, O. Prieto, M. Simón, Selecting few genes for microarraygene expression classification[J], Current Topics in Artificial Intelligence.2010:111-120.
    [123] D. Ruano, G.R. Abecasis, B. Glaser, et al., Functional gene group analysis reveals a roleof synaptic heterotrimeric G proteins in cognitive ability[J], American Journal of HumanGenetics.2010,86(2):113-125.
    [124] J.P. Demuth, T. De Bie, J.E. Stajich, et al., The evolution of mammalian gene families[J],PLoS One.2006,1(1):e85.
    [125] V. Gómez-Verdejo, M. Verleysen, J. Fleury, Information-theoretic feature selection forfunctional data classification[J], Neurocomputing.2009,72(16-18):3580-3589.
    [126] G. Bontempi, P.E. Meyer, Causal filter selection in microarray data, in: Proceedings ofthe27th International Conference on Machine Learning2010), pp.95-102.
    [127] S.L. Salzberg, A.L. Delcher, S. Kasif, O. White, Microbial gene identification usinginterpolated Markov models[J], Nucleic Acids Research.1998,26(2):544-548.
    [128] N.A. Chuzhanova, A.J. Jones, S. Margetts, Feature selection for genetic sequenceclassification[J], Bioinformatics.1998,14(2):139-143.
    [129] R.L. Somorjai, B. Dolenko, R. Baumgartner, Class prediction and discovery using genemicroarray and proteomics mass spectroscopy data: curses, caveats, cautions[J],Bioinformatics.2003,19(12):1484-1491.
    [130] U. Alon, N. Barkai, D.A. Notterman, et al., Broad patterns of gene expression revealedby clustering analysis of tumor and normal colon tissues probed by oligonucleotidearrays[J], Proceedings of the National Academy of Sciences.1999,96(12):6745-6750.
    [131] A. Ben-Dor, L. Bruhn, N. Friedman, et al., Tissue classification with gene expressionprofiles[J], Journal of Computational Biology.2000,7(3-4):559-583.
    [132] R. Díaz-Uriarte, S.A. De Andres, Gene selection and classification of microarray datausing random forest[J], BMC bioinformatics.2006,7(1):3.
    [133] T. Jirapech-Umpai, S. Aitken, Feature selection and classification for microarray dataanalysis: Evolutionary methods for identifying predictive genes[J], BMC bioinformatics.2005,6(1):148.
    [134] R. Blanco, P. Larra aga, I. Inza, B. Sierra, Gene selection for cancer classification usingwrapper approaches[J], International Journal of Pattern Recognition and ArtificialIntelligence.2004,18(08):1373-1390.
    [135] O. Gevaert, F.D. Smet, D. Timmerman, et al., Predicting the prognosis of breast cancerby integrating clinical and microarray data with Bayesian networks[J], Bioinformatics.2006,22(14):e184-e190.
    [136] C. Ding, H. Peng, Minimum redundancy feature selection from microarray geneexpression data[J], Journal of Bioinformatics and Computational Biology.2005,3(2):185-205.
    [137] K.Y. Yeung, R.E. Bumgarner, Multiclass classification of microarray data with repeatedmeasurements: application to cancer[J], Genome Biology.2003,4(12):83-83.
    [138] V.G. Tusher, R. Tibshirani, G. Chu, Significance analysis of microarrays applied to theionizing radiation response[J], Proceedings of the National Academy of Sciences.2001,98(9):5116.
    [139] A. Statnikov, I. Tsamardinos, Y. Dosbayev, C.F. Aliferis, GEMS: A system forautomated cancer diagnosis and biomarker discovery from microarray gene expressiondata[J], International Journal of Medical Informatics.2005,74(7-8):491-503.
    [140] I. Choi, B.J. Wells, C. Yu, M.W. Kattan, An empirical approach to model selectionthrough validation for censored survival data[J], Journal of Biomedical Informatics.2011.
    [141] L.J.v.t. Veer, H. Dai, M.J.v. Vijver, et al., Gene expression profiling predicts clinicaloutcome of breast cancer[J], Nature.2002,415(6871):530-536.
    [142] S.L. Pomeroy, P. Tamayo, M. Gaasenbeek, et al., Prediction of central nervous systemembryonal tumour outcome based on gene expression[J], Nature.2002,415(6870):436-442.
    [143] G.J. Gordon, R.V. Jensen, L.-L. Hsiao, et al., Translation of Microarray Data intoClinically Relevant Cancer Diagnostic Tests Using Gene Expression Ratios in LungCancer and Mesothelioma[J], Cancer Research.2002,62(17):4963-4967.
    [144] D. Singh, P.G. Febbo, K. Ross, et al., Gene expression correlates of clinical prostatecancer behavior[J], Cancer cell.2002,1(2):203-209.
    [145] Y. Hippo, H. Taniguchi, S. Tsutsumi, et al., Global gene expression analysis of gastriccancer by oligonucleotide microarrays[J], Cancer Research.2002,62(1):233.
    [146] M.H. Cheok, W. Yang, C.H. Pui, et al., Treatment-specific changes in gene expressiondiscriminate in vivo drug response in human leukemia cells[J], Nature genetics.2003,34(1):85-90.
    [147] J. Dem ar, Statistical comparisons of classifiers over multiple data sets[J], The Journalof Machine Learning Research.2006,7:1-30.
    [148] F. Chen, M.R. Capecchi, Paralogous mouse Hox genes, Hoxa9, Hoxb9, and Hoxd9,function together to control development of the mammary gland in response topregnancy[J], Proceedings Of The National Academy Of Sciences Of The United StatesOf America.1999,96(2):541.
    [149] J. Li, H. Su, H. Chen, B.W. Futscher, Optimal search-based gene subset selection forgene array cancer classification[J], IEEE Transactions on Information Technology inBiomedicine.2007,11(4):398-405.
    [150] A. Subramanian, P. Tamayo, V.K. Mootha, et al., Gene set enrichment analysis: aknowledge-based approach for interpreting genome-wide expression profiles[J],Proceedings of The National Academy of Sciences of The United States of America.2005,102(43):15545.
    [151] J. Tang, H. Liu, Feature Selection with Linked Data in Social Media[C],12th SIAMInternational Conference on Data Mining (SDM2012), Anaheim, California,2012.
    [152] J. Lee, J. Han, X. Li, H. Cheng, Mining discriminative patterns for classifyingtrajectories on road networks[J], IEEE Transactions on Knowledge and Data Engineering.2011,(99):1-1.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700