多模型共识数据建模方法研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
分析化学数据的建模是化学计量学研究的重要内容。根据数据建模的任务不同,可以分为回归校正(regression)和模式识别(pattern recognition)。由于传统的单模型建模方法对数据中的噪声和样本量都比较敏感,在分析复杂的化学测量数据时,容易受到数据中噪声或样本量的影响,使模型的普适性(generalization performance)大大降低。为了弥补单模型建模方法的不足,近年来,多模型共识建模(ensemble modeling或consensus modeling)方法受到普遍重视,在很多研究领域得到广泛的研究和应用。本论文将多模型共识建模方法用于近红外光谱和基因芯片(microarray)数据的建模与分类,并对多模型共识建模方法的基本理论和应用进行了探讨,主要内容包括:
     1.综述了分析化学数据建模的基本原理以及常见的建模方法,重点总结了多模型共识建模的基本理论、常用建模方法以及应用现状。
     2.研究了随机抽样法多回归模型共识建模方法,提出了一种基于偏最小二乘(PLS)的多回归模型共识算法cPLS。该方法不是只利用预测性能最好的单个模型来预测未知样本,而是采用随机抽样技术扰动训练集,建立一系列的PLS模型,并从中选择部分预测性能较好的模型共同预测未知样本。通过对玉米近红外光谱数据的校正分析,结果表明,cPLS的预测性能要比普通PLS模型好,采用多个PLS模型的共识,不但提高了PLS模型的预测精度,而且也提高了PLS模型的普适性。
     3.将局部建模技术与多模型共识方法相结合,提出了一种动态建模多模型共识算法CDL-PLS。与普通PLS和基于bagging/boosting的PLS算法不同,CDL-PLS采用一种局部动态建模方法训练成员PLS模型,用于训练成员PLS模型的样本不是从原训练集中随机选取,而是根据训练集样本与未知预测样本之间在主成分空间的欧几里得距离来选取。通过对烟叶样品近红外光谱数据的校正分析,结果表明,局部动态建模技术可以提高PLS模型的预测精度和稳定性,而多个局部动态PLS模型的共识,可以进一步提高模型的预测精度和普适性。
     4.采用特征变量选择和非重复特征变量相结合的方法,建立了多分类器共识分类方法CAMCUN(consensus analysis of multiple classifiers using non-repetitive variables)。CAMCUN根据特征变量的预测能力有选择地建立非重复特征变量成员分类器,使各成员分类器之间尽可能不相关,以增加成员的多样性。通过对基因表达谱数据的分析,结果表明,CAMCUN的预测精度和普适性比其成员分类器有较大的提高。另外,对CAMCUN的偶然相关性(chance correlation)和预测结果的可信度(prediction confidence)分别进行了评估,研究结果表明,通过多分类器的共识,CAMCUN的偶然相关性降低而预测可信度得到了提高。
     5.研究了模式识别过程中特征变量的选择方法,提出了一种不相交主成分分析(disjoint principal component analysis)和遗传算法(genetic algorithm,GA)相结合的特征变量选择方法,并将其应用于基因表达谱数据中差异表达基因的识别。不相交PCA用于评估不同基因组合在区分两类样品时的区分能力大小,由于考虑了基因之间的组合效果,更加符合基因在生物体内发挥调控作用的实际情形。GA用于优化不同基因间的组合。此外,还提出了一种新的统计方法,对差异表达基因的偶然相关性进行了评估。研究结果表明,与文献中常用的差异表达基因识别方法t-检验和SAM(significance analysis of microarray)相比,新方法识别的差异表达基因具有更强的区分能力。
Modeling of analytical data is a common task in chemometrics. There are two types of problems in the modeling of analytical data, namely regression (or calibration) and pattern recognition. Because a single model is inherently susceptible to the difficulties associated with data quality and sample number. In this dissertation, consenesus strategy was used in the modeling of NIR spectroscopy and microarray data, and the theories and application of consensus modeling were investigated, including the following works:
     1. The basic theories and frequently used methods for the modeling of analytical data were reviewed, and the basic theories, modeling methods and application of consensus modeling were summarized as an emphasis.
     2. Based on random resampling, a partial least squares-based consensus regression method cPLS was proposed. In cPLS, other than selecting one PLS model on the basis of the best fit, several PLS models satisfying a predefined criterion were selected and combined into one cPLS. The effectiveness of cPLS was demonstrated by comparing the prediction results to those from the regular PLS in an application for the calibration of the NIR spectra of corn samples. The results suggested that combining multiple individual PLS models by cPLS could improve not only the accuracy of prediction, but also the robustness of the model.
     3. Combination of local modeling with consensus modeling, a consensus dynamic local partial least squares, CDL-PLS, was proposed. Unlike a regular PLS and many consensus methods reported in the literatures which used bagging or boosting to generate constituent predictors, CDL-PLS generates constituent models using a dynamic local modeling technique, which is different from bagging or boosting in that the samples used to develop constituent predictors are not randomly selected from the original training data set but according to their Euclidean distances to the predicting unknown sample. The effectiveness of CDL-PLS was demonstrated by comparing its prediction results to those of a general PLS in an application for the calibration of the near-infrared (NIR) spectral data of tobacco lamina samples. It was found that the use of dynamic local modeling technique could increase the prediction accuracy and stability of a predictor, while the combination of multiple dynamic local PLS models could further improve the prediction accuracy and robustness of a predictor.
     4. A new classification method CAMCUN (consensus analysis of multiple classifiers using non-repetitive variables) was developed. The central idea of CAMCUN is to combine multiple, heterogeneous classifiers, each derived with distinct features selected according to discriminatory power. CAMCUN was applied in analysis of microarray gene expression data. The analysis including classification of cancer based on gene expression profiles, assessing the chance correlation and the prediction confidence of classifiers, and identifying biomarkers. It was found that CAMCUN give much better prediction accuracy with higher prediction confidence and lower chance correlation than any of the constituent classifiers.
     5. By integration of disjoint principal component analysis with genetic algorithm (GA), a new feature selection method for pattern recognition was developed and applied in identification of differentially expressed genes from microarray gene expression profiles. In this method, the discriminatory power of combination of genes was obtained from disjoint PCA. GA was used to search for the best combination of genes. The significance in differential expression of individual gene was assessed by a statistic method. It was found that the differentially expressed genes identified using this method showed stronger discriminatory power than those obtained from t-test and SAM (significance analysis of microarray).
引文
[1] 梁逸曾,吴海龙,沈国励,蒋健晖,陈增萍,梁晟,俞汝勤,分析化学计量学的若干新进展.中国科学B辑:化学,2006,36(2),93-100.
    [2] Otto M.著,邵学广,蔡文生,徐筱杰等译,化学计量学:统计学与计算机在分析化学中的应用.科学出版社,北京,2003.
    [3] Bro R., Multivariate calibration - What is in chemometrics for the analytical chemist. Analytica Chimica Acta, 2003, 500 (1-2), 185-194.
    [4] Kalivas J.H., Multivariate calibration, an overview. Analytical Letters, 2005, 38 (14), 2259-2279.
    [5] Papa E., Battaini F., Gramatica P., Ranking of aquatic toxicity of esters modelled by QSAR. Chemosphere, 2005, 58 (5), 559-570.
    [6] Mecozzi M., Estimation of total carbohydrate amount in environmental samples by the phenol-sulphuric acid method assisted by multivariate calibration. Chemometrics and Intelligent Laboratory Systems, 2005, 79 (1-2), 84-90.
    [7] Shao X.G., Wang F., Chen D., Su Q.D., A method for near-infrared spectral calibration of complex plant samples with wavelet transform and elimination of uninformative variables. Analytical and Bioanalytical Chemistry, 2004, 378 (5), 1382-1387.
    [8] Ni Y.N., Huang C.F., Kokot S., Application of multivariate calibration and artificial neural networks to simultaneous kinetic-spectrophotometric determination of carbamate pesticides. Chemometrics and Intelligent Laboratory Systems, 2004, 71 (2), 177-193.
    [9] Donald D., Coomans D., Everingham Y., Cozzolino D., Gishen M., Hancock T., Adaptive wavelet modelling of a nested 3 factor experimental design in NIR chemometrics. Chemometrics and Intelligent Laboratory Systems, 2006, 82 (1-2), 122-129.
    [10] Chen D., Wang F., Shao X.G., Su Q.D., Research on the nonlinear model of near infrared spectroscopy and the total sugar of tobacco samples. Spectroscopy and Spectral Analysis, 2004, 24 (6), 672-674.
    [11] 倪永年,化学计量学在分析化学中的应用.科学出版社,北京,2004,pp134-187.
    [12] Nilsson N.J., Learning Machines: Foundations of Trainable Pattern Classifying Systems. McGraw Hill, New York, 1965.
    [13] Collantes E.R., Duta R., Welsh W.J., Zielinski W.L., Brower J., Preprocessing of HPLC trace impurity patterns by wavelet packets for pharmaceutical fingerprinting using artificial neural networks. Anal Chem, 1997, 69 (7), 1392-1397.
    [14] Wold S., Pattern Recognition by Means of Disjoint Principle Components Models. Pattern Recognition, 1976, 8, 127-139.
    [15] Lutz U., Lutz R.W., Lutz W.K., Metabolic profiling of glucuronides in human urine by LC-MS/MS and partial least-squares discriminant analysis for classification and prediction of gender. Analytical Chemistry, 2006, 78 (13), 4564-4571.
    [16] Ren Y.Y., Liu H.X., Xue C.X., Yao X.J., Liu M.C., Fan B.T., Classification study of skin sensitizers based on support vector machine and linear discriminant analysis. Analytica Chimica Acta, 2006, 572 (2), 272-282.
    [17] Gramatica P., Pilutti P., Papa E., Validated QSAR prediction of OH tropospheric degradation of VOCs: splitting into training-test sets and consensus modeling. Journal of Chemical Information and Computer Sciences, 2004, 44 (5), 1794-1802.
    [18] Inon F.A., Garrigues S., de la Guardia M., Combination of mid- and near-infrared spectroscopy for the determination of the quality properties of beers. Analytica ChimicaActa, 2006, 571 (2), 167-174.
    [19] Bocker A., Derksen S., Schmidt E., Teckentrup A., Schneider G., A hierarchical clustering approach for large compound libraries. Journal of Chemical Information and Modeling, 2005, 45 (4), 807-815.
    [20] Barakat N.A.M., Jiang J.H., Yu R.Q., Bubble agglomeration algorithm for unsupervised classification: a new clustering methodology without a priori information. Chemometrics and Intelligent Laboratory Systems, 2005, 77 (1-2), 43-49.
    [21] Frayman Y., Rolfe B.F., Webb G.I., Solving regression problems using competitive ensemble models. Al 2002: Advances in Artificial Intelligence, 2002, 2557, 511-522.
    [22] Breiman L., Bagging predictors. Machine Learning, 1996, 24 (2), 123-140.
    [23] Dietterich T.G., Machine learning research: Four current directions. AI Magazine, 1997, 18 (4), 97-136.
    [24] Erdem Z., Polikar R., Gurgen F., Yumusak N., Ensemble of SVMs for incremental learning. Multiple Classifier Systems, 2005, 3541, 246-256.
    [25] Merkwirth C., Mauser H.A., Schulz-Gasch T., Roche O., Stahl M., Lengauer T., Ensemble methods for classification in cheminformatics. Journal of Chemical Information and Computer Sciences, 2004, 44 (6), 1971-1978.
    [26] Tong W.D., Fang H., Xie Q., Hong H.X., Shi L.M., Perkins R., Scherf U., Goodsaid F., Frueh F., Gaining confidence on molecular classification through consensus modeling and validation. Toxicology Mechanisms and Methods, 2006, 16 (2-3), 59-68.
    [27] Ghosh J., Multiclassifier Systems: Back to the Future. In Multiple Classifier Systems." Third International Workshop, Roll F., Kittler J., Eds. Springer Berlin, Heidelberg Cagliari, Italy, 2002, pp 1-15.
    [28] Arodz T., Yuen D.A., Dudek A.Z., Ensemble of linear models for predicting drug properties. Journal of Chemical Information and Modeling, 2006, 46 (1), 416-423.
    [29] Cerquides J., Mantaras R.L., Robust Bayesian linear classifier ensembles. Proc. 16th European Conf. Machine Learning, Lecture Notes in Computer Science, 2005, 3720, 70-81.
    [30] Barai S.V., Reich Y., Ensemble modelling or selecting the best model: Many could be better than one. AI EDAM, 1999, 13 (5), 377-386.
    [31] Ho T.K., Complexity of classification problems and comparative advantages of combined classifiers. Multiple Classifier Systems, 2000, 1857, 97-106.
    [32] Devroye L., Any discrimination rule can have an arbitrarily bad probability of error for finite sample size. IEEE Trans. on PAMI, 1982, 4 (2), 154-157.
    [33] Sohn S.Y., Meta analysis of classification algorithms for pattern recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1999, 21 (11), 1137-1144.
    [34] Kanal L., Interactive pattern analysis and classification systems-A survey and commentary. Proc. IFFF, 1972, 60 (10), 1200-1215.
    [35] Mevik B.H., Segtnan V.H., Naes T., Ensemble methods and partial least squares regression. Journal of Chemometrics, 2004, 18 (11), 498-507.
    [36] Hansen L.K., Salamon P., Neural network ensembles. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1990, 12, 993-1001.
    [37] Rogova G., Combining the Results of Several Neural-Network Classifiers. Neural Networks, 1994, 7 (5), 777-781.
    [38] Krogh A., Vedelsby J., Neural network ensembles, cross validation and active learning. In Advances in Neural Information Processing Systems, MIT Press, Cambridge, MA, 1995, Vol. 7, pp 231-238.
    [39] Geman S., Bienenstock E., Doursat R., Neural Networks and the Bias Variance Dilemma. Neural Computation, 1992, 4 (1), 1-58.
    [40] Granitto P.M., Verdes P.F., Ceccatto H.A., Neural network ensembles: evaluation of aggregation algorithms. Artificial Intelligence, 2005, 163 (2), 139-162.
    [41] Su Z., Tong W., Shi L., Shao X., Cai W., A Partial Least Squares-based Consensus Regression Method for the Analysis of Near-Infrared Complex Spectral Data of Plant Samples. Analytical Letters, 2006, 39 (9), 2073-2083.
    [42] Raviv Y., Intrator N., Bootstrapping with noise: An effective regularization technique. Connection Science, 1996, 8 (3), 356-372.
    [43] Tumer K., Ghosh J., Analysis of decision boundaries in linearly combined neural classifiers. Pattern Recognition, 1996, 29 (2), 341-348.
    [44] Tumer K., Ghosh J., Error Correlation and Error Reduction in Ensemble Classifiers. Connection Science, 1996, 8 (3/4), 385-404.
    [45] Friedman J., Hall P. On Bagging and Nonlinear Estimation, Technical Report, Department of Statistics, Standford University. 1999.
    [46] Freund Y. , Boosting a weak learning algorithm by majority. Information and Computation, 1995, 121 (2), 256-285.
    [47] Drucker H. , Cortes C. , Boosting decision trees. In Advances in Neural Information Processing Systems, MIT Press, 1996, Vol. 8, pp 479-485.
    [48] Qu Y. , Adam B. L. , Yasui Y. , Ward M. D. , Cazares L. H. , Schellhammer P. F. , Feng Z. , Semmes O. J. , Wright G. L. , Jr. , Boosted decision tree analysis of surface-enhanced laser desorption/ionization mass spectral serum profiles discriminates prostate cancer from noncancer patients. Clinical Chemistry, 2002, 48 (10), 1835-1843.
    [49] Li H. , Luan Y. , Boosting proportional hazards models using smoothing splines, with applications to high-dimensional microarray data. Bioinformatics, 2005, 21 (10), 2403-2409.
    [50] Drucker H. , Improving Regressors using Boosting Techniques. In Proc. 14th International Conference on Machine Learning, Douglas H. Fisher J. , Ed. Morgan Kaufmann, 1997, pp 107-115.
    [51] Webb G. I. , MultiBoosting: A technique for combining boosting and wagging. Machine Learning, 2000, 40 (2), 159-196.
    [52] Hothorn T. , Lausen B. , Bundling classifiers by bagging trees. Computational Statistics & Data Analysis, 2005, 49 (4), 1068-1078.
    [53] Freund Y. , Schapire R. E. , A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. Journal of computer and system sciences, 1997, 55, 119-139.
    [54] Friedman J. , Hall P. On Bagging and Nonlinear Estimation, Technical Report, Department of Statistics, Standford University. 2000.
    [55] Freund Y. , Schapire R. , Experiments with a new boosting algorithm. In Proceedings of the Thirteenth International Conference on Machine Learning, Kaufmann M. , Ed. 1996, pp 148-156.
    [56] Freund Y. , Schapire R. E. , A decision-theoretic generalization of online learning and an application to boosting. Journal of Computer and System Sciences, 1997, 5 (1), 119-139.
    [57] Efron B., Tibshirani R., An Introduction to the Bootstrap. Chapman & Hall, London, UK, 1993.
    [58] Bauer E., Kohavi R., An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning, 1999, 36 (1-2), 105-139.
    [59] Qualian J. In Bagging, boosting, and C4.5, Proceedings of the 13th National Conference Artificial Intelligence, 1996, AAAI/MIT Press, 1996, pp 725-730.
    [60] Breiman L. Bias, variance, and arcing classifiers, Technical report 460, University of California, Department of Statistics. Berkeley, CA, 1996.
    [61] Parmanto B., Munro P.W., Doyle H.R., Improving committee diagnosis with resampling techniques. In Advances in Neural Information Processing Systems, 1996, 8, 832-888.
    [62] Sinisi S.E., Neugebauer R., Laan M.J.v.d., Cross-Validated Bagged Prediction of Survival. Statistical Applications in Genetics and Molecular Biology, 2005, 5 (1).
    [63] Bertoni A., Folgieri R., Valentini G., Bio-molecular cancer prediction with random subspace ensembles of Support Vector Machines. Neurocomputing, 2005, 63, 535-539.
    [64] Ho T.K., The random Subspace method for constructing decision forests, Ieee Transactions on Pattern Analysis and Machine Intelligence, 1998, 20 (8), 832-844.
    [65] Cherkauer K.J., Human expert-level performance on a scientific image analysis task by a system using combined artificial neural networks. In Working Notes of the AAAI Workshop on Integrating Multiple Learned Models, Chart I. P., Ed. 1996.
    [66] Dietterich T.G., Ensemble Methods in Machine Learning. Springer Verlag, New York, 2000, pp 1-15.
    [67] Christensen S.W., Ensemble construction via designed output distortion. Multiple Classifier Systems, Proceeding, 2003, 2709, 286-295.
    [68] Dietterich T.G., Kong E.B. Machine Learning Bias, Statistical Bias, and Statistical Variance of Decision Tree Algorithms, Technical Report, Dept of Computer Science, Oregon State University, Corvallis, Oregon, 1995.
    [69] Baurin N., Mozziconacci J.C., Arnoult E., Chavatte P., Marot C., Morin-Allory L., 2D QSAR consensus prediction for high-throughput virtual screening. An application to COX-2 inhibition modeling and screening of the NCI database. Journal of Chemical Information and Computer Sciences, 2004, 44 (1), 276-285.
    [70] Martinez-Munoz G., Suarez A., Switching class labels to generate classification ensembles. Pattern Recognition, 2005, 38 (10), 1483-1494.
    [71] Dietterich T.G., Bakiri G., Solving Multiclass Learning Problems via Error-Correcting Output Codes. Journal of Articial Intelligence Research, 1995, 2, 263-286.
    [72] Hauger S., Windeatt T., ECOC and Boosting with Multi-Layer Perceptrons In Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Hauger S., Windeatt T., Eds. IEEE Computer Society. Washington, DC, USA 2004, Vol. 3, pp 458-461.
    [73] Mangiameli P., West D., Rampal R., Model selection for medical diagnosis decision support systems. Decision Support Systems, 2004, 36 (3), 247-259.
    [74] Shin H.W., Sohn S.Y., Combining both ensemble and dynamic classifier selection schemes for prediction of mobile internet subscribers. Expert Systems with Applications, 2003, 25 (1), 63-68.
    [75] Shi T., Seligson D., Belldegrun A.S., Palotie A., Horvath S., Tumor classification by tissue microarray profiling: random forest clustering applied to renal cell carcinoma. Modern Pathology, 2005, 18 (4), 547-557.
    [76] Kao B., Katriel R., Subsampled model aggregation. International Journal on Artificial Intelligence Tools, 2005, 14 (3), 385-397.
    [77] Jiang Y., Ling J.J., Li G., Dai H.H., Zhou Z.H., Dependency bagging. In Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing, Pt 1, Proceedings, 2005, Vol. 3641, pp 491-500.
    [78] Kotsiantis S.B., Tsekouras G.E., Pintelas P.E., Bagging random trees for estimation of tissue softness. Machine Learning and Data Mining in Pattern Recognition, Proceedings, 2005, 3587, 674-681.
    [79] Puuronen S., Terziyan V., Tsymbal A., A Dynamic Integration Algorithm for an Ensemble of Classifiers. In Foundations of Intelligent Systems, 11th International Symposium ISMIS'99, Springer-Verlag, LNAI, 1999, Vol. 1609, pp 592-600.
    [80] Tsymbal A., Cunningham P., Pechenizkiy M., Puuronen S., Search Strategies for Ensemble Feature Selection in Medical Diagnostics. In Proc. 16 IEEE Symp. on Computer-Based Medical Systems CBMS'2003, Krol M., Mitra S., Lee D. J., Eds. 2003.
    [81] Tsymbal A., Pechenizkiy M., Puuronen S., Patterson D.W., Dynamic integration of classifiers in the space of principal components. In Proc. Advances in Databases and Information Systems. 7th East-European Conf ADBIS'03, Kalinichenko L., Manthey R., Thalheim B., Wloka U., Eds. Springer-Verlag, 2003, Vol. 2798, pp 278-292.
    [82] Merz C.J., Dynamical selection of learning algorithms. In In Learning from data, artificial intelligence and statistics, Fisher, Lenz H. J., Eds. Springer, New York, 1996.
    [83] Giacinto G., Roli F., Dynamic Classifier Selection. In In Proc. 1st Int. Workshop on Multiple Classifier Systems, Springer-Verlag, LNCS, 2000, Vol. 1857, pp 177-189.
    [84] Shin H.W., Sohn S.Y., Selected tree classifier combination based on both accuracy and error diversity. Pattern Recognition, 2005, 38 (2), 191-197.
    [85] Gunes V., Menard M., Loonis P., Petit-Renaud S., Combination, cooperation and selection of classifiers: A state of the art. International Journal of Pattern Recognition and Artificial Intelligence, 2003, 17 (8), 1303-1324.
    [86] Zhu X.Q., Wu X.D., Yang Y., Effective classification of noisy data streams with attribute-oriented dynamic classifier selection. Knowledge and Information Systems, 2006, 9 (3), 339-363.
    [87] Giacinto G., Roli F., Dynamic classifier selection based on multiple classifier behaviour. Pattern Recognition, 2001, 34 (9), 1879-1881.
    [88] Parhami B., Voting Algorithms, Ieee Transactions on Reliability, 1994, 43 (4), 617-629.
    [89] Sboner A. , Eccher C. , Blanzieri E. , Bauer P. , Cristofolini M. , Zumiani G. , Forti S. , A multiple classifier system for early melanoma diagnosis. Artificial Intelligence in Medicine, 2003, 27 (1), 29-44.
    [90] Kim H. C. , Pang S. , Je H. M. , Kim D. , Bang S. Y. , Support vector machine ensemble with bagging. Pattern Recogniton with Support Vector Machines, Proceedings, 2002, 2388, 397-407.
    [91] Bryll R. , Gutierrez-Osuna R. , Quek F. , Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets. Pattern Recognition, 2003, 36 (6), 1291-1302.
    [92] Kim H. C. , Pang S. , Je H. M. , Kim D. , Bang S. Y. , Constructing support vector machine ensemble. Pattern Recognition, 2003, 36 (12), 2757-2767.
    [93] Ruta D. , Gabrys B. , New measure of classifier dependency in multiple classifier systems. Multiple Classifier Systems, 2002, 2364, 127-136.
    [94] Zouari H. , Heutte L. , Lecourtier Y. , Alimi A. , Generating classifier outputs with fixed diversity for evaluating voting methods. Structural, Syntactic, and Statistical Pattern Recognition, Proceedings, 2004, 3138, 1001-1009.
    [95] Loo C. K. , Law A. , Lim W. S. , Rao M. V. C. , Probabilistic ensemble simplified fuzzy ARTMAP for sonar target differentiation. Neural Computing & Applications, 2006, 15 (1), 79-90.
    [96] Hu Q. H. , Yu D. R. , Wang M. Y. , Constructing rough decision forests. Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing, Pt 2, Proceedings, 2005, 3642, 147-156.
    [97] Loo C. K. , Rao M. V. C. , Accurate and reliable diagnosis and classification using probabilistic ensemble simplified fuzzy ARTMAP. leee Transactions on Knowledge andData Engineering, 2005, 17 (11), 1589-1593.
    [98] Min S. K. , Hense A. , A Bayesian approach to climate model evaluation and multi-model averaging with an application to global mean surface temperatures from IPCC AR4 coupled climate models. Geophysical Research Letters, 2006, 33 (8).
    [99] Chen C.H., Lin Z.S., A committee machine with empirical formulas for permeability prediction. Computers & Geosciences, 2006, 32 (4), 485-496.
    [100] Fumera G, Roll F., Analysis of error-reject trade-off in linearly combined multiple classifiers. Pattern Recognition, 2004, 37 (6), 1245-1265.
    [101] Wolpert D.H., Stacked Generalization. Neural Networks, 1992, 5(2), 241-259.
    [102] Tsai C.F., Training support vector machines based on stacked generalization for image classification. Neurocomputing, 2005, 64, 497-503.
    [103] Sigletos G., Paliouras G., Spyropoulos C.D., Hatzopoulos M., Combining information extraction systems using voting and stacked generalization. Journal of Machine Learning Research, 2005, 6, 1751 -1782.
    [104] Wu W.P., Lee V.C.S., Tan T.Y., Contributions of domain knowledge and stacked generalization in AI-based classification models. Ai 2004. Advances in Artificial Intelligence, Proceedings, 2004, 3339, 1049-1054.
    [105] Robes V., Larranaga P., Pena J.M., Menasalvas E., Perez M.S., Herves V., Wasilewska A., Bayesian network multi-classifiers for protein secondary structure prediction. Artificial Intelligence in Medicine, 2004, 31 (2), 117-136.
    [106] Hu M.Y., Tsoukalas C., Explaining consumer choice through neural networks: The stacked generalization approach. European Journal of Operational Research, 2003, 146 (3), 650-660.
    [107] Breiman L., Stacked regressions. Machine Learning, 1996, 24 (1), 49-64.
    [108] Lange D.H., Siegelmann H.T., Pratt H., Inbar G.F., Overcoming selective ensemble averaging: Unsupervised identification of event-related brain potentials. Ieee Transactions on Biomedical Engineering, 2000, 47 (6), 822-826.
    [109] Rabinovich M.I., Huerta R., Volkovskii A., Abarbanel H.D.I., Stopfer M., Laurent G., Dynamical coding of sensory information with competitive networks. Journal of Physiology-Paris, 2000, 94 (5-6), 465-471.
    [110] Aiolli F., Sperduti A., Multiclass classification with multi-prototype support vector machines. Journal of Machine Learning Research, 2005, 6, 817-850.
    [111] Biehl M., Ghosh A., Hammer B., Learning vector quantization: The dynamics of winner-takes-all algorithms. Neurocomputing, 2006, 69 (7-9), 660-670.
    [112] Har-Peled S., Roth D., Zimak D., Constraint classification: A new approach to multiclass classification. Algorithmic Learning Theory, Proceedings, 2002, 2533, 365-379.
    [113] Ng S.K., McLachlan G.J., Normalized Gaussian networks with mixed feature data. Ai 2005: Advances in Artificial Intelligence, 2005, 3809, 879-882.
    [114] Guler I., Ubeyli E.D., A mixture of experts network structure for modelling Doppler ultrasound blood flow signals. Computers in Biology and Medicine, 2005, 35 (7), 565-582.
    [115] Srivastava A.N., Su R.J., Weigend A.S., Data mining for features using scale-sensitive gated experts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1999, 21 (12), 1268-1279.
    [116] Baxt W.G., Improving the Accuracy of an Artificial Neural Network Using Multiple Differently Trained Networks. Neural Computation, 1992, 4 (5), 772-780.
    [117] Kittler J., Hatef M., Duin R.P.W., Matas J., On combining classifiers. 1EEE Transactions on Pattern Analysis and Machine Intelligence, 1998, 20 (3), 226-239.
    [118] Bagui S.C., Pal N.R., A Multistage Generalization of the Rank Nearest Neighbor Classification Rule. Pattern Recognition Letters, 1995, 16 (6), 601-614.
    [119] Ho T.K., Hull J.J., Srihari S.N., Decision Combination in Multiple Classifier Systems. 1EEE Trans. Pattern Analysis and Machine Intelligence, 1994, 16 (1), 66-75.
    [120] Cho S.B., Kim J.H., Combining Multiple Neural Networks by Fuzzy Integral for Robust Classification. IEEE Trans. Systems, Man, and Cybernetics, 1995, 25 (2), 380-384.
    [121] Cho S.B., Kim J.H., Multiple Network Fusion Using Fuzzy Logic. IEEE Trans. Neural Networks, 1995, 6 (2), 497-501.
    [122] Selfridge O.G., Pandemonium: A paradigm for learning. In The Mechanization of Thought Process. Vol 10 of National Physical Laboratory Symposia, Blake D., Uttley A., Eds. Her Majesty's Stationary Office, London, 1959, pp 511-529.
    [123] Zhou Z.H., Wu J.X., Tang W., Ensembling neural networks: Many could be better than all. Artificial Intelligence, 2002, 137 (1-2), 239-263.
    [124] Geurts P., Ernst D., Wehenkel L., Extremely randomized trees. Machine Learning, 2006, 63 (1), 3-42.
    [125] Chandra A., Yao X., Evolving hybrid ensembles of learning machines for better generalization. Neurocomputing, 2006, 69 (7-9), 686-700.
    [126] Wang X., Wang H., Classification by evolutionary ensembles. Pattern Recognition, 2006, 39 (4), 595-607.
    [127] Gislason P.O., Benediktsson J.A., Sveinsson J.R., Random Forests for land cover classification. Pattern Recognition Letters, 2006, 27 (4), 294-300.
    [128] Seierstad M., Agrafiotis D.K., A QSAR model of bERG binding using a large, diverse, and internally consistent training set. Chemical Biology & Drug Design, 2006, 67 (4), 284-296.
    [129] Svetnik V., Wang T., Tong C., Liaw A., Sheridan R.P., Song Q., Boosting: an ensemble learning tool for compound classification and QSAR modeling. Journal of Chemical Information and Modeling, 2005, 45 (3), 786-799.
    [130] Svetnik V., Liaw A., Tong C., Culberson J.C., Sheridan R.P., Feuston B.P., Random forest: A classification and regression tool for compound classification and QSAR modeling. Journal of Chemical Information and Computer Sciences, 2003, 43 (6), 1947-1958.
    [131] Dettling M., BagBoosting for tumor classification with gene expression data. Bioinformatics, 2004, 20 (18), 3583-3593.
    [132] Liu B., Cui Q., Jiang T., Ma S., A combinational feature selection and ensemble neural network method for classification of gene expression data. BMC Bioinformatics, 2004, 5, 136.
    [133] Peng Y.H., A novel ensemble machine learning for robust microarray data classification. Computers in Biology and Medicine, 2006, 36 (6), 553-573.
    [134] Hong J.H., Cho S.B., The classification of cancer based on DNA microarray data that uses diverse ensemble genetic programming. Artificial Intelligence in Medicine, 2006, 36 (1), 43-58.
    [135] West D. , Mangiameli P. , Rampal R. , West V. , Ensemble strategies for a medical diagnostic decision support system: A breast cancer diagnosis application. European Journal of Operational Research, 2005, 162 (2), 532-551.
    [136] Yeung K. Y. , Bumgarner R. E. , Raftery A. E. , Bayesian model averaging: development of an improved multi-class, gene selection and classification tool for microarray data. Bioinformaties, 2005, 21 (10), 2394-2402.
    [137] Im J. S. , Brill K. , Danaher E. , Confidence interval estimation for quantitative precipitation forecasts (QPF) using short-range ensemble forecasts (SREF). Weather and Forecasting, 2006, 21 (1), 24-41.
    [138] Roebber P. J. , Schultz D. M. , Colle B. A. , Stensrud D. J. , Toward improved prediction: High-resolution and ensemble modeling systems in operations. Weather and Forecasting, 2004, 19 (5), 936-949.
    [139] Strehl A. , Ghosh J. , Cluster ensembles—a knowledge reuse framework for combining partitions. Journal of Machine Learning Research, 2002, 3, 583-617.
    [140] Topchy A. , Jain A. K. , Punch W. , Clustering ensembles: Models of consensus and weak partitions. Ieee Transactions on Pattern Analysis and Machine Intelligence, 2005, 27 (12), 1866-1881.
    [141] Galvao R. K. H. , Araujo M. C. U. , Martins M. D. , Jose G. E. , Pontes M. J. C. , Silva E. C. , Saldanha T. C. B. , An application of subagging for the improvement of prediction accuracy of multivariate calibration models. Chemometrics and Intelligent Laboratory Systems, 2006, 81 (1), 60-67.
    [1] Norris K.H., Hart J.R., Direct Spectrophotometric Determination of Moisture Content of Grain and Seeds. In Proceedings of the 1963 International Symposium on Humidity and Moisture in Liquids and Solids, 1965, Vol. 4, pp 19-25.
    [2] Davies T., The history of near infrared spectroscopic analysis: Past, present and future - "From sleeping technique to the morning star of spectroscopy". Analusis, 1998, 26 (4), M 17-M 19.
    [3] Mazerolles G. , Hanafi A. , Dufour E. , Bertrand D. , Qannari E. M. , Common components and specific weights analysis: A chemometric method for dealing with complexity of food products. Chemometrics and Intelligent Laboratory Systems, 2006, 81 (1), 41-49.
    [4] Afseth N. K. , Segtnan V. H. , Marquardt B. J. , Wold J. P. , Raman and near-infrared spectroscopy for quantification of fat composition in a complex food model system. Applied Spectroscopy, 2005, 59 (11), 1324-1332.
    [5] Weinstock B. A. , Janni J. , Hagen L. , Wright S. , Prediction of oil and oleic acid concentrations in individual corn (Zea mays L.) kernels using near-infrared reflectance hyperspectral imaging and multivariate analysis. Applied Spectroscopy, 2006, 60 (1), 9-16.
    [6] Rantanen J. , Wikstrom H. , Turner R. , Taylor L. S. , Use of in-line near-infrared spectroscopy in combination with chemometrics for improved understanding of pharmaceutical processes. Analytical Chemistry, 2005, 77 (2), 556-563.
    [7] Shao X. , Wang F. , Chen D. , Su Q. D. , A method for near-infrared spectral calibration of complex plant samples with wavelet transform and elimination of uninformative variables. Analytical and Bioanalytical Chemistry, 2004, 378 (5), 1382-1387.
    [8] Schulz H. , Baranska M. , Quilitzsch R. , Schutze W. , Losing G. , Characterization of peppercorn, pepper oil, and pepper oleoresin by vibrational spectroscopy methods. Journal of Agricultural and Food Chemistry, 2005, 53 (9), 3358-3363.
    [9] Sun J. Y. , Li M. Z. , Zheng L. H. , Hu Y. G. , Zhang X. J. , Real-time analysis of soil moisture, soil organic matter, and soil total nitrogen with NIR spectra. Spectroscopy and Spectral Analysis, 2006, 26 (3), 426-429.
    [10] Millar S. , Robert P. , Devaux M. F. , Guy R. C. E. , Maris P. , Near-infrared spectroscopic measurements of structural changes in starch-containing extruded products. Applied Spectroscopy, 1996, 50, 1134-1139.
    [11] Thennadil S. N. , Martens H. , Kohler A. , Physics-based multiplicative scatter correction approaches for improving the performance of calibration models. Applied Spectroscopy, 2006, 60 (3), 315-321.
    [12] Saiz-Abajo M.J., Gonzalez-Saiz J.M., Pizarro C., Orthogonal signal correction applied to the classification of wine and molasses vinegar samples by near-infrared spectroscopy. Feasibility study for the detection and quantification of adulterated vinegar samples. Analytical and Bioanalytical Chemistry, 2005, 382 (2), 412-420.
    [13] Kasemsumran S., Du Y.E, Maruo K., Ozaki Y., Selective removal of interference signals for near-infrared spectra of biomedical samples by using region orthogonal signal correction. Analytica Chimica Acta, 2004, 526 (2), 193-202.
    [14] Wang G.Q., Wang F., Chen D., Su Q.D., Shao X.G., A novel method for the determination of inorganic ions in complex plant samples by near infrared spectroscopy. Spectroscopy and Spectral Analysis, 2004, 24 (12), 1540-1542.
    [15] Shao X.G., Wang F., Chen D,, Su Q.D., A method for near-infrared spectral calibration of complex plant samples with wavelet transform and elimination of uninformative variables. Analytical and Bioanalytical Chemistry, 2004, 378 (5), 1382-1387.
    [16] Baratieri S.C., Barbosa J.M., Freitas M.P., Martins J.A., Multivariate analysis of nystatin and metronidazole in a semi-solid matrix by means of diffuse reflectance NIR spectroscopy and PLS regression. Journal of Pharmaceutical and Biomedical Analysis, 2006, 40 (1), 51-55.
    [17] Felicio C.C., Bras L.P., Lopes J.A., Cabrita L., Menezes J.C., Comparison of PLS algorithms in gasoline and monitoring with MIR and NIR. Chemometrics and Intelligent Laboratory Systems, 2005, 78 (1-2), 74-80.
    [18] Faix O., Moltran J., Meier D., Rapid differentiation of impregnated woods by NIR spectroscopy using the PLS and PCR approach. Abstracts of Papers of the American Chemical Society, 1996, 212, 90-Cell.
    [19] Chauchard F., Cogdill R., Roussel S., Roger J.M., Bellon-Maurel V., Application of LS-SVM to non-linear phenomena in NIR spectroscopy: development of a robust and portable sensor for acidity prediction in grapes. Chemometrics and Intelligent Laboratory Systems, 2004, 71 (2), 141-150.
    [20] Wu W. , Massart D. L. , deJong S. , Kemel-PCA algorithms for wide data. 2. Fast cross-validation and application in classification of NIR data. Chemometrics and Intelligent Laboratory Systems, 1997, 37 (2), 271-280.
    [21] Wold J. P. , Westad F. , Heia K. , Detection of parasites in cod fillets by using SIMCA classification in multispectral images in the visible and NIR region. Applied Spectroscopy, 2001, 55 (8), 1025-1034.
    [22] Thomas E. , Haaland D. , Comparison of multivariate calibration methods for quantitative spectral analysis. Analytical Chemistry, 1990, 62, 1091-1099.
    [23] Galvao R. K. H. , Araujo M. C. U. , Martins M. D. , Jose G. E. , Pontes M. J. C. , Silva E. C. , Saldanha T. C. B. , An application of subagging for the improvement of prediction accuracy of multivariate calibration models. Chemometrics and Intelligent Laboratory Systems, 2006, 81 (1), 60-67.
    [24] Mevik B. H. , Segtnan V. H. , Naes T. , Ensemble methods and partial least squares regression. Journal of Chemometrics, 2004, 18 (11), 498-507.
    [25] Baurin N. , Mozziconacci J. C. , Arnoult E. , Chavatte P. , Marot C. , Morin-Allory L. , 2D QSAR consensus prediction for high-throughput virtual screening. An application to COX-2 inhibition modeling and screening of the NCI database. Journal of Chemical Information and Computer Sciences, 2004, 44 (1), 276-285.
    [26] Svetnik V. , Wang T. , Tong C. , Liaw A. , Sheridan R. P. , Song Q. , Boosting: an ensemble learning tool for compound classification and QSAR modeling. Journal of Chemical Information and Modeling, 2005, 45 (3), 786-799.
    [27] Breiman L. , Randomizing outputs to increase prediction accuracy. Machine Learning, 2000, 40 (3), 229-242.
    [28] Tong W. , Hong H. , Fang H. , Xie Q. , Perkins R. , Decision forest: combining the predictions of multiple independent decision tree models. Journal of Chemical Information and Computer Science, 2003, 43 (2), 525-531.
    [29] Krogh A. , Vedelsby J. , Neural network ensembles, cross validation, and active learning. MIT Press, Cambridge, 1995, pp 231-238.
    [1] Ghosh J., Multiclassifier Systems: Back to the Future. In Multiple Classifier Systems." Third International Workshop, Roli F., Kittler J., Eds. Springer Berlin, Heidelberg Cagliari, Italy, 2002, pp 1-15.
    [2] Frayman Y. , Rolfe B. F. , Webb G. I. , Solving regression problems using competitive ensemble models. Al 2002: Advances in Artificial Intelligence, 2002, 2557, 511-522.
    [3] Selfridge O. G. , Pandemonium: A paradigm for learning. In The Mechanization of Thought Process. Vol 10 of National Physical Laboratory Symposia, Blake D. , Uttley A. , Eds. Her Majesty's Stationary Office, London, 1959, pp 511-529.
    [4] Bates J. M. , Granger C. W. J. , The combination of forecasts. Operational Research Quarterly, 1969, 20, 451-468.
    [5] Indurkhya N. , Weiss S. M. , Rule-based ensemble solutions for regression. Machine Learning and Data Mining in Pattern Recognition, 2001, 2123, 62-72.
    [6] Zhou Z. H. , Wu J. X. , Tang W. , Ensembling neural networks: Many could be better than all. Artificial Intelligence, 2002, 137 (1-2), 239-263.
    [7] Wang X. , Wang H. , Classification by evolutionary ensembles. Pattern Recognition, 2006, 39 (4), 595-607.
    [8] Merkwirth C. , Mauser H. A. , Schulz-Gasch T. , Roche O. , Stahl M. , Lengauer T. , Ensemble methods for classification in cheminformatics. Journal of Chemical Information and Computer Sciences, 2004, 44 (6), 1971-1978.
    [9] Breiman L. , Random forests. Machine Learning, 2001, 45 (1), 5-32.
    [10] Galvao R. K. H. , Araujo M. C. U. , Martins M. D. , Jose G. E. , Pontes M. J. C. , Silva E. C. , Saldanha T. C. B. , An application of subagging for the improvement of prediction accuracy of multivariate calibration models. Chemometrics and Intelligent Laboratory Systems, 2006, 81 (1), 60-67.
    [11] Qiu P. , Wang Z. J. , Liu K. J. , Ensemble dependence model for classification and prediction of cancer and normal gene expression data. Bioinformatics, 2005, 21 (14), 3114-3121.
    [12] Baurin N. , Mozziconacci J. C. , Amoult E. , Chavatte P. , Marot C. , Morin-Allory L. , 2D QSAR consensus prediction for high-throughput virtual screening. An application to COX-2 inhibition modeling and screening of the NCI database. Journal of Chemical Information and Computer Sciences, 2004, 44 (1), 276-285.
    [13] Tong W. , Xie Q. , Hong H. , Shi L. , Fang H. , Perkins R. , Petricoin E. F. , Using decision forest to classify prostate cancer samples on the basis of SELDI-TOF MS data: assessing chance correlation and prediction confidence. Environmental Health Perspectives, 2004, 112 (16), 1622-1627.
    [14] Hong H. , Tong W. , Perkins R. , Fang H. , Xie Q. , Shi L. , Multiclass Decision Forest—a novel pattern recognition method for multiclass classification in microarray data analysis. DNA and Cell Biology, 2004, 23 (10), 685-694.
    [15] Su Z. , Tong W. , Shi L. , Shao X. , Cai W. , A Partial Least Squares-based Consensus Regression Method for the Analysis of Near-Infrared Complex Spectral Data of Plant Samples. Analytical Letters, 2006, 39 (9), 2073-2083.
    [16] Mevik B. H. , Segtnan V. H. , Naes T. , Ensemble methods and partial least squares regression. Journal of Chemometrics, 2004, 18 (11), 498-507.
    [17] Breiman L. , Bagging predictors. Machine Learning, 1996, 24 (2), 123-140.
    [18] Freund Y. , Boosting a weak learning algorithm by majority. Information and Computation, 1995, 121 (2), 256-285.
    [19] Bauer E. , Kohavi R. , An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning, 1999, 36 (1-2), 105-139.
    [20] Qualian J. In Bagging, boosting, and C4.5, Proceedings of the 13th National Conference Artificial Intelligence, 1996, AAAI/MIT Press, 1996, pp 725-730.
    [21] Breiman L. Bias, variance, and arcing classifiers, Technical report 460, University of California, Department of Statistics. Berkeley, CA, 1996.
    [22] Webb G. I. , MultiBoosting: A technique for combining boosting and wagging. Machine Learning, 2000, 40 (2), 159-196.
    [23] Hou Z. , Wang W. , Cai W. , Shao X. , A local regression method based on independent component analysis and its application in near infrared spectra analysis. Computers and Applied Chemistry, 2006, 23 (3), 224-226.
    [24] Shao X. , Leung A. K.-M. , Chau F. -T. , Wavelet: a new trend in chemistry. Accounts of Chemical Research, 2003, 36 (4), 276-283.
    [25] Ma C. , Shao X. , Continuous wavelet transform applied to removing the fluctuating background in near-infrared spectra. Journal of Chemical Information and Computer Sciences, 2004, 44 (3), 907-911.
    [26] Dietterich T.G., Machine learning research: Four current directions. AI Magazine, 1997, 18 (4), 97-136.
    [27] Hansen L.K., Salamon P., Neural network ensembles. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1990, 12, 993-1001.
    [28] Tumer K., Ghosh J., Analysis of decision boundaries in linearly combined neural classifiers. Pattern Recognition, 1996, 29 (2), 341-348.
    [29] Krogh A., Vedelsby J., Neural network ensembles, cross validation and active learning. In Advances in Neural Information Processing Systems, MIT Press, Cambridge, MA, 1995, Vol. 7, pp 231-238.
    [30] Raviv Y., Intrator N., Bootstrapping with noise: An effective regularization technique. Connection Science, 1996, 8 (3), 356-372.
    [31] Malinowski E.R., Determination of the number of factors and the experimental error in a data matrix. Analytical Chemistry, 1977, 49, 612-617.
    [1] Alizadeh A.A., Eisen M.B., Davis R.E., Ma C., Lossos I.S., Rosenwald A., Boldrick J.C., Sabet H., Tran T., Yu X., Powell J.I., Yang L., Marti G.E., Moore T., Hudson J., Jr., Lu L., Lewis D.B., Tibshirani R., Sherlock G., Chan W.C., Greiner T.C., Weisenburger D.D., Armitage .1.O., Warnke R., Staudt L.M., et al., Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature, 2000, 403 (6769), 503-511.
    [2] Greer B.T., Khan J., Diagnostic classification of cancer using DNA microarrays and artificial intelligence. Annals of the New York Academy of Sciences, 2004, 1020, 49-66.
    [3] van't Veer L.J., Dai H., van de Vijver M.J., He Y.D., Hart A.A., Mao M., Peterse H.L., van der Kooy K., Marton M.J., Witteveen A.T., Schreiber G.J., Kerkhoven R.M., Roberts C., Linsley P.S., Bernards R., Friend S.H., Gene expression profiling predicts clinical outcome of breast cancer. Nature, 2002, 415 (6871), 530-536.
    [4] Simon R., Radmacher M.D., Dobbin K., McShane L.M., Pitfalls in the use of DNA microarray data for diagnostic and prognostic classification. Journal of the National Cancer Institute, 2003, 95 (1), 14-18.
    [5] Somorjai R.L., Dolenko B., Baumgartner R., Class prediction and discovery using gene microarray and proteomics mass spectroscopy data: curses, caveats, cautions. Bioinformatics, 2003, 19 (12), 1484-1491.
    [6] Nguyen D.V., Rocke D.M., Partial least squares proportional hazard regression for application to DNA microarray survival data. Bioinformatics, 2002, 18 (12), 1625-1632.
    [7] Shen L. , Tan E. C. , Reducing multiclass cancer classification to binary by output coding and SVM. Computational Biology and Chemistry, 2006, 30 (1), 63-71.
    [8] Furey T. S. , Cristianini N. , Duffy N. , Bednarski D. W. , Schummer M. , Haussler D. , Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics, 2000, 16 (10), 906-914.
    [9] Liu B. , Cui Q. , Jiang T. , Ma S. , A combinational feature selection and ensemble neural network method for classification of gene expression data. BMC Bioinformatics, 2004, 5, 136.
    [10] Olshen A. B. , Jain A. N. , Deriving quantitative conclusions from microarray expression data. Bioinformatics, 2002, 18 (7), 961-970.
    [11] Hong H. , Tong W. , Perkins R. , Fang H. , Xie Q. , Shi L. , Multiclass Decision Forest—a novel pattern recognition method for multiclass classification in microarray data analysis. DNA and Cell Biology, 2004, 23 (10), 685-694.
    [12] Zhu J. , Hastie T. , Classification of gene microarrays by penalized logistic regression. Biostatistics, 2004, 5 (3), 427-443.
    [13] Gunther E. C. , Stone D. J. , Gerwien R. W. , Bento P. , Heyes M. P. , Prediction of clinical drug efficacy by classification of drug-induced genomic expression profiles in vitro. Proceedings of the National Academy of Sciences of the USA, 2003, 100 (16), 9608-9613.
    [14] Tong W. D. , Xie W. , Hong H. X. , Shi L. M. , Fang H. , Perkins R. , Assessment of prediction confidence and domain extrapolation of two structure-activity relationship models for predicting estrogen receptor binding activity. Environmental Health Perspectives, 2004, 112 (12), 1249-1254.
    [15] Windeatt T. , Vote counting measures for ensemble classifiers. Pattern Recognition, 2003, 36 (12), 2743-2756.
    [16] Qiu P. , Wang Z. J. , Liu K. J. , Ensemble dependence model for classification and prediction of cancer and normal gene expression data. Bioinformatics, 2005, 21 (14), 3114-3121.
    [17] Fort G. , Lambert-Lacroix S. , Classification using partial least squares with penalized logistic regression. Bioinformatics, 2005, 21 (7), 1104-1111.
    [18] Ghosh J. , Multiclassifier Systems: Back to the Future. In Multiple Classifier Systems: Third International Workshop, Roli F. , Kittler J. , Eds. Springer Berlin, Heidelberg Cagliari, Italy, 2002, pp 1-15.
    [19] Lapointe J. , Li C. , Higgins J. P. , van de Rijn M. , Bair E. , Montgomery K. , Ferrari M. , Egevad L. , Rayford W. , Bergerheim U. , Ekman P. , DeMarzo A. M. , Tibshirani R. , Botstein D. , Brown P. O. , Brooks J. D. , Pollack J. R. , Gene expression profiling identifies clinically relevant subtypes of prostate cancer. Proceedings of the National Academy of Sciences of the USA, 2004, 101 (3), 811-816.
    [20] Golub T. R. , Slonim D. K. , Tamayo P. , Huard C. , Gaasenbeek M. , Mesirov J. P. , Coller H. , Loh M. L. , Downing J. R. , Caligiuri M. A. , Bloomfield C. D. , Lander E. S. , Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science, 1999, 286 (5439), 531-537.
    [21] Wold S. , Pattern Recognition by Means of Disjoint Principle Components Models. Pattern Recognition, 1976, 8, 127-139.
    [22] Malinowski E. R. , Determination of the number of factors and the experimental error in a data matrix. Analytical Chemistry, 1977, 49, 612-617.
    [23] Stone M. , Cross-validatory choice and assessment of statistical predictions. Journal of the Royal Statistical Society B, 1974, 36, 111-147.
    [24] Berwick M. , Vineis P. , Markers of DNA repair and susceptibility to cancer in humans: An epidemiologic review. Journal of the National Cancer Institute, 2000, 92 (11), 874-897.
    [25] Schlomm T. , Luebke A. M. , Sultmann H. , Hellwinkel O. J. , Sauer U. , Poustka A. , David K. A. , Chun F. K. , Haese A. , Graefen M. , Erbersdobler A. , Huland H. , Extraction and processing of high quality RNA from impalpable and macroscopicaily invisible prostate cancer for microarray gene expression analysis. International Journal of Oncology, 2005, 27 (3), 713-720.
    [26] Antognelli C. , Mearini L. , Talesa V. N. , Giannantoni A. , Mearini E. , Association of CYP17, GSTP1, and PON1 polymorphisms with the risk of prostate cancer. Prostate, 2005, 63 (3), 240-251.
    [27] Ananthanarayanan V., Deaton R.J., Yang X.J., Pins M.R., Gann P.H., Alpha-methylacyl-CoA racemase (AMACR) expression in normal prostatic glands and high-grade prostatic intraepithelial neoplasia (HGPIN): association with diagnosis of prostate cancer. Prostate, 2005, 63 (4), 341-346.
    [28] Hughes S.J., Glover T.W., Zhu X.X., Kuick R., Thoraval D., Orringer M.B., Beer D.G., Hanash S., A novel amplicon at 8p22-23 results in overexpression of cathepsin B in esophageal adenocarcinoma. Proceedings of the National Academy of Sciences of the USA, 1998, 95 (21), 12410-12415.
    [29] Liu W., Bulgaru A., Haigentz M., Stein C.A., Perez-Soler R., Mani S., The BCL2-family of protein ligands as cancer drugs: the next generation of therapeutics. Current Medicinal Chemistry-Anti-Cancer Agents, 2003, 3 (3), 217-223.
    [30] Chen Y.H., Tang Y.M., Shen H.Q., Song H., Yang S.L., Shi S.W., Qian B.Q., Xu W.Q., Ning B.T., [The expression of CD19 in 210 cases of childhood acute leukemia and its significance]. Zhonghua Er Ke Za Zhi, 2004, 42 (3), 188-191.
    [31] Short M.L., Nickel J., Schmitz A., Renkawitz R., In vivo protein interaction with the mouse M-lysozyme gene downstream enhancer correlates with demethylation and gene expression. Cell Growth & Differentiation, 1996, 7 (11), 1545-1550.
    [1] Allison D.B., Cui X., Page G.P., Sabripour M., Microarray data analysis: from disarray to consolidation and consensus. Nature Reviews Genetics, 2006, 7 (1), 55-65.
    [2] Saal I., Nagy N., Lensch M., Lohr M., Manning JoC., Decaestecker C., Andre S., Kiss R., Salmon I., Gabius H.J., Human galectin-2: expression profiling by RT-PCR/immunohistochemistry and its introduction as a histochemical tool for ligand localization. Histology and Histopathology, 2005, 20 (4), 1191-1208.
    [3] Schwable J., Wittrock J., Schweizer P., Girgert R., Detection of rare target genes on Northern blots with cDNA probes labeled by reverse transcriptase polymerase chain reaction with simultaneous digoxigenin incorporation. Analytical Biochemistry, 1998, 262 (1), 77-79.
    [4] Meloni R., Khalfallah O., Biguet N.F., DNA microarrays and pharmacogenomics. Pharmacol Res, 2004, 49 (4), 303-308.
    [5] Ekins S. , Andreyev S. , Ryabov A. , Kirillov E. , Rakhmatulin E. A. , Sorokina S. , Bugrim A. , Nikolskaya T. , A combined approach to drug metabolism and toxicity assessment. Drug Metabolism and Disposition, 2006, 34 (3), 495-503.
    [6] Iwahashi Y. , Hosoda H. , Park J. H. , Lee J. H. , Suzuki Y. , Kitagawa E. , Murata S. M. , Jwa N. S. , Gu M. B. , Iwahashi H. , Mechanisms of patulin toxicity under conditions that inhibit yeast growth. Journal of Agricultural and Food Chemistry, 2006, 54 (5), 1936-1942.
    [7] Golub T. R. , Slonim D. K. , Tamayo P. , Huard C. , Gaasenbeek M. , Mesirov J. P. , Coller H. , Loh M. L. , Downing J. R. , Caligiuri M. A. , Bloomfield C. D. , Lander E. S. , Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science, 1999, 286 (5439), 531-537.
    [8] Guan Z. , Zhao H. , A semiparametric approach for marker gene selection based on gene expression data. Bioinformatics, 2005, 21 (4), 529-536.
    [9] Qiu X. , Xiao Y. H. , Gordon A. , Yakovlev A. , Assessing stability of gene selection in microarray data analysis. BMC Bioinformatics, 2006, 7, 50.
    [10] Ambroise C. , McLachlan G. J. , Selection bias in gene extraction on the basis of microarray gene-expression data. Proceedings of the National Academy of Sciences of the USA, 2002, 99 (10), 6562-6566.
    [11] Black M. A. , Doerge R. W. , Calculation of the minimum number of replicate spots required for detection of significant gene expression fold change in microarray experiments. Bioinformatics, 2002, 18 (12), 1609-1616.
    [12] Chen Y. , Dougherty E. , Bittner M. , Ratio-based decisions and the quantitative analysis of cDNA microarray images. Journal of Biomedical Optics, 1997, 2 (4), 364-374.
    [13] Dudoit S. , Yang Y. , Speed T. , Callow M. , Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. Statistica Sinica, 2002, 12, 111-139.
    [14] Tusher V. G. , Tibshirani R. , Chu G. , Significance analysis of microarrays applied to the ionizing radiation response. Proceedings of the National Academy of Sciences of the USA, 2001, 98 (9), 5116-5121.
    [15] Thomas J. , Olson J. , Tapscott S. , Zhao L. , An efficient and robust sta-. tistical modeling approach to discover differentially expressed genes using genomic. expression profiles. Genome Research, 2001, 11, 1227-1236.
    [16] Pan W. , Lin J. , Le C. T. , A mixture model approach to detecting differentially expressed genes with microarray data. Funct Integr Genomics, 2003, 3 (3), 117-24.
    [17] Wold S. , Pattern Recognition by Means of Disjoint Principle Components Models. Pattern Recognition, 1976, 8, 127-139.
    [18] Holland J. H. , Genetic Algorithms. Scientific American, 1992, 44-50.
    [19] Hong H. , Tong W. , Perkins R. , Fang H. , Xie Q. , Shi L. , Multiclass Decision Forest—a novel pattem recognition method for multiclass classification in microarray data analysis. DNA and Cell Biology, 2004, 23 (10), 685-694.
    [20] Chen X. , Cheung S. T. , So S. , Fan S. T. , Barry C. , Higgins J. , Lai K. M. , Ji J. , Dudoit S. , Ng I. O. , Van De Rijn M. , Botstein D. , Brown P. O. , Gene expression patterns in human liver cancers. Molecular Biology of the Cell, 2002, 13 (6), 1929-1939.
    [21] Malinowski E. R. , Determination of the number of factors and the experimental error in a data matrix. Analytical Chemistry, 1977, 49, 612-617.
    [22] Holland J. H. , Adaptation in Natural and Artificial Systems. University of Michigan Press, 1975.
    [23] Shao X. , Chen Z. , Lin X. , Resolution of multicomponent overlapping chromatogram using an immune algorithm and genetic algorithm. Chemometrics and Intelligent Laboratory Systems, 2000, 50 (1), 91-99.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700