基于特征选择的多变量数据分析方法及其在谱学研究中的应用
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
特征选择是多变量数据分析中一个重要的研究方面,通过特征选择可以剔除无关、冗余的信息,降低数据维数及算法的复杂度,提高模型的推广能力及可理解性,因而在数据分析中起着很重要的作用。
     本文以蛋白质组学质谱数据以及近红外光谱数据为研究对象,进行了高维数据特征变量选择方法的研究。对蛋白质组学质谱数据的分析目的是进行潜在生物标记物的探寻及疾病和健康样本的模式识别;对近红外光谱数据的研究目的是通过变量筛选消除数据共线性的影响,从而建立稳定、高效的多元校正模型。
     本文研究工作主要包括以下几个方面:
     (1)提出了一种基于非相关线性判别分析的演进式特征选择方法,该方法包括数据降噪及标准化、数据分箱及箱变量筛选、箱数据处理、非相关线性判别分析用于特征筛选及样本分类等四个步骤。通过对卵巢癌血清样本SELDI-TOF质谱数据的分析筛选得到了可用于识别卵巢癌样本的潜在生物标记物,并建立了分类模型,得到了100%的灵敏度和特异性。
     (2)提出了一种独立成分分析结合非相关线性判别分析的特征选择方法。该方法包括三个步骤:1)独立成分分解;2)非参数统计检验用于判别独立成分的选择;3)非相关线性判别分析用于潜在生物标记物的筛选及分类模型的建立。用本方法对一组结肠癌数据集和一组卵巢癌数据集分别进行了分析,最终筛选出的特征所建立的分类模型在两组数据上的灵敏度均为100%,特异性分别为100%和96.77%。
     (3)建立了一种基于F-score与偏最小二乘—判别分析的特征选择方法,首先通过预处理,提取出质谱信号中的峰值,然后按F-score值大小对变量的可分类性排序,最后以PLS-DA逐步有放回地筛选出潜在的生物标记物。对结肠癌和卵巢癌数据集进行了分析,最终得到的特异性分别为100%和96.77%,灵敏度分别为95.24%和100%。
     (4)提出了一种基于蒙特卡罗采样技术的递归偏最小二乘方法,该方法采用蒙特卡罗采样技术建立多个数据子集,并利用PLS分别对每个子集多次建模,以回归系数为变量筛选依据选出多个优变量子集,通过统计分析确定最终的最佳变量集。用此方法对几个不同的近红外光谱数据集进行分析,并与不同方法进行了比较,结果表明该方法可有效地进行近红外光谱的变量筛选。
     (5)提出了一种基于光谱纯度值的变量选择方法,用于近红外光谱定量建模中的波长选择。对光谱中各变量计算其纯度值后,按降序将相应变量排列,采用PLS交互检验通过依次考察变量对模型的贡献逐步选择最佳变量。用此方法对几个不同的近红外光谱数据集进行变量筛选,结果表明此方法简单、有效。
Feature selection is one of the most important aspects of multivariate data analysis. Through feature selection, both of the redundant and irrelevant information can be eliminated and the data dimensionality can be reduced, so that the computational processing is simplified. Furthermore, it can improve the generalization performance and understandability of models. Thus, feature selection plays an important role in data analysis.
     This dissertation studied the feature selection methods for high dimensional data, the proteomic mass spectrometric (MS) data and near-infrared spectroscopic (NIRS) data were taken as research object. The main aims for proteomic MS data analysis was potential biomarker finding and samples classification, for NIR data analysis was wavelength selection for elimination of the effect of co-linearity and effective modeling.
     The main works in this dissertation are as follows:
     (1) A feature selection method called ULDA-HFS (uncorrelated linear discriminant analysis based heuristic feature selection) was proposed, which mainly include three steps:(a) dimensionality reduction and data normalization; (b) data binning and discriminant bin selection; (c) ULDA for feature selection and sample classification. An ovarian cancer serum SELDI-TOF (surface enhanced laser desorption/ionization-time of flight) MS dataset was analyzed with the proposed method, and obtained several potential biomarkers which could discriminate ovarian caner samples from healthy samples, the classification model built by the potential biomarkers obtained 100% of specificity and sensitivity.
     (2) A strategy based on Independent Component Analysis (ICA) and ULDA was proposed for proteomic profile analysis and potential biomarker discovery from proteomic mass spectra of cancer and control samples. The method mainly includes 3 steps:(a) ICA decomposition for the mass spectra; (b) selection of discriminatory independent components (ICs) using nonparametric test; and (c) selection of special peaks (m/z locations) as potential biomarkers and create classification models by ULDA.. A colorectal cancer data set and an ovarian cancer data set were analyzed with the proposed method. The classification results yielded 100% and 96.77% of specificities on colorectal and ovarian cancer datasets respectively,100% of sensitivity on both of the datasets.
     (3) A feature selection method based on F-score and partial least square-discriminant analysis (PLS-DA) was presented. After preprocessing, peaks consist in the signals were picked and the variables were sorted according to their F-scores, then, potential biomarkers were selected by performing PLS-DA in forward selection strategy. The classification results of the potential biomarkers selected by the proposed method yielded 100% of specificity and 95.24% of sensitivity on a colorectal cancer dataset, and 96.77% of specificity and 100% of sensitivity on an ovarian cancer dataset.
     (4) Proposed a feature selection method named Monte Carlo Sampling-based Recursive Partial Least Squares (MCS-RPLS), which create a number of sub-dataset by using Monte Carlo sampling technique firstly, then modeling with PLS on each subset repeatedly and select feature subset on each dataset by taken regression coefficient as criterion, finally determine the optimum feature set through statistical analysis on the feature subsets. The method was used for analysis of several NIR datasets and compared with several methods, the results shown that the method could effectively select useful features from NIR data for multivariate calibration.
     (5) A feature selection method based on purity of spectral variable was proposed and used for wavelength selection from NIR dataset for quantitative modeling. After calculation of the purity of each spectral variable (i.e. wavelength), sort the variables using purities in descendent way and select optimum variables step by step, where the contribution of each variable for calibration model was tested with PLS cross validation. The method was used for analysis of several NIR datasets and the results indicated its simplicity and availability.
引文
[1]俞汝勤.化学计量学导论[M].长沙:湖南教育出版社,1991
    [2]梁逸曾,杜一平.分析化学计量学[M]重庆:重庆大学出版社,2004.
    [3]杜一平,潘铁英,张玉兰.化学计量学应用[M].北京:化学工业出版社,2008.
    [4]Lavine, B.K.; Workman Jr, J. Chemometrics [J]. Analytical chemistry,2002,74(12): 2763-2770.
    [5]Lavine, B.; Workman Jr, J.J. Chemometrics [J]. Analytical chemistry,2004,76(12): 3365-3372.
    [6]Lavine, B.; Workman, J. Chemometrics [J]. Analytical chemistry,2006,78(12): 4137-4145.
    [7]Lavine, B.; Workman, J. Chemometrics [J]. Analytical chemistry,2008,80(12): 4519-4531.
    [8]Lavine, B.; Workman, J. Chemometrics [J]. Analytical chemistry,2010,82(12): 4699-4711.
    [9]李国正.特征选择若干新方法的研究[D].博士学位论文,上海交通大学:2004
    [10]Li, H.; Liang, Y.; Xu, Q.; Cao, D. Key wavelengths screening using competitive adaptive reweighted sampling method for multivariate calibration [J]. Analytica chimica acta,2009,648(1):77-84.
    [11]Gemperline, P.J.; Salt, A. Principal components regression for routine multicomponent UV determinations:a validation protocol [J]. Journal of Chemometrics,1989,3(2): 343-357.
    [12]Hartnett, MK; Lightbody, G.; Irwin, GW. Dynamic inferential estimation using principal components regression (PCR) [J]. Chemometrics and intelligent laboratory systems,1998,40(2):215-224.
    [13]Sjostrom, M.; Wold, S.; Lindberg, W., Persson, J.A.; Martens, H. A multivariate calibration problem in analytical chemistry solved by partial least-squares models in latent variables [J]. Analytica chimica acta,1983,150(1):61-70.
    [14]Geladi, P.; Kowalski, B.R. Partial least-squares regression:a tutorial [J]. Analytica chimica acta,1986,185(1):1-17.
    [15]Petricoin Ⅲ, E.F.; Ardekani, A.M., Hitt, B.A.; Levine, P.J.; Fusaro, V.A.; Steinberg, S.M.; Mills, G.B.; Simone, C.; Fishman, DA.; Kohn, E.C. Use of proteomic patterns in serum to identify ovarian cancer [J]. The Lancet,2002,359(9306):572-577.
    [16]Petricoin, E.F.; Liotta, L.A. SELD1-TOF-based serum proteomic pattern diagnostics for early detection of cancer [J] Current opinion in biotechnology,2004,15(1): 24-30.
    [17]Yu, JS; Ongarello, S.; Fiedler, R.; Chen, XW; Toffolo, G.; Cobelli, C.; Trajanoski, Z. Ovarian cancer identification based on dimensionality reduction for high-throughput mass spectrometry data [J]. Bioinformatics,2005,21(10):2200-2209.
    [18]Tang., K L; Li., T H; Xiong., W; Chen., K. Ovarian cancer classification based on dimensionality reduction for SELDI-TOF data [J]. BMC bioinformatics,2010,11(1): 109.
    [19]Alexe, G.; Alexe, S.; Liotta, L.A.; Petricoin, E.; Reiss, M.; Hammer, P.L. Ovarian cancer detection by logical analysis of proteomic data [J]. Proteomics,2004,4(3): 766-783.
    [20]Xu, G.; Xiang, C.Q.; Lu, Y.; Kang, X.N.; Liao, P.; Ding, Q.; Zhang, Y.F. Application of SELDI-TOF-MS to identify serum biomarkers for renal cell carcinoma [J]. Cancer letters,2009,282(2):205-213.
    [21]Guo, J.; Wang, W.; Liao, P.; Lou, W.; Ji, Y.; Zhang, C.; Wu, J.; Zhang, S. Identification of serum biomarkers for pancreatic adenocarcinoma by proteomic analysis [J]. Cancer science,2009,100(12):2292-2301.
    [22]Xue, A.; Scarlett, CJ; Chung, L.; Butturini, G.; Scarpa, A.; Gandy, R.; Wilson, SR; Baxter, RC; Smith, RC. Discovery of serum biomarkers for pancreatic adenocarcinoma using proteomic analysis [J]. British journal of cancer,2010,103(3): 391-400.
    [23]Han, K.; Huang, G.; Gao, C.; Wang, X.; Ma, B.; Sun, L.; Wei, Z. Identification of lung cancer patients by serum protein profiling using surface-enhanced laser desorption/ionization time-of-flight mass spectrometry [J]. American journal of clinical oncology,2008,31(2):133.
    [24]Petricoin, E.F.; Ornstein, D.K.; Paweletz, C.P.; Ardekani, A.; Hackett, P.S.; Hitt, B.A.; Velassco, A.; Trucco, C.; Wiegand, L.; Wood, K. Serum proteomic patterns for detection of prostate cancer [J]. Journal of the National Cancer Institute,2002,94(20): 1576.
    [25]Adam, B.L.; Vlahou, A.; Semmes, O.J.; Wright Jr, G.L. Proteomic approaches to biomarker discovery in prostate and bladder cancers [J]. Proteomics,2001,1(10): 1264-1270.
    [26]Goo, Y. A.; Goodlett, D.R. Advances in proteomic prostate cancer biomarker discovery [J]. Journal of Proteomics,2010,73(10):1839-1850.
    [27]Pei, Y.; Zhang, T., Renault, V.; Zhang, X. An overview of hepatocellular carcinoma study by omics-based methods [J]. Acta biochimica et biophysica Sinica,2009,41(1):
    [28]Vlahou, A.; Schellhammer, P.F.; Mendrinos, S.; Patel, K.; Kondylis, F.I.; Gong, L. Nasim, S.; Wright Jr, G.L. Development of a novel proteomic approach for the detection of transitional cell carcinoma of the bladder in urine [J]. The American journal of pathology,2001,158(4):1491-1502.
    [29]Li, J.; Zhang, Z.; Rosenzweig, J.; Wang, Y.Y.; Chan, D.W. Proteomics and bioinformatics approaches for identification of serum biomarkers to detect breast cancer [J]. Clinical Chemistry,2002,48(8):1296-1304.
    [30]Rajalahti, T.; Kroksveen, A.C.; Arneberg, R.; Berven, F.S.; Vedeler, C.A.; Myhr, K.M.; Kvalheim, O.M. A Multivariate Approach To Reveal Biomarker Signatures for Disease Classification:Application to Mass Spectral Profiles of Cerebrospinal Fluid from Patients with Multiple Sclerosis [J]. Journal of Proteome Research,2010,9(7): 3608-3620.
    [31]Yang, C.; He, Z.; Yu, W. Comparison of public peak detection algorithms for MALDI mass spectrometry data analysis [J]. BMC bioinformatics,2009,10(1):4.
    [32]Coombes, K.R.; Tsavachidis, S.; Morris, J.; Baggerly, K.; Hung, M.C.; Kuerer, H. Improved peak detection and quantification of mass spectrometry data acquired from surface-enhanced laser desorption and ionization by denoising spectra with the undecimated discrete wavelet transform [J]. Proteomics,2005,5(16):4107-4117.
    [33]Du, P.; Sudha, R.; Prystowsky, M.B.; Angeletti, R.H. Data reduction of isotope-resolved LC-MS spectra [J]. Bioinformatics,2007,23(11):1394-1400.
    [34]Mantini, D.; Petrucci, F.; Pieragostino, D.; Del Boccio, P.; Di Nicola, M.; Di Ilio, C.; Federici, G.; Sacchetta, P.; Comani, S.; Urbani, A. LIMPIC:a computational method for the separation of protein MALDI-TOF-MS signals from noise [J]. BMC bioinformatics,2007,8(1):101.
    [35]Yasui, Y.; Pepe, M.; Thompson, M.L.; Adam, B.L.; Wright, G.L.; Qu, Y.; Potter, J.D.; Winget, M.; Thornquist, M.; Feng, Z. A data-analytic strategy for protein biomarker discovery:profiling of high-dimensional proteomic data for cancer detection [J]. Biostatistics,2003,4(3):449-463.
    [36]Leptos, K.C.; Sarracino, D.A.; Jaffe, J.D.; Krastins, B.; Church, G.M. MapQuant: Open-source software for large-scale protein quantification [J]. Proteomics,2006, 6(6):1770-1782.
    [37]Du, P.; Kibbe, W.A.; Lin, S.M. Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching [J]. Bioinformatics,2006,22(17):2059-2065.
    [38]Bellew, M.; Coram, M.; Fitzgibbon, M.; Igra, M.; Randolph, T.; Wang, P.; May, D.; Eng, J.; Fang, R.; Lin, C. A suite of algorithms for the comprehensive analysis of complex protein mixtures using high-resolution LC-MS [J]. Bioinformatics,2006, 22(15):1902-1909.
    [39]Katajamaa, M.; Miettinen, J.; Ore i, M. MZmine:toolbox for processing and visualization of mass spectrometry based molecular profile data [J]. Bioinformatics, 2006,22(5):634-636.
    [40]Lange, E.; Gropl, C.; Reinert, K.; Kohlbacher, O.; Hildebrandt, A. High-accuracy peak picking of proteomics data using wavelet techniques [C], Pacific Symposium on Biocomputing,2006:243-254.
    [41]Li, X.; Gentleman, R.; Lu, X.; Shi, Q.; Iglehart, JD; Harris, L.; Miron, A. SELDI-TOF mass spectrometry protein data [J]. Bioinformatics and Computational Biology solutions using R and Bioconductor,2005:91-109.
    [42]Karpievitch, Y.V.; Hill, E.G.; Smolka, A.J.; Morris, J.S.; Coombes, K.R.; Baggerly, K.A.; Almeida, J.S. PrepMS:TOF MS data graphical preprocessing tool [J]. Bioinformatics,2007,23(2):264-265.
    [43]Smith, C.A.; Elizabeth, J.; O'Maille, G.; Abagyan, R.; Siuzdak, G. XCMS:processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification [J]. Analytical chemistry,2006,78(3):779-787.
    [44]Antoniadis, A.; Bigot, J.; Lambert-Lacroix, S.; Letue, F. Nonparametric Pre-Processing Methods and Inference Tools for Analyzing Time-of-Flight Mass Spectrometry Data [J]. Current Analytical Chemistry,2007,3(2):127-147.
    [45]Roy, P.; Truntzer, C.; Maucort-Boulch, D.; Jouve, T.; Molinari, N. Protein mass spectra data analysis for clinical biomarker discovery:a global review [J]. Briefings in bioinformatics,2010:doi:10.1093/bib/bbq1019.
    [46]Kwon, D.; Vannucci, M.; Song, J.J.; Jeong, J.; Pfeiffer, R.M. A novel wavelet-based thresholding method for the pre-processing of mass spectrometry data that accounts for heterogeneous noise [J]. Proteomics,2008,8(15):3019-3029.
    [47]Mostacci, E.; Truntzer, C.; Cardot, H.; Ducoroy, P. Multivariate denoising methods combining wavelets and principal component analysis for mass spectrometry data [J]. Proteomics,2010,10(14):2564-2572.
    [48]Cruz-Marcelo, A.; Guerra, R.; Vannucci, M.; Li, Y.; Lau, C.C.; Man, T.K. Comparison of algorithms for pre-processing of SELDI-TOF mass spectrometry data [J]. Bioinformatics,2008,24(19):2129-2136.
    [49]Jeffries, N. Algorithms for alignment of mass spectrometry proteomic data [J]. Bioinformatics,2005,21(14):3066-3073.
    [50]Wong, J.W.H.; Cagney, G.; Cartwright, H.M. SpecAlign—processing and alignment of mass spectra datasets [J]. Bioinformatics,2005,21(9):2088-2090.
    [51]Kong, X.; Reilly, C. A Bayesian approach to the alignment of mass spectra [J]. Bioinformatics,2009,25(24):3213-3220.
    [52]Meuleman, W.; Engwegen, J.Y.M.N.; Gast, M.C.W.; Beijnen, J.H.; Reinders, M.J.T. Wessels, L.F.A. Comparison of normalisation methods for surface-enhanced laser desorption and ionisation(SELDI) time-of-flight(TOF) mass spectrometry data [J]. BMC bioinformatics,2008,9(1):88.
    [53]Randolph, T.W.; Mitchell, B.L.; McLerran, D.F.; Lampe, P.D.; Feng, Z. Quantifying peptide signal in MALDI-TOF mass spectrometry data [J]. Molecular & Cellular Proteomics,2005,4(12):1990-1999.
    [54]Morris, J.S.; Coombes, K.R.; Koomen, J.; Baggerly, K.A.; Kobayashi, R. Feature extraction and quantification for mass spectrometry in biomedical applications using the mean spectrum [J]. Bioinformatics,2005,21(9):1764-1775.
    [55]Saeys, Y.; Inza, I.; Larra aga, P. A review of feature selection techniques in bioinformatics [J]. Bioinformatics,2007,23(19):2507.
    [56]徐义田.分类问题中特征选择算法的研究[D].博士学位论文,中国农业大学:2007.
    [57]Papadopoulos, M.C.; Abel, P.M.; Agranoff, D.; Stich, A.; Tarelli, E.; Bell, B.A.; Planche, T.; Loosemore, A.; Saadoun, S.; Wilkins, P. A novel and accurate diagnostic test for human African trypanosomiasis [J]. The Lancet,2004,363(9418):1358-1363.
    [58]Wu, B.; Abbott, T.; Fishman, D.; McMurray, W.; Mor, G.; Stone, K.; Ward, D.; Williams, K.; Zhao, H. Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data [J]. Bioinformatics,2003,19(13):1636-1643.
    [59]Wagner, M.; Naik, D.; Pothen, A. Protocols for disease classification from mass spectrometry data [J]. Proteomics,2003,3(9):1692-1698.
    [60]Rogers, M.A.; Clarke, P.; Noble, J.; Munro, N.P.; Paul, A.; Selby, P.J.; Banks, R.E. Proteomic profiling of urinary proteins in renal cancer by surface enhanced laser desorption ionization and neural-network analysis [J]. Cancer research,2003,63(20): 6971-6983.
    [61]Sorace, J.M.; Zhan, M. A data review and re-assessment of ovarian cancer serum proteomic profiling [J]. BMC bioinformatics,2003,4(1):24.
    [62]Hilario, M.; Kalousis, A.; Muller, M.; Pellegrini, C. Machine learning approaches to lung cancer prediction from mass spectra [J]. Proteomics,2003,3(9):1716-1719.
    [63]Hilario, M.; Kalousis, A. Approaches to dimensionality reduction in proteomic biomarker studies [J]. Briefings in bioinformatics,2008,9(2):102.
    [64]Holland, J H. Adaptation in Natural & Artificial Systems [M]. MIT Press,1992.
    [65]Dorigo, M.; Caro, G.D.; Gambardella, L.M Ant algorithms for discrete optimization [J]. Artificial life,1999,5(2):137-172.
    [66]Kennedy,J.; Eberhart, R. Particle swarm optimization [C], Proceedings of IEEE International Conference on Neural Networks,1995, IV:1942-1948.
    [67]Qu, Y.; Adam, B.; Thornquist, M.; Potter, J.D.; Thompson, M.L.; Yasui, Y.; Davis, J.; Schellhammer, P.F.; Cazares, L.; Clements, M.A. Data reduction using a discrete wavelet transform in discriminant analysis of very high dimensionality data [J]. Biometrics,2003,59(1):143-151.
    [68]Hall, M.A.; Holmes, G. Benchmarking attribute selection techniques for discrete class data mining [J]. IEEE Transactions on Knowledge and Data engineering,2003,15(6): 1437-1447.
    [69]Liu, H.; Li, J.; Wong, L. A comparative study on feature selection and classification methods using gene expression profiles and proteomic patterns [J]. GENOME INFORMATICS SERIES,2002,13(1):51-60.
    [70]Baggerly, Keith A.; Morris, Jeffrey S.; Wang, Jing; Gold, David; Xiao, Lian-Chun; Coombes, Kevin R. A comprehensive approach to the analysis of matrix-assisted laser desorption/ionization-time of flight proteomics spectra from serum samples [J]. Proteomics,2003,3(9):1667-1672.
    [71]Kira, K.; Rendell, L.A. The feature selection problem:Traditional methods and a new algorithm [C], Proceedings of the National Conference on Artificial Intelligence,1992: 129-129.
    [72]Nguyen, M.H.; de la Torre, F. Optimal feature selection for support vector machines [J]. Pattern recognition,2010,43(3):584-591.
    [73]Breiman, L. Classification and regression trees [M]. Chapman & Hall/CRC,1984.
    [74]Quinlan, J.R. C4.5:programs for machine learning [M]. San Mateo:Morgan Kaufmann,1993.
    [75]Won, Y.; Song, H.J.; Kang, T.W.; Kim, J.J.; Han, B.D.; Lee, S. Pattern analysis of serum proteome distinguishes renal cell carcinoma from other urologic diseases and healthy persons [J]. Proteomics,2003,3(12):2310-2316.
    [76]Banez, L.L.; Prasanna, P.; Sun, L.; Ali, A.; Zou, Z.; Adam, B.L.; Mcleod, D.G.; Moul, J.W.; Srivastava, S. Diagnostic potential of serum proteomic patterns in prostate cancer [J]. The Journal of urology,2003,170(2):442-446.
    [77]Freund, Y.; Schapire, R.E. Experiments with a new boosting algorithm [C], Proceedings of the 13th International Conference on Machine Learning, Bari,1996: 148-156.
    [78]Guyon, I.; Weston, J.; Barnhill, S.; Vapnik, V. Gene selection for cancer classification using support vector machines [J]. Machine learning,2002,46(1):389-422.
    [79]Jong, K.; Marchiori, E.; Sebag, M.; Van Der Vaart, A. Feature selection in proteomic pattern data with support vector machines [C], CIBCB'04. Proceedings of the 2004 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology,2004:41-48.
    [80]Oh, J.H.; Gao, I.; Nandi, A.; Gurnani, P.; Knowles, L.; Schorge, J.; Rosenblatt, K.P. Diagnosis of early relapse in ovarian cancer using serum proteomic profiling [J]. GENOME INFORMATICS SERIES,2005,16(2):195.
    [81]Zhang, X.; Lu, X.; Shi, Q.; Xu, X.; Leung, H.E.; Harris, L.N.; Iglehart, J.D.; Miron, A.; Liu, J.S.; Wong, W.H. Recursive SVM feature selection and sample classification for mass-spectrometry and microarray data [J]. BMC bioinformatics,2006,7(1):197.
    [82]Li, L.; Tang, H.; Wu, Z.; Gong, J.; Gruidl, M.; Zou, J.; Tockman, M.; Clark, R.A. Data mining techniques for cancer detection using serum proteomic profiling [J]. Artificial Intelligence in Medicine,2004,32(2):71-83.
    [83]Li, L.; Umbach, D.M.; Terry, P.; Taylor, J.A. Application of the GA/KNN method to SELDI proteomics data [J]. Bioinformatics,2004,20(10):1638.
    [84]Ressom, H.W.; Varghese, R.S.; Abdel-Hamid, M.; Eissa, S.A.L.; Saha, D.; Goldman, L.; Petricoin, E.F.; Conrads, T.P.; Veenstra, T.D.; Loffredo, C.A. Analysis of mass spectral serum profiles for biomarker selection [J]. Bioinformatics,2005,21(21): 4039.
    [85]Ressom, HW; Varghese, RS; Drake, SK; Hortin, GL; Abdel-Hamid, M; Loffredo, CA; Goldman, R. Peak selection from MALDI-TOF mass spectra using ant colony optimization [J]. Bioinformatics,2007,23(5):619.
    [86]Jin, Z.; Yang, J.Y.; Hu, Z.S.; Lou, Z. Face recognition based on the uncorrelated discriminant transformation [J]. Pattern recognition,2001,34(7):1405-1416.
    [87]Stark, E.; Luchter, K.; Margoshes, M. Near-infrared analysis (NIRA):A technology for quantitative and qualitative analysis [J]. Applied Spectroscopy Reviews,1986, 22(4):335-399.
    [88]Landau, S.; Glasser, T.; Dvash, L. Monitoring nutrition in small ruminants with the aid of near infrared reflectance spectroscopy (NIRS) technology:A review [J]. Small Ruminant Research,2006,61(1):1-11.
    [89]Luck, S.; Biige, G.; Plettenberg, H.; Hoffmann, M. Near-infrared spectroscopy for process control and optimization of biogas plants [J]. Engineering in Life Sciences, 2010,10(6):537-543.
    [90]Murugesan, A.; Umarani, C.; Chinnusamy, TR; Krishnan, M.; Subramanian, R.; Neduzchezhain, N Production and analysis of bio-diesel from non-edible oils--A review [J]. Renewable and Sustainable Energy Reviews,2009,13(4):825-834.
    [91]Meher, LC; Vidya Sagar, D.; Naik, SN. Technical aspects of biodiesel production by transesterification--a review [J]. Renewable and Sustainable Energy Reviews,2006, 10(3):248-268.
    [92]Nicolai, B.M.; Beullens, K.; Bobelyn, E.; Peirs, A.; Saeys, W.; Theron, K.I.; Lammertyn, J. Nondestructive measurement of fruit and vegetable quality by means of NIR spectroscopy:A review [J]. Postharvest Biology and Technology,2007,46(2): 99-118.
    [93]Alishahi, A.; Farahmand, H.; Prieto, N.; Cozzolino, D. Identification of transgenic foods using NIR spectroscopy:A review [J]. Spectrochimica Acta Part A:Molecular and Biomolecular Spectroscopy,2010,75(1):1-7.
    [94]Gendrin, C.; Roggo, Y.; Spiegel, C.; Collet, C. Monitoring galenical process development by near infrared chemical imaging:One case study [J]. European Journal of Pharmaceutics and Biopharmaceutics,2008,68(3):828-837.
    [95]Shi, Z.; Anderson, C.A. Scattering orthogonalization of near-infrared spectra for analysis of pharmaceutical tablets [J]. Analytical chemistry,2009,81(4):1389-1396.
    [96]Roggo, Y.; Chalus, P.; Maurer, L.; Lema-Martinez, C.; Edmond, A.; Jent, N. A review of near infrared spectroscopy and chemometrics in pharmaceutical technologies [J]. Journal of pharmaceutical and biomedical analysis,2007,44(3):683-700.
    [97]Karoui, R.; De Baerdemaeker, J. A review of the analytical methods coupled with chemometric tools for the determination of the quality and identity of dairy products [J]. Food Chemistry,2007,102(3):621-640.
    [98]Bellon-Maurel, V.; Fernandez-Ahumada, E.; Palagos, B.; Roger, J.M.; McBratney, A. Critical review of chemometric indicators commonly used for assessing the quality of the prediction of soil attributes by NIR spectroscopy [J]. TrAC Trends in Analytical Chemistry,2010,29(9):1073-1081.
    [99]Shepherd, K.D.; Walsh, M.G. Infrared spectroscopy—enabling an evidence-based diagnostic surveillance approach to agricultural and environmental management in developing countries [J]. Journal of Near Infrared Spectroscopy,2007,15(1):1-19.
    [100]Xu, L.; Zhou, Y.P.; Tang, L.J.; Wu, H.L.; Jiang, J.H.; Shen, G.L.; Yu, R.Q. Ensemble preprocessing of near-infrared (NIR) spectra for multivariate calibration [J]. Analytica chimica acta,2008,616(2):138-143.
    [101]Wiley, P.R., Tanner, G.J.; Chandler, P.M.; Anderssen, R.S. Molecular Classification of Barley (Hordeum vulgare L.) Mutants Using Derivative NIR Spectroscopy [J]. Journal of agricultural and food chemistry,2009,57(10):4042-4050.
    [102]Isaksson, T.; N s, T. The effect of multiplicative scatter correction (MSC) and linearity improvement in NIR spectroscopy [J]. Applied Spectroscopy,1988,42(7): 1273-1284.
    [103]Geladi, P.; MacDougall, D.; Martens, H. Linearization and scatter-correction for near-infrared reflectance spectra of meat [J]. Applied Spectroscopy,1985,39(3): 491-500.
    [104]Martens, H.; Stark, E. Extended multiplicative signal correction and spectral interference subtraction:New preprocessing methods for near infrared spectroscopy [J]. Journal of pharmaceutical and biomedical analysis,1991,9(8):625-635.
    [105]Barnes, RJ; Dhanoa, MS; Lister, S.J. Standard normal variate transformation and de-trending of near-infrared diffuse reflectance spectra [J]. Applied Spectroscopy, 1989,43(5):772-777.
    [106]Wold, S.; Antti, H.; Lindgren, F.; hman, J. Orthogonal signal correction of near-infrared spectra [J]. Chemometrics and intelligent laboratory systems,1998, 44(1-2):175-185.
    [107]Fearn, T. On orthogonal signal correction [J]. Chemometrics and intelligent laboratory systems,2000,50(1):47-52.
    [108]Luypaert, J.; Heuerding, S.; Massart, DL; Heyden, Y.V. Direct orthogonal signal correction as data pretreatment in the classification of clinical lots of creams from near infrared spectroscopy data [J]. Analytica chimica acta,2007,582(1):181-189.
    [109]Luo, C.; Xue, L.; Liu, M.; Li, J.; Wang, X. Nondestructive Measurement of Sugar Content in Navel Orange Based on Vis-NIR Spectroscopy [J]. Computer and Computing Technologies in Agriculture IV,2011,347:467-473.
    [110]Liu, Y.; Sun, X.; Zhou, J.; Zhang, H.; Yang, C. Linear and nonlinear multivariate regressions for determination sugar content of intact Gannan navel orange by Vis-NIR diffuse reflectance spectroscopy [J]. Mathematical and Computer Modelling,2010, 51(11-12):1438-1443.
    [111]Huang, Z.; Tao, W.; Fang, J.; Wei, X.; Du, Y. Multivariate calibration of on-line enrichment near-infrared (NIR) spectra and determination of trace lead in water [J]. Chemometrics and intelligent laboratory systems,2009,98(2):195-200.
    [112]Fearn, T.; Riccioli, C.; Garrido-Varo, A.; Guerrero-Ginel, J.E. On the geometry of SNV and MSC [J]. Chemometrics and intelligent laboratory systems,2009,96(1): 22-26.
    [113]Mittermayr, CR; Nikolov, SG; Futter, H.; Grasserbauer, M. Wavelet denoising of Gaussian peaks:a comparative study [J]. Chemometrics and intelligent laboratory systems,1996,34(2):187-202.
    [114]Shao, X.G.; Leung, A.K.M.; Chau, F.T. Wavelet:a new trend in chemistry [J]. Accounts of chemical research,2003,36(4):276-283.
    [115]Wu, D.; Chen, X.; Shi, P.; Wang, S.; Feng, F.; He, Y. Determination of [alpha]-linolenic acid and linoleic acid in edible oils using near-infrared spectroscopy improved by wavelet transform and uninformative variable elimination [J]. Analytica chimica acta,2009,634(2):166-171.
    [116]Zhang, Y.; Cong, Q.; Xie, Y. Quantitative analysis of routine chemical constituents in tobacco by near-infrared spectroscopy and support vector machine [J]. Spectrochimica Acta Part A:Molecular and Biomolecular Spectroscopy,2008,71(4): 1408-1413.
    [117]N(?)s, T.; Kvaal, K.; Isaksson, T.; Miller, C. Artificial neural networks in multivariate calibration [J]. Journal of Near Infrared Spectroscopy,1993,1(1):1-11.
    [118]Kim, J.; Mowat, A.; Poole, P.; Kasabov, N. Linear and non-linear pattern recognition models for classification of fruit from visible-near infrared spectra [J]. Chemometrics and intelligent laboratory systems,2000,51(2):201-216.
    [119]Balabin, R.M.; Lomakina, E.I. Support vector machine regression (SVR/LS-SVM)—an alternative to neural networks (ANN) for analytical chemistry? Comparison of nonlinear methods on near infrared (NIR) spectroscopy data [J]. Analyst,2011:DOI.10.1039/c1030an00387e.
    [120]Bangalore, A.S.; Shaffer, R.E.; Gary, W.; Arnold, M.A. Genetic algorithm-based method for selecting wavelengths and model size for use with partial least-squares regression:application to near-infrared spectroscopy [J]. Analytical chemistry,1996, 68(23):4200-4212.
    [121]Zou, X.; Zhao, J.; Povey, M J.W.; Holmes, M.; Mao, H. Variables selection methods in near-infrared spectroscopy [J]. Analytica chimica acta,2010,667(1-2):14-32.
    [122]Lucasius, C.B.; Beckers, M.L.M.; Kateman, G. Genetic algorithms in wavelength selection:a comparative study [J]. Analytica chimica acta,1994,286(2):135-153.
    [123]Hemmateenejad, B.; Akhond, M.; Miri, R.; Shamsipur, M. Genetic algorithm applied to the selection of factors in principal component-artificial neural networks: application to QSAR study of calcium channel antagonist activity of 1, 4-dihydropyridines (nifedipine analogous) [J]. Journal of chemical information and computer sciences,2003,43(4):1328-1334.
    [124]Jouan-Rimbaud, D.; Massart, D.L.; Leardi, R.; De Noord, O.E. Genetic algorithms as a tool for wavelength selection in multivariate calibration [J]. Analytical chemistry, 1995,67(23):4295-4301.
    [125]Niazi, A.; Soufi, A.; Mobarakabadi, M. Genetic algorithm applied to selection of wavelength in partial least squares for simultaneous spectrophotometric determination of nitrophenol isomers [J]. Analytical letters,2006,39(11):2359-2372.
    [126]Khajehsharifi, H.; Pourbasheer, E. Genetic-algorithm-based Wavelength Selection in Multicomponent Spectrophotometric Determination by PLS:Application on Ascorbic Acid and Uric Acid Mixture [J]. Journal of the Chinese Chemical Society,2008,55(1): 163-170.
    [127]Kalivas, J.H.; Roberts, N.; Sutter, J.M. Global optimization by simulated annealing with wavelength selection for ultraviolet-visible spectrophotometry [J]. Analytical chemistry,1989,61(18):2024-2030.
    [128]Sasaki, K.; Kawata, S.; Minami, S. Optimal wavelength selection for quantitative analysis [J]. Applied Spectroscopy,1986,40(2):185-190.
    [129]Yi-Zeng, L.; Yu-Long, X.; Ru-Qin, Y. Accuracy criteria and optimal wavelength selection for multicomponent spectrophotometric determinations [J]. Analytica chimica acta,1989,222(1):347-357.
    [130]Horchner, U.; Kalivas, J.H. Further investigation on a comparative study of simulated annealing and genetic algorithm for wavelength selection [J]. Analytica chimica acta, 1995,311(1).1-13.
    [131]Shi, J.; Yin, X.; Zou, X.; Zhao, J.; Ju, S. Detection of Strawberry Firmness by NIR Wavelength Selection Based on Simulated Annealing Algorithm [J]. Nongye Jixie Xuebao(Transactions of the Chinese Society of Agricultural Machinery),2010,41(9): 99-103.
    [132]Shamsipur, M.; Zare-Shahabadi, V.; Hemmateenejad, B.; Akhond, M. Ant colony optimisation:a powerful tool for wavelength selection [J]. Journal of Chemometrics, 2006,20(3-4):146-157.
    [133]Osborne, S.D.; Kunnemeyer, R.; Jordan, R.B. Method of wavelength selection for partial least squares [J]. Analyst,1997,122(12):1531-1537.
    [134]Centner, V.; Massart, D.L.; de Noord, O.E.; de Jong, S.; Vandeginste, B.M.; Sterna, C. Elimination of uninformative variables for multivariate calibration [J]. Analytical chemistry,1996,68(21):3851-3858.
    [135]Cai, W.; Li, Y.; Shao, X. A variable selection method based on uninformative variable elimination for multivariate calibration of near-infrared spectra [J]. Chemometrics and intelligent laboratory systems,2008,90(2):188-194.
    [136]Han, Q.J.; Wu, H.L.; Cai, C.B.; Xu, L.; Yu, R.Q. An ensemble of Monte Carlo uninformative variable elimination for wavelength selection [J]. Analytica chimica acta,2008,612(2):121-125.
    [137]Jiang, J.H.; Berry, R.J.; Siesler, H.W.; Ozaki, Y. Wavelength interval selection in multicomponent spectral analysis by moving window partial least-squares regression with applications to mid-infrared and near-infrared spectroscopic data [J]. Analytical chemistry,2002,74(14):3555-3565.
    [138]Ye, S.; Wang, D.; Min, S. Successive projections algorithm combined with uninformative variable elimination for spectral variable selection [J]. Chemometrics and intelligent laboratory systems,2008,91(2):194-199.
    [139]成忠;张立庆;刘赫扬;诸爱士.连续投影算法及其在小麦近红外光谱波长选择中的应用[J].光谱学与光谱分析,2010,30(4):949-952.
    [140]Chen, T.; Martin, E. Bayesian linear regression and variable selection for spectroscopic calibration [J]. Analytica chimica acta,2009,631(1):13-21.
    [141]Pearson, K. LIII. On lines and planes of closest fit to systems of points in space [J]. Philosophical Magazine Series 6,1901,2(11):559-572.
    [142]Croux, C.; Haesbroeck, G. Principal component analysis based on robust estimators of the covariance or correlation matrix:influence functions and efficiencies [J]. Biometrika,2000,87(3):603-618.
    [143]Li, G.; Chen, Z. Projection-pursuit approach to robust dispersion matrices and principal components:primary theory and Monte Carlo [J]. Journal of the American Statistical Association,1985,80(391):759-766.
    [144]Croux, C.; Ruiz-Gazen, A. High breakdown estimators for principal components:The Projection Pursuit approach revisited [J]. The IMS Bulletin,2000,29(270.
    [145]Croux, C.; Ruiz-Gazen, A. A fast algorithm for robust principal components based on projection pursuit [C], COMPSTAT:Proceedings in Computational Statistics, Heidelberg,1996:211-217.
    [146]Hubert, M.; Rousseeuw, P.J.; Verboven, S. A fast method for robust principal components with applications to chemometrics [J]. Chemometrics and intelligent laboratory systems,2002,60(1-2):101-111.
    [147]Croux, C.; Ruiz-Gazen, A. High breakdown estimators for principal components:the projection-pursuit approach revisited [J]. Journal of Multivariate Analysis,2005, 95(1):206-226.
    [148]Croux, C.; Filzmoser, P.; Oliveira, M.R. Algorithms for projection-pursuit robust principal component analysis [J]. Chemometrics and intelligent laboratory systems, 2007,87(2):218-225.
    [149]Hubert, M.; Engelen, S. Robust PCA and classification in biosciences [J]. Bioinformatics,2004,20(11):1728.
    [150]Stanimirova, I.; Walczak, B.; Massart, DL; Simeonov, V. A comparison between two robust PCA algorithms [J]. Chemometrics and intelligent laboratory systems,2004, 71(1):83-95.
    [151]Brown, B.M.; Hall, P.; Young, G.A. On the Effect of Inliers on the Spatial Median [J]. Journal of Multivariate Analysis,1997,63(1):88-104.
    [152]Small, C.G. A survey of multidimensional medians [J]. International Statistical Review/Revue Internationale de Statistique,1990,58(3):263-277.
    [153]Blanchard, G.; Bousquet, O.; Zwald, L. Statistical properties of kernel principal component analysis [J]. Machine learning,2007,66(2):259-294.
    [154]Liu, X.; Kruger, U.; Littler, T.; Xie, L.; Wang, S. Moving window kernel PCA for adaptive monitoring of nonlinear processes [J]. Chemometrics and intelligent laboratory systems,2009,96(2):132-143.
    [155]Jeng, J.C. Adaptive process monitoring using efficient recursive PCA and moving window PCA algorithms [J]. Journal of the Taiwan Institute of Chemical Engineers, 2010,41(4):475-481.
    [156]de la Fuente, R.L.N.; Garcia-Mu oz, S.; Biegler, L.T. An efficient nonlinear programming strategy for PCA models with incomplete data sets [J]. Journal of Chemometrics,2010,24(6):301-311.
    [157]Robotti, E.; Demartini, M.; Gosetti, F.; Calabrese, G.; Marengo, E. Development of a classification and ranking method for the identification of possible biomarkers in two-dimensional gel-electrophoresis based on principal component analysis and variable selection procedures [J]. Mol. BioSyst.,2011,7(3):677-686.
    [158]Plumbley, M.D.; Oja, E. A nonnegative PCA algorithm for independent component analysis [J]. Neural Networks, IEEE Transactions on,2004,15(1):66-76.
    [159]Lipovetsky, S. PCA and SVD with nonnegative loadings [J]. Pattern recognition, 2009,42(1):68-76.
    [160]Han, H. Nonnegative principal component analysis for mass spectral serum profiles and biomarker discovery [J]. BMC bioinformatics,2010, 11(Suppl 1):S1.
    [161]Zou, H.; Hastie, T.; Tibshirani, R. Sparse principal component analysis [J]. Journal of computational and graphical statistics,2006,15(2):265-286.
    [162]Andersson, M. A comparison of nine PLS1 algorithms [J]. Journal of Chemometrics, 2009,23(10):518-529.
    [163]Wold, S.; Martens, H.; Wold, H. The multivariate calibration problem in chemistry solved by the PLS method [J]. Matrix Pencils,1982:286-293.
    [164]Martens, H.; Naes, T. Multivariate calibration [M]. New York:John Wiley & Sons Inc,1992.
    [165]Manne, R. Analysis of two partial-least-squares algorithms for multivariate calibration [J]. Chemometrics and intelligent laboratory systems,1987,2(1-3): 187-197.
    [166]Paige, C.C. A Bidiagonalization Algorithm for Sparse Linear Equations and Least-Squares Problems [J]. ACM Trans. Math. Softw,1982,8(1):43-71.
    [167]de Jong, S. SIMPLS:an alternative approach to partial least squares regression [J]. Chemometrics and intelligent laboratory systems,1993,18(3):251-263.
    [168]Dayal, B. Improved PLS algorithms [J]. Journal of Chemometrics,1997,11(1): 73-85.
    [169]Hoskuldsson, A. PLS regression methods [J]. Journal of Chemometrics,1988,2(3): 211-228.
    [170]Ye, J.; Janardan, R.; Li, Q.; Park, H. Feature reduction via generalized uncorrelated linear discriminant analysis [J]. IEEE Transactions on Knowledge and Data engineering,2006,18(10):1312-1322.
    [171]Yuan, D.; Liang, Y.; Yi, L.; Xu, Q.; Kvalheim, O.M. Uncorrelated linear discriminant analysis (ULDA):A powerful tool for exploration of metabolomics data [J]. Chemometrics and intelligent laboratory systems,2008,93(1):70-79.
    [172]Ye, J.; Janardan, R.; Park, C.H.; Park, H. An optimization criterion for generalized discriminant analysis on undersampled problems [J]. Pattern Analysis and Machine Intelligence, IEEE Transactions on,2004,26(8):982-994.
    [173]Ye, J.; Li, T.; Xiong, T.; Janardan, R. Using uncorrelated discriminant analysis for tissue classification with gene expression data [J]. IEEE/ACM Transactions on Computational Biology and Bioinformatics,2004,1(4):181-190.
    [174]张小丹;吕建平.基于SVM的非相关线性判别分析算法研究[J].计算机工程与应用,2008,44(4):227-229.
    [175]Comon, P. Independent component analysis, a new concept? [J]. Signal processing, 1994,36(3):287-314.
    [176]Hyvarinen, A.; Oja, E. Independent component analysis:algorithms and applications [J]. Neural networks,2000,13(4-5):411-430.
    [177]De Lathauwer, L.; De Moor, B.; Vandewalle, J. An introduction to independent component analysis [J]. Journal of Chemometrics,2000,14(3):123-149.
    [178]Hyvarinen, A.; Karhunen, J.; Oja, E. Independent component analysis [M]. New York:Wiley & Sons. Inc,2001.
    [179]张红娟.扩展独立成分分析的若干算法及其应用研究[D].博士学位论文,大连理工大学:2008.
    [180]Hyvarinen, A. New approximation of differential entropy for independent component analysis and projection pursuit [C], Proceedings of the Conference on Advances in Neural Information Processing Systems,1998(10):273-279.
    [181]Roberts, S.; Everson, R. Independent component analysis:principles and practice [M]. Cambridge Univ Pr,2001.
    [182]Gaeta, M.; Lacoume, J.L. Source separation without a priori knowledge:the maximum likelihood solution [C], Proceedings of European Signal Processing Conference (EUSIPCO),1990:621-624.
    [183]Pham, D.T.; Garrat, P.; Jutten, C. Separation of a mixture of independent sources through a maximum likelihood approach [C], Proceedings of European Signal Processing Conference (EUSIPCO),1992:771-774.
    [184]Amari, S.; Cichocki, A.; Yang, H.H. A new learning algorithm for blind signal separation [J], Advances in neural information processing systems,1996,8(757-763.
    [185]Lee, M. A unifying information-theoretic framework for independent component analysis [J]. Computers & Mathematics with Applications,2000,39(11):1-21.
    [186]MacKay, D.J.C Maximum likelihood and covariant algorithms for independent component analysis [C],1996:available at ftp://wol.ra.phy cam.ac.uk/pub/mackay/ica.pa.gz.
    [187]Amari, S.I. Natural gradient works efficiently in learning [J]. Neural computation, 1998,10(2):251-276.
    [188]Bell, A.J.; Sejnowski, T.J. An information-maximization approach to blind separation and blind deconvolution [J]. Neural computation,1995,7(6):1129-1159.
    [189]Roberts, S.; Choudrey, R. Data decomposition using independent component analysis with prior constraints [J]. Pattern recognition,2003,36(8):1813-1825.
    [190]Plumbley, M. Conditions for nonnegative independent component analysis [J]. Signal Processing Letters, IEEE,2002,9(6):177-180.
    [191]Plumbley, M.D. Algorithms for nonnegative independent component analysis [J]. Neural Networks, IEEE Transactions on,2003,14(3):534-543.
    [192]Zhang, N.; Lu, J.; Yahagi, T. A method of independent component analysis based on radial basis function networks using noise estimation [J]. Electronics and Communications in Japan,2008,91(3):45-52.
    [193]Mehri Dehnavi, A.R.; Farahabadi, I.; Rabbani, H.; Farahabadi, A.; Parsa Mahjoob, M.; Rajabi Dehnavi, N. Detection and classification of cardiac ischemia using vectorcardiogram signal via neural network [J]. Journal of Research in Medical Sciences,2011,16(2):136-142.
    [194]Whitmer, D.; Worrell, G.; Stead, M.; Lee, I.K.; Makeig, S. Utility of Independent Component Analysis for Interpretation of Intracranial EEG [J]. Frontiers in Human Neuroscience,2010:doi:10.3389/fnhum.2010.00184.
    [195]Varoquaux, G.; Sadaghiani, S.; Pinel, P., Kleinschmidt, A.; Poline, J.B.; Thirion, B. A group model for stable multi-subject ICA on fMRI datasets [J]. Neuroimage,2010, 51(1):288-299.
    [196]Calhoun, V.D.; Liu, J.; AdalI, T. A review of group ICA for fMRI data and ICA for joint inference of imaging, genetic, and ERP data [J]. Neuroimage,2009,45(1): S163-S172.
    [197]Frigyesi, A.; Veerla, S.; Lindgren, D.; H glund, M. Independent component analysis reveals new and biologically significant structures in micro array data [J]. BMC bioinformatics,2006,7(1):290.
    [198]Engreitz, J.M.; Daigle Jr, B.J.; Marshall, J.J.; Altman, R.B. Independent component analysis:Mining microarray data for fundamental human gene expression modules [J]. Journal of biomedical informatics,2010,43(6):932-944.
    [199]Liu, K.H.; Li, B.; Wu, Q.Q.; Zhang, J.; Du, J.X.; Liu, G.Y. Microarray data classification based on ensemble independent component selection [J]. Computers in Biology and Medicine,2009,39(11):953-960.
    [200]Zheng, C.H.; Huang, D.S.; Kong, X.Z.; Zhao, X.M. Gene expression data classification using consensus independent component analysis [J]. Genomics, Proteomics & Bioinformatics,2008,6(2):74-82.
    [201]Chen, L.; Xuan, J.; Wang, C.; Shih, Ie-M.; Wang, Y.; Zhang, Z.; Clarke, R. Knowledge-guided multi-scale independent component analysis for biomarker identification [J]. BMC bioinformatics,2008,9:416.
    [202]Han, H.; Li, X.L. Multi-resolution independent component analysis for high-performance tumor classification and biomarker discovery [J]. BMC bioinformatics,2011,12(Suppl 1):S7.
    [203]Dowsey, A.W.; English, J.A.; Lisacek, F.; Morris, J.S.; Yang, G.Z.; Dunn, M.J. Image analysis tools and emerging algorithms for expression proteomics [J]. Proteomics,2010,10(23):4226-4257.
    [204]Safavi, H.; Correa, N.; Xiong, W.; Roy, A.; Adali, T.; Korostyshevskiy, V.R.; Whisnant, C.C.; Seillier-Moiseiwitsch, F. Independent component analysis of 2-D electrophoresis gels [J]. Electrophoresis,2008,29(19):4017-4026.
    [205]Mantini, D.; Petrucci, F.; Del Boccio, P.; Pieragostino, D.; Di Nicola, M.; Lugaresi, A.; Federici, G.; Sacchetta, P.; Di Ilio, C.; Urbani, A. Independent component analysis for the extraction of reliable protein signal profiles from MALDI-TOF mass spectra [J]. Bioinformatics,2008,24(1):63.
    [206]Want, E. Challenges in applying chemometrics to LC-MS-based global metabolite profile data [J]. Bioanalysis,2009,1(4):805-819.
    [207]Shao, X.; Wang, G.; Wang, S.; Su, Q. Extraction of mass spectra and chromatographic profiles from overlapping GC/MS signal with background [J]. Analytical chemistry,2004,76(17):5143-5148.
    [208]Wang, G., Cai, W.; Shao, X. A primary study on resolution of overlapping GC-MS signal using mean-field approach independent component analysis [J]. Chemometrics and intelligent laboratory systems,2006,82(1-2):137-144.
    [209]Yao, Z.; Zhang, K.; Liu, H.; Su, H. Eliminate indeterminacies of independent component analysis for chemometrics [J]. Progress in Natural Science,2008,18(8): 1009-1014.
    [210]Shao, X.; Wang, W.; Hou, Z.; Cai, W. A new regression method based on independent component analysis [J]. Talanta,2006,69(3):676-680.
    [211]Kaneko, H.; Arakawa, M.; Funatsu, K. Development of a new regression analysis method using independent component analysis [J]. Journal of chemical information and modeling,2008,48(3):534-541.
    [212]Pasadakis, N.; Kardamakis, A.A. Identifying constituents in commercial gasoline using Fourier transform-infrared spectroscopy and independent component analysis [J]. Analytica chimica acta,2006,578(2):250-255.
    [213]Kano, M.; Tanaka, S.; Hasebe, S.; Hashimoto, I.; Ohno, H. Monitoring independent components for fault detection [J]. AIChE Journal,2003,49(4):969-976.
    [214]Bouveresse, D.J.R.; Benabid, H.; Rutledge, D.N. Independent component analysis as a pretreatment method for parallel factor analysis to eliminate artefacts from multiway data [J]. Analytica chimica acta,2007,589(2):216-224.
    [215]Wang, G.; Ding, Q.; Hou, Z. Independent component analysis and its applications in signal processing for analytical chemistry [J]. TrAC Trends in Analytical Chemistry, 2008,27(4):368-376.
    [216]Wulfkuhle, J.D.; Liotta, L.A.; Petricoin, E.F. Proteomic applications for the early detection of cancer [J]. Nature reviews cancer,2003,3(4):267-275.
    [217]Idborg-Bjorkman, H.; Edlund, P.O.; Kvalheim, O.M.; Schuppe-Koistinen, I.; Jacobsson, S.P. Screening of biomarkers in rat urine using LC/electrospray ionization-MS and two-way data analysis [J]. Analytical chemistry,2003,75(18): 4784-4792.
    [218]Jiye, A.; Trygg, J.; Gullberg, J.; Johansson, A.I.; Jonsson, P.; Antti, H.; Marklund, S.L.; Moritz, T. Extraction and GC/MS analysis of the human blood plasma metabolome [J]. Analytical chemistry,2005,77(24):8086-8094.
    [219]Skytt,.; Thy sell, E.; Stattin, P.; Stenman, U.H.; Antti, H.; Wikstr m, P. SELDI-TOF MS versus prostate specific antigen analysis of prospective plasma samples in a nested case-control study of prostate cancer [J]. international Journal of Cancer,2007, 121(3):615-620.
    [220]Bruce, S.J.; Jonsson, P., Antti, H.; Cloarec, O.; Trygg, J.; Marklund, S.L.; Moritz, T. Evaluation of a protocol for metabolic profiling studies on human blood plasma by combined ultra-performance liquid chromatography/mass spectrometry:From extraction to data analysis [J]. Analytical biochemistry,2008,372(2):237-249.
    [221]Hendriks, M.M.W.B.; Smit, S.; Akkermans, W.L.M.W.; Reijmers, T.H.; Eilers, P.H.C.; Hoefsloot, H.C.J.; Rubingh, C.M.; de Koster, C.G.; Aerts, J.M.; Smilde, A.K. How to distinguish healthy from diseased? Classification strategy for mass spectrometry-based clinical proteomics [J]. Proteomics,2007,7(20):3672-3680.
    [222]Smit, S.; van Breemen, M.J.; Hoefsloot, H.C.J.; Smilde, A.K.; Aerts, J.M.F.G.; de Koster, C.G. Assessing the statistical validity of proteomics based biomarkers [J]. Analytica chimica acta,2007,592(2):210-217.
    [223]Berven, F.S.; Kroksveen, A.C.; Berle, M.; Rajalahti, T.; Flikka, K.; Arneberg, R.; Myhr, K.M.; Vedeler, C.; Kvalheim, O.M.; Ulvik, R.J. Pre-analytical influence on the low molecular weight cerebrospinal fluid proteome [J]. PROTEOMICS-Clinical Applications,2007,1(7):699-711.
    [224]Rajalahti, T.; Arneberg, R.; Kroksveen, A.C.; Berle, M.; Myhr, K.M.; Kvalheim, O.M. Discriminating variable test and selectivity ratio plot:quantitative tools for interpretation and variable (biomarker) selection in complex spectral or chromatographic profiles [J]. Analytical chemistry,2009,81(7):2581-2590.
    [225]Jeffries, N.O. Performance of a genetic algorithm for mass spectrometry proteomics [J]. BMC bioinformatics,2004,5(1):180.
    [226]Rajalahti, T.; Arneberg, R.; Berven, F.S.; Myhr, K.M.; Ulvik, R.J.; Kvalheim, O.M. Biomarker discovery in mass spectral profiles by means of selectivity ratio plot [J]. Chemometrics and intelligent laboratory systems,2009,95(1):35-48.
    [227]Rousseau, R.; Govaerts, B.; Verleysen, M.; Boulanger, B. Comparison of some chemometric tools for metabonomics biomarker identification [J]. Chemometrics and intelligent laboratory systems,2008,91(1):54-66.
    [228]Liu, H.; Setiono, R. Chi2: Feature selection and discretization of numeric attributes [C], Proc. IEEE 7th International Conference on Tools with Artificial Intelligence, 1995:388-391
    [229]Liu, H.; Setiono, R. Feature selection via discretization [J]. Knowledge and Data Engineering, IEEE Transactions on,1997,9(4):642-645.
    [230]Majumder, SK; Gupta, A.; Gupta, S.; Ghosh, N.; Gupta, PK. Multi-class classification algorithm for optical diagnosis of oral cancer [J]. Journal of Photochemistry and Photobiology B:Biology,2006,85(2):109-117.
    [231]Zhang, M.; Wang, W.; Du, Y. ULDA-based heuristic feature selection method for proteomic profile analysis and biomarker discovery [J]. Chemometrics and intelligent laboratory systems,2010,102(2):84-90.
    [232]Hyvarinen, A. Fast and robust fixed-point algorithms for independent component analysis [J]. Neural Networks, IEEE Transactions on,1999,10(3):626-634.
    [233]Hyvarinen, A.; Oja, E. A fast fixed-point algorithm for independent component analysis [J]. Neural computation,1997,9(7):1483-1492.
    [234]Kaiser, JF. Nonrecursive digital filter design using the 10-sinh window function [C], Proceedings of IEEE International Symposium on Circuit Theory,1974:20-23.
    [235]Mann, H.B.; Whitney, D.R. On a test of whether one of two random variables is stochastically larger than the other [J]. The Annals of Mathematical Statistics,1947, 18(1):50-60.
    [236]Currie, L.A. Detection and quantification limits:origins and historical overview [J]. Analytica chimica acta,1999,391(2):127-134.
    [237]Alexandrov, T.; Decker, J.; Mertens, B.; Deelder, A.M.; Tollenaar, R.A.E.M.; Maass, P.; Thiele, H. Biomarker discovery in MALDI-TOF serum protein profiles using discrete wavelet transformation [J]. Bioinformatics,2009,25(5):643.
    [238]Fushiki, T.; Fujisawa, H.; Eguchi, S. Identification of biomarkers from mass spectrometry data using a" common" peak approach [J]. BMC bioinformatics,2006, 7(1):358-366.
    [239]Abeel, T.; Helleputte, T.; Van de Peer, Y.; Dupont, P.; Saeys, Y. Robust biomarker identification for cancer diagnosis with ensemble feature selection methods [J]. Bioinformatics,2010,26(3):392-398.
    [240]Krishnapuram, B.; Carin, L.; Hartemink, A.J. Joint classifier and feature optimization for comprehensive cancer diagnosis using gene expression data [J]. Journal of Computational Biology,2004,11(2-3):227-242.
    [241]Guyon, I.; Elisseeff, A. An introduction to variable and feature selection [J]. The Journal of Machine Learning Research,2003,3:1157-1182.
    [242]谢娟英;王春霞;蒋帅;张琰;将传统F.基于改进的F-score与支持向量机的特征选择方法[J].计算机应用,2010,30(4):993-996
    [243]Chen, Y.W.; Lin, C.J. Combining SVMs with various feature selection strategies [J]. Feature Extraction,2006:315-324.
    [244]Heinzmann, S.S.; Brown, I.J.; Chan, Q.; Bictash, M.; Dumas, M.E.; Kochhar, S.; Stamler, J.; Holmes, E.; Elliott, P.; Nicholson, J.K. Metabolic profiling strategy for discovery of nutritional biomarkers:proline betaine as a marker of citrus consumption [J]. The American journal of clinical nutrition,2010,92(2):436.
    [245]Legido-Quigley, C.; Stella, C.; Perez-Jimenez, F.; Lopez-Miranda, J.; Ordovas, J.; Powell, J.; van-der-Ouderaa, F.; Ware, L.; Lindon, J.C.; Nicholson, J.K. Liquid chromatography-mass spectrometry methods for urinary biomarker detection in metabonomic studies with application to nutritional studies [J]. Biomedical Chromatography,2010,24(7):737-743.
    [246]Zeng, M.; Liang, Y.; Li, H.; Wang, B.; Chen, X. A metabolic profiling strategy for biomarker screening by GC-MS combined with multivariate resolution method and Monte Carlo PLS-DA [J]. Anal. Methods,2011,3(2):438-445.
    [247]蒋红卫;夏结来;李园;于莉莉.偏最小二乘判别分析在基因微阵列分型中的应用[J].中国卫生统计,2007,24(4):372-374.
    [248]Zhang, M.; Tong, P.; Wang, W.; Geng, J.; Du, Y. Proteomic profile analysis and biomarker discovery from mass spectra using independent component analysis combined with uncorrelated linear discriminant analysis [J]. Chemometrics and intelligent laboratory systems,2011,105(2):207-214.
    [249]Nystrom, J.; Dahlquist, E. Methods for determination of moisture content in woodchips for power plants--a review [J]. Fuel,2004,83(7-8):773-779.
    [250]Miller, C.E. Chemical principles of near infrared technology [J]. Near infrared technology in the agricultural and food industries,2001:19-37.
    [251]Connolly, C. NIR spectroscopy for foodstuff monitoring [J]. Sensor Review,2005, 25(3):192-194.
    [252]Moreda, GP; Ortiz-Ca avate, J.; Garcia-Ramos, FJ; Ruiz-Altisent, M. Non-destructive technologies for fruit and vegetable size determination-A review [J]. Journal of Food Engineering,2009,92(2):119-136.
    [253]Kasemsumran, S.; Du, Y.; Murayama, K.; Huehne, M.; Ozaki, Y. Simultaneous determination of human serum albumin, y-globulin, and glucose in a phosphate buffer solution by near-infrared spectroscopy with moving window partial least-squares regression [J]. Analyst,2003,128(12):1471-1477.
    [254]Du, Y.P.; Kasemsumran, S.; Maruo, K.; Nakagawa, T.; Ozaki, Y. Ascertainment of the number of samples in the validation set in Monte Carlo cross validation and the selection of model dimension with Monte Carlo cross validation [J]. Chemometrics and intelligent laboratory systems,2006,82(1-2):83-89.
    [255]李彦威;方慧文;梁素霞;王志忠.偏最小二乘紫外分光光度法同时测定丁烯二酸的顺反异构体[J].分析化学,2008,8(1):95-98.
    [256]褚小立;田高友;袁洪福;陆婉珍.小波变换结合多维偏最小二乘方法用于近红外光谱定量分析[J].分析化学,2006,34(特刊):S175-S178.
    [257]Spiegelman, C.H.; McShane, M.J.; Goetz, M.J.; Motamedi, M.; Yue, Q.L.; Cote, G.L. Theoretical justification of wavelength selection in PLS calibration:development of a new algorithm [J]. Analytical chemistry,1998,70(1):35-44.
    [258]褚小立;许育鹏;陆婉珍.用于近红外光谱分析的化学计量学方法研究与应用进展[J].分析化学,2008,8(5):702-709.
    [259]成忠;诸爱士.段式正交信号校正方法及在小麦近红外光谱数据分析中的应用[J].分析化学,2008,36(6):788-792.
    [260]李丽娜;李庆波;张广军.基于交互式自模型混合物分析的近红外光谱波长变量优选方法[J].分析化学,2009,9(6):823-827.
    [261]Windig, W.; Stephenson, DA. Self-modeling mixture analysis of second-derivative near-infrared spectral data using the SIMPLISMA approach [J]. Analytical chemistry, 1992,64(22):2735-2742.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700