支持向量机算法应用于生物活性混合体系的定量分析及重元素光谱能级分类
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
化学计量学(chemometrics)研究目的在于优化化学量测过程,并从化学量测数据中最大限度地获取有用的化学信息。支持向量机(support vector machine,SVM)方法,是一种基于结构风险最小化的新兴化学计量学方法。SVM算法可以在很大程度上避免误差反向传播(back propagation neural networks,BPN)使用过程中存在的“过学习(over-fitting)”问题;通过选用不同的核函数可以寻找出空间最优平面,以期避免信息的丢失,取得更为可靠、更为准确的结果。SVM方法正逐步应用于包括多元分辨与校正分析、模式分类等研究领域中,也有望在数据处理和分析任务愈来愈繁重的现代分析科学中发挥它的积极作用。本论文主要应用SVM算法对多元混合体系的定量分析以及光谱的化学模式分类这两个方面进行了研究,具体内容如下:
     对于多元混合体系的定量分析,常需要花费大量的时间和精力在多组分的预分离,而借助化学计量学手段则可较简单的实现复杂多组分的同时直接测定。我们将支持向量机方法分别应用于处理多种混合氨基酸体系的拉曼光谱、儿茶酚胺类物质混合体系的微分脉冲伏安图谱的定量分析研究。研究表明,支持向量机方法能更好地从混合体系的量测数据中提取信息以实现定量分析目的,较传统的BPN方法,其分析结果更为精确。
     原子光谱的电子组态通常是根据谱线的能级、强度、同位素位移、塞曼效应等测量数据进行确定,或者应用量子理论计算来指认。但由于原子光谱的复杂性,仍有部分高激发态的原子光谱所属的电子组态难于确定。因此,尝试采用支持向量机方法来对UⅡ等重元素原子光谱的分类研究,对于完善原子光谱数据信息具有重要的意义。计算结果表明,相对于传统化学模式识别方法,支持向量机能够更为全面和准确地预报了未知能级的组态归属。
Chemometrics was designed to optimize the process of chemical measurements and get useful chemical information from the data of chemical measurements.Support vector machine (SVM) has solid theoretical foundation and can deal with small dataset, nonlinear optimization, high-dimensional feature space, local minimization and other realistic problems. Along with the development of SVM, some derived algorithms have been put forward and the application of SVM has gradually been the hot point for researchers in the world. Today, SVM has been successfully applied in face recognition, voice identification, handwritten digit recognition, text classification, risk assessment, protein structure recognition, gene recognition and other pattern recognition domains and achieves equivalent or superior results compared to those obtained by some other methods. It is very exciting that their capability to generalize input-output mapping from a limited set of training examples is great. In this paper, we use SVM to solve the problems of determining mixture and to classify the spectrums of heavy metal atom:
     Using SVM to determing mixtures, such as amino acid, catecholamines(CATs) by informations from the mix spectrograms of DPV and Raman without pre-separation. Study shows that SVM can well deal with such mixture, relative to BPN , it gains more accuracy information.
     Using SVM to classify the unknown energy levels of heavy metal-U II, which can not be classified by experiment. Although some people have tried to use traditional chemometric techniques to predict the unknown energy levels, there still have some samples which can not be predicted. So we use SVM to deal with such heavy metal to gain the energy levels. The results show that SVM predict more accuracy and completely than traditional methods of chemometrics.
引文
[1]Wold S,Chemometrics:what do we mean with it,and what do we want from it,Paper of InCINC'94;
    [2]俞汝勤等.化学计量学导论[M].长沙,湖南教育出版社,1991:1-10;
    [3]Einax J.(ed),Chemometrics in Enviromental Chemistry,Statistical Methods.Berlin:Springer-Verlag,1995;
    [4]Kemsley E K,Rnauit S,Wilson R H.Food Chem,1995,54(3):321-326;
    [5]Gonzalez-Vinas M A,Perez-Coello M S,Salvador M D.Food Chem,1996,56(3):399-403;
    [6]Briandet R,Kemsley E K,Wilson R H.J Agric Food Chem,1996,44(1):170-174;
    [7]Hansch C,Fujita T.Classic and Three-Dimensional QSAR in Agrochemistry,ACS Symposium Series 606.Washington DC:Amenican Chemical Society,1995;
    [8]Mannhold R,Hrogsgaard-Larsen P,Timmerman H.Methods and Principles in Medical Chemistry 3.Weinheim Germany:VHC,1995;
    [9]Schenker B,Agarwal M.Comput Chem Eng,1996,20:s924-s930;
    [10]Wise B M,Kowalski B R.Process Chemometrics,in Mclennan F and Kowalski B R,ed.Process Analytical Chemistry.U.K.Blackie Academic & Professional,1995;
    [11]Dayal B S,Anal Chem,1996,68:59R-88R;
    [12]沈林海,梁逸曾,俞汝勤等.中国科学(B辑),1998,27:556-563;
    [13]Liang Y Z,kvalheim O M,Manne R.Chemo and Intel Lab Syst,1992,18:235-245;
    [14]Faber K,Lorber A and Kowalski B R.J Chemoemtrics,1997,11:419;
    [15]Lorber A,Faber K and Kowalski B R.Anal Chem,1997,69:1620;
    [16]Faber K and Kowalski B R.J Chemoemtrics,1997,11:181;
    [17]Gleser LJ.Chemom Intell Lab Syst,1997,37:15-22;
    [18]Song X H,Hopke P K.Environ Sci Technol,1996,30:531-535;
    [19]Powers S E,Villaume J F,Ripp J A.Groundwater Monit Rem,1997,17:130-140;
    [20]Troiano J,Nordmark C,Barry T,Johnson B.Evrion Monit Assess,1997,45:301-318;
    [21]Jones J M,Davies T D,Dorling S R.Water,Air,Soil Pollut,1995,85:1569-1574;
    [22]Vogels J T W E,Terwel L,Tas A C,etc J Agric Food Chem,1996,44:175-180;
    [23]lizuka K,Aishima T.J Food Sci,1997,62:101-104;
    [24]Angerosa F,Di Giacinto L,Vito R,Cumitini S.J Sci Food Agric,1996,72:323-328;
    [25]Flecher P E,Welch W T,Albin S,Cooper J B.Spectrochim Acta,Part A,1997,53A(2):199-206;
    [26]Cooper J B,Wise K L,Groves J,Welch W T.Anal Chem,1995,67:1766-1771;
    [27]Bakker C J de,Frederricks P M.Appl Spectrosc,1995,49:1766-1771;
    [28]Firmstone G P,Smith M P,Stipanovic A J.Soc Automot Eng,[Spec.Publ.]SP1995,SP-1116:201-208;
    [29]李志良,曾鸽鸣,梁本熹等.Acta Chim ica Sinica(化学学报)[J],1996,54(10):1009-1015;
    [30]印春生,刘树深,李志粮,线性神经网络应用于维生素B族4组分同时测定,高等学校化学学报,2001,21:49-51:
    [31]黄勇,马军涛,何佩韦.BP神经网络分光光度法在多组分体系分析中的应用研究.河北化工,2002,1:35-37;
    [32]郭强伟.用神经网络方法解析色谱重叠峰[J].杭州大学学报,25(3):62-66.;
    [33]张玲,张拨,模式识别与人工智能,1994,7;191-195;
    [34]方力,张燕,丁佳,姚照兵.紫外分光光度法同时测定硝酸盐氮和亚硝酸盐氮,分析测试技术与仪器,1995,5(3);
    [35]邵学广,陈宗海等.逐步回归分析与人工神经网络用于多组分稀土元素的HPLC同时测定.稀土.1998,19(6);
    [36]Vapnik Vladimir N.The Nature of Statistical Learning Theory.Berlin:Springer,1995;
    [37]刘秀兰,刘念.基于灰度图像分形特征的局部放电模式识别.变压器,2009,1;
    [38]张松林,高佩佩.模式识别及其在刑事科学技术中的应用.电脑知识与技术,2008,34;
    [39]苗琦龙,栾新.基于遗传算法和BP网络的文字识别方法[J].计算机应用,2005,(S1);
    [40]谢蓄芬,刘泊,王德军.一种改进BP神经网络在模式识别中的应用[J].哈尔滨理工大学学报,2004,(05);
    [41]刘心.地震数据处理的一种模式识别计算方法[J].牡丹江师范学院学报(自然科学版),2005,(04);
    [42]李军梅,胡以华,陶小红.基于主成分分析与BP神经网络的识别方法研究[J].红外与激光工程,2005,(06)
    [43]刘爱华,施式亮,吴超.基于模糊模式识别的模糊综合评价在高层建筑火灾危险评价中的应用[J].中国安全科学学报,2005,(11);
    [44]舒继森,才庆祥,郝航程,王文忠,张镭.可拓学理论在边坡破坏模式识别中的应用[J].中国矿业大学学报,2005,(05);
    [45]裴铁璠,金昌杰.物候模式识别在生态动力预报中的应用[J].应用生态学报,2005,(09);
    [46]Conrad Bessant and Selwayan Saini.Simultaneous Determination of Ethanol,Fructose,and Glucose at an Unmodified Platinum Electrode Using Artificial Neural Networks.Anal.Chem.1999,71,2806-2813;
    [47]邱士利,夏日元,陈宏峰,姚昕,金新峰.运用人工神经网络方法预报表层岩溶地下水动态[J].广西师范大学学报(自然科学版),2005,(04)
    [48]高建华,何琴,刘伟.人工神经网络方法用于多组分分光光度分析.分析科学学报,2005,21(4):429-431;
    [49]崔秀君,张卓勇,袁星,苏忠民,刘思东.主成分分析-神经网络方法用于硝基苯及其同系物的QSAR研究.计算机与应用化学,2005,22(11):1038-1041;
    [50]吴军,杨梅.人工神经网络用于紫外光谱同时测定苯和甲苯及二甲苯的含量.理化检验-化学分册,2006,42(7):511-515;
    [51]Peterson K L,Anderson D L,Parsons M L Spectral classification using pattern-recognition techniques.Ⅱ.Application to curium energy levels.Phys Rev A1978,17:270-276;
    [52]Peterson K L.Classification of Cm Ⅰ energy levels using counter propagation neural networks.Phys.Rev.A,1990,41:2457-2461;
    [53]Peterson K L.Classification of Cm Ⅱ and Pu Ⅰ energy levels using counter propagation neural networks.Phys.Rev.A,1991,44:126-138;
    [54]曹晓卫,刘洪霖,陈念贻.UI偶宇称原子光谱分类的PCA-BPN模式识别方法研究.原子与分子物理学报,1999,16(4):487-493;
    [55]曹晓卫,刘洪霖.UI奇宇称光谱能级的化学模式识别方法研究。科学通报,1997,21(42):2285-2287;
    [56]CaoXiaowei(曹晓卫),Liu Honglin(刘洪霖),Chen Nianyi(陈念贻)Chinese Science Bulletin,1997,42:2284
    [57]Chih Wei-hsu,Chi-Jenlin.A Comparison of methods for multiclass support vector machines.IEEE Trans.On Neural Networks[J].1999,10(5);
    [58]K Jonsson,J Kittler,Y P Li,et al.Support vector machines for face authentication.Image and Vision Computing[J].2002,20(5-6);
    [59]陈念贻,陆文聪,叶晨洲,等.支持向量机及其他核函数算法在化学计量学中的应用.计算机与应用化学[J].2002,19(6).
    [60]张学工.关于统计学习理论与支持向量机.自动化学报[J].2000,26(1);
    [61]郑乃扬,田英杰.数据挖掘中的新方法—支持向量机.科学出版社.2004;
    [62]Arai H.Mapping ability of three-layer neutral networks.Proc of IEEE Int Joint Conf on Neutral Networks,Washinngton D C,1992(1):419-423;
    [63]Boonyanit K & Peterson A.Reducing the number of multiplies in backpropagation.Proc of Int Conf on Neural Networks,1994,1:28-31;
    [64]Astarita V.Node and link models for network traffic flow simulation.Mathematical &Computer Modeling,2002,35:643-656;
    [65]Liang X,Ma L.A study of removing hidden neurons in cascade-correlation neural networks.Proc of Int Conf on Neural Networks,Budapest,2004,2:1015-1020;
    [66]Hirose Y,Yamashita K,Hijiya S.Back-propagation algorithm which varies the number of hidden units.Neural Networks,1991,4:61-66;
    [67]Preisendorfer,Rudolph W.Mobley,Curtis D.Principal component analysis in meteorology and oceanography.Elsevier,Amsterdam.NL.1988.425 p;
    [68]Principal component analysis.Chemometrics and Intelligent Laboratory Systems.Vol.2,no.1-3,pp.37-52.1987;
    [69]刘广军,高洪涛.化学计量学中的主成分分析.曲阜师范大学学报.2004,30(3);
    [70]Barros A S,Rutledge D N.Genetic algorithm applied to the selection of principal components [J].Chemom Intel Lab Syst,1998,40:65-81;
    [71]Wu W,Massart D L,Sde J.The kernel PCA algorithms for wide data.Part1:theory and algorithms[J].Chemom Intel Lab Syst,1997,36:165-172;
    [72]3]Yu Ke,CHENG Yi-Yu.Discriminating the genuineness Chinese medicines with least squares support vector machines[J].Analytical Chemistry,2006,34(4):561-564;
    [73]1 YU Yan.Fang,GAO Da-Qi.An improved]east squares support vector machine and its applications[J].Computer Engineering&Scienc.2006,28(2)69-71,85;
    [74]陈国华,文香军.非线性隐核偏最小二乘回归算法及其应用[J].武汉理工大学学报,2008,(12);
    [75]6]杨晓云,徐汉虹,王文世,等.色氨酸,半胱氨酸和酪氨酸的高效毛细管电泳分析[J].分析测试学报,2001,20(3):15-18;
    [76][10]刘惠文.柱前和柱后衍生高效液相色谱分析氨基酸方法进展与评述[J].氨基酸和生物资源,1995,17(2):50-55;
    [77]Roth M.Analysis of protein reversed phased luquic chromatography[J].Analytical Chemistry,1971,43:880;
    [78]Jones B N,Gilligan J P.Molden analysis methods of amino acid[J].Journal of Chroma tography,1983,10:266-271;
    [79]L.Abello,F.Genet,J.M.Nigretto,G.Lucazeau.Surface enhanced Raman spectra of Methionine on a silver electrode.Surface Science,1989,215:158-170;
    [80]Saulius Martusevicius,Gediminas Niaura,Zita Talaikyte,Valdemaras Razumas.Adsorption of L-histidine on copper surface as evidenced by surface-enhanced Raman scattering spectroscopy.Vibrational Spectroscopey,1996,10:271-2;
    [81]S.Jarmelo,P.R.Carey,R.Fausto.The Raman spectra of serine and 3,3-dideutero-serine in aqueous solution.Vibrational Spectroscopy,2007,43:04-110;
    [82]KushnirM M,Urry F M,Frank E L,RobertsW L,Shushan B.Clin Chem,2002,48(2):323;
    [83]Tornkvist A,sjoberg P J R,Markides K E,Bergquist J.J Chromatogr B,2004,801:323;
    [84]Rinne S,Holm A,Lundanes E,Greibrokk T.J Chrom atogr A,2006,1 119:285;
    [85]汤涛等.安徽医学,1990,11(5):9-11
    [86]石一鸣.第二军医大学学报,1989,10(6):567-570;
    [87]Sun Dengming(孙登明),Ma Wei(马伟),Zhang Zhenxin(张振新).Chinese Anal.Chem.(分析化学),2006,34(5):668-670;
    [88]林玲,等.福建医科大学学报,2000,34(4):403-404;
    [89]Kuhlenbeck D L,et al.J Chmomatogr B Biomed Sci Appl,2000,738(2):319-330;
    [90]Nozaki O,et al.Luminescence,1999,14(6):369-374;
    [91]F.Lisdat,et al.Biosensors & Bioeleclronics,1997,12(12):1199-1211;
    [92]刘思东,张卓勇.人工神经网络方法用于脉冲极谱重叠峰解析.分析化学,1997,25(3):249-252;
    [93]何池洋,吴根华.人工神经网络-荧光光谱法同时测定维生素B1,B2,B6[J].光谱学与光谱分析,2003,(03);
    [94]陈贤光,张素娟.L-半胱氨酸自组装电极循环伏安法测定多巴胺.分析实验室,2007,26(4);
    [95]李明齐,何晓英.去肾上腺素在L-半胱氨酸修饰金电极上的电化学行为及分析研究.分析实验室,2006,25(3);
    [96]Sheng-fu Wang,Dan Du,Qi-Chao Zou.Electrochemical behavior of epinephrine at L-cysteine self-assembled monolayers modified gold electrode.ELSEVIER,2002,57,687-692;
    [97]Chen N Y.J.Analy-iea Chimlca Acta.1988.210:175;
    [98]Liu H L,Chen N Y,LuW C,Zhu JW.AnMytie Letters.1994,27:2195;
    [99]Jean Blaise.Energy levels and isotope shifts for singly ionized uranium(U Ⅱ).JOSA B,Vol.11,Issue 10,pp.1897-1929;