甜叶菊糖甙含量近红外光谱定量预测模型的创建及应用
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
近红外光谱技术是上世纪80年代后期迅速发展起来的新型物理测试技术,其具有速度快、无污染、低消耗、非破坏性、多组分同时测定等优点,已被列为多种国际或行业标准,在许多领域得到广泛应用。甜菊二萜糖甙是一类高甜度、无能量、天然甜味剂,在食品、医药、化妆品等领域有广阔的前景。本研究旨在建立甜叶菊叶片中重要二萜糖甙组分的近红外光谱预测模型,并通过改进的算法,优化模型的稳健性,提高模型的适用性和预测能力,为甜叶菊味质检测和育种选择提供快速、简捷、有效地方法。
     甜叶菊二萜糖甙含量近红外光谱预测模型的创建及应用研究中,利用多年份的单株选择材料,随机选出508份甜叶菊叶粉样品,取样品量约3g扫描其近红外光谱。参考分析方法采用改良的液相色谱法测定甜菊糖甙、瑞鲍迪甙A及两者的总百分含量。采用标准正态变换、二阶微分和Savitzky-Golay卷积平滑处理原始光谱。联用基于蒙特卡罗的无信息变量消除法和连续性投影法筛选光谱变量,剔除了大量无信息变量和冗余变量,同时降低光谱变量的共线性,提高了模型预测的准确性和稳健性,并在一定程度上克服了过拟合现象。
     建模过程中采用迭代的加权最小二乘支持向量机剔除异常值,并对剔除了异常值的训练集采用支持向量机回归算法,仿真数据和二萜糖甙总含量数据均表明,此混合方法的预测性能优于其他方法,对二萜糖甙总含量的预测均方根误差,决定系数,剩余预测偏差分别为0.843%、0.907和3.256。同时,通过PLS方法对筛选后的变量空间进行特征提取,提高了计算的效率,其预测均方根误差,决定系数和剩余预测偏差分别达到0.845%、0.906和3.249。对单个组分的甜菊糖甙和瑞鲍迪甙A,采用相同的建模方法,并对第一个偏最小二乘成分进行正交信号校正以提高预测的准确性。预测结果表明,应用所构建的近红外光谱预测模型进行大规模育种材料的筛选是可行的。
     应用此模型共筛选出70个瑞鲍迪甙A绝对含量较高和63个瑞鲍迪甙A相对含量较高的单株材料。其杂交F1子代50个品系的瑞鲍迪甙A绝对含量和相对含量均显著提高,甜叶菊味质得到显著改良。
     本研究所发展的近红外光谱技术还可为甜叶菊遗传研究,数量性状定位,种质资源评价以及甜叶菊叶片现场收购提供有利的检测工具。
Near infared reflectance spectroscopy (NIRS) has been rapidly developed as a novel physical analysis technique in the late of 1980s in last century. Because the NIRS is nondestructive, fast, cost effective, environmentally safe, and allows the simultaneous estimation of several traits in a unique measurement, this technique was used widely in many areas and considered as international stardards. The diterpene glycosides in Stevia rebaudiana leaves are considered as a potential source of natural non caloric sweeteners and used widely in the food, medicine, cosmetic, etc. The present study aimed for assessing the potential of NIRS technique to estimate the stevioside, rebaudioside A and their total contents in Stevia rebaudiana leaves and to optimize the suitable regression method and variables space to develop a robust and accurate regression-model.
     A total of 508 samples selected randomly from the individual plants with good agronomic traits from 2008 to 2009. The percentage contens of stevioside and rebaudioside A of the leaf samples were determined by the reference method of HPLC. About 3g leaf powder of each sample was scanned from 400nm to 2498nm at the interval of 0.5nm. The entire spectrum was pretreated with the standard normalized variate, second derivatived and Savitzky-Golay convolution smoothing. For the pretreated spectrum in 350 samples in train set, Monte-Carlo uninformative variables elimination and successive projections algorithm were used to optimize the variable space, reduce the collinearity and overcome the overfitting.
     Based on the optimized variables space, the prediction model was developed by the insensitive loss function-support vector regression method after the outliers removed by using iterative reweighted least squares support vector regression. The hybrid method is superior to other methods, which has been certificated by the simulation data and the total glycosides content data with the smaller prediction risk and the better generalization. Further more, the extracted feature extracted by partial least squares was used as the inputs to construct the NIR calibration model. It is feasible to determine the stevioside, rebaudioside A and their total contents of them in Stevia leaves with the low root mean square error of prediction, high determination coefficient, and satisfactory residual predictive deviation.
     By using the developed models to screen the individual plants,133 parental materials and 50 F1 lines with absolutely or relatively high rebaudioside A content were primarily identified and tested subsequently by HPLC. Briefly, the developed model could be directly to predict the diterpene glycosides in Stevia leaves and had good performance in breeding project.
引文
1. 成忠、诸爱士和陈德钊,组合偏最小二乘回归方法在近红外光谱定量分析中的应用。分析化学,2007,35(7):978-982
    2. 褚小立、许育鹏和陆碗珍,用于近红外光谱分析的化学计量学方法研究与进展。分析化学,2008,36(5):702-709
    3. 褚小立、袁洪福和陆婉珍,近红外分析中光谱预处理及波长选择方法进展与应用。化学进展,2004,16(4):529-542
    4. 郝勇、陈斌和朱锐,近红外光谱预处理中几种小波消噪方法的分析,光谱学与光谱分析,2006,26(10):1838-1841
    5. 李艳坤、邵学广和蔡文生,基于多模型共识的偏最小二乘法用于近红外光谱定量分析。高等学校化学学报,2007,28(2):246-249
    6. 李勇、魏益民和王锋,影响近红外光谱分析结果准确性的因素。核农学报,2005,19(3):236-240
    7. 刘瑞兰、牟盛静、苏宏业和褚健,基于支持向量机和粒子群算法的软测量建模。控制理论与应用,2006,23(6):895-899
    8. 陆婉珍、袁洪福、徐广通和强冬梅,现代近红外光谱分析技术。北京:中国石化出版社,2000
    9. 秦军立、倪世宏和苏晨,基于蚁群优化的SVM及其应用研究。计算机仿真,2009,26(11)46-48
    10.吴建国,作物种子品质分析中近红外光谱分析模型的创建和应用。博士学位论文,浙江大学,杭州,2004
    11.吴建国、石春海和张海珍,构建整粒油菜籽脂肪酸成分近红外反射光谱分析模型的研究。光谱与光谱分析,2006,26(2):259-262
    12.许禄和邵学广,化学计量学方法(第二版)。北京:科学出版社,2004
    13.严衍禄,近红外光谱分析基础与应用。北京:中国轻工业出版社,2005
    14.张学工,关于统计学习理论与支持向量机。自动化学报,2000,26(1):32-42
    15. Araujo M. C. U., Saldanha T. C. B., Galvao R. K. H., Yoneyama T., Chame H. C. & Visani V., The successive projections algorithm for variable selection in spectroscopic multicomponent analysis, Chemometrics and Intelligent Laboratory Systems,2001,57:65-73
    16. Boger Z., Selection of quasi-optimal inputs in chemometrics modeling by artificial neural network analysis, Analytica Chimica Acta,2003,490:31-40
    17. Borin A., Ferrao M. F., Mello C., Maretto D. A. & Poppi R. J., Least-squares support vector machines and near infrared spectroscopy for quantification of common adulterants in powdered milk, Analytica Chimica Acta,2006,579:25-32
    18. Brabanter K. D., Brabanter J. D., Suykens J. A. K. & Moor, B. D., Optimized fixed-size kernel models for large data sets,Computational Statistics & Data Analysis,2010,54:1484-1504
    19.Brandle J.,Genetic control of rebaudioside A and C concentration in leaves of the sweet herb,Stevia rebaudiana,Canadian Journal of Plant Science,1999,79:85-92
    20.Brandle J.E. & Telmer P G.,Steviol glycoside biosynthesis,Phytochemistry,2007,68:1855-1863
    21.Brusick D.J.,A critical review of the genetic toxicity of steviol and steviol glycosides,Food and Chemical Toxicology,2008,46:83-91
    22.Cai W.S.,Li Y.K. & Shao X.G.,A variable selection method based on uninformative variable elimination for multivariate calibration of near-infrared spectra,Chemometrics and Intelligent Laboratory Systems, 2008,90:188-194
    23.Carakostas M.C.,Curry L.L.,Boileau A.C. & Brusick D.J.,Overview:The history,technical function and safety of rebaudioside A,a naturally occurring steviol glycoside, for use in food and beverages,Food and Chemical Toxicology,2008,46:1-10
    24.Chen C.C.,Shun F.S.,Jin T.J. & Chin C.H.,Robust support vector regression networks for function approximation with outliers,Neural Networks,2002,13:1322-1330
    25.Cherkassky V. & Ma Y Q.,Practical selection of SVM parameters and noise estimation for SVM regression,Neural Networks,2004,17:113-126
    26.Chuang C.C. & Lee Z.J.,Hybrid robust support vector machines for regression with outliers, Applied Soft Computing,2011,11:64-72
    27.Collobert R.,Sinz F.,Weston J. & Bottou L.Trading convexity for Scalability,In 23rd international conference on machine learning, 2006
    28.Cramer J.A.,Kramer K.E.,Johnson K.J.,Morris R.E. & Rose-Pehrsson S.L.,Automated wavelength selection for spectroscopic fuel models by symmetrically contracting repeated unmoving window partial least squares,Chemometrics and Intelligent Laboratory Systems,2008,92:13-21
    29. Cui W. & Yan X.,Adaptive weighted least square support vector machine regression integrated with outlier detection and its application in QSAR,Chemometrics and Intelligent Laboratory Systems, 2009,98:130-135
    30. Curry L.L. & Roberts A.,Subchronic toxicity of rebaudioside A,Food and Chemical Toxicology, 2008,46:11-20
    31.DuBois G. E. & Stephenson R.A., Diterpenoid sweeteners. Synthesis and sensory evaluation of stevioside analogues with improved organoleptic properties,Journal of Medicinal Chemistry,1985, 28:93-98
    32.Fearn T.,On orthogonal signal correction. Chemometrics and Intelligent Laboratory Systems,2000, 50:47-52
    33.Filzmoser P.,Maronna R. & Werner M.,Outlier identification in high dimensions,Computational Statistics & Data Analysis,2008,52:1694-1711
    34. Galvao R. K. H., Araujo M. C. U., Martins M. D., Jose G. E., Pontes M. J.C., Silva E. C. & Saldanha T. C. B., An application of subagging for the improvement of prediction accuracy of multivariate calibration models, Chemometrics and Intelligent Laboratory Systems, 2006, 81: 60-67
    35. Geuns J. M. C., Stevioside, Phytochemistry, 2003, 64: 913-921
    36. Gil J. A. & Romera R., On robust partial least squares (PLS) methods, Journal of Chemometrics, 1998, 12: 365-378
    37. Goicoechea H. C. & Olivieri A.C., A new family of genetic algorithms for wavelength interval selection in multivariate analytical spectroscopy, Journal of Chemometrics, 2003,17: 338-345
    38. Gonzalez J., Pena D. & Romera R., A robust partial least squares regression method with applications, Journal of Chemometrics, 2009,23: 78-90
    39. Griep M, Wakeling I., Vankeerberghen P. & Massart D., Comparison of semirobust and robust partial least squares procedures, Chemometrics and Intelligent Laboratory Systems, 1995,29: 37-50
    40. Han Q. J., Wu H. L., Cai C. B., Xu L. & Yu R. Q., An ensemble of Monte Carlo uninformative variable elimination for wavelength selection, Analytica Chimica Acta, 2008,612: 121-125
    41. Hanson J. R. & Oliveira, B. H. D., Stevioside and related sweet diterpenoid glycosides, Natural Product Reports, 1993,10: 301-309
    42. Hearn L. K. & Subedi P. P., Determining levels of steviol glycosides in the leaves of Stevia rebaudiana by near infrared reflectance spectroscopy, Journal of Food Composition and Analysis, 2009,22: 165-168
    43. Hemmateenejad B., Akhond M. & Samari F., A comparative study between PCR and PLS in simultaneous spectrophotometric determination of diphenylamine, aniline, and phenol: Effect of wavelength selection, Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy, 2007, 67:958-965
    44. Hubert M., Rousseeuw P. J. & Vanden K. B., ROBPCA: A new approach to robust principal component analysis, Technometrics, 2005, 47: 64-79
    45. Hubert M. & Vanden K., Robust methods for partial least squares regression, Journal of Chemometrics, 2003, 17: 537-549
    46. Janik L. J., Cozzolino D., Dambergs R., Cynkar W. & Gishen M., The prediction of total anthocyanin concentration in red-grape homogenates using visible-near-infrared spectroscopy and artificial neural networks, Analytica Chimica Acta, 2007, 594: 107-118
    47. Jaworski A., Wikiel K. & Wikiel H., Application of Multiblock and Hierarchical PCA and PLS Models for Analysis of AC Voltammetric Data, Electroanalysis, 2005, 17: 1477-1485
    48. Jia R. D., Mao Z. Z., Chang Y. Q. & Zhang S. N., Kernel partial robust M-regression as a flexible robust nonlinear modeling technique, Chemometrics and Intelligent Laboratory Systems, 2010, 100: 91-98
    49. Karna T., Corona F. & Lendasse A., Gaussian basis functions for Chemometrics, Journal of Chemometrics,2008,22:701-707
    50. Kasemsumran S., Du Y. P., Maruo K. & Ozaki Y., Improvement of partial least squares models for in vitro and in vivo glucose quantifications by using near-infrared spectroscopy and searching combination moving window partial least squares, Chemometrics and Intelligent Laboratory Systems, 2006,82:97-103
    51. Kennard R. W. & Stone L. A., Computer aided design of experiments, Technometrics,1969,11: 137-148
    52. Kim K., Lee J. M. & Lee I. B., A novel multivariate regression approach based on kernel partial least squares with orthogonal signal correction, Chemometrics and Intelligent Laboratory Systems,2005, 79:22-30
    53. Kolb N., Herrera J. L., Ferreyra D. J. & Uliana R. F., Analysis of sweet diterpene glycosides from Stevia rebaudiana:Improved HPLC method, Journal of Agricultural and Food Chemistry,2001,49: 4538-4541
    54. Koljonen J., Nordling T. E. M.& Alander J. T., A review of genetic algorithms in near infrared spectroscopy and chemometrics:past and future, Journal of Near Infrared Spectroscopy,2008,16: 189-197
    55. Kruger U., Zhou Y, Wang X., Rooney D. & Thompson J., Robust partial least squares regression: Part Ⅰ, algorithmic developments, Journal of Chemometrics.2008a,22:1-13
    56. Kruger U., Zhou Y, Wang X., Rooney D. & Thompson J., Robust partial least squares regression: Part Ⅱ, new algorithm and benchmark studies, Journal of Chemometrics,2008b,22:14-22
    57. Li H. D., Liang Y. Z., Xu Q. S. & Cao D. S., Key wavelengths screening using competitive adaptive reweighted sampling method for multivariate calibration, Analytica Chimica Acta,2009,648:77-84
    58. Li Y. K., Shao X. G. & Cai W. S., A consensus least squares support vector regression (LS-SVR) for analysis of near-infrared spectra of plant samples, Talanta,2007,72:217-222
    59. Liu F., Jiang Y. H. & He Y, Variable selection in visible/near infrared spectra for linear and nonlinear calibrations:A case study to determine soluble solids content of beer, Analytica Chimica Acta,2009, 635:45-52
    60. Liu J., Li J., Xu W. & Shi Y., A weighted Lq adaptive least squares support vector machine classifiers-Robust and sparse approximation, Expert Systems with Applications,2011,38:2253-2259
    61. Macho S., Rius A., Callao M. P. & Larrechi M. S., Monitoring ethylene content in heterophasic copolymers by near-infrared spectroscopy-Standardisation of the calibration model, Analytica Chimica Acta,2001,445:213-220
    62. Martins J. P. A., Teofilo R. F. & Ferreira M. M. C., Computational performance and cross-validation error precision of five PLS algorithms using designed and real data sets, Journal of Chemometrics, 2010,24:320-332
    63. Melssen W., Ostun B. & Buydens L., SOMPLS:A supervised self-organising map-partial least squares algorithm for multivariate regression problems, Chemometrics and Intelligent Laboratory Systems,2007,86:102-120
    64. Murray L. & Williams P. C., Chemical principles of near-infrared technology. In Williams P. C. and Norris K. (Ed.), American Association of Cereal Chemists (AACC), St. Paul, Minn,USA,1987
    65. Nicolai B. M., Theron K. I. & Lammertyn J., Kernel PLS regression on wavelet transformed NIR spectra for prediction of sugar content of apple, Chemometrics and Intelligent Laboratory Systems, 2007,85:243-252
    66. N(?)rgaard A. S. L., Wagner J., Nielsen J. P., Munck L. & Engelsen S. B., Interval partial least-squares regression (iPLS):A comparative chemometric study with an example from near-infrared spectroscopy, Applied Spectroscopy,2000,54:413-419
    67. Osborne B. G. & Fearn T., Near infrared spectroscopy in food analysis. New York:John Wiley and Sons, inc.,1986
    68. Pena D. & Prieto J., Combining random and specific directions for outlier detection and robust estimation of high-dimensional multivariate data, Journal of computational and graphical statistics, 2007,27:228-254
    69. Rantalainen M., Bylesjo M., Cloarec O., Nicholson J. K., Holmes E. & Trygg J., Kernel-based orthogonal projections to latent structures (K-OPLS). Journal of Chemometrics 2007,21:376-385.
    70. Roggo Y., Chalus P., Maurer L., Lema-Martinez C., Edmond A. & Jent N., A review of near infrared spectroscopy and chemometrics in pharmaceutical technologies, Journal of Pharmaceutical and Biomedical Analysis,2007,44:683-700
    71. Rosipal R. & Trejo L. J., Kernel partial least squares regression in Reproducing Kernel Hilbert Space, Journal of Machine Learning Research,2001,2:97-123
    72. Rossel R. A. V., Robust modelling of soil diffuse reflectance spectra by "bagging-partial least squares regression", Journal of Near Infrared Spectroscopy,2007,15:39-47
    73. Rossi F., Francois D., Wertz V., Meurens M. & Verleysen M., Fast selection of spectral variables with B-spline compression, Chemometrics and Intelligent Laboratory Systems,2007,86:208-218
    74. Serneels S., Croux C., Filzmoser P. & Vanespen P., Partial robust M-regression, Chemometrics and Intelligent Laboratory Systems,2005,79:55-64
    75. Shao X. G, Bian X. H. & Cai W. S., An improved boosting partial least squares method for near-infrared spectroscopic quantitative analysis, Analytica Chimica Acta,2010,666:32-37
    76. Shenk J. S. & Westerhaus M. O., Calibration the ISI way. In Davies A. M. C. & Williams P. C., Near Infrared Spectroscopy:The Future Waves. NIR Publications, Chichester, UK,1996
    77. Shinzawa H., Jiang J. H., Ritthiruangdej P. & Ozaki Y, Investigations of bagged kernel partial least squares (KPLS) and boosting KPLS with applications to near-infrared (NIR) spectra, Journal of Chemometrics,2006,20:436-444
    78. Sjoblom J., Svensson O., Josefson M., Kullberg H. & Wold S., An evaluation of orthogonal signal correction applied to calibration transfer of near infrared spectra, Chemometrics and Intelligent Laboratory Systems,1998,44:229-244
    79. Smola A. J., Scholkopf B., A tutorial on support vector regression, Statistics and Computing,2004,14: 199-222
    80. Soejarto D.D., Botany of Stevia and Stevia rebaudiana. In:Kinghorn A.D. (Ed.), Stevia:The genus Stevia. Taylor and Francis, London and New York,2002
    81. Suykens J. A. K., Brabanter J. D., Lukas L., Vandewalle J., Weighted least squares support vector machines:robustness and sparse approximation, Neurocomputing,2002,48:85-105
    82. Suykens J. A. K. & Vandewalle J., Least squares support vector machine classifiers, Neural Processing Letters,1999,9:293-300
    83. Tan C., Wang J. Y., Wu T., Qin X. & Li M. L., Determination of nicotine in tobacco samples by near-infrared spectroscopy and boosting partial least squares, Vibrational Spectroscopy,2010,54: 35-41
    84. Tanaka O., Steviol-glycosides:new natural sweeteners, Trac-Trends in Analytical Chemistry,1982,1: 246-248
    85. Tang L. J., Jiang J. H., Wu H. L., Shen G. L. & Yu R. Q., Variable selection using probability density function similarity for support vector machine classification of high-dimensional microarray data, Talanta,2009,79:260-267
    86. Tenenhaus A., Giron A., Viennet E., Bera M., Saporta G. & Fertil B., Kernel logistic PLS:A tool for supervised nonlinear dimensionality reduction and binary classification. Computational Statistics & Data Analysis,2007,51:4083-4100
    87. Todeschini R., Galvagni D., Vilchez J. L., Olmo M. D. & Navas N., Kohonen artificial neural networks as a tool for wavelength selection in multicomponent spectrofluorimetric PLS modelling: application to phenol, o-cresol, m-cresol and p-cresol mixtures, Trac-Trends in Analytical Chemistry, 1999,18:93-98
    88. Trygg J. & Wold S., Orthogonal projections to latent structures (O-PLS), Journal of Chemometrics, 2002,16:119-128
    89. Ustun B., Melssen W. J., Oudenhuijzen M. & Buydens L. M. C., Determination of optimal support vector regression parameters by genetic, algorithms and simplex optimization, Analytica Chimica Acta, 2005,544:292-305
    90. Valderrama P., Braga J. W. B. & Poppi R. J., Variable selection, outlier detection, and figures of merit estimation in a partial least-squares regression multivariate calibration model. A case study for the determination of quality parameters in the alcohol industry by near-infrared spectroscopy, Journal of Agricultural and Food Chemistry,2007,55:8331-8338
    91. Vapanik V. N., The Nature of Statistical Learning Theory, New York:Springer-Verlag,1995
    92. Wakeling I. & Macfie H., A robust PLS procedure, Journal of Chemometrics,1992,6:189-198
    93. Wold S., Antti H., Lindgren F. & Ohman J., Orthogonal signal correction of near-infrared spectra, Chemometrics and Intelligent Laboratory Systems,1998,44:175-185
    94. Wu D., He Y., Nie P. C., Cao F. & Bao Y. D., Hybrid variable selection in visible and near-infrared spectral analysis for non-invasive quality determination of grape juice, Analytica Chimica Acta,2010, 659:229-237
    95. Wu Y., & Liu Y., Robust truncated hinge loss support vector machines, Journal of the American Statistical Association,2007,102,974-983
    96. Ye S. F., Wang D. & Min S. G., Successive projections algorithm combined with uninformative variable elimination for spectral variable selection, Chemometrics and Intelligent Laboratory Systems, 2008,91:194-199
    97. Yuille A. L., The concave convex procedure, Neural Computation,2003,15:915-936
    98. Zhang M. H., Xu Q. S. & Massart D. L., Boosting partial least squares, Analytical Chemistry,2005, 77:1423-1431
    99. Zhao Y. & Sun J., Robust support vector regression in the primal, Neural Networks,2008,21: 1548-1555
    100. Zou H. Y., Wu H. L., Fu H. Y., Tang L. J., Xu L., Nie J. F. & Yu R.Q., Variable-weighted least-squares support vector machine for multivariate spectral analysis, Talanta,2010,80:1698-1701
    101. Zou X. B., Zhao J. W. & Li Y., Selection of the efficient wavelength regions in FT-NIR spectroscopy for determination of SSC of 'Fuji' apple based on BiPLS and FiPLS models, Vibrational Spectroscopy, 2007,44:220-227
    102. Zou X. B., Zhao J. W., Povey M. J. W., Holmes M. & Mao H. P., Variables selection methods in near-infrared spectroscopy, Analytica Chimica Acta,2010,667:14-32

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700