用户名: 密码: 验证码:
基于近红外光谱的灌浆期玉米籽粒水分小样本定量分析
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Moisture quantitative analysis with small sample set of maize grain in filling stage based on near infrared spectroscopy
  • 作者:王雪 ; 马铁民 ; 杨涛 ; 宋平 ; 谢秋菊 ; 陈争光
  • 英文作者:Wang Xue;Ma Tiemin;Yang Tao;Song Ping;Xie Qiuju;Chen Zhengguang;College of Information and Electrical Engineering, Shenyang Agricultural University;College of Electrical and Information, Heilongjiang Bayi Agricultural University;School of Computer Science And Engineering, Northeastern University;
  • 关键词:近红外光谱 ; 水分 ; 模型 ; 定量分析 ; 小样本集 ; 灌浆期玉米籽粒 ; Bootstrap重抽样本 ; 样本优化选择
  • 英文关键词:near infrared spectroscopy;;water;;models;;quantitative analysis;;small sample set;;maize grain in filling stage;;bootstrap resample;;sample optimized selection
  • 中文刊名:NYGU
  • 英文刊名:Transactions of the Chinese Society of Agricultural Engineering
  • 机构:沈阳农业大学信息与电气工程学院;黑龙江八一农垦大学电气与信息学院;东北大学计算机科学与工程学院;
  • 出版日期:2018-07-08
  • 出版单位:农业工程学报
  • 年:2018
  • 期:v.34;No.340
  • 基金:国家自然科学基金青年基金(31701318);; 黑龙江八一农垦大学校内课题培育资助项目(XZR2016-09)
  • 语种:中文;
  • 页:NYGU201813024
  • 页数:8
  • CN:13
  • ISSN:11-2047/S
  • 分类号:211-218
摘要
玉米灌浆期含水率测定是考种育种的重要指标。为了节约样本且快速准确测定灌浆期玉米水分,该文应用近红外光谱技术,提出了基于小样本条件下的自举算法(Bootstrap)与基于x-y距离结合的样本划分方法(SPXY,sample set partitioning based on joint x-y distances)相结合的样本优化方法的偏最小二乘(PLS,partial least square)水分定量分析模型Bootstrap-SPXY-PLS模型。试验结果表明,当Bootstrap重抽样本次数等于500,样本数量大于等于10时,模型的性能稳定,并且随着样本数量增加,重抽样本次数相对减少;样本数量为10和50时,全谱Bootstrap-SPXY-PLS模型的预测均方根误差(RMSEP,root-mean-square error of prediction)均值分别为0.38%和0.40%,预测相关系数(correlation coefficients of prediction)分别为0.975 1和0.968 5,决定系数R~2分别为0.999 9和0.993 6;基于竞争性自适应重加权采样算法(CARS,competitive adaptive reweighed sampling)波长变量筛选后的CARS-Bootstrap-SPXY-PLS模型的预测均方根误差RMSEP均值分别为0.36%和0.35%,预测相关系数分别为0.973 6和0.975 0,模型决定系数R~2分别为0.924 5和0.918 0。因此,全谱Bootstrap-SPXY-PLS模型和CARS-Bootstrap-SPXY-PLS模型均具有稳定的预测能力,为玉米育种时灌浆期种子水分测定提供了一种稳定、高效的方法。
        Near infrared spectroscopy(NIRS) and its analytical techniques are increasingly used for the rapid quantitative and qualitative analysis in the field of agriculture, food, industry, and so on. Generally, the sample size in most research is between 100 and 200. In maize breeding, the sampling quantity and its cost for maize grain moisture measurement in filling stage are limited due to some objective limitations of the planting area of new varieties, the maize plants number per square meter, the effective experimental spikes number and other conditions. However, the filling period is a critical stage for maize grain variety changes and breeding test. In the traditional measurement method for moisture drying, 150-250 grains are taken for the moisture measurement, which are a large number of samples. Therefore, it is one of the urgent problems to provide a high efficient moisture measurement method using small sample size in maize breeding process. In NIRS research field, the size of sample set is a key factor for the performance and prediction ability of the algorithm. In general, the smaller the size of sample set, the lower the efficiency of model, so it is very important to find a critical value for the small sample set in practical applications. In recent years, data analysis methods for small sample set based on Bootstrap were proposed, and most of them were considered reliable for the small sample set data validation. In order to reduce sample size and measure the moisture content of maize grainin filling period quickly and accurately, a quantitative analysis model of moisture was presented based on sample set optimized selection and partial least squares(PLS) algorithm using NIRS. The method of sample set optimized selection was on the basis of Bootstrap resampling strategy and sample set partitioning based on joint x-y distances(SPXY). The models were evaluated by correlation coefficient of prediction and root-mean-square error of prediction(RMSEP) in different resampling times and the sizes of sample set. Firstly, the full spectrum and wavelength selection spectrum were resampled for 100-800 times at the sample size of 5, 10, 20 and 50, respectively, using Bootstrap algorithm. Secondly, the resampled set was selected for the calculation of SPXY samples to optimize selection to form modeling sample set. Thirdly, the modeling sample set was divided into multiple subsets and PLS sub-models were constructed using these subsets respectively, and multiple predictive values were obtained by using the PLS sub-models regression analysis. Finally, the predictive values of maize grain moisture in filling period were obtained by the weighted mean of multiple predictive values. It is shown that a model with stable performance is gotten when the number of Bootstrap resampling is 500 and resampling size is greater than 10, and the number of resampled samples decreases with the increasing of sample size. When the sample size is 10 and 50, the RMSEP mean values of full spectrum Bootstrap-SPXY-PLS model are 0.38% and 0.40% respectively, the correlation coefficients of prediction are 0.975 1 and 0.968 5 respectively, and the determination coefficients(R~2) of the calibration are 0.999 9 and 0.993 6 respectively; the RMSEP mean values of CARS-Bootstrap-PLS are 0.36% and 0.35% respectively, the correlation coefficients of prediction are 0.973 6 and 0.975 0 respectively, and the R~2 values were 0.924 5 and 0.918 0 respectively. Therefore, the 2 models of full-spectrum Bootstrap-SPXY-PLS and the CARS-Bootstrap-PLS both have good prediction ability and can provide a new stable and efficient method for maize grain moisture determination in filling stage in breeding process. It is helpful for maize breeding research, and also provides a new idea for quantitative analysis of NIR spectra in small sample set.
引文
[1]文韬,郑立章,龚中良,等.基于近红外光谱技术的茶油原产地快速鉴别[J].农业工程学报,2016,32(16):293-299.Wen Tao,Zheng Lizhang,Gong Zhongliang,et al.Rapid identification of geographical origin of camellia oil based on near infrared spectroscopy technology[J].Transactions of the Chinese Society of Agricultural Engineering(Transactions of the CSAE),2016,32(16):293-299.(in Chinese with English abstract)
    [2]Liang Pei-shih,Haff Ronald P,Hua Sui Sheng T,et al.Nondestructive detection of zebra chip disease in potatoes using near-infrared spectroscopy[J].Biosystems Engineering,2018,166(2):161-169.
    [3]Diago M P,Fernández-Novales J,Gutiérrez S,et al.Development and validation of a new methodology to assess the vineyard water status by on-the-go near infrared spectroscopy:[J].Frontiers in Plant Science,2018,1(9):1-13.
    [4]李倩倩,田旷达,李祖红,等.无信息变量消除法变量筛选优化[J].分析化学,2013,41(6):917-921.Li Qianqian,Tian Kuangda,Li Zhuhong,et al.Model of total nitrogen and total sugar in tobacco optimizing after uninformative variable elimination[J].Chinese Journal of Analytical Chemistry,2013,41(6):917-921.(in Chinese with English abstract)
    [5]Jia Shengyao,Li Hongyang,Wang Yanjie,et al.Recursive variable selection to update near-infrared spectros copy model for the determination of soil nitrogen and organic carbon[J].Geoderma,2016,268(4):92-99.
    [6]陈奕云,齐天赐,黄颖菁,等.土壤有机质含量可见-近红外光谱反演模型校正集优选方法[J].农业工程学报,2017,33(6):107-114.Chen Yiyun,Qi Tianci,Huang Yingjing,et al.Optimization method of calibration dataset for VIS-NIR spectral inversion model of soil organic matter content[J].Transactions of the Chinese Society of Agricultural Engineering(Transactions of the CSAE),2017,33(6):107-114.(in Chinese with English abstract)
    [7]Sun Xudong,Dong Xiaoling.Improved partial least squares regression for rapid determination of reducing sugar of potato flours by near infrared spectroscopy and variable selection method[J].Journal of Food Measurement&Characterization,2015,9(1):95-103.
    [8]Liu Ke,Chen Xiaojing,Li Limin,et al.A consensus successive projections algorithm-multiple linear regression method for analyzing near infrared spectra[J].Analytica Chimica Acta,2015,858(1):16-23.
    [9]朱丽伟,马文广,胡晋,等.近红外光谱技术检测种子质量的应用研究进展[J].光谱学与光谱分析,2015,35(2):346-349.Zhu Liwei,Ma Wenguang,Hu Jin,et al.Advances of NIRspectroscopy technology applied in seed quality detection[J].Spectroscopy and Spectral Analysis,2015,35(2):346-349.(in Chinese with English abstract)
    [10]彭彦昆,赵芳,李龙,等.利用近红外光谱与PCA-SVM识别热损伤番茄种子[J].农业工程学报,2018,34(5):159-165.Peng Yankun,Zhao Fang,Li Long,et al.Discrimination of heat-damaged tomato seeds based on nearinfrared spectroscopy and PCA-SVM method[J].Transactions of the Chinese Society of Agricultural Engineering(Transactions ofthe CSAE),2018,34(5):159-165.(in Chinese with English abstract)
    [11]郭婷婷,徐丽,刘金,等.玉米亚正常籽粒生活力近红外光谱判别方法研究[J].光谱学与光谱分析,2013,33(6):1501-1505.Guo Tingting,Xu Li,Liu Jin,et al.Study on discrimination method of maize seed viability based on near-infrared spectroscopy[J].Spectroscopy&Spectral Analysis,2013,33(6):1501-1505.(in Chinese with English abstract)
    [12]刘思奇,钟雪梅,李凤海,等.东北地区4个代表性玉米品种的灌浆和脱水速率比较[J].种子,2015,34(12):69-72.Liu Siqi,Zhong Xuemei,Li Fenghai,et al.Comparisons of grain filling and dehydration rates in 4representative maize varieties in northeast provinces[J].Transactions of the Seed,2015,34(12):69-72.(in Chinese with English abstract)
    [13]Efron B.Bootstrap methods:another look at the jackknife[J].Annals of Statistics,1979,7(1):1-26.
    [14]Krebsbach C M.Bootstrapping with Small Samples in Structural Equation Modeling:Goodness of Fit and Confidence Intervals[D].Rhodes Island,USA,University of Rhode Island,2014.
    [15]Amalnerkar E,Lee T H,Lim W.Bootstrap guided information criterion for reliability analysis using small sample size information[C]//World Congress of Structural and Multidisciplinary Optimisation.Springer,Cham,2017:326-333.
    [16]Wang Yanqing,Zhou Weihu,Dong Dengfeng,et al.Estimation of random vibration signals with small samples using bootstrap maximum entropy method[J].Measurement,2017,105(7):45-55.
    [17]Coskun A,Ceyhan E,Inal T C,et al.The comparison of parametric and nonparametric bootstrap methods for reference interval computation in small sample size groups[J].Accreditation&Quality Assurance,2013,18(1):51-60.
    [18]Heathcote A,Brown S,Wagenmakers E J,et al.Distribution-free tests of stochastic dominance for small samples[J].Journal of Mathematical Psychology,2010,54(5):454-463.
    [19]Vojta A,Shekvugrove?ki A,Radin L,et al.Hematological and biochemical reference intervals in Dalmatian pramenka sheep estimated from reduced sample size by bootstrap resampling.[J].Veterinarski Arhiv,2011,81(1):25-33.
    [20]Neto E C.Speeding up non-parametric bootstrap computations for statistics based on sample moments in small/moderate sample size applications[J].Plos One,2015,10(6):e0131333.
    [21]Dwivedi A K,Mallawaarachchi I,Alvarado L A.Analysis of small sample size studies using nonparametric bootstrap test with pooled resampling method[J].Statistics in Medicine,2017,36(14):2187-2205.
    [22]陈昭,吴志生,史新元,等.Bagging偏最小二乘和Boosting偏最小二乘算法的金银花醇沉过程近红外光谱定量模型预测能力研究[J].分析化学,2014,42(11):1679-1686.Chen Zhao,Wu Zhisheng,Shi Xinyuan,et al.A study on model performance for ethanol precipitation process of lonicera japonica by NIR based on bagging-PLS and Boosting-PLS algorithm[J].Chinese Journal of Analytical Chemistry,2014,42(11):1679-1686.(in Chinese with English abstract)
    [23]Xiao Ma,Zhao Zhong,Xiong Shanhai.Spectrum quantitative analysis based on bootstrap-SVM model with small sample set[J].Spectroscopy&Spectral Analysis,2016,36(5):1571-1575.
    [24]Lodder R,Moses J,Buice R G.Determination of protein crosslinking with bootstrap pattern selection and nearinfrared spectrophotometry[J].CPS:analchem/0008002,2000(8):1-5.
    [25]文韬,洪添胜,李立君,等.霉变稻谷脂肪酸含量的光谱检测模型构建与优化分析[J].农业工程学报,2016,32(1):193-199.Wen Tao,Hong Tiansheng,Li Lijun,et al.Optimization analysis and establishment of spectra detection model of fatty acid contents for mould paddies[J].Transactions of the Chinese Society of Agricultural Engineering(Transactions of the CSAE),2016,32(1):193-199.(in Chinese with English abstract)
    [26]李江波,郭志明,黄文倩,等.应用CARS和SPA算法对草莓SSC含量NIR光谱预测模型中变量及样本筛选[J].光谱学与光谱分析,2015,35(2):372-378.Li Jiangbo,Guo zhiming,Huang Wenqian,et al.Near-infrared spectra combining with CARS and SPAalgorithms to screen the variables and samples for quantitatively determining the soluble solids content in strawberry[J].Spectroscopy&Spectral Analysis,2015,35(2):372-378.(in Chinese with English abstract)
    [27]赵安新,汤晓君,张钟华,等.优化Savitzky-Golay滤波器的参数及其在傅里叶变换红外气体光谱数据平滑预处理中的应用[J].光谱学与光谱分析,2016,36(5):1340-1344.Zhao Anxin,Tang Xiaojun,Zhang Zhonghua,et al.Optimizing savitzky-golay parameters and its smoothing pretreatment for FTIR gas spectra[J].Spectroscopy&Spectral Analysis,2016,36(5):1340-1344.(in Chinese with English abstract)
    [28]蔡剑华,胡惟文,王先春.基于组合滤波的鱼油二十碳五烯酸含量近红外光谱检测[J].农业工程学报,2016,32(1):312-317.Cai Jianhua,Hu Weiwen,Wang Xianchun.Near-infrared spectrum detection of fish oil eicosapentaenoic acid content based on combinational filtering[J].Transactions of the Chinese Society of Agricultural Engineering(Transactions of the CSAE),2016,32(1):312-317.(in Chinese with English abstract)
    [29]冯艳春,张琪,胡昌勤.药品近红外光谱通用性定量模型评价参数的选择[J].光谱学与光谱分析,2016,36(8):2447-2454.Feng Yanchun,Zhang Qi,Hu Changqin.Study on the selection of parameters for evaluating drug NIR universal quantitative models[J].Spectroscopy&Spectral Analysis,2016,36(8):2447-2454.(in Chinese with English abstract)
    [30]宋相中,唐果,张录达,等.近红外光谱分析中的变量选择算法研究进展[J].光谱学与光谱分析,2017,37(4):1048-1052.Song Xiangzhong,Tang Guo,Zhang Luda,et al.Research advance of variable selection algorithms in Near Infrared Spectroscopy analysis[J].Spectroscopy&Spectral Analysis,2017,37(4):1048-1052.(in Chinese with English abstract)
    [31]蔡亮红,丁建丽.小波变换耦合CARS算法提高土壤水分含量高光谱反演精度[J].农业工程学报,2017,33(16):144-151.Cai Lianghong,Ding Jianli.Wavelet transformation coupled with CARS algorithm improving prediction accuracy of soilmoisture content based on hyperspectral reflectance[J].Transactions of the Chinese Society of Agricultural Engineering(Transactions of the CSAE),2017,33(16):144-151.(in Chinese with English abstract)
    [32]宾俊,范伟,周冀衡,等.智能优化算法应用于近红外光谱波长选择的比较研究[J].光谱学与光谱分析,2017,37(1):95-102.Bin Jun,Fan Wei,Zhou Jiheng,et al.Application of intelligent optimization algorithms to wavelength selection of near-infrared spectroscopy[J].Spectroscopy&Spectral Analysis,2017,37(1):95-102.(in Chinese with English abstract)

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700