摘要
当近红外光谱信息远远大于样本量时,对光谱信息进行自动变量选择进而建立光谱与微量成分含量之间的稀疏线性模型重要且具有挑战性。针对聚苯醚生产过程中微量成分邻甲酚难以测量的问题,将变量选择方法 Adaptive Elastic Net用于建立近红外光谱与邻甲酚含量之间的定量校正模型,并将其模型性能与ElasticNet方法进行对比。在变量数目远远大于样本量的情形下,ElasticNet方法虽可以实现变量选择,但由于其系数估计不具备Oracle性质,使得模型的可解释性和预测精度受到影响,而Adaptive Elastic Net方法通过对L1惩罚项施加自适应权重从而很好的解决了上述问题并提高了模型性能。为了验证Adaptive Elastic Net方法的模型性能指标,用最终被选中的自变量数目来评价模型复杂度;利用复相关系数R~2来评价模型的可解释性,利用平均相对预测误差MRPE(mean relative prediction error)和预测相关系数Rp来评价模型的预测精度。Elastic Net方法建立的模型性能指标为:NSIV=529,R~2=0.96,MRPE=3.22%,Rp=0.97;Adaptive Elastic Net方法的性能指标为:NSIV=139,R~2=0.99,MRPE=2.00%,Rp=0.99。结果表明:Adaptive Elastic Net所建立模型的性能指标优于Elastic Net方法,可以得到更加简单且具有较强可解释性和较高预测精度的稀疏线性模型。
When near infrared spectral information is much larger than the sample size,it is both important and challenging to make automatic variable selection of spectral information and coefficient estimation to establish a sparse linear model between spectra and sample concentration.In this paper,adaptive Elastic Net,a variable selection method,is used to establish a quantitative calibration model between near infrared spectroscopy and o-cresol content,which is a kind of trace component and is difficult to measure in the production of polyphenylene ether.Then,the model performance is compared with the Elastic Net method.Under the circumstance that the number of variables is much larger than the sample size,although Elastic Net method can achieve variable selection,due to the fact that its coefficient estimation does not have the Oracle property,the interpretability and prediction accuracy of the model are affected.The adaptive Elastic Net method solves the above problem and improves the model performance by applying adaptive weights to L1 penalty.In order to verify model performance indicators of adaptive Elastic Net method,the number of selected independent variables(NSIV)is used to evaluate the model complexity and the complex correlation coefficient R~2 is used to evaluate the interpretability of the model.Meanwhile,the prediction accuracy of the model is evaluated by using the mean relative prediction error(MRPE)and the prediction correlation coefficient(Rp).The performance indicators of Elastic Net Method are:NSIV=529,R~2=0.96,MRPE=3.22%,Rp=0.97;adaptive Elastic Net method's performance indicators are:NSIV=139,R~2=0.99,MRPE=2.00%,Rp=0.99.The results show that adaptive Elastic Net's model is better than that of Elastic Net.A simpler sparse linear model with better interpretability and higher prediction accuracy can be obtained by the adaptive Elastic Net regression.
引文
[1]SUN Ji-cheng,MA Jin,SHEN Chao,et al.Progress in Modern Biomedicine,2016,16(8):1594.
[2]ZHANG Li-pei,LU Xiong.Light Industry Science and Technology,2016,(2):103.
[3]SHI Ting,LUAN Xiao-li,LIU Fei.Spectroscopy and Spectral Analysis,2017,37(4):1058.
[4]SHI Ting,LUAN Xiao-li,LIU Fei.Vibrational Spectroscopy,2017,92:302.
[5]TIAN Kuang-da,QIU Kai-xian,LI Zu-hong,et al.Spectroscopy and Spectral Analysis,2014,32(12):3262.
[6]YE Shu-bin,XU Liang,LI Ya-kai,et al.Spectroscopy and Spectral Analysis,2017,37(3):749.
[7]HUANG Xiao-han,ZHANG Ping,YANG Xiao-li,et al.Gansu Science and Technology,2017,33(18):123.
[8]TANG Shou-peng,YAO Xin-feng,YAO Xia,et al.Chinese Journal of Analytical Chemistry,2009,37(10):1445.
[9]XU Qing-juan,YANG Bin-bin.Journal of Guangxi Teachers Education University:Natural Science Edition,2017,33(4):36.
[10]Zou H,Hastie T.Journal of the Royal Statistical Society.Series B(Methodological),2005,67(1):301.
[11]Zou H,Hao Helen Zhang.The Annals of Statistics,2009,37(4):1733.
[12]CHEN Shan-xiong,LIU Xiao-juan,CHEN Chun-rong,et al.Journal of Computer Applications,2017,37(6):1674.
[13]HE Xiao-qun,LIU Wen-qing.Applied Regression Analysis.Beijing:China Renmin University Press,2015.