摘要
多因变量综合线性回归中变量筛选问题,一直受到学术界的高度关注。针对当前不少学者对多因变量综合线性回归中变量筛选问题的错误认识,尤其是"偏最小二乘回归模型"涉及数学过于深奥,很多学者不能理解其原理,不能适合采用该模型的条件而盲目使用。在利用线性代数中正定与半正定矩阵的性质和矩阵的特征理论的基础上,剖析三种常规线性回归建模方法的原理,揭示"偏最小二乘回归模型"的本性,并在肯定其优越性的同时也指出其应用上的局限性;提出实际应用中合理选择回归模型的若干标准,建立一种容易掌握操作简便且可替代OLS法的"超平面回归模型";利用一个实例对几种回归建模方法的应用效果进行比较和说明。
The problem of variable selection in multivariate comprehensive linear regression has been highly concerned by academic circles.At present,many people misunderstand the problem of variable selection in multivariate comprehensive linear regression,especially the "partial least square regression model" involves too deep mathematics,and many scholars cannot understand its principle.Do not understand the conditions suitable for the use of the model and use it blindly.using mainly the characters of positive definite and positive semi-definite matrix,characteristic theory of matrix in linear algebra,we analyze principle of 3-kinds methods to build routinely linear regression model,reveal the nature of "model of partial least squares regression",sure it's superiority,and point out it's the limitation.We also propose to choose reasonably standards of the regression model,establish a model of ultra-plane regression to master easily and to operate conveniently,which can replace OLS method.Last we the comparison and explanation for apply result of every regression methods by an example.
引文
[1]王惠文,黄薇.成分数据的线性回归模型[J].系统工程,2003(2).
[2]张晓琴,陈佳佳,原静.成分数据的组合预测[J].应用概率统计,2013(3).
[3]姜磊.空间回归模型选择的反思[J].统计信息论坛,2016(10).
[4]詹敏,廖志高,徐玖平.线性无量纲化方法比较研究[J].统计与信息论坛,2016(12).
[5]李玲玉,郭亚军,易平涛.无量纲化方法的选取原则[J].系统管理学报.2016(6).
[6]林彬.多元线性回归分析及其应用[J].中国科技信息,2010(9).
[7]陈希孺,王松桂.近代回归分析[M].合肥:安徽教育出版社,1987.
[8]Wold H.Partial Least Squares[C]∥Kotz S,Johnson N L.cyclopedia of Statistical Sciences,New York:John Wiley&Sons,1985.
[9]Quenouille M H.Experiments with Mixtures[J].J.R.Statist.Soc.B,1959(21).
[10]同济大学数学系.线性代数[M].北京:高等教育出版社,2014.
[11]中国统计年鉴(2014)[M].北京:中国统计出版社,2014.