摘要
对常见的Pearson简单相关系数、Spearman和Kendall等级相关系数、Cramer’s V系数、偏相关系数、广义相关测度、复相关系数、典型相关系数、广义相关系数和多向量间相关系数的适用条件、联系和区别等进行了辨析,给出了这些相关系数在R软件中的实现方法,并选取2016年中国31个省市的数据,对经济、能源和环境污染三方面进行相关分析,计算相关系数并解释其实际意义,最后对这些相关系数的基本情况进行了总结。
We distinguish the application conditions,linkage and difference of Pearson correlation coefficient,Spearman and Kendall rank correlation coefficient,Cramer's V coefficient,partial correlation coefficient,generalized measures of correlation,multiple correlation coefficient,canonical correlation coefficient,generalized correlation coefficient and multivariate correlation coefficient and then the implementation of these correlation coefficients are given in R software.After this,based on the data of 31 provinces in China in 2016,we select economy,energy and environmental pollution to calculate correlation coefficients and make some explanations of the results.Finally,this paper summarizes the basic conditions of these correlation coefficients.
引文
[1] Galton F.Regression Towards Mediocrity in Hereditary Stature[J].Journal of the Anthropological Institute of Great Britain & Ireland,1886,15.
[2] Pearson K.Notes on the History of Correlation[J].Biometrika,1920,13(1).
[3] 王静龙,梁小筠.非参数统计分析[M].北京:高等教育出版社,2006.
[4] Spearman C.The Proof and Measurement of Association Between Two Things[J].The American Journal of Psychology,1904,15(1).
[5] 樊嵘,孟大志,徐大舜.统计相关性分析方法研究进展[J].数学建模及其应用,2014,3(1).
[6] Kendall M G.A New Measure of Rank Correlation[J].Biometrika,1938,30(1/2).
[7] Kendall M G,Gibbons J D.Rank Correlation Methods[M].London:Griffin,1990.
[8] Cramér B H.Mathematical Methods of Statistics[M].Princeton:Princeton University Press,1946.
[9] Zheng S,Shi N Z,Zhang Z.Generalized Measures of Correlation for Asymmetry,Nonlinearity,and Beyond[J].Journal of the American Statistical Association,2012,499.
[10] Hotelling H.Relations Between Two Sets of Variates[J].Biometrika,1936,28(3/4).
[11] 张尧庭.广义相关系数及其应用[J].应用数学学报,1978,1(4).
[12] 张尧庭.关于度量变量之间的相关程度[J].上海财经大学学报,1999(2).