摘要
分位数回归方法由于其具有稳健性,不仅能够全面刻画响应变量的条件分布,还能提供更有现实意义的回归参数,已经逐渐成为各个领域统计分析的强有力的工具.但在许多实际应用中,人们不仅想要探寻不同水平下(即不同分位数)响应变量与解释变量之间的关系,更希望找到一个最优水平,也即最优分位数,使其上的回归结果最真实可靠,最好地反映总体情况.文中提出一种新的回归方法一最优分位回归方法,给出此类问题一个完美的解决方案.该方法的灵感主要来源于稀疏函数的定义,可以证实与传统均值回归相比最优分位回归方法更具优势:(1)稳健性.不受误差分布的限制;(2)有效性.回归结果蕴含信息更丰富;(3)灵活性.对任意模型及数据均适用.文中的模拟结果也对以上三条性质给予极大的支持.最后食品消费数据的分析结果表明当考虑食品消费与人均收入的关系时,中下等收入人群的消费模式为社会的主流模式.
Quantile regression is becoming a powerful statistical tool in diverse fields, owing to its robustness, completeness and interpretability. However, in many real applications, not only the relationship between the response and covariates is to be investigated, but the best quantile level is more required. To this end, Optimal Quantile Regression(OQR) technique based on sparsity function is proposed. It can be demonstrated that the proposed OQR has significant advantages compared with classical mean regression.(1) Robustness,(2) Efficiency,(3) Flexibility. To examine the performance of proposed methods, simulations are conducted. The results all provide valid supports with the proposed OQR. In the end, a real data is used to make an illustration. It is suggested, in terms of salary, the lower class of people should be of more attention.
引文
[1] Gilchrist G, Viavant S D. Trusted biometric client authentication:U.S. Patent 6, 167, 517[P].2000-12-26.
[2] Reid N. Estimating the median survival time[J]. Biometrika, 1981, 68:601-608.
[3] Slud E V, Byar D P, Green S B. A comparison of re ected versus testbased confidence intervals for the median survival time, based on censored data[J]. Biometrics, 1984, 40:587-600.
[4] Su John Q, Wei L J. Nonparametric estimation for the difference or ratio of median failure times[J].Biometrics, 1993, 49:603-607.
[5] Nair N U, Sankaran P G, Kumar B V. Total time on test transforms of order n and its implications in reliability analysis[J]. Journal of Applied Probability, 2008, 45:1126-1139.
[6] Koenker R, Bassett Jr G. Regression quantiles[J]. Econometrica:Journal of the Econometric Society, 1978, 33-50.
[7] Tukey J W. Which part of the sample contains the information?[J]. Proceedings of the National Academy of Sciences of the United States of America, 1965, 53(1):127.
[8] Parzen E. Nonparametric statistical data modeling[J]. Journal of the American Statistical Association, 1979, 74(365):105-121.
[9] Bofinger E. Estimation of a density function using order statistics[J]. Journal of the American Statistical Association, 1975, 70(349):151-154.
[10] Sheather S J, Maritz J S. An estimate of the asymptotic standard error of the sample median[J].Australian Journal of Statistics, 1983, 25(1):109-122.
[11] Siddiqui M M. Distribution of quantiles in samples from a bivariate population[J]. Journal of Research of the National Bureau of Standards, 1960, 64:145-150.
[12] Welsh A H. One-step L-estimators for the linear model[J]. The Annals of Statistics, 1987, 626-641.
[13] Soni P, Dewan I, Jain K. Nonparametric estimation of quantile density function[J]. Computational Statistics&Data Analysis, 2012, 56(12):3876-3886.
[14] Hall P, Sheather S J. On the distribution of a studentized quantile[J]. Journal of the Royal Statistical Society, 1988, 50(3):381-391.
[15] Koenker R, Bassett G J. Robust Test for Heteroscedasticity Based on Regression Quantiles[J].Econometrica, 1982, 50(1):43-61.
[16] Knight K. Limiting distributions for l_1 regression estimators under general conditions[J]. Annals of Statistics, 1988, 26:755-770.