分位数回归中的贝叶斯变量选择
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
自从Koenker&Bassett的重要论文于1978年发表以后,分位数回归已经被广泛应用到线性和非线性模型的分析中.和传统的均值模型相比,分位数模型可以更详细地描述变量的统计分布.在实际中,当存在很多个解释变量时就很有必要选择一些重要的变量以提高估计的精度.基于随机寻找变量选择方法和非对称拉普拉斯分布的混合表示,本文首先在线性模型中提出了一个简单有效的吉布斯算法进行贝叶斯变量选择.其次将其推广到具有潜在反应变量的模型中,具体的是两值和tobit分位数模型中.进一步,基于狄利克雷混合模型,本文考虑了单指标分位数模型中的贝叶斯变量选择问题,其中连接函数用截断线性样条进行构建,误差项分布由非参数狄利克雷混合模型近似,利用吉布斯抽样和MH算法进行后验推断.对于上述方法本文做了大量的数值模拟研究,结果表明提出的方法在不同的模型下均能有效的识别出真实的模型,最后将方法应用到一些实际数据中.
Since the seminal work of Koenker&Bassett[1](1978), quantile regression isgradually emerging as a comprehensive approach to the statistical analysis of linearand nonlinear response models. Compared with the mean regression, quantile re-gression can provide a more complete picture for the distribution. In practice, if thepredictor vector contains many variables, then it is necessary to select the signif-cant ones in order to improve the precision of the estimator. In this paper, based onthe stochastic search variable selection approach, we develop a simple and efcientGibbs sampling algorithm for Bayesian model selection in quantile regression basedon a location-scale mixture representation of the asymmetric laplace distribution.Also, we extended the approach in binary and tobit quantile regression. More-over, we consider variable selection in the single-index quantile regression modelbased on the stochastic search variable selection approach, where the link functionis modeled by trucated linear splines, and the distribution of the error is mod-eled nonparametrically by a Dirichlet process mixture model. Posterior inferenceis implemented using the Gibbs sampling and Metropolis-Hastings algorithm. Theabove models are illustrated using a large number of simulations. The results showthat our methods can efectively choose the true model. At last we analysis severalreal data examples.
引文
[1] Koenker, R. and Bassett, G. Regression quantile[J]. Econometrica(1978),46:33-50.
    [2] Walker, S. G. and Mallick, B. K. A Bayesian semiparametric accelerated failuretime model[J]. Biometrics(1999),55:477-483.
    [3] Kottas, A. and Gelfand, A. E. Bayesian semiparametric median regression model-ing[J]. Journal of the American Statistical Association(2001),96:1458-1468.
    [4] Yu, K. and Moyeed, R. A. Bayesian quantile regression[J]. Statistics and Probabil-ity Letters(2001),54:437-447.
    [5] Tsionas, E. Bayesian quantile inference[J]. Journal of Statistical Computation andSimulation(2003),73:659-674.
    [6] Hanson, T. and Johnson, W. O. Modeling regression error with a mixture of Polyatrees[J]. Journal of the American Statistical Association(2002),97:1020-1433.
    [7] Hjort, N. L. and Walker, S. G. Quantile pyramids for Bayesian nonparametrics[J].The Annals of Statistics(2009),37:105-131.
    [8] Kottas, A. and Krnjajic, M. Bayesian nonparametric modeling in quantile regres-sion[J]. Scandinavian Journal of Statistics(2009),36:297-319.
    [9] Taddy, M. A. and Kottas, A. A Bayesian nonparametric approach to inference forquantile regression[J]. Journal of Business and Economic Statistics(2010),28:357-369.
    [10] Drovandi, C. C. and Pettitt, A. N. Likelihood-free Bayesian estimation of multivari-ate quantile distributions[J]. Computational Statistics and Data Analysis(2011),55:2541-2556.
    [11] Geraci, M. and Bottai, M. Quantile regression for longitudinal data using theasymmetric Laplace distribution[J]. Biostatistics(2007),8:140-154.
    [12] Reich, B. J. and Bondell, H. D. and Wang, H. X. Flexible Bayesian quantile re-gression for independent and clustered data[J]. Biostatistics(2010),11:337-352.
    [13] Koenker, R. and Machado, J. A. F. Goodness of ft and related inference processesfor quantile regression[J]. Journal of the American Statistical Association(1999),94:1296-1310.
    [14] Komujer, I. Quasi-maximum likelihood estimation for conditional quantiles[J].Journal of Econometrics(2005),128:137-164.
    [15] Wu, Y. and Liu, Y. Variable selection in quantile regression[J]. StatisticaSinica(2009),19:801-817.
    [16] Li, Y. and Zhu, J. L1-norm quantile regression[J]. Journal of Computational andGraphical Statistics(2008),17:163-185.
    [17] Belloni, A. and Chernozhukov, V. L1quantile regression in high-dimensional sparsemodels[J]. The Annals of Statistics(2011),39:82-130.
    [18] Bang, S. W. and Jhun, M. Simultaneous estimation and factor selection in quantileregression via adaptive sup-norm regularization[J]. Computational Statistics andData Analysis(2011), DOI:10.1016/j.csda.2011.01.026.
    [19] Zou, H. and Yuan, M. Regularized simultaneous model selection in multiple quan-tiles regression[J]. Computational Statistics and Data Analysis(2008),52:5296-5304.
    [20] Park, T. and Casella, G. The Bayesian lasso[J]. Journal of the American StatisticalAssociation(2008),103:681-686.
    [21] Hans, C. Bayesian lasso regression[J]. Biometrika(2009),96:835-845.
    [22] Li, Q. and Xi, R. and Lin, N. Bayesian regularized quantile Regression[J]. BayesianAnalysis(2010),5:533-556.
    [23] George, E. L. and McCulloch, R. E. Variable selection via Gibbs sampling[J].Journal of the American Statistical Association(1993),88:881-889.
    [24] Wang, X. and George, E. I. Adaptive Bayesian criteria in variable selection forgeneralized linear models[J]. Statistica Sinica(2007),17:667-690.
    [25] Smith, M. and Kohn, R. Nonparametric regression using Bayesian variable selec-tion[J]. Journal of Econometrics(1996),75:317-343.
    [26] Yi, N. and George, V. and Allison, D. B. Stochastic search variable selection foridentifying multiple quantitative trait loci[J]. Genetics(2003),164:1129-1138.
    [27] Farcomeni, A. Bayesian constrained variable selection[J]. Statistica Sinica(2010),20:1043-1062.
    [28] George, E. L. and McCulloch, R. E. Approaches for Bayesian variable selection[J].Statistica Sinica(1997),7:339-373.
    [29] Kuo, L. and Mallick, B. Variable selection for regression models[J]. Sankhya,B(1998),60:65-81.
    [30] Antoniadis, A. and Gregoire, G. and McKeague, I. Bayesian estimation in single-index models[J]. Statistica Sinica(2004),14:1147-1164.
    [31] Reed, C. and Dunson, D. B. and Yu, K. Bayesian quantile regression model selec-tion using stochastic search[J]. Brunel University(2010).
    [32] Manski, C. F. Maximum score estimation of the stochastic utility model ofchoice[J]. Journal of Econometrics(1975),3:205-228.
    [33] Manski, C. F. Semiparametric analysis of discrete response: asymptotic propertiesof the maximum score estimator[J]. Journal of Econometrics(1985),27:313-333.
    [34] Das, S. A semiparametric structural analysis of the idling of cement kilns[J]. Jour-nal of Econometrics(1991),50:235-256.
    [35] Bartik, T. J. and Butler, J. S. and Liu, J. Maximum score estimates of the deter-minants of residential mobility:implications for the value of residential attachmentand neighborhood amenities[J]. Journal of Urban Economics(1992),32:233-256.
    [36] Horowitz, J. L. A smoothed maximum score estimator for the binary responsemodel[J]. Econometrica(1992),60:505-531.
    [37] Horowitz, J. L. Semiparametric estimation of a work-trip mode choice model[J].Journal of Econometrics(1993),58:49-61.
    [38] Goodwin, B. K. Semiparametric (distribution-free) testing of the expectations hy-pothesis in a parimutuel gambling market[J]. Journal of Business and EconomicStatistics(1996),14:487-496.
    [39] Fernandez, A. I. and Rodriguez-Poo, J. M. Estimation and specifcation testingin female labor participation models: parametric and semiparametric methods[J].Econometric Reviews(1997),16:229-247.
    [40] Kordas, G. Smoothed binary regression quantile[J]. Journal of Applied Economet-rics(2006),21:387-407.
    [41] Powell, J. Censored regression quantiles[J]. Journal of Econometrics(1986),32:143-155.
    [42] Hahn, J. Bootstrapping quantile regression estimators[J]. Econometric The-ory(1995),11:105-121.
    [43] Buchinsky, M. and Hahn, J. An alternative estimator for censored quantile regres-sion[J]. Econometrica(1998),66:653-671.
    [44] Bilias, Y. and Chen, S. and Ying, Z. Simple resampling methods for censoredregression quantile[J]. Journal of Econometrics(2000),68:303-338.
    [45] Fitzenberger, B. and Winker, P. Improving the computation of censored quantileregressions[J]. Computational Statistics and Data Analysis(2007),52:88-108.
    [46] Lin, G. and He, X. and Portnoy, S. Quantile regression with dou-bly censored data[J]. Computational Statistics and Data Analysis(2011),DOI:10.1016/j.csda.2011.03.009.
    [47] Yu, K. and Stander, J. Bayesian analysis of a Tobit quantile regression model[J].Communications in Statistics-Theory and Methods(2007),137:260-276.
    [48] Stoker, T. M. Consistent estimation of scaled coefcients[J]. Econometrica(1986),54:1461-1481.
    [49] Hardle, W. and Stoker, T. M. Investigating smoothing multiple regression bythe method of average derivatives[J]. Journal of the American Statistical Asso-ciation(1989),84:986-995.
    [50] Powell, J. L. and Stoker, J. H and Stoker, T. M. Semiparametric estimation ofindex coefcients[J]. Econometrica(1989),57:1403-1430.
    [51] Hardle, W. and Marron, J. S and Tsybakov, A. B. Bandwidth choice for averagederivative estimation[J]. Journal of the American Statistical Association(1992),87:218-226.
    [52] Newey, W. K and Stoker, T. M. Efciency of weighted average derivative estimatorsand index models[J]. Econometrica(1993),61:1199-1223.
    [53] Samarov, A. M. Exploring regression structure using nonparametric functionalestimation[J]. Journal of the American Statistical Association(1993),88:836-847.
    [54] Hardle, W. and Tsybakov, A. B. How sensitive are average derivatives?[J]. Journalof Econometrics(1993),58:31-48.
    [55] Hristache, M. A. and Juditsky, J. and Spokoiny, V. Direct estimation of the indexcoefcients in a single-index model[J]. Annals of Statistics(2001),29:595-623.
    [56] Hristache, M. A. and Juditsky, J. and Polzehl, J and Spokoiny, V. Structure adap-tive approach for dimension reduction a single-index model[J]. Annals of Statis-tics(2001),29:1537-1566.
    [57] Carroll, R. J. and Fan, J. and Gijbels, I and Wand, M. P. Generalized partially lin-ear single-index models[J]. Journal of the American Statistical Association(1997),92:477-489.
    [58] Delecroix, M. and Hristache, M. M-estimateurs semiparam triques dans lesmodeles a direction revelatrice unique[J]. Bulletin of the Belgian MathematicalSociety(1999),6:161-185.
    [59] Hardle, W. and Hall, P. and Ichimura, H. Optimal smoothing in single-index mod-els[J]. Annals of Statistics(1993),21:157-178.
    [60] Horowitz, J. L. and Hardle, W. Direct semiparametric estimation of single-indexmodels with discrete covariates[J]. Journal of the American Statistical Associa-tion(1996),91:1632-1640.
    [61] Ichimura, H. Semiparametric least-squares (SLS) and weighted SLS estimation ofsingle-index models[J]. Journal of Econometrics(1993),58:71-120.
    [62] Klein, R. L. and Spady, R. H. An efcient semiparametric estimator for binaryresponse models[J]. Econometrica(1993),61:387-421.
    [63] Yu, Y. and Ruppert, D. Penalized spline estimation for partially linear single-indexmodel[J]. Journal of the American Statistical Association(2002),97:1042-1054.
    [64] Xia, Y. and Hardle, W. Simi-parametric estimation of partially linear single indexmodels[J]. Journal of Multivariate Analysis (2006),97:1162-1184.
    [65] Xia, Y. and Tong, H. and Li, W. K and Zhang, D. An adaptive estimation ofdimension reduction space (with discussion)[J]. Journal of the Royal StatisticalSociety Series B(2002),64:363-410.
    [66] Li, K. Sliced inverse regression for dimension reduction[J]. Journal of the AmericanStatistical Association(1991),86:316-342.
    [67] Delecroix, M. and Hall, P. and Vial-Roget, C. Test des modeles a direction revela-trice unique. Abstracts of the34th meeting of the French Statistical Society(2001),364-365.
    [68] Xia, Y. and Li, W. K. and Tong, H. and Zhang, D. A goodness-of-ft test forsingle-index models (with discussion)[J]. Statistica Sinica (2004),14:1-39.
    [69] Wang, H. B. Bayesian estimation and variable selection for single index models[J].Computational Statistics and Data Analysis(2009),53:2617-2627.
    [70] Karabatsos, G. Modeling heteroscedasticity in the single-index model with theDirichlet process[J]. Advances and Applications in Statistical Sciences(2009),1:83-104.
    [71] Choi, T. and Shi, J. and Wang, B. A gaussian process regression approach to asingle index model[J]. Journal of Nonparametric Statistics(2011),23:21-36.
    [72] Gramacy, R. and Lian, H. Gaussian process single index models as emulators forcomputer experiments[J]. Technical Reports(2011).
    [73] Naik, P. A. and Tsai, C. I. Single index model selection[J]. Biometrika(2001),88:821-832.
    [74] Kong, E. and Xia, Y. Variable selection for single index model[J]. Biometrika(2007),94:217-229.
    [75] Zhu, L. P. and Zhu, L. X. Nonconcave penalized inverse regression in single indexmodels with high dimensional predictors[J]. Journal of Multivariate Analysis(2009),100:862-875.
    [76] Zhu, L. P. and Qian, L. Y. and Lin, J. G. Variable selection in a class of singleindex models[J]. Annals of the Institute of Statistical Mathematics(2011), DOI:10.1007/s10463-010-0287-4.
    [77] Peng, H. and Huang, T. Penalized least squares for single index models[J]. Journalof Statistical Planning and Inference(2011),141:1362-1379.
    [78] Zeng, P. and He, T, H. and Zhu, Y. A Lasso-type approach for estimation and vari-able selection in single index models[J]. Journal of Computational and GraphicalStatistics(2011), accepted.
    [79] Bae, K. and Mallick, B. K. Gene selection using a two-level hierarchical Bayesianmodel[J]. Bioinformatics(2004),20:3424-3430.
    [80] Yuan, M. and Lin, Y. Efcient empirical Bayes variable selection and esimation inlinear models[J]. Journal of the American Statistical Association(2005),100:1215-1225.
    [81] Albert, J. H. and Chib, S. Bayesian analysis of binary and polychotomous responsedata[J]. Journal of the American Statistical Association(1993),88:669-679.
    [82] Benoit, D. F. and Poel, D. V. D. Binary quantile regression: a Bayesian ap-proach based on the asymmetric Laplace density[J]. Journal of Applied Econo-metrics(2010),(Published online) DOI:10.1002/jae.1216.
    [83] Wu, T. Z and Yu, K. M and Yu, Y. Single-index quantile regression[J]. Journal ofMultivariate Analysis(2010),101:1607-1621.
    [84] Kong, E. and Xia, Y. Quantile estimation of a general single-index model[J]. Work-ing paper(2007).
    [85] Hu, Y. and Gramacy, R. B and Lian, H. Bayesian Quantile regression for singleindex models[J]. Working paper(2011).
    [86] Yu, K. and Zhang, J. A three-parameter asymmetric Laplace distribution and itsextension[J]. Communications in Statistics-Theory and Methods(2005),34:1867-1879.
    [87] Kozumi, H. and Kobayashi, G. Gibbs sampling methods for Bayesian quantileregression[J]. Journal of Statistical Computation and Simulation(2011),81:1565-1578.
    [88] Ghosh, J. K. and Ramamoorthi, R. V. Bayesian nonparametrics[M]. Springer,(2003).
    [89] Ferguson, T. S. A Bayesian analysis of some nonparametric problems[J]. The An-nals of Statistics(1973),1:209-230.
    [90] Blackwell, D. and MacQueen, J. B. Ferguson Distributions via P lya UrnSchemes[J]. The Annals of Statistics(1973),1:353-355.
    [91] Sethuraman, J. A Constructive Defnition of Dirichlet Priors[J]. StatisticaSinica(1994),4:639650.
    [92] Martin, A. and Quinn, K. M. and Park, H. P. Markov chain Monte Carlo(MCMC)[J]. Package”MCMCpack”, ver1.0-10(2011).
    [93] Harrison, D. and Rubinfeld, D. L. Hedonic housing prices and the demand for cleanair[J]. Journal of Environmental Economics and Management(1978),5:81-102.
    [94] Efron, B. and Hastie, T. and Johnstone, I. and Tibshirani, R. Least Angle Regres-sion[J]. The Annals of Statistics(2004),32:407-499.
    [95] Trautmann, H. and Steyer, D. and Meramann, O. Truncated normal distribu-tion[J]. R Package “truncnorm”, ver1.0-4(2010).
    [96] Calcagno, V. and Mazancourt, C. D. glmulti: An R Package for Easy AutomatedModel Selection with (Generalized) Linear Models[J]. Journal of Statistical Soft-ware(2010),34.
    [97] Ripley, B. Pattern Recognition and Neural Networks[M]. Cambridge Univer-sity.(1996).
    [98] Fair, R. C. Theroy of extramarital afairs[J]. Journal of Political Economy(1978),86:45-61.
    [99] Chernozhukov, V. and Hong, H. Three-step censored quantile regression and extra-marital afairs[J]. Journal of the American Statistical Association(2002),97:872-882.
    [100] Gelman, A. and Rubin, D. B. Inference from iterative simulation using multiplesequences[J]. Statistical Science(1992),7:457-472.
    [101] Brooks, S. and Gelman, A. General methods for monitoring covergence of iterativesimulations[J]. Journal of Computational and Graphical Statistics(1998),7:434-455.
    [102] Andrews, D. F. and Mallows, C. L. Scale mixtures of normal distributions[J]. Jour-nal of the Royal Statistical Society Series B(1974),36:99-102.
    [103] Kyung, M. A Computational Bayesian method for estimating the number of knotsin regression splines[J]. Bayesian Analysis (2011),4:1-36.
    [104] Dimatteo, I. and Genovese, C. R. and Kass, R. E. Bayesian curve ftting with freeknot splines[J]. Biometrika (2001),88:1055-1071.
    [105] Saw, J. G. A family of distributions on the m-sphere and some hypothesis tests[J].Biometrika(1978),65:69-73.
    [106] Escobar, M. D. and West, M. Density estimation and inference using mixture[J].Journal of the American Statistical Association(1995),90:577-588.
    [107] Bush, C. A. and MacEachern, S. N. A semiparametric bayesian model for ran-domised block designs[J]. Biometrika(1996),83:275-285.
    [108] MacEachern, S. N. Estimating mixture of dirichlet process models[J]. Communi-cations in Statistics B(1994),23:727-741.
    [109] Antoniak, C. E. Mixtures of dirichlet processes with applications to Bayesian non-parametric problems[J]. Annals of Statistics(1974),2:1152-1174.
    [110] Penrose, K. W. and Nelson, A. G. and Fisher, A. G. Generalized body compositionprediction equation for men using simple measurement techniques[J]. Medicine andScience in Sports and Exercise(1985),17,189.
    [111] Siri, W. E. Gross composition of the body[M]. Advances in Biological and MedicalPhysics,IVAcademic Press, Inc., New York.(1956).
    [112] Katch, F. and McArdle, W. Nutrition, Weight Control, and Exercise[M]. HoughtonMifin Co., Boston.(1977).

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700