Expectation-robust algorithm and estimating equations for means and dispersion matrix with missing data
详细信息    查看全文
  • 作者:Ke-Hai Yuan ; Wai Chan ; Yubin Tian
  • 关键词:Missing data ; Monte Carlo ; Robust means and dispersion matrix ; Sandwich ; type covariance matrix
  • 刊名:Annals of the Institute of Statistical Mathematics
  • 出版年:2016
  • 出版时间:April 2016
  • 年:2016
  • 卷:68
  • 期:2
  • 页码:329-351
  • 全文大小:581 KB
  • 参考文献:Cheng, T. C., Victoria-Feser, M. P. (2002). High
    eakdown estimation of multivariate mean and covariance with missing observations. British Journal of Mathematical and Statistical Psychology, 55, 317–335.
    Devlin, S. J., Gnanadesikan, R., Kettenring, J. R. (1981). Robust estimation of dispersion matrices and principal components. Journal of the American Statistical Association, 76, 354–362.
    Efron, B., Tibshirani, R. J. (1993). An Introduction to the bootstrap. New York: Chapman & Hall.
    Godambe, V. P. (1960). An optimum property of regular maximum likelihood estimation. Annals of Mathematical Statistics, 31, 1208–1211.CrossRef MathSciNet
    Godambe, V. P. (Ed.). (1991). Estimating functions. New York: Oxford University Press.MATH
    Green, P. J. (1984). Iteratively reweighted least squares for maximum likelihood estimation, and some robust and resistent alternatives (with discussion). Journal of the Royal Statistical Society B, 46, 149–192.MATH
    Hampel, F. R., Ronchetti, E. M., Rousseeuw, P. J., Stahel, W. A. (1986). Robust statistics: The approach based on influence functions. New York: Wiley.
    Heritier, S., Cantoni, E., Copt, S., Victoria-Feser, M. P. (2009). Robust methods in biostatistics. Southern Gate: Wiley.
    Huber, P. J. (1967). The behavior of maximum likelihood estimates under nonstandard conditions. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability (Vol. I, pp. 221–233). Oakland: University of California Press.
    Huber, P. J. (1981). Robust statistics. New York: John Wiley.CrossRef MATH
    Johnson, R. A., Wichern, D. W. (2002). Applied multivariate statistical analysis (5th ed.). New Jersey: Prentice-Hall.
    Kano, Y., Berkane, M., Bentler, P. M. (1993). Statistical inference based on pseudo-maximum likelihood estimators in elliptical populations. Journal of the American Statistical Association, 88, 135–143.
    Kelley, C. T. (2003). Solving nonlinear equations with Newton’s method. Philadelphia: SIAM.CrossRef MATH
    Kent, J. T., Tyler, D. E., Vardi, Y. (1994). A curious likelihood identity for the multivariate t-distribution. Communications in Statistics Simulation and Computation, 23, 441–453.
    Liang, K. Y., Zeger, S. L. (1986). Longitudinal data analysis using generalized linear models. Biometrika, 73, 13–22.
    Little, R. J. A. (1988). Robust estimation of the mean and covariance matrix from data with missing values. Applied Statistics, 37, 23–38.CrossRef MathSciNet MATH
    Little, R. J. A., Schluchter, M. D. (1985). Maximum likelihood estimation for mixed continuous and categorical data with missing values. Biometrika, 72, 497–512.
    Little, R. J. A., Smith, P. J. (1987). Editing and imputing for quantitative survey data. Journal of the American Statistical Association, 82, 58–68.
    Liu, C. (1997). ML estimation of the multivariate \(t\) distribution and the EM algorithm. Journal of Multivariate Analysis, 63, 296–312.CrossRef MathSciNet MATH
    Lopuhaä, H. P. (1989). On the relation between S-estimators and M-estimators of multivariate location and covariances. Annals of Statistics, 17, 1662–1683.CrossRef MathSciNet MATH
    Maronna, R. A. (1976). Robust M-estimators of multivariate location and scatter. Annals of Statistics, 4, 51–67.CrossRef MathSciNet MATH
    Maronna, R. A., Martin, R. D., Yohai, V. J. (2006). Robust statistics: theory and methods. New York: Wiley.
    Maronna, R., Zamar, R. (2002). Robust estimates of location and dispersion for high-dimensional datasets. Technometrics, 44, 307–317.
    Mehrotra, D. V. (1995). Robust elementwise estimation of a dispersion matrix. Biometrics, 51, 1344–1351.CrossRef MATH
    Meng, X. L., van Dyk, D. A. (1997). The EM algorithm: an old folk song sung to a fast new tune (with discussion). Journal of the Royal Statistical Society B, 59, 511–567.
    Micceri, T. (1989). The unicorn, the normal curve, and other improbable creatures. Psychological Bulletin, 105, 156–166.CrossRef
    Poon, W. Y., Poon, Y. S. (2002). Influential observations in the estimation of mean vector and covariance matrix. British Journal of Mathematical and Statistical Psychology, 55, 177–192.
    Prentice, R. L., Zhao, L. P. (1991). Estimating equations for parameters in means and covariances of multivariate discrete and continuous responses. Biometrics, 47, 825–839.
    Richardson, A. M., Welsh, A. H. (1995). Robust restricted maximum likelihood in mixed linear models. Biometrics, 51, 1429–1439.
    Rocke, D. M. (1996). Robustness properties of S-estimators of multivariate location and shape in high dimension. Annals of Statistics, 24, 1327–1345.CrossRef MathSciNet MATH
    Rubin, D. B. (1976). Inference and missing data (with discussions). Biometrika, 63, 581–592.CrossRef MathSciNet MATH
    Ruppert, D. (1992). Computing S estimators for regression and multivariate location/dispersion. Journal of Computational and Graphical Statistics, 1, 253–270.
    Savalei, V., Falk, C. (2014). Robust two-stage approach outperforms robust FIML with incomplete nonnormal data. Structural Equation Modeling, 21, 280–302.
    Schott, J. (2005). Matrix analysis for statistics (2nd ed.). New York: Wiley.MATH
    Sen, P. K. (1968). Estimates of the regression coefficient based on Kendalls tau. Journal of the American Statistical Association, 63, 1379–1389.CrossRef MathSciNet MATH
    Theil, H. (1950). Rank invariant method for linear and polynomial regression analysis. Indagationes Mathematicae, 12, 85–91.
    Tyler, D. E. (1991). Some issues in the robust estimation of multivariate location and scatter. In W. Stahel S. Weisberg (Eds.), Directions in robust statistics and diagnostics part II (pp. 327–336). New York: Springer-Verlag.
    Wilcox, R. R. (1998). A note on the Theil-Sen regression estimator when the regressor is random and the error term is heteroscedastic. Biometrical Journal, 40, 261–268.CrossRef MATH
    Wilcox, R. R. (2012). Introduction to robust estimation and hypothesis testing (3rd ed.). Waltham: Academic Press.MATH
    Yuan, K.-H., Jennrich, R. I. (1998). Asymptotics of estimating equations under natural conditions. Journal of Multivariate Analysis, 65, 245–260.
    Yuan, K.-H., Zhang, Z. (2012). Robust structural equation modeling with missing data and auxiliary variables. Psychometrika, 77, 803–826.
    Yuan, K.-H., Bentler, P. M., Chan, W. (2004). Structural equation modeling with heavy tailed distributions. Psychometrika, 69, 421–436.
    Yuan, K.-H., Wallentin, F., Bentler, P. M. (2012). ML versus MI for missing data with violation of distribution conditions. Sociological Methods & Research, 41, 598–629.
  • 作者单位:Ke-Hai Yuan (1)
    Wai Chan (2)
    Yubin Tian (3)

    1. Department of Psychology, University of Notre Dame, Notre Dame, IN, 46556, USA
    2. Department of Psychology, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
    3. School of Mathematics, Beijing Institute of Technology, Haidian District, Beijing, 100081, China
  • 刊物类别:Mathematics and Statistics
  • 刊物主题:Statistics
    Statistics
    Statistics for Business, Economics, Mathematical Finance and Insurance
  • 出版者:Springer Netherlands
  • ISSN:1572-9052
文摘
Means and covariance/dispersion matrix are the building blocks for many statistical analyses. By naturally extending the score functions based on a multivariate \(t\)-distribution to estimating equations, this article defines a class of M-estimators of means and dispersion matrix for samples with missing data. An expectation-robust (ER) algorithm solving the estimating equations is obtained. The obtained relationship between the ER algorithm and the corresponding estimating equations allows us to obtain consistent standard errors when robust means and dispersion matrix are further analyzed. Estimating equations corresponding to existing ER algorithms for computing M- and S-estimators are also identified. Monte Carlo results show that robust methods outperform the normal-distribution-based maximum likelihood when the population distribution has heavy tails or when data are contaminated. Applications of the results to robust analysis of linear regression and growth curve models are discussed.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700