删失场合回归模型的若干问题研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
回归分析是处理变量之间相关关系的一种统计方法。理论上,回归函数一般都是未知的,回归分析就是根据回归变量和响应变量的取值对回归函数进行估计和推断,所用的方法很大程度上取决于回归模型的假定。其中应用最为广泛,相关理论最为丰富的是线性回归模型,但实际应用中线性回归模型的假设往往不能满足,常常需要使用非参数回归模型。随着非参数回归模型的出现,也衍生出了许多新的数据模型,如半参数模型、可加模型、变系数模型等。
     完全数据下的回归理论已经较为成熟,而在生物科学、临床实验、质量控制等领域常常会遇到各种形式的删失数据,本文旨在对删失场合下具有不同随机误差的线性回归模型、非参数模型以及半参数模型的估计理论进行研究。主要的研究成果如下:
     1.研究了协变量区间删失场合下线性回归模型的估计。提出了两种估计方法:一、在协变量分布参数的极大似然估计的基础上,构造出了区间协变量的条件均值,以条件均值代替真实值,利用最小二乘的方法得到线性回归参数的估计,并证明了估计的渐近无偏性和强相合性。二、提出了一个两步迭代估计方法。当回归参数已知时,通过最大化区间协变量的似然函数得到协变量分布参数的估计;当分布参数已知时,构造出区间协变量的条件期望,进而得到回归参数的最小二乘估计,数值模拟结果表明该估计方法效果良好。
     2.在因变量区间删失场合下,研究得到了非参数回归模型的局部线性估计。通过分析删失变量与真实值的误差对局部线性估计的结果产生的影响,对删失数据进行修正,然后在修正值的基础上对非参数的回归函数进行局部线性估计,数值模拟结果表明该估计效果令人满意。
     3.在因变量随机删失下,研究了半参数回归模型的估计。将基于经验分布函数的最近邻估计应用于半参数模型中非参数部分的估计,利用两阶段估计的思想分别得到了参数和非参数部分的估计。在较弱的假设条件下证明了参数部分的渐近正态性和非参数部分的强相合性。
     4.在随机误差为NA序列时,对随机删失场合下的半参数模型进行了研究。利用NA序列和式的收敛性质,得到了参数估计的o(n~-1/4)收敛速度和非参数部分的强相合性,并进一步证明了参数部分估计的渐近正态性。
Regression analysis is a statistical method of dealing with statistical correlations of some variables. In theory, regression functions are usually unknown, and regression analysis is to estimate regression function according to the value of covariate and response variable. The choosing of regression method largely depends on the assumption of regression model. Linear regression model is the one which is oldest, mostly applied and has plenty of related mature theory. But we usually have to use nonparametric regression model because the assumption of linear regression model cannot be satisfied in actual application. Then there appears many new models, such as semiparametric model, generalized linear model, vary-coefficient model and so on.
     Regression theory under complete data is mature, but it always appears censored data in some fields, such as biology science,clinial experiment and quality control. This paper is to do some research on the estimation of linear regression model, nonparametric model and semiparametric model under different kinds of errors . To sum up, the works and innovations of this thesis could be summarized as follows:
     1. We use two methods to estimate linear regression parameters under interval -censored covariate. Firstly, baesd on the maximum estimator of the parameter in distribution of covariate, we construct the interval conditional mean of the censored covariates. Then the estimators of regression parameters are obtained and the asymptotic unbiasness and consistency of our proposed estimators are proved. Secondly, we propose a two-step iterative algorithm: When regression parameters are given, parameters in the distribution of covariates are obtained to maximize the likelihood function of interval-censored covariates; when distribution parameters are given, regression parameters are obtained based on conditional mean of covariates using least square method. Simulation shows the performance is good.
     2. We propose local linear estimators of nonparametric regression model under interval-censored covariate. We analyze the influence on the estimator by the deviation between censored data and the real value, then amend the censored data . Based on the amended data, we get the final local linear estimators. The good performance of our proposed estimators is illustrated by some simulation examples.
     3. For semiparametric regression model with randomly censored data, we propose the estimators of parametric and nonparametric part by the use of the nearest neighbor method based on distribution function and least squares methods, using the idea of two-step estimation. The asymptotic normality of parametric part and the strong consistency of nonparametric part are proved under some weak conditions.
     4. For semiparametric regression model with censored data under NA errors, we propose the estimators of parametric and nonparametric part. By the use of convergence property of sums of NA sequences, the o(n~-1/4) convergence rate of parametric part and the strong consistency of nonparametric part are proved. We also obtain the asymptotic normality of parametric part.
引文
[1]王松桂,史建红,尹素菊,吴密霞,线性模型引论[M],科学出版社,2004
    [2] Jian-Jian Ren. Regression M-estimators with non-i.i.d. doubly censored data[J]. Ann. Statist, 2003,31 (4) :1186-1219.
    [3] Zhou Xiuqing Wang Jinde. LAD estimation for nonlinear regression models with randomly censored data[J].中国科学A辑(英文版), 2005, 48(7):880-897.
    [4] Elisa T.Lee,陈家鼎等.生存数据分析的统计方法[M],中国统计出版社,1998.
    [5]王娟,一般II型逐步删失数据下得一些统计推断问题的研究[D],东南大学学位论文, 2005.
    [6] Guadalupe Gomez,Anna Espinal, Stephen W. Lagakos, Inference for a linear regression model with an interval censored covariate[J], Statistics In Medicine,2003, 22: 409-425.
    [7]何其祥,协变量区间删失对线性模型的参数估计[J],应用数学,2007,20(2): 427-432.
    [8] Zheng Zukang. The iterated algorithms of estimating parameter by interval censored Data[J]. Chinese Journal of Applied P1obability and Statistics, 2004, 8: 273-280.
    [9] Buckley J,James I R. Linear Regression with Censored Data. Biometrika[J],1979,66: 429-436.
    [10] Koul H,Susar V,Van Ryzin J. Regression analysis with randomly right-censored data[J]. The Anaals of Statistics,1981,9:1276-1288.
    [11] Jonathan Buckley and Ian James. Linear regression with censored data[J]. Biometrika,1979,66(3): 429-436
    [12] Tejas A.Desai and Pranab K. Sen. Information attainable in some randomly incomplete data models[J]. Journal of Statistical Planning and Inference, 2006,136, 2309-2326.
    [13] William Thomas and R.Dennis Cook. Assessing Influence on Regression Coefficients in Generalized Linear Models[J]. Biometrika, 1989,76(4), 741-749.
    [14] Yanhua Wang, Zhang Chun-Hui. Linear regression with interval censoring data[J]. Au, n.Statist, 1998, 26, 1306-1327.
    [15] HeQixiang. Estimation of parameters of a linear regression model under interval-censored covariate[J]. Mathematica applicata, 2007, 20, 427-432.
    [16] Zheng Zukang. A class of estimators for the parameters in linear regression with censored Data. Acta[J]. Math. Appl.Sinica3, 1987(3): 131-141.
    [17] Huang. J. Maximum Scored Likelihood Estimation of A linear Regression with Interval-censored Data[J]. Technical Report 256,Department of statistics, university of Washington.
    [18]何其祥,协变量区间删失对线性模型的参数估计[J],应用数学,2007,20(2): 427-432
    [19] I. R. James,P. J. Smith, Consistency results for linear regression with censored data[J]. The analysis of statistics,1984,12(2): 4590-600.
    [20] Guadalupe Gomez, Anna Espinal, Stephen W. Lagakos. Inference for a linear regression model with an interval censored covariate[J]. Statistics In Medicine, 2003, 22: 409-425.
    [21] Breslow, N. Covariance analysis of censored survival data[J]. Biometrics, 1997, 30, 89-99.
    [22] J. Fan, I. Gijbels. Local Polynomial Modelling and Its Application[J]. Chapman&Hall,1996
    [23] Meng. X. and Rubin, D. Using EM to obtain asymptotic variance-covariance matrix using the SEM algorithm[J]. Journal of the American Statistical Association, 1991, 86: 899-909.
    [24] Sen. PK. and Singer. J. M. Large Sample Methods in Statistics[J]. Chapman-Hall, London, 1993.
    [25] James I R, Smith P J. Consistency results for linear regression with censored data[J]. The Annuals of Statistics, 1984, 12: 590-600.
    [26] Fan. J. and Zhang. J. T. Two-step estimation of function linear models with applications to longitudinal data[J]. .J.R.Statist.Ass.B, 2000, 57: 371-394.
    [27]邓文丽.区间数据的若干问题研究[D].上海:复旦大学博士论文,2004.
    [28] He Qixiang, ZhengMing. Empirical likelihood-based inference in linear models with interval-censored data[J]. App1.Math.J.ChineseUniv.SerB, 2005: 20(3): 338~346.
    [29]杨善朝.截尾数据非参数回归函数加权核估计[J].数学学报, 1999, 42(2): 255-262.
    [30] Shi Peide, Wang Haiyan. Asymptotic theory of nonparametric regression estimates with censored data[J].中国科学A辑(英文版), 2000, 43(6): 574-580.
    [31] Kaplan.E.L, Meier. P. Nonparametric estimation from incomplete observation[J]. Journal of the American Statistical Association. 1958, 53: 457-481.
    [32]陈平.双边删失场合生存函数的非参数Bayes估计的弱收敛性[J].应用数学, 1994, 7(3): 337-342.
    [33]潘建敏,宋燕平.截尾数据非参数回归函数加权核估计的强相合[J]性.数理统计与应用概率, 1997, 12(2): 151-160.
    [34] P. J. Green and B. W. Silverman, Smoothing splines and generalized linear models[M]. Bungay Suffolk,1994.
    [35] Cedric heuchenne, Ingrid vankeilegom. Polynomial regression with censored data based on preliminary nonparametric estimation[M]. Entrepreneurshio and innovation, 2006, 59:2 73-297.
    [36] ZuKang Zheng. Strong consistency of nonparametric regression estimates with censored data[J]. Journal of Mathematic Research and Exposition, 1988, 8: 307-313.
    [37]陈婉清.混合区间删失资料的非参数分析方法[D].广州:中山大学硕士论文,2005.
    [38]谢飞英.区间删失数据的回归分析[D].上海:华东师范大学硕士论文, 2006.
    [39]胡建敏.区间删失数据对回归模型的影响分析[D].上海:华东师范大学硕士论文, 2007.
    [40] Michael Kihler, Adam Krzyzak. Nonparametric regression estimation using penalized least squares[J]. IEEE transaction on information theory, 2001, 47(7): 3054-3058.
    [41] Cedric heuchenne, Ingrid vankeilegom, Polynomial regression with censored data based on preliminary nonparametric estimation[J]. Entrepreneurshio and innovation, 2006, 59:273-297.
    [42] Michael Kihler,Adam Krzyzak. Nonparametric regression estimation using penalized least squares[C]. IEEE transaction on information theory, 2001, 47(7): 3054-3058.
    [43] Jian Huang,Asymptotic properties of nonparametric estimation based on partly interval-censored data[J]. The annuals of statistics,2005,25(1):1-26
    [44]秦更生.随机删失场合部分线性模型中的核光滑方法[J].数学年刊, 1995, 1(4): 441-453.
    [45]王启华.随机截断下半参数回归模型中估计的渐近性质[J].中国科学, 1997, 27A(7): 583-594.
    [46]薛留根.随机删失下半参数回归模型的估计理论[J].数学年刊, 1999, 20A(6): 745-754. [47」Hardle W.P. Applized nonparametric regression[M]. Boston: Cambridge University Press,1990
    [48]王启华.随机截断下半参数回归模型中的相合估计[J].中国科学,1995. 25A(8): 819-832.
    [49] Yingcun Xia. A semiparametric approach to canonical analysis[J]. J. R. Statist. Soc. B 2008, 70(3): 519–543.
    [50] Speckmen. P. Kernel smoothing in partial linear models[J]. Roy Statist SerB,1988,50: 413-436.
    [51]钱莲芬.回归函数最近邻函数的重对数律[J].应用概率统计,1993,9(3):225-230.
    [52]施云弛,柴根象,半参数回归模型得局部多项式光滑[J].同济大学学报,2000,28(1):80-83
    [53]任哲,陈明华. NA样本下部分线性模型中估计的强相合性[J].应用概率统计, 2002, 18(1): 60-66.
    [54]王志江,陈彩琴.删失场合下回归函数估计的强相合性[J].杭州大学学报, 1996, 7(3): 212-218.
    [55] D.Zeng, D.Y.Lin. Maximum likelihood estimation in semiparametric regression models with censored data[J]. J.R.Statist.Soc.B, 2007, 69(4): 507-564.
    [56]胡舒合,截尾数据的非参数回归函数的核估计[J],数学物理学报,1995,15(2):132-136
    [57]王之江,陈彩琴.删失场合下回归函数估计得强相合性[J],杭州大学学报,1996,23(3):212-218
    [58]樊明智,王芬玲,郭辉,纵向数据半参数回归模型得最小二乘局部线性估计[J],数理统计与管理,2006,25(2):170-174.
    [59]罗双华,田萍,蒋双英,缺失数据下半参数回归模型的渐近性质[J].兰州理工大学学报,2008,34(2): 155-159
    [60]邱瑾,删失场合半参数回归模型的二阶段估计[J],高校应用数学学报,1998,13(3): 281-288
    [61]柴根象,孙平,蒋泽云.半参数回归模型的二阶段估计[J].应用数学学报,1995,18(3): 353-356.
    [62]姜玉英,刘强.固定设计下半参数回归模型小波估计的收敛速度[J].福州大学学报,2008, 36(2): 176-181
    [63]薛宏旗,宋立新. I型区间删失情形下部分线性模型Sieve极大似然估计渐近性质[J].应用数学学报, 2001, 24(1): 139-151.
    [64]高集体,陈希孺,赵林城.部分线性模型中估计的渐近正态性[J].数学学报,1994, 37(3): 256—268.
    [65] Qin Gengsheng. K-NN method in partial linear model with random censorship[J]. App1. Math-Jcu, 1995(10B): 275-286.
    [66]刘网定,王海康,周秀轻.协变量带误差的随机删失数据线性模型的一类半参数估计[J].南京师大学报, 2009, 32(2): 31-35.
    [67]李永明,韩龙生. NA序列半参数回归模型小波估计的强相合性[J].数学的实践与认识, 2007, 38(20): 47-52.
    [68]于卓熙,王德辉,史宁中. NA误差下部分线性模型的经验似然推断[J].系统科学与科学, 2009, 29(4): 490-501.
    [69]潘雄,付宗堂.随机删失半参数回归模型小波估计的渐近性质[J].应用数学学报, 2006,29(1): 68-80.
    [70]赵晓兵.基于生存数据的半参数变换的删失回归估计[J].应用概率统计. 2006, 22(2): 159-172.
    [71]潘雄,孙海燕.随机删失下半参数模型的补偿估计方法[J].测绘科学, 2005, 5(4): 27-29.
    [72] Qin Gengsheng. K-NN method in partial linear model with random censorship[J]. App1. Math-Jcu, 1995(10B): 275-286.
    [73]李军,杨善朝.相协样本半参数回归模型估计的强相合性[J].数学研究, 2004, 37(4): 431-437.
    [74]刘强,薛留根,陈放.删失数据下部分线性EV模型中参数的经验似然置信域[J].数学学报, 2009,3: 549-560.
    [75] Sundarraman. Median regression analysis from data with left and right censored observations[J]. Statistical Methodology, 2007, 4(2):121-131.
    [76]许冰. NA相依样本部分线性模型估计理论[J].系统科学与数学, 2004, 24(2): 232-242.
    [77]朱春浩,孙光辉.随机截断下NA样本半参数回归模型中的相合估计[J].数学杂志, 2007, 27(3): 327-332.
    [78]苟列红.左截断右删失数据下半参数模型风险率函数估计[J].应用数学学报, 2005, 28(4): 675-688.
    [79] Jong Baek, Han-Ying Liang. Asymptotics of estimators in semi-parametric model under NA samples[J]. Journal of Statistical Planning and Inference 2006,136:3362-3382.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700