时间序列中回归模型的诊断检验
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
时间序列是指被观测到的依时间次序排列的数据序列。从经济、金融到工程技术,从天文、地理到气象,从医学到生物,几乎在各个领域中都涉及到时间序列。对时间序列数据进行统计分析及推断,被称为时间序列分析。近几十年来,金融时间序列分析得到了人们广泛的关注。Engle在1982年对英国的通货膨胀率数据进行分析时提出一种统计建模思想:时间序列自回归模型误差的条件方差不一定是常数,可以随时间的变化而不同。基于这个思想,Engle首次提出了条件异方差模型,即人们熟知的ARCH(p)模型。由于Engle出色的开创性工作,金融时间序列条件异方差模型很快在学术界和实际应用中得到了极大的关注。许多专家学者根据实际中经济、金融数据的各种特征,提出了各种各样的条件异方差模型,并研究各种参数或非参数估计方法。但是,提出的模型是否合理?或者说,观测数据是否真的来自这一模型?人们往往不太关心。这个问题实际上是所谓的模型检验问题。对于著名的Box-Jenkins时间序列建模三步曲:模型的建立、模型的参数估计和模型的检验,理论上他们具有同等重要的地位。但是,正如专著Li所述,人们关注更多的是前面两步工作,而第三步(即模型的检验)常常得不到应有的重视。对于近二十年来受到广泛关注的条件异方差模型,模型检验问题同样没有得到应有的关注,相关的研究寥寥无几。
     对传统的回归模型,文献中主要有两大类模型检验方法:局部光滑方法和整体光滑方法。局部光滑方法涉及用非参数估计方法估计其均值函数从而有可能导致维数问题。为了避免维数问题,学者们提出了各种各样的整体光滑方法用于模型检验,构造的检验不需要非参数光滑,但是对高频备择不敏感。上述两种方法各有优缺点。另外,这两种方法基本上都是针对因变量为一元情形。因此,本文提出一些新的方法来处理时间序列自回归模型的模型检验问题。需要特别指出的是,本文考虑的时间序列包括一元和多元情形,回归函数形式可以非常一般,自回归变量可以有多个后置项。
     本文首先研究了一元时间序列一般形式的自回归模型(包括条件异方差模型的均值模型和方差模型)的模型检验问题。通过模型的残差或标准化的残差进行加权平均,我们构造了一个得分型检验统计量。该检验具有许多优良性质,比如:在零假设模型下是渐近卡方分布的,处理起来简单;对备择假设敏感,能检测到以参数的速度收敛到原假设的备择假设模型;通过权函数的选择可以构造功效高的检验。在方向备择情形,我们研究得到了最优(功效最高)的得分型检验。当备择不是沿着某一方向而是多个可能的方向趋于原假设时,我们构造了极大极小(maximin)检验,该检验是渐近分布自由的,并具有许多优良性质。另外,对备择完全未知(即完全饱和备择)情形,我们也基于得分型检验的思想提出了一个构造万能检验(omnibus test)的可行性方案。需要指出的是,关于时间序列回归模型的诊断检验问题,本文是第一篇理论上研究检验的功效性质的文章。另外,在进行功效研究的过程中,我们得到了当模型被错误指定时参数估计(拟极大似然估计)的渐近性质。
     注意到得分型检验在构造过程中涉及渐近方差的插入估计(plug-in estimation)。当样本量很小时,检验功效可能不高。为此,本论文在相依数据情形,发展了非参数蒙特卡罗检验方法(NMCT)。该方法避免由于使用插入估计导致的问题,提高检验统计量在样本量较小时的检验功效。模拟结果表明当样本量较大或适中时,非参数蒙特卡罗检验方法并没有明显优势,这是因为当样本量不是很小时得分型检验表现比较好。但当样本量很小时,非参数蒙特卡罗检验方法就表现得比较有优势。具体而言,当样本量较小时,用NMCT方法确定临界值和通过渐近分布确定临界值得到的检验功效相差比较大。
     另外,为了避免渐近方差的插入估计方法,我们通过经验似然方法构造了一个尺度不变的经验似然比得分型检验。该检验一方面具有经验似然方法的优良性质,比如:Wilks定理(或现象)和Bartlett可纠正性。另一方面具有得分型检验的优良性质,比如:检验在零假设下是渐近卡方的,能检测到以参数的速度收敛到零假设的方向备择假设。值得一提的是,在研究过程中,我们发现简单的经验似然比方法用于模型检验时没有Wilks现象,得到的检验不是尺度不变的,这显然是不理想的。为此,我们提出一种纠偏技术,最终得到了一个纠偏的经验似然比得分型检验统计量,该检验具有Wilks性质。
     实际应用中,把多个时间序列统一起来处理(即研究向量时间序列)常常是必要的和重要的。在Engle首次提出条件异方差模型后不久即有学者提出并研究多元GARCH-型模型。然而,多元GARCH-型模型相比一元情形而言无论在参数估计方面还是模型检验方面,处理起来都更难。杂志Journal of Applied Econometrics 2006年发表的一篇文章指出多元GARCH-型模型的模型检验方法的发展是一个公开的问题,该问题的解决无论对理论研究还是在实际应用都将产生重要的推动作用。通常,一个已知方法的直接推广(多元因变量情形)不可能构造一个功效高的检验。事实上,无论对于理论研究还是实际应用,我们都应该特别关注因变量各成分间的相关性问题。本文通过一些变换或技术处理直接研究多元时间序列模型或多元GARCH-型模型的模型检验问题。具体而言,对向量自回归模型检验时,我们基于向量残差逐项加权平均得到检验统计量。为了避免渐近方差的插入估计方法,我们也考虑结合经验似然方法,并通过纠偏技术处理,得到了一个尺度不变的经验似然比得分型检验。对于多元GARCH-型模型,我们通过标准化的残差的一个函数进行加权平均得到检验统计量。对上述检验统计量,我们均从理论上进行了功效研究。
     最后,通过计算机模拟实验和实际数据分析说明我们的模型检验方法的有用性。
Time series is a data series observed over time which can be found in almost all fields such as economic, finance, technology, astronomy, geography, meteorology, medicine, biology and etc. Statistical analysis and inference of time series is named time series analysis. In recent, financial time series analysis have received a growing amount of attention. Engle (1982) argued originally that the conditional variance of the error may be time-varying in studying the variance of UK inflation and then proposed the autoregressive conditional heteroscedasticity (ARCH(p)) model which is well known for all. Since his seminal work, time series with conditional heteroscedasticity have received a great of attention in both the theory and the applications. In the literature, various GARCH models have been proposed to fit the actual economic/finance data with different properties. Meantime, all various parametric or nonparametric estimation methods have been suggested and investigated. However, is the model valid? that is, is the data really from the model? The problem has not received more attention which it deserves. In fact, this is a problem on model checking. In the well-known Box-Jenkins time series modeling three steps: modeling, parameter estimation and model checking, their roles should be the same. As argued in the monograph by Li , however, a lot has been done for the first two steps, and the third (model checking) has not received the attention it deserves. For the GARCH models which have received a great of attention in the last two decades, there are a few papers focusing on model checking.
    In classical regressions, there are largely two major classes of techniques used for model diagnostic checking which are locally smoothing and globally smoothing methods. For the former, the tests require nonparametric estimation of mean regression function and then often suffer the curse of dimensionality. Many tests were proposed to avoid the severe dimensionality problems by globally smoothing methods. The tests do not need nonparametric smoothing, but are less sensitive to the high frequency alternative. Therefore, the pros and cons of the two methodologies are fairly clear. Moreover, the above two methods is used in the case in which the response is an univariate variable. Therefore, the thesis proposes some new approaches to constructing tests for checking the adequacy of regression models in time series. It is worthwhile to point out that the time series considered in this thesis is the univariate or multivariate, the form of the regression function is general, and the regressors can be several lagged variables.
    We first study model checking for the general autoregressive models (including mean regression models and variance regression models in the time series with heteroscedastic-ity). By averaging the weighted residuals, we construct a score type test statistic. The tests have the following feathers: in the null model, they are asymptotically chi-squared and then they are trackable, they are sensitive to alternative and can detect the directional alternative converging to the null with the parameter rate, they involve weight functions, which provides us with the flexibility to choose scores for enhancing power performance, especially under directional alternatives. For a directional alternative, the optimal score type test is investigated. And for a class of alternatives, we construct asymptotically distribution-free maximin test which has many desirable properties. A possibility to construct score-based omnibus tests is discussed when the alternative is saturated. It is worthwhile to point out that this is the first paper to study theoretically power properties in diagnostic checking for GARCH-type models. As a product, we also study the asymptotically properties of parameter estimation when the parameter model is not correctly specified.
    Note that when the sample is small, the power of the tests maybe is not high, which is resulted from the plug-in estimation in the construction of our score-type test. Therefore, we develop the nonparametric Monte Carlo test (NMCT) approach in dependent data case. By the developed NMCT approach, we can determine the reject value without the plug-in estimation and enhanced the power of the test when the sample is small. Simulation results show that when the sample is large or moderate, the NMCT approach is not better than the test with plug-in estimation. The reason is that our score test performs well when the sample is not small. When the sample is very small, the NMCT shows its usefulness. That is, when the sample is small, the power of the tests with the critical values determined by the NMCT approach is higher than that of the limiting distribution.
    Moreover, to avoid the plug-in estimation, we construct the empirical likelihood based score type test which is self-invariant. The test shares some desirable properties with parametric likelihood such as Bartlett correctability and Wilks' theorem. On the other hand, the resulted test shares many desirable feathers of score-type tests: it is asymptotically chi-squared under the null hypothesis and can detect the alternatives converging to the null at a parametric rate. It is worthwhile to point out that the naive EL-based tests are found to be not of Wilks' phenomenon in the study. So, a bias correction technique is proposed in the construction of the EL-based score type tests and then
    the adjusted ones are of Wilks' phenomenon.
    In the actual applications, multivariate time series have been found more and more useful. The univariate conditional heteroscedasticity models were extended to the multivariate ones almost as soon as the original paper on ARCH was published. For both parameter estimation and model checking, it is more difficult to do in multivariate GARCH-type models than in univariate GARCH-type models. It is reported by the paper published in the Journal of Applied Econometrics (2006) that further development of multivariate diagnostic tests is one of ten open issues/research topics in multivariate GARCH-type models and that progress in this issue would greatly contribute to the theory and practice of multivariate GARCH-type models. Usually, any direct extension of existing methodologies cannot construct powerful tests. In fact, we should pay particular attention on the correlation between the components of the vector variables in both the theory and the applications. In the thesis, we study the model checking for the vector autoregressive models and the multivariate GARCH-type models through various techniques. Specifically, in checking the adequacy of vector autoregressive models, we construct the test statistics by averaging each weighted component of the (vector) residuals. To avoid the plug-in estimation, we develop the empirical likelihood based score type test by corrected-bias techniques, and the test is self-invariant. In checking the adequacy of variance model of multivariate GARCH-type models, we construct the test statistics by averaging the weighted function of standardized residuals. Also we study theoretically the power of all the above tests.
    Some simulation studies are carried through and the applications to some real data set are illustrated, which show the usefulness of our results in the thesis.
引文
[1] Aert, M., Claeskens, G., and Hart, J. D. (1999). Testing lack of fit in multiple regression. J. Am. Statist. Assoc., 94, 869-879.
    [2] Anderson, T. W. (1971). The Statistical Analysis of Time Series. John Wiley & Sons, INC., New York.
    [3] Bauwens, L., Laurent, S. and Rombouts, J. (2006). Multivariate GARCH models: A survey. J. Appl. Econ., 21, 79-109.
    [4] Behnen, K. and Neuhaus, G. (1989). Rank Tests with Estimated Scores and Their Application. B. G. Neubner Stuttgart, Germany.
    [5] Bera, A. K. and Higgins, M. L. (1993). A surveys of ARCH models: Properties, estimation and testing. J. Econ. Surveys, 7, 305-366.
    [6] Billingsley, P. (1961). The Lindeberg-Levy theorem for martingales. Proc. Am. Math. Soc., 12, 788-92.
    [7] Billingsley, P. (1968). Convergence of Probability Measures. Wiley, New York.
    [8] Bollerslev, T. (1986). Generalized autoregressive conditional heteroscedasticity. J. Economet., 31,307-327.
    [9] Bollerslev, T.(1990). Modelling the coherence in short-run nominal exchange rates: A multivariate generalized ARCH model. Review of Economics and Statistics, 72, 498-505.
    [10] Bollerslev, T., Chou, R. Y. and Kroner, K. F. (1992). ARCH modeling in finance: A review of the theory and empirical evidence. J. Economet., 52, 5-59.
    [11] Bollerslev, T., Engle, R. F. and Nelson, D. B. (1994). ARCH models. Handbook of Econometrics 4, Ed. R. F. Engle and D. L. McFadden, North-Holland, Amsterdam.
    [12] Bollerslev, T., Engle, R. F. and Wooldridge, J. M. (1988). A capital asset pricing model with time varying covariance. J. Polit. Econ., 96, 116-31.
    [13] Bollerslev, T. and Wooldridge, J. M. (1992). Maximum likelihood estimation and inference in dynamic models with time varying covariance. Economet. Rev., 11, 143-72.
    [14] Box, G. E. P. and Jenkins, G. M. (1970). Time Series Analysis Forcasting and Control. 1st edition. Holden-Day, San Francisco.
    [15] Box, G. E. P. and Jenkins, G. M. (1976). Time Series Analysis Forcasting and Control. 2st edition. Holden-Day, San Francisco.
    [16] Cai, Z., Fan, J. and Li, R.Z. (2000). Efficient estimation and inferences for varying-coefficient models. J. Amen Statist. Ass. , 95, 888-902.
    [17] Chen, S. X. (1996). Empirical likelihood for nonparametric density function. Biometrika. 83, 329-341.
    [18] Chen, S. X. and Hall, P. (1993). Smoothed empirical likelihood Confidence intervals for quantiles. Ann. Statist., 21, 1166-1181.
    [19] Chen, S. X., Hardle, W. and Li, M. (2003). An empirical likelihood goodness-of-fit test for time series. J. Roy. Statist. Soc. B, 65, 663-678.
    [20] Cheng, Kuang Fu and Lin, Pi Erh (1981). Nonparametric estimation of a regression function. Z. Wahrsch. Verw. Gebiete. 57, 223-233.
    [21] Cook, R. D. and Weisberg, S. (1982). Residual and Influence in Regression. Chapman and Hall, New York.
    [22] Dawkins, B. (1989), Multivariate analysis of national track records. Am. Star., 43, 110-112.
    [23] Dette, H. (1999). A consistent test for the functional form of a regression based on a difference of variance estimators. Ann. Statist., 27, 1012-1040.
    [24] Engle, R. F. (1982). Autoregressive conditional heteroscedasticity with estimates of the variance of UK inflation. Econometrica, 50, 987-1008.
    [25] Engle, R. F. and Grange, C. W. J. (1987). Cointegration and error correction: Representation, estimation and testing. Econometrica, 55, 251-276.
    [26] Engle, R. F., Grange, C. W. J. and Kraft, D. (1984). Combining competing forecasts of inflation using a bivariate ARCH model. J. Econ. Dyna. Cont., 8, 151-165.
    [27] Eubank, R. L. and Hart, J. D. (1993). Commonalty of cusum, von Neumann and smoothing-based goodness-of-fit tests. Biometrika, 80, 89-98.
    [28] Fan, J. and Huang, L. (2001). Goodness-of-fit tests for parametric regression models. J. Amer. Statist. Ass., 96, 640-652.
    [29] Fan, J. and Yao, Q. (2003). Nonlinear Time Series: Nonparametric and Parametric Methods. Springer, New York.
    [30] Fan, J. and Zhang, W.Y. (2000). Simultaneous confidence bands and hypothesis testing in varying-coefficient models. Scand. J. Statist. , 27, 715-731.
    [31] Fuller, W. A. (1996). Introduction to Statistical Time Series (Second edition). John Wiley & Sons, INC., New York.
    [32] Grange, C. W. J. and Joyeux, R. (1980). An introduction to long-memory time series models and fractional differencing. J. Time Set. Anal., 1, 15-29.
    [33] Hall, P. and La Scala, B (1990). Methodology and algorithms of empirical likelihood. International statistical review, 58, 109-127.
    [34] Hannan, E.J. and Deistler, M. (1988). The Statistical Theory of Linear Systems. Wiley, New York.
    [35] Hardle, W. and Mammen, E. (1993). Comparing non-parametric versus parametric regression fits. Ann. Statist., 21, 1926-1947.
    [36] Hart, J. D. (1997). Nonparametric Smoothing and Lack-of-Fit Tests, New York: Springer-Verlag.
    [37] He, X. and Zhu, L. X. (2003). A lack-of-fit test for quantile regression. J. Am. Star. Ass., 464, 1013-1022.
    [38] Hentschel, L. (1995). All in the family: Nesting symmetric and asymmetric GARCH models. J. Financ. Econ., 39, 71-104.
    [39] Higgins, M. L. and Bera, A. K. (1992). A class of nonlinear ARCH model. Int. Econ. Rev., 33, 137-158.
    [40] Horowitz, Hoell J and Spokoiny, V. G. (2001). An adaptive, rate-optimal test of a parametric mean-regression model against a nonparametric alternative, Econometrica, 69, 599-631.
    [41] Hosking, J.R.M. (1981). Fractional differencing. Biometrika, 68, 165-176.
    [42] Jeganathan, P. (1988). On the strong approximation of the distributions of estimators in linear stochastic models, Ⅰ and Ⅱ: stationary and explosive AR models. Ann. Statist., 16, 1283-1314.
    [43] Johnson, R. A. and Wichern, D. W. (1998). Applied Multivariate Statistical Analysis. Prentice-Hall International, USA.
    [44] Kraft, D. F. and Engle, R. F. (1983). Autoregressive conditional heteroscedasticity in multiple time series. Unpublished manuscript, Department of Economics, University of California at San Diego, CA.
    [45] Koul, H. L. and Stute, W. (1999). Nonparametric Method Checks for Time Series. Ann. Statist., 27, 204-236.
    [46] Li, G. D. and Li, W. K. (2005). Diagnostic checking time series models with conditional heteroscedasticity estimated by the LAD approach. Biometrika, 92, 691-701.
    [47] Li, W. K. (2004). Diagnostic Checking in Time Series. Chapman and Hall
    [48] Li, W. K., Ling, S. and McAleer, M. (2002). Recent theoretical results for time series models with GARCH errors. J. Econ. Surveys, 16, 245-269.
    [49] Li, W. K. and Mak, T. K. (1994). On the squared residual autocorrelations in non-linear time series with conditional heteroscedasticity. J. Time Ser. Anal., 15, 627-636.
    [50] Ling, S. and Li, W. K. (1997). Diagnostic checking of nonlinear multivariate ARCH errors. J. Time Ser. Anal., 18, 447-464.
    [51] Moran, P. A. P. (t953). The statistical analysis of the Canadian Lynx cycle Ⅰ. Structure and prediction. Aust. J. Zool., 1, 163-173.
    [52] Owen, A. (1988). Empirical likelihood ratio confidence intervals for a single functional. Biometrika, 75, 237-249.
    [53] Owen, A.(1990). Empirical likelihood ratio confidence regions. Ann. Statist., 18, 90-120.
    [54] Owen, A. (2001). Empirical Likelihood. Chapman and Hall, New York.
    [55] Qin, J. and Lawless, J. (1994). Empirical likelihood and general estimating functions. Ann. Statist., 22, 300-325.
    [56] Strasser, H. (1985). Mathematical Theory of Statistics. De Gruyter, Berlin.
    [57] Stute, W. (1997). Nonparametric model checks for regression. Ann. Statist., 25,613-641.
    [58] Stute, W., Gonzalez Manteiga, W., and Presedo Quindimil, M. (1998). Bootstrap approximations in model checks for regression. J. Amer. Statist. Ass., 93, 141-149.
    [59] Stute, W., Thies, S. and Zhu, L. X. (1998). Model checks for regression: An innovation process approach. Ann. Statist., 26, 1916-1934.
    [60] Stute, W. and Zhu, L. X. (2002). Model checks for generalized linear models. Scand. J. Statist., 29, 535-546.
    [61] Stute, W. and Zhu, L. X. (2005). Nonparametric checks for single-index models. Ann. Statist., 33, 1048-1083.
    [62] Stute, W., Zhu, L. X. and Xu, W. L. (2005). Model Diagnosis for Parametric Regression in High Dimensional Spaces. Biometrika, Submitted.
    [63] Tsay, R. S. (2002). Analysis of Financial Time Series. Wiley, New York.
    [64] Tse, Y. K. (2002). Residual-based diagnostic for conditional heteroscedasticity model. Economet. J., 5,358-373.
    [65] Tse, Y. K. and Tsui, A. K. C. (1999). A note on diagnosting multivariate conditional heteroscedasticity models. J. Time Ser. Anal., 20, 679-691.
    [66] Tse, Y. K. and Zuo, X. L. (1997). Testing for conditional heteroscedasticity: Some Monte Carlo results. J. Statist. Comput. Simul., 58, 237-253.
    [67] White, H. (1994). Estimation, Inference, and Specification Analysis. New York: Cambridge University Press.
    [68] Wu, Chien-Fu (1981). Asymptotic theory of nonlinear least squares estimation. Ann. Statist., 9, 501-513.
    [69] Yule, G. U. (1927). On a method of invetigating peroiodicities in disturbed series with special reference to Wolfer's sunspot numbers. Phil. Trans. R. Soc. (London), A, 226, 267-298.
    [70] Zhu, L. X. (2003). Model checking of dimension-reduction type for regression. Statist. Sinica, 13, 283-296.
    [71] Zhu, L. X. (2005). Nonparametric Monte Carlo Tests and Their Applications. Springer, New York.
    [72] Zhu, L. X. and Cui, H. J. (2005). Testing the adequacy of a general linear error-in-variables model. Statist. Sinica, 15, 1049-1068.
    [73] Zhu, L. X. and Ng, K. W. (2003). Checking the adequacy of a partial linear model. Statist. Sinica, 13, 763-781.
    [74] Zhu, L. X., Qin, Y. S. and Xu, W. L. (2006). Empirical likelihood ratio tests for regression models. Science in China, Accepted.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700