关于动态隐患识别系统的研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
在日常生活中,我们有时需要监控某个个体(飞机、汽车电池、病人)的质量特性/表现,以考察该个体是否正常运行,避免一些不好的事情发生。为此,我们有必要建立一个这样的隐患识别系统:它顺序收集待监控个体的观测值,在该个体的纵向模式偏离正常模式后,可以尽可能快的给出报警,以便即使的做出一些调整或者干预。对于这一研究问题,我们称其为动态隐患识别问题。值得注意的是,动态隐患识别问题中所要监控的个体的质量特性值的分布并不是恒定的,而是随着时间变化而变化的。
     对于动态隐患识别问题,目前的统计学文献中仅有两类方法与之相关。第一类方法属于纵向数据的研究领域。它是基于正常工作的个体的观测数据,利用纵向数据分析方法来构建特性变量均值的置信区间,然后通过观察观测值是否落在区间外面来判断个体是否出现异常。具体来说,这种方法是通过比较当前时间点下待监控个体与正常工作个体来判断待监控个体是否正常工作。鉴于下述原因,这种方法应用在动态隐患识别问题中,表现往往并不那么高效。首先,这一方法在监控的时候并没有利用到历史数据,这会导致一些工作异常的个体无法识别出来。譬如,如果我们希望监控个体的某个质量特性指标随时间变化一直略高于正常工作个体的指标均值,我们应该判其为异常工作的个体,但利用LDA方法就不会表现的那么灵敏。第二,在动态监控问题中,一旦发现异常,我们就需要及时地做出响应,以避免不好的结果出现,因此,动态的决策方案是非常关键的。但置信区间方法不具备这种动态决策的能力,因为它不能顺序的监控个体的质量特性。
     第二类与动态隐患识别问题相关的统计方法属于统计过程控领域。通过控制图,我们可以顺序的监控每个个体,并且一旦个体的质量特性值从受控状态变为失控状态,我们就快速的报警。然而,传统的控制图也不能直接应用于动态隐患识别问题,原因如下。首先,传统控制图只涉及一个监控过程,决策是基于比较当前数据和以往历史数据给出的。然而,动态隐患识别问题涉及很多监控过程,其决策是基于比较当前过程和大量正常工作过程给出的。第二,传统监控过程中,受控分布是不会改变的。在动态隐患识别问题中,这往往是很难达到的。
     到目前为止,我们没法找到一种有效的方法来解决动态隐患识别问题。这使得动态隐患识别问题更具挑战意义。除此之外,对于动态隐患识别问题,我们需要合理的处理不同时间点观测值之间的相互关系。而对于多元质量特性变量,我们甚至要考虑不同质量特性之间的相关性。为此,本文提出了一个动态隐患识别方法,我们称其为动态隐患识别系统。
     我们提出的动态隐患识别系统包含3个主要步骤。第一,利用正常工作的个体观测值估计出正常工作个体质量特性的纵向模式(均值函数和方差/协方差函数)。第二,对待监控的新个体,利用第一步中获得的纵向模式来标准化其所收集到的观测值。第三,应用控制图到标准化数据,以达到监控目的。由于第二步是非常容易获得的,本文主要考虑如何获得第一步的纵向模型以及第三步中控制图应用于标准化残差的表现。
     文章首先讨论纵向模式的获取方法,然后再讨论控制图应用于标准残差的表现。由于在实际应用中,很多问题的纵向模式并不能由参数模型刻画,因此,我们将集中于非参数纵向模式的讨论。对于单变量动态隐患识别系统,第一步中的非参数纵向模式的获取方法已被广泛研究,相关方法也已出现在很多教科书或专著中。而对于多变量的非参数纵向模式的获取方法,目前仍无人予以研究。这是因为建立多元纵向数据的非参数方法需要合理的处理好多元响应变量不同时间下观测值和不同分量间的相互关系。受来自美国国家心肺和血液研究所数据的启发,本文在第二章提出一个用于分析多元纵向数据非参数建模方法。在第二章中,我们并未对模型做过多的假定,而仅仅在很一般的假定下,讨论了多元纵向数据的非参数回归模型。理论结果表明,我们给出的方法是高效的,数值结果很好的演示了我们的理论结果。为使得模型更具应用性,我们还将其推广到存在缺失数据时的非参数回归模型。
     接下来,我们在第三、第四章中分别讨论了,在单变量和多变量质量特性监控问题中,一般控制图应用于标准化残差的表现。在第三章中,我们首先讨论一元纵向模式已知时,控制图的表现情况,并基于此给出了监控方法的控制线。然而实际应用中,纵向模式是需要从正常工作的个体观测中获取的。因此,在第三章中我们也讨论了纵向模式被估计出后,各种情形下(标准化残差独立、相关,监控均值突变漂移、渐变漂移)控制图的表现。所有结果都显示我们的方法是十分有效的。第四章,我们则考虑的多元情形。我们首先考察了观测值是来自多元正态多元分布的监控问题。然后讨论了非常一般的情形(标准化残差相关,标准化残差不服从正态)的监控统计量以及控制线的获取方法。同样的,所提方法具有良好的表现效果。需要注意的是,第二、三和四章中涉及到的实例分析,所用数据均来自美国国家心、肺、血液研究所收集的关于心肌梗塞研究试验的数据。
In our daily life, we often need to identify individuals whose longitudinal behavior is different from the behavior of those well-functioning individuals, so that some unpleasant consequences can be avoided. In many such applications, observations of a given individual are obtained sequentially, and it is desirable to have a screening system to give a signal of irregular behavior as soon as possible after that individual's longitudinal behavior starts to deviate from the regular behavior, so that some adjustments or interventions can be made in a timely manner.
     In the statistical literature, there are two relevant methods. One is to construct confi-dence intervals of the mean performance variables, using longitudinal data analysis (LDA), and the subject's longitudinal pattern can be identified as abnormal if its observations fall outside the intervals. The second method is to monitor each subject sequentially using a statistical process control (SPC) chart. By this method, a signal will be given after the chart detects a significant shift in the longitudinal pattern of the subject over time. These two methods, however, are ineffective to handle the dynamic screening (DS) problem, because the LDA method cannot sequentially monitor a subject in question while the SPC method cannot compare the subject cross-sectionally with other subjects. In this paper, we propose a new method to handle the DS problem effectively by combining the major strengths of the LDA and SPC methods.
     For the LDA method, the univariate longitudinal data has been widely studied in the literature. Besides, multivariate longitudinal data are common in medical, industrial, and social science research (e.g., the data from the SHARe Framingham Heart Study). Thus, in this paper we focus on the study of multivariate longitudinal data. Statistical analysis of such data in the current literature is restricted to linear or parametric modeling, which may well be inappropriate in applications. However, in most cases this two assumptions are invalid. Thus, it is desirable to develop a nonparametric method to handle multivariate longitudinal data. When longitudinal data are multivariate, nonparametric modeling becomes challenging, as one needs to properly handle the association among the observed data across different time points and across different components of the multivariate response. Motivated by data from the National Heart, Lung and Blood Institute, this paper proposes a nonparametric modeling approach for analyzing multivariate longitudinal data in Chapter2. Our method is based on multivariate local polynomial smoothing. Both theoretical and numerical results show that it is useful in various settings.
     In Chapter3, we consider cases when the longitudinal behavior is univariate. Sev-eral different cases, including those with regularly spaced observation times, irregularly s-paced observation times, and correlated observations, are discussed. Our proposed method is demonstrated using a real-data example about the SHARe Framingham Heart Study of the National Heart, Lung and Blood Institute.
     In Chapter4, we consider cases when the longitudinal behavior is multivariate. Our proposed multivariate dynamic screening system makes decisions about the longitudinal pat-tern of a subject by comparing it with other subjects cross-sectionally and by sequentially monitoring it as well. Related results show that it provides good performance in various cases.
引文
[1]Apley, D.W., and Tsung, F. (2002), "The autoregressive T2 chart for monitoring uni-variate autocorrelated processes," Journal of Quality Technology,34,80-96.
    [2]Brockwell, P., Davis, R., and Yang, Y. (2007), "Continuous-time Gaussian autoregres-sion," Statistica Sinica,17,63-80.
    [3]Capizzi, G., and Masarotto, G. (2003), "An adaptive exponentially weighted moving average control chart," Technometrics,45,199-207.
    [4]Capizzi, G., and Masarotto, G. (2011), "A least angle regression control chart for multidimensional data," Technometrics,53,285-296.
    [5]Chen, K., and Jin, Z. (2005), "Local polynomial regression analysis of clustered data," Biometrika,92,59-74.
    [6]Costa, A.F.B. (1998), "Joint X and R charts with variable parameters," HE Transac-tions,30,505-514.
    [7]Coull, B. A. and Staudenmayer, J. (2004). Self-modeling regression for multivariate curve data. Statistica Sinica.14,695-711.
    [8]Croisier, R.B. (1988), "Multivariate generalizations of cumulative sum quality-control schemes," Technometrics,30,243-251.
    [9]Crowder, S.V., and Hamilton, M. (1992), "Average run lengths of EWMA control charts for monitoring a process standard deviation," Journal of Quality Technology, 24,44-50.
    [10]Cupples, L.A. et al. (2007), The Framingham Heart Study 100K SNP genome-wide association study resource:overview of 17 phenotype working group reports," BMC Medical Genetics,8, (Suppl 1):S1.
    [11]Ding, Y., Zeng, L., and Zhou, S. (2006), "Phase I analysis for monitoring nonlinear profiles in manufacturing processes," Journal of Quality Technology,38,199-216.
    [12]Efron, B., Hastie, T., Johnstone, I., and Tibshirani, R. (2004), "Least angle regression", Annals of Statistics,32,407-489.
    [13]Fieuws, S. and Verbeke, G. (2006). Pairwise fitting of mixed models for the joint modeling of multivariate longitudinal profiles. Biometrics.62,424-31.
    [14]Fitzmaurice, G. M., Laird, N. M. and Ware, J. H. (2004). Applied longitudinal analysis. John Willey & Sons, New York
    [15]Gan, F.F. (1995), "Joint monitoring of process mean and variance using exponentially weighted moving average control charts," Technometrics,37,446-453.
    [16]Godleski, J. J. Verrier, R. L. Koutrakis, P. Catalano, P. et al. (2000). Mechanisms of morbidity and mortality from exposure to ambient air particles. Research Report, Health Effects Institute 91,5-103.
    [17]Gray, S. M. and Brookmeyer, R. (2000). Multidimensional longitudinal data:esti-mating treatment effect from continuous, discrete or time-to-event response variable. Journal of the American Statistical Association.95,396-406.
    [18]Hall, P., Muller, H.G., and Yao, F. (2008), "Modelling sparse generalized longitudinal observations with latent Gaussian processes," Journal of the Royal Statistical Society (Series B),70,703-723.
    [19]Hall, P. and Robinson, A. P. (2009). Reducing variability of cross-validation for smoothing-parameter choice. Biometrika.96,175-186.
    [20]Han, D., and Tsung, F. (2006), "A reference-free cuscore chart for dynamic mean change detection and a unified framework for charting performance comparison," Journal of the American Statistical Association,101,368-386.
    [21]Hawkins, D.M. (1991), "Multivariate quality control based on regression-adjusted vari-ables," Technometrics,33,61-75.
    [22]Hawkins, D.M., and Deng, Q. (2010), "A nonparametric change-point control chart," Journal of Quality Technology,42,165-173.
    [23]Hawkins, D.M., and Maboudou-Tchao, E.M. (2007), "Self-starting multivariate expo-nentially weighted moving average control charting," Technometrics,49,199-209.
    [24]Hawkins, D.M., and Olwell, D.H. (1998), Cumulative Sum Charts and Charting for Quality Improvement, New York:Springer-Verlag.
    [25]Healy, J.D. (1987), "A note on multivariate CUSUM procedure," Technometrics,29, 409-412.
    [26]Jensen, W.A., and Birch, J.B. (2009), "Profile monitoring via nonlinear mixed models," Journal of Quality Technology,41,18-34.
    [27]Jiang, W. (2004), "Multivariate control charts for monitoring autocorrelated processes," Journal of Quality Technology,36,367-379.
    [28]Jiang, W., Tsui, K.L., and Woodall, W. (2000), "A new SPC monitoring method:the ARMA chart," Technometrics,42,399-410.
    [29]Kim, K., Mahmoud, M.A., and Woodall, W.H. (2003), "On the monitoring of linear profiles," Journal of Quality Technology,35,317-328.
    [30]Knoth, S. (2011), "spc:Statistical Process Control," R package version 0.4.0, http://CRAN.R-project.org/package=spc.
    [31]Lahiri, S.N. (2003), Resampling Methods for Dependent Data, New York:Springer.
    [32]Li, Y. (2011). Efficient semiparametric regression for longitudinal data with nonpara-metric covariance estimation. Biometrika.98,355-370.
    [33]Li, S.Y., Tang, L.C., and Ng, S.H. (2010), "Nonparametric CUSUM and EWMA control charts for detecting mean shifts," Journal of Quality Technology,42,209-226.
    [34]Liang, K. Y. and Zeger, S. L. (1986). Longitudinal data analysis using generalized linear models. Biometrika.73,13-22.
    [35]Lin, X. and Carroll, R. (2000). Nonparametric function estimation for clustered data when the predictor is measured without/with error. J. Am. Statist. Assoc.95,520-534.
    [36]Lin, X. and Carroll, R. (2001). Semiparametric regression for clustered data using generalized estimating equations. J. Am. Statist. Assoc.96,1045-1056.
    [37]Lowry, C.A., Woodall, W.H., Champ, C.W., and Rigdon, S.E. (1992), "Multivariate exponentially weighted moving average control chart", Technometrics,34,46-53.
    [38]Lu, C.W., and Reynolds, M.R., Jr. (2001), "Control charts for monitoring an autocor-related process," Journal of Quality Technology,33,316-334.
    [39]Ma, S., Yang, L., and Carroll, R. (2012), "A simultaneous confidence band for sparse longitudinal regression," Statistica Sinica,22,95-122.
    [40]Maller, R.A., Muller, G., and Szimayer, A. (2008), "GARCH modelling in continuous time for irregularly spaced time series data," Bernoulli,14,519-542.
    [41]Montgomery, D. C. (2009), Introduction To Statistical Quality Control (6th edition), New York:John Wiley & Sons.
    [42]Moustakides, G.V. (1986), "Optimal stopping times for detecting changes in distribu-tions," The Annals of Statistics,14,1379-1387.
    [43]Pan, J., Ye, H., and Li, R. (2009), "Nonparametric regression of covariance structures in longitudinal studies," Technical Report,2009.46, School of Mathematics, The Uni-versity of Manchester, UK.
    [44]Pan, X., and Jarrett, J. (2004), "Applying state space to SPC:monitoring multivariate time series," Journal of Applied Statistics,31,397-418.
    [45]Qiu, P. (2005), Image Processing and Jump Regression Analysis, New York:John Wiley & Sons.
    [46]Qiu, P. (2008), "Distribution-free multivariate process control based on log-linear mod-eling," IIE Transactions,40,664-677.
    [47]Qiu, P. (2013), Introduction to Statistical Process Control, London:Chapman & Hal-1/CRC.
    [48]Qiu, P., and Hawkins, D.M. (2001), "A rank-based multivariate CUSUM procedure", Technometrics,43,120-132.
    [49]Qiu, P., and Hawkins, D.M. (2003), "A nonparametric multivariate CUSUM procedure for detecting shifts in all directions", Journal of the Royal Statistical Society (Series D)-The Statistician,52,151-164.
    [50]Qiu, P., and Li, Z. (2011a), "On nonparametric statistical process control of univariate processes," Technometrics,53,390-405.
    [51]Qiu, P., and Li, Z. (2011b), "Distribution-free monitoring of univariate processes," Statistics and Probability Letters,81,1833-1840.
    [52]Qiu, P., Zou, C., and Wang, Z. (2010), "Nonparametric profile monitoring by mixed effects modeling (with discussions)," Technometrics,52,265-277.
    [53]O'Brien, L. M. and Fitzmaurice, G. M. (2004). Analysis of longitudinal multiple-source binary data using generalized estimating equations. Applied Statistician.53,177-193.
    [54]Reynolds, M.R., Jr., Amin, R.W., and Arnold, J.C. (1990), "CUSUM charts with variable sampling intervals," Technometrics,32,371-384.
    [55]Rochon, J. (1996). Analyzing bivariate repeated measures for discrete and continuous outcome variable. Biometrics.52,740-750.
    [56]Roy, J. and Lin, X. (2000). Latent variable model for longitudinal data with multiple continuous outcomes. Biometrics.56,1047-1054.
    [57]Runger, G.C., and Willemain, T.R. (1995), "Model-based and model-free control of autocorrelated processes," Journal of Quality Technology,27,283-292.
    [58]Ruppert, D. and Wand, M. P. (1994). Multivariate locally weighted least squares regression. Ann. Statist.22,1346-1370.
    [59]Shu, L., and Jiang, W. (2006), "A Markov chain model for the adaptive cusum control chart," Journal of Quality Technology,38,135-147.
    [60]Sparks, R.S. (2000), "CUSUM charts for signalling varying location shifts," Journal of Quality Technology,32,157-171.
    [61]Tibshirani, R.J. (1996), "Regression shrinkage and selection via the LASSO", Journal of the Royal Statistical Society (Series B),58,267-288.
    [62]Vityazev, V.V. (1996), "Time series analysis of unequally spaced data:the statistical properties of the Schuster periodogram," Astronomical and Astrophysical Transactions, 11,159-173.
    [63]Wang, K., and Jiang, W. (2009), "High-dimensional process monitoring and fault iso-lation via variable selection," Journal of Quality Technology,41,247-258.
    [64]Wang, N. (2003). Marginal nonparametric kernel regression accounting for within-subject correlation. Biometrika.90,43-52.
    [65]Weiss, R. E. (2005). Modeling Longitudinal data. Springer, USA.
    [66]Woodall, W.H., and Adams, B.M. (1993), "The statistical design of CUSUM charts," Quality Engineering,5,559-570.
    [67]Woodall, W.H., and Ncube, M.M. (1985), "Multivariate CUSUM quality-control pro-cedures," Technometrics,27,285-292.
    [68]Wu, Z., Zhang, S., and Wang, P. (2007), "A CUSUM scheme with variable sample sizes and sampling intervals for monitoring the process mean and variance," Quality and Reliability Engineering International,23,157-170.
    [69]Xiang, D., Qiu, P., and Pu, X. (2013), "Nonparametric regression analysis of multi-variate longitudinal data," Statistica Sinica,23,560-582.
    [70]Yao, F., Miiller, H.G., and Wang, J.L. (2005), "Functional data analysis for sparse longitudinal data," Journal of the American Statistical Association,100,577-590.
    [71]Yashchin, E. (1993), "Statistical control schemes:methods, applications, and general-izations," International Statistical Review,61,41-66.
    [72]Yeh, A.B., Lin, D.K.J., and Venkataramani, C. (2004), "Unified CUSUM charts for monitoring process mean and variability," Quality Technology and Quantitative Man-agement,1,65-86.
    [73]Zamba, K.D., and Hawkins, D.M. (2006), "A multivariate change-point for statistical process control", Technometrics,48,539-549.
    [74]Zhao, Z., and Wu, W. (2008), "Confidence bands in nonparametric time series regres-sion," Annals of Statistics,36,1854-1878.
    [75]Zou, H. (2006), "The adaptive lasso and its oracle properties", Journal of the American Statistical Association,101,1418-1429.
    [76]Zou, C., and Qiu, P. (2009), "Multivariate statistical process control using LASSO", Journal of the American Statistical Association,104,1586-1596.
    [77]Zou, C., and Tsung, F. (2010), "Likelihood ratio-based distribution-free EWMA control charts," Journal of Quality Technology,42,1-23.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700