摘要
提出一个用于Web服务器集群系统(WSC)可靠性和降级过程分析的方法.可靠性过程建模为一个非齐次马尔可夫过程(NHMH),该过程由若干个非齐次泊松过程(NHPPs)组成.每个NHPP到达速率对应于系统软件失效率.用Cox比例风险模型(PHM)建模软件失效率,模型中同时考虑软件累积和瞬时工作负载.软件累积工作负载表示软件累积执行时间,而瞬时工作负载表示用户请求到达速率.可靠性分析结果是一个在WSC生命周期内随时间变化的可靠性和降级过程描述.最后,评估实验证明了方法的有效性
An approach for web server cluster( WSC)reliability and degradation process analysis is proposed. The reliability process is modeled as a non-homogeneous Markov process( NHMH) composed of several non-homogeneous Poisson processes( NHPPs). The arrival rate of each NHPP corresponds to the system software failure rate which is expressed using Cox 's proportional hazards model( PHM) in terms of the cumulative and instantaneous load of the software. The cumulative load refers to software cumulative execution time, and the instantaneous load denotes the rate that the users ' requests arrive at a server. The result of reliability analysis is a time-varying reliability and degradation process over the WSC lifetime. Finally, the evaluation experiment shows the effectiveness of the proposed approach.
引文
[1]Ye Z,Revie M,Walls L.A load sharing system reliability model with managed component degradation[J].IEEE Transactions on Reliability,2014,63(3):721-730.
[2]Vaidyanathan K,Harper R E,Hunter S W,et al.Analysis and implementation of software rejuvenation in cluster systems[C]//ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems.Massachu- setts,USA,2001:62-71.
[3]Park C.Parameter estimation for the reliability of loadsharing systems[J].IIE Transactions,2010,42(10):753-765.DOI:10.1080/07408171003670991.
[4]Park C.Parameter estimation from load-sharing system data using the expectation-maximization algorithm[J].IIE Transactions,2013,45(2):147-163.DOI:10.1080/0740817x.2012.669878.
[5]Liu H M.Reliability of a load-sharing k-out-of-n:G system:Non-iid components with arbitrary distributions[J].IEEE Transactions on Reliability,1998,47(3):279-284.DOI:10.1109/24.740502.
[6]Huang L,Xu Q.Lifetime reliability for load-sharing redundant systems with arbitrary failure distributions[J].IEEE Transactions on Reliability,2010,59(2):319-330.DOI:10.1109/tr.2010.2048679.
[7]Cox D R.Regression models and life tables(with discussion)[J].Journal of the Royal Statistical Society,Series B,1972,34(2):187-220.
[8]Zhang Q,Hua C,Xu G H.A mixture Weibull proportional hazard model for mechanical system failure prediction utilising lifetime and monitoring data[J].Mechanical Systems and Signal Processing,2014,43(1/2):103-112.DOI:10.1016/j.ymssp.2013.10.013.
[9]Mohammad R,Kalam A,Amari S V.Reliability of loadsharing systems subject to proportional hazards model[C]//Reliability and Maintainability Symposium(RAMS).Orlando,FL,USA,2013:1-5.
[10]Hou C Y,Chen C,Wang J S,et al.A scenario-based reliability analysis approach for component-based software[J].IEICE Transactions on Information and Systems,2015,E98-D(3):617-626.DOI:10.1587/transinf.2014edp7241.