面向寄存器软错误的容错编译技术研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
从计算机诞生之日起,可靠性问题就是计算机研究领域最需要关注的几个问题之一。虽然现代计算机的可靠性已经得到了很大程度的提高,但是随着计算机制造工艺的日趋复杂和应用领域的不断拓展,计算机可靠性仍然面临很多新的挑战。软错误是半导体电路中的一种瞬态故障现象,通常是由外部环境中的高能粒子
     辐照和电压扰动、电磁干扰等电磁噪声诱发。宇宙射线辐射所导致的单粒子翻转等软错误一直是影响航天计算机可靠性的重要原因。而随着集成电路制造工艺的持续进步,现代处理器的性能在大幅度提高的同时,对软错误也越来越敏感。继性能和功耗问题之后,软错误导致的计算可信性已成为一个日益严峻的课题。其中,由于寄存器访问频繁却未能被良好保护,发生于寄存器中的软错误成为影响系统可靠性的最关键因素之一。
     与硬件容错相比,针对软错误的软件容错技术由于在实现成本和灵活性等方面的优势而备受关注。本文在程序汇编代码的基础上,从程序可靠性的角度研究了面向寄存器软错误问题的程序分析、错误检测和编译优化等技术。本文的主要工作分为以下五个方面:
     1.从所运行程序角度就寄存器软错误对可靠性的影响进行定量分析,是设计和实现高效容错算法的基础。基于程序汇编代码,本文提出一种针对寄存器软错误的程序可靠性静态分析方法ASER。首先在一种已有静态分析方法的基础上,通过摘要函数的过程间分析框架提高了分析结果的精度。然后在寄存器活性分析的基础上,使用图可达的遍历方法提取出具体影响程序运行的寄存器生存期。ASER分析结果指出在寄存器软错误影响下程序可靠性与其自身结构的关系,以及寄存器相关生存期的量化分析结果。以上研究成果有助于理解程序中的关键脆弱点,为设计和实现针对寄存器软错误的高效错误检测和恢复技术提供了理论依据。
     2. ECC编码是解决软错误问题的有效手段,但是对全部寄存器用ECC进行保护在功耗开销、芯片面积和性能等方面都存在一定困难。本文假设只对部分寄存器进行ECC保护,然后提出一种寄存器重分配方法RAPP。该方法首先根据ASER的分析结果构造寄存器生存期的相干图,然后通过层次化图着色的启发式分配算法,把那些有ECC保护的寄存器尽量分配给比较关键的寄存器生存期。与已有方法相比,RAPP方法在兼顾功耗开销的前提下,对程序可靠性的改善效果最为明显。
     3.针对数据流错误的软件容错技术通常采取程序复算的方法,即把程序重复执行多次并比较运算结果以实现错误检测和恢复。其中指令级程序复算由于检错能力强、对用户透明和便于采用优化措施等特点而成为研究热点。但是指令复算中的一致性比较指令是限制容错程序性能的最关键因素,本文针对此问题提出一种针对指令复算的检查点优化方法COID。该方法基于错误传播的数据流分析,以系统调用指令为界限,在保证错误检测率的前提下,给出一种安全删除同步比较指令的方法。故障注入和性能分析实验表明,COID方法在不影响软错误检测率的前提下,将指令复算程序的平均性能提高了12.78%。
     4.软错误可能导致程序控制流错误,已有的控制流检测技术在性能开销和检测率方面存在不足。本文首先在程序控制流图的基础上,利用图着色算法对基本块进行分类,然后基于基本块的格式化标记提出一种有效的控制流检测方法ECCFS,并针对基本块内部和过程间的控制流检测问题分别给出扩展解决方法。检测效能分析和故障注入实验的结果表明,ECCFS能够检测出绝大部分的控制流错误。与已有控制流检测方法相比较,ECCFS在错误检测率和性能开销等方面都具有一定的优势。
     5.目前,硬件和软件实现的软错误容错技术在性能、功耗和实现成本等方面均有不同程度的开销。本文针对寄存器软错误,提出一种用于增强程序可靠性的编译优化方法SISER。其基本思想是通过指令调度缩短程序运行过程中寄存器总的活跃区间,即减少受寄存器软错误影响的有效区域,以提高程序运行时的可靠性。基于ASER和指令依赖关系的分析结果,SISER以动态规划的方式给出具体的基本块调度算法。指令调度实验结果表明,在无明显开销的前提下,目标程序的可靠性平均被提高了4.41%。与传统容错技术相比,SISER方法最大的特点是不会引入额外的时空开销。
From the birth date of computer, reliability has been one of the most con-cerned issues in computer science domain. Today, though the reliability of moderncomputer has been improved significantly, the increasingly complicated computermanufacturing technics and the continuously expanded computer application areacause the reliability of computer systems always faces many new challenges.
     Soft errors are a kind of transient fault phenomenon in semiconductor circuit,which are caused by external radiation or electrical noises, such as high energy neu-trons from cosmic rays, power glitches, electromagnetic interference and etc. Softerrors introduced by the radiation of cosmic rays, e.g. single event upsets, alwaysa?ect the reliability of space computers. Moreover, with the continuously increasingperformance enabled by the scaling of VLSI technologies, modern microprocessorsare becoming more susceptible to soft errors. Subsequently to the wall of perfor-mance and power consumption, the dependability of computing, caused by softerrors, has emerged as a growing concern. Since Register Files (RFs) are accessedvery frequently and can not be well protected, soft errors occurring in them are oneof the critical reasons for a?ecting program reliability.
     Comparing with the hardware-implemented fault tolerance for soft errors, thesoftware-implemented methods are attractive because of their advantage on costsand ?exibility. For addressing the soft errors occurred in RFs, this dissertationfocuses on the techniques about program analysis, error detection, compiler opti-mization and etc. The main work is divided into the following five parts:
     1. It is valuable for analyzing the impact of soft errors occurred in RFs from theperspective of program, which is the foundation for implementing e?cient faulttolerance technologies. Based on the assembly codes, this dissertation proposesa static approach, named ASER, which is able to analyze the soft errors ef-fect quantitatively for the reliability of a given program. Based on a previousstatic method, ASER calculate the living probability of registers according tothe inter-procedural analysis framework of summary functions, resulting in theimprovement of final accuracy. Then, the concrete live ranges of registers are sketched via a graph reachability method. Analytical experiments show thatthe reliability of a program has a connection with its native structure. More-over, the critical factor of all involved live ranges have been presented, whichidentify the vulnerabilities of a program under the occurrence of soft errors inRFs. These contributions are in favor of implementing the e?cient algorithmsfor tolerating soft errors.
     2. ECC coding is one of the most powerful and popular architectural error pro-tection mechanisms for mitigating soft errors. But it is di?cult to fully protectRFs using ECC because of the significant penalty in power, area and possiblyperformance. This dissertation assumes that the register file is only partiallyprotected by ECC, and presents a register reassignment method, named RAPP.Firstly, the register interference graph is constructed according to the analyti-cal result about registers’live ranges from ASER. Then through a hierarchicalgraph coloring algorithm, the ECC protected registers are assigned to the mostcritical live ranges of registers. Comparing with other available partial pro-tected methodologies, experimental results show that RAPP improve programreliability significantly and take into account the power overhead.
     3. To address the data ?ow errors caused by soft errors, the instruction-level du-plication techniques have been used widely for their advantage on ?exible andgeneral implementation with strong capacity for error detection. However, theconsiderable consistency check instructions are the fundamental limitation forprogram performance. This dissertation presents a checkpoint optimizationmethod for instruction duplication, named COID. Based on the data ?ow analy-sis for error propagation, this method try to remove the redundancy comparisoninstructions under the boundaries of system call instructions without a?ectingthe error detection rate. To illustrate the e?ectiveness of this method, we per-form several fault injection experiments and performance evaluations on a setof simple benchmark programs. Experimental results indicate that COID hasimproved the average performance of instruction duplication for 12.78% withoutdegrading the error detection rate.
     4. Control ?ow errors are a major e?ect incurred by soft errors. Current avail-able control ?ow checking methods have deficiency in performance overhead and checking capacity. Through the control ?ow graph of program, basic blocksare firstly categorized by the graph coloring algorithm. Then, an e?ective con-trol ?ow checking method, named ECCFS, is presented based on the formattedsignature of basic blocks. Moreover, the extend solutions are proposed for thecontrol ?ow checking of intra-block and inter-procedure, respectively. The ana-lytical result of checking capacity and the experimental result of fault injectionindicate that ECCFS can detect most control ?ow errors. Compared with thetypical control ?ow checking methods, ECCFS has the advantage in the errorsdetecting rate and the performance overhead.
     5. Currently, a variety of methodologies have been proposed to address the e?ectsof soft errors. Unfortunately, these techniques will incur performance penalty,storage overhead and economical costs in di?erent degree. For enhancing theruntime reliability of program without extra costs, the dissertation presentsa compiler optimization method, named SISER. Its basic idea is to decreasethe total susceptible intervals that may be a?ected by soft errors during theexecution process through re-arranging the code execution ?ow. Based on theanalytical results of ASER, the detailed algorithm of basic block scheduling ispresented in the fashion of dynamic programming. Experimental results indicatethat the average reliability of programs have been improved about 4.41%. SISERdoes not provoke extra palpable overhead, which is its outstanding characteristiccomparing with other traditional methodologies of fault tolerance.
引文
[1] J. F. Ziegler, et al. IBM experiments in soft fails in computer electronics (1978-1994).IBM Journal on Research and Development. 1996, 40(1):3–18
    [2] P. Shivakumar, M. Kistler, S. W. Keckler, D. Burger, L. Alvisi. Modeling the E?ectof Technology Trends on the Soft Error Rate of Combinational Logic. Proc. of the32th International Conference on Dependable Systems and Networks. 2002, 389–399
    [3] C. Weaver, J. Emer, S. S. Mukherjee, S. K. Reinhardt. Techniques to Reduce the SoftError Rate of a High-Performance Microprocessor. Proc. of the 31th InternationalSymposium on Computer Architecture. 2004, 264–275
    [4] C. Wang, H. Kim, Y. Wu, V. Ying. Compiler-Managed Software-Based RedundantMulti-Threading for Transient Fault Detection. Proc. of International Symposiumon Code Generation and Optimization. 2007, 244–258
    [5] A. Shye, J. Blomstedt, T. Moseley, V. J. Reddi, D. A. Connors. PLR: A SoftwareApproach to Transient Fault Tolerance for Multicore Architectures. IEEE Transac-tions on Dependable and Secure Computing. 2009, 6(2):135–148
    [6] P. Montesinos, W. Liu, J. Torrellas. Using Register Lifetime Predictions to ProtectRegister Files against Soft Errors. Proc. of the 37th International Conference onDependable Systems and Networks. 2007, 286–296
    [7] J. F. Ziegler, H. Puchner. SER - History, Trends, and Challenges: A Guide forDesigning with Memory ICs. Cypress Semiconductor Corp., 2004
    [8] R. C. Baumann. Radiation-induced soft errors in advanced semiconductor technolo-gies. IEEE Transactions on Device and Materials Reliability. 2005, 5(3):305–316
    [9] J. L. Kaschmitter, D. L. Shae?er, N. J. Colella, et al. Operation of commercial R3000processors in the low earth orbit (LEO) space environment. IEEE Transactions onNuclear Science. 1991, 38(6):1415–1428
    [10] R. Koga, S. D. Pinkerton, T. J. Lie, et al. Single-word multiple-bit upsets in staticrandom access devices. IEEE Transactions on Nuclear Science. 1993, 40(6):1941–1946
    [11] P. Eaton, J. Benedetto, D. Mavis, et al. Single event transient pulsewidth measure-ments using a variable temporal latch technique. IEEE Transactions on NuclearScience. 2004, 51(6):3365–3368
    [12] D. P. Siewiorek, R. S. Swarz. Reliable Computer Systems: Design and Evaluation.Bedford, Massachusetts: Digital Press, 1992
    [13] J. F. Ziegler, et al. Terrestrial cosmic rays. IBM Journal on Research and Develop-ment. 1996, 40(1):19 39
    [14] http://www.nasa.gov/mission pages/swift/main/index.html, 2007
    [15] E. Normand. Single Event E?ects in Avionics. IEEE Transactions on NuclearScience. 1996, 43(2):461–474
    [16]庄奕琪.微电子器件应用可靠性技术.北京:电子工业出版社, 1996
    [17] R. T. Johnson, F. V. Thome, C. M. Craft, et al. A Survey of aging of electronicswith application to nuclear power plant instrumentation. IEEE Transactions onNuclear Science. 1983, 30(6):4358–4362
    [18] T. C. May, W. H. Woods. Alpha-particle-induced soft errors in dynamic memories.IEEE Transactions on Electron Devices. 1979, 26(1):2–9
    [19]王长河.单粒子效应对卫星空间运行可靠性影响.半导体情报. 1998, 35(1):1–8
    [20] H. H. K. Tang. Nuclear physics of cosmic ray interaction with semiconductor ma-terials: Particle-induced soft errors from a physicist’s perspective. IBM Journal ofResearch and Development. 1996, 40(1):91–108
    [21] L. B. Freeman. Critical charge calculations for a bipolar SRAM array. IBM Journalof Research and Development. 1996, 40(1):119–130
    [22] R. C. Baumann. Soft Errors in Commercial Semiconductor Technology: Overviewand Scaling Trends. Proc. of the IEEE 2002 Reliability Physics Tutorial Notes,Reliability Fundamentals. 2002, 121 01.1–121 01.14
    [23] S. E. Michalak, K. W. Harris, N. W. Hengartner, B. E. Takala, S. A. Wender.Predicting the number of fatal soft errors in Los Alamos National Laboratory’sASC Q computer. IEEE Transactions on Device and Materials Reliability. 2005,5(3):329–335
    [24]傅忠传,陈红松,崔刚,杨孝宗.处理器容错技术研究与展望.计算机研究与发展. 2007, 44(1):154–160
    [25] N. J. Wang, J. Quek, T. M. Rafacz, S. J. Patel. Characterizing the E?ects ofTransient Faults on a Modern High-Performance Processor Pipeline. Proc. of the34th International Conference on Dependable Systems and Networks. 2004, 61–70
    [26] J. A. Blome, S. Gupta, S. Feng, et al. Cost-e?cient soft error protection for embed-ded microprocessors. Proc. of the International Conference on Compilers, Architec-ture and Synthesis for Embedded Systems. 2006, 421–431
    [27] M. Rebaudengo, M. S. Reorda, M.Violante. An Accurate Analysis of the E?ects ofSoft Errors in the Instruction and Data Caches of a Pipelined Microprocessor. Proc.of the Design, Automation and Test in Europe Conference and Exhibition. 2003,602–607
    [28] A. V. Karapetian, R. R. Some, J. J. Beahan. Radiation fault modeling and faultrate estimation for a COTS based space-borne supercomputer. Proc. of AerospaceConference. 2002, 2121–2131
    [29] S. K. Reinhardt, S. S. Mukherjee. Transient Fault Detection via Simultaneous Mul-tithreading. Proc. of the 27th International Symposium on Computer Architecture.2000, 25–36
    [30] R. E. Lyons, W. Vanderkulk. The Use of Triple-Modular Redundancy to ImproveComputer Reliability. IBM Journal of Research and Development. 1962, 6(2):200–209
    [31] http://www03.ibm.com/ibm/history/exhibits/space/space saturn.html, 2010
    [32] J. H. Wensley, L. Lamport, J. Goldberg, et al. SIFT: Design and analysis of afault-tolerant computer for aircraft control. Proceedings of the IEEE. Oct. 1978,66(10):1240–1255
    [33] A. L. Hopkins, T. B. Smith, J. H. Lala. FTMP: A highly reliable fault-tolerantmultiprocessor for aircraft. Proceedings of the IEEE. Oct. 1978, 66(10):1221–1239
    [34] Y. C. Yeh. Triple-Triple Redundant 777 Primary Flight Computer. Proc. of 1996IEEE Aerospace Applications Conference. 1996, 293–307
    [35] Y. C. Yeh. Design considerations in Boeing 777 ?y-by-wire computers. Proc. of the3rd IEEE International High-Assurance Systems Engineering Symposium. 1998, 64–72
    [36]龚锐.多核微处理器容软错误设计关键技术研究.博士学位论文,国防科学技术大学, 9 2008
    [37] S. S. Mukherjee, M. Kontz, S. K. Reinhardt. Detailed Design and Evaluation ofRedundant Multithreading Alternatives. Proc. of the 29th Annual InternationalSymposium on Computer Architecture. 2002, 99–110
    [38] M. A. Gomaa, C. Scarbrough, T. N. Vijaykumar, I. Pomeranz. Transient Fault-Recovery for Chip Multiprocessors. Proc. of the 30th International Symposium onComputer Architecture. 2003, 96–109
    [39] T. N. Vijaykumar, I. Pomeranz, K. Cheng. Transient Fault Recovery Using Simul-taneous Multithreading. Proc. of the 29th International Symposium on ComputerArchitecture. 2002, 87–98
    [40] J. Gaisler. A portable and fault-tolerant microprocessor based on the SPARC v8architecture. Proc. of the 32th International Conference on Dependable Systemsand Networks. 2002, 409–415
    [41] P. P. Shirvani. Fault Tolerant Computing for Radiation Environment. Ph.D. thesis,Stanford University, 2001
    [42] N. J. Wang, S. J. Patel. ReStore: Symptom-Based Soft Error Detection in Mi-croprocessors. IEEE Transactions on Dependable and Secure Computing. 2006,3(3):188–201
    [43] Michael R. Lyu, (Editor) Software Fault Tolerance. New York: John Wiley & Sons,1995
    [44] Laura L. Pullum. Software Fault Tolerance Techniques and Implementation. Nor-wood, MA, USA: Artech House Inc., 2001
    [45] P. P. Shirvani, E. J. McCluskey. Fault-Tolerant Systems in A Space Environment:The CRC ARGOS Project. Tech. Rep. 98-2, Stanford University, California, USA:Center for Reliable Computing, 1998
    [46] P. P. Shirvani. Software-Implemented Hardware Fault Tolerance Experiments:COTS in Space. Proc. of the 30th International Conference on Dependable Sys-tems and Networks. 2000, 6–7
    [47] R. R. Someand D. C. Ngo. REE: a COTS-based fault tolerant parallel processingsupercomputer for spacecraft onboard scientific data analysis. Proc. of the 18thDigital Avionics Systems Conference. 1999, 1–12
    [48] K. Whisnant, R.K. Iyer, P. Jones, R. Some, D. Rennels. An Experimental Evaluationof the REE SIFT Environment for Spaceborne Applications. Proc. of the 32thInternational Conference on Dependable Systems and Networks. 2002, 585–594
    [49] M. C. Hsueh, T. K. Tsai, R. K. Iyer. Fault Injection Techniques and Tools. IEEEComputer. 1997, 30(4):75–82
    [50] J. Arlat, M. Aguera, L. Amat, et al. Fault injection for dependability validation: amethodology and some applications. IEEE Transactions on Software Engineering.1990, 16(2):166–182
    [51] J. A. Clark, D. K. Pradhan. Fault Injection: A Method for Validating ComputerSystem Dependability. IEEE Computer. 1995, 28(6):47–56
    [52] J. Arlat, Y. Crouzet, J. Karlsson, P. Folkesson, E. Fuchs, G. H. Leber. Comparison ofPhysical and Software-Implemented Fault Injection Techniques. IEEE Transactionson Computers. 2003, 52(9):1115–1133
    [53] S. S. Mukherjee, C. T. Weaver, J. S. Emer, S. K. Reinhardt, T. M. Austin. ASystematic Methodology to Compute the Architectural Vulnerability Factors for aHigh-Performance Microprocessor. Proc. of the 36th International Symposium onMicroarchitecture. 2003, 29–42
    [54] A. Biswas, P. Racunas, R. Cheveresan, J. Emer, S. S. Mukherjee, R. Rangan. Com-puting the Architectural Vunerability Factor for Address-Based Structures. Proc.of the 29th International Symposium on Computer Architecture. 2005, 532–543
    [55] X. Li, S. V. Adve, P. Bose, J. A. Rivers. SoftArch: An Architecture-Level Tool forModeling and Analyzing Soft Errors. Proc. of the 35th International Conference onDependable Systems and Networks. 2005, 496–505
    [56] X. Li. Soft Error Modeling and Analysis for Microprocessors. Ph.D. thesis, IllinoisUniversity, 2008
    [57] X. Li, S. V. Adve, P. Bose, J. A. Rivers. Architecture-Level Soft Error Analysis:Examining the Limits of Common Assumption. Proc. of the 37th InternationalConference on Dependable Systems and Networks. 2007, 266–275
    [58] S. S. Mukherjee. Architecture Design for Soft Errors. Burlington, MA, USA: ElsevierInc., 2008
    [59] A. Benso, S. D. Carlo, G. D. Natale, P. Prinetto. Static Analysis of SEU E?ects onSoftware Applications. Proc. of the International Test Conference. 2002, 500–508
    [60] V. Sridharan, D. R. Kaeli. Eliminating Microarchitectural Dependency from Archi-tectural Vulnerability. Proc. of the International Symposium on High-PerformanceComputer Architecture. 2009, 117–128
    [61] M. Hiller, A. Jhumka, N. Suri. An Approach for Analysing the Propagation ofData Errors in Software. Proc. of the 31th International Conference on DependableSystems and Networks. 2001, 161–170
    [62]李爱国,洪炳镕,王司.基于错误传播分析的软件脆弱点识别方法研究.计算机学报. 2007, 30(11):1910–1921
    [63]杨学军,高珑.错误流模型:硬件故障的软件传播建模与分析.软件学报. 2007,18(4):808–820
    [64] N. Oh, P. P. Shirvani, E. J. McCluskey. Error detection by duplicated instructionsin super-scalar processors. IEEE Transactions on Reliability. 2002, 51(1):63–75
    [65] G. A. Reis, J. Chang, N. Vachharajani, R. Rangan, D. I. August. SWIFT: Softwareimplemented fault tolerance. Proc. of International Symposium on Code Generationand Optimization. 2005, 243–254
    [66] N. Oh, P. P. Shirvani, E. J. McCluskey. Control-?ow checking by software signatures.IEEE Transactions on Reliability. 2002, 51(1):111–122
    [67] N. Oh, P. Mitra, E. J. McCluskey. ED4I: Error Detection by Diverse Data andDuplicated Instructions. IEEE Transactions on Computer. 2002, 51(2):180–199
    [68] G. A. Reis, J. Chang, N. Vachharajani, R. Rangan, D. I. August, S. S. Mukherjee.Software-controlled fault tolerance. ACM Transactions on Architecture and CodeOptimization. 2005, 2(4):366–396
    [69] M. Rebaudengo, M. S. Reorda, et al. A source-to-source compiler for generatingdependable software. Proc. of the International Workshop Source Code Analysisand Manipulation. 2001, 33–42
    [70] A. Benso, S. Chiusano, P. Prinetto, L. Tagliaferri. A C/C++ Source-to-SourceCompiler for Dependable Applications. Proc. of the 30th International Conferenceon Dependable Systems and Networks. 2000, 71–78
    [71] P. P. Shirvani, N. Saxena, E. J. McCluskey. Software implemented EDAC protectionagainst SEUs. IEEE Transactions on Reliability. 2000, 49(3):273–284
    [72] K. H. Huang, J. A. Abraham. Algorithm-based fault tolerance for matrix operations.IEEE Transactions on Computers. 1984, 33(6):518–528
    [73] D. S. Katz M. Turmon, R. Granat. Software-Implemented Fault Detection for High-Performance Space Applications. Proc. of the 30th International Conference onDependable Systems and Networks. 2000, 107–118
    [74] G. Miremadi, J. Karlsson, J. U. Gunne?o, et al. Two Software Techniques forOn-line Error Detection. Proc. of the 22nd Annual International Symposium onFault-Tolerant Computing. 1992, 328–335
    [75] M. Hiller. Executable Assertions for Detecting Data Errors in Embedded ControlSystems. Proc. of the 30th International Conference on Dependable Systems andNetworks. 2000, 24–36
    [76] K. Lee, A. Shrivastava, I. Issenin, N. Dutt, N. Venkatasubramanian. Mitigatingsoft error failures for multimedia applications by selective data protection. Proc.of International Confernce on Compilers, Architecture and Synthesis for EmbeddedSystems. 2006, 411–420
    [77] Jianjun Xu, Qingping Tan, Rui Shen. A Novel Optimum Data Duplication Approachfor Soft Error Detection. Proc. of the 15th Asia-Pacific Software Engineering Con-ference. 2008, 161–168
    [78]高珑,王之元,杨学军.高效的部分冗余容错编译:复制错误流关键子图.软件学报. 2007, 18(9):2105–2116
    [79] G. A. Reis, J. Chang, N. Vachharajani, R. Rangan, D. I. August, S. S. Mukher-jee. Design and Evaluation of Hybrid Fault-Detection Systems. Proc. of the 32ndInternational International Symposium on Computer Architecture. 2005, 148–159
    [80] J. Hu, F. Li, V. Degalahal, et al. Compiler-Assisted Soft Error Detection underPerformance and Energy Constraints in Embedded Systems. ACM Transactions onEmbedded Computing Systems. 2009, 8(4):27–56
    [81] M. R. Guthaus, J. S. Ringenberg, D. Ernst, et al. Mibench: A free, commerciallyrepresentative embedded benchmark suite. Proc. of the International Workshop onWorkload Characterization. 2001, 3–14
    [82] Michael R. Lyu, (Editor) Handbook of Software Reliability Engineering. McGraw-Hill, 1996
    [83] J. Lee, A. Shrivastava. Static analysis to mitigate soft errors in register files. Proc.of the Conference on Design, Automation, and Test in Europe. 2009, 1367–1372
    [84] MIPS Technologies Inc. MIPS32 Architecture for Programmers Volume II: TheMIPS32 Instruction Set, 2002
    [85] D. Burger, T. M. Austin, S. Bennett. Evaluating Future Microprocessors: theSimpleScalar Tool Set. Tech. Rep. 1342, UW Madison CS, 1997
    [86] Y. Wu, J. R. Larus. Static branch frequency and program profile analysis. Proc. ofthe 27th International Symposium on Microarchitecture. 1994, 1–11
    [87] Je?rey M. Barth. An Interprocedural Data Flow Analysis Algorithm. ACM Com-munications. 1978, 21(9):724–736
    [88] Jianjun Xu, Rui Shen, Qingping Tan. PRASE: An Approach for Program Relia-bility Analysis with Soft Errors. Proc. of the 14th IEEE Pacific Rim InternationalSymposium on Dependable Computing. 2008, 240–247
    [89] M. Sharir, A. Pnueli. Program ?ow analysis: theory and applications, EnglewoodCli?s, N.J.: Prentice-Hall, 1981. 189–234
    [90]林成森.数值分析.北京:科学出版社, 2007
    [91] Keith D. Cooper, Timothy J. Harvey, Ken Kennedy. Iterative Data-?ow Analysis,Revisited. Tech. Rep. TR04-100, Rice Technical Report, 2004
    [92] F. Nielson, H. R. Nielson, C. Hankin. Principles of Program Analysis. Berlin,Germany: Springer, 2005
    [93] J. Yan, W. Zhang. Compiler-guided register reliability improvement against softerrors. Proc. of the 5th ACM International Conference on Embedded Software.2005, 203–209
    [94] T. J. Slegel, et al. IBM’s S/390 G5 Microprocessor Design. IEEE Micro. 1999,19(2):12–23
    [95] Marc Tremblay, Yuval Tamir. Support for fault tolerance in VLSI processors. Proc.of the International Symposium on Circuits and Systems. 1989, 388–393
    [96] J. Gaisler. Evaluation of a 32-bit microprocessor with builtin concurrent error-detection. Proc. of the International Symposium on Fault-Tolerant Computing.1997, 42–46
    [97] C. McNairy, R. Bhatia. Montecito: A dual-core, dual-thread Itanium processor.IEEE Micro. 2005, 25(2):10–20
    [98] http://www.trimaran.org, 2010
    [99] J. Lee, A. Shrivastava. Compiler-Managed Register File Protection for Energy-E?cient Soft Error Reduction. Proc. of the 14th Asia South Pacific Design Au-tomation Conference. 2009, 618–623
    [100] G. J. Chaitin, M. A. Auslander, A. K. Chandra, J. Cocke, M. E. Hopkins, P. W.Markstein. Register allocation via coloring. Computer Languages. 1981, 6(1):47–57
    [101] G. J. Chaitin. Register allocation and spilling via graph coloring. ACM SIGPLANNotices. 1982, 17(6):201–207
    [102] Preston Briggs, Keith D. Cooper, Linda Torczon. Improvements to graph coloringregister allocation. ACM Transactions on Programming Languages and Systems.1994, 16(3):428–455
    [103] Sebastian Hack, Gerhard Goos. Copy coalescing by graph recoloring. Proc. of theACM SIGPLAN 2008 Conference on Programming Language Design and Imple-mentation. 2008, 227–237
    [104] Sebastian Hack, Daniel Grund, Gerhard Goos. Register allocation for programs inSSA-form. Proc. of the 15th International Conference on Compiler Construction.2006, 247–262
    [105] F. C. Chow, J. L. Hennessy. The Priority-Based Coloring Approach to RegisterAllocation. ACM Transactions on Programming Languages and Systems. 10 1990,12(4):501–536
    [106] B. Nicolescu, Y. Savaria, R. Velazco. SIED: software implemented error detection.Proc. of the 18th IEEE International Symposium on Defect and Fault Tolerance inVLSI Systems. 2003, 589–596
    [107]高珑,杨学军.高性能低功耗的容错编译技术:错误流压缩算法.软件学报. 2006,17(12):2425–2437
    [108] L. A. Juan, G. Rose, G. Antonio. Power-Aware control speculation through selectivethrottling. Proc. of the 9th International Symp. on High-Performance ComputerArchitecture. 2003, 103–112
    [109] M. Hiller, A. Jhumka, N. Suri. On the Placement of Software Mechanisms forDetection of Data Errors. Proc. of the 32th International Conference on DependableSystems and Networks. 2002, 135–144
    [110] J. Yu, M. J. Garzar′an, M. Snir. ESoftCheck: Removal of Non-vital Checks for FaultTolerance. Proc. of International Symposium on Code Generation and Optimization.2009, 35–46
    [111] M. Prvulovic, Z. Zhang, J. Torrellas. ReVive: Cost-E?ective Architectural Supportfor Rollback Recovery in Shared-Memory Multiprocessors. Proc. of the InternationalSymposium on Computer Architecture. 2002, 111–122
    [112] D. Sorin, M. Martin, M. Hill, D. Wood. SafetyNet: Improving the Availability ofShared Memory Multiprocessors with Global Checkpoint/Recovery. Proc. of theInternational Symposium on Computer Architecture. 2002, 123–134
    [113] B. Saurabh, S. Balaji, et al. Hierarchical Error Detection in a Software ImplementedFault Tolerance (SIFT) Environment. IEEE Transactions on Knowledge and DataEngineering. 2000, 12(2):203–224
    [114] D. J. Lu. Watchdog Processor and Structural Integrity Checking. IEEE Transactionson Computers. 1982, 31(7):681–685
    [115] A. Mahmood, E. J. McCluskey. Concurrent error detection using watchdogprocessors-a survey. IEEE Transactions on Computers. 1988, 37(2):160–174
    [116]李爱国,洪炳镕,王司.软件实现的程序控制流检测方法研究进展.哈尔滨工业大学学报. 2008, 40(3):407–412
    [117] A. Li, B. Hong. Software implemented transient fault detection in space computer.Aerospace Science and Technology. 2007, 11(2-3):245–252
    [118] Z. Alkhalifa, V. Nair, N. Krishnamurthy, J. Abraham. Design and evaluation ofsystem-level checks for on-line control ?ow error detection. IEEE Transactions onParallel and Distributed Systems. 1999, 10(6):627–641
    [119] O. Goloubeva, M. Rebaudengo, M. Sonza Reorda, M. Violante. Soft-Error DetectionUsing Control Flow Assertions. Proc. of the 18th IEEE International Symposiumon Defect and Fault Tolerance in VLSI Systems. 2003, 581–588
    [120] B. Nicolescu, R. Velazco. Detecting soft errors by a purely software approach:method, tools and experimental results. Proc. of the Conference on Design, Au-tomation, and Test in Europe. 2003, 57–62
    [121] A. Welsh, B. Powell. An upper bound for the chromatic number of a graph and itsapplication to timetabling problems. The Computer Journal. 1967, 10(1):85–86
    [122] A. V. Aho, M. S. Lam, R. Sethi, J. D. Ullman. Compilers: Principles, Techniques,and Tools, Second Edition. Boston, MA, USA: Addison Wesley Press, 2006
    [123] S. S. Muchnick. Advanced compiler design implementation. San Francisco, CA,USA: Morgan Kaufmann Publishers, 1997
    [124] D. Bernstein, M. Rodeh. Global instruction scheduling for superscalar machines.Proc. of the ACM Conference on Programming Language Design and Implementa-tion. 1991, 241–255
    [125] Jianjun Xu, Qingping Tan, Rui Shen. The Instruction Scheduling for Soft Errorsbased on Data Flow Analysis. Proc. of the 15th IEEE Pacific Rim InternationalSymposium on Dependable Computing. 2009, 372–378
    [126] Thomas H. Cormen, Charles E. Leiserson, Ronarld L. Rivest, Cli?ord Stein. Intro-duction to Algorithms, Second Edition. The MIT Press, 2001
    [127] J. A. Fisher. Trace Scheduling: A Technique for Global Microcode Compaction.IEEE Transactions on Computers. 1981, C-30(7):478–490

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700