多核微处理器容软错误设计关键技术研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
微处理器受到高能粒子轰击或噪声干扰等恶劣环境的影响,将发生瞬态故障。这些瞬态故障可能引起软错误(Soft Error),甚至失效,这将对微处理器的可靠性产生较大的影响。随着集成电路制造工艺的进步,单片上能够集成的晶体管数目将呈指数增长,这将使得微处理器面临越来越严重的软错误威胁。目前,多核微处理器已经逐渐成为市场的主流。容软错误(Soft Error Tolerance)技术一般都需要某种程度的冗余,而多核微处理器中天然的冗余资源为容软错误设计提供了新的解决思路。如何有效地利用多核微处理器中的冗余资源来增强微处理器的容软错误能力,进而提高其可靠性,就成了亟待解决的问题,对其进行深入研究具有重要的理论意义和实用价值。
     本文的研究工作围绕多核微处理器容软错误设计中的一系列关键技术展开。首先研究了多核微处理器容软错误执行模型,容软错误执行模型关系到程序如何高效、正确、可靠地在多核微处理器上执行,这也是发挥多核冗余资源优势实现容软错误设计的关键所在。其次,本文对具体的容软错误加固技术进行了研究,任何容软错误微处理器都要采用不同层次的加固技术对软错误进行屏蔽、检测或恢复,本文主要研究了门级的冗余技术和体系结构级的控制流检测技术。最后,本文对微处理器可靠性评估模型进行了研究,以便能在设计流程的早期就对微处理器可靠性进行定量评估,从而对设计选择和优化进行有效地指导。
     本文所作的主要创新工作包括:
     (I)本文提出了两种多核微处理器容软错误执行模型,包括:(1)基于现场保存与恢复的双核冗余执行模型DCR。在该模型中,两份相同的线程在两个具有现场保存与恢复功能的内核上冗余执行。通过增强内核的功能,使得该模型在能够有效恢复软错误的同时,具有较低的容错专用核间队列带宽需求和实现复杂度。(2)可重构的三核冗余执行模型TCR。该模型通过增强内核的冗余,在三个不同的内核上执行三份相同的线程,发现软错误以后可以进行动态重构,从而以较低的容错专用核间队列带宽需求和较高的执行性能实现了对软错误的有效屏蔽。
     (II)本文提出了两种基于异步电路技术的门级冗余结构,包括:(1)基于异步C单元的双模冗余结构DMR。该结构采用异步C单元对双模冗余单元的输出进行屏蔽,有效地降低了硬件冗余度,在具有对SEU(Single Event Upset)故障屏蔽能力的同时,有效地降低了芯片的面积开销。(2)基于异步双沿触发寄存器的时空三模冗余结构TSTMR。本文借鉴异步电路中解同步电路显式分离主从锁存器的结构,提出了双沿触发寄存器(DCTREG)。TSTMR结构通过采用DCTREG,将时间冗余应用到门级,从而实现对SEU和SET(Single Event Transient)故障的全面屏蔽。
     (III)本文提出了一种增强型控制流检测技术ECFC,该技术主要包括检测方法和实现方法两部分:(1)基于节点和边的签名检测方法。该方法通过将签名同时赋予控制流图中的节点和边,实现了比经典的基于节点的签名检测方法更严格的控制流检测,并且可以杜绝经典检测方法中可能出现的非法转移误判和调整签名冲突的情况。(2)软硬件结合的控制流检测实现方法。该实现方法由编译器在程序中插入签名数据,在程序执行的过程中,执行完控制流转移指令后自动触发一次硬件检测操作。该实现方法具有二进制代码量小、性能高、检错及时等优点。
     (IV)本文提出了一种综合考虑芯片面积和性能开销的可靠性评估模型:该模型采用一种新的评估量化标准,以实现对微处理器可靠性的定量评估。采用该评估模型,可以在设计流程中对采用了不同容软错误技术的微处理器的可靠性进行准确的定量评估,有利于对设计选择和优化进行指导。本文还在此评估模型下,对上述容软错误执行模型、门级冗余结构和体系结构级控制流检测技术进行了可靠性评估。
     本文通过对容软错误执行模型、容软错误加固技术和可靠性评估模型的研究,对容软错误多核微处理器的设计实现进行了有益的探索。本文的实现、验证和评估结果表明,上述技术是有效的,能够应用于容软错误多核微处理器的设计和实现。
One of the most critical challenges in modern microprocessor design is the transient fault caused by high-energy particles or random noise. These transient faults may cause soft errors, even failures, which can affect the reliability of microprocessors. With the development of integrated circuit, the transient fault rate of a single microprocessor keeps increasing exponentially with the exponential increase of transistors per chip. Multi core microprocessors become the mainstream in the last few years. Generally speaking, the soft error tolerance design needs some kind of redundancy. Therefor, the redundant cores in multi core microprocessor provide a potential solution for soft error tolerance design. And how to efficiently use the redundant resources in multi core microprocessor to enhance the reliability becomes the research focus in recent years.
     This thesis details our researches on some key techniques of soft error tolerance design on multi core microprocessor. Firstly, our researches focus on the soft error tolerant execution model, which is the key technique to exploit the redundant resources in multi core microprocessors for soft error tolerance design. Secondly, we research some hardened techniques on gate and architecture level. These hardened techniques provide soft error masking, detection or recovery. Finally, we research the reliability evaluation model of microprocessors so that the evaluation results can be used to conduct the design process of high reliable microprocessors.
     The primary innovative works in this thesis are list as follows.
     (I) Two soft error tolerant execution models on multi core microprocessor are proposed. (1) The dual core redundancy execution model based on context saving and recovery (DCR) executes two copies of a given program on different cores. The soft errors can be recovered with low inter-core FIFO bandwidth demand and low implementation complexity by enhancing the cores with context saving and recovery. (2) The reconfigurable triple core redundancy execution model (TCR) executes three threads of a program on different cores. Once detecting a soft error, the execution model can be reconfigured to mask the failed core. Thus the soft error can be masked with low inter-core FIFO bandwidth demand and high execution performance.
     (II) Two gate level redundancy structure based on asynchronous circuit are proposed. (1) The dual modular redundancy based on C element (DMR) uses the asynchronous C element to mask the corrupted values in the dual redundant device. It can efficiently reduce the die area overheads, while provides the SEU tolerant ability. (2) The temporal spatial triple modular redundancy based on dual clock triggered register (TSTMR) can mask both SEU and SET faults. With the same explicitly separated master and slave latch structure as de-synchronous pipeline, dual clock triggered register (DCTREG) uses one clock for sample enable and another for output enable, thus the temporal redundancy can be implemented on gate level.
     (III) An enhanced control flow checking technique (ECFC) is proposed. This ECFC technique includes checking method and implementation method. (1) Checking method based on signature node and edge signs for both nodes and edges in control flow graph. It is a more powerful checking method. And it can eliminate the misjudgment of illegal branch and the conflict of adjusting signature in the typical checking method. (2) Control flow checking implementation method with compiler signatures and hardware checking inserts signature data in the code when compiling. Then the hardware checking operation is triggered by control flow switching instructions. This implementation method shows its advantages on small binary code size, high performance and real time checking.
     (IV) A reliability evaluation model based on die area and performance overheads are proposed. This model uses a novel reliability metric to evaluate the reliability of microprocessors veraciously and quantitatively. An evaluation framework is also proposed so that the reliability of different soft error tolerant techniques can be evaluated during the design flow and the evaluation results can be used to conduct the designer to choose appropriate techniques among various hardening methods. The aforementioned soft error tolerant execution models, gate level redundancy techniques and architecture level control flow checking technique have been evaluated using this evaluation model.
     This thesis explores the soft error tolerance design on multi core microprocessor by researching the soft error tolerant execution model, hardened techniques on gate and architecture levels and reliability evaluation model. The experimental results demonstrate that these models and techniques are effective and can be used in the design and implementation of soft error tolerant multi core microprocessors.
引文
[1] Mcnulty P J, Farrell G E, Wyatt R C. Upset phenomena induced by energetic protons and electrons. IEEE Transactions on Nuclear Science, 1980, 27(6): 1516~1522.
    [2]里基茨L W.电子器件核加固基础.北京:国防工业出版社, 1978.
    [3]陈盘训.半导体器件和集成电路的辐射效应.北京:国防工业出版社, 2005.
    [4] Ziegler J F. Terrestrial cosmic rays. IBM Journal of Research and Development, 1996, 40(1): 19~39.
    [5]贺朝会.单粒子效应研究的现状和动态.抗核加固, 2000, 17(1): 82.
    [6]庄奕琪.微电子器件应用可靠性技术.北京:电子工业出版社, 1996.
    [7] Johnson R T, Thome F V, Craft C M, et al. A Survey of aging of electronics with application to nuclear power plant instrumentation. IEEE Transactions on Nuclear Science, 1983, 30(6): 4358~4362.
    [8] May T C, Woods M H. Alpha-particle-induced soft errors in dynamic memories. IEEE Transactions on Electron Devices, 1979, 26(1): 2~9.
    [9]史保华.微电子器件可靠性.西安:西安电子科技大学出版社, 2001.
    [10] Srour J R, Long D M. Radiation effects on and dose enhancement of electronic materials. New Jersey: Noyes Publications, 1984.
    [11] Ma T, Dressendorfer P. Ionizing radiation effects in MOS devices and circuits. New York: Wiley, 1989.
    [12] Tutus J L, Wheatley C F. Experimental studies of single-event gate rupture and burnout in vertical power MOSFETs. IEEE Transactions on Nuclear Science, 1996, 43(2): 533~545.
    [13] Oberg D L, Wert J L. First observation of power MOSFET burnout with high energy neutrons. IEEE Transactions on Nuclear Science, 1996, 43(6): 2913~2920.
    [14] Campbell S A.微电子制造科学原理与工程技术.北京:电子工业出版社, 2004.
    [15] Kaschmitter J L, Shaeffer D L, Colella N J, et al. Operation of commercial R3000 processors in the low earth orbit (LEO) space environment. IEEE Transactions on Nuclear Science, 1991, 38(6): 1415~1428.
    [16] Siewiorek D P, Swarz R S. Reliable computer systems: design and evaluation. Bedford, Massachusetts: Digital Press, 1992.
    [17] Eaton P, Benedetto J, Mavis D, et al. Single event transient pulsewidth measurements using a variable temporal latch technique. IEEE Transactions on Nuclear Science, 2004, 51(6), 3365~3368.
    [18] Shivakumar P, Kistler M.; Keckler S W, et al. Modeling the effect of technology trends on the soft error rate of combinational logic. In Proceedings of IEEE International Conference on Dependable Systems and Networks (DSN’02), 2002, 389~398.
    [19] Koga R, Pinkerton S D, Lie T J, et al. Single-word multiple-bit upsets in static random access devices. IEEE Transactions on Nuclear Science, 1993, 40(6): 1941~1946.
    [20] Koga R, Crain W R, Crawford K B, et al. On the suitability of non-hardened high density SRAMs for space applications. IEEE Transactions on Nuclear Science, 1991, 38(6): 1507~1513.
    [21]徐拾义.可信计算系统设计和分析.北京:清华大学出版社, 2006.
    [22] Cohen N, Sriram T S, Leland N, et al. Soft error considerations for deep-submicron CMOS circuit applications. In Technical Digest of International Electron Devices Meeting, 1999, 315~318.
    [23] Avizienis A. Design of fault-tolerant computers. In Proceedings of AFIPS Fall Joint Computer Conference, 1967. 733~743.
    [24] Ziegler J F. IBM experiments in soft fails in computer electronics (1978-1994). IBM Journal of Research and Development, 1996, 40(1): 3~18.
    [25] Tang H H K. Nuclear physics of cosmic ray interaction with semiconductor materials: Particle-induced soft errors from a physicist's perspective. IBM Journal of Research and Development, 1996, 40(1): 91~108.
    [26] Freeman L B. Critical charge calculations for a bipolar SRAM array. IBM Journal of Research and Development, 1996, 40(1): 119~130.
    [27] Hareland S, Maiz J, Alavi M, et al. Impact of CMOS scaling and SOI on soft error rates of logic processes. In Digest of Technical Papers of Symposium on VLSI Technology, 2001, 73~74.
    [28] Karnik T, Bloechel B, Soumyanath K, et al. Scaling trends of cosmic rays induced soft errors in static latches beyond 0.18μ. In Digest of Technical Papers of Symposium on VLSI Technology, 2001: 61~62.
    [29] Hennessy J L, Patterson D A. Computer architecture: a quantitative approach. 3rd Edition, Morgan Kaufmann Publishing, 2002.
    [30] International technology roadmap for semiconductor, 2007 edition. http://www.itrs.net/Links/2007ITRS/Home2007.htm
    [31] Constantinescu C. Neutron SER characterization of microprocessors. In Proceedings of IEEE International Conference on Dependable Systems and Networks (DSN’05), 2005, 754~759.
    [32]王阳元,康晋锋.超深亚微米集成电路中的互连问题——低k介质与Cu的互连集成技术.半导体学报, 2002, 23(11): 1121~1134.
    [33] Tullsen D, Eggers S J, Levy H M. Simultaneous Multithreading: Maximizing On-Chip Parallelism. In Proceedings of International Symposium on Computer Architecture (ISCA’95), 1995, 392~403.
    [34] Hammond L, Nayfeh B A, Olukotun K. A single-chip multiprocessor. IEEE Computer, 1997, 30(9): 79-85.
    [35] Adiga N R, Almasi G, Almasi G S, et al. An overview of the Blue Gene/L supercomputer. In Proceedings of ACM/IEEE Conference on Supercomputing (SC’02), 2002, 1~22.
    [36] Pham D, Asano S, Bolliger M, et al. The design and implementation of a first-generation CELL processor. In Proceedings of IEEE International Solid-State Circuits Conference (ISSCC’05), 2005, 184~592.
    [37] Tendler J M, Dodson J S, Fields J S, et al. POWER4 system microarchitecture. IBM Journal of Research and Development, 2002, 46(1): 5~26.
    [38] Kalla R, Sinharoy B, Tendler J M. IBM Power5 chip: a dual-core multithreaded processor. IEEE Micro, 2004, 24(2): 40~47.
    [39] Le H Q, Starke W J, Fields J S, et al. IBM POWER6 microarchitecture. IBM Journal of Research and Development, 2007, 51(6): 639~662.
    [40] Kongetira P, Aingaran K, Olukotun K. Niagara: A 32-way multithreaded Sparc processor. IEEE Micro, 2005, 25(2): 21~29.
    [41] Shah M, Barren J, Brooks J, et al. UltraSPARC T2: A highly-treaded, power-efficient, SPARC SOC. In Proceedings of IEEE Asian Solid-State Circuits Conference (ASSCC’07), 2007, 22~25.
    [42] Keltcher C N, McGrath K J, Ahmed A, et al. The AMD Opteron processor for multiprocessor servers. IEEE Micro, 2003, 23(2): 66~76.
    [43] Bearden D. SOI design experiences with Motorola's high-performance processors. In Proceedings of IEEE International SOI Conference (SOI’02), 2002, 6~9.
    [44] Buchholtz T C, Aipperspach G, Cox D T, et al. A 660 MHz 64b SOI processor with Cu interconnects. In Proceedings of IEEE International Solid-State Circuits Conference (ISSCC’00), 2000, 88~89.
    [45]刘新宇,孙海峰,刘洪民, et al.全耗尽CMOS/SOI工艺.半导体学报, 2003, 24(1): 104~108.
    [46]程玉华,王阳元.薄全耗尽SOI膜N沟道MOSFET强反型电流模型.半导体学报, 1992, 13(9): 547~553.
    [47]万新恒,甘学温,张兴, et al.短沟道SOI MOSFET总剂量辐照效应模型.半导体学报, 2001, 22(9): 1154~1159.
    [48]刘新宇,刘运龙,孙海锋, et al. CMOS/SOI 4Kb SRAM总剂量辐照实验.半导体学报, 2002, 23(2): 213~216.
    [49] Noda K, Matsui K, Ito S, et al. An ultra-high-density high-speed loadlessfour-transistor SRAM macro with a dual-layered twisted bit-line and a triple-well shield. In Proceedings of IEEE Custom Integrated Circuits Conference (CICC’00), 2000, 283~286.
    [50] Bessot D, Velazco R. Design of SEU-hardened CMOS memory cells: the HIT cell. In Proceedings of European Conference on Radiation and its Effects on Components and Systems (RADECS’93), 1993, 563~570.
    [51] Calin T, Nicolaidis M, Velazco R. Upset hardened memory design for submicron CMOS technology. IEEE Transactions on Nuclear Science, 1996, 43(6): 2874~2878.
    [52] Tremblay M, Yu T. Support for fault tolerance in VLSI processors. In Proceedings of IEEE International Symposium on Circuits and Systems (ISCS’89), 1989, 388~392.
    [53] Rockett L R. An SEU-hardened CMOS data latch design. IEEE Transactions on Nuclear Science, 1988, 35(6): 1682~1687.
    [54] Seifert N, Zhu X, Massengill L W. Impact of scaling on soft-error rates in commercial microprocessors, IEEE Transactions on Nuclear Science, 2002, 49(6): 3100~3106.
    [55] Mongkolkachit P, Bhuva B. Design technique for mitigation of alpha-particle-induced single-event transients in combinational logic. IEEE Transactions on Device and Materials Reliability, 2003, 3(3): 89~92.
    [56] Gaisler J. A portable and fault-tolerant microprocessor based on the SPARC V8 architecture. In Proceedings of IEEE International Conference on Dependable Systems and Networks (DSN’02), 2002, 409~415.
    [57] Mitra S, Seifert N, Zhang M, et al. Robust system design with built-in soft-error resilience. IEEE Computer, 2005, 38(2): 43~52.
    [58] Mavis D G, Eaton P H. Soft error rate mitigation techniques for modern microcircuits. In Proceedings of International Reliability Physics Symposium (IRPS’02), 2002, 216~225.
    [59] Wu K, Karri R. Algorithm level re-computing with shifted operands-a register transfer level concurrent error detection technique, In Proceedings of International Test Conference (ITC’00), 2000, 971~978.
    [60] Sohi G S, Franklin M, Saluja K K. A study of time-redundant fault tolerance techniques for high-performance pipelined computers. In Proceedings of IEEE International Fault-Tolerant Computing Symposium (FTCS’89), 1989, 436~443.
    [61] Li J, Swartzlander E E. Concurrent error detection in ALUs by recomputing with rotated operands. In Proceedings of IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (DFT’92), 1992, 109~116.
    [62] Berger J M. A note on error detection codes for asymmetric channels. Inform Control, 1961, 4(1): 68~73.
    [63] Lo J C. Self-checking VLSI reduced instruction set computers: Ph.D. dissertation. Lafayette: University of Southwestern Louisiana, 1989.
    [64]陈微.高可靠微处理器设计关健技术研究与实现:硕士学位论文.长沙:国防科技大学, 2006.
    [65] Mendelson A, Suri N. Designing high-performance and reliable superscalar architectures-the out of order reliable superscalar (O3RS) approach. In Proceedings of IEEE International Conference on Dependable Systems and Networks (DSN’00), 2000, 473~481.
    [66] Sato T, Arita I. In search of efficient reliable processor design. In Proceedings of IEEE International Conference on Parallel Processing (ICPP’01), 2001, 525~532.
    [67] Rashid F, Saluja K K, Ramanathan P. Fault tolerance through re-execution in multiscalar architecture. In Proceedings of IEEE International Conference on Dependable Systems and Networks (DSN’00), 2000, 482~491.
    [68] Mahmood A, McCluskey E J. Concurrent error detection using watchdog processors-a survey., IEEE Transactions on Computers, 1988, 37(2): 160~174.
    [69] Majzik I, Hohl W, Pataricza A, et al. Multiprocessor checking using watchdog processors. International Journal of Computer Systems Science and Engineering, 1996, 11(5): 301~310.
    [70] Austin T M. DIVA: a reliable substrate for deep submicron microarchitecture design. In Proceedings of IEEE/ACM International Symposium on Microarchitecture (Micro’99), 1999, 196~207.
    [71] Chatterjee S, Weaver C, Austin T. Efficient checker processor design. In Proceedings of IEEE/ACM International Symposium on Microarchitecture (Micro’00). 2000, 87~97.
    [72] Weaver C, Austin T. A fault tolerant approach to microprocessor design. In Proceedings of IEEE International Conference on Dependable Systems and Networks (DSN’01), 2001, 411~420.
    [73] Rao T R N, Fujiwara E. Error-Control Coding for Computer Systems. New Jersey: Prentice-Hall, 1989.
    [74] Quach N. High availability and reliability in the Itanium processor. IEEE Micro, 2000, 20(5): 61~69.
    [75] Kessler R E. The Alpha 21264 microprocessor. IEEE Micro, 1999, 19(2): 24~36.
    [76] Ando H, Yoshida Y, Inoue A, et al. A 1.3GHz fifth generation SPARC64 microprocessor. In Proceedings of IEEE International Conference on Design Automation (DAC’03), 2003, 702~705.
    [77] de Lima F G, Cota E, Carro L, et al. Designing a radiation hardened 8051-like micro-controller. In Proceedings of Symposium on Integrated Circuits and Systems Design (SBCCI’00), 2000, 255~260.
    [78] Neuberger G, de Lima F, Carro L, et al. A multiple bit upset tolerant SRAM memory. ACM Transactions on Design Automation of Electronic Systems, 2003, 8(4): 577~590.
    [79] Cotaérika, Lima F, Rezgui S, et al. Synthesis of an 8051-like micro-controller tolerant to transient faults. Journal of Electronic Testing: Theory and Applications, 2001, 17(2): 149-161.
    [80] Oh N. Software Implemented Hardware Fault Tolerance: Ph.D. dissertation. Stanford: Stanford University, 2000.
    [81]张仕健,胡伟武.一种向分支指令后插入冗余指令的容错微结构.计算机学报, 2007, 30(10): 1674-1680.
    [82] Lyu M R. Software reliability measurements in N-Version software execution environment, In Proceedings of International Symposium on Software Reliability Engineering (ISSRE’92), 1992, 254~263.
    [83] Oh N, Mitra S, McCluskey E J. ED4I: Error detection by diverse data and duplicated instructions. IEEE Transactions on Computers, 2002, 51(2): 180~199.
    [84]高珑.面向硬件故障的软件容错——模型,算法和实验:博士学位论文.长沙:国防科技大学, 2006.
    [85] Alkhalifa Z, Nair V S S, Krishnamurthy N, et al. Design and evaluation of system-level checks for on-line control flow error detection. IEEE Transactions on Parallel and Distributed Systems, 1999, 10(6): 627~641.
    [86] Michel T, Leveugle R, Saucier G. A new approach to control flow checking without program modification. In Proceedings of IEEE International Fault-Tolerant Computing Symposium (FTCS’91), 1991, 334~341.
    [87] Miremadi G, Karlsson J, Gunneflo J U, et al. Two software techniques for on-line error detection. In Proceedings of IEEE International Fault-Tolerant Computing Symposium (FTCS’92), 1992, 328~335.
    [88] Tian J. Integrating time domain and input domain analyses of software reliability using tree-based models. IEEE Transactions on Software Engineering, 1995, 21(12): 945~958.
    [89] Oh N, Shirvani P, McCluskey E J. Control flow checking by software signatures. IEEE Transactions on Reliability, 2002, 51(2): 111~122.
    [90] Fazeli M, Farivar R, Miremadi S G. A software-based concurrent error detection technique for PowerPC processor-based embedded systems. In Proceedings of IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (DFT’05), 2005, 266~274.
    [91]李爱国,洪炳熔,王司.一种软件实现的程序控制流错误检测方法.宇航学报, 2006, 27(06): 1424~1430.
    [92]李爱国,洪炳熔,王司.一种星载计算机数据流软故障纠正算法.宇航学报,2007, 28(04): 1044~1048.
    [93] Banerjee P, Abraham J A. Bounds on algorithm-based fault tolerance in multiple processor Systems. IEEE Transactions on Computers, 1986, 35(4): 296~306.
    [94] Slegel T J, Averill R M, Check M A, et al. IBM’s S/390 G5 microprocessor design. IEEE Micro, 1999, 19(2): 12~23.
    [95] Salloum C E, Steininger A, Tummeltshammer P. Recovery mechanisms for dual core architectures. In Proceedings of IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (DFT’06), 2006, 380~388.
    [96]彭和平,时晨,赵元富, et al.面向空间应用的双核容错微处理器的研究与实现.宇航学报, 2007, 28(01): 188~193.
    [97] Rotenberg E. AR-SMT: A microarchitectural approach to fault tolerance in microprocessors. In Proceedings of IEEE International Fault-Tolerant Computing Symposium (FTCS’99), 1999, 84~91.
    [98] Reinhardt S K, Mukherjee S S. Transient fault detection via simultaneous multithreading. In Proceedings of International Symposium on Computer Architecture (ISCA’00), 2000, 25~36.
    [99] Vijaykumar T, Pomeranz I, Cheng K. Transient-fault recovery via simultaneous multithreading. In Proceedings of International Symposium on Computer Architecture (ISCA’02), 2002, 87~98.
    [100] Mukherjee S S, Kontz M, Reinhardt S K. Detailed design and evaluation of redundant multithreading alternatives. In Proceedings of International Symposium on Computer Architecture (ISCA’02), 2002, 99~110.
    [101] Gomaa M, Scarbrough C, Vijaykumar T N, et al. Transient-fault recovery for chip multiprocessors. In Proceedings of International Symposium on Computer Architecture (ISCA’03), 2003, 98~109.
    [102] LaFrieda C, Ipek E, Martinez J F, et al. Utilizing dynamically coupled cores to form a resilient chip multiprocessor. In Proceedings of IEEE International Conference on Dependable Systems and Networks (DSN’07), 2007, 317~326.
    [103] Smolens J C, Gold B T, Falsafi B, et al. Reunion: complexity-effective multicore redundancy. In Proceedings of IEEE/ACM International Symposium on Microarchitecture (Micro’06), 2006, 223~234.
    [104] Smolens J C, Gold B T, Kim J, et al. Fingerprinting: bounding soft-error-detection latency and bandwidth. IEEE Micro, 2004, 24(6): 22~29.
    [105] Kohavi Z. Switching and finite automata theory. New York: McGraw-Hill, 1970.
    [106] Bastos R P, Kastensmidt F L, Reis R. Design at high level of a robust 8-bit microprocessor to soft errors by using only standard gates. In Proceedings of the Annual Symposium on Integrated Circuits and Systems Design (SBCCI’06), 2006, 196~201.
    [107] Erez M, Jayasena N, Knight T J, et al. Fault tolerance techniques for the merrimac streaming supercomputer. In Proceedings of ACM/IEEE Conference on Supercomputing (SC’05), 2005, 29~29.
    [108]彭和平,赵元富,高德远, et al.高可靠微处理器结构与实现.微电子学与计算机, 2006, 23(07): 78~81.
    [109] Gong R, Chen W, Liu F, et al. A new approach to single event effect tolerance based on asynchronous circuit technique. Journal of Electronic Testing: Theory and Application, 2008, 24(1-3): 57~65.
    [110] Gong R, Chen W, Liu F, et al. Control flow checking and recovering based on 8051 architecture. In Proceedings of ACM Symposium on Applied Computing (SAC’08), 2008, 1550~1551.
    [111]龚锐,陈微,刘芳, et al. FT51:一款容软错误高可靠微控制器.计算机学报, 2007, 30(10): 1662~1673.
    [112]刘必慰,陈书明,梁斌.一种新型的低功耗SEU加固存储单元.半导体学报, 2007, 28(5): 123~126.
    [113] HP NonStop S88000, S78000, and S780 servers data sheet. Hewlett-Packard Development Company, 2004.
    [114] Patterson D A, Brown A, Broadwell P, et al. Recovery-oriented computing (ROC): motivation, definition, techniques, and case studies. UC Berkeley Computer Science Technical Report, 2002.
    [115] Weaver C, Emer J, Mukherjee S S, et al. Techniques to reduce the soft error rate of a high-performance microprocessor. In Proceedings of International Symposium on Computer Architecture (ISCA’04), 2004, 264~275.
    [116] Reis G A, Chang J, Vachharajani N, et al. Design and evaluation of hybrid fault-detection systems. In Proceedings of International Symposium on Computer Architecture (ISCA’05), 2005, 148~159.
    [117]王晶,王月,孙越强.国产SOI 1750A微处理器抗辐射效应模拟试验.微计算机信息, 2008, 24(2): 268~270.
    [118]邢克飞,王跃科,扈啸.银河飞腾DSP芯片总剂量辐照试验研究.半导体技术, 2006, 31(7): 493~494.
    [119]周开明,谢泽元,杨有莉.瞬时辐射对80C31单片机性能的影响.核电子学与探测技术, 2006, 26(6): 981~984.
    [120]张庆祥,杨兆铭,李志常, et al.静态存储器同一字节多位翻转实验研究.原子能科学技术, 2001, 35(6): 485~488.
    [121] Mukherjee S S, Weaver C, Emer J, et al. A systematic methodology to compute the architectural vulnerability factors for a high-performance microprocessor. In Proceedings of IEEE/ACM International Symposium on Microarchitecture(Micro’03), 2003, 29~40.
    [122] Li X, Adve S V, Bose P, et al. SoftArch: an architecture-level tool for modeling and analyzing soft errors. In Proceedings of IEEE International Conference on Dependable Systems and Networks (DSN’05), 2005, 496~505.
    [123] Soundararajan N, Parashar A, Sivasubramaniam A. Mechanisms for bounding vulnerabilities of processor structures. In Proceedings of International Symposium on Computer Architecture (ISCA’07), 2007, 506~515.
    [124] Walcott K R, Humphreys G, Gurumurthi S. Dynamic prediction of architectural vulnerability from microarchitectural state. In Proceedings of International Symposium on Computer Architecture (ISCA’07), 2007, 516~527.
    [125] Wang N, Mahesri A, Patel S J. Examining ACE analysis reliability estimates using fault injection. In Proceedings of International Symposium on Computer Architecture (ISCA’07), 2007, 460~469.
    [126] Li X, Adve S V, Bose P, et al. Online estimation of architectural vulnerability factor for soft errors. In Proceedings of International Symposium on Computer Architecture (ISCA’08), 2008, 341~352.
    [127] Kim S, Somani A K. Soft error sensitivity characterization for microprocessor dependability enhancement strategy. In Proceedings of IEEE International Conference on Dependable Systems and Networks (DSN’02), 2002, 416~515.
    [128] Wang N J, Quek J, Rafacz T M, et al. Characterizing the effects of transient faults on a high-performance processor pipeline. In Proceedings of IEEE International Conference on Dependable Systems and Networks (DSN’04), 2004, 61~70.
    [129]黄海林,唐志敏,许彤.龙芯1号处理器的故障注入方法与软错误敏感性分析.计算机研究与发展, 2006, 43(10): 1820~1827.
    [130] Borkar S Y, Dubey P, Kahn K C, et al. Platform 2015: Intel processor and platform evolution for the next decade. Technology@Intel Magazine, 2005, 1~10.
    [131] Shi W D, Lee H S, Falk L, et al. An integrated framework for dependable and revivable architectures using multicore processors. In Proceedings of International Symposium on Computer Architecture (ISCA’06), 2006, 102~113.
    [132] Ma Y, Gao H L, Dimitrov M, et al. Optimizing dual-core execution for power efficiency and transient-fault recovery. IEEE Transactions on Parallel and Distributed Systems, 2007, 18(8): 1080~1093.
    [133] Hennessy J H, Patterson D A.计算机体系结构:量化研究方法(英文版,第三版).北京:机械工业出版社, 2002.
    [134] Renau J, Fraguela B, Tuck J, et al. SESC simulator. http://sesc.sourceforge.net, 2005.
    [135] Spars? J, Furber S. Principles of asynchronous circuit design: a systems perspective. Germany: Springer Publisher, 2001.
    [136] Yun K Y, Beerel P A, Vakilotojar V, et al. The design and verification of a high-performance low-control-overhead asynchronous differential equation solver. In Proceedings of International Symposium on Advanced Research in Asynchronous Circuits and Systems (ASYNC’97), 1997, 140~153.
    [137] Benes M, Wolfe A, Nowick S M. A high-speed asynchronous decompression circuit for embedded processors. In Proceedings of Conference on Advanced Research in VLSI (ARVLSI’97), 1997, 219~236.
    [138] Chou W, Beerel P A, Ginosar R, et al. Average-case optimized technology mapping of one-hot domino circuits. In Proceedings of International Symposium on Advanced Research in Asynchronous Circuits and Systems (ASYNC’98), 1998, 80~91.
    [139] Berkel K V, Burgess R, Kessels J, et al. Asynchronous circuits for low power: a DCC error corrector. IEEE Design & Test of Computers, 1994, 11(2): 22~32.
    [140] Berkel K V, Burgess R, Kessels J, et al. A single-rail re-implementation of a DCC error detector using a generic standard-cell library. In Proceedings of Conference on Asynchronous Design Methodologies (ASYNC’95), 1995, 72~79.
    [141] Marshall A, Coates B, Siegel F. Designing an asynchronous communications chip. IEEE Design & Test of Computers, 1994, 11(2): 8~21.
    [142] Berkel K V, Kessels J, Roncken M, et al. The VLSI-programming language Tangram and its translation intohandshake circuits. In Proceedings of the European Conference on Design Automation (EDAC’91), 1991, 384~389.
    [143] Furber S B, Day P, Garside J D, et al. AMULET1: a micropipelined ARM. In Proceedings of IEEE Computer Conference (CompCon'94), 1994, 476~485.
    [144] Furber S B, Garside J D, Temple S, et al. AMULET2e: an asynchronous embedded controller. In Proceedings of International Symposium on Advanced Research in Asynchronous Circuits and Systems (ASYNC’97), 1997, 290~299.
    [145] Garside J D, Furber S B, Chung S H. AMULET3 revealed. In Proceedings of International Symposium on Advanced Research in Asynchronous Circuits and Systems (ASYNC’99), 1999, 51~59.
    [146] Martin A J, Lines A, Manohar R, et al. The design of an asynchronous MIPS R3000 microprocessor. In Proceedings of Conference on Advanced Research in VLSI (ARVLSI’97), 1997, 164~181.
    [147] Martin A J, Burns S M, Lee T K, et al. The design of an asynchronous microprocessor. In Proceedings of Conference on Advanced Research in VLSI (ARVLSI’89), 1989, 351~373.
    [148] Brunvand E. The NSR processor. In Proceedings of Hawaii International Conference on System Sciences (HICSS’93), 1993, 428~435.
    [149] Sproull R F, Sutherland I E, Molnar C E. The counterflow pipeline processor architecture. IEEE Design & Test of Computers, 1994, 11(3): 48~59.
    [150] Gageldonk H V, Berkel K V, Peeters A, et al. An asynchronous low-power 80C51 Microcontroller. In Proceedings of International Symposium on Advanced Research in Asynchronous Circuits and Systems (ASYNC’98), 1998, 96~107.
    [151]俞颖,周磊,闵昊.采用异步电路的低功耗微控制器的VLSI设计与实现.半导体学报, 2001, 22(10): 1346~1351.
    [152]赵冰,仇玉林,吕铁良, et al.采用异步实现的快速傅立叶变换处理器.微电子学, 2006, 36(4): 396~399.
    [153]赵冰,黑勇,仇玉林.一种新型异步ACS的CMOS VLSI设计.固体电子学研究与进展, 2004, 24(1): 98~102.
    [154]王蕾.异步嵌入式微处理器设计与分析关键技术研究:博士学位论文.长沙:国防科技大学, 2006.
    [155]李勇.异步数据触发微处理器体系结构关键技术研究与实现:博士学位论文.长沙:国防科技大学, 2007.
    [156]龚锐.异步乘法器关键技术研究与实现:硕士学位论文.长沙:国防科技大学, 2005.
    [157] Muller D E, Bartky W S. A theory of asynchronous circuits. In Proceedings of International Symposium on the Theory of Switching, 1959, 204~243.
    [158] Martin A J. Formal program transformations for VLSI circuit synthesis. In Formal Development of Programs and Proofs, 1989, 59~80.
    [159] Berkel K V. Beware the isochronic fork. The VLSI Journal, 1992, 13(2): 103~128.
    [160] Blunno I, Cortadella J, Kondratyev A, et al. Handshake protocols for de-synchronization. In Proceedings of International Symposium on Asynchronous Circuits and Systems (ASYNC’04), 2004, 149~158.
    [161] Sutherland I E. Micropipelines. Communication of the ACM, 1989, 32(6): 720~738.
    [162]金星,洪延姬.系统可靠性与可用性分析方法.北京:国防工业出版社, 2007.
    [163]王珍熙.可靠性·冗余及容错技术.北京:航空工业出版社, 1991.
    [164]盛骤.概率论与数理统计.北京:高等教育出版社, 2003.
    [165] Verdel T, Makris Y. Duplication-based concurrent error detection in asynchronous circuits: shortcomings and remedies. In Proceedings of IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (DFT’02), 2002, 345~353.
    [166] Peng S, Manohar R. Self-healing asynchronous arrays. In Proceedings of International Symposium on Asynchronous Circuits and Systems (ASYNC’06), 2006, 34~45.
    [167] Jang W, Martin A. SEU-tolerant QDI circuits. In Proceedings of International Symposium on Asynchronous Circuits and Systems (ASYNC’05), 2005, 156~165.
    [168] Czech E W, Siewiorek D. Effects of transient gate-level faults on program behavior. In Proceedings of IEEE International Fault-Tolerant Computing Symposium (FTCS’90), 1990, 236~243.
    [169] Rajabzadeh A, Miremadi S G. A hardware approach to concurrent error detection capability enhancement in COTS processors. In Proceedings of Pacific Rim International Symposium on Dependable Computing (PRDC’05), 2005, 83~90.
    [170] Alkhalifa Z, Nair V S S, Krishnamurthy N, et al. Design and evaluation of system-level checks for on-line control flow error detection. IEEE Transactions on Parallel and Distributed Systems, 1999, 10(6): 627~641.
    [171] Goloubeva O, Rebaudengo M, Sonza R M, et al. Soft-error detection using control flow assertions. In Proceedings of IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (DFT’03), 2003, 581~588.
    [172]高星,廖明宏,吴翔虎, et al.基于虚拟寄存器的控制流错误检测算法.宇航学报, 2007, 28(1): 183~187.
    [173] Venkatasubramanian R, Hayes J P, Murray B T. Low-cost on-line fault detection using control flow assertions. In Proceedings of IEEE On-Line Testing Symposium (IOLTS’03), 2003, 137~143.
    [174] LI X, Gaudiot J L. A compiler-assisted on-chip assigned-signature control flow checking. In Proceedings of Asia-Pacific Computer Systems Architecture Conference (ACSAC’04), 2004, 554~567.
    [175] Li X, Adve S V, Bose P, et al. Architecture-level soft error analysis: examining the limits of common assumptions. In Proceedings of IEEE International Conference on Dependable Systems and Networks (DSN’07), 2007, 266~275.
    [176] http://sdcc.sourceforge.net

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700