A Hybrid Fault-Tolerant Architecture for Highly Reliable Processing Cores
详细信息    查看全文
  • 作者:I. Wali ; Arnaud Virazel ; A. Bosio ; P. Girard…
  • 关键词:Fault tolerance ; Microprocessor ; Single event transient ; Permanent fault ; Delay fault ; Power consumption ; High dependability ; Fault injection
  • 刊名:Journal of Electronic Testing
  • 出版年:2016
  • 出版时间:April 2016
  • 年:2016
  • 卷:32
  • 期:2
  • 页码:147-161
  • 全文大小:1,542 KB
  • 参考文献:1.Avirneni NDP, Somani AK (2012) Low overhead soft error mitigation techniques for high-performance and aggressive designs. IEEE Trans Comput 61(4):488–501MathSciNet CrossRef
    2.E. Balaji and P. Krishnamurthy (1996) Modeling ASIC memories in VHDL. In: Proc. EURO-VHDL Design Automation Conference, pp. 502–508
    3.J. A. Blome, S. Feng, S. Gupta, S. Mahlke (2006) Online timing analysis for wearout detection. In: Proc. of the 2nd Workshop on Architectural Reliability
    4.Bubrova E (2013) “Hardware redundancy,” in Fault-Tolerant Design. Springer, New YorkCrossRef
    5.D. Ernst, Nam Sung Kim, S. Das, S. Pant, R. Rao, T. Pham, C. Ziesler, D. Blaauw, T. Austin, K. Flautner, T. Mudge, (2003) Razor: a low-power pipeline based on circuit-level timing speculation. In: Proc. of the 36th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 7–18
    6.Introduction to Single-Event Upsets, white paper, Altera Corp (2013) http://​www.​altera.​com/​literature/​lit-index.​html .
    7.K. John, H.K. Chris (2011) Transistor Aging, IEEE Spectrum, http://​spectrum.​ieee.​org .
    8.Johnson BW (1989) Design techniques to achieve fault tolerance. In: Design and analysis of Fault-Tolerant Digital Systems. Addison-Wesley Pub Comp. Inc, USA, pp. 67–68
    9.Li M-L., P. Ramachandran, U.R. Karpuzcu, S.K.S. Hari, S.V. Adve (2009) Accurate microarchitecture-level fault modeling for studying hardware faults. In: Proc. of the 15th IEEE International Symposium on High Performance Computer Architecture, pp. 105–116
    10.P. Liden et al. (1994) On latching probability of particle induced transients in combinational networks. In: Proc. of the Symp on Fault-Tolerant Computing, pp. 340–349
    11.M. Mehrara, M. Attariyan, S. Shyam, K. Constantinides, V. Bertacco and T. Austin(2007) Low-Cost Protection for SER Upsets and Silicon Defects. In: Proc. of the Design, Automation & Test in Europe Conference, pp. 1–6
    12.S. Mitra, E.J. McCluskey (2000) Word-voter: a new voter design for triple modular redundant systems. In: Proc. of the 18th IEEE VLSI Test Symposium, pp. 465–470
    13.M. Prvulovic et al. (2002) ReVive: cost-effective architectural support for rollback recovery in shared-memory multiprocessors. In: Proc. of the Int Symp on Computer Architecture, pp. 111–122
    14.Semiconductor Industry Association (2010) International Technology Roadmap for Semiconductors (ITRS)
    15.P. Shivakumar, M. Kistler, S.W. Keckler, D. Burger and L. Alvisi (2002) Modeling the effect of technology trends on the soft error rate of combinational logic. In: Proc. of the Int Conf on Dependable Systems and Networks, pp. 389–398, .
    16.V. Subramanian, A.K. Somani (2008) Conjoined Pipeline: Enhancing Hardware Reliability and Performance through Organized Pipeline Redundancy. In: Proc. 14th IEEE Pacific Rim International Symposium on Dependable Computing, pp. 9–16
    17.D. A. Tran, A. Virazel, A. Bosio, L. Dilillo, P. Girard, S. Pravossoudovitch and H.-J. Wunderlich (2011) A hybrid fault tolerant architecture for robustness improvement of digital circuits. In: Proc. of the Asian Test Symposium, pp. 136–141
    18.D. A. Tran, A. Virazel, A. Bosio, L. Dilillo, P. Girard, A. Todri, M.E. Imhof and H.-J. Wunderlich (2012) A pseudo-dynamic comparator for error detection in fault tolerant architectures. In: Proc. of the VLSI Test Symposium, pp. 50–55
    19.J. Velamala, R. LiVolsi, M. Torres and Yu Cao (2011) Design sensitivity of Single Event Transients in scaled logic circuits. In: Proc. of the Design Automation Conference, pp. 694–699
    20.I. Wali, A. Virazel, A. Bosio, L. Dilillo, P. Girard, A. Todri (2014) Protecting combinational logic in pipelined microprocessor cores against transient and permanent faults,. In: Proc. of the Int. Symp. on Design and Diagnostics of Electronic Circuits & Systems, pp. 223, 225
    21.Wirth G, Kastensmidt L, Fernanda IR (2008) Single event transients in logic circuits—load and propagation induced pulse broadening. IEEE Trans Nucl Sci 55(6):2928–2935CrossRef
    22.Yao J, Shimada H, Kobayashi K (2010) A stage-level recovery scheme in scalable pipeline modules for high dependability. In: Proc. of the Int Workshop on Innovative Architecture for Future Generation High Performance, pp.21–29
    23.Yao J et al. (2012) DARA: A low-cost reliable architecture based on unhardened devices and its case study of radiation stress test. IEEE Trans Nucl Sci 59(6):2852–2858CrossRef
  • 作者单位:I. Wali (1)
    Arnaud Virazel (1)
    A. Bosio (1)
    P. Girard (1)
    S. Pravossoudovitch (1)
    M. Sonza Reorda (2)

    1. Laboratoire d’Informatique de Robotique et de Microélectronique de Montpellier, University of Montpellier / CNRS, 161 rue Ada, 34392, Cedex 5, Montpellier, France
    2. Politecnico di Torino, Torino, Italy
  • 刊物类别:Engineering
  • 刊物主题:Circuits and Systems
    Electronic and Computer Engineering
    Computer-Aided Engineering and Design
  • 出版者:Springer Netherlands
  • ISSN:1573-0727
文摘
Increasing vulnerability of transistors and interconnects due to scaling is continuously challenging the reliability of future microprocessors. Lifetime reliability is gaining attention over performance as a design factor even for lower-end commodity applications. In this work we present a low-power hybrid fault tolerant architecture for reliability improvement of pipelined microprocessors by protecting their combinational logic parts. The architecture can handle a broad spectrum of faults with little impact on performance by combining different types of redundancies. Moreover, it addresses the problem of error propagation in nonlinear pipelines and error detection in pipeline stages with memory interfaces. Our case-study implementation of a fault tolerant MIPS microprocessor highlights four main advantages of the proposed solution. It offers (i) 11.6 % power saving, (ii) improved transient error detection capability, (iii) lifetime reliability improvement, and (iv) more effective fault accumulation effect handling, in comparison with TMR architectures. We also present a gate-level fault-injection framework that offers high fidelity to model physical defects and transient faults.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700