COTS微处理器软件容错性能的研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
美国ARGOS卫星对利用SIHFT软件容错技术进行容错的商用器件的抗辐射性能进行了在轨实验,结果证明在不需要专用硬件的情况下,使用软件技术也能达到航天应用的可靠性要求。这个结论为商用器件在空间飞行器上的使用提供了有力的支持,使得商用器件在空间飞行器上有了更广阔的应用前景。从众多优秀的商用器件中筛选出合适的器件用于空间飞行器中,已经成为空间系统发展的主要研究方向之一。微处理器是空间飞行器必不可少的核心部件,因此有必要对目前主流航天用COTS微处理器的体系结构进行研究,对其使用软件容错技术产生的容错效果和性能开销进行比较分析,探索更加适合软件容错实现的航天用微处理器体系结构,这对今后航天用微处理器的选用具有十分重要的现实意义。
     本文首先研究了SIHFT的两项关键技术:软件签名控制流检错技术和复制指令检错技术的基本思想和算法;其次研究了ARM7和SPARC V8这两种微处理器的体系结构和指令系统;接着在以上研究的基础上,对SIHFT技术在这两种微处理器体系结构上实现时的容错效果和性能开销进行了比较和理论分析,提出了适合采用该技术进行软件容错的微处理器应该具备的条件,并对SIHFT技术在SPARC V8平台上实现的关键问题进行了分析,给出了解决问题的算法和相应的技术方案。最后,通过仿真实验测试了适于目标平台的软件检错技术的检错覆盖率。在仿真器TSIM上进行实验的结果数据表明:在引入平均133.2%的性能开销的基础上,寄存器注入故障,检错覆盖率为95.3%;存储器注入故障,检错覆盖率为79.8%,这说明SIHFT技术是有效可行的。由于基于ARM7的实验数据尚不完整,因此暂时无法与之进行定量的比较,但从理论分析的结果可以看出在实现SIHFT技术时,SPARC V8表现出的性能要明显优于ARM7。该结论有待于进一步进行实验验证。
The American ARGOS satellite had done experiments on the primary anti-radiation measures of COTS components, and had obtained valuable experiment data. The experiments have demonstrated that it’s possible to satisfy the requirement of space applications by software techniques without special hardware. This conclusion makes the COTS components more applicable in spacecrafts. How to select the best one from many excellent COTS components used in spacecrafts has been an important issue in space system research. Microprocessor is the core component in spacecrafts. Therefore, it's necessary to study the architecture of microprocessor , the fault tolerance and performance overhead, when the software fault-tolerance techniques is applied in the COTS microprocessors, and then select the best, which is very helpful in selecting the microprocessor applied in the spacecrafts in the future.
     EDDI and CFCSS, two key techniques of the SIHFT technique and the architecture of ARM7 and SPARC V8 are researched in this dissertation. Based on the research, the fault tolerance and performance overhead are compared when the SIHFT technique is applied on the two microprocessors, and the requirements of the suitable microprocessors are proposed. And then the chief technique problems when the SIHFT technique is applied on SPARC V8 are analyzed, the solution algorithms and the technical scheme are proposed. At last, the simulated experiments on the target platform have been done to test the fault detection coverage of suitable software fault-tolerance technique. The experiment data on the TSIM simulator has shown that with average 133.2% overhead, the fault detection coverage of the software fault tolerance technique is 95.3% when faults are injected into registers, and 79.8% when faults are injected into memory, which proved that the SIHFT technique is valid. Because the data of the experiment on the ARM7 is not complete, there is no way to compare two platforms quantitatively. But the performance of SPARC V8 is better than ARM7’s, which obtained by the theoretical analyses. The conclusion is to be proven by further experiments.
引文
1王长河.单粒子效应对卫星空间运行可靠性影响.半导体情报. 1998,35(1):1~2
    2潘科炎,王长龙.星载数字电子设备的辐射加固技术. 1998,(3):67~68
    3任琼英,蔡金荣,罗光宣.星载计算机的单粒子效应及其软件防护加固技术研究.贵州大学学报(自然科学版). 1998,15(3):178~180
    4 N. Oh, P. P. Shirvani and E. J. McCluskey. Control-flow checking by software signatures. IEEE Transactions on Reliability. 2002,51(1):111~122
    5 N. Oh, P. P. Shirvani and E. J. McCluskey. Error detection by duplicated instructions in super-scalar processors. IEEE Transactions on Reliability. 2002,51(1):63~75
    6 Philip P. Shirvani, Nirmal Saxena, Nahmsuk Oh, Subhasish Mitra, Shu-Yi Yu, Wei-Je Huang, Santiago Fernandez-Gomez, Nur A. Touba and Edward J. McCluskey. Fault-Tolerance Projects at Stanford CRC. CRC Technical Report. 2000
    7 Ugur YENIER. Fault Tolerant Computing In Space Environment And Software Implemented Hardware Fault Tolerance Techniques
    8 P.P. Shirvani, N. Oh, E. J. McCluskey. Software-Implemented Hardware Fault Tolerance Experiments COTS in Space. CRC Technical Report. 2000
    9黄振远.一种星载计算机软件检错技术的研究与实现.哈尔滨工业大学硕士学位论文. 2006:50~57
    10姜秀杰,孙辉先,王志华,张利.商用器件的空间应用需求、现状及发展前景.空间科学学报. 2005,25(1):76~80
    11朱光武,李保权.空间环境对航天器的影响及其对策研究.上海航天. 2002,(4):1~6
    12王长龙,沈石岑,张传军.星载设备抗单粒子效应的设计技术初探.航天控制. 1995.3
    13王准.交换系统软件容错技术的研究.北京邮电大学博士学位论文. 1998:4~6
    14张宇,洪炳熔.软件容错技术的研究现状与展望.计算机应用研究. 1999,(9):1~3
    15陈炜.计算机容错技术与应用.武汉工业大学学报. 1998,20(4):75~76
    16 Ravishankar K. Iyer, Zbigniew Kalbarczyk. Hardware and Software Error Detection. Center for Reliable and High-Performance Computing Coordinated Science Laboratory. University of Illinois at Urbana-Champaig. 2001:14~15,27~30
    17闵应骅.前进中的可信计算-软件容错.中国传媒科技. 2006,(121):34~36
    18 Nahmsuk Oh, Subhasish Mitra, Edward J. McCluskey. ED4I: error detection by diverse data and duplicated instructions. IEEE Transactions on Computers. 2002,51(2):180 ~ 199
    19 Software Fault Tolerance: A Tutorial. Wilfredo Torres-Pomales. NASA. 2000:8~11
    20 Aamer Mahmood and E. J. McCluskey. Concurrent Error Detection Using Watchdog Processors-A Survey. IEEE Transactions on Computers. 1998,37(2):160~163
    21彭宇,洪炳熔.一种控制流错误检测方法的实现.计算机应用研究. 1999,(8):24~24
    22高星,廖明宏,吴翔虎,黄振远.基于虚拟寄存器的控制流错误检测算法.宇航学报2007.1: 183~187
    23 Michael A. Schuette and John P. Shen. Exploiting Instruction-Level Parallelism for Integrated Control-Flow Monitoring. IEEE Transactions on Computers. 1994,43(2):129~133
    24 G. A. Reis, J. Chang, N. Vachharajani, R. Rangan and D. I. August. SWIFT: Software implemented fault tolerance. Proceedings of the 3rd International Symposium on Code Generation and Optimization. 2005
    25高珑,杨学军.高性能低功耗的容错编译技术:错误流压缩算法.软件学报. 2006.12: 2425~2437
    26 P. P. Shirvani, N. Saxena, and E. J. McCluskey. Software implemented EDAC protection against SEUs. IEEE Transactions on Reliability.2000,49:273~284
    27 GCC Internals. Free Software Foundation, Inc. 2004
    28 Red Hat Enterprise Linux 3: Using as, the Gnu Assembler. Red Hat, Inc.2003:1~63
    29时晨等.基于SPARC结构的RISC系统设计技术.微电子学与计算机.2002,(11):52~54
    30 The SPARC Architecture Manual Version 8. SPARC International, Inc. Revision SAV080SI9308. 1992:1~57
    31胡伟武,张福新,李祖松.龙芯2号处理器设计和性能分析.计算机研究与发展. 2006:959~966
    32 ARM7TDMI Data Sheet. Advanced RISC Machines Ltd (ARM) . 1995:4-1~4-64
    33夏军. 32位RISC微处理器设计研究.华中科技大学博士学位论文. 2004:1~7
    34李勇等.SPARC -V8处理器架构和寄存器组织.中国计算机学会第九届计算机工程与工艺学术年会:117~120
    35 Alfred V.Aho, Ravi Sethi, Jeffrey D. Ullman.编译原理.李建中,姜守旭.机械工业出版社. 2003:343~343
    36石博慧,陈英. GCC代码优化技术的研究.微机发展. 2004,14(8):67~70
    37吴佩华,尉红梅,漆锋宾. GCC编译系统分析.高性能计算技术. 2003,(160):21~24
    38 Steven S. Muchnick.高级编译器设计与实现.赵克佳,沈志宇.机械工业出版社. 2005:195~199,381~390
    39 John Paul Shen, Mikko H.Lipasti.现代处理器设计—超标量处理器基础.张承义,邓宇,王蕾.电子工业出版社. 2004:44~47,85~86
    40陆伯鹰,尹宝林.一个基于DAG图的指令调度优化算法.计算机工程与应用. 2001,(12):121~124
    41杨书鑫,张兆庆.全局指令调度综述.计算机工程与应用. 2004,(21):24~24
    42 Mendel Rosenblum, Stephen A. Herrod, Emmett Witchel, andAnoop Gupta, Complete Computer Simulation: The SimOS Approach,IEEE Parallel and Distributed Technology, Fall 1995
    43陆岚,王克祥,熊悦,赵振西.系统级体系结构仿真器的研究与实现.小型微型计算机系统. 2002.1:14~16[45]
    44 The Sparc-Sulima Manual Version 0.4 .CC–NUMA Project. Department of Computer Science,The Australian National University,October 2006:3-5
    45 TSIM2 Simulator User’s Manual for Version 2.0.7. Gaisler Research AB.. January 2007:6~11
    46 GRSIM Simulator User’s Manual for version 1.1.19. Gaisler Research AB. Mars 2007:6~8
    47 BCC-Bare-C Cross-Compiler User's Manual. Version 1.0.29. Gaisler Research. February 2007:3~18

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700