基于SCSI故障注入的可用性评测工具设计与实现
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
信息技术正从以计算设备为核心的计算时代和以交换机为中心的网络时代进入到以存储为核心的存储时代。面对大容量存储设备和存储系统,准确评测其应用级性能、可用性是待解决的关键问题。故障注入方法通过人为引入故障加速系统的失效,并通过观察系统在出现故障之后的行为,能够全面反映系统统合性能。由于海量存储系统高I/O带宽、高并发、大容量的特点,现有的故障注入工具不能有效评测系统计算可用性。
     本文针对海量存储磁盘系统,定义了SCSI磁盘I/O故障模型,包括读写失效故障、总线忙故障、磁盘挂起故障、命令队列溢出故障。设计和实现了一种基于SCSI协议的故障注入工具,该工具利用SCSI协议中间层提供的接口函数截获SCSI上层命令,然后修改命令,能够实现多种暂态和永久故障的注入,有效的模拟了存储系统可能遇到的各种故障。通过比较系统在故障前与故障中的应用级性能,得出存储系统计算可用性指标。
     使用本评测工具对SAN架构的海量存储系统进行可用性评测。根据目标系统特点确定故障模型和注入参数,注入结束后,进行注入结果回收和离线数据分析,通过计算注入成功率验证了本评测工具的有效性和实用性;并通过IOZONE测量目标系统在注入故障前与注入故障后的文件系统读写性能,评测了目标系统的可用性。
With the development of information technology, The change from computing centered and network centered era to storage centered era is comming. While enjoying the benefit of the massive storage system, how to evaluate its performance and availability in practical use is a core problem to solve. Fault injection is a promising method to this problem, which evaluate the system’s total performance by accelerating the advent of faults. The existing fault injection method is not appropriate to evaluate the massive storage system, because of its high I/O bandwidth, high concurrence and huge capacity.
     In this paper, the author define the fault model for mass storage system, design and implement a fault injection tool base on SCSI. The tool intercepts the SCSI commands from upper layer and modifies the commands, then pass them to the lower layer. Through modifying the commands, many faults both transient and permanent can be injected. The comparison of performance before and after fault injection reveals the computing availability of the target system.
     Applying this faults injection tool to a massive storage system based on SAN architecture, and using the IOZONE tool to collect performance index of the target system, Experiment results validate the effectiveness of the tool, and give the computing availability of the system.
引文
1冯丹.网络存储技术.中国计算机学会通信.2008,4(11):16-17
    2杨孝宗.容错技术与STRATUS容错计算机.哈尔滨工业大学出版社. 1993:8-14
    3孙峻朝.面向容错系统验证的故障注入技术研究.哈尔滨工业大学博士论文. 2000:1-2
    4崔荣一,洪炳熔.容错计算机系统的可靠性验证系统设计与实现.计算机应用研究2000:1-2
    5 A. Brown and D. A. Patterson.Towards Availability Benchmarks: A Case Study of Software RAID Systems, USENIX 2000
    6 I. Pramanick, J. Mauro, J. Zhu.A System Recovery Benchmark for Clusters.Proceedings of the IEEE International conference on Cluster Computing. 2003
    7中国计算机学会第四届专委会.全国容错计算学术会议18年回顾.第十届全国容错计算机学术会议. 2003:1-2
    8 M. Schuette, J. Shen, D. Siewiorek, and Y. Zhu.Experimental Evaluation of Two Concurrent Error Detection Schemes Digest of Papers.Proc. Symp Fault-Tolerant Computing Systems.1986:138-143
    9 G. Carrette. CRASHME: Random Input Testing. http://people.delphi.com/-gjc/crashme.html, accessed July 6,1998
    10刘治国.软件健壮性测试的一种新方法.电脑知识与技术. 2005:87-89
    11 B. Miller, D. Koski, C. Lee, et al. Fuzz Revisited: A Re-examination of the Reliability of UNIX Utilities and Services. ComputerScience Technical Report 1268. University of Wisconsin Madison, May 1998:2-4
    12 J. DeVale, P. Koopman, and D. Guttendorf. The Ballista Software Robustness Testing Service. 16th International Conference on Testing Computer Software.1999:33-42
    13 N.P. Kropp, P. J. Koopman, and D.P. Siewiorek.Automated Robustness Testing of Off-the-Shelf Software Components Proc. Symp. Fault-Tolerant Computing (FTCS) 1998:230-239
    14 Barton, J. , Czeck, E. , Segall, Z. , and Siewiorek , D. . Fault InjectionExperiments Using FIAT. IEEE Transaction on Computers, Vol. 39, no. 4, pp. 575-582.
    15 S. Han, H. Rosenberg, K.S hin. DOCTOR: an Integrated Software Fault Injection Environment, Technical Report-University of Michigan, 1993
    16 G. A. Kanawati, N. A. Kanawati, J. A. Abraham. FERRARI: A Flexible Software-Based Fault and Error Injection System. IEEE Trans. on Computers. 1995, 44(2):248-260
    17 G. A. Kanawati, N. A. Kanawati, J. A. Abraham. FERRARI: A Tool for the Validation of System Dependability Properties. Proc. 22th IEEE Int. Symp. on Fault Tolerant Computing(FTCS-22). Boston, MA, 1992: 336-344
    18 T. K. Tsai and R. K. Iyer. Measuring Fault Tolerance with the FTAPE Fault Injection Tool. In Proc. of Performance Tools. Heidelberg, Germany, September 1995:26-40
    19 J. Carreira, H. Madeira, and J. Silva. Xception: A Technique for the Experimental Evaluation of Dependability in Modern Computers, IEEE Transactions on Software Engineering, Vol.24, No.2, February 1998:125–136
    20 T.K. Tsai and R.K. Iyer. Measuring Fault Tolerance with the FTAPE Fault Injection Tool. In Proc. of Performance Tools. Heidelberg, Germany, September 1995:26-40
    21 D.T. Stott, et al. NFTAPE: A Framework for Assessing Dependability in Distributed Systems with Lightweight Fault Injectors. In Proceedings of the IEEE International Computer Performance and Dependability Symposium,.2000:91-100
    22 Dan Feng,Hai Jin.Massive Storage Systems.Computer Science and Technology.2006,21:648-663
    23史伟.对象存储原型系统设计及相关实现.华中科技大学硕士学位论文.2006:43~44
    24 D.A.Patterson,G.A.Gibson,and R.Katz.A Case for Redundant Arrays of Inexpensive Disks(RAID).ACM SIGMOD,1988.109~116
    25何青林.磁盘阵列SCSI接口及Cache预取算法研究.华中科技大学硕士学位论文.2004:27~28
    26 Friedhelm Schmidt著.精英科技译.SCSI总线和IDE接口:协议、应用和编程. 2001:198~200
    27 R. Chillarege and N. Bowen. Understanding Large Systems Failures - A Fault Injection Experiment. In Proc. of 19th FTCS, Chicago.1989: 356~36
    28 J. Arlat et al.. Fault injection for dependability validation: a methodology and some applications. IEEE Trans. Software Eng. 1990,16(2)
    29 J. A. Clark and D. K. Pradhan. Fault injection: a method for validating computer-system dependability. IEEE Computer. 1995,28(6)
    30 Dong Tang. Experimental analysis of computer system dependability. Proc. Annual Reliability and Maintainability Symposium.1995
    31熊根华.基于COTS的安全关键软件故障注入技术研究.华中师范大学硕士论文. 2007:34~43
    32 V.Sief. Fault-injector using UNIX ptrace interface. Internal Report 11/93,IMMD3,University at Erlangen-Nurnberg,1993:2~6
    33 Jonathan Corbet,Alessandro Rubini,Greg Kroan-hartman著.魏永明,耿岳,钟书毅译. linux设备驱动程序.中国电力出版社. 2006.1
    34 D. P. Bovet, M. Cesati.深入理解LINUX内核.中国电力出版社. 2002:219~220
    35 D. P. Bovet, M. Cesati.深入理解LINUX内核.中国电力出版社. 2002:219~220
    36愈永昌.设备驱动开发技术及应用.人民邮电出版社.2008:55~76
    37康华.Linux内核空间与用户空间信息交互方法. http://www.kerneltravel.net/jiaoliu/005.htm
    38毛德操,胡希明. Linux内核源代码情景分析.浙江大学出版社. 2001:756-773
    39刘丹.面向软件的故障注入工具的研究与实现.哈尔滨工业大学硕士学位论文. 2008:23~25
    40 X. Li, R. P. Martin, K. Nagaraja, T. D. Nguyen, and B. Zhang. Mendosus: A SAN-Based Fault-Injection Test-Bed for the Construction of Highly Available Network Services. In Proceedings of the 1st Workshop on NovelUses of System Area Networks (SAN-1), Cambridge, MA, Jan. 2002
    41 Robert Love著.陈莉君译.Linux内核设计与实现(第2版).机械工业出版社.2006, 74~95
    42朱荣,徐拾义.软件测试中故障模型的建立.计算机工程与应用. Vol.39, No.17,2003:69-71

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700