Automatic detection of serious storage system errors via model checking.

详细信息

作者：Yang ; Junfeng.
学历：Doctor
年：2008
导师：Engler, Dawson
毕业院校：Stanford University
专业：Computer Science.
ISBN：9780549357254
CBH：3292433
Country：USA
语种：English
FileSize：8420437
Pages：145

文摘

Storage systems such as file systems, databases, and RAID (Redundant Array of Inexpensive Disks) systems have a simple, basic contract: you give them data, they do not lose or corrupt it. Often they store the only copy of data, making its irrevocable loss intolerable. Unfortunately, storage system code is exceptionally hard to get right, in part because it must correctly recover from any crash at any program point, no matter how their state was smeared across volatile and persistent memory. This dissertation presents FiSC and EXPLODE, two systems that are tailored to effectively, systematically check real storage systems for errors. FiSC is designed to check file systems. It adapts ideas from model checking, a comprehensive, heavyweight formal verification technique to make its checking systematic. It comes with a set of generic and file-system-specific checks that are applicable to many file systems. We applied FiSC to four widely used, heavily tested Linux file systems: ext3, JFS, ReiserFS and XFS. We found serious bugs in each of them, 33 in total, most of which led to Linux kernel patches within a few days of diagnosis. For each file system, FiSC found demonstrable events leading to the unrecoverable destruction of entire directory trees, including the file system root directory "/" (i.e. all data in the file system). FiSC shows that model checking can practically, effectively find interesting errors in real file systems.;While being more lightweight than a traditional model checking approach, FiSC still requires a heavily intrusive port of the checked Linux system to run inside the model checker FiSC uses, making it difficult to check other storage systems; in the best case, checking a new file system may take a week. Improving upon FiSC, EXPLODE uses a novel in-situ checking architecture that checks live storage systems running in their native environments, by interlacing the control it needs for systematic checking throughout the checked system (instead of "shoehorning" the checked system to run inside the model checker). This architecture gets most of FiSC's model checking benefits and drastically reduces the checking overhead: in the best case, clients write a few tens of C++ code to check a new storage system. This reduction enables EXPLODE to check many more storage systems than FiSC. Even for the same systems checked by both FiSC and EXPLODE, setting them up for checking in EXPLODE often takes just a few minutes, several orders of magnitude reduction compared to a week's effort in FiSC. Since the in-situ architecture simplifies checking of new storage systems, E XPLODE provides a flexible checking interface that allows developers to easily write new checkers and check new storage systems or compose complicated storage stacks from existing components for checking. We applied EXPLODE to a broader range of real-world storage systems (without requiring source code): three version control systems, Berkeley DB, the Linux NFS V3 implementation, ten Linux file systems, a RAID system, and the popular VMware GSX virtual machine. As with FiSC, we found serious bugs in every system checked, 36 bugs in total, typically with little effort. One interesting result is that EXPLODE detects that the checked version of VMware GSX server on Linux makes it impossible for storage systems to enjoy correct crash recovery however GSX is configured.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700