Rethinking the design and implementation of the I/O software stack for high-performance computing.

详细信息

作者：Zhang ; Xuechen.
学历：Doctor
年：2012
导师：Jiang, Song,eadvisorXu, Cheng-Zhongecommittee memberSarhan, Nabil J.ecommittee memberShi, Weisongecommittee member
毕业院校：Wayne State University
Department：Computer Engineering
ISBN：9781267750389
CBH：3544437
Country：USA
语种：English
FileSize：3877594
Pages：163

文摘

Current I/O stack for high-performance computing is composed of multiple software layers in order to hide users from complexity of I/O performance optimization. However, the design and implementation of a specific layer is usually carried out separately with limited consideration of its impact on other layers, which could result in suboptimal I/O performance because data access locality is weakened, if not lost, on hard disk, a widely used storage medium in high-end storage systems. In this dissertation, we experimentally demonstrated such issues in four different layers, including operating system process management layer and MPI-IO middleware layer on compute server side, and parallel file system layer and disk I/O scheduling layer on data server side. This dissertation makes four contributions towards solving each of the issues. First, we propose a data-driven execution model for DualPar to explore opportunity of effective I/O scheduling to alleviate I/O bottleneck via cooperation between the I/O and process schedulers. Its novelty is on the ability to obtain a pool of pre-sorted requests to I/O scheduler in its data-driven execution mode by using process pre-execution and prefetching techniques. Second, realizing that well-formed locality for an MPI program by using collective I/O can be seriously compromised by non-determinism in process scheduling, we proposed Resonant I/O, to match the data request pattern with the pattern of file striping over multiple data servers to improve disk efficiency. Third, since the conventional practice for I/O parallelism using file striping may compromise on-disk data access locality, we proposed IOrchestrator scheduling framework which is implemented in PVFS2 parallel file system to improve I/O performance of multi-node storage systems by orchestrating I/O services among programs when such inter-data-server coordination is dynamically determined to be cost effective. Fourth, we developed iTransformer, a scheme that employs a small SSD to schedule requests for the data on disk. Being less space constrained than with more expensive DRAM, iTransformer can buffer larger amounts of dirty data before writing it back to the disk, or prefetch a larger volume of data in a batch into the SSD. In both cases high disk efficiency can be maintained for highly concurrent requests.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700