文摘
A fundamental building block for an IaaS (Infrastructure-as-a-Service) cloud service such as Amazon's EC2 is a storage virtualization system that provides block-level storage services to individual virtual machines over the network. This dissertation addresses four major problems in such a block-level cloud storage system,in the context of an end-to-end IaaS solution called ITRI Cloud OS. First,to effectively eliminate redundancies in stored data blocks,we propose a scalable block-level deduplication engine called Sungem,which uses both sampling and prefetching to minimize the performance overhead of fingerprint accesses,and features a storage block garbage collection algorithm whose run- time overhead is proportional only to the size of the delta between consecutive backup operations. Second,to efficiently flush meta-data updates associated with large-scale block-level storage management,we developed a novel storage system architecture called BOSC (Batching mOdifications with Sequential Commit),which uses largely sequential writes to commit updates to disk and is thus able to sustain high-throughput and low-latency metadata updates that are largely random. Third,as part of the BOSC architecture,we invented a high-throughput low-latency disk logging system called Beluga,which fashions a carefully tuned disk write pipeline and makes it possible to provide,on an array of three commodity 7200 RPM SATA disks,close to 5 million fine-grained (64-byte) disk logging operations per second,which is close to the maximum possible bandwidth on a commodity disk,while keeping the latency of each logging operation under 1 msec. Finally,we devised a set of techniques for supporting software-defined storage service on a distributed and replicated storage architecture. Specifically,we developed a distributed storage QoS guarantee system called Cheetah,which is able to provide a bandwidth guarantee to each virtual disk attached to a virtual machine,while ensuring the loads on the distributed storage nodes be balanced,and the locality of the access stream associated with each virtual disk be preserved as much as possible.