REDU: reducing redundancy and duplication for multi-failure recovery in erasure-coded storages
详细信息    查看全文
文摘
Data reliability is a significant issue in large-scale storage systems. Erasure codes provide high data reliability via data recovery, which however generates a large amount of data transmission in the network. The bandwidth cost of the data transmission in recovery significantly impacts the performance of the located cluster. Existing work considers the single-failure as the most common failure pattern and mainly focuses on reducing the data transmission cost of single-failure recovery, which unfortunately fails to efficiently support the multi-failure recovery. In this work, first, we provide the Mean Time To Multi-Failure metric based on Markov model to demonstrate the frequency and pattern of multi-failure in erasure-coded storages. Then, we propose REDU to reduce the duplication and redundancy in multi-failure recovery of erasure-coded storages. In REDU, we propose merging-based de-duplication to reduce duplicated data transmission, and aggregating-based de-redundancy to reduce redundant information transmission, and we also propose cooperative routing to efficiently use the two schemes above based on the practical cluster topology. The analysis and experimental results demonstrate the importance of multi-failure recovery problem and the efficiency of REDU.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700