文摘
Cluster platforms have an important role in high performance computing (HPC). They execute cloud computing, data-intensive computing and data center applications, which are supported on distributed file systems. The implementation of data redundancy in these file systems provides a support for high availability and error tolerance. This work proposes an implementation of redundant data storage based on the storage included in the cluster nodes, instead of more expensive approaches with a dedicated storage and network, and on multi-multicast transfers, instead on unicast transfers, to perform the multiple simultaneous data diffusion required for implementing redundant data storage. The proposal applies a recently proposed congestion control scheme that adjusts the sender injection rate, taking into account control information from the receiver nodes and the storage technology available on the cluster nodes. The implementation takes full advantage of the switch diffusion hardware and of the IGMP snooping capability of current switches, which allows to multicast a packet just to the output links with receivers joined to a multicast group. It is made at the user level directly on the UDP interface. Evaluation tests with multiple simultaneous storage accesses were performed in a CentOS cluster. Test results show a more efficient use of the cluster storage. The global bandwidth improves by using hardware related to the storage (network and storage devices) more efficiently.