MapReduce short jobs optimization based on resource reuse
详细信息    查看全文
文摘
Hadoop is an open-source implementation of MapReduce serving for processing large datasets in a massively parallel manner. It was designed aiming at executing large-scale jobs in an enormous number of computing nodes offering computing and storage. However, Hadoop is frequently employed to process short jobs. In practice, short jobs suffer from poor response time and run inefficiently. To fill this gap, this paper analyses the process of job execution and depicts the existing issues why short jobs run inefficiently in Hadoop. According to the characteristic of task execution in multi-wave under cluster overload, we develop a mechanism in light of resource reuse to optimize short jobs execution. This mechanism can reduce the frequency of resource allocation and recovery. Experimental results suggest that the developed mechanism based on resource reuse is able to improve effectiveness of the resource utilization. In addition, the runtime of short jobs can be significantly reduced.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700