Scaling up MapReduce-based Big Data Processing on Multi-GPU systems
详细信息    查看全文
  • 作者:Hai Jiang (1)
    Yi Chen (1)
    Zhi Qiao (1)
    Tien-Hsiung Weng (2)
    Kuan-Ching Li (2)

    1. Department of Computer Science
    ; Arkansas State University ; Jonesboro ; AR ; USA
    2. Department of Computer Science and Information Engineering
    ; Providence University ; Taichung ; 43301 ; Taiwan
  • 关键词:GPU ; Multi ; GPU ; MapReduce ; Pipeline ; Big Data ; Parallel processing
  • 刊名:Cluster Computing
  • 出版年:2015
  • 出版时间:March 2015
  • 年:2015
  • 卷:18
  • 期:1
  • 页码:369-383
  • 全文大小:1,692 KB
  • 参考文献:1. Jiang, H, Chen, Y, Qiao, Z, Li, K-C, Ro, W, Gaudiot, J-C (2013) Accelerating MapReduce framework on multi-GPU systems. Cluster Computing. Springer, Berlin, pp. 1-9
    2. Cubieboards: an Open ARM Mini PC, http://www.cubieboard.org 2014
    3. CUDA Programming Guide 6.0, NVIDIA, 2014
    4. Dean, Jeffrey, Ghemawa, Sanjay (2008) MapReduce: simplied data processing on large clusters. Commun. ACM 51: pp. 107-113 CrossRef
    5. Chen, Y, Qiao, Z, Jiang, H, Li, K-C, Ro, WW (2013) MGMR: multi-GPU based MapReduce. Grid and Pervasive Computing. Springer, Berlin, pp. 433-442 CrossRef
    6. Bollier, D, Firestone, CM (2010) The Promise and Peril of Big Data. Aspen Institute, Washington, DC
    7. Jinno, R., Seki, K., Uehara, K.: Parallel distributed trajectory pattern mining using MapReduce. In: Proceedings of IEEE 4th International Conference on Cloud Computing Technology and Science, pp. 269鈥?73, 2012
    8. Lee, D, Dinov, I, Dong, B, Gutman, B, Yanovsky, I, Toga, AW (2012) CUDA optimization strategies for compute-and memory-bound neuroimaging algorithms. Comput. Methods Programs Biomed. 106: pp. 175 CrossRef
    9. Raina, R., Madhavan, A., Ng, A.D.: Large-scale deep unsupervised learning using graphics processors. In: Proceedings of the 26th International Conference on Machine Learning, Canada, 2009
    10. Fadika, z., Dede, E., Hartog, J., Govindaraju, M.: Marla: Mapreduce for heterogeneous clusters. In: Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp. 49鈥?6, 2012
    11. Stuart, J.A., Owens, J.D.: Multi-GPU MapReduce on GPU clusters. In: Proceedings of the 2011 IEEE International Parallel & Distributed Processing Symposium, pp. 1068鈥?079, 2011
    12. Foster, I., Kesselman, C.: The Grid 2: blueprint for a new computing infrastructure, Morgan Kaufmann, 2003
    13. Czajkowski, K., Fitzgerald, S., Foster, I., Kesselman, C.: Grid information services for distributed resource sharing. In: Proceedings of 10th IEEE International Symposium on High Performance Distributed Computing, pp. 181鈥?94, 2001
    14. White, T (2012) Hadoop: The Definitive Guide. O鈥橰eilly Media, Sebastopol
    15. Chen, L., Huo, X., Agrawal, G.: Accelerating MapReduce on a coupled CPU-GPU architecture. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis 2012
    16. Nakada, H., Ogawa, H., Kudoh, T.: Stream processing with big data: SSS-MapReduce. In: Proceedings of 2012 IEEE 4th International Conference on Cloud Computing Technology and Science, pp. 618鈥?21, 2012
    17. Ji, F., Ma, X.: Using shared memory to accelerate MapReduce on graphics processing units. In: Proceedings of the IEEE International Parallel & Distributed Processing Symposium, pp. 805鈥?16, 2011
    18. Chen, L., Agrawal, G.: Optimizing MapReduce for GPUs with effective shared memory usage. In: Proceedings of the 21st International Symposium on High-Performance Parallel and Distributed Computing, pp. 199鈥?10, 2012
    19. Shainer, G, Ayoub, A, Lui, P, Liu, T, Kagan, M, Troot, CR, Scantlen, G, Crozier, PS (2011) The development of Mellanox/NVIDIA GPU Direct over InfiniBand new model for GPU to GPU communications. Computer Science-Research and Development. Springer, Berlin, pp. 267-273
    20. Fang, Wenbin, He, Bingsheng, Luo, Qiong, Govindaraju, Naga K (2011) Mars: Accelerating MapReduce with Graphics Processors. IEEE Trans. Parallel Distrib. Syst. 22: pp. 608-620 CrossRef
    21. Elteir, M., Lin, H., Feng, W.C., Scogland, T.R.W: StreamMR: an optimized MapReduce framework for AMD GPUs. In: IEEE 17th International Conference on Parallel and Distributed Systems, pp. 364鈥?71, 2011
    22. Tuning CUDA Applications for Kepler, http://docs.nvidia.com/cuda/kepler-tuning-guide/
    23. Nathan, B., Jared, H.: Thrust: a productivity-oriented library for CUDA. In: GPU Computing Gems: Jade Edition, Morgan Kaufmann, pp. 359鈥?71, 2011
    24. Xiaobo, L, Paul, L, Jonathan, S, John, S, Sze, WP, Hanmao, S (1993) On the versatility of parallel sorting by regular sampling. Parallel Comput. 19: pp. 1079-1103 CrossRef
    25. Bartosz, P (2002) A fast approximation algorithm for the subset-sum problem. Int. Trans. Oper. Res. 9: pp. 437-459 CrossRef
    26. FERMI Compute Architecture White Paper, Nvidia
    27. Shi, Y, L茅on-Charles, T, De, MB, Yves, M (2012) Optimized data fusion for kernal k-means clustering. IEEE Trans. Pattern Anal. Mach. Intell. 34: pp. 1031-1039 CrossRef
  • 刊物类别:Computer Science
  • 刊物主题:Processor Architectures
    Operating Systems
    Computer Communication Networks
  • 出版者:Springer Netherlands
  • ISSN:1573-7543
文摘
MapReduce is a popular data-parallel processing model encompassed with recent advances in computing technology and has been widely exploited for large-scale data analysis. The high demand on MapReduce has stimulated the investigation of MapReduce implementations with different architectural models and computing paradigms, such as multi-core clusters, Clouds, Cubieboards and GPUs. Particularly, current GPU-based MapReduce approaches mainly focus on single-GPU algorithms and cannot handle large data sets, due to the limited GPU memory capacity. Based on the previous multi-GPU MapReduce version MGMR, this paper proposes an upgrade version MGMR++ to eliminate GPU memory limitation and a pipelined version, PMGMR, to handle the Big Data challenge through both CPU memory and hard disks. MGMR++ is extended from MGMR with flexible C++ templates and CPU memory utilization, while PMGMR fine-tuned the performance through the latest GPU features such as streams and Hyper-Q as well as hard disk utilization. Compared to MGMR (Jiang et al., Cluster Computing 2013), the proposed schemes achieve about 2.5-fold performance improvement, increase system scalability, and allow programmers to write straightforward MapReduce code for Big Data.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700