一种面向异构计算的结构化并行编程框架
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:A heterogeneous computing oriented structural parallel programming framework
  • 作者:李安民 ; 计卫星 ; 廖心怡 ; 高建花 ; 谈兆年 ; 王一拙 ; 石峰
  • 英文作者:LI An-min;JI Wei-xing;LIAO Xin-yi;GAO Jian-hua;TAN Zhao-nian;WANG Yi-zhuo;SHI Feng;School of Computer Science & Technology,Beijing Institute of Technology;
  • 关键词:异构计算 ; 并行计算 ; 编程框架 ; 结构化编程
  • 英文关键词:heterogeneous computing;;parallel computing;;programming framework;;structural programming
  • 中文刊名:JSJK
  • 英文刊名:Computer Engineering & Science
  • 机构:北京理工大学计算机学院;
  • 出版日期:2019-03-15
  • 出版单位:计算机工程与科学
  • 年:2019
  • 期:v.41;No.291
  • 基金:国家自然科学基金(61300010)
  • 语种:中文;
  • 页:JSJK201903006
  • 页数:9
  • CN:03
  • ISSN:43-1258/TP
  • 分类号:44-52
摘要
随着人工智能时代的到来,异构计算在深度学习、科学计算等领域发挥着越来越重要的作用。目前异构计算系统在应用上的瓶颈之一在于缺少高效的软件开发框架,已有的OpenCL、CUDA等支持GPU、DSP及FPGA的编程框架基于C/C++语言和传统的并行编程方法,导致软件开发效率较低,软件推理和调试困难,难以灵活处理计算设备之间的协作和调度。提出一种面向异构计算平台的基于脚本语言的结构化并行编程框架,提供结构化的并行编程接口,支持计算任务到异构计算设备的映射,便于并行程序的推理和验证。设计并实现了基于遗传算法的结构化调度算法,充分利用异构计算系统的计算能力,提高了异构计算系统的软件开发效率。实验结果表明,提出的编程框架在CPU+GPU平台上实现了相对于单处理器1.5到2.5倍的加速比。
        With the advent of artificial intelligence era, heterogeneous computing has been playing a more and more important role in deep learning and scientific computing. One of the bottlenecks that limit the application of heterogeneous computing systems is a lack of efficient software development framework. Existing programming frameworks like OpenCL and CUDA, base on C/C++ language and traditional parallel programming methods, and support hardware like GPU, DSP and FPGA, which are complained due to their low efficiency in software development as well as the difficulties in software reasoning and debugging, leading to clumsy handling of the cooperation and scheduling between computing devices. We introduce a script-based structural parallel programming framework for heterogeneous computing platforms, which provides a structural parallel programming interface to support the mapping of computing tasks to heterogeneous computing devices, and facilitate the reasoning and verification of parallel programs. We also design and implement a structural scheduling algorithm based on the genetic algorithm, which fully utilizes the computing capability of heterogeneous systems and enhances the efficiency of software development. Experimental results show that the proposed programming framework achieves 1.5× ~ 2.5× speedup in comparison to a single processor on the CPU+GPU platform.
引文
[1] Zhang Jun. Research on MapReduce programming model for heterogeneous computing platforms[D].Jinan:Shandong University,2016.(in Chinese)
    [2] Sanders J,Kandrot E.CUDA by example:An introduction to general-purpose GPU programming[M].Boston:Addison-Wesley Professional,2010.
    [3] Gaster R R. Heterogeneous computing with OpenCL[M].Amsterdam:Elsevier,2012.
    [4] Wienke S,Springer P,Terboven C,et al.OpenACC:First experiences with real-world applications[C]//Proc of European Conference on Parallel Processing, 2012:859-870.
    [5] Frigo M,Leiserson C E,Randall K H.The implementation of the Cilk-5 multithreaded language [C]//Proc of ACM SIGPLAN 1998 Conference on Programming Language Design and Implementation,1998:212-223.
    [6] Olszewski M,Ansel J,Amarasinghe S.Kendo:Efficient deterministic multithreading in software [C]//Proc of International Conference on Architectural Support for Programming Languages and Operating Systems,2009:97-108.
    [7] Ji W X,Lu L,Scott M L.TARDIS:Task-level access race detection by intersecting sets [C]//Proc of Workshop on Determinism and Correctness in Parallel Programming (WoDet),2013:1.
    [8] Bocchino R L,Adve V S,Dig D,et al.A type and effect system for deterministic parallel Java [J].ACM SIGPLAN Notices,2009,44(10):97-116.
    [9] Wang lei, Cui Hui-min, Chen Li, et al. Research on task parallel programming model[J]. Journal of Software, 2013,24(1):77-90.(in Chinese)
    [10] Berger E D,Yang T,Liu T,et al.Grace:Safe multithreaded programming for C/C++[C]//Proc of ACM SIGPLAN Conference on Object Oriented Programming Systems Languages and Applications,2009:81-96.
    [11] McCool M D.Structured parallel programming with deterministic patterns [C]//Proc of the 2nd USENIX Confernece on Hot Topics in Parallelism,2010:5.
    [12] Vandierendonck H, Pratikakis P, Nikolopoulos D S. Parallel programming of general-purpose programs using task-based programming models[C]//Proc of the 3rd USENIX Workshop on Hot Topics in Parallelism, 2011:1.
    [13] Pinto N, Lee Y, Catanzaro B, et al. PyCUDA and PyOpenCL:A scripting-based approach to GPU run-time code generation [J].Parallel Computing,2012,38(3):157-174.
    [14] Tristram W B, Bradshaw K L. Hydra:A Python framework for parallel computing[C]//Proc of Communication Process Architectures Conference, 2009:311-324.
    [15] Parr T J, Quong R W. ANTLR:A predicated-LL(k) parser generator[J]. Software Practice and Experience, 1995,25(7):789-810.
    [16] Tellez E S, Chávez E, Contreras-Castillo J. SPyRO:Simple Python remote objects[C]//Proc of LA-WEB Congress, 2006:39-46.
    [17] Bjrndalen J M, Vinter B,Anshus O J. PyCSP-communicating sequential processes for Python[C]//Proc of Communication Process Architectures Conference, 2007:229-248.
    [18] Akagi K,Suzuki T,Stephens R M,et al. RTCGD:Retroviral tagged cancer gene database[J].Nucleic Acids Research, 2004,32(Database-Issue):523-527.
    [19] Kl?ckner A,Pinto N,Lee Y,et al.PyCUDA:GPU run-time code generation for high-performance computing[J].Parallel Computing, 2009,38(3):157-174.
    [1] 张军.基于异构计算平台的MapReduce 编程模型的研究[D].济南:山东大学,2016.
    [9] 王蕾,崔慧敏,陈莉,等.任务并行编程模型研究与进展[J].软件学报,2013,24(1):77-90.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700