超算环境科学工作流应用平台的引擎设计和资源调度
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Engine design and resource scheduling of scientific workflow application platform in supercomputing
  • 作者:李于锋 ; 莫则尧 ; 肖永浩 ; 赵士操 ; 段博文
  • 英文作者:Li Yufeng;Mo Zeyao;Xiao Yonghao;Zhao Shicao;Duan Bowen;Institute of Computer Application,Chinese Academy of Engineering Physics;Institute of Applied Physics & Computational Mathematics;
  • 关键词:科学工作流 ; 高性能计算 ; 资源调度 ; 工作流引擎
  • 英文关键词:scientific workflow;;high performance computing;;resource scheduling;;workflow engine
  • 中文刊名:JSYJ
  • 英文刊名:Application Research of Computers
  • 机构:中国工程物理研究院计算机应用研究所;北京应用物理与计算数学研究所;
  • 出版日期:2018-07-09 15:12
  • 出版单位:计算机应用研究
  • 年:2019
  • 期:v.36;No.332
  • 基金:国家重点研发计划资助项目(2016YFB0201504,2018YFB0703903)
  • 语种:中文;
  • 页:JSYJ201906027
  • 页数:5
  • CN:06
  • ISSN:51-1196/TP
  • 分类号:129-132+142
摘要
围绕超算资源的易用性和多类软件的集成以及协作需求,开发了超算环境下的科学工作流应用平台,设计了异步并发的流程执行引擎,采取调度算法和调度器、引擎相分离的设计策略,给出了资源调度方案。提出了局部资源池化技术和资源预约算法,并比较分析了五种常用调度算法的性能,给出了算法选择的建议。实际应用表明设计的引擎能够支撑复杂工作流的灵活执行方式,给出的资源调度方案能够满足超算环境下工作流应用的高效执行。
        In general,lots of software need to be cooperated together for particular object in scientific experiments and engineering domains. This paper described a new scientific workflow application platform in HPC environment. This platform contained an engine with high concurrency and asynchronous framework to process workflow application. Scheduler,planner and engine were decoupled from each other,which allowed that the three components could develop independently: scheduler for scheduling algorithms implementation,planner for collecting the scheduling information and engine for workflow driver. Scheduler and planner used resource advance reservation and local pooling mechanism to increase the performance of workflow execution. This paper also implemented and compared five scheduling algorithms as to their performance on testing graph set,and got some useful advices of algorithm selection. Real application shows that engine can support various execution strategies and resource scheduling solution can help increase efficiency of workflow execution in supercomputing environment.
引文
[1] Top 500. org. Top 500 list[EB/OL].(2017-11-30)[2018-05-08].https://www. top500. org/lists/2017/11/.
    [2] Obama. NSCI executive order 13702[EB/OL].(2017-07-20)[2018-05-08]. https://obamawhitehouse. archives. gov/the-press-office/2015/07/29/executive-order-creating-national-strategic-computing-initiative.
    [3]肖飞,张为华,王东辉.面向科学过程的工作流技术研究现状与趋势[J].计算机应用研究,2011,28(11):4013-4019.(Xiao Fei,Zhang Weihua,Wang Donghui. Overview of workflow technology in scientific process[J]. Application Research of Computers,2011,28(11):4013-4019.)
    [4] Ludascher B,Altintas I,Berkley C,et al. Scientific workflow management and the Kepler system[J]. Concurrency and Computation:practice&Experience,2006,18(10):1039-1065.
    [5] Deelman E,Singh G,Su M H,et al. Pegasus:a framework for mapping complex scientific workflows onto distributed systems[J]. Scientific Programming,2005,13(3):219-237.
    [6] Wolstencroft K,Haines R,Fellows D,et al. The Taverna workflow suite:designing and executing workflows of Web services on the desktop,Web or in the cloud[J]. Nucleic Acids Research,2013,1(1):1-5.
    [7]王红霞.网格工作流引擎的设计与实现[J].计算机工程与设计,2011,32(2):430-433.(Wang Hongxia. Design and realization of grid workflow engine[J]. Computer Engineering and Design,2011,32(2):430-433.)
    [8]沈瑜,李娟,常飚,等.高性能计算机统一资源管理系统的设计与实现[J].计算技术与自动化,2014,33(1):83-90.(Shen Yu,Li Juan,Chang Biao,et al. Design and implementation of the uniform resource management system of HPC[J]. Computing Technology and Automation,2014,33(1):83-90.)
    [9] Topcuoglu H,Hariri S,Wu Minyou. Performance effective and low complexity task scheduling for heterogeneous computing[J]. IEEE Trans on Parallel and Distributed System,2002,13(3):260-274.
    [10]Shi Zhiao,Dongarra J J. Scheduling workflow applications on processors with different capabilities[J]. Future Generation Computer Systems,2006,22(6):665-675.
    [11] Kwok Y K,Ahmad I. Dynamic critical-path scheduling an effective technique for allocating task graphs to multiprocessors[J]. IEEE Trans on Parallel and Distributed System,1996,7(5):506-521.
    [12]Rahman M,Venugopal S,Buyya R. A dynamic critical path algorithm for scheduling scientific workflow applications on global grids[C]//Proc of the 3rd IEEE International Conference on e-Science and Grid Computing. Piscataway,NJ:IEEE Press,2007:35-42.
    [13]Rahman M,Hassan R,Ranjan R,et al. Adaptive workflow scheduling for dynamic grid and cloud computing environment[J]. Concurrency and Computation:Practice&Experience,2013,25(13):1816-1842.
    [14] Chan W Y,Li C K. Heterogeneous dominant sequence cluster(HDSC):a low complexity heterogeneous scheduling algorithm[C]//Proc of IEEE Pacific Rim Conference on Communications,Computers and Signal Processing. Piscataway,NJ:IEEE Press,1997:956-959.
    [15]Amalarethinam G,Selvi F K M. A minimum makespan grid workflow scheduling algorithms[C]//Proc of Conference on Computer Communication and Informatics. Piscataway,NJ:IEEE Press,2012:1-6.
    [16]Patil V A,Chaudhary V. Rack aware scheduling in HPC data centers:an energy conservation strategy[J]. Cluster Computing,2013,16(3):559-573.
    [17] Chen Wei,Lee Y C,Fekete A,et al. Adaptive multiple-workflow scheduling with task rearrangement[J]. The Journal of Supercomputing,2015,71(4):1297-1317.
    [18]Wu Fuhui,Wu Qingbo,Tan Yusong. Workflow scheduling in cloud:a survey[J]. Journal of Supercomputing,2015,71(9):1-46.
    [19] Suter F. DAGGEN[EB/OL].(2017-07-26)[2018-05-08]. https://github. com/frs69wq/daggen.
    [20]Casanova H,Giersch A,Legrand A,et al. Versatile,scalable,and accurate simulation of distributed applications and platforms[J]. Journal of Parallel and Distributed Computing,2014,74(10):2899-2917.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700