云计算中高效益的资源调度机制研究与实现

英文题名：The Research and Implementation of Cost-Effective Resource Scheduling Mechanism in Cloud Computing Environment
作者：胡睿
论文级别：硕士
学科专业名称：计算机科学与技术
中文关键词：云计算 ; IaaS提供商 ; 服务提供商 ; 虚拟机 ; 资源调度 ; CloudSim
英文关键词：Cloud Computing ; Resource Scheduling ; DAG ; CloudSim
学位年度：2013
导师：苏森
学科代码：0812
学位授予单位：北京邮电大学
论文提交日期：2012-12-27

摘要

随着互联网以及Web2.0的迅速发展,应用所处理的用户请求日益增长,给服务提供商的本地数据中心造成了巨大的压力。服务提供商为了应对日益增长的用户请求,需要不断地加大本地数据中心的硬件投入。然而,本地数据中心存在硬件购置的成本高、维护困难和资源利用率低等缺点。在这种情况下,云计算作为一种新兴的计算模式应运而生,并有高可靠性、高可扩展性以及按需收费等优点。越来越多的服务提供商为了降低成本和保证服务质量,选择租用基础架构(Infrastructure-as-a-Service, IaaS)提供商的计算资源构建资源环境。然而,IaaS提供商提供不同处理能力和不同计价模式的虚拟机,并按租赁时长进行收费。不同的虚拟机租赁方案会导致完全不同的租赁费用开销,这是服务提供商必须考虑的问题。另外,并行应用请求称为服务提供商所处理应用请求的重要组成部分,这类应用一般通过工作流进行抽象。因此,在处理工作流这类应用请求时,服务提供商面临的主要问题是：如何针对工作流这类应用的特点设计虚拟机调度策略,在保证服务等级协定(Service Level Agreement,SLA)的前提下,达到降低虚拟机的租赁费用开销的目标。
     本文针对此问题,建立云计算环境下的虚拟机资源调度模型,为服务提供商设计了一种并行应用请求调度算法。首先,本文提出了单工作流虚拟机调度算法,并在此基础上提出多工作流虚拟机资源调度算法。其次,我们基于CloudSim仿真平台开发虚拟机调度策略模块,用于对论文提出的算法进行仿真和性能评估。最后,基于真实的并行应用请求对算法进行测试,通过对测试结果的分析,验证了我们的算法能够在满足服务等级协定的前提下,有效降低租用虚拟机的费用开销。
As the development of Internet and Web2.0, the amount of requests that the applications face has increased dramatically and put huge pressure on the local datacenters of service providers. Therefore, service providers need to increase the investment in hardware of their datacenters in order to meet the ever-increasing amount of requests. However, local datacenters have some disadvantages including high cost in hardware investment, difficulties in operation and low resource utilization etc. Under these circumstances, cloud computing emerged as a revolutionary computing paradigm. Cloud computing has some advantages including high reliability, high scalability and pay-as-you-go pricing model etc.
     In order to reduce the cost and guarantee the quality of their services, more and more of service providers choose to build computing environment based on the infrastructures provided by IaaS (Infrastructure-as-a-Service) providers. VMs (Virtual Machine) that IaaS providers provide differ in computing capacity and pricing model. Different strategies to rent VMs will result in different rental cost, which has become an important issue for service providers. Additionally, requests of parallel applications, which can be abstracted as workflows, are the integral part of requests that service providers deal with. While dealing with this kind of requests, service providers'problem is how to make VM scheduling strategy according to the application's characteristics in order to achieve the goal of reducing rental cost and guarantee SLA (Service Level Agreement) at the same time.
     To solve this problem,this paper brings forward theVM scheduling model in cloud computing environment and designs an algorithm for service providers to schedule parallel requests. First, this paper designs single workflow scheduling algorithm and based on which designs multiple workflows scheduling algorithm. Second, we design and implement the VM scheduling module based on CloudSim, and use it to evaluate the performace of the algorithm. Finally, we use real-world parallel requests to test the algorithm and the results demonstrate that our algorithm can effectively reduce the rental cost of VMs as well as ensure the SLA of the services.

引文

[1]Mary Meeker:"Internet Trends".2012.
    [2]Lizhe Wang, J.T., Marcel Kunze, Alvaro Canales Castellanos, David Kramer, Wolfgang Karl:"Scientific Cloud Computing:Early Definition and Experience". Proc. High Performance Computing and Communications,2008.pp.825-830.
    [3]Deelman E., Singh G., Su. M.H., et al. "Pegasus:A framework for mapping complex scientific workflows onto distributed systems". Scientific Programming.2005.
    [4]Michael Isard, M.B., Yuan Yu, Andrew Birrell, Dennis Fetterly:"Dryad:Distributed Data-Parallel Programs From Sequential Building Blocks". Proc. EuroSys,2007. pp.59-72.
    [5]Daniel W., Odej K.:'Nephele:Efficient Parallel Data Processing in the Cloud'. Supercomputing,2009.pp.1-10.
    [6]Amason EC2 Website,[online] Available:http://aws.amazon.com.
    [7]RightScale Website,[online] Available:http://www.rightscale.com.
    [8]Haluk T., Salim Hariri, and Min-you W.:"Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing". IEEE Transactions on Parallel and Distributed Systems,13(3):260-274, March2002.
    [9]B. Kruatrachue, T. G. Lewis.:"Duplication Scheduling Heuristics (DSH):A New Precedence Task Scheduler for Parallel Processor Systems". IEEE Transactions on Reliability.1987.
    [10]Wai-Yip Chan, Chi-Kwong Li.:"Heterogeneous Dominant Sequence Cluster (HDSC):a low complexity heterogeneous scheduling algorithm". IEEE Pacific Rim Conference on Communications, Computers and Signal Processing-PacRim.1997.
    [11]Pierre-francois D., Tchimou N'takpe, Frederic S., Henri C.:"Scheduling Parallel Task Graphs on (Almost) Homogeneous Multicluster Platforms". IEEE Transactions on Par allel and Distributed Systems-TPDS,vol.20, no.7, pp.940-952.2009.
    [12]Ming M.,Marty H.:"Auto-Scaling to Minimize Cost and Meet Application Deadlines in Cloud Workflows". Supercomputing.2011.
    [13]Saeid A.,Mahmoud N.,Dick E.:"Cost-Driven Scheduling of Grid Workflows Using Partial Critical Paths".IEEE Transactions on Parallel and Distributed Systems, vol.23.2012.
    [14]M.Wieczorek, A. Hoheisel, R. Prodan:"Towards a General Model of the Multi-Criteria Workflow Scheduling on the Grid". Future Generation Computer Systems, vol.25, pp. 237-256,2009.
    [15]J. Yu, M. Kirley, R. Buyya:"Multi-Objective Planning for Workflow Execution on Grids". Proc. IEEE/ACM Eighth Int'l Conf. Grid Computing, pp.10-17,2007.
    [16]I. Brandic, S. Benkner, G. Engelbrecht, R. Schmidt:"QoS Support for Time-Critical Grid Workflow Applications". Proc. Int'l Conf. e-Science and Grid Computing, pp.108-115, July 2005.
    [17]J. Yu, R. Buyya, C.K. Tham:"Cost-Based Scheduling of Scientific Workflow Applications on Utility Grids". Proc. First Int'l Conf. e-Science and Grid Computing, pp. 140-147, July 2005.
    [18]R. Sakellariou, H. Zhao, E. Tsiakkouri, M.D. Dikaiakos:"Scheduling Workflows with Budget Constraints". Integrated Research in GRID Computing, CoreGRID Series, S. Gorlatch, M. Danelutto, eds., pp.189-201, Springer,2007.
    [19]R. Prodan, M. Wieczorek:"Bi-Criteriea Scheduling of Scientific Grid Workflows". IEEE Trans. Automation Science and Eng., vol.7, no.2, pp.364-376,2010.
    [20]Armbrust M., Fox A., Griffth R., Joseph A., Katz R., Konwinski A., Lee G., Patterson D., Rabkin A., Stoica I. Above the Clouds:A Berkeley View of Cloud Computing[R]. EECS Department, University of California, Berkeley, Tech.Rep.UCB/EECS-2009-28,2009.
    [21]Carl H.:"ORGs for Scalable, Robust, Privacy-Friendly Client Cloud Computing". IEEE Internet Computing-INTERNET, vol.12, no.5, pp.96-99,2008.
    [22]Google App Engine Website.[online] Available:http://appengine.google.com.
    [23]Windows Azure Website.[online] Availabe:http://www.windowsazure.com.\
    [24]Dale H., Peter B.S.:"The Virtualized Data Centre In Support of Shared Services". IBM-Global Technology Services.2007.
    [25]Wikipedia. [online] Available:http://en.wikipedia.org/wiki/Virtualization.
    [26]J.Treadwell (Editor):"Open Grid Services Architecture Glossary of Terms (GFD-1.044)". Global Grid Forum.2005.
    [27]Buyya R., Ranjan R., Calheiros R.N.:"Modeling and Simulation of Scalabe Cloud Computing Environments and the CloudSim Toolkit:Challenges and Opportunities". Buyya R.2009.1-11.
    [28]M. Wieczorek, R. Prodan, T. Fahringer:"Scheduling of Scientific Workflows in the Askalon Grid Environment". SIGMOD Record, vol.34, pp.56-62,2005.
    [29]Buyya R., Murshed M.:"GridSim:A Toolkit for the Modeling and Simulation of Distributed Resource Management and Scheduling for Grid Computing". Concurrency and Computation:Practice and Experience 2002.
    [30]F. Howell, R. McNab:"SimJava:A Discrete Event Simulation Package For Java With Applications In Computer Systems Modeling. First International Conference on Web-based Modeling and Simulation. Society for Computer Simulation.1998.
    [31]WorkflowGenerator Website, [online] Available: https://confluence.pegasus.isi.edu/display/pegasus/WorkflowGenerator.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700