网格环境中主机负载和任务执行时间预测研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
网格计算技术是当前国内外通信领域研究的热点之一。网格将地域上分散的计算资源连接成为一个相互分工合作的资源集合,而网格资源自身的性能特征又总是在不断变化,如负载、任务执行时间、网络带宽等。因此,在网格的发展过程中,网格资源性能指标预测的重要性日益显现出来,它是实现网格任务高效调度的关键技术之一。
     但在已有的网格资源性能监测系统中,关于资源性能预测的研究仍存在着一些不足。大部分监测系统处于被动监控的状态,更多的工作集中在实时数据采集、统计分析以及事后决策上,对未来网格资源性能状态走向把握不够清楚,即使在少有的具有预测功能的监测系统中,采用的数学预测模型均比较简单,各项性能指标的预测精度有待进一步提高。
     本文从网格资源性能预测的实际需要出发,同时根据网格资源特征,选取合适的预测方法对资源性能进行了预测。首先,总体介绍了网格资源的性能预测系统架构,总结了设计性能预测系统需要满足的特性,同时对本文重点研究的两大网格资源性能指标,主机负载和任务执行时间进行了深入分析。
     其次,我们在充分了解网格环境中主机负载统计特性的情况下,从主机负载的自相似性和长相关性出发,采用分形插值方法预测未来主机负载;在此基础上,结合主机负载和任务执行时间的线性关系,研究分析了一种基于主机负载的任务执行时间预测算法。
     最后,利用仿真实现了主机负载和任务执行时间预测算法,并通过计算相对误差、平均相对误差以及覆盖率等预测评价指标对算法的预测性能进行了评估。在主机负载预测方面,将分形插值预测算法与AR(16)和负载图样图形预测法进行了预测精度的比较与分析;在任务执行时间预测方面,对基于主机负载实现任务执行时间预测的可靠性进行了实验论证,并给出了预测精度的评价结果。
     除此之外,我们在总结本文所得成果的基础上,指出了下一步工作的研究方向。
Currently, grid computing technique is one of the research hotspots in the communication field at home and abroad. Grid could connect all of the computing resources together to work collaborative from the distributed environments, and the performance characteristics of the grid resource are always change, such as load, the execution time of task, network bandwidth and so on. So, it becomes more and more important to predict the performance of the grid resource in the development processes. Resource performance prediction is one of the key technologies in the grid computing, it could implement task scheduling effectively.
     However, there are some deficiencies about the research on resource performance prediction in grid resource performance monitoring system. A majority of monitoring systems are in passive monitoring shape, focusing on gathering and analyzing real-time data, making decision after the event more and more in order not to hold clearly development direction of grid resource performance. Few monitoring systems have the ability of predicting resource performance, but due to the fact that mathematics prediction models adopted which are relatively simple, prediction precision of performance indexes will be improved further.
     In this dissertation, we choose the appropriate prediction method to predict the performance of the grid resource base on the practical requirement and the characteristics of the grid resource. First of all, we introduce the framework of this prediction system and summarize the requirement of designing. In addition, we deeply analyze the two grid resource performance indexes: host load and the execution time of task.
     Secondly, according to the statistical characteristics of host load in the grid environment, we adopt fractal interpolation method to predict the future host load by the self-similarity and long-range dependence of host load. Based on that, we studied and analyzed a kind of algorithm about predicting the execution time of task by the linear relationship between host load and the execution time of task.
     Finally, we simulate the prediction algorithm about the host load and the execution time of task, and evaluate the prediction performance by calculating the prediction evaluation indexes, such as relative error, average relative error and coverage rate and so on. In the part of host load prediction, we analyze and compare prediction precision of our algorithm with AR(16) and“Patterns method”. In the part of the execution time of task prediction, we validated the reliability about predicting the execution time of task based on the host load, and evaluate prediction precision of it.
     In addition, we conclude this paper and point out the direction our research goes in future.
引文
[1]都志辉,陈渝,刘鹏。《网格计算》,清华大学出版社,2002.10。
    [2]刘鹏,《网格应用现状及分析》,www.chinagrid.net.
    [3] Marzolla M, Mordacchini M, Orlando S. Resource discovery in a dynamic grid environment. Database and Expert Systems Applications, 2005. Proceedings. Sixteenth International Workshop on 22-26 Aug. 2005 Page(s):356-360.
    [4] El-Darieby M, Krishnamurthy D. A Scalable Wide-Area Grid Resource Management Framework. Networking and Services, 2006. ICNS '06. International conference on 2006 Page(s):76-76.
    [5] Serafeim Zanikolas, Rizos Sakellariou. A Taxonomy of Grid Monitoring Systems. Generation Computer Systems, 2005, 21(1): 163-188.
    [6] Chapman C, Musolesi M, Emmerich W, Mascolo C. Predictive Resource Scheduling in Computational Grids. Parallel and Distributed Processing Symposium, 2007. IPDPS 2007. IEEE International 26-30 March 2007 Page(s):1-10.
    [7] Demchenko Y, Gommans L, de Laat C, Taal A, Wan A, Mulmo O. Using Workflow for Dynamic Security Context Management in Grid-based Applications. Grid Computing, 7th IEEE/ACM International Conference on 28-29 Sept. 2006 Page(s):72-79.
    [8] Hwa Min Lee, Sung Ho Chin, Jong Hyuk Lee, Dae Won Lee, Kwang Sik Chung, Soon Young Jung, Heon Chang Yu. A resource manager for optimal resource selection and fault tolerance service in Grids. Cluster Computing and the Grid, 2004. CCGrid 2004. IEEE International Symposium on 19-22 April 2004 Page(s):572-579.
    [9] Wegiel M, Czajkowski G, Dqynes L, Palacz K. A portable grid infrastructure for resource-aware applications. Cluster Computing and the Grid, 2006. CCGRID 06. Sixth IEEE International Symposium on Volume 1, 16-19 May 2006 Page(s):8 pp.
    [10] Reed D A, Mendes C L. Intelligent Monitoring for Adaptation in Grid Applications. Proceedings of the IEEE Volume 93, Issue 2, Feb. 2005 Page(s):426-435.
    [11] Jamshed M, Khalique S, Suguri H, Ahmad H F, Ali A. Grid node monitoring architecture for autonomous resource management. Broadband Networks, 2005 2nd International Conference on 3-7 Oct. 2005 Vol. 2 Page(s):1362-1369.
    [12] Mambelli M, Gardner R. Integration of monitoring systems for grid environments. Enabling Technologies: Infrastructure for Collaborative Enterprises, 2004. WET ICE2004. 13th IEEE International Workshops on 14-16 June 2004 Page(s):266-267.
    [13] Valcarenghi L, Castoldi P. QoS-aware connection resilience for network-aware grid computing fault tolerance. Transparent Optical Networks, 2005, Proceedings of 2005 7th International Conference Volume 1, 3-7 July 2005 Page(s):417-422.
    [14] Cai M, Hwang K. Distributed Aggregation Algorithms with Load-Balancing for Scalable Grid Resource Monitoring. Parallel and Distributed Processing Symposium, 2007. IPDPS 2007. IEEE International 26-30 March 2007 Page(s):1-10.
    [15] B Tierney, R Aydt, D Gunter, W Smith, M Swany, V Taylor, R Wolski. A Grid Monitoring Architecture, http://www.ggf.org/documents/GFD/GFD-1.7.pdf.
    [16] M Gerndt, R Wismuller, Z Balaton, G Gombas, P Kacsuk, Zs Nemeth, N Podhorszki, H-L Truong, T Fahringer, M Bubak, E Laure, T Margalef. Performance Tools for the Grid: State of the Art and Future. White paper, Shaker Verlag, 2004.
    [17] S Fitzgerald, I Foster, C Kesselman, G von Laszewski, W Smith, and S Tuecke. A directory service for con_goring high-performance distributed computations. In Proc. of 6th IEEE Symp, on High Performance Distributed Computing, IEEE Computer Society Press, 1997, pp 365-375.
    [18] Byrom R, Coghlan B, Cooke A, Cordenonsi R, Cornwall L et al. The CanonicalProducer: an instrument monitoring component of the Relational Grid Monitoring Architecture (R-GMA). Parallel and Distributed Computing, 2004. Third International Symposium on/Algorithms, Models and Tools for Parallel Computing on Heterogeneous Networks, 2004. Third International Workshop on 5-7 July 2004 Page(s):232-237.
    [19] R Wolski, N Spring, and J Hayes. The Network Weather Service: A Distributed Resource Performance Forecasting Service for metacomputing. In Journal of Future Generation Computing Systems, Volume 15, Numbers 5-6, October 1999, pp.757-768.
    [20] J S Vetter and D A Reed. Real-time Monitoring Adaptive Control and Interactive Steering of Computational Grids. The International Journal of High Performance Computing Applications, 2000, pp.245-249.
    [21] Gunter D, Tierney B et al. NetLogger: A Toolkit for Distributed System Performance Analysis. Proc. of the IEEE Mascots 2000 Conference, Aug 2000, pp.267-273.
    [22] P A Dinda and D R O'Hallaron. The statistical properties of host load. Fourth Workshop on Languages, Compilers, and Run-time Systems for Scalable Computers (LCR 98), 1998, pp. 1-23.
    [23] P Dinda, David R O'Halloran. An Extensible Toolkit for Resource Perdiction In Distributed Systems. CMU-CS-99-138, School of Computer Science, Carnegie Mellon University, July 1999, pp.1-36.
    [24] Lowekamp B, Miller N, Sutheerland D et al. A resource monitoring system for network-aware applications. HPDC, July, 1998, IEEE: 189-196.
    [25] Zinky JA, Bakken DE, Schantz RE. Architectural support for quality of service for CORBA objects. Theory and Practice of Object Systems, April, 1997, 3(1): 1~20.
    [26] Rich Wolski, Neil Spring, Jim Hayes. Predicting the CPU availability of time-shared Unix systems. HPDC, August, 1999: 35-51.
    [27] S Vazhkudai, J M Schopf, I Foster. Predicting the Performance of Wide Area Data Transfers. IPDPS, April, 2002: 15-19.
    [28] Lingyun Yang, Ian Foster, Jennifer M. Schopf. Homeostatic and Tendency-based CPU Load Predictions. IPDPS, April, 2003: 22-26.
    [29] Peter J Brockwell and Richard A Davis. Time Series: Theory and Methods Second Edition Springer-Verlag New York, 1991.
    [30] Barnsley M F. Fractals Everywhere (Second Edition). Academic Press, 1993: 56-63.
    [31] Dinda P A. Design, implementation, and performance of an extensible toolkit for resource prediction in distributed systems. Parallel and Distributed Systems, IEEE Transactions on Volume 17, Issue 2, Feb. 2006 Page(s):160-173.
    [32] Glimcher L, Gagan Agrawal. A Performance Prediction Framework for Grid-Based Data Mining Applications. Parallel and Distributed Processing Symposium, 2007. IPDPS 2007. IEEE International 26-30 March 2007 Page(s):1-10.
    [33]李冰峰,王治,高传善。主动请求服务代理中的资源性能预测,计算机工程,2005,33(17): 128-130.
    [34] P A Dinda and D R O'Hallaron. Host Load Prediction using Linear Models. Journal of Cluster Computing, vol. 3, no. 4, 2000, pp.265-280.
    [35] Rosa M Badia, Francesc Escale, Edgar Gabriel, Judit Gimenez, Rainer Keller, Jesus Labarta, Matthias S Muller. Performance Prediction in a Grid Environment. Grid Computing, Springer-Verlag, Berlin, 2004, 257-264.
    [36] Marin Gabriel, Mellor-Crummey Johu. Cross-architecture Performance Predictions for Scientific Applications Using Parameterized Models. Proceedings of the joint interational conference on Measurement and modeling of computer systems, New York, NY, USA, ACM Press, 2004, 2-13.
    [37] Krishnaswamy Shonali, Loke Seng Wai, Zaslavsky Arkady B. Application RunTime Estimation: A Quality of Service Metric for Web-based Data Mining Services. Proceedings of the 2002 ACM symposium on Applied computing, New York, NY, USA, ACM Press, 2002, 1153-1159.
    [38] Smith Warren, Foster Ian, Taylor Valeric E. Predicting Application Run Times Using Historical Information. Journal of Parallel and Distributed Computing, 2004, 64(9):1007-1016.
    [39] Barnsley M F. Fractals functions and interpolation. Constr Approx, 1986(2): 303-329.
    [40] Y Zhang, W Sun, and Y Inoguchi. CPU Load Predictions on the Computational Grid. IEEE, Proc. in 6th International Conference on Cluster Computing and the Grid (CCGridO6), May 2006, pp. 321-326.
    [41] http://www.cs.northwestern.edu/~pdinda/LoadTraces/
    [42] http://www.cs.northwestern.edu/~pdinda/LoadTraces/playload/playload.html.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700