系统级动态热管理关键技术研究

英文题名：System-Level Dynamic Thermal Management Key Techniques Research
作者：舒龙昊
论文级别：硕士
学科专业名称：计算机系统结构
中文关键词：温度感知任务调度 ; 工作负载刻画 ; 高速缓存缺失分布 ; 在线学习 ; 时间片缩放 ; 间隔调度
英文关键词：temperature-aware scheduling ; workload characterization ; cache miss distribution ; online learning ; time-slice scaling ; alternative scheduling
学位年度：2011
导师：李曦
学科代码：081201
学位授予单位：中国科学技术大学
论文提交日期：2011-04-25

摘要

处理器制造技术的飞速发展使得更多的计算资源被集中在一块很小的芯片上。单位面积计算资源的增加使得处理器的功耗密度和局部温度急剧上升从而形成温度热点(HotSpot)。片上温度不均给处理器及其冷却系统的设计带来很多挑战,同时有可能导致处理器运行时的逻辑错误甚至永久性的物理损伤。另一方面,冷却系统是按处理器最高工作负载的热状态来设计,而大多数情况下处理器并不处于高负载的状态,这就意味着冷却系统的设计成本被变相的增大。低功耗,低能耗和热耗管理技术可对系统的温度进行在线控制,这就从一定程度上减少了冷却系统的设计成本。因此设计有效的处理器温度动态控制技术变得越来越重要。本文在前人工作的基础上,充分调研了体系结构支持的多种可被利用的系统温度控制手段,例如动态电压频率缩放(DVFS),动态功耗管理(DPM,和操作系统层的各种资源管理手段,例如任务调度,内存分配等的优势与劣势。最终采用灵活性更强,控制面更大的系统级动态热耗控制技术(主要是任务调度)进行处理器的温度控制。
     本论文开展的主要研究工作包括:
     1.分析目前计算机发展特别是处理器发展过程中出现的严峻挑战,说明控制处理器温度的重要性和紧迫性。同时介绍国内外关于处理器温度控制的研究成果和现状。
     2.分析工作负载对处理器温度的影响,刻画工作负载的冷热特征并找到在线温度控制的机会。提出动静态参数相结合的工作负载刻画方法,并在此基础上进行任务的冷热特征刻画。
     3.分析应用程序体系结构级的运行特征,提出高速缓存缺失分布(Cache Miss Distribution)与平均每指令时钟周期数(CPI)相结合的工作负载刻画方法,并在此基础上进行任务的冷热特征刻画。
     4.一方面提出基于启发式匹配规则的温度感知的任务调度方法。另一方面将温度感知的任务调度问题形式化为在线学习模型。在线学习是机器学习理论的一种,该方法将过去系统的所有状态作为当前决策的一种依据,并通过判断不同决策的损失程度来决定最终的选择。
     5.设计时间片缩放(Time-Slice Scaling)和间隔调度(Alternative Scheduling)机制来进一步降低程序运行时处理器的峰值温度并缩短峰值温度的持续时间。
     6.在真实的Linux操作系统中对于上述在线控制技术的设计和实现。
     本论文研究的新贡献如下:
     1.将机器学习的理论应用到处理器温度的在线控制,并设计了一套完整的在线学习方法,使得处理器温度控制技术有了一定的理论保证。
     2.结合高速缓存缺失分布和平均每指令时钟周期数来刻画工作负载,并在此基础上刻画任务的冷热度。
     3.提出了时间片缩放和间隔调度的方法来进一步的降低运行时处理器的峰会温度并缩短峰值温度的持续时间。
With the rapid development of computer manufacturing technology, more computing resources are combined into one small chip area, which makes the on-chip power density and local temperature rise sharply then leading to temperature hotspots. Uneven on-chip temperature brings many challenges for processor and its cooling system design. Meanwhile, uneven chip temperature and temperature hotspots may lead to logic errors when the processor is running and even permanent physical damage. On the other hand, the design principles of cooling system have considered the situation of highest workloads. While in most cases, the processor is not high loaded, which means that the cost of the cooling system increases in disguise. The low-power, -energy and thermal management techonologies can do online control for system temperature, which reduces the cose of cooling system design radically. It is increasingly important to design effective dynamic techniques for processors’temperature control.
     In this thesis, I have investigated the advantages and disadvantages between several architectural-supported thermal control methods, such as DVFS and DPM techniques, and OS-levevl resource management approaches, such as task scheduling and memory allocation. Finally, I adopt the system-level approach to control processor temperature, which is more flexiable and powerful.
     The main research works in this thesis include:
     1. Summarize the current status of computer system developments and analyze the emerging challenges faced by the processor designers in the future. Then manifest the importance and emergency for processor’s temperature control. Introduce the research status and achievements for temperature control at home and broad.
     2. Analyze the effects to processor’s temperature brought by workloads. Then characterize workloads’hot-cool feature and find opportunities for online temperature control. Propose the workload characterization approach through combining different dynamic and static parameters and characterize tasks’hot-cool features.
     3. Analyze the runtime feature on architectural level and propose the workload characterization approach through combining cache miss distribution and CPI and characterize tasks’hot-cool features.
     4. Propose a temperature-aware task scheduling approach based on a heuristic corresponding principle. Then Formulating the problem of temperature-aware task scheduling into online learning model. Online learning is one of the machine learning methods. This method takes all of the past system states into consideration to make decision. Each decision is based on a process of loss factor evaluation.
     5. Design Time-Slice Scaling and Alternative Scaling schemes to reduce runtime chip temperature further or shorten the time length of peak temperature.
     6. Design and implementation on real Linux platform.
     The contributions and innovations of our works include:
     1. Applied the machine learning theory to online thermal control and design the online learning framework, which make it theoretically garanteed.
     2. Combining the CPI and Cache Miss Distribution to characterize workload and achieving hot-cool characterization for tasks.
     3. Proposing novel Time-Slice Scaling and Alternative Scheduling schemes to reduce chip temperature further or shorten the time length of peak temperature.

引文

[1] L. Yeh and R. Chy. Thermal Management of Microelectronic Equipment. American Society of Mechanical Engineering, 2001
    [2] P. Chaparro, G. Magklis, J. Gonzalez, and A. Gonzalez. Distributing the frontend for temperature reduction. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA), February 2005.
    [3] J. Clabes, J. Friedrich, M. Sweet, J. DiLullo, S. Chu, D. Plass, J. Dawson, P. Muench, L. Powell, M. Floyd, B. Sinharoy, M. Lee, M. Goulet, J. Wagoner, N. Schwartz, S. Runyon, G. Gorman, P. Restle, R. Kalla, J. McGill, and S. Dodson. Design and implementation of the power5 microprocessor. In Proceedings of the Design Automation Conference (DAC), 2004.
    [4] D. Brooks and M. Martonosi. Dynamic thermal management for high-performance microprocessors. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA), January 2001.
    [5] S. Gunther, F. Binns, D. Carmean, and J. Hall. Managing the Impact of Increasing Microprocessor Power Consumption. Intel Technology Journal, 5, February 2001.
    [6] J. Choi et al.,”Thermal-aware Task Scheduling at the System Software Level”, In Proceedings of the International Symposium on Low Powr Electronics and Design, August 2007.
    [7] L. Xia et al.,”Implementing a Thermal-Aware Scheduler in LinuxKernel on a Multi-Core Processor”, In The Computor Journal, 2010
    [8] Michael D. Powell et al.,”Heat-and-Run:Leveraging SMT and CMP to Manage Power Density Through the Operating System”, In Proceeding of International Conference on Architectural Support for Programming Languages and Operating Systems 2004
    [9] J. Yang et al.,”Dynamic Thermal Management through Task Scheduling”, In proceeding of International Symposium of Performance Analysis of Systems and Software April 2008
    [10] R. Jayaseelan and T. Mitra. Temperature-aware task sequencing and voltage scaling. In Proc.of ICCAD, 2008.
    [11] T. Chantem, R. P. Dick, and X. S. Hu. Temperature-aware scheduling and assignment for hard real-time applications on MPSOS. In DATE, 2008.
    [12] J. Cui and D. L. Maskell. Dynamic thermal-aware scheduling on chip multiprocessor for soft real-time system. In Proc. GLSVLSI, 2009.
    [13] D. Li, H.-C. Chang, H. K. Pyla, and K. W. Cameron. System-level, thermal-aware, fully-loaded process scheduling. In Proc. IPDPS, 2008.
    [14] W.-L. H. et al. Thermal-aware task allocation and scheduling for embedded systems. In Proc. Design, Automation and Test in Europe Conf., Mar. 2005.
    [15] S. Zhang and K. S. Chatha. Approximation algorithm for the temperature-aware scheduling problem. In ICCAD, 2007.
    [16] S. Wang and R. Bettati. Reactive speed control in temperature-constrained real-time systems. In Euromicro Conference on Real-Time Systems, 2006
    [17] S. Wang and R. Bettati. Delay analysis in temperature-constrained hard real-time systems with general task arrivals. In IEEE Real-Time Systems Symposium, 2006.
    [18] Y. Liu, H. Yang, R. P. Dick, H. Wang, and L. Shang. Thermal vs energy optimization for dvfs-enabled processors in embedded systems. In Proc. of ISLPED, 2007.
    [19] L. Wang, A. J. Younge, T. R. Furlani, G. von Laszewski, J. Dayal, and X. He. Towards thermal aware workload scheduling in a data center. In Proc. of the 10th International Symposium on Pervasive Systems, Algorithms and Networks, 2009.
    [20] K. Pruhs, R. van Stee, and P. Uthaisombut. Speed scaling of tasks with precedence constraints. In Proc. 3rd Workshop on Approximation and Online Algorithm, 2005.
    [21] S. Albers. Power-aware scheduling for makespan and ?ow. In Proc. of the 18th Annual ACM Symposium on Parallel Algorithms and Architectures, pages 190–196, 2006.
    [22] N. Bansal, T. Kimbrel, and K. Pruhs. Speed scaling to manage energy and temperature. In J. ACM, 2007.
    [23] M. Bao, A. Andrei, P. Eles, and Z. Peng. Temperature-aware voltage selection for energy optimization. In Proc. of DATE, Mar. 2008
    [24] G. Dhiman and T. Rosing. Dynamic voltage frequency scaling for multi-tasking systems using online learning. In ICCAD 2007.
    [25] S. Ghiasi, J. Casmira, and D. Grunwald. Using ipc variation in workloads with externallyspecified rates to reduce power consumption. In Workshop on Complexity EectiveDesign,June 2000.
    [26] R. Bahar and S. Manne. Power and energy reduction via pipeline balancing. In Proc. of the20th Intl. Sym. On Computer Architecture, July 2001.
    [27] Frank Bellosa, Andreas Weissel, Martin Waitz, and Simon Kellner. Event–driven energyaccounting for dynamic thermal management. In Proceedings of the Workshop onCompilers and Operating Systems for Low Power (COLP’03), September 2003.
    [28] K. Skadron, T. Abdelzaher, and M. R. Stan. Control theoretic techniques and thermal-RCmodeling for accurate and localized dynamic thermal management. In Proc. HPCA, pp.17–28, Feb. 2002.
    [29] W. Huang et al. An improved block-based thermal model in HotSpot 4.0 with granularityconsiderations. In Workshop on Duplicating, Deconstructing, and Debunking, 2007.
    [30] S. Wang, R. Bettati,“Reactive speed control in temperature-constrained real-time systems,”the 18th Euromicro Conference on Real-Time Systems, 2006.
    [31] I. Yeo and E. J. Kim, Temperature-aware scheduler based on thermal behavior grouping inmulticore systems," in Design, Automation Test in Europe Conference Exhibition, pp.946-951, Apr. 2009.
    [32] D. Ferrari. Workload Characterization and Selection in Computer PerformanceMeasurement. Computer, 5(4):18-24, 1972.
    [33] K. Sreenivasan and A.J. Kleinman. On the Construction of a Representative SyntheticWorkload. Communications of the ACM, 17(3):127-133, 1974.
    [34] A.K. Agrawala, J.M. Mohr, and R.M. Bryant. An Approach to the WorkloadCharacterization Problem. Computer, pages 18-32, 1976
    [35] G. Serazzi. A Functional and Resource oriented Procedure for Workload Modeling. In F.J.Kylstra, editor, PERFORMANCE '81, pages 345-361. North Holland, 1981.
    [36] M. Zhou and A.J. Smith. TracingWindows95. Technical Report, Computer Science Division,UC Berkeley, November 1998.
    [37] P.A. Lewis and G.S. Shedler. Statistical Analysis of Non-stationary Series of Events in aData Base System. IBM Journal on Research and Development, 20:465- 482, 1976.
    [38] M. Calzarossa and G. Serazzi. A Characterization of the Variation in Time of Workload Arrival Patterns. IEEE Trans. on Computers, C-34(2):156-162, 1985.
    [39] A.K. Agrawala and J.M. Mohr. A Markovian Model of a Job. In Proc. CPEUG, pages 119-126, 1978.
    [40] G. Haring. On Stochastic Models of Interactive Workloads. In A.K. Agrawala and S.K. Tripathi, editors, PERFORMANCE '83, pages 133-152. North-Holland, 1983.
    [41] D. Ferrari. On the Foundations of Articial Workload Design. In Proc. ACM SIGMETRICS Conf., pages 8-14, 1984.
    [42] S.V. Raghavan, P.J. Joseph, and G. Haring. Workload Models for Multiwindow Distributed Environments. In H. Beilner and F. Bause, editors, Quantitative Evaluation of Computing and Communication Systems, pages 314-326. Springer, 1995.
    [43] M. Calzarossa, G. Haring, and G. Serazzi. Workload Modeling for Computer Networks. In U. Kastens and F.J. Ramming, editors, Architekture und Betrieb von Rechensystemen, pages 324-339. Springer-Verlag, 1988.
    [44] S.V. Raghavan, D. Vasukiammaiyar, and G. Haring. Generative Networkload Models for a Single Server Environment. In Proc. ACM SIGMETRICS Conf., pages 118-127, 1994.
    [45] W.E. Leland, M.S. Taqqu, W. Willinger, and D.V. Wilson. On the Self-similar Nature of Ethernet Traffic (Extended Version). IEEE/ACM Trans. on Networking, 2(1):1-15, 1994.
    [46] Freund and Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. In JCSS, 1997.
    [47] C. Isci and M. Martonosi. Runtime power monitoring in high-end processors: Methodology and empirical data. In Proc. MICRO-36, Dec. 2003
    [48] Raghunathan A, Jha N K, Dey S. 1998. High-Level Power Analysis and Optimization[M]. Norwell, MA, USA: Kluwer Academic Publishers.
    [49] GUTHAUS, M. R., RINGENBERG, J. S., ERNST, D., AUSTIN, T. M., MUDGE, T., AND BROWN, R. B. 2001. MiBench: A free, commercially representative embedded benchmark suite. Available at http://www.eecs.umich.edu/ jringenb/mibench/.
    [50] http://www.spec.org/cpu2000/

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700