异构内存系统全局优化的数据预取算法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Data Prefetching Algorithm for Globally Optimizing Heterogeneous Memory System
  • 作者:裴颂文 ; 赵梦旖 ; 姬燕飞
  • 英文作者:PEI Songwen;ZHAO Mengyi;JI Yanfei;School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology;School of Management, Fudan University;
  • 关键词:异构内存系统 ; 数据预取 ; 模拟退火算法 ; 全局优化
  • 英文关键词:heterogeneous memory system;;data prefetching;;simulated annealing algorithm;;global optimum
  • 中文刊名:HDGY
  • 英文刊名:Journal of University of Shanghai for Science and Technology
  • 机构:上海理工大学光电信息与计算机工程学院;复旦大学管理学院;
  • 出版日期:2019-02-15
  • 出版单位:上海理工大学学报
  • 年:2019
  • 期:v.41;No.188
  • 基金:中国博士后科学基金资助项目(2017M610230);; 国家自然科学基金资助项目(61775139,61332009);; 上海市自然科学基金资助项目(15ZR1428600);; 上海市浦江人才计划项目(PJ1407600)
  • 语种:中文;
  • 页:HDGY201901004
  • 页数:8
  • CN:01
  • ISSN:31-1739/T
  • 分类号:26-33
摘要
鉴于现有的数据预取算法不能满足高效能异构计算系统对动态随机存取存储器(DRAM)和非易失性存储器(NVM)相结合的新型异构存储器高效访问的要求,提出了一种模拟退火的全局优化数据预取算法(SADPA)。该算法在启发式搜索模拟退火算法的基础上,引入了随机因子,以避免局部最优,从而确定了全局优化阈值以预取NVM页面的有效数量。实验结果表明,该算法相对于静态阈值调整算法,平均访问延时降低了4%,每个时钟周期内的平均指令数(IPC)增加了10.1%;对于cactusADM应用,该算法相对于软硬件协同的动态阈值调整算法,系统能耗降低了3.4%。
        Due to the existing data prefetching algorithms can 't meet the requirements of the novel heterogeneous memory system combining the dynamic random access memory(DRAM) with the nonvolatile memory(NVM) in high energy-efficiency heterogeneous computing systems, a simulated annealing data prefetching algorithm(SADPA) was proposed. It was a heuristic search inspired simulated annealing algorithm, in which a random factor was introduced to confirm the global optimal threshold and the valid number of prefetching NVM pages. The results show that the average accessing latency of SADPA is 4% lower than that of the static threshold adjustment algorithm, and the average instruction per cycle(IPC) of the SADPA is 10.1% greater than that of the static threshold adjustment algorithm. Besides, the systemic power supported by SADPA, as for the cactusADM, is reduced by3.4% compared with the cooperative hardware/software dynamic threshold adjustment algorithm.
引文
[1]郭勇,尉红梅,漆锋滨.基于局部性分析数据预取在GCC上的实现[C]//中国计算机学会软件工程专委会2006年年会论文集.长沙:中国计算机学会,2006:21-23.
    [2]连瑞琦,张兆庆,乔如良.指令级并行编译器的数据预取及优化方法[J].计算机学报,2000,23(6):576-584.
    [3]ISLAM M,BANERJEE S,MESWANI M,et al.Prefetching as a potentially effective technique for hybrid memory optimization[C]//Proceedings of the 2nd International Symposium on Memory Systems.Alexandria:ACM,2016:220-231.
    [4]PARK Y,SHIN D J,PARK S K,et al.Power-aware memory management for hybrid main memory[C]//Proceedings of the 2nd International Conference on Next Generation Information Technology.Gyeongju:IEEE,2011:82-85.
    [5]裴颂文,吴小东,唐作其,等.异构千核处理器系统的统一内存地址空间访问方法[J].国防科技大学学报,2015,37(1):28-33.
    [6]QURESHI M K,SRINIVASAN V,RIVERS J A.Scalable high performance main memory system using phasechange memory technology[C]//Proceedings of the 36th International Symposium on Computer Architecture.Austin:ACM,2009:24-33.
    [7]LEE H G,BAEK S,NICOPOULOS C,et al.An energyand performance-aware DRAM cache architecture for hybrid DRAM/PCM main memory systems[C]//Proceedings of the 29th International Conference on Computer Design.Amherst:IEEE Computer Society,2011:381-387.
    [8]罗乐,刘轶,钱德沛.内存计算技术研究综述[J].软件学报,2016,27(8):2147-2167.
    [9]PEI S W,ZHANG J G,XIONG N X,et al.Performanceenergy efficiency model of heterogeneous parallel multicore system[C]//Proceedings of the 6th Green and Sustainable Computing Conference.Las Vegas:IEEE,2016:1-6.
    [10]YIN H,SONG D.Temu:binary code analysis via wholesystem layered annotative execution[R].Berkeley:University of California,2010.
    [11]RAMOS L E,GORBATOV E,BIANCHINI R.Page placement in hybrid memory systems[C]//Proceedings of the International Conference on Supercomputing.Tucson:ACM,2011:85-95.
    [12]张进宝.一种基于页面热度的异构内存能耗管理机制[D].武汉:华中科技大学,2015.
    [13]YOON H,MEZA J,AUSAVARUNGNIRUN R,et al.Row buffer locality aware caching policies for hybrid memories[C]//Proceedings of the 30th International Conference on Computer Design.Montreal:IEEE,2012:337-344.
    [14]LIU H K,CHEN Y J,LIAO X F,et al.Hardware/software cooperative caching for hybrid DRAM/NVM memory architectures[C]//Proceedings of International Conference on Supercomputing.Chicago:ACM,2017:26.
    [15]IOANNIDIS Y E,WONG E.Query optimization by simulated annealing[C]//Proceedings of the 1987 ACMSIGMOD International Conference on Management of Data.San Francisco,California:ACM,1987:9-22.
    [16]LIU H K,CHEN Y J,LIAO X F,et al.Hardware/software cooperative caching for hybrid DRAM/NVM memory architectures[C]//Proceedings of the International Conference on Supercomputing.Chicago:ACM,2017:26.
    [17]SANCHEZ D,KOZYRAKIS C.ZSim:fast and accurate microarchitectural simulation of thousand-core systems[J].ACM SIGARCH Computer Architecture News,2013,41(3):475-486.
    [18]POREMBA M,XIE Y.NVMain:an architectural-level main memory simulator for emerging non-volatile memories[C]//Proceedings of 2012 IEEE Computer Society Annual Symposium on VLSI.Amherst:IEEE Computer Society,2012:392-397.
    [19]HENNING J L.SPEC CPU2006 benchmark descriptions[J].ACM SIGARCH Computer Architecture News,2006,34(4):1-17.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700