基于双倍步长数据流的硬件预取机制
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Hardware Prefetching Mechanism Based on Double Step Data Stream
  • 作者:王锦涵 ; 李俊 ; 路冬冬 ; 张海龙 ; 朱英
  • 英文作者:WANG Jinhan;LI Jun;LU Dongdong;ZHANG Hailong;ZHU Ying;Shanghai High Performance IC Design Center;
  • 关键词:硬件预取 ; 双倍步长 ; 流预取 ; SPEC2006测试集 ; Cache ; Miss率
  • 英文关键词:hardware prefetching;;double step;;stream prefetching;;SPEC2006 test set;;Cache Miss rate
  • 中文刊名:JSJC
  • 英文刊名:Computer Engineering
  • 机构:上海高性能集成电路设计中心;
  • 出版日期:2019-06-15
  • 出版单位:计算机工程
  • 年:2019
  • 期:v.45;No.501
  • 基金:核高基重大专项“超级计算机处理器研制”(20172X01028101-001)
  • 语种:中文;
  • 页:JSJC201906018
  • 页数:5
  • CN:06
  • ISSN:31-1289/TP
  • 分类号:121-124+132
摘要
硬件数据预取技术可以有效提升处理器的访存性能,但传统流预取策略存在预取不及时的问题。为此,提出一种双倍步长流预取策略,并设计对应的预取部件结构。预取部件自动检测数据流的固定步长并将该步长扩大为原有的2倍,以计算预取地址。实验结果表明,加入该预取部件后,运行SPEC2006测试集的整数应用与浮点应用时,处理器性能最高可分别提升45%与57%,针对Cache Miss率较高的应用,该预取部件可以有效隐藏访存延时。
        Hardware data prefetching technology can effectively improve the memory access performance of processors,but the traditional stream prefetching strategy has the problem of untimely prefetching.Therefore,a double step stream prefetching strategy is proposed,and the corresponding prefetching component structure is designed.The prefetching component automatically detects the fixed step size of the data stream and enlarges the step size to twice of the original one to calculate the prefetching address.Experimental results show that the performance of the processor can be improved by 45% and 57% respectively when SPEC2006 test set integer application and floating-point application are run with the prefetching component.For applications with high Cache Miss rate,the prefetch component can effectively hide the memory access latency.
引文
[1] WULF W A,MCKEE S A.Hitting the memory wall:implications of the obvious[J].ACM SIGARCH Computer Architecture News,1995,23(1):20-24.
    [2] SMITH J E.Decoupled access/execute computer archi-tecture[J].ACM SIGARCH Computer Architecture News,1982,10(3):112-119.
    [3] GUO Yan,NARAYANAN P,BENNASER M A,et al.Energy-efficient hardware data prefetching[J].IEEE Transactions on Very Large Scale Integration Systems,2011,19(2):250-263.
    [4] GINDELE J D.Buffer block prefetching method[J].IBM Technical Disclosure Bulletin,1977,20(2):696-697.
    [5] CHEN Tienfu,BAER J L.Effective hardware-based data prefetching for high-performance processors[J].IEEE Transaction on Computers,1995,44(5):609-623.
    [6] PINTER S S,YOAZ A.Tango:a hardware-based data prefetching technique for superscalar processors[C]//Proceedings of the 29th Annual ACM/IEEE International Symposium on Microarchitecture.Washington D.C.,USA:IEEE Computer Society,1996:214-225.
    [7] 靳强,郭阳,鲁健壮.一种步长自适应二级Cache预取机制[J].计算机工程与应用,2011,47(29):56-59.
    [8] BAER J L,CHEN Tienfu.An effective on-chip preloading scheme to reduce data access penalty[C]//Proceedings of 1991 ACM/IEEE Conference on Supercomputing.New York,USA:ACM Press,1991:176-186.
    [9] JOUPPI N P.Improving directed-mapped cache performance by addition of small fully-associative cache and prefetching buffers[J].ACM SIGARCH Computer Architecture News,1990,18(3):363-373.
    [10] ALACHARLS S,KESSLER R E.Evaluating stream buffer as a secondary cache replacement[J].ACM SIGARCH Computer Architecture News,1994,22(2):24-33.
    [11] JOSEPH D,GRUNWALD D.Prefetching using Markov predictors[J].ACM SIGARCH Computer Architecture News,1997,25(2):252-263.
    [12] HU Zhigang,MARTONOSI M,KAXIRAS S.TCP:tag correlating prefetchers[C]//Proceedings of International Symposium on High-performance Computer Architecture.Washington D.C.,USA:IEEE Press,2003:317-326.
    [13] NESBIT K J,DHODAPKAR A S,SMITH J E.AC/DC:an adaptive data cache prefetcher[C]//Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques.Washington D.C.,USA:IEEE Computer Society,2004:135-145.
    [14] LAI A C,FIDE C,FALSAFI B.Dead-block predictionand dead-block correlating prefetchers[J].ACM SIGARCH Computer Architecture News,2001,29(2):144-154.
    [15] 贾迅,翁志强,胡向东.基于流访问特征的多级硬件预取[J].计算机工程,2016,42(1):51-55.
    [16] 贾迅,尹飞,胡向东.申威处理器硬件预取技术的实现[J].计算机工程与科学,2015,37(11):2013-2017.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700