Selective dynamic serialization for reducing energy consumption in hardware transactional memory systems
详细信息    查看全文
  • 作者:Epifanio Gaona (1)
    J. Rubén Titos-Gil (2)
    Juan Fernández (3)
    Manuel E. Acacio (1)
  • 关键词:Many ; core CMPs ; Hardware transactional memory ; Transactions ; Run ; time serialization ; Energy consumption ; Execution time
  • 刊名:The Journal of Supercomputing
  • 出版年:2014
  • 出版时间:May 2014
  • 年:2014
  • 卷:68
  • 期:2
  • 页码:914-934
  • 全文大小:
  • 参考文献:1. Borkar S (2007) Thousand core chips: a technology perspective. In: DAC-44
    2. Diestelhorst S, Pohlack M, Hohmuth M, Christie D, Chung J-W, Yen L (2010) Implementing AMD’s advanced synchronization facility in an out-of-order x86 core. In: Transact-05
    3. Dice D, Lev Y, Moir M, Nussbaum D (2009) Early experience with a commercial hardware transactional memory implementation. In: ASPLOS-14
    4. The IBM Blue Gene Team (2011) The Blue Gene/Q compute chip. In: Hot Chips 23
    5. Kanter D (2012) Analysis of Haswell’s transactional memory. In: Real World Technologies (02-5-2012)
    6. Herlihy M, Eliot J, Moss B (1993) Transactional memory: architectural support for lock-free data structures. In: ISCA-20
    7. Harris T, Cristal A, Unsal OS, Ayguad E, Gagliardi F, Smith B, Valero M (2007) Transactional memory: an overview. IEEE Micro 27(3):8-9 CrossRef
    8. Ferri C, Wood S, Moreshet T, Bahar RI, Herlihy M (2010) Embedded-TM: energy and complexity-effective hardware transactional memory for embedded multicore systems. J Parallel Distrib Comput (JPDC) 70(10):1042-052
    9. Ferri C, Wood S, Moreshet T, Bahar RI, Herlihy M (2010) Energy and throughput efficient transactional memory for embedded multicore systems. In: HiPEAC, pp 50-5
    10. Barroso LA, H?lzle U (2007) The case for energy-proportional computing. Computer 40(12):33-7 CrossRef
    11. Ceze L, Tuck J, Torrellas J, Cascaval C (2006) Bulk disambiguation of speculative threads in multiprocessors. In: ISCA-33
    12. Shriraman A, Dwarkadas S, Scott ML (2008) Flexible decoupled transactional memory support. In: ISCA-35
    13. Gaona-Ramírez E, Titos-Gil JR, Fernández J, Acacio ME (2013) On the design of energy-efficient hardware transactional memory systems. Concurr Comput Pract Exp 25(6):862-80
    14. Yen L, Bobba J, Marty MR, Moore KE, Volos H, Hill MD, Swift MM, Wood DA (2007) LogTM-SE: decoupling hardware transactional memory from caches. In: HPCA-13
    15. Minh CC, Chung J, Kozyrakis C, Olukotun K (2008) STAMP: stanford transactional applications for multi-processing. In: IISWC-4
    16. Gaona-Ramírez E, Titos-Gil JR, Acacio ME, Fernández J (2012) Dynamic serialization: Improving energy consumption in eager–eager hardware transactional memory systems. In: PDP-20, pp 221-28
    17. Moreshet T, Bahar RI, Herlihy M (2006) Energy-aware microprocessor synchronization: transactional memory vs. locks. In: Workshop on memory performance, Issues
    18. Martin MMK, Sorin DJ, Beckmann BM, Marty MR, Xu M, Alameldeen AR, Moore KE, Hill MD, Wood DA (2005) Multifacet’s general execution-driven multiprocessor simulator (GEMS) toolset. SIGARCH CAN 33(4):92-9
    19. Kahng AB, Li B, Peh L-S, Samadi K (2009) ORION 2.0: a fast and accurate NoC power and area model for early-stage design space exploration. In: DATE-13
    20. Thoziyoor S, Muralimanohar N, Ahn JH, Jouppi NP (2008) Cacti 5.1. Technical Report HPL-2008-0. HP Laboratories, Palo Alto, CA
    21. Dragojevic A, Guerraoui R (2010) Predicting the scalability of an STM. In: Transact-05
    22. Harris T, Larus J, Rajwar R (2010) Transactional memory, 2nd edn. Morgan & Claypool, San Rafael
    23. Dice D, Shalev O, Shavit N (2006) Transactional locking II. In: DISC-20
    24. Fraser K, Harris TL (2007) Concurrent programming without locks. ACM TOCS 25(2):1-1
    25. Marathe VJ, Scherer-III WN, Scott ML (2005) Adaptive software transactional memory. In: DISC-19
    26. Herlihy M, Luchangco V, Moir M, Scherer-III WN (2003) Software transactional memory for dynamic-sized data structures. In: PODC-22
    27. Saha B, Adl-tabatabai A, Hudson RL, Minh CC, Hertzberg B (2006) McRT-STM: a high performance software transactional memory system for a multi-core runtime. In: PPoPP-11
    28. Tomic S, Perfumo C, Kulkarni CE, Armejach A, Cristal A, Unsal OS, Harris T, Valero M (2009) EazyHTM: eager-lazy hardware transactional memory. In: MICRO-42
    29. Rajwar R, Herlihy M, Lai KK (2005) Virtualizing transactional memory. In: ISCA-32
    30. Damron P, Fedorova A, Lev Y, Luchangco V, Moir M, Nussbaum D (2006) Hybrid transactional memory. In: ASPLOS-XII, pp 336-46
    31. Flores A, Aragón JL, Acacio ME (2008) An energy consumption characterization of on-chip interconnection networks for tiled cmp architectures. J Supercomput 45(3):341-64 CrossRef
    32. Lupon M, Magklis G, González A (2010) A dynamically adaptable hardware transactional memory. In: MICRO-43, pp 27-8
    33. Negi A, Titos-Gil JR, Acacio ME, García JM, Stenstr?m P (2011) ZEBRA: a data-centric, hybrid-policy hardware transactional memory design. In: ICS-25
    34. Negi A, Titos-Gil JR, Acacio ME, García JM, Stenstr?m P (2012) PI-TM: pessimistic invalidation for scalable lazy hardware transactional memory. In: HPCA-18, pp 141-52
    35. Titos-Gil JR, Negi A, Acacio ME, García JM, Stenstr?m P (2013) Eager beats lazy: improving store management in eager hardware transactional memory. IEEE Trans Parallel Distrib Syst 24(11):2192-201 CrossRef
    36. Shriraman A, Dwarkadas S, Scott ML (2010) Implementation tradeoffs in the design of flexible transactional memory support. J Parallel Distrib Comput 70(10):1068-084 CrossRef
    37. Klein F, Baldassin A, Araujo G, Centoducatte P, Azevedo R (2009) On the energy-efficiency of software transactional memory. In: SBCCI-22
    38. Sanyal S, Roy S, Cristal A, Unsal O, Valero M (2009) Clock gate on abort: towards energy-efficient hardware transactional memory. In: HPPAC-2009
    39. Chafi H, Casper J, Carlstrom BD, McDonald A, Minh CC, Baek W, Kozyrakis C, Olukotun K (2007) A scalable, non-blocking approach to transactional memory. In: HPCA-13
    40. Pugsley SH, Awasthi M, Madan N, Muralimanohar N, Balasubramonian R (2008) Scalable and reliable communication for hardware transactional memory. In: PACT-17
    41. Cristal A, Unsal O, Yalcin G, Fetzer C, Wamhoff J-T, Felber P, Harmanci D (2013) A. Sobe, Leveraging transactional memory for energy-efficient computing below safe operation margin. In: TRANSACT-2013
  • 作者单位:Epifanio Gaona (1)
    J. Rubén Titos-Gil (2)
    Juan Fernández (3)
    Manuel E. Acacio (1)

    1. Universidad de Murcia, Murcia, Spain
    2. Chalmers University of Technology, G?teborg, Sweden
    3. Intel Barcelona Research Center, Barcelona, Spain
  • ISSN:1573-0484
文摘
In the search for new paradigms to simplify multithreaded programming, Transactional Memory (TM) is currently being advocated as a promising alternative to deadlock-prone lock-based synchronization. In this way, future many-core CMP architectures may need to provide hardware support for TM. On the other hand, power dissipation constitutes a first class consideration in multicore processor designs. In this work, we propose Selective Dynamic Serialization (SDS) as a new technique to improve energy consumption without degrading performance in applications with conflicting transactions by avoiding wasted work due to aborted transactions. Our proposal, which is implemented on top of a hardware transactional memory (HTM) system with an eager conflict management policy, detects and serializes conflicting transactions dynamically (at run-time). In its simplest form, in case of conflict, one transaction is allowed to continue whilst the rest are completely stalled. Once the executing transaction has finished, it wakes up several of the stalling transactions. More elaborated implementations of SDS try to delay this behavior until serialization of transactions is profitable, achieving the best trade-off between performance, energy savings and network traffic. SDS implementations differ from each other in the condition that triggers the serialization mode. We have evaluated several SDS schemes using GEMS, a full-system simulator implementing the LogTM-SE Eager–Eager HTM system, and several benchmarks from the STAMP suite. Results for a 16-core CMP show that SDS obtains reductions of 6?% on average in energy consumption (more than 20?% in high contention scenarios) in a wide range of benchmarks without affecting, on average, execution time. At the same time, network traffic level is also reduced by 22?%.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700