基于非易失内存的高性能计算容错技术研究

英文题名：Research on Fault Tolerance of High-performance Computing with NVRAM
作者：李旭
论文级别：博士
学科专业名称：计算机科学与技术
中文关键词：高性能计算 ; 系统可靠性 ; 容错 ; NVRAM ; 进程模型 ; 算法容错 ; 检查点技术
英文关键词：High-performance computing ; The reliability of system ; Fault
英文关键词：tolerance ; NVRAM ; Process model ; Algorithm-based fault tolerance ; The
英文关键词：checkpoint technology
学位年度：2013
导师：苑波 ; 卢凯
学科代码：0812
学位授予单位：国防科学技术大学
论文提交日期：2013-03-01

摘要

近年来，高性能计算系统的性能迅猛增长，系统规模不断扩大，预计在2020年前后，高性能计算将跨入E级（1018Flops）计算时代。然而，随着系统规模的增大，高性能计算机的可靠性问题也日益突出，迫使高性能计算系统必须借助容错技术才能保证用户应用能够正确完成。然而，系统规模的增大在导致系统可靠性不断下降的同时，又会引起容错的开销又不断增长。基于现有的容错技术，研究表明当高性能计算跨入E级时，容错开销将会占用系统的全部运行时间，从而导致系统的有效利用率为“零”。因此，现有的容错技术将无法应对未来高性能计算面临的可靠性挑战，需要研究新的容错技术。
     新兴的非易失内存（Non-Volatile Random-Access Memory，NVRAM）设备既具有DRAM的随机快速访问的特性，又具备磁盘的非易失性，而且功耗也非常低。近年来，NVRAM技术发展非常迅速，并有望到2015年后投入实用。届时，NVRAM或代替DRAM作为内存，或成为结合内存和磁盘特点的新的存储层次，或代替磁盘作为新的快速存储介质，这都将为容错技术提供新的机遇。因此，本文的研究重点就是如何利用NVRAM技术来设计高效的容错技术。针对NVRAM可能应用的存储层次，本文进行了以下几个方面的研究：
     1.基于算法的容错技术
     当NVRAM设备成为结合内存和磁盘特点的新的存储层次时，我们研究了基于算法的容错技术。基于算法的容错思想是通过对应用数据和恢复数据统一编码的方式来进行容错。但是，现有的算法容错都是针对和矩阵运算相关的算法设计的，无法应用于其它类型的算法。本文利用NVRAM的特性，提出了新型的算法容错思想，使算法容错能够应用于更多的算法类型。我们的方法通过保证算法循环的原子性执行，使应用能够在发生错误后从未完成的循环继续执行。为了验证我们方法的有效性，我们设计实现了容错Barnes-Hut算法和容错K-means算法，实验表明，相对原算法来说，我们的容错算法能够以低于10%的开销达到容错的目标。
     2.容错进程模型
     当NVRAM设备代替DRAM作为主存时，我们进行了容错进程模型的研究。在传统进程模型中，由于进程和操作系统紧耦合，即使进程运行在NVRAM中，系统重启也会破坏进程的数据，使进程也无法容错。针对这个问题，我们设计并实现了名为NV-process的容错进程模型，使进程能够在发生错误后，继续之前的状态执行，从而使进程天然具备容错的能力。NV-process通过独立的物理空间机制和自包含的进程管理机制，实现了进程和操作系统的松耦合，使进程能够独立于操作系统存在。而且，NV-process为进程提供了事务化的执行方式，使进程在执行过程中能够维护自身状态的一致性。最后，NV-process为进程提供了原地的启动方式，使进程能够快速恢复。测试结果表明，和传统进程模型相比，NV-process能够以很小的执行开销实现容错的功能。
     3.任意粒度增量式检查点技术
     当NVRAM作为快速存储介质时，我们研究了支持任意粒度的增量式检查点技术。增量式检查点技术的开销主要来源于脏数据的检测和保存。由于磁盘的带宽以及块访问特性的限制，增量式检查点大都以内存页（通常，1页包含4096字节）为粒度来检测脏数据。这样虽然减小了检测开销，但是会增加保存开销。本文通过测试发现每个应用程序内存页的数据在两次连续的检查点间隔中有很大一部分数据不会发生变化，这说明传统的以页为单位的增量式检查点每次都保存了很多重复数据。为了减小检查点技术的开销，我们利用NVRAM支持字节访问的特性，设计并实现了支持任意粒度的增量式检查点框架。在统计了应用程序访存行为的基础上，我们建立了有关检查点粒度和代价的模型。通过分析，我们得出了最优的检查点粒度。测试结果表明，同样使用最优粒度的情况下，我们的方法可以显著减小增量式检查点的开销，加速比最高可以达到1.3倍。
Recently, the technique of high-performance computing develops very rapidly. It isforecasted that the high-performance computing will enter exascale (1018Flops) era bythe year of2020. However, with the system scale increasing, the reliability ofhigh-performance computers decreases sharply. Therefore, the high-performancecomputing system must rely on the fault tolerance technique to complete the computingtask correctly. What is more, the increasing system scale, on one hand, decreases thesystem reliability; on the other hand, it increases the cost of fault tolerance technique.Researches show that the high cost from fault tolerance will reduce the utilization rateof exascale systems to be zero. As a result, the current fault tolerance technique cannotsatisfy the requirement of future high-performance computing, and we must researchnew techniques to address this challenge.
     The new emerging Non-Volatile Random-Access Memory (NVRAM)technologies promise large, fine-grained, fast, and non-volatile memory device forcomputer designers. Recently, the NVRAM technologies develop rapidly and will beavailable after the year of2015. Then, NVRAM may replace DRAM as the mainmemory; or add a new memory level between DRAM and the disk; or replace the diskas fast storage devices. No matter how NVRAM is integrated into the memory hierarchy,it will promise a new opportunity for the research of fault tolerance technique. In thispaper, we focus on how to leverage the NVRAM technologies to improve theperformance of fault tolerance techniques and do the following work:
     1. Algorithm-based fault tolerance
     Algorithm-based fault tolerance (ABFT) approaches a very cost-effective methodto incorporate fault tolerance into applications. They adapt algorithms and applyappropriate mathematical operations on both the original and recovery data. Oncefailure occurs, they can recover the application dataset with a very low overhead.Currently, ABFT approaches are mainly used in matrix operations, and they are notsuitable for general data structures. To fill this gap, we propose an approach to enhancethe ability of ABFT based on the NVRAM technologies, and make ABFT suitable foralgorithms operating on link-based data structures. Our approach ensures the dataconsistency by maintaining the atomicity of each iteration. We demonstrate thepracticality of our approach by applying it to the Barnes-Hut algorithm and K-meansalgorithm. The experiment results show that our approach is able to survive failureswithin a performance overhead of10%
     2. Fault tolerance process model
     In the traditional process model, the OS and processes are coupled tightly and there-initialization of OS will destroy process data. Though the process executes in the NVRAM, it cannot be restored after the OS reboots. To address this challenge, wepropose a fault-tolerance process abstraction based on NVRAM, called NV-process,which supports fault tolerance natively. First, NV-process decouples processes from theOS, and processes are stand-alone instances running in a self-contained way inNVRAM. Second, NV-process provides a transactional execution model to make aprocess persistent effectively. Thirdly, NV-process provides the in-place restarttechnique to restore a process very efficiently. When the system is power off (no matterit is intended or not), NV-process instances reside in the NVRAM and can continuerunning where they left off after the OS reboots. The experiment results show thatNV-process could accomplish fault tolerance with a low performance overhead.
     3. Any-grained incremental checkpoint
     The cost of incremental checkpoint technique mainly comes from the dirty datadetecting and saving. Due to the limit of the disk bandwidth and block-access property,most state-of-the-art dirty data detection is coarse-grained and implemented inpage-granularity. Though the coarse-grained approach reduces the data detecting cost, itincreases the data saving cost. We do some experiments and observe that there is a lotof unmodified data in dirty pages during a checkpoint interval. In other words, there is alot of redundant data stored into checkpoint file under page-granularity incrementalcheckpoint mechanism. To address this issue, we design and implement a newincremental checkpoint scheme named AG-ckpt (Any Granularity checkpoint) based onNVRAM. Moreover, we also formulize the performance-granularity relationship ofcheckpoint systems through a mathematical model, and further obtain the optimumsolutions. The model is general, and can be adopted to optimize granularity parameterof other checkpoint systems. The experiment results show that our approach can gain aspeedup of1.2x-1.3x on checkpoint efficiency.

引文

[1] D. Jensen and A. Rodrigues, Embedded systems and exascale computing[J],Computing in Science&Engineering, vol.12, pp.20-29,2010.
    [2] K. Bergman, S. Borkar, D. Campbell, W. Carlson, W. Dally, M. Denneau, P.Franzon, W. Harrod, K. Hill, and J. Hiller, Exascale computing study: Technologychallenges in achieving exascale systems[J], Defense Advanced Research ProjectsAgency Information Processing Techniques Office (DARPA IPTO), vol. TR-2008-13,2008.
    [3] J. Shalf, S. Dosanjh, and J. Morrison, Exascale computing technologychallenges[J], Lecture Notes in Computer Science vol.6449, pp.1-25,2011.
    [4] T. Agerwala, Exascale computing: the challenges and opportunities in thenext decade[C], in Proceeding of2010IEEE16th International Symposium on HighPerformance Computer Architecture (HPCA'10),2010, pp.1-1.
    [5] F. Cappello, A. Geist, B. Gropp, L. Kale, B. Kramer, and M. Snir, Towardexascale resilience[J], International Journal of High Performance ComputingApplications, vol.23, pp.374-388,2009.
    [6] B. Schroeder and G. A. Gibson, Understanding failures in petascalecomputers[J], Journal of Physics Conference Series: SciDAC, vol.78, pp.12-22,2007.
    [7] F. Cappello, Fault tolerance in petascale/exascale systems: Currentknowledge, challenges and research opportunities[J], International Journal of HighPerformance Computing Applications, vol.23, pp.212-226,2009.
    [8] Xuejun Yang, Zhiyuan Wang, Jingling Xue, and Y. Zhou, The ReliabilityWall for Exascale Supercomputing[J], IEEE Transactions on Computers, vol.61, pp.767-779,2012.
    [9] R. Williams, How we found the missing memristor[J], IEEESpectrum, vol.45, pp.28-35,2008.
    [10] H. Wong, S. Raoux, S. Kim, J. Liang, J. P. Reifenberg, B. Rajendran, M.Asheghi, and K. E. Goodson, Phase change memory[J], Proceedings of the IEEE, vol.98, pp.2201-2227,2010.
    [11] D. C. Ralph and M. D. Stiles, Spin transfer torques[J], arXiv preprintarXiv:0711.4608,2007.
    [12] Y. Huai, Spin-transfer torque MRAM (STT-MRAM): Challenges andprospects[J], AAPPS Bulletin, vol.18, pp.33-40,2008.
    [13] J. kerman, Toward a universal memory[J], Science, vol.308, pp.508-510,2005.
    [14] R. F. Freitas and W. W. Wilcke, Storage-class memory: The next storagesystem technology[J], IBM Journal of Research and Development, vol.52, pp.439-447,2008.
    [15] K. Kadau, C. G. Timothy, and S. L. Peter, Molecular dynamics comes ofage:320billion atom simulation on BlueGene/L[J], International Journal of ModernPhysics C, vol.17, pp.1755-1761,2006.
    [16] TOP500Supercomputing Site[EB/OL], http://www.top500.org,2012.
    [17] Y. Liang, Y. Zhang, M. Jette, A. Sivasubramaniam, and R. Sahoo,BlueGene/L failure analysis and prediction models[C], in Proceeding of InternationalConference on Dependable Systems and Networks,2006(DSN'06),2006, pp.425-434.
    [18] Y. Liang, Y. Zhang, H. Xiong, and R. Sahoo, Failure prediction in ibmbluegene/l event logs[C], in Proceeding of the7th IEEE International Conference onData Mining,2008, pp.583-588.
    [19] N. R. Adiga, G. Almsi, G. S. Almasi, Y. Aridor, R. Barik, D. Beece, R.Bellofatto, G. Bhanot, R. Bickford, and M. Blumrich, An overview of the BlueGene/Lsupercomputer[C], in Proceeding of Supercomputing2002,2002, pp.60-60.
    [20] J. Larkin and M. Fahey, Guidelines for efficient parallel I/O on the CrayXT3/XT4[C], in Proceeding of Cray Users Group Meeting (CUG)2007,2007, pp.7-10.
    [21] L. A. Barroso, J. Dean, and U. Holzle, Web search for a planet: The Googlecluster architecture[J], IEEE Micro, vol.23, pp.22-28,2003.
    [22] X. J. Yang, X. K. Liao, K. Lu, Q. F. Hu, J. Q. Song, and J. S. Su, TheTianHe-1A Supercomputer: Its Hardware and Software[J], Journal of Computer Scienceand Technology, vol.26, pp.344-351,2011.
    [23] A. R. Fersht and V. Daggett, Protein folding and unfolding at atomicresolution[J], Cell, vol.108, pp.573-582,2002.
    [24] E. N. Elnozahy, L. Alvisi, Y.-M. Wang, and D. B. Johnson, A survey ofrollback-recovery protocols in message-passing systems[J], ACM Computing Surveys(CSUR), vol.34, pp.375-408,2002.
    [25] X. Dong, N. Muralimanohar, N. Jouppi, R. Kaufmann, and Y. Xie,Leveraging3D PCRAM technologies to reduce checkpoint overhead for future exascalesystems[C], in Proceeding of2009High Performance Computing Networking, Storageand Analysis (SC'09),2009.
    [26] F. Bedeschi, R. Fackenthal, C. Resta, E. M. Donze, M. Jagasivamani, E.Buda, F. Pellizzer, D. Chow, A. Cabrini, and G. M. A. Calvi, A multi-level-cellbipolar-selected phase-change memory[C], in Proceeding of2008IEEE InternationalSolid-State Circuits Conference Digest of Technical Papers (ISSCC'08)2008, pp.428-625.
    [27] X. Dong, X. Wu, G. Sun, Y. Xie, H. Li, and Y. Chen, Circuit andmicroarchitecture evaluation of3D stacking magnetic RAM (MRAM) as a universalmemory replacement[C], in Proceeding of the45th ACM/IEEE Design AutomationConference (DAC'08)2008, pp.554-559.
    [28] D. Ha and K. Kim, Recent advances in high density phase change memory(PRAM)[C], in Proceeding of2007International Symposium on VLSI Technology,Systems and Applications (VLSI-TSA'07),2007, pp.1-4.
    [29] M. Hosomi, H. Yamagishi, T. Yamamoto, K. Bessho, Y. Higo, K. Yamane,H. Yamada, M. Shoji, H. Hachino, and C. Fukumoto, A novel nonvolatile memory withspin torque transfer magnetization switching: Spin-RAM[C], in Proceeding of2005IEEE International Electron Devices Meeting (IEDM'05),2005, pp.459-462.
    [30] S. Lai, Current status of the phase change memory and its future[C], inProceeding of2003IEEE International Electron Devices Meeting (IEDM'03),2003, pp.10.1.1-10.1.4.
    [31] G. W. Burr, B. N. Kurdi, J. C. Scott, C. H. Lam, K. Gopalakrishnan, and R.S. Shenoy, Overview of candidate device technologies for storage-class memory[J],IBM Journal of Research and Development, vol.52, pp.449-464,2008.
    [32] M. S. Iovu, E. P. Colomeico, V. G. Benea, M. Popescu, A. Lorinczi, and A.Velea, Optical properties of phase change memory Ge1Sb2Te4glasses[J], Journal ofOptoelectronics and Advanced Materials, vol.13, p.1483,2011.
    [33] S. H. Jo, T. Chang, I. Ebong, B. B. Bhadviya, P. Mazumder, and W. Lu,Nanoscale memristor device as synapse in neuromorphic systems[J], Nano letters, vol.10, pp.1297-1301,2010.
    [34] S. Tehrani, B. Engel, J. M. Slaughter, E. Chen, M. DeHerrera, M. Durlam, P.Naji, R. Whig, J. Janesky, and J. Calder, Recent developments in magnetic tunneljunction MRAM[J], IEEE Transactions on Magnetics, vol.36, pp.2752-2757,2000.
    [35] S. Tehrani, J. M. Slaughter, E. Chen, M. Durlam, J. Shi, and M. DeHerren,Progress and outlook for MRAM technology[J], IEEE Transactions on Magnetics, vol.35, pp.2814-2819,1999.
    [36] Y. Choi, I. Song, M.-H. Park, H. Chung, S. Chang, B. Cho, J. Kim, Y. Oh,D. Kwon, and J. Sunwoo, A20nm1.8V8Gb PRAM with40MB/s programbandwidth[C], in Proceeding of2012IEEE International Solid-State CircuitsConference Digest of Technical Papers (ISSCC'12),2012, pp.46-48.
    [37] S. Raoux, G. W. Burr, M. J. Breitwisch, C. T. Rettner, Y. C. Chen, R. M.Shelby, M. Salinga, D. Krebs, S. H. Chen, and H. L. Lung, Phase-change random accessmemory: A scalable technology[J], IBM Journal of Research and Development, vol.52,pp.465-479,2008.
    [38] B. C. Lee, E. Ipek, O. Mutlu, and D. Burger, Architecting phase changememory as a scalable dram alternative[J], ACM SIGARCH Computer ArchitectureNews, vol.37, pp.2-13,2009.
    [39] W. Zhang and T. Li, Exploring phase change memory and3d die-stackingfor power/thermal friendly, fast and durable memory architectures[C], in Proceeding ofthe18th International Conference on Parallel Architectures and CompilationTechniques (PACT'09),2009, pp.101-112.
    [40] P. Zhou, B. Zhao, J. Yang, and Y. Zhang, A durable and energy efficientmain memory using phase change memory technology[C], in Proceeding of the36thAnnual International Symposium on Computer Architecture (ISCA'09),2009, pp.14-23.
    [41] B. C. Lee, E. Ipek, O. Mutlu, and D. Burger, Phase change memoryarchitecture and the quest for scalability[J], Communications of the ACM, vol.53, pp.99-106,2010.
    [42] L. E. Ramos, E. Gorbatov, and R. Bianchini, Page placement in hybridmemory systems[C], in Proceeding of the2010international conference onSupercomputing (ICS'10),2011, pp.85-95.
    [43] D. Narayanan and O. Hodson, Whole-system persistence[C], in Proceedingof the17th Architectural support for programming languages and operating systems(ASPLOS'12),2012, pp.401-410.
    [44] A. M. Caulfield, A. De, J. Coburn, T. I. Mollov, R. K. Gupta, and S.Swanson, Moneta: A High-performance Storage Array Architecture for Next-generation,Non-volatile Memories[C], in Proceeding of the43rd Annual IEEE/ACM internationalsymposium on mcroachitecture,2010, pp.385-395.
    [45] J. Coburn, A. M. Caulfield, A. Akel, L. M. Grupp, R. K. Gupta, R. Jhala,and S. Swanson, NV-Heaps: making persistent objects fast and safe withnext-generation, non-volatile memories[C], in Proceeding of the16th Architecturalsupport for programming languages and operating systems (ASPLOS'11),2011, pp.105-118.
    [46] H. Volos, A. J. Tack, and M. M. Swift, Mnemosyne: Lightweight persistentmemory[C], in Proceeding of the16th Architectural support for programming languagesand operating systems (ASPLOS'11),2011, pp.91-104.
    [47] J. C. E. B. Nightingale, C. Frost, E. I. B. Lee, and D. B. D. Coetzee, BetterI/O through byte-addressable, persistent memory[C], in Proceeding of the22ndsymposium on Operating systems principles (SOSP'09),2009, pp.133-146.
    [48] X. Wu and A. L. N. Reddy, SCMFS: A File System for Storage ClassMemory[C], in Proceeding of2011High Performance Computing Networking, Storageand Analysis (SC'11),2011.
    [49] J. Von Neumann, Probabilistic logics and the synthesis of reliable organismsfrom unreliable components[J], Automata studies, vol.34, pp.43-98,1956.
    [50] A. Aviienis, Design of fault-tolerant computers[C], in Proceeding of theNovember14-16,1967, fall joint computer conference,1967, pp.733-743.
    [51] E. Dubrova, Fault tolerant design: An introduction[J], Department ofMicroelectronics and Information Technology, Royal Institute of Technology,Stockholm, Sweden,2008.
    [52] R. D. Schlichting and F. B. Schneider, Fail-stop processors: an approach todesigning fault-tolerant computing systems[J], ACM Transactions on ComputerSystems (TOCS), vol.1, pp.222-238,1983.
    [53] M. Treaster, A survey of fault-tolerance and fault-recovery techniques inparallel systems[J], arXiv preprint cs/0501002,2005.
    [54] M. Pease, R. Shostak, and L. Lamport, Reaching agreement in the presenceof faults[J], Journal of the ACM, vol.27, pp.228-234,1980.
    [55] L. Lamport, R. Shostak, and M. Pease, The Byzantine generals problem[J],ACM Transactions on Programming Languages and Systems (TOPLAS), vol.4, pp.382-401,1982.
    [56] L. Lamport, Using Time Instead of Timeout for Fault-Tolerant DistributedSystems[J], ACM Transactions on Programming Languages and Systems (TOPLAS),vol.6, pp.254-280,1984.
    [57] G. Bracha and S. Toueg, Asynchronous consensus and broadcastprotocols[J], Journal of the ACM (JACM), vol.32, pp.824-840,1985.
    [58] F. B. Schneider, Implementing fault-tolerant services using the statemachine approach: A tutorial[J], ACM Computing Surveys (CSUR), vol.22, pp.299-319,1990.
    [59] M. K. Reiter, A secure group membership protocol[J], IEEE Transactions onSoftware Engineering, vol.22, pp.31-42,1996.
    [60] D. Malkhi and M. Reiter, Unreliable intrusion detection in distributedcomputations[C], in Proceeding of the10thComputer Security Foundations Workshop,1997, pp.116-124.
    [61] J. A. Garay and Y. Moses, Fully polynomial Byzantine agreement forprocessors in rounds[J], SIAM Journal on Computing, vol.27, pp.247-290,1998.
    [62] K. P. Kihlstrom, L. E. Moser, and P. M. Melliar-Smith, The SecureRingprotocols for securing group communication[C], in Proceeding of the Thirty-FirstHawaii International Conference on System Sciences,1998, pp.317-326.
    [63] D. Malkhi and M. Reiter, Byzantine quorum systems[J], DistributedComputing, vol.11, pp.203-213,1998.
    [64] R. A. Bazzi, Synchronous Byzantine quorum systems[J], DistributedComputing, vol.13, pp.45-52,2000.
    [65] D. Malkhi, M. K. Reiter, and A. Wool, The load and availability ofByzantine quorum systems[J], SIAM Journal on Computing, vol.29, pp.1889-1906,2000.
    [66] R. Rodrigues, M. Castro, and B. Liskov, BASE: Using abstraction toimprove fault tolerance[J], ACM SIGOPS Operating Systems Review, vol.35, pp.15-28,2001.
    [67] M. Castro and B. Liskov, Practical Byzantine fault tolerance and proactiverecovery[J], ACM Transactions on Computer Systems (TOCS), vol.20, pp.398-461,2002.
    [68] J. Yin, J.-P. Martin, A. Venkataramani, L. Alvisi, and M. Dahlin, Byzantinefault-tolerant confidentiality[C], in Proceeding of the International Workshop on FutureDirections in Distributed Computing,2002, pp.12-15.
    [69] M. Durlam, D. Addie, J. Akerman, B. Butcher, P. Brown, J. Chan, M.DeHerrera, B. N. Engel, B. Feil, and G. Grynkewich, A0.18um4Mbit togglingMRAM[C], in Proceeding of International Conference on Integrated Circuit Design andTechnology,2004(ICICDT'04),2004, pp.27-30.
    [70] B. N. Engel, J. Akerman, B. Butcher, R. W. Dave, M. DeHerrera, M.Durlam, G. Grynkewich, J. Janesky, S. V. Pietambaram, and N. D. Rizzo, A4-Mbtoggle MRAM based on a novel bit and switching method[J], IEEE Transactions onMagnetics, vol.41, pp.132-136,2005.
    [71] G. Sun, X. Dong, Y. Xie, J. Li, and Y. Chen, A novel architecture of the3Dstacked MRAM L2cache for CMPs[C], in Proceeding of IEEE15th InternationalSymposium on High Performance Computer Architecture (HPCA'09),2009, pp.239-249.
    [72] S. Venkataraman, N. Tolia, P. Ranganathan, and R. H. Campbell, Consistentand durable data structures for non-volatile byte-addressable memory[C], in Proceedingof the9th USENIX Conference on File and Storage Technologies (FAST'11),2011, pp.61-75.
    [73] X. Wu, J. Li, L. Zhang, E. Speight, R. Rajamony, and Y. Xie, Hybrid cachearchitecture with disparate memory technologies[C], in Proceeding of the36th annualinternational symposium on Computer architecture,2009, pp.34-45.
    [74] K. H. Huang and J. A. Abraham, Algorithm-based fault tolerance for matrixoperations[J], IEEE Transactions on Computers, vol.100, pp.518-528,1984.
    [75] X. Yang, Y. Du, P. Wang, H. Fu, J. Jia, Z. Wang, and G. Suo, The faulttolerant parallel algorithm: the parallel recomputing based failure recovery[C], inProceeding of the16th International Conference on Parallel Architecture andCompilation Techniques (PACT'07),2007, pp.199-212.
    [76] J. S. Plank, M. Beck, G. Kingsley, and K. Li, Libckpt: Transparentcheckpointing under unix[C], in Proceeding of USENIX1995Technical Conference,1995, pp.18-32.
    [77] A. S. Machine and A. R. E. Machine, Checkpoint&Migration of UNIXProcesses in the Condor Distributed Processing System[M].
    [78] M. Litzkow and M. Solomon, The evolution of condor checkpointing[J],Mobility: processes, computers, and agents, pp.163-164,1999.
    [79] J. Duell, The design and implementation of berkeley lab's linuxcheckpoint/restart[J], Journal of Physics: Conference Series, vol.46, pp.494-503,2006.
    [80] G. Stellner, CoCheck: Checkpointing and process migration for MPI[C], inProceeding of the10th International Parallel Processing Symposium,1996, pp.526-531.
    [81] A. Beguelin, E. Seligman, and P. Stephan, Application level fault tolerancein heterogeneous networks of workstations[J], Journal of Parallel and DistributedComputing, vol.43, pp.147-155,1997.
    [82] J. Walters and V. Chaudhary, Application-level checkpointing techniquesfor parallel programs[J], Distributed Computing and Internet Technology, pp.221-234,2006.
    [83] G. Bronevetsky, M. Schulz, P. Szwed, D. Marques, and K. Pingali,Checkpointing shared memory programs at the application-level[C], in Proceeding ofEuropean Workshop on OpenMP,2004.
    [84] M. Bougeret, H. Casanova, M. Rabie, Y. Robert, and Vivien, Checkpointingstrategies for parallel jobs[C], in Proceeding of2011International Conference for HighPerformance Computing, Networking, Storage and Analysis,2011, pp.1-11.
    [85] G. Bronevetsky and B. de Supinski, Formal specification of the openmpmemory model[J], OpenMP Shared Memory Parallel Programming, pp.324-346,2008.
    [86] J. S. Plank and K. Li, ickp: A consistent checkpointer for multicomputers[J],IEEE Parallel&Distributed Technology: Systems&Applications, vol.2, pp.62-67,1994.
    [87] J. S. Plank, Efficient checkpointing on MIMD architectures[D]. PrincetonUniversity,1993.
    [88] E. N. Elnozahy, D. B. Johnson, and W. Zwaenepoel, The performance ofconsistent checkpointing[C], in Proceeding of the11th Symposium on ReliableDistributed Systems,1992, pp.39-47.
    [89] A. Goldberg, A. Gopal, K. Li, R. Strom, and D. Bacon, Transparentrecovery of Mach applications[C], in Proceeding of Proceedings of the Usenix MachWorkshop,1990, pp.169-184.
    [90] K. Li, J. F. Naughton, and J. S. Plank, Low-latency, concurrentcheckpointing for parallel programs[J], EEE Transactions on Parallel and DistributedSystems, vol.5, pp.874-879,1994.
    [91] K. Li, J. F. Naughton, and J. S. Plank, Real-time, concurrent checkpoint forparallel programs[M], vol.25. ACM,1990.
    [92] J. S. Plank, Y. Kim, and J. J. Dongarra, Fault-tolerant matrix operations fornetworks of workstations using diskless checkpointing[J], Journal of Parallel andDistributed Computing, vol.43, pp.125-138,1997.
    [93] C.-D. Lu, Scalable diskless checkpointing for large parallel systems[D].University of Illinois,2005.
    [94] S. Yi, J. Heo, Y. Cho, and J. Hong, Adaptive page-level incrementalcheckpointing based on expected recovery time[C], in Proceeding of2006ACMsymposium on Applied computing,2006, pp.1472-1476.
    [95] Y.-M. Wang, Y. Huang, K.-P. Vo, P.-Y. Chung, and C. Kintala,Checkpointing and its applications[C], in Proceeding of the25th InternationalSymposium onFault-Tolerant Computing,1995, pp.22-31.
    [96] E. N. Elnozahy, How safe is probabilistic checkpointing?[C], in Proceedingof the18th Annual International Symposium on Fault-Tolerant Computing,1998, pp.358-363.
    [97] J. Mehnert-Spahn, E. Feller, and M. Schoettner, Incremental checkpointingfor grids[C], in Proceeding of Linux Symposium,2009.
    [98] J. Heo, Y. Cho, G. Jeon, and H. Kimm, The overhead model of word-leveland page-level incremental checkpointing[C], in Proceeding of2006ACM symposiumon Applied computing,2006, pp.1493-1494.
    [99] H. Nam, J. Kim, S. J. Hong, and S. Lee, Probabilistic checkpointing[C], inProceeding of the27th Annual International Symposium on Fault-Tolerant Computing1997, pp.48-57.
    [100] N. Naksinehaboon, Y. Liu, C. Leangsuksun, R. Nassar, M. Paun, and S. L.Scott, Reliability-aware approach: An incremental checkpoint/restart model in hpcenvironments[C], in Proceeding of the8th IEEE International Symposium on ClusterComputing and the Grid,2008, pp.783-788.
    [101] J. Heo, S. Yi, Y. Cho, J. Hong, and S. Y. Shin, Space-efficient page-levelincremental checkpointing[C], in Proceeding of2005ACM symposium on Appliedcomputing,2005, pp.1558-1562.
    [102] R. Gioiosa, J. C. Sancho, S. Jiang, and F. Petrini, Transparent, incrementalcheckpointing at kernel level: a foundation for fault tolerance for parallel computers[C],in Proceeding of2005High Performance Computing Networking, Storage and Analysis,2005.
    [103] C. C. Li and W. K. Fuchs, Catch-compiler-assisted techniques forcheckpointing[C], in Proceeding of the20th International Symposium Fault-TolerantComputing,1990, pp.74-81.
    [104] T. A. Welch, A Technique for High-Performance Data Compression [J],IEEE Computer, vol.17, pp.8-19,1984.
    [105] M. Burrows, C. Jerian, B. Lampson, and T. Mann, On-line datacompression in a log-structured file system[C], in Proceeding of the5th InternationalConference on Architectural Support for Programming Languages and OperatingSystems,1992, pp.2-9.
    [106] A. Moody, G. Bronevetsky, K. Mohror, and B. R. De Supinski, Design,modeling, and evaluation of a scalable multi-level checkpointing system[C], inProceeding of2010International Conference for High Performance Computing,Networking, Storage and Analysis (SC'10),2010.
    [107] S. Sardashti and D. A. Wood, UniFI: leveraging non-volatile memories fora unified fault tolerance and idle power management technique[C], in Proceeding of the26th ACM international conference on Supercomputing(ICS'12),2012, pp.59-68.
    [108] J. S. Plank and M. G. Thomason, The average availability of parallelcheckpointing systems and its importance in selecting runtime parameters[C], inProceeding of the19th Annual International Symposium on Fault-Tolerant Computing,1999, pp.250-257.
    [109] J. T. Daly, A higher order estimate of the optimum checkpoint interval forrestart dumps[J], Future Generation Computer Systems, vol.22, pp.303-312,2006.
    [110] J. Daly, A model for predicting the optimum checkpoint interval for restartdumps[J], Lecture Notes in Computer Science, vol.2660, pp.1-12,2003.
    [111] J. Hong, S. Kim, Y. Cho, H. Y. Yeom, and T. Park, On the choice ofcheckpoint interval using memory usage profile and adaptive time series analysis[C], inProceeding of International Symposium on Dependable Computing,2002, pp.45-48.
    [112] J. C. Sancho, F. Petrini, G. Johnson, J. Fernandez, and E. Frachtenberg, Onthe feasibility of incremental checkpointing for scientific computing[C], in Proceedingof the18th International Parallel and Distributed Processing Symposium,2004, pp.58-67.
    [113] S. Yi, J. Heo, Y. Cho, and J. Hong, Taking point decision mechanism forpage-level incremental checkpointing based on cost analysis of process executiontime[J], Journal of Information Science and Engineering, vol.23, pp.1325-1337,2007.
    [114] Y. Ling, J. Mi, and X. Lin, A variational calculus approach to optimalcheckpoint placement[J], IEEE Transactions on computers, vol.50, pp.699-708,2001.
    [115] G. Bronevetsky, D. Marques, K. Pingali, and P. Stodghill, Automatedapplication-level checkpointing of MPI programs[C], in Proceeding of ACM SIGPLANNotices,2003, pp.84-94.
    [116] G. Bronevetsky, D. Marques, K. Pingali, and P. Stodghill, C3: A system forautomating application-level checkpointing of MPI programs[J], Languages andCompilers for Parallel Computing, pp.357-373,2004.
    [117] S.-E. Choi and S. J. Deitz, Compiler support for automaticcheckpointing[C], in Proceeding of the16th Annual International Symposium on HighPerformance Computing Systems and Applications,2002, pp.213-220.
    [118] V. Strumpen, Compiler technology for portable checkpoints[J], submittedfor publication (http://theory. lcs. mit. edu/strumpen/porch. ps. gz), vol.477, pp.481-484,1998.
    [119] G. Bronevetsky, D. Marques, K. Pingali, and R. Rugina,Compiler-enhanced incremental checkpointing[J], Languages and Compilers forParallel Computing, pp.1-15,2008.
    [120] G. Rodrguez, M. J. Martn, P. Gonzlez, and J. Tourino,Controller/precompiler for portable checkpointing[J], IEICE transactions on informationand systems, vol.89, pp.408-417,2006.
    [121] D. Marques, G. Bronevetsky, R. Fernandes, K. Pingali, and P. Stodghill,Optimizing checkpoint sizes in the c3system[C], in Proceeding of the19th IEEEInternationalParallel and Distributed Processing Symposium,2005, p.7pp.
    [122] V. Strumpen and B. Ramkumar, Portable checkpointing and recovery inheterogeneous environments[J], Dept. of Electrical and Computer Engineering,University of Iowa, Tech. Rep, pp.6-96,1996.
    [123] B. Ramkumar and V. Strumpen, Portable checkpointing for heterogeneousarchitectures[C], in Proceeding of the27th Annual International SymposiumonFault-Tolerant Computing1997, pp.58-67.
    [124] A. J. Ferrari, S. J. Chapin, and A. S. Grimshaw, Process introspection: Aheterogeneous checkpoint/restart mechanism based on automatic code modification[J],University of Virginia, Charlottesville, VA, USA,1997.
    [125] S. V. Sathish and J. J. Dongarra, Srs: A framework for developingmalleable and migratable parallel applications for distributed systems[J], ParallelProcessing Letters, vol.13, pp.291-312,2003.
    [126] P. Prata and J. G. Silva, Algorithm based fault tolerance versusresult-checking for matrix computations[C], in Proceeding of the29th AnnualInternational Symposium on Fault-Tolerant Computing,1999, pp.4-11.
    [127] A. Roy-Chowdhury, N. Bellas, and P. Banerjee, Algorithm-basederror-detection schemes for iterative solution of partial differential equations[J], IEEETransactions on Computers, vol.45, pp.394-407,1996.
    [128] A. L. N. Reddy and P. Banerjee, Algorithm-based fault detection for signalprocessing applications[J], IEEE Transactions on Computers, vol.39, pp.1304-1308,1990.
    [129] A. Roy-Chowdhury and P. Banerjee, Algorithm-based fault location andrecovery for matrix computations[C], in Proceeding of Twenty-Fourth InternationalSymposium onFault-Tolerant Computing,1994, pp.38-47.
    [130] S.-J. Wang and N. K. Jha, Algorithm-based fault tolerance for FFTnetworks[J], IEEE Transactions on Computers, vol.43, pp.849-854,1994.
    [131] P. Banerjee, J. T. Rahmeh, C. Stunkel, V. S. Nair, K. Roy, V.Balasubramanian, and J. A. Abraham, Algorithm-based fault tolerance on a hypercubemultiprocessor[J], IEEE Transactions on Computers, vol.39, pp.1132-1145,1990.
    [132] F. T. Luk and H. Park, An analysis of algorithm-based fault tolerancetechniques[J], Journal of Parallel and Distributed Computing, vol.5, pp.172-184,1988.
    [133] J.-Y. Jou and J. A. Abraham, Fault-tolerant matrix arithmetic and signalprocessing on highly concurrent computing structures[J], Proceedings of the IEEE, vol.74, pp.732-741,1986.
    [134] Z. Chen and J. Dongarra, Algorithm-based checkpoint-free fault tolerancefor parallel matrix computations on volatile resources[C], in Proceeding of the20thInternational Parallel and Distributed Processing Symposium(IPDPS'06),2006, pp.10-20.
    [135] H. Liu, T. Davies, C. Ding, C. Karlsson, and Z. Chen, Algorithm-BasedRecovery for Newton's Method without Checkpointing[C], in Proceeding of2011IEEEInternational Symposium on Parallel and Distributed Processing Workshops and PhdForum,2011, pp.1541-1548.
    [136] Z. Chen, Online-ABFT: an online algorithm based fault tolerance schemefor soft error detection in iterative methods[C], in Proceeding of the18th ACMSIGPLAN symposium on Principles and practice of parallel programming (PPOPP'13),2013, pp.167-176.
    [137] Z. Chen and J. Dongarra, Algorithm-based fault tolerance for fail-stopfailures[J], IEEE Transactions onParallel and Distributed Systems, vol.19, pp.1628-1641,2008.
    [138] G. Bosilca, R. Delmas, J. Dongarra, and J. Langou, Algorithm-based faulttolerance applied to high performance computing[J], Journal of Parallel and DistributedComputing, vol.69, pp.410-416,2009.
    [139] T. Davies, C. Karlsson, H. Liu, C. Ding, and Z. Chen, High performancelinpack benchmark: a fault tolerant implementation without checkpointing[C], inProceeding of the2011international conference on Supercomputing,2011, pp.162-171.
    [140] D. Hakkarinen and Z. Chen, Algorithmic Cholesky factorization faultrecovery[C], in Proceeding of the24th International Symposium on Parallel&Distributed Processing (IPDPS'10),2010, pp.1-10.
    [141] P. Du, A. Bouteiller, G. Bosilca, T. Herault, and J. Dongarra,Algorithm-based fault tolerance for dense matrix factorizations[C], in Proceeding of the17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming,2012, pp.225-234.
    [142] P. Wang, Y. Du, H. Fu, H. Zhou, X. Yang, and W. Yang, A novelfault-tolerant parallel algorithm[J], Advanced Parallel Processing Technologies, pp.18-29,2007.
    [143] X. Yang, Y. Du, P. Wang, H. Fu, and J. Jia, FTPA: SupportingFault-Tolerant Parallel Computing through Parallel Recomputing[J], IEEE Transactionson Parallel and Distributed Systems, vol.20, pp.1471-1486,2009.
    [144] M. K. Qureshi, V. Srinivasan, and J. A. Rivers, Scalable high performancemain memory system using phase-change memory technology[C], in Proceeding of the36th annual international symposium on Computer architecture,2009, pp.24-33.
    [145] G. Dhiman, R. Ayoub, and T. Rosing, Pdram: a hybrid pram and drammain memory system[C], in Proceeding of the46th Design Automation Conference(DAC'09)2009, pp.664-669.
    [146] J. Barnes and P. Hut, A hierarchical O (N log N) force-calculationalgorithm[J], Nature, vol.324, p.4,1986.
    [147] P. Liu and S. N. Bhatt, Experiences with parallel N-body simulation[J],IEEE Transactions on Parallel and Distributed Systems, vol.11, pp.1306-1323,2000.
    [148] M. O. Steven Cameron Woo, Evan Torrie, Jaswinder Pal Singh, andAnoop Gupta, The SPLASH-2Programs:Characterization and MethodologicalConsiderations[C], in Proceeding of the22nd Annual International Symposium onComputer Architecture (ISCA'95),1995, pp.24-36.
    [149] J. MacQueen, Some methods for classification and analysis of multivariateobservations[C], in Proceeding of the fifth Berkeley symposium on mathematicalstatistics and probability,1967, p.14.
    [150] C. C. Minh, J. W. Chung, C. Kozyrakis, and K. Olukotun, STAMP:Stanford transactional applications for multi-processing[C], in Proceeding of2008IEEEInternational Symposium on Workload Characterization (IISWC'08),2008, pp.35-46.
    [151] D. Lea, A memory allocator[EB/OL], http://gee.cs.oswego.edu/dl/html/malloc.html,2000.
    [152] H. Volos, A. J. Tack, N. Goyal, M. M. Swift, and A. Welc, xCalls: safe I/Oin memory transactions[C], in Proceeding of the4th ACM European conference onOperating Systems (EuroSys'09),2009, pp.247-260.
    [153] G. Bronevetsky and A. Moody, Scalable I/O systems via node-localstorage: Approaching1TB/sec file I/O[J], Lawrence Livermore National Laboratory,Tech. Rep. LLNL-TR-415791,2009.
    [154] D. H. Bailey, E. Barszcz, J. T. Barton, D. S. Browning, R. L. Carter, L.Dagum, R. A. Fatoohi, P. O. Frederickson, T. A. Lasinski, and R. S. Schreiber, TheNAS parallel benchmarks[J], International Journal of High Performance ComputingApplications, vol.5, p.63,1991.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700