分布式系统中失效检测器综述
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Summarization on failure detector in distributed system
  • 作者:刘家希 ; 吴智博 ; 董剑 ; 温东新
  • 英文作者:LIU Jiaxi;WU Zhibo;DONG Jian;WEN Dongxin;School of Computer Science and Technology,Harbin Institute of Technology;
  • 关键词:分布式系统 ; 失效检测器 ; 服务质量
  • 英文关键词:distributed system;;failure detector;;Quality of Service
  • 中文刊名:DLXZ
  • 英文刊名:Intelligent Computer and Applications
  • 机构:哈尔滨工业大学计算机科学与技术学院;
  • 出版日期:2019-02-18
  • 出版单位:智能计算机与应用
  • 年:2019
  • 期:v.9
  • 基金:国家自然科学基金(61100029,61370085)
  • 语种:中文;
  • 页:DLXZ201902051
  • 页数:6
  • CN:02
  • ISSN:23-1573/TN
  • 分类号:223-228
摘要
失效检测器是构建高可用分布式系统的基础组件之一,能够保证分布式系统提供持续、可靠的服务,以最低的检测负载实现快速、准确的失效检测为目标。目前的失效检测器主要围绕自适应失效检测和检测结果共享机制展开研究,以期能够在检测时间、检测准确性以及检测负载等失效检测服务质量方面不断改进。
        Failure detector is one of fundamental components to build high availability distributed systems,and can ensure that the distributed systems provide the continuous and reliable service. The target of failure detector is to achieve the fast and accurate failure detection with the lowest overhead. At present,in order to improve the detection time,accuracy and overhead,the failure detector mainly focuses on adaptive failure detection and mechanism of sharing result.
引文
[1]常光辉.大规模分布式可信监控系统研究[D].重庆:重庆大学,2011.
    [2]张家琳.分布式计算中的共识问题研究[D].北京:清华大学,2010.
    [3]李磊.分布式系统中容错机制性能优化技术研究[D].长沙:国防科学技术大学,2007.
    [4]CHANDRA T D,TOUEG S.Unreliable failure detectors for reliable distributed systems[J].Journal of the ACM(JACM).1996,43(2):225-267.
    [5]LARREA M,ANTA F A,ARVALO S.Optimal implementation of the weakest failure detector for solving consensus[C]//The19thIEEE Symposium on Reliable Distributed Systems(SRDS).Nürnberg,Germany:IEEE Computer Society Press,2000:52-59.
    [6]CHEN Wei,TOUEG S,AGUILERA MK.On the quality of service of failure detectors[J].Computers,IEEE Transactions on,2002,51(1):13-32.
    [7]SOTOMA I,MADEIRA E R M.Adaptation-algorithms to adaptive fault monitoring and their implementation on Corba[C]//The 3rdInternational Symposium on Distributed Objects and Applications.Rome,Italy:IEEE,2001:219-228.
    [8]FETZER C,RAYNAL M,TRONEL F.An adaptive failure detection protocol[C]//Pacific Rim International Symposium on Dependable Computing,Seoul,Korea:IEEE,2001:146-153.
    [9]FALAI L,BONDAVALLI A.Experimental evaluation of the Qos of failure detectors on wide area network[C]//Proceedings of International Conference on Dependable Systems and Networks.Yokohama,Japan:IEEE Press,2005:624-633.
    [10]BERTIER M,MARIN O,SENS P.Implementation and performance evaluation of an adaptable failure detector[C]//DSN'02:Proceedings of the 2002 International Conference on Dependable Systems and Networks.Washington,DC,USA:IEEE Computer Society,2002:354-363.
    [11]TOMSIC A,SENS P,GARCIA J,et al.2W-FD:A failure detector algorithm with QoS[C]//International Parallel and Distributed Processing Symposium(IPDPS2015).Hyderabad,India:IEEE Press,2015:885-893.
    [12]DFAGO X,URBAN P,HAYASHIBARA N et al.Definition and specification of accrual failure detectors[C]//International Conference on Dependable Systems and Networks.Yokohama,Japan:IEEE Computer Society,2005:206-215.
    [13]HAYASHIBARA N,DFAGO X,YARED R,et al.TheΦaccrual failure detector[C]//Proceedings of the 23rdIEEE International Symposium on Reliable Distributed Systems.Florianopolis,Brazil:IEEE Press,2004:66-78.
    [14]XIONG N,DFAGO X.ED FD:Improving the Phi accrual failure detecor[R].Japan:JAIST,2007.
    [15]LAKSHMAN A,MALIK P.Cassandra:A decentralized structured storage system[J].ACMSIGOPS Operating Systems Review,2010,44(2):35-40.
    [16]SATZGER B,PIETZOWSKI A,TRUMLER W,et al.A newadaptive accrual failure detector for dependable distributed systems[C]//Proceedings of ACMsymposium on Applied computing(SAC'07).Seoul,Korea:ACMPress,2007:551-555.
    [17]HE Yanzhang,JIANG Xiaohong,DAI Changbo,et al.Selfadaptive failure detector for peer-to-peer distributed system considering the link faults[M]//DOU Y,LIN H,SUN G,et al.Advanced parallel processing technologies.APPT 2017.Lecture Notes in Computer Science,Cham:Springer,2017,10561:64-75.
    [18]FELBER P,DFAGO X,GUERRAOUI R,et al.Failure detectors as first class objects[C]//Proceedings of the International Symposium on Distributed Objects and Applications.Edinburgh,United Kingdom:IEEE Computer Society,1999:132-141.
    [19]STELLING P,FOSTER I,KESSELMAN C,et al.A fault detection service for wide area distributed computations[C]//Proceedings of The Seventh International Symposium on High Performance Distributed Computing.Chicago,USA:IEEE Computer Society,1998:268-278.
    [20]BERTIER M,MARIN O,SENS P.Performance analysis of a hierarchical failure detector[C]//Proceedings of 2003International Conference on Dependable Systems and Networks.San Francisco,CA,USA:IEEE Computer Society Press,2003:635-644.
    [21]LIN Mengjiang,MARZULLO K,MASINI S.Gossip versus deterministic flooding:Low message overhead and high reliability for broadcasting on small networks[R].CA,USA:University of California,1999.
    [22]EUGSTER P T,GUERRAOUI R.Probabilistic multicast[C]//Proceedings of International Conference on Dependable Systems and Networks.Washington,DC,USA:IEEE Computer Society,2002:313-322.
    [23]VAN RENESSE R,MINSKY Y,HAYDEN M.A gossip-style failure detection service[C]//Proceedings of IFIP International Conference on Distributed Systems Platforms and Open Distributed Processing Middleware.The Lake District,UK:ACM,1998:55-70.
    [24]GUPTA I,CHANDRA T D,GOLDSZMIDT A S.On scalable and efficient distributed failure detectors[C]//Twentieth ACMSymposium on Principles of Distributed Computing.New port,Rhode Island:ACMPress,2001:170-179.
    [25]SNYDER S,CARNS P,JENKINS J,et al.A case for epidemic fault detection and group membership in HPC storage systems[M]//JARVIS S,WRIGHT S,HAMMOND S.High performance computing systems.performance modeling,Benchmarking,and Simulation.PMBS 2014.Lecture Notes in Computer Science,Cham:Springer,2014,8966:237-248.
    [26]HORITA Y,TAURA K,CHIKAYMA T.A scalable and efficient self-organizing failure detector for grid applications[C]//The 6thIEEE/ACMInternational Workshop on Grid Computing.Seattle,WA,USA:IEEE Computer Society Press,2005:202-210.
    [27]WARD J S,BARKER A.Monitoring large-scale cloud systems with layered gossip protocols[J].ar Xiv preprint ar Xiv:1305.7403,2013.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700