Bloom Filter和Weighted Bloom Filte的比较和研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着Internet技术和WWW服务的发展,Web网络流量的增加和网页访问的延迟日益引起人们的关注,这两个问题影响了Internet的持续发展。网络缓存技术是解决这两个问题的一种至关重要的技术,在国际上已经形成一个独立的主流研究领域,并取得了一些研究成果。网络缓存技术是一个复杂的课题,它需要解决替换策略、一致性维护、缓存共享和性能评价等诸多问题。虽然目前已经在这些方面做了很多工作,但许多问题并没有得到圆满解决,影响了网络缓存技术在WWW服务上的应用。本文的内容属于缓存共享领域。利用Bloom filter表示共享信息的内容,大大地降低了用于存储索引的空间消耗,减少了访问延迟。Bloom filter是一个简明的空间效率极高的随机的数据结构,用于判别一个元素是否属于某个集合。用Bloom filter表示cache内容,可以高效地实现cache协作。因为在代理之间只需传输Bloom filter而不是完整的cache目录表。
     本文首先介绍了Bloom filter的研究和应用现状,然后,从数学角度对Bloom filter和Weighted Bloom filter进行比较。结果证明Weighted Bloom filter有较低的错误预测。但是,模拟结果显示,Bloom filter有较低的错误预测,比Weighted Bloom filter好。主要原因是Weighted Bloom filter需要很强的条件,而这些条件在现实中不能被满足。
    
     太原理工大学工学硕士学位论文
     本文最后指出Bloom ilter应用中存在问题和进一步研究的方向和措
     施。
With the development of Internet technologies and WWW services, more and more attention is paid to the Web traffic and page access delay , which are the key issues affecting the continuous growing of Internet . Proxy Cache,an effective technique to solve these issues, is a hot research area . Some research works have been conducted on this area and result in some fruits . Research on Proxy Cache is quiet complicated because there are a lot of problems such as replacement strategy, cache consistency, cache sharing and performance analysis. Though studying of Proxy Cache has been made of years, few of these problems have been solved successfully. This thesis, concentrates on topic of cache. The contents of shared message are represented by Bloom filter. This technique greatly reduces the space needed for page indexing and decrease the access delay. A Bloom filter is a simple space-efficient randomized data structure for representing a set in order to determine a certain element in the set . Weighted Bloom filt
    ers and counting Bloom filters have been suggested as means for sharing Web cache information . Bloom filters are transmitted among shared proxies instead of sending the full list of cache contents .
    In this thesis, a summary about the current research and application on
    
    
    
    Bloom filter is presented, and then a mathematical comparison between the Bloom filter and weighted Bloom filter is given. It is proved that weighted Bloom filter has lower false prediction rate than Bloom filter. But the simulation results shown that Bloom filter has lower prediction rate and it is better than weighted Bloom Filter. The major reason is weighted Bloom filter needs very strong conditions, which cannot be satisfied in the real world.
    Finally, the problems in Bloom filter and the future research direction and measure are pointed out in this paper.
引文
1 B.Bloom. Space/time tradeoffs in hash coding with allowable errors.Communications of the ACM,13(7) :422-426,July 1970.
    2 L. Fan, P. Cao, J. Almeida, and A. Broder. Summary cache: A scalable wide-area Wed cache sharing protocol.In Proceedings of SIGCOMM98,1998. Extended version available as technical Report 1361,Computer sciences Department, Univ. of Wisconsin-Madison, February 1999.
    3 D. wessels. SQUID Frequently asked Questions. At http://www.squid-cache.org.
    4 A. Pousskov and D. Wessels. Cache digests. Computer Networks and ISDN Systems, 30(2-23) :2155-2168,1998.
    5 S. Czerwinski, B Zhao,T Hodes, A. Joseph, and R. Katz. An architecture for a secure service discovery service. In Proceedings of the Fifth Annual International Conference on Mobile Computing and Networks(MobiCOM99) , 1999.
    6 J. Kubiatowicz, D. Bindel, Y. Chen, S. Czerwinski, P. Eaton, D. Geels, R. Gummadi, S. Rhea, H. Weatherspoon, W. Weimer, C. Wells, and B. Zhao. OceanStore: architecture for global-scale persistent storage. In Proceedings of ASPLOS 20000,2000.
    7 A. C. Snoeren, C. Partridge, L. A. Sanchez, C. E. Jones, F. Tchakounito, S.
    
    T. Kent, and W. T. Strayer. Hash-Based IP Traceback. To appear in SIGCOMM 2001.
    Z. Liang. Transparent Web Caching with Load Balancing, Master Thesis in Queens University, 2001.
    C. Faloutsos and S.Christodoulakis, design of a signature file method that accounts for xnon-uniform occurrence and query frequencies, 11th International Conf. On VLDB, Stockholm, Sweden, August 1985, 165
    10 刘济波,朱培栋.改善WWW响应性能的文档预送技术.计算技术与自动化,1998,17(3):36-38
    11 周晖,侯滨.Proxy Server中Cache的管理和使用.现代计算机,2000,83:31-33
    12 陈兵,王立松.基于哈希链表和时间链表的HTTP代理缓存机制的实现.南京航空航天大学学报,2002,34(1):50-54
    13 沈庆伟.微型计算机系统中Cache的结构及性能分析.安徽建筑工业学院学报(自然科学版),2001,9(3):66-69
    14 王裕如,赵静.Cache技术及应用.大连大学学报,2001,22(4):73-77
    15 张晨曦,刘依.虚拟多体Cache:一种高效实现高相联度Cache的方案.计算机工程与应用,2001,23:37-40
    16 李妍,杨诚之.浅谈高速缓存的应用.电站系统工程,2002,18(1)
    17 黄震春,李三立.填补存储器间距的一种方法—前瞻性Cache.小型微型计算机系统,2002,23(6):641-645
    18 焦锋,赵英杰,王丽平.构建基于Cache网络数据缓存技术模型.河北建筑科技学院学报,2002,23(6):19(1):53-55
    
    
    19 许小刚,夏勤,顾冠群.WAP网关中Cache的实现研究.数据通信,2001,2:59-61
    20 陈良洲,鲍璐,张根度.公共Cache在WWW中的应用.计算机工程与应用,1997:15-18
    21 缪军海,朱兰娟,吴智铭.Raid中Cache的设计与实现.微型电脑应用,2001,17(4):29-31
    22 严蔚敏,吴伟民.数据结构(C语言版).北京:清华大学出版社,1997:751-262
    23 张晨曦等.计算机体系结构.北京:高等教育出版社.2000:180-190
    24 沈进,顾其威.代理服务器的研究与实现[J].南京航空航天大学学报,32(6):620,624
    25 曾朋,李建军.Internet访问管理与代理服务器[m].北京:人民邮电出版社,1999:3-6
    26 庄伟强.集群服务器代理缓存技术研究.[清华大学博士论文].1999:36-39
    27 马兆丰.一种Hash-DFS算法与数字签名方法的研究及应用.[兰州大学硕士论文].2001:10-12
    28 Henny Bekker, Ton Verschuren, Ingrid Melve Requirements and Recommendations of Web caching services DESIRE Technical Report D4.1, 1996
    29 Cunha, Bestravos and Crovella Characteristics of WWW Client-based Traces, BU-CS-95-010 (Boston University Technical Report), 1995
    30 Gihan V. Dias, Graham Cope, Ravi Wijayaratne A Smart Internet Caching
    
    System, Proceedings INET'96, June 1996
    31 Duane Wessels, Kim Claffy, Internet cache Protocol (ICP), version 2 Work in progress
    32 Donald Neal The Harvest Object Cache in New Zealand Computer Networks and ISDN Systems, volume 28, p 1415-1430,1996
    33 Roy Fielding, Jim Gettys, Jeff Mogul, Henrik Frystyk, Tim Berners-Lees, Hypertext Transfer Protocol-HTTP/1. 1, RFC 2068(Proposed Standard),1996
    34 The web Consortium Problem Statement On Propagation, Replication and Caching 1995
    35 Allison Woodruff et al An Investigation of Documents from the World Wide Web Computer Networks and ISDN Systems, volume 28,p 963-980
    36 HENsA,the UK national caching service,http://www.hensa.ac.uk/cache/
    37 NLANR Caching Project http://www.nlanr.net/Cache/
    38 Netscapes Automatic Proxy Configuration, http://home.netscape.com/eng/mozilla/2. 0/relnotes/demo/proxy-live.html
    39 Squidhttp://squid.nlanr.net/
    40 HarvestCached-2*http://www.netcache.com/
    41 Netscape Proxy server2. 0, http://www.netscape.com/comprod/server_central/product/proxy/
    42 P.Cao and S.Irani, Cost-aware WWW Proxy Caching Algorithms. In Proceedings of the 1997 USENIX Symposium on Internet Technology and Systems, pp193-206, December1997.
    
    
    43 World Wide Web: Whence Whither, What Next ?Henning Schulzrinne, IEEE Network10(2) , 10-17, Mar / Apr96
    44 Web Traffic Characterization:an Assessment of the Impact of Caching Documents from NCSA's Web Server Hans-werner Braun and Kimberly C.Claffy,Computer Networks and ISDN Systems28(1&2) ,37-51,Dec95
    45 A Case for Caching File Objects Inside Internetworks Peter Danzig, Richard S. Hall and Michael F.Schwartz, Proc. SIGCOMM '93 (Computer and Communications Reviews 23(4) ,239-248,Oct93)
    46 Web Cataloguing Through Cache Exploitation and Steps Toward Consistency Maintenance Chris Dodge, Beate Marx and Hans Pfeifienberger, Computer Networks and ISDN Systems27(6) , 1003-1008, Apr95
    47 Main Memory Caching of Web Documents Evangelos P. Markatos, Proc. Fifth International World Wide Web Conference, 6-10 May 1996, Paris, France (Computer Networks and ISDN Systems28(7-11) ,893-905,May96)
    48 Web Cache Coherence Adam Dingle and Tomas Parti, Proc. Fifth International World Wide Web Conference, 6-10 May 1996, Paris, France (Computer Networks and ISDN Systems28(7-11) ,907-920,May96)
    49 Introducing Application-Level Replication and Naming Into Today's Web Michael Baentsch, Georg Molter and Peter Sturm, Proc. Fifth International World Wide Web Conference, 6-10 May 1996, Paris, France (Computer Networks and ISDN Systems28(7-11) , 921-930,
    
    May96)
    50 Using Prefetching to Improve Word Wide Web Latency Venkata N. Padmanabhan and Jeffrey C. Mogul, Proc. SIGCOMM'96(Computer and Communication Reviews 26(3) , 22-36, Jul96)
    51 Performance Engineering of the World Wide Web: Application to Dimensionning and Cache Design Jean-Chrysostome Bolot and Andres Vega-Garcia, IEEE Infocom'96, San Francisco, CA, Mar96

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700