HTTP缓存系统设计与实现
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着上网用户数量剧增,互联网应用种类的不断增加,大量的数据流量吞噬了网络带宽,导致网络拥堵现象增多、数据传输速度下降。为此,网络缓存技术已经成为众多网络应用研究的热门领域。本文对现有的主流缓存技术进行分类与分析,发现如Squid、Apache等著名的WWW缓存代理服务器运行时,对于缓存未命中的用户请求,系统的拦截转发式监听过程都会额外的增加用户访问延时。
     针对这一问题,本文提出基于旁路镜像式监听的缓存技术。该技术采用旁路端口镜像方式监听用户Internet通信流量,根据用户访问的倾向,将频繁访问的Web资源本地缓存。当缓存系统监听到用户请求Web资源且请求资源已缓存时,系统采用会话劫持技术引导用户去内网缓存服务器获取所需资源,因此用户无需再与远程Web服务器进行连接。所以基于旁路镜像式监听的缓存技术不仅达到了减少网络出口流量、节省带宽资费、加快用户访问速度和传输速度的效果,同时解决了拦截转发式缓存技术影响用户访问延时的问题。
     本文在Windows平台下设计实现了旁路镜像式HTTP缓存系统。系统应用WinPcap工具捕获镜像流量中的原始数据包,经过网络协议解析和过滤,获取用户资源请求信息,实现镜像监听功能;对于用户频繁访问的Web资源,系统应用套接字网络编程实现将其从外网下载且磁盘缓存;系统通过IIS建立内部网HTTP服务器,实现对磁盘缓存资源的发布和管理;引导用户获取缓存资源过程是通过封装含有缓存所在地址的响应包,冒充Web服务器欺骗用户内网获取资源来实现的;系统应用Microsoft SQL server实现日志显示内网用户资源请求状况。
     与此同时,为了提高系统查找磁盘缓存的效率,本文实现了用来存储和组织用户请求资源信息的哈希表结构,采用哈希查询算法来缩短系统处理延时,并且系统采用缓存资源替换和过期检测方法提高缓存系统的命中率和资源一致性。
     最后本文对HTTP缓存系统的功能和性能进行测试,结果表明了本文设计的旁路镜像式缓存系统达到了镜像监听用户访问Internet通信数据、对用户请求劫持重定向、内网缓存加速、减少用户访问延时、SQL server数据库记录显示内网用户资源请求状况的目的。从而验证了基于旁路镜像式监听的HTTP缓存系统的实用性和可行性。
In recent years, the number of Internet users and the demand for Internet applications has quickly increased, and a great amount of data traffic consumes network bandwidth, resulting in a decline in network congestion and data transmission speed. Therefore, Web cache technology has become a hot research field of network application. In the thesis, with the current mainstream of cache technology classified and analysed, we find that when the famous cache proxy server, like Squid or Apache, is running and the cache misses the user's request, the process based on monitoring by interception will add extra user's access delay.
     In order to solve this problem, the thesis puts forward the cache technology based on monitoring by bypass mirrored. The technology uses bypass port mirrored to monitor users Internet traffic. According to the tendency of users'request, the system downloads and caches the Web resources which are frequently requested. When the cache system listens to a user's request and the requested Web resource has been cached, the system will use the HTTP session hijacking technology to guide the user to cache system to obtain the resources needed, so the user no longer needs to connect with a remote Web server. Therefore, the cache technology based on monitoring by bypass mirrored not only reduces network outlet flow, saves bandwidth charges, accelerates user's access and transmission speed, at the same time solves the issue that cache technology based on monitoring by interception affects the user access delay.
     Our HTTP cache system based on monitoring by bypass mirrored is designed on the Windows platform.For achieving mirrored monitoring function, The system uses WinPcap tool to capture raw network packets from the image flow, and through network protocol analysis and filtering, the system gets the information of user requesting Web resource. For Web resources which users frequently request, system uses Socket program to download them from external network and cache them on disk. For achieving the management and distribution of disk cache, The system uses IIS program to establish the Intranet HTTP server. Guiding users to obtain the cached resources is by packetaging the response packet containing the cache location, and the system act as Web server to cheat users to obtain resources from Intranet HTTP server.The system uses Microsoft SQL server to show the information of Intranet user requesting Web resource.
     At the same time, in order to improve the efficiency of the system for the disk cache, the thesis implements hash table structure to store and organize the information users request, then uses hash algorithm to shorten the system processing time delay. And the system also uses cache replacement and resources expired detection methods to improve the cache hit rate and system resources consistency.
     Finally in the thesis, the HTTP cache system functionality and performance test results show that the designed system can mirroring monitor users'access to Internet traffic data, and hijacking redirect users request. Intranet cache accelerates to reduce user access delay. SQL server database shows the users'resource request status. Thus the HTTP cache system based on monitoring by bypass mirrored is practicable and feasible.
引文
[1]CNNIC第二十八次中国互联网络发展状况统计报告[R].中国互联网络信息中心,2011.
    [2]陈明.CDN本地负载均衡系统设计与实现[D].北京邮电大学,2008.
    [3]Clifton Nock. Dataaccesspatterns:data base interactions in object-oriented applications[M]. Pearson Education,2004
    [4]Balachander Krishnamurthy.Web protocols and practice[M].Pearson Education,2003
    [5]贺硕,范定国,鲍芳.网络缓存协作的实现方法[J].光电子技术与信息,2004(1):58-60.
    [6]Cisco.Cisco500SeriesContentEngines[EB/OL].2013-5.http://www.cisco.com/en/US/prod ucts/hw/contnetw/ps761/index.html.
    [7]佘华君,姜开达,黄保青.基于TCP会话劫持的校园网安全告警系统[J].厦门大学学报,2007年11月,46(2):64-68.
    [8]湛高峰,李超.基于语义的违法上网行为旁路阻断系统的设计与实现[J].计算机安全,2011(1):70-74.
    [9]贺振欢,刘军,王保山.Web服务器开发技术[M].人民邮电出版社,2007.
    [10]文秀林.Web代理服务器的研究与实现[M].电子科技大学,2002.
    [11]狄刚HTTP实现代理服务器及缓存替换算法的研究[D].吉林大学,2010
    [12]S.Glassman. A caching relay for the World Wide Web[C]. In the first International World Wide Web Conferencing, Geneva, Switzerland,1994.
    [13]Qing Zou; Sch. of Comput., Queen"s Univ., Kingston. Transparent distributed Web caching with minimum expected response time[C]. Conference Proceedings of the 2003 IEEE International Performance, Computing, and Communications Conference, 2003:379-386.
    [14]L.A. Belady.A study of replacement algorithms for a virtual-storage computer[J].IBM System Journal,2010(2):78-101.
    [15]Cache Flow Ine.Active Web Caching Technology[Z].SanFraneiseo Cache Flow Ine,2000
    [16]Ranjan, S; Karrer, R.;Knightly. Wide area redirection of dynamic content by Internet data centers [C]. Twenty-third Annual Joint Conference of the IEEE Computer and Communications Societies INFOCOM 2004,2004:816-826.
    [17]squid. web proxy cache document[EB/OL].2013-5. http://www.squid-cache.org/.
    [18]A.Mahanti, C.Williamson, and D.Eager, Traffic Analysis of a Web Proxy Caching Hierarchy[J], IEEE Network.2000,14(3):16-23.
    [19]纪绪.代理缓存策略的研究与实现[D].吉林大学,2005.
    [20]深信服科技ISA/Squid传统上网加速方案优缺点分析[EB/OL].2013-5.http://www.sangfor.com.cn/topic/SG_program/zj3.html.
    [21]Hassanein, H; Zhengang Liang; Martin, P. Performance comparison of alternative Web caching techniques [C]. Proceedings of Seventh International Symposium on Computers and Communications(ISCC),2002:213-218.
    [22]胡洋平.重庆联通P2P流量管理及运营策略[D].北京邮电大学,2010.
    [23]Zhengang Liang; Hassanein, H.;Martin, P. Transparent distributed Web caching[C]. Proceedings of 26th Annual IEEE Conference on Local Computer Networks, 2001:225-233.
    [24]吕雪峰.网络分析技术揭秘原理[M].机械工程出版社,2012.
    [25]靳攀.校园网内容检测系统研究[D].郑州大学,2009.
    [26]湛高峰.网络行为旁路监控系统的研究[D].北方工业大学,2009.
    [27]谢希仁.计算机网络[M].第5版.电子工业出版社,2008.
    [28]张旺俊.Web缓存替换策略与预取技术的研究[D].中国科技技术大学,2011.
    [29]王晓薇,刘志宏.典型的TCP_IP协议脆弱性及常见攻击方法分析[J].空军工程大学学报,2002,3(4):46-50.
    [30]JoncherayL.SimpleActive Attack AgainstTCP[EB/OL].1995-3. http://www.usenix.org
    [31]王鹏,季明,梅强.交换式网络下HTTP会话的劫持研究及其对策[J].计算机工程,2007,33(5):135-137.
    [32]R.Fielding. Hypertext Transfer Protocol-- HTTP/1.1. Network Working Group Request for Comments:2616[M], June 1999
    [33]张晓军,吕洁,张蓓HTTP重定向在网关认证中的应用[J].大连理工大学学报2005,45(增刊):48-51.
    [34]赵欣,陈道蓄,谢立.网上IP劫持攻击的研究[J].软件学报,2000(4):515-519.
    [35]王永刚.基于HTTP文件下载欺骗系统的设计与实现[J].常州工学院学报,2012.4,25(2):38-42.
    [36]吴坚,谭保华,曾玲.用Vc++ADO实现SQL存储过程的调用[J].江汉大学学报(自然科学版),2005,33(4):75-77.
    [37]刘浩Visual C+++SQL server数据库应用系统开发与实例[M].人民邮电出版社,2004
    [38]郑营营.基于HTTP_FTP协议的断点续传多线程下载组件[D].济南大学,2012.
    [39]许林.电子文件安全管理系统设计与实现[D].电子科技大学,2012.
    [40]王玲惠Squid小文件缓存优化的设计与实现[D].上海交通大学,2012.
    [41]要忠伟.基于线程池的WWW缓存实现[D].中国地质大学,2008
    [42]谢肖绵.性能大幅提升-全新出炉IIS7真实网站测试[EB/OL].2013-5.http://servers.pconline.com.cn
    [43]贾明正.基于WinPcap的网络流量统计与监测[J].计算机技术与信息发展,2012(3):75-76.
    [44]WinPcap documentation[EB/OL].2013-5. http://www.winpcap.org/
    [45]周振.WWW缓存技术的研究与实现[D].大连海事大学,2004.
    [46]陈兵,王立松.基于哈希链表和时间链表的HTTP代理缓存机制的实现[J].南京航空航天大学学报,2002,34(1):50-54.
    [47]庄伟,王鼎兴,郑纬明,在分析用户访问行为基础上实现代理缓存[J].计算机研究与发展,1999,35(11),1375-1383
    [48]M. Arlitt, R. Friedrich, T. Jin. Performance evaluation of Web proxy cache replacement policies[J]. Performance Evaluation,2000,39(1-4):149-164.
    [49]D Lee, J Choi, On the Existence of a Spectrum of Policies that Subsumes the Least Recently Used(LRU) and Least Frequently Used(LFU) Policies. Performance evaluation review, Vol.27, No.11999.
    [50]郝沁汾,祝明发,郝继升,一种新的代理缓存替换策略[J],计算机研究与发展,2001,39(10),1178-1185
    [51]V. Cate, Alex-a global file system, Proceedings of the 1992 USENIX File System Workshop, pp.1-12, May 1992.
    [52]张超群.基于Internet的代理缓存技术研究[D].广西大学,2003.
    [53]唐沙骊.基于Squid的Web服务缓存技术研究与实现[D].华中科技大学,2004.
    [54]王倩倩,成卫青.基于HTTP的Web服务响应时间测试[J].南京邮电学院学报,2005,25(6):79-83

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700