基于正则表达式的深度包检测研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着专门针对应用层攻击现象的增多,传统的状态检测防火墙有效性越来越低。防火墙的功能重心从网络层发展到了应用层,因而诞生了深度包检测技术。深度包检测技术不仅检测网络层和传输层数据包头部,而且深入到应用层数据包的有效载荷所封装的内容中,搜寻合法或非法的内容以决定是否允许数据包通过。
     随着深度包检测技术的发展,传统上用于过滤数据包内容的模式集合(包含模式的匹配串)逐渐被正则表达式集合所代替。例如Linux的应用协议分类器L7-filter(Linux Application Protocol Classifier),通过基于正则表达式的模式集合识别应用层的数据包;Snort、Bro等入侵检测系统也已将正则表达式应用于它的规则集当中。
     然而,虽然正则表达式在模式匹配时比字符串表现得更优异,但在现有的网络应用中,一个典型的模式集合往往由上百个正则表达式和数以万计的状态数组成,将模式集合构造成一个有限自动机,所需的内存可达几百兆,甚至几G,结果导致了基于正则表达式的深度包检测的响应时间过长,极大地影响了检测效率。目前,如何提高基于正则表达式的深度包检测技术的效率,在国内外都尚处于探索阶段。本文所进行的研究正是在该背景下展开的。
     本文首先在分析传统防火墙工作原理的基础上,介绍了采用深度包检测技术的新一代智能防火墙。然后在详细说明数据包过滤技术和入侵防护检测技术的基础上,阐述了深度包检测技术的工作原理。通过对常见模式匹配算法的优缺点的分析,本文提出了一种新的基于正则表达式的匹配算法。在深入分析了DFA(DFA,Deterministic Finite Automaton)状态数对算法性能影响的基础上,本文进一步提出了构造最优DFA状态数的算法,该算法保证在任意有限的系统资源下算法的时间复杂度最小。作者已经在Linux环境下实现了该算法,并对基于L7-filter模式集合的网络数据包进行了大量检测实验。实验数据表明,与已有算法相比该算法的时间复杂度最小。
Traditional stateful firewall can't provide enough protection against application-level attacks. The function of firewall moved from the network layer to the application layer and DPI (Deep Packet Inspection) technology was developed. DPI technology examines not only the header but also the contents of packets from the application level.
    Traditional string set based DPI technology is being replaced by regular expression set based technology. For example, in Linux Application Protocol Classifier (L7-filter), all protocol identifiers are expressed as regular expressions. Similarly, Snort and Bro intrusion detection systems also use regular expressions as pattern language.
    Although regular expression is effective and flexible, in current network application, a typical set of regular expressions contains hundreds of regular expressions and tens of thousands of DFA (Deterministic Finite Automaton) states which result in a storage requirement of hundreds of megabytes, even more than gigabytes. Thus the response time of regular expression based DPI algorithm increases and its performance degrades dramatically. Nowadays, how to improve the efficiency of regular expression based DPI technology is still under development all over the world.
    Based on the analysis of traditional firewall, the author introduced DPI technology base firewall. The operating principle of DPI was described from the viewpoints of packet filtering and intrusion detection. By analyzing the merits and demerits of the classical pattern matching algorithms, a new pattern matching algorithm based regular expression which was proposed in this paper. Based on the analysis of the impact of number of DFA states to the algorithm performance, further improvement to the algorithm was made by introducing a DFA state number optimization algorithm. The propose algorithm has been implemented in Linux environment and lots of experiments have been done. Experimental results show that the performance of the proposed algorithm is much better than others.
引文
[1].http://www.tj.xinhuanet.com/2006-08/25/content_7869753_2.html 2006年全国信息网络安全状况调查分析 2006.08
    [2].柳岸等.基于包过滤技术的网络安全的研究.计算机应用 2006(09)
    [3].王钢等.应用网关防火墙——网络的“中间检查站”.计算机安全 2004(04)
    [4].刘更楼等.基于状态检测的防火墙系统研究.航空计算技术 2004(01)
    [5].王栋.防火墙深度包检测技术研究.西安电子科技大学 2005
    [6] A. V. Aho and M. J. Corasick, "Efficient string matching: An aid to bibliographic search," Comm. of the ACM, 18(6): 333-340, 1975.
    [7] B. Commentz-Walter, "A string matching algorithm fast on the average," Proc. of ICALP, pages 118-132, July 1979.
    [8] S. Wu, U. Manber, "A fast algorithm for multi-pattern searching," Tech. R. TR-94-17, Dept. of Comp. Science, Univ of Arizona, 1994.
    [9] N. Tuck, T. Sherwood, B. Calder, and G. Varghese, "Deterministic memory-efficient string matching algorithms for intrusion detection," IEEE Infocom 2004, pp. 333—340.
    [10] L. Tan, and T. Sherwood, "A High Throughput String Matching Architecture for Intrusion Detection and Prevention," ISCA 2005.
    [11] S. Yusuf and W. Luk, "Bitwise Optimised CAM for Network Intrusion Detection Systems," IEEE FPL 2005.
    [12] N.J. Larsson, "Structures of string matching and data compression," PhD thesis, Dept. of Computer Science, Lund University, 1999.
    [13] S. Dharmapurikar, P. Krishnamurthy, T. Sproull, and J. Lockwood, "Deep Packet Inspection using Parallel Bloom Filters," IEEE Hot Interconnects 12, August 2003. IEEE Computer Society Press.
    [14] Z. K. Baker, V. K. Prasanna, "Automatic Synthesis of Efficient Intrusion Detection Systems on FPGAs," in Field Prog. Logic and Applications, Aug. 2004, pp. 311-321.
    [15] Y. H. Cho, W. H. Mangione-Smith, "Deep Packet Filter with Dedicated Logic and Read Only Memories," Field Prog. Logic and Applications, Aug. 2004, pp. 125-134.
    [16] R. Sommer, V. Paxson, "Enhancing Byte-Level Network Intrusion Detection Signatures with Context," ACM conf. on Computer and Communication Security, 2003, pp. 262—271.
    [17] J. E. Hopcroft and J. D. Ullman, "Introduction to Automata Theory, Languages, and Computation," Addison Wesley, 1979.
    [18] J. Hopcroft, "An nlogn algorithm for minimizing states in a finite automaton," in Theory of Machines and Computation, J. Kohavi, Ed. New York: Academic, 1971, pp. 189—196.
    [19] R. Sidhu and V. K. Prasanna, "Fast regular expression matching using FPGAs," In IEEE Symposium on Field-Programmable Custom Computing Machines, Rohnert Park, CA, USA, April 2001.
    [20] C. R. Clark and D. E. Schimmel, "Efficient reconfigurable logic circuit for matching complex network intrusion Conference on Field Program.
    [21] J. Moscola, et. al, "Implementation of a content-scanning module for an internet firewall," IEEE Workshop on FPGAs for Custom Comp. Machines, Napa, USA, April 2003.
    [22]. http://www.nwfusion.com/reviews/2004/0216ips.html.Network Intrusion-Prevention Systems 2004
    [23].南湘浩,陈钟.网络安全技术概论.北京:国防工业出版社.2003.325-327.
    [24]. D.Denning, "An Intrusion-Detection Model," IEEE Trans Software Eng. 2003, Vol. 13: 222-223.
    [25]. Cisco lOS IPS Deployment Guide, www.cisco.com
    [26]. Jean-Paul Tremblay, Paul G. Sorenson, "The theory and practice of compiler writing", McGraw-Hill, 1985
    [27]. K. Thompson, "Regular expression search algorithm", Communications of the ACM, 11: 419-422, 1968.
    [28]. Alfred V. Aho, Ravi Sethi, Jeffrey D.Ullman, "Compilers, principles, techniques, and tools", Addison-Wesley, 1986
    [29].卢开澄,“计算机算法导引”,清华大学出版社,1996
    [30]. G.A. Stephen. String searching algorithms. World Scientific. Singapore u. a, 1994
    [31]. Knuth, D.E., Morris, J.H., and Pratt, V.R., "Fast pattern matching in strings" SIAM J. Computing 6(2) pp. 323-350, 1977.
    [32]. Boyer R. S, Moore J S. A fast string searching algorithm. Communication of ACM, 1997,10.20
    [33]. Aho A, Corasick M.EfFicient string matching, an aid to bibliographic search[J]. Comm ACM, 1975,(18):33-40.
    [34]. Coit C J, Staniford S, Mc Alerney J. Towards faster pattern matching for intrusion detection or exceeding the speed of snort[C]. In: DARPA Information Survivability Conference and Exposition, 2001.
    [35]. J. Levandoski, E. Sommer, and M. Strait, "Application Layer Packet Classifier for Linux." http://17-filter.sourceforge.net/.
    [36]. V. Paxson et al.,"Flex:A fast scanner generator." http://www.gnu.org/software/flex/.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700