用户名: 密码: 验证码:
一种针对DFA状态爆炸的正则表达式匹配方法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:A Regular Expression Matching Method for DFA State Explosion
  • 作者:王翔 ; 卢毓海 ; 马伟 ; 刘燕兵
  • 英文作者:WANG Xiang;LU Yuhai;MA Wei;LIU Yanbing;School of Cyber Security,University of Chinese Academy of Sciences;Institute of Information Engineering,Chinese Academy of Sciences;National Engineering Laboratory for Information Security Technologies;
  • 关键词:正则表达式 ; 确定有限自动机 ; 状态爆炸 ; 子串抽取 ; 匹配引擎
  • 英文关键词:regular expression;;Deterministic Finite Automaton(DFA);;state explosion;;substring extraction;;matching engine
  • 中文刊名:JSJC
  • 英文刊名:Computer Engineering
  • 机构:中国科学院大学网络空间安全学院;中国科学院信息工程研究所;信息内容安全技术国家工程实验室;
  • 出版日期:2018-03-13 17:10
  • 出版单位:计算机工程
  • 年:2019
  • 期:v.45;No.499
  • 基金:国家重点研发计划(2016YFB0800303);; 中国科学院信息工程研究所基础前沿项目(Y7Z0351101)
  • 语种:中文;
  • 页:JSJC201904026
  • 页数:9
  • CN:04
  • ISSN:31-1289/TP
  • 分类号:154-162
摘要
针对基于确定有限状态自动机的匹配引擎在大规模、复杂规则下会出现状态爆炸的问题,提出正则表达式子串抽取算法。通过将子串抽取算法应用于DFA状态爆炸场景,设计基于子串抽取的正则匹配引擎。实验结果表明,该算法在单个规则上运行时间可达10 ms量级,抽取率高达99%,同时匹配引擎具有较好的稳定性和可拓展性,且匹配速度优于相关开源匹配引擎。
        Aiming at the problem that the Deterministic Finite Automation(DFA)-based matching engine will explode under large-scale and complex rules,a regular expression substring extraction algorithm is proposed.A substring extraction based regular matching engine is designed by applying the substring extraction algorithm to the DFA state explosion scenario.Experimental results show that the algorithm can run on a single rule up to 10 ms,and the decimation rate is as high as 99%.At the same time,the matching engine has better stability and scalability,and the matching speed is better than the related open source matching engine.
引文
[1] 褚衍杰,李云照,魏强.一种改进的多模式匹配算法[J].西安电子科技大学学报(自然科学版),2014(6):174-180.
    [2] 赵毅,朱鹏,迟学斌,等.浅析高性能计算应用的需求与发展[J].计算机研究与发展,2007,44(10):1640-1646.
    [3] XU C,CHEN S,SU J,et al.A survey on regular expression matching for deep packet inspection:applications,algorithms,and hardware platforms[J].IEEE Communications Surveys and Tutorials,2016,18(4):2991-3029.
    [4] 刘燕兵.高性能正则表达式匹配技术研究[D].北京:中国科学院大学,2012.
    [5] BECCHI M,CADAMBI S.Memory-efficient regular expression search using state merging[C]//Proceedings of the 26th IEEE International Conference on Computer Communications.Washington D.C.,USA:IEEE Press,2007:1064-1072.
    [6] FICARA D,GIORDANO S,PROCISSI G,et al.An improved DFA for fast regular expression matching[J].ACM SIGCOMM Computer Communication Review,2008,38(5):29-40.
    [7] FICARA D,PIETRO A D,GIORDANO S,et al.Differential encoding of DFAs for fast regular expression matching[J].IEEE/ACM Transactions on Networking,2011,19(3):683-694.
    [8] KONG S,SMITH R,ESTAN C.Efficient signature matching with multiple alphabet compression tables[C]//Proceedings of the 4th International Conference on Security and Privacy in Communication Networks.New York,USA:ACM Press,2008.
    [9] LIU Y,GUO L,LIU P,et al.Compressing regular expressions’ DFA table by matrix decomposition[C]//Proceedings of the 15th International Conference on Implementation and Application of Automata.Berlin,Germany:Springer,2010:282-289.
    [10] SMITH R,ESTAN C,JHA S.XFA:faster signature matching with extended automata[C]//Proceedings of IEEE Symposium on Security and Privacy.Washington D.C.,USA:IEEE Computer Society,2008:187-201.
    [11] SMITH R D.Toward robust network payload inspection[D].Madison,USA:University of Wisconsin at Madison,2009.
    [12] YANG Y E,PRASANNA V K.Space-time tradeoff in regular expression matching with semi-deterministic finite automata[C]//Proceedings of IEEE International Conference on Computer Communications.Washington D.C.,USA:IEEE Press,2011:1853-1861.
    [13] MAJUMDER A,RASTOGI R,VANAMA S.Scalable regular expression matching on data streams[C]//Proceedings of ACM SIGMOD International Conference on Management of Data.New York,USA:ACM Press,2008:161-172.
    [14] ROHRER J,ATASU K,LUNTEREN J,et al.Memory-efficient distribution of regular expressions for fast deep packet inspection[C]//Proceedings of the 7th IEEE/ACM International Conference on Hardware/Software Codesign and System Synthesis.New York,USA:ACM Press,2009:147-154.
    [15] WATSON B W.A new regular grammar pattern matching algorithm[J].Theoretical Computer Science,1996,299(1/2/3):364-377.
    [16] NAVARRO G.NR-grep:a fast and flexible pattern-matching tool[J].Practice and Experience,2001,31(13):1265-1312.
    [17] YANG X,QIU T,WANG B,et al.Negative factor:improving regular-expression matching in strings[J].ACM Transactions on Database Systems,2016,40(4).

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700