网络安全态势感知中Trie树关键词高速匹配算法研究
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Research on Trie Tree Keyword Fast Matching Algorithm in Network Security Situational Awareness
  • 作者:徐国天 ; 张铭
  • 英文作者:XU Guotian;ZHANG Ming;Cyber Crime Investigation Department, Criminal Investigation Police University of China;The Second Af?liated Hospital of Harbin Medical University;
  • 关键词:态势感知 ; 双数组 ; Trie树 ; 压缩 ; 信息检索
  • 英文关键词:situational awareness;;double-array;;Trie-tree;;compression;;information retrieval
  • 中文刊名:XXAQ
  • 英文刊名:Netinfo Security
  • 机构:中国刑事警察学院网络犯罪侦查系;哈尔滨医科大学附属第二医院;
  • 出版日期:2019-04-10
  • 出版单位:信息网络安全
  • 年:2019
  • 期:No.220
  • 基金:辽宁省自然科学基金[20180550841,2015020091];; 公安部理论及软科学研究计划[2016LLYJXJXY013];公安部技术研究计划[2016JSYJB06];; 辽宁省经济社会发展研究重大课题[2018LSLKTZD-028]
  • 语种:中文;
  • 页:XXAQ201904008
  • 页数:8
  • CN:04
  • ISSN:31-1859/TN
  • 分类号:61-68
摘要
海量数据中关键词高速检索对增强网络安全态势感知系统反应速度,提高系统整体效率和安全性具有重要意义。基于双数组Trie树的网络信息检索算法具有较高的查找效率,但其插入时间复杂度较高,同时叶子结点占用了大量存储空间。为此,文章提出一种基于叶子结点压缩存储的双数组Trie树构造方法,按层次遍历Trie树,将分枝结点存储在基本双数组中,对叶子结点进行压缩后以位图形式存储于压缩数组中。该方法在保留双数组Trie树查询性能的同时,一定程度上提高了插入效率,改善了存储空间利用效率。
        High-speed keyword retrieval in massive data is of great significance for enhancing the response speed of network security situational awareness system and improving the overall efficiency and security of the system.The network information retrieval algorithm based on the double-array Trie-tree has higher search efficiency, but its insertion time complexity is higher, and the leaf nodes consume a lot of storage space.For this reason, this paper proposes a double-array Trie-tree construction method based on leaf node compression storage, traverses the tree hierarchically, stores the branch nodes in the basic double array, compresses the leaf nodes, and stores them in the compressed array.This method not only preserves the query performance of double-array Trie-tree, but also improves the insertion efficiency and storage space utilization efficiency.
引文
[1]TAO Yuan,HUANG Tao,ZHANG Mehan,et al.Research and Development Trend Analysis on Key Technologies of Network Security Situational Awareness[J].Netinfo Security,2018,18(8):79-85.陶源,黄涛,张墨涵,等.网络安全态势感知关键技术研究及发展趋势分析[J].信息网络安全,2018,18(8):79-85.
    [2]WANG Sili,ZHANG Huaping,WANG Bin.Optimization of Dual Array Trie Tree Algorithm and Its Application[J].Journal of Chinese Information Science,2006,20(5):24-30.王思力,张华平,王斌.双数组Trie树算法优化及其应用研究[J].中文信息学报,2006,20(5):24-30.
    [3]LI Jiangbo,ZHOU Qiang,ZHOU Zushun.Research on Fast Query Algorithms for Chinese Dictionaries[J].Journal of Chinese Information Science,2006,20(5):33-39.李江波,周强,陈祖舜.汉语词典快速查询算法研究[J].中文信息学报,2006,20(5):33-39.
    [4]WANG Shikun,LI Shaozi,KE Xiao.Double-array Trie Based on Genetic Algorithm and Idea of Sherwood[J].Computer Engineering and Applications,2009,45(29):128-139.王世坤,李昭滋,柯逍.基于遗传算法和舍伍德思想的双数组Trie树改进[J].计算机工程应用,2009,45(29):128-139.
    [5]YANG Wenchuan,LIU Jian,YU Miao.Optimization of Chinese Word Segmentation Dictionary Algorithms Based on Double Array Trie Tree[J].Computer Engineering and Science,2013,35(9):127-131.杨文川,刘健,于淼.基于双数组Trie树的中文分词词典算法优化研究[J].计算机工程与科学,2013,35(9):127-131.
    [6]LINUX.An Implementation of Double-array Trie[EB/OL].http://linux.thai.net/~thep/datrie/datrie.html,1999-5-15.
    [7]AOE J.An Efficient Digital Search Algorithm by Using a Double-array Structure[J].IEEE Transactions on Software Engineering,1989,15(9):1066-1077.
    [8]THEPPITAK K.An Implementation of Double-array Trie[EB/OL].http://linux.thai.net/~thep/da Trie/da Trie.html,2006-5-15.
    [9]YATA S,OONO M.A Compact Static Double-array Keeping Character Codes[J].Information Processing Letters,2007,35(10):120-126.
    [10]DAI Gengyi,SHE Jingtao.Dictionary Improvement and Implementation Based on Double Array Trie Tree Algorithms[J].Software Guide,2012,11(7):17-19.戴耿毅,佘静涛.基于双数组Trie树算法的字典改进和实现[J].软件导刊,2012,11(7):17-19.
    [11]ZHAO Huan,ZHU Hongquan.Research on Chinese Word Segmentation Based on Double Array Trie Tree[J].Journal of Hunan University,2009,131(5):22-26.赵欢,朱红权.基于双数组Trie树中文分词研究[J].湖南大学学报,2009,131(5):22-26.
    [12]YANG Wenfeng,CHEN Guangying,LI Xing.Automatic Chinese Word Segmentation Dictionary Mechanism Based on PATRICIA Tree[J].Journal of Chinese Information Science,2001,15(3):44-49.杨文峰,陈光英,李星.基于PATRICIA Tree的汉语自动分词词典机制[J].中文信息学报,2001,15(3):44-49.
    [13]WEN Tao,ZHU Qiaoming.A Fast Chinese Word Segmentation Algorithm[J].Computer Engineering,2004,30(19):119-182.温滔,朱巧明.一种快速汉语分词算法[J].计算机工程,2004,30(19):119-182.
    [14]LI Qinghu,CHEN Yujian,SUN Jiaguang.A New Mechanism of Chinese Word Segmentation Dictionary-Double-character Hash Mechanism[J].Journal of Chinese Information Science,2003,17(4):13-18.李庆虎,陈玉健,孙家广.一种中文分词词典新机制一双字哈希机制[J].中文信息学报,2003,17(4):13-18.
    [15]LIAO Min,CHU Yingna,SONG Jihua.Research on Operability of Double Array Trie Tree Index[J].Application of Computer System,2009,18(10):52-56.廖敏,褚颖娜,宋继华.双数组Trie树索引的可操作性研究[J].计算机系统应用,2009,18(10):52-56.
    [16]WANG Ziniu,CAO Lingfei,WANG Yan.Keyword Preprocessing Technology Based on Double Array Trie Tree Method and Its Application in CNC Syntax Verification[J].Journal of Guizhou University,2010,27(1):49-52.王子牛,曹凌菲,王岩.基于双数组Trie树法的关键字预处理技术及其在CNC语法检验中的应用[J].贵州大学学报,2010,27(1):49-52.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700