基于N-Gram的计算机病毒特征码自动提取的改进方法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Improved Method of Computer Virus Signature Automatic Extraction Based on N-Gram
  • 作者:杨燕 ; 蒋国平
  • 英文作者:YANG Yan;JIANG Guo-ping;School of Computer Science and Technology,Nanjing University of Posts and Telecommunications;School of Automation,Nanjing University of Posts and Telecommunications;
  • 关键词:N-Gram ; 病毒特征码 ; 特征浓度 ; 数据字典
  • 英文关键词:N-Gram;;Virus signature;;Signature concentration;;Data dictionary
  • 中文刊名:JSJA
  • 英文刊名:Computer Science
  • 机构:南京邮电大学计算机学院;南京邮电大学自动化学院;
  • 出版日期:2017-11-15
  • 出版单位:计算机科学
  • 年:2017
  • 期:v.44
  • 语种:中文;
  • 页:JSJA2017S2072
  • 页数:5
  • CN:S2
  • ISSN:50-1075/TP
  • 分类号:348-351+371
摘要
随着计算机技术的发展和普及,计算机病毒带来的危害日趋严重。传统N-Gram算法难以提取不同长度的特征,导致有效特征缺失,并产生庞大的特征集合,造成空间的浪费。针对这些问题,提出一种改进的基于N-Gram的特征码自动提取方法。该方法在原有N-Gram特征提取算法的基础上引入变长N-Gram特征,提取不同长度的有效特征,生成不定长病毒特征码。综合考虑特征频率的相关性,利用特征浓度对N-Gram特征进行有向筛选,生成数据字典,节省存储空间。实验结果表明,与单纯使用定长N-Gram的算法相比,该方法能有效降低特征码自动提取的误报率。
        With the rapid development of computer technology,security threats brought by computer virus have become more and more serious.The traditional N-Gram algorithm is difficult to capture bytes of different length,leading to the lack of effective signature and the geheration of huge signature sets,and creating a waste of storage space.Instead of using fixed-length N-Gram feature that the traditional way dose,an improved computer virus signature automatic extraction algorithm based on variable-length N-Gram was proposed to solve these problems.It extracts the effective signature to generate variable-length virus signature.Taking the correlation of signature frequency into account,the algorithm uses signature concentration to extract the N-Gram feature of malware samples and generates a data dictionary to save the storage space.In the experiment results,compared with the traditional algorithm which uses fixed-length NGram feature,the proposed method can effectively decrease the false rate of signature extraction.
引文
[1]YEGNESWARAN V,GIFFIN J T,BARFOD P,et al.An architecture for generating semantics-aware signatures[C]∥Conference on Usenix Security Symposium.USENIX Association,2004:7-7.
    [2]LEE H,KIM W,HONG M.Biologically Inspired Computer Virus Detection System[J].Lecture Notes in Computer Science,2004,3141:153-165.
    [3]KIJEWSKI P.Automated Extraction of Threat Signatures from Network Flows[OL].http://www.first.org/conference/2006/papers/kijewski-piotr-paper.pdf.
    [4]KREIBICH C,ROWCROFT J.Honeycomb:creating intrusion detection signatures using honeypots[J].Acm Sigcomm Computer Communication Review,2015,34(1):51-56.
    [5]张小康,帅建梅,史林.基于加权信息增益的恶意代码检测方法[J].计算机工程,2010,36(6):149-151.
    [6]KEPHART J O,ARNOLD W C.Automatic extraction of computer virus signatures[C]∥4th Virus Bulletin International Conference.1994.
    [7]张福勇.基于n-gram词频的恶意代码特征提取方法[J].网络安全技术与应用,2015(11):88-89.
    [8]白金荣,王俊峰,赵宗渠.基于PE静态结构特征的恶意软件检测方法[J].计算机科学,2013,40(1):122-126.
    [9]RAFF E,ZAK R,COX R,et al.An investigation of byte n-gram features for malware classification[J].Journal of Computer Virology&Hacking Techniques,2016:1-20.
    [10]曾键,赵辉.一种基于N-Gram的计算机病毒特征码自动提取方法[J].计算机安全,2013(10):2-5.
    [11]李沁蕾,王蕊,贾晓启.OSN中基于分类器和改进n-gram模型的跨站脚本检测方法[J].计算机应用,2014,34(6):1661-1665.
    [12]DHAYA R,POONGODI M.Detecting software vulnerabilies in android using static analysis[C]∥International Conference on Advanced Communication,Control and Computing Technologies.2014.
    [13]O’KANE P,SEZER S,MCLAUGHLIN K.N-gram density based malware detection[C]∥Computer Applications&Research.IEEE,2014:1-6.
    [14]SHABTAI A,MOSKOVITCH R,FEHER C,et al.Detecting unknown malicious code by applying classification techniques on OpCode patterns[J].Security Informatics,2012,1(1):1-22.
    [15]SANTOS I,BREZO F,UGARTE-PEDRERO X,et al.Opcode sequences as representation of executables for data-miningbased unknown malware detection[J].Information Sciences,2013,231(9):64-82.
    [16]吴军.数学之美[M].北京:人民邮电出版社,2012.
    [17]恶意代码网站[OL].http://vxheaven.org.
    [18]金雄斌.计算机病毒特征码自动提取技术的研究[D].武汉:华中科技大学,2011.
    [19]TANG Y,XIAO B,LU X.Using a bioinformatics approach to generate accurate exploit-based signatures for polymorphic worms[J].Computers&Security,2009,28(8):827-842.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700