用户名: 密码: 验证码:
基于混合特征的恶意PDF文档检测
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Malicious PDF document detection based on mixed feature
  • 作者:杜学绘 ; 林杨东 ; 孙奕
  • 英文作者:DU Xuehui;LIN Yangdong;SUN Yi;Henan Provincial Key Laboratory of Information Security, Information Engineering University;
  • 关键词:恶意PDF文档 ; 混合特征 ; 机器学习 ; 检测
  • 英文关键词:malicious PDF document;;mixed feature;;machine learning;;detection
  • 中文刊名:TXXB
  • 英文刊名:Journal on Communications
  • 机构:解放军战略支援部队信息工程大学河南省信息安全重点实验室;
  • 出版日期:2019-02-25
  • 出版单位:通信学报
  • 年:2019
  • 期:v.40;No.382
  • 基金:国家高技术研究发展计划(“863”计划)基金资助项目(No.2015AA016006);; 国家自然科学基金资助项目(No.61702550)~~
  • 语种:中文;
  • 页:TXXB201902014
  • 页数:11
  • CN:02
  • ISSN:11-2102/TN
  • 分类号:122-132
摘要
针对现有恶意PDF文档在检测方案存在特征顽健性差、易被逃避检测等问题,提出了一种基于混合特征的恶意PDF文档检测方法,采用动静态混合分析技术从文档中提取出其常规信息、结构信息以及API调用信息,并基于K-means算法设计了特征提取方法,聚合出表征文档安全性的核心混合特征,从而提高了特征的顽健性。在此基础上,利用随机森林算法构建分类器并设计实验,对所提方案的检测性能以及抵抗模拟攻击的能力进行了探讨。
        Aiming at the problem of poor robustness and easy to evade detection in the detection of malicious PDF doc-ument, a malicious PDF document detection method based on mixed features was proposed. It adopted dynamic and stat-ic analysis technology to extract the regular information, structure information and API calling information from thedocument, and then a feature extraction method based on K-means clustering algorithm was designed to filter and selectthe key mixed features that characterize the document security. Ultimately, it improved the robustness of features. On thisbasis, it used random forest algorithm to construct classifier and perform experiment to discuss the detection performanceof the scheme and its ability to resist mimicry attacks.
引文
[1]SYSTEMS A.PDF reference:adobe portable document format,version 1.3[M].Addison-Wesley,2000.
    [2]BLONCE A,FILIOL E.Portable document format(PDF)security analysis and malware threats[J].Images Paediatr Cardiol,2008(2):1-3.
    [3]陈亮,陈性元,孙奕,等.基于结构路径的恶意PDF文档检测[J].计算机科学,2015,42(2):90-94.
    [4]武雪峰.恶意PDF文档的分析[D].济南:山东大学,2012.
    [5]Adobe Systems Incorporated.PDF reference:version 1.4[J].Textile Research Journal,2003,30:1-10.
    [6]LI W J,STOLFO S,STAVROU A,et al.A study of malcode-bearing documents[C]//International Conference on Detection of Intrusions and Malware,and Vulnerability Assessment Springer-Verlag,2007:231-250.
    [7]WILLEMS C,HOLZ T,FREILING F.Toward automated dynamic malware analysis using CWSandbox[J].IEEE Security&Privacy,2007,5(2):32-39.
    [8]COVA M,KRUEGEL C,VIGNA G.Detection and analysis of drive-by-download attacks and malicious JavaScript code[C]//International Conference on World Wide Web.2010:281-290.
    [9]RIECK K,KRUEGER T,DEWALD A.Cujo:efficient detection and prevention of drive-by-download attacks[C]//Twenty-Sixth Computer Security Applications Conference.2010:31-39.
    [10]CURTSINGER C,LIVSHITS B,ZORN B,et al.ZOZZLE:fast and precise in-browser JavaScript malware detection[C]//Usenix Conference on Security.2011:3.
    [11]CANALI D,COVA M,VIGNA G,et al.Prophiler:a fast filter for the large-scale detection of malicious Web pages categories and subject descriptors[C]//International Conference Companion on World Wide Web.2012:197-206.
    [12]ENGELBERTH M,WILLEMS C,HOLZ T.Mal Office-detecting malicious documents with combined static and dynamic analysis[C]//Virus Bulletin International Conference.2009:1-37.
    [13]SNOW K Z,KRISHNAN S,PROVOS N,et al.SHELLOS:enabling fast detection and forensic analysis of code injection attacks[C]//Usenix Conference on Security.2011:9.
    [14]TZERMIAS Z,SYKIOTAKIS G,POLYCHRONAKIS M,et al.Combining static and dynamic analysis for the detection of malicious documents[C]//The Fourth European Workshop on System Security.2011:1-6.
    [15]LASKOV P.Static detection of malicious JavaScript-bearing PDFdocuments[C]//Twenty-Seventh Computer Security Applications Conference.2011:373-382.
    [16]MAIORCA D,GIACINTO G,CORONA I.A pattern recognition system for malicious PDF files detection[C]//International Conference on Machine Learning and Data Mining in Pattern Recognition.2012:510-524.
    [17]SMUTZ C,STAVROU A.Malicious PDF detection using metadata and structural features[C]//Computer Security Applications Conference.2012:239-248.
    [18]?RNDIC N,LASKOV P.Detection of malicious pdf files based on hierarchical document structure[C]//The 20th Annual Network&Distributed System Security Symposium.2013:1-17.
    [19]MAIORCA D,CORONA I,GIACINTO G.Looking at the bag is not enough to find the bomb:an evasion of structural methods for malicious pdf files detection[C]//The 8th ACM SIGSAC Symposium on Information,Computer and Communications Security.2013:119-130.
    [20]LIU D,WANG H,STAVROU A.Detecting malicious Javascript in PDF through document instrumentation[C]//IEEE/IFIP International Conference on Dependable Systems and Networks.2014:100-111.
    [21]CORONA I,MAIORCA D,ARIU D,et al.Lux0R:detection of malicious PDF-embedded JavaScript code through discriminant analysis of API references[C]//The Workshop on Artificial Intelligent and Security Workshop.2014:47-57.
    [22]MAASS M,SCHERLIS W L,ALDRICH J.In-nimbo sandboxing[C]//Symposium and Bootcamp on the Science of Security.2014:1-13.
    [23]MAIORCA D,ARIU D,CORONA I,et al.A structural and content-based approach for a precise and robust detection of malicious PDF files[C]//International Conference on Information Systems Security and Privacy.2015:27-36
    [24]VATAMANU C,GAVRILU?D,BENCHEA R.A practical approach on clustering malicious PDF documents[J].Journal in Computer Virology,2012,8(4):151-163.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700