基于随机森林的WebShell检测方法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Webshell Detection Method Based on Random Forest
  • 作者:秦英
  • 英文作者:QIN Ying;Wuhan Research Institute of Posts and Telecommunications;FiberHome Communications Science &Technology Development Co.Ltd.;
  • 关键词:WebShell ; 随机森林 ; 组合特征 ; 特征选择
  • 英文关键词:WebShell;;random forest;;combination features;;feature selection
  • 中文刊名:XTYY
  • 英文刊名:Computer Systems & Applications
  • 机构:武汉邮电科学研究院;南京烽火星空通信发展有限公司;
  • 出版日期:2019-02-15
  • 出版单位:计算机系统应用
  • 年:2019
  • 期:v.28
  • 语种:中文;
  • 页:XTYY201902037
  • 页数:6
  • CN:02
  • ISSN:11-2854/TP
  • 分类号:242-247
摘要
WebShell根据其功能和大小可以分为多种类型,各种类型的WebShell在基本特征上又有其独有的特征,而现有的WebShell检测大多从单一层面提取特征,无法较全面的覆盖各种类型WebShell全部特征,具有种类偏向性,无差别的检测效果差,泛化能力弱等问题.针对这一问题,提出了一种基于随机森林的WebShell检测方法.该方法在数据预处理阶段分别提取文本层的统计特征和文本层源码与编译结果层字节码(opcode)的序列特征,构成较全面的组合特征,然后通过Fisher特征选择选取适当比例的重要特征,降低特征维度,构成样本的特征集,最后采用随机森林分类器训练样本得到检测模型.通过实验表明,本检测方法能有效地检测WebShell,并在准确率、召回率和误报率上都优于单一层面的WebShell检测模型.
        WebShell can be divided into various types according to its function and size; they have basic features and unique features. However, most existing WebShell detection only extracts features from single level, they cannot cover all the features of various types of WebShell in a more comprehensive way. These detections have problems such as kind bias, poor detection effect, weak generalization ability, etc. To solve these problems, a random forest based WenShell detection method is proposed. In the data preprocessing stage, this method extracts the statistical features of the text layer,and the sequence characteristics of the text layer sources and the compilation result layer opcode, to form a comprehensive combination features. Then, the feature set of the sample is formed by using Fisher feature selection to select important features with the appropriate proportion to reduce the feature dimension. Finally, the random forest classifier is used to train samples to get the detection model. The experiment shows that this detection method can detect Web Shell effectively, and it is superior to the single level Web Shell detection model in accuracy, recall, and false alarm rate.
引文
1叶飞,龚俭,杨望.基于支持向量机的Webshell黑盒检测南京航空航天大学学报,2015,47(6):924-930.
    2张红瑞.WebShell原理分析与防范实践.现代企业教育2013,(20):254-255.[doi:10.3969/j.issn.1008-1496.201320.218]
    3贾文超,戚兰兰,施凡,等.采用随机森林改进算法的Webshell检测方法.计算机应用研究,2018,35(5):1558-1561.[doi:10.3969/j.issn.1001-3695.2018.05.060]
    4 Mingkun X,Xi C,Yan H.Design of software to search ASPweb shell.Procedia Engineering,2012,29:123-127.[doi:10.1016/j.proeng.2011.12.680]
    5 Hansen RJ,Patterson ML.Guns and butter:Towards formal axioms of input validation.Black Hat USA,2005,(8):1-6.
    6 Tu TD,Cheng G,Guo XJ,et al.Webshell detection techniques in web applications.Proceedings of the 5th International Conference on Computing,Communications and Networking Technologies(ICCCNT).Hefei,China2014.1-7.
    7胡建康,徐震,马多贺,等.基于决策树的WebShell检测方法研究.网络新媒体技术,2012,1(6):15-19.[doi:10.3969/j issn.2095-347X.2012.06.004]
    8 Starov O,Dahse J,Ahmad S S,et al.No honor among thieves:A large-scale analysis of malicious web shells Proceedings of the 25th International Conference on World Wide Web.Montréal,QB,Canada.2016.1021-1032.
    9胥小波,聂小明.基于多层感知器神经网络的WebShell检测方法.通信技术,2018,51(4):895-900.[doi:10.3969/j.issn1002-0802.2018.04.028]
    10毛勇,周晓波,夏铮,等.特征选择算法研究综述.模式识别与人工智能,2007,20(2):211-218.[doi:10.3969/j.issn.1003-6059.2007.02.012]
    11周屹,冯兆祥,白熙卓,等.基于随机森林算法的数据分析软件设计.黑龙江工程学院学报,2017,31(3):38-41.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700