基于KNN离群点检测和随机森林的多层入侵检测方法

英文篇名：An Multi-Level Intrusion Detection Method Based on KNN Outlier Detection and Random Forests
作者：任家东 ; 刘新倩 ; 王倩 ; 何海涛 ; 赵小林
英文作者：Ren Jiadong;Liu Xinqian;Wang Qian;He Haitao;Zhao Xiaolin;School of Information Science and Engineering, Yanshan University;Hebei Key Laboratory of Software Engineering (Yanshan University);School of Software, Beijing Institute of Technology;Beijing Key Laboratory of Software Security Engineering Technology (Beijing Institute of Technology);
关键词：网络安全 ; 入侵检测系统 ; KNN离群点检测 ; 随机森林模型 ; 多层次
英文关键词：network security;;intrusion detection system;;KNN outlier detection;;random forests model;;multi-level
中文刊名：JFYZ
英文刊名：Journal of Computer Research and Development
机构：燕山大学信息科学与工程学院;河北省软件工程重点实验室(燕山大学);北京理工大学软件学院;软件安全工程技术北京市重点实验室(北京理工大学);
出版日期：2019-03-15
出版单位：计算机研究与发展
年：2019
期：v.56
基金：国家重点研发计划基金项目(2016YFB0800700);; 国家自然科学基金项目(61472341,61772449,61572420);; 河北省自然科学基金项目(F2016203330,F2015203326);; 燕山大学博士后科研择优资助项目(B2017003005);燕山大学博士基金项目(B1036)~~
语种：中文;
页：JFYZ201903012
页数：10
CN：03
ISSN：11-1777/TP
分类号：116-125

摘要

入侵检测系统能够有效地检测网络中异常的攻击行为,对网络安全至关重要.目前,许多入侵检测方法对攻击行为Probe(probing),U2R(user to root),R2L(remote to local)的检测率比较低.基于这一问题,提出一种新的混合多层次入侵检测模型,检测正常和异常的网络行为.该模型首先应用KNN(K nearest neighbors)离群点检测算法来检测并删除离群数据,从而得到一个小规模和高质量的训练数据集;接下来,结合网络流量的相似性,提出一种类别检测划分方法,该方法避免了异常行为在检测过程中的相互干扰,尤其是对小流量攻击行为的检测;结合这种划分方法,构建多层次的随机森林模型来检测网络异常行为,提高了网络攻击行为的检测效果.流行的数据集KDD(knowledge discovery and data mining) Cup 1999被用来评估所提出的模型.通过与其他算法进行对比,该方法的准确率和检测率要明显优于其他算法,并且能有效地检测Probe,U2R,R2L这3种攻击类型.
Intrusion detection system can efficiently detect attack behaviors, which will do great damage for network security. Currently many intrusion detection systems have low detection rates in these abnormal behaviors Probe(probing), U2 R(user to root) and R2 L(remote to local). Focusing on this weakness, a new hybrid multi-level intrusion detection method is proposed to identify network data as normal or abnormal behaviors. This method contains KNN(K nearest neighbors) outlier detection algorithm and multi-level random forests(RF) model, called KNN-RF. Firstly KNN outlier detection algorithm is applied to detect and delete outliers in each category and get a small high-quality training dataset. Then according to the similarity of network traffic, a new method of the division of data categories is put forward and this division method can avoid the mutual interference of anomaly behaviors in the detection process, especially for the detecting of the attack behaviors of small traffic. Based on this division, a multi-level random forests model is constructed to detect network abnormal behaviors and improve the efficiency of detecting known and unknown attacks. The popular KDD(knowledge discovery and data mining) Cup 1999 dataset is used to evaluate the performance of the proposed method. Compared with other algorithms, the proposed method is significantly superior to other algorithms in accuracy and detection rate, and can detect Probe, U2 R and R2 L effectively.

引文

[1]Lee W, Stolfo S J, Mok K W. A data mining framework for building intrusion detection models[C] //Proc of the 20th IEEE Symp on Security & Privacy. Piscataway, NJ: IEEE, 1999: 120- 132
    [2]Roesch M. Snort-lightweight intrusion detection for networks[C] //Proc of the 13th USENIX Conf on System Administration. Berkeley, CA: USENIX Association, 1999: 229- 238
    [3]Om H, Kundu A. A hybrid system for reducing the false alarm rate of anomaly intrusion detection system[C] //Proc of the 1st Int Conf on Recent Advances in Information Technology. Piscataway, NJ: IEEE, 2012: 131- 136
    [4]Raman M R G, Somu N, Kirthivasan K, et al. An efficient intrusion detection system based on hypergraph-genetic algorithm for parameter optimization and feature selection in support vector machine[J]. Knowledge-Based Systems, 2017, 134: 1- 12
    [5]Khammassi C, Krichen S. A GA-LR wrapper approach for feature selection in network intrusion detection[J]. Computers & Security, 2017, 70: 255- 277
    [6]Aljawarneh S, Aldwairi M, Yassein M B. Anomaly-based intrusion detection system through feature selection analysis and building hybrid efficient model[J]. Journal of Computational Science, 2018, 25: 152- 160
    [7]George A. Anomaly detection based on machine learning dimensionality reduction using PCA and classification using SVM[J]. International Journal of Computer Applications, 2012, 47(21): 5- 8
    [8]Hashem S H. Efficiency of SVM and PCA to enhance intrusion detection system[J]. Journal of Asian Scientific Research, 2013, 3(4): 381- 395
    [9]Cheng Xiaoxu, Yu Haitao, Li Zi. Improved K-means network intrusion detection algorithm[J]. Intelligent Computer & Applications, 2012, 2(2): 21- 23 (in Chinese)(程晓旭, 于海涛, 李梓. 改进的K-means网络入侵检测算法[J]. 智能计算机与应用, 2012, 2(2): 21- 23)
    [10]Alyaseen W L, Othman Z A, Nazri M Z A. Hybrid modified K-means with C4.5 for intrusion detection systems in multiagent systems[J]. The Scientific World Journal, 2015, 2015(2): 294761
    [11]Alyaseen W L, Othman Z A, Nazri M Z A. Intrusion detection system based on modified K-means and multi-level support vector machines[C] //Proc of the 1st Int Conf on Soft Computing in Data Science. Berlin: Springer, 2015: 265- 274
    [12]Alyaseen W L, Othman Z A, Nazri M Z A. Multi-level hybrid support vector machine and extreme learning machine based on modified K-means for intrusion detection system[J]. Expert Systems with Applications, 2017, 67: 296- 303
    [13]Roshan S, Miche Y, Akusok A, et al. Adaptive and online network intrusion detection system using clustering and extreme learning machines[J]. Journal of the Franklin Institute, 2018, 355(4): 1752- 1779
    [14]Enamul K, Hu Jiankun, Wang Hua, et al. A novel statistical technique for intrusion detection systems[J/OL]. Future Generation Computer Systems. 2017[2017-11-08]. https://www.researchgate.net/publication/313034553_A_novel_statistical_technique_for_intrusion_detection_systems
    [15]Leo B. Random forests[J]. Machine Learning, 2001, 45(1): 5- 32
    [16]Gogoi P, Bhattacharyya D K, Borah B, et al. MLH-IDS: A multi-level hybrid intrusion detection method[J]. Computer Journal, 2014, 57(4): 602- 623
    [17]Guo Yutong, Wang Yan, Qin Mengyuan, et al. DPI & DFI: A malicious behavior detection method combining deep packet inspection and deep flow inspection[J]. Procedia Engineering, 2017, 174: 1309- 1314
    [18]Hoque M S, Mukit M A, Bikas M A N. An implementation of intrusion detection system using genetic algorithm[J]. International Journal of Network Security & Its Applications, 2012, 4(2): 109- 120
    [19]Ambwani T. Multi class support vector machine implementation to intrusion detection[C] //Proc of the 14th Int Joint Conf on Neural Networks. Piscataway, NJ: IEEE, 2003: 2300- 2305

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700