基于自编码网络和聚类的入侵检测技术

英文篇名：Intrusion Detection Technology Based on Self-coded Networks and Clustering
作者：周康 ; 万良
英文作者：ZHOU Kang;WAN Liang;School of Computer Science and Technology,Guizhou University;Institute of Software and Theory;
关键词：模糊C均值 ; 遗传算法 ; 限制玻尔兹曼机 ; 自编码网络 ; 特征降维 ; 双向映射
英文关键词：fuzzy C-means;;genetic algorithm;;restricted Boltzmann machine;;autoencoder network;;feature dimensionality reduction;;bidirectional mapping
中文刊名：WJFZ
英文刊名：Computer Technology and Development
机构：贵州大学计算机科学与技术学院;贵州大学软件与理论研究所;
出版日期：2018-12-21 17:51
出版单位：计算机技术与发展
年：2019
期：v.29;No.265
基金：贵州省科学基金(黔科合J字[2011](2328),黔科合LH字[2014](7634))
语种：中文;
页：WJFZ201905023
页数：5
CN：05
ISSN：61-1450/TP
分类号：113-117

摘要

针对模糊C均值聚类算法的入侵检测方法易陷入局部最优,受时间和空间复杂度约束,检测速率低并且使用原始数据集容易陷入"维度灾难"等问题,提出了一种基于自编码网络(AN)特征降维结合遗传算法(GA)优化模糊C均值算法的聚类模型(AN-GA-FCM)。该模型采用多层限制玻尔兹曼机(RBM)将高维、非线性的数据双向映射到低维空间,建立高维空间到低维空间的自编码网络,进而使用自编码网络权值微调重构低维空间数据的最优高维表示。并利用遗传算法优化的FCM初始聚类中心,避免目标函数陷入局部最优。将得到的特征降维数据集通过GA-FCM进行分类并在KDD’99数据集上进行检测,通过与PCA,SVM,Softmax等传统算法的实验对比,结果表明,该模型具有较高的入侵检测准确率和较低的分类检测时间。
The intrusion detection method for the fuzzy C-means clustering algorithm is easy to fall into the local optimal,constrained by the time and space complexity,with low detection rate and easy to fall into the "dimensional disaster" and other problems using the original data set. For these problems,we propose a novel fuzzy C-means algorithm clustering model(AN-GA-FCM) based on genetic algorithm(GA) optimization combined with auto-encoder network(AN). This model uses multi-layer restricted Boltzmann machine(RBM) to bidirectionally map high-dimensional and nonlinear data into low-dimensional space,establishes high-dimensional space to low-dimensional autoencoder network,and then uses autoencoder network weights to fine-tune parameter,reconstructing the optimal high-dimensional representation to low-dimensional spatial data. The FCM initial clustering center optimized by the genetic algorithm is to avoid objective function falling into a local optimum. The dimensionality reduction datasets are classified by GA-FCM detected on the KDD'99 dataset. Meanwhile,compared with the traditional algorithms such as PCA,SVM and Softmax with the model,it shows that the model has higher intrusion detection accuracy and lower classification detection time.

引文

[1] DENATIOUS D K,JOHN A.Survey on data mining techniques to enhance intrusion detection[C]//International conference on computer communication and informatics.Coimbatore,India:IEEE,2012:1-5.
    [2] CHITRAKAR R,HUANG Chuanhe.Selection of candidate support vectors in incremental SVM for network intrusion detection[J].Computers and Security,2014,45:231-241.
    [3] SRINOY S,KURUTACH W,CHIMPHLEE W,et al.Intrusion detection via independent component analysis based on rough fuzzy[J].WSEAS Transactions on Computers,2006,5(1):43-48.
    [4] HATHAWAY R J,BEZDEK J C.Local convergence of the fuzzy c-means algorithms[J].Pattern Recognition,1986,19(6):477-480.
    [5] WANG Qiang,MEGALOOIKONOMOU V.A clustering algorithm for intrusion detection[J].Proceedings of SPIE,2008,5812:31-38.
    [6] OH S H,LEE W S.An anomaly intrusion detection method by clustering normal user behavior[J].Computers & Security,2003,22(7):596-612.
    [7] 黄思慧,陈万忠,李晶.基于PCA和ELM的网络入侵检测技术[J].吉林大学学报:信息科学版,2017,35(5):576-583.
    [8] KUANG Fangjun,XU Weihong,ZHANG Siyang.A novel hybrid KPCA and SVM with GA model for intrusion detection[J].Applied Soft Computing,2014,18:178-184.
    [9] HOZ E D L,HOZ E D L,ORTIZ A,et al.PCA filtering and probabilistic SOM for network intrusion detection[J].Neurocomputing,2015,164:71-81.
    [10] HINTON G E,SALAKHUTDINOV R R.Reducing the dimensionality of data with neural networks[J].Science,2006,313(5786):504-507.
    [11] ZABALZA J,REN Jinchang,ZHENG Jiangbin,et al.Novel segmented stacked autoencoder for effective dimensionality reduction and feature extraction in hyperspectral imaging[J].Neurocomputing,2016,185:1-10.
    [12] 刘忠,茆诗松.分组数据的Bayes分析—Gibbs抽样方法[J].应用概率统计,1997,13(2):211-216.
    [13] BENGIO Y,DELALLEAU O.Justifying and generalizing contrastive divergence[M].[s.l.]:MIT Press,2009.
    [14] 高妮.网络安全多维动态风险评估关键技术研究[D].西安:西北大学,2016.
    [15] 胡洋.基于马尔可夫链蒙特卡罗方法的RBM学习算法改进[D].上海:上海交通大学,2012.
    [16] LIU Fan,XU Feng,YANG Sai.A flood forecasting model based on deep learning algorithm via integrating stacked autoencoders with BP neural network[C]//IEEE third international conference on multimedia big data.Laguna Hills,CA,USA:IEEE,2017:58-61.
    [17] 高新波,裴继红,谢维信.模糊c-均值聚类算法中加权指数m的研究[J].电子学报,2000,28(4):80-83.
    [18] 吉根林.遗传算法研究综述[J].计算机应用与软件,2004,21(2):69-73.
    [19] TAVALLAEE M,BAGHERI E,LU Wei,et al.A detailed analysis of the KDD CUP 99 data set[C]//IEEE international conference on computational intelligence for security & defense applications.Ottawa,ON,Canada:IEEE,2009:1-6.
    [20] HINTON G E,OSINDERO S,TEH Y W.A fast learning algorithm for deep belief nets[J].Neural Computation,2006,18(7):1527-1554.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700