基于特征选择和聚类的入侵检测的研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
现代社会,随着计算机技术与通信技术的发展,计算机的应用越来越广泛,但同时网络安全问题也日益突出。传统的安全保护措施如防火墙、数据加密等已不能完全满足网络安全的需要。入侵检测是一种新的安全技术,相对于传统的安全措施,它是一种基于主动防御的技术,在网络系统受到危害之前能够检测入侵和异常,并作出相应的响应措施。入侵检测的关键在于有效地收集数据,并对各种行为进行分析。然而,网络环境中各种攻击与破坏与日俱增和网络数据的海量化为入侵检测带来了很大的困难。数据挖掘的引入为入侵检测提供了很好的手段。以往的基于数据挖掘的入侵检测方法要求训练集中的数据已经过标记并且数据样本是“干净”的。聚类是一种无监督的学习方法,可以在未标记数据集上建立检测模型或发现异常数据,克服了传统数据挖掘方法的缺陷。
     基于上述研究背景,本文开展了基于聚类技术的入侵检测的研究,首先对入侵检测技术和聚类进行了介绍与分析,探讨了聚类算法在入侵检测中的应用。针对传统模糊C-均值聚类算法在入侵检测应用中存在的问题,如对初始值敏感、易陷入局部最优等,本文引入带交叉操作的微粒群算法对其优化,提出一种改进的模糊C-均值算法。采用KDD CUP1999数据集中的数据对改进后的算法进行实验,结果表明,算法具有较好的入侵检测效果。
     特征选择被广泛用来降维和去除不相关特征,它一般用来作为分类工作的预处理步骤,通过消除不相关和冗余的特征,可以避免维灾难,提高运算速度和降低计算代价。入侵检测中的数据具有高维性和数据特征复杂等特点,这样特征选择在入侵检测中的应用非常有必要。本文提出了一种基于聚类和微粒群的特征选择方法,采用KDD CUP1999数据集进行实验,实验结果表明,本文算法加快了特征选择的速度,所选择的特征子集具有较好的分类效果。
In the modern society, with the development of computer and communication technology, computers are widely used, but network security issues are also increasing prominently. Traditional security measures such as firewalls, data encryption can not fully meet the needs of network security. Intrusion detection is a new kind of security technologies, as opposed to traditional security measures, it is a technology based on active defense, it can detect intrusion and exceptions before the network system suffers the hazards, and make appropriate response. The key of intrusion detection is to effectively collecting data and analyzing a variety of behaviors. However, as well as the growing of all kinds of attacks and destructions, the massing of network data brings great difficulties to intrusion detection. The introduction of data mining provides a good means for intrusion detection. The past intrusion detection based on data mining method requires the training set data and the data sample which has been labeled. Clustering is an unsupervised learning method; you can establish the detection model or discover abnormal data on unlabeled dataset, so it can overcome the shortcomings of traditional data mining methods.
     Based on the above study background, this paper carried out research on intrusion detection based on clustering technology. First introduced intrusion detection technology and clustering and analyzed the clustering algorithm in intrusion detection. In view of the problems that exit in the traditional fuzzy C-means clustering during the application of intrusion detection, such as sensitive to initial value, easy to fall into local optimum, we introduce particle swarm optimization algorithm with cross-operation to combine with it, forming a modified fuzzy C-Means algorithm. Using KDD CUP 1999 data set to test the improved algorithm, the experimental results show that the algorithm has better intrusion detection.
     Feature selection is widely used in dimension reduction and removal of irrelevant features, it is generally used as a classification preprocessing step, by eliminating irrelevant and redundant features, it can avoid dimension disaster, and improve processing speed and reduce the computational cost. Feature selection in intrusion detection is necessary for the high dimensional and complex features of intrusion detection data. This paper presents a feature selection method based on particle swarm and clustering. The results of experiment using KDD CUP 1999 show that the algorithm can speed up the rate of feature selection and the selected feature subset has better classification results.
引文
[1]J.P. Anderson. Computer security threat monitoring and surveillance, Technical Report, James P. Anderson Company, Fort Washington, Pennsylvania, April 1980.
    [2]D.E.Denning. An intrusion detection model. IEEE Transaction on Software Engineering.1987,SE-13;222-232.
    [3]Jake Ran, Meng-Jang Lin, Risto Miikkulainen. Intrusion Detection with Neural Networks[C]. Advances in Neural Information Processing Systems, Cambridge MA:MIT Press,1998:179-195.
    [4]S.Cheung,R.Crawford,M.Dilger, J.Frank.The design of GrIDS:A Graph-based Intrusion Detection System[R],Technical Report CSE-99-2,U.C.Davis Computer Science Department,January 1999.
    [5]L.Wenke.A Data Mining Framework for Building Intrusion Detection Modeling[C],In IEEE Symposium on Security and Privacy,Berkeley,California,pp.120-132,1999.
    [6]de Paula FS, de Castro LN and de Geus PL.An intrusion detection system using ideas from the immune system. In:Proceeding of IEEE Congress on Evolutionary Computation (CEC-2004), Portland, OR, USA, June 2004, pp.1059-1066.
    [7]Peddabachigari S., Abraham A., Grosan C., Thomas J. Modeling intrusion detection system using hybrid intelligent systems(2007) Journal of Network and Computer Applications,30 (1), pp.114-132.
    [8]陈硕,安常青.分布式入侵检测系统及其认知能力.软件学报.2001,12(2):225-232.
    [9]蔡忠闽,管晓宏等.基于粗糙理论的入侵检测新方法[J].计算机学报.2003,26(3):361-366.
    [10]冯力,管晓宏,郭三刚等.采用规划识别理论预测系统调用序列中的入侵企图[J].计算机学报.2004,27(8):1083-1091.
    [11]连一峰,戴英侠等.分布式入侵检测模型研究[J].计算机研究与发展.2003,40(8):1195-1202.
    [12]崔竞松,王丽娜,张焕国等.一种并行容侵系统研究模型—RC模型[J].计算机学报.2004,27(4):500-506.
    [13]夏春和,李肖坚,赵沁平.基于入侵诱骗的网络动态防御研究[J].计算机学报.2004,27(12):1585-1592.
    [14]Jiawei Han,Micheline Kamber.数据挖掘概念与技术(范明,孟小峰译)[M].北京:机械工业出版社,2007,3.
    [15]Qiang Wang, Vasileios Megalooikonomou. A Clustering Algorithm for Intrusion Detection[EB/OL].[2007-08-30].http://knight.cis.temple.edu/-vasilis/Publications/SPIED SS05.pdf.
    [16]Y Guan, A Ghorbani, N Belacel. Y-means:A clustering method for intrusion detection[C]//Proceedings of Canadian Conference on Electrical and Computer Engineering(CCECE 2003). Washington:IEEE,2003:1083-1086.
    [17]罗敏,王丽娜,张焕国.基于无监督聚类的入侵检测方法[J].电子学报.2003,31(11):1713-1716.
    [18]SH Oh, W S Lee. An anomaly intrusion detection method by clustering normal user behavior[J]. Computers & Security,2003,122 (7):596-612.
    [19]陈友,程学旗,李洋,戴磊.基于特征选择的轻量级入侵检测系统[J].软件学报.2007,18(7):1639-1651.
    [20]罗静,董晟,华鹏.一种基于克隆的模糊C-均值入侵检测方法[J].微机发展.2004,14(3):107-109.
    [21]倪霖,郑洪英.基于聚类和支持向量机的入侵检测研究[J],计算机应用.2007,27(10):2440-2442.
    [22]唐贤伦,庄陵,李银国等.基于粒子群优化和模糊C均值聚类的入侵检测[J].计算机工程.2008,34(4):13-15.
    [23]唐少先,蔡文君.基于无监督聚类混合遗传算法的入侵检测方法[J].计算机应用.2008,28(2):409-411.
    [24]张国锁,周创明,雷英杰.改进FCM聚类算法及其在入侵检测中的应用[J].计算机应用.2009,29(5):1336-1338.
    [25]潘瑜.计算机网络安全技术[M].北京:科学出版社,2007,8:282-282.
    [26]刘美兰,姚京松.入侵检测预警系统及其性能设计[C],信息和通信安全CCICS'99:第一届中国信息和通信安全学术会议论文集,北京,PP.105-111,2000.
    [27]S.Chen, B.Tung, G. D. Schnackenber. The Common Intrusion Detection Framework-data Formats[R], Internet draft draft-ietf-cidf-data-formats-00.txt,1998.
    [28]唐正军.黑客入侵防护系统源代码分析.机械工业出版社,2002.
    [29]B.Mukherjee,L.T.Heberlein, K.N.Levitt. Network Intrusion Detection[J],IEEE Network,8(3):26-41,May-Tune 1994.
    [30]曹元大,薛静锋,祝烈煌等.入侵检测技术[M].北京:人民邮电出版社,2007,5.
    [31]尹清波,张汝波,李雪耀,王慧强.基于线性预测与马尔可夫模型的入侵检测技术研究.计算机学报.2005.28(5).900-907.
    [32]贺玲,吴玲达,蔡益朝.数据挖掘中的聚类算法综述[J].计算机应用研究,2007,24(1):10-13.
    [33]Tian Zhang, Raghu Ramakrishnan et al:BIRCH:An Effieient Data Clustering Methed for Very Large Databases[C].Technical Report, Computer Sciences Dept.,Univ.of wsconsin Madison,1995.
    [34]Karypis G, Han E-H, Kumar V. CHAMELEON:A Hierarchical Clustering Algorithm Using Dynamic Modeling [J]. IEEE Computer,1999,32(8):68-75.
    [35]向继,高能,荆继武.聚类算法在网络入侵检测中的应用[J].计算机工程.2003,29(16):48-49.
    [36]杨德刚.基于模糊C均值聚类的网络入侵检测算法[J].计算机科学,2005,32(1):86-87.
    [37]BEZDECK J C,EBRLICB R,FULL W.FCM:fuzzy C-means algorithm[J]. Computers and Geoscience,1984,23:16-20.
    [38]ANGELINE P J. Evolutionary Optimization Versus Particle Swarm Optimization:Philosophy and Performance Difference[C]//Proc.Of the 7th Annual Conf. on Evolutionary Programming. Germany:[s.n.],1998.
    [39]刘晶晶,吴传生.一种带交叉算子的改进的粒子群优化算法[J].青岛科技大学学报,2008,29(1).
    [40]Kennedy J,Eberhart R. Particle Swarm Optimization[C]//IEEE International Conference on Neural Networks.Piscataway:IEEE Service Center,1995:1942-1948.
    [41]SHI Y, EBERHART R. A modified particle swarm optimizer[C]. Piscataway, NJ,IEEE Press.Proceedings of the IEEE International Conference on Evolutionary Computation,1998.
    [42]曾建潮,介婧,崔志华.微粒群算法[M].北京:科学出版社,2004.
    [43]http://kdd.ics.uci.edu/databases/kddcup99/kdd cup99.htm,1999.
    [44]任江涛,孙婧昊,黄焕宇,印鉴.一种基于信息增益及遗传算法的特征选择算法[J].计算机科学,2006,33(10):193-195.
    [45]张莉,孙钢,郭军.基于K均值聚类的无监督的特征选择方法[J].计算机应用研究,2005,2:23-24.
    [46]M.Dash and H.Liu, "Feature Selection for Classification".Intelligent Data Analysis:An Int'lJ., vol.1,no.3,pp.131-156,1997.
    [47]董琳,邱泉,于晓峰,等(译).数据挖掘实用机器学习技术(第2版).北京:机械工业出版社,2006.190-195.
    [48]Kenney J, Eberhart R C, A Discrete Binary Version of the Particle Swarm Algorithm//Proc of the IEEE International Conference on Systems. Man and Cybernetics. Orlando, USA,1997,V:4104-4108.
    [49]G. L. Pappa, A. J. Baines, and A. A. Freitas.Predicting post-synaptic activity in proteins with data mining. Bioinformatics,21 (2):1119-1125,2005.
    [50]Brian Caswell,Jay Beale,James C.Foster et al. Snort 2.0入侵检测(宋劲松等译)[M].北京:国防工业出版社,2004,1.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700