基于增量式关联规则挖掘算法的研究及其在手机病毒检测中的应用
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
针对手机病毒的大规模蔓延,移动通信网络亟需对手机病毒进行主动防御的现状,本文对将基于增量式的关联规则挖掘算法应用于手机病毒检测做了探索性的研究并加以实现。本文选题自企业委托项目《手机病毒检测系统》,主要解决项目中关联规则挖掘模块的研究与实现,为项目提供了一种手机病毒检测的解决方案。
     论文的研究工作主要体现在以下几方面:1、总结了手机病毒的定义与特点,调研了手机病毒造成的各方面危害,和目前网络侧所采用的主要手机病毒防治技术;2、总结了数据挖掘技术,尤其是其中关联规则挖掘算法的基本概念和关联规则挖掘算法的一般挖掘步骤,根据有无候选项集的产生对关联规则挖掘算法进行了归纳,并对比了这两种方式的执行特点和优缺点,分析了本项目所涉及到的多值属性关联规则方面目前已有的挖掘算法特点,和与普通关联规则挖掘算法的不同之处,归纳了目前已有的关联规则客观度量方法,并分析了每种方法的特点和优缺点;3、在分析了Apriori算法和FUP算法的基础上,根据这两个算法的不足和本项目自身的数据特点,采用了新的数据库操作方法和增量更新技术,提出并阐述了对这两个算法的改进方案,并通过实验对改进的效果进行了验证;4、通过对关联规则挖掘模块在手机病毒检测系统中功能特点的研究,设计并实现了关联规则模块中所有相关子模块,并对实验测试结果进行了分析。
     论文的主要贡献有以下几方面。首先提出了一种基于属性预排列支持度统计的关联规则改进算法,和一种利用候选频繁项集的关联规则更新改进算法,这两项改进使得关联规则挖掘和增量更新的效率得到了一定提高。本文提出的改进算法可以广泛应用于手机病毒检测系统的关联规则挖掘模块中,在移动通信网络的主动防御领域有着重要的应用前景。经过实验测试后,手机病毒检测系统的部署应用验证了改进算法对关联规则挖掘性能提高的显著效果,和关联规则挖掘模块在手机病毒检测中的重要作用,同时实验结果表明关联规则模块对多种病毒检测的正确率超过90%。
Nowadays, smart phone viruses become more destructive with increasing spreading speed, which put a strain on the limited wireless network resources and also a threat to information privacy. Consequently, an antivirus mechanism tailor-made for mobile communication network is of significant importance today. This paper firstly does an exploratory research in applying the Increment Association Rule Mining Algorithm to smart phone virus detection mechanism, and further presents an effective method to implement it. This work is supported by the enterprise-commissioned project Smart Phone Viruses detection System, and the main task is to study andimplement the Association Rule Mining (ARM) module in order to provide a cellphone virus detection solution.
     The main work of this paper is stated as follows. Firstly, the paper presents the definition and key features of smart phone viruses, and the classification of the viruses according to the damage caused. Also it surveys the mainstream anti-phone-virus technique used by the network side. Secondly, the paper presents the basic theory of ARM algorithm and its general steps, sort existed algorithms into two categories according to with or without candidate items, and compare their execution features, advantages and disadvantages; also analyze the multi-value attributes originated from the supporting project, as well as the existed mining algorithms for this specific attribute. It is found that the multi-value-attribute-aimed ARM algorithms own several basic differences with common ARM algorithm, which shows the importance of a new measure of association rules. For this purpose, some measures commonly used are analyzed, and the features of each are presented. Thirdly, improved algorithms for Apriori algorithm and FUP algorithm are put forward, based on new database operation technique, increment update technique and the features of data collected from the supporting project. The test shows that the improved algorithms bring significant enhancement in execution efficiency. At last, the framework of design and realization of the ARM module in smart phone virus detection system is presented, and all the test results are elaborated.
     The contribution of this work is that it puts forward two algorithms which bring great enhancement in association rule mining performance and increment update performance:an improved prearranged-attributes-support-statistic-based ARM algorithm, and an improved candidate-frequent-items-based ARM algorithm. These two improved algorithms can be used in ARM module in cellphone virus detection system, showing a wide application foreground in the active defense mechanism for mobile communication network.
引文
[1]第30次中国互联网络发展状况统计报告.中国互联网络信息中心.2012
    [2]网秦公司.2011年中国大陆地区手机安全报告.2011
    [3]网秦公司.2012年第三季度全球手机安全报告.2012
    [4]来晓阳,时镇军,石磊,李俊.直面挑战积极应对手机病毒.电信技术.2011(5):42-44
    [5]李雪娟.高速网络中手机病毒检测方法与实现技术的研究.北京邮电大学,2011.
    [6]夏玮.智能手机病毒传播模型与防治对策研究.南开大学.2008
    [7]柳亚鑫.面向移动终端的恶意代码自动收集和检测技术研究.北京大学,2010.
    [8]陈雅娴,袁津生,郭敏哲等.基于行为异常的Symbian蠕虫病毒检测方法.计算机系统应用,2008,17(11):49-52,31.
    [9]Gregory Piatetsky-Shapiro, Usama M. Fayyad, Padhraic Smyth. From Data Mining to Knowledge Discovery:An Overview, in:Usama M. Fayyad, Gregory Piatetsky-Shapiro, Padhraic Smyth eds. Advances in Knowledge Discovery and Data Mining.1996.1-34
    [10]Kaijian, Liang; Quan, Liang; Bingru, Yang;, "Causal association rule mining methods based on fuzzy state description," Systems Engineering and Electronics, Journal of, vol.17, no.1, pp.193-199, March 2006
    [11]Jiawei Han, Micheline Kamber.数据挖掘概念与技术.范明,孟晓峰,译.北京.机械工业出版社,2006:20-31
    [12]Ken-Hao Liu; Ming-Fang Weng; Chi-Yao Tseng; Yung-Yu Chuang; Ming-Syan Chen;, "Association and Temporal Rule Mining for Post-Filtering of Semantic Concept Detection in Video," Multimedia, IEEE Transactions on, vol.10, no.2, pp.240-251, Feb.2008
    [13]R. Agrawal, T. Imielinski, A. N. Swami. Mining Association Rules between Sets of Items in Large Databases.in:P. Buneman and S. Jajodia, eds. Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data. Washington, D.C.1993.207-216
    [14]Romani, L. A. S.; de Avila, A. M. H.; Chino, D. Y. T.; Zullo, J.; Chbeir, R.; Traina, C.; Traina, A. J. M.;, "A New Time Series Mining Approach Applied to Multitemporal Remote Sensing Imagery," Geoscience and Remote Sensing, IEEE Transactions on, vol.51, no.l, pp.140-150, Jan.2013
    [15]唐正军,李建华.入侵检测技术.北京.清华大学出版社.2004.6-7
    [16]D. W. Cheung, J. Han, V. T. Ng. Maintenance of Discovered Association Rules in Large Databases:An Incremental Updating Technique [C]. Proceedings of the 12th international conference on Data Engineering,1996:212-223
    [17]Sekhavat, Y.A.; Fathian, M.;, "Mining frequent itemsets in the presence of malicious participants," Information Security, IET, vol.4, no.2, pp.80-92, June 2010
    [18]Qinbao Song; Shepperd, M.; Cartwright, M.; Mair, C.;, "Software defect association mining and defect correction effort prediction," Software Engineering, IEEE Transactions on, vol.32, no.2, pp.69-82, Feb.2006
    [19]Boukerche, A.; Samarah, S.;, "A Novel Algorithm for Mining Association Rules in Wireless Ad Hoc Sensor Networks," Parallel and Distributed Systems, IEEE Transactions on, vol.19, no.7, pp.865-877, July 2008
    [20]Ribeiro, M.X.; Traina, A.J.M.; Traina, C.; Azevedo-Marques, P.M.;, "An Association Rule-Based Method to Support Medical Image Diagnosis With Efficiency," Multimedia, IEEE Transactions on, vol.10, no.2, pp.277-285, Feb. 2008
    [21]Takama, Y.; Hattori, S.;, "Mining Association Rules for Adaptive Search Engine Based on RDF Technology," Industrial Electronics, IEEE Transactions on, vol.54, no.2, pp.790-796, April 2007
    [22]Tien Dung Do; Siu Cheung Hui; Fong, A.C.M.; Fong, B.;, "Associative Classification With Artificial Immune System," Evolutionary Computation, IEEE Transactions on, vol.13, no.2, pp.217-228, April 2009
    [23]Marin, N.; Molina, C.; Serrano, J.M.; Vila, M.A.;, "A Complexity Guided Algorithm for Association Rule Extraction on Fuzzy DataCubes," Fuzzy Systems, IEEE Transactions on, vol.16, no.3, pp.693-714, June 2008
    [24]Exarchos, T.P.; Papaloukas, C.; Fotiadis, D.I.; Michalis, L.K.;, "An association rule mining-based methodology for automated detection of ischemic ECG beats," Biomedical Engineering, IEEE Transactions on, vol.53, no.8, pp.1531-1540, Aug.2006
    [25]胡志伟.增量关联规则算法在手机病毒中的挖掘应用研究与实现.北京邮电大学.2011
    [26]陈劲松,施小英.一种关联规则增量更新算法.计算机工程,2002,28(7):106-107
    [27]石冰,郑燕峰.一种关联规则的增量式更新算法.计算机工程,2002,26(8):101-103
    [28]刘造新.一种关联规则增量式挖掘算法研究.计算机时代,2012,(3):20-21,24.
    [29]宋海生.关联规则的增量式更新算法.兰州大学学报,2004,40(2):47-50
    [30]商志会.关联规则挖掘算法的研究及其在网络入侵检测中的应用.同济大学.2006
    [31]朱晓峰,李玲娟,徐小龙等.基于MapReduce的关联规则增量更新算法.计算机技术与发展,2012,(4).
    [32]梅华威,张铭泉.基于BP神经网络的手机病毒检测方法.计算机应用与软件,2010,27(7):283-284,300.
    [33郭前进.手机病毒分析及智能手机杀毒软件设计.河北工业大学.2007
    [34]侯兵.关联规则挖掘算法研究.西南交通大学.2006
    [35]刘朝晖.Android智能手机操作系统上基于程序行为的病毒检测与应用.北京科技大学,2009.
    [36]毛宇星,施伯乐.基于扩展自然序树的概化关联规则增量挖掘方法.计算机研究与发展,2012,49(3):598-606.
    [37]李金凤,王怀彬.基于关联规则的网络故障告警相关性分析.计算机工程,2012,38(5):44-46.
    [38]唐璐,江红,上官秋子等.一种改进的关联规则的增量式更新算法.计算机应用与软件,2012,(4).
    [39]Brauckhoff, D.; Dimitropoulos, X.; Wagner, A.; Salamatian, K.;, "Anomaly Extraction in Backbone Networks Using Association Rules," Networking, IEEE/ACM Transactions on, vol.20, no.6, pp.1788-1799, Dec.2012
    [40]Lutfi Othman, M.; Aris, I.; Abdullah, S.M.; Ali, M.L.; Othman, M.R.;, "Knowledge Discovery in Distance Relay Event Report:A Comparative Data-Mining Strategy of Rough Set Theory With Decision Tree," Power Delivery, IEEE Transactions on, vol.25, no.4, pp.2264-2287, Oct.2010
    [41]Bringmann, B.; Berlingerio, M.; Bonchi, F.; Gionis, A.;, "Learning and Predicting the Evolution of Social Networks," Intelligent Systems, IEEE vol.25, no.4, pp.26-35, July-Aug.2010

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700