人工免疫算法在反垃圾邮件技术中的应用
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
人工免疫系统是当前计算智能领域的新兴研究热点,其应用领域包括工业控制、模式识别、经济管理、计算机网络安全、反垃圾邮件等。近年来,由于生物免疫系统集成了邮件过滤器必需的优良特性,而人工免疫系统能够较好地从生物免疫中继承、发展这些特性,优于其他智能系统,体现出人工免疫原理与反垃圾邮件技术的结合的天然优势。将人工免疫原理及其算法应用到邮件过滤技术中能够提供比传统方法更为广泛的多样性、模式识别能力、自适应性和容噪性。人工免疫算法逐渐成为反垃圾邮件技术中的应用热点。
     本文以人工免疫系统原理和算法为研究对象,探索人工免疫算法在反垃圾邮件技术中的应用。首先对反垃圾邮件技术以及人工免疫系统原理进行总结归纳,并对电子邮件的基本理论、人工免疫的生物学背景和传统的反垃圾邮件技术进行了简要的介绍。以一个客户端的垃圾邮件过滤系统为平台,设计人工免疫原理的垃圾邮件过滤器,并详细研究与分析了人工免疫算法和IP-ISF算法在其中的应用。并进一步深入研究IP-ISF算法在检测效率稳定性方面地问题,提出了改进方案,最后通过改进算法的实现与测试结果分析验证了算法改进的有效性。最后提出了基于人工免疫算法的Outlook客户端垃圾邮件过滤插件的设计方案。本文的应用研究成果对人工免疫算法在其他应用领域,如垃圾短信的过滤等提供了借鉴和参考。
     本文的研究重点是人工免疫算法及在垃圾邮件过滤中的应用,给出了基于人工免疫原理的邮件过滤器的设计与实现机制,对IP—ISF免疫算法的应用进行了研究,详细分析了邮件社区基因库、特征检测器等算法模块的应用与实现。对基于人工免疫原理的邮件过滤器的设计和算法流程进行了探索,对实验结果进行了对比分析。本文的研究难点是,由于动态数据环境下IP-ISF垃圾邮件分类算法的正确检测率的不稳定性,既需要针对新特征类型提高初次检测的效率,又应针对旧有特征保存已有的优势基因,所以提出免疫优势位点演化来提高新生成特征检测器的检测效率,增强演化的方向性,提高准确识别的稳定性。并利用公共语料库进行测试,通过测试结果证明改进算法有效地增加了动态数据环境下正确检测率的稳定性,验证了本文所作的研究工作的有效性和可行性。
Artificial immune system is the emerging field of computational intelligence research, its applications include Industrial Control, Pattern Recognition, economic management, computer network security, anti-spam, etc.. In recent years, Artificial Immune System which is inherited from the biological immunity, has developed it's characteristics, and superior to other smart systems. It has shown the natural advantage of the combination of the technologies of the embodied artificial immune and anti-spam. Artificial immune principle and it's application to e-mail filtering algorithm technology can provide more extensive chance than the traditional method, for instance, diversity, pattern recognition capability, adaptability and the capacity of noise defence. Artificial immune algorithm gradually become hot spots in anti-spam technology.
     Based on artificial immune system theory and algorithms's study, this paper will explore AIS and immune algorithm in the anti-spam technology applications. First, anti-spam technology and artificial immune system principle summarized, then the e-mail's basic theory, artificial immune biology background and the traditional anti-spam technology for a brief introduction. To introduce a client of spam filtering system as a platform, deeply research the design principle of artificial immune spam filters, and detailed study and analysis the artificial immune and IP-ISF algorithm in which the application uesd into. And further study of artificial immune algorithm improvements, through improved with the realization of the test results to improve the effectiveness of verification algorithm. Finally, present the proposed algorithm based on artificial immune Outlook client spam filtering plug-in's design. In this paper, through the application of research results, artificial immune algorithm in other application areas, such as SMS spam filters provide reference.
     This study focused on artificial immune algorithm in the spam filters and the application that is based on artificial immune filters principle. The most important part is the design and implementation of immune algorithm and IP-ISF, which is a detailed analysis e-mail community algorithm, in the application of the Anti-spam system. Based on artificial immune's principle, the experimental results are compared and analyzed with many other methods. In this study provide a good way to, for dynamic data environment correct detection rate's stability, enhance the evolution of the rules of the new generation Detection efficiently, thereby enhancing the stability of accurate identification. Corpus tested on the platform mentioned before. And the work in this paper is a good case for other compuer applications.
引文
[1]丁永生.计算智能—理论技术与应用.北京:科学出版社,2004.8
    [2]李少远,王景成.智能控制.北京:机械工业出版社,2005.1
    [3]曹承志,王楠.智能技术.北京:清华大学出版社,2004
    [4]张仰森.人工智能原理与应用.北京:高等教育出版社,2005,317-323
    [5]莫宏伟.人工免疫系统原理与应用.哈尔滨工业大学出版社,2002.11
    [6]陈勇.反垃圾邮件完全手册.北京:清华大学出版社,2006
    [7]J.Fenton,Cisco Systems,Inc.Analysis of Threats Motivating Domain Keys Identified Mail (DKIM)[EB/OL].RFC4686.2006
    [8]Jonathan E.Schmidt.Dynamic Port 25 Blocking to Control SPAM Zombies[C].In:CEAS 2006Third Conference on Email and AntiSpam(CEAS 2006),Mountain View,California USA,July 27-28,2006
    [9]COHENWW.Learning Rules that Classify E-mail[A].AAAI Spring Symposium on Machine Learning in information access[C],1996[1]
    [10]I.Rigoutsos,T.Huynh,Chung-Kwei.A Pattern-Discovery-Based System for the Automatic Identification of Unsolicited Email Messages(spam)[C].In Proceedings of the 1~(st) Conference on Email and Anti-Spam(CEAS 2004),Mountain View,CA,July 2004
    [11]X.Carreras,L.Marquez.Boosting Trees for Anti-Spam Email Filtering[C].In:Proceedings of Euro Conference Recent Advances in NLP(RANLP22001).September 2001:58-64
    [12]Alan Schwarrtz.SpamAssassin[M].USA:O'Reilly Media,Inc,2004(7):25-30
    [13]H.Drucker,D.Wu,V.N.Vapnik.Support Vector Machines for Spam Categorization[J].IEEE Transactions on Neural Networks.Sep.1999.20(5):1048-1054
    [14]潘文峰.基于内容的垃圾邮件过滤:[硕士学位论文],北京:中国科学院研究生院,2004
    [15]Paul Graham.A Plan for Spam[EB/OL].http://www.paulgraham.com/spam.html,2007-1-27
    [16]Paul Graham.Better Bayesian Filtering[EB/OL].http://paulgraham.com/better.html,2007-1-27.
    [17]Jim Boyce.Microsoft Office Outlook 2003完全揭密[EB/OL].http://www.microsoft.com/china/msdn/library/office/UsingVBAInOutlook.mspx.
    [18]林学颜,张玲.现代细胞与分子免疫学.北京:科学出版社,1999
    [19]Dasgupta D,Attoh-okine N Immunity based System:A survey.IEEE International Conference on Systems,Man,and Cybernetics,Orlando,Florida 1997 369-374(Dasgupta.D,ttoh-Okine. N.Immunity based systems:A survey.In:Proc IEEE International Conference on Systems,Man,and Cybemetics,Orlando,Florida,1997:369-374)
    [20]Forrest S,Hofmeyr S A.Immunology as information processing.In:Segel and Cohen eds.Design Principles for the Immune System and Other Distributed Autonomous Systems.USA:Oxford University Press,2000
    [21]Mcloy D F,Devarajan V.Artificial immune systems and aerial image segmentation.In:Proc IEEE International Conference on Systems,Man,and Cybernetics,Orlando,Florida,1997:369-374
    [22]Kim J,Bentley P.Towads an artificial immune system for network instrnsion detection:An investigation of clonal selection with a negative selection operater.In:Proc Congress on Evolutionary Computation,Seoul,Korea,2001:27-30
    [23]Dasgnpta D,Forrest S.Artificial immune Systems in industrial applications.In:Proc 2nd International Conference on Intelligent Processing and Manufacturing of Materials,Honolulu,1999:257-267
    [24]Endoh S,Toma N,Yamada K,Immune algorithm for n-TSF.Proc IEEE International Conference on Systems,Man,and Cybernetics,San Diego,CA,USA 1998,3844-3849
    [25]Xiao Ren-Bin.Wang Lei.Fan Zheng.An artificial immune system based isomorphism identification method for mechanism kinematics chains.Proc 2002 ASME Design Engineering Technical Conferences,DETC 2001/DAC-34063,Montreal,Canada,2002 1-6
    [26]王磊,潘进,焦李成.免疫规划.计算机学报,2000,23(8):806-8120
    [27]曹先彬,刘克胜,王煦法.基于免疫遗传算法的装箱问题.小型微型计算机系统,2000,21(4):361-363
    [28]曹先彬,郑振,刘克胜等.免疫进化策略及其在二次布局求解中的应用.计算机工程,2000,26(3):1-10
    [29]King RL,Russ SH,Lambert Abctal.An artificial immune system model for intelligent agents.Future Generation Computer Systems.2001,17(4):335-343
    [30]D'haeseleer P,Forrest S,Helman P.An immunological approach to change detection algorithms:Analysis and implication.Proc IEEE Symposium on Security and Privacy,Las Alamitos,CA,USA,1996,110-119
    [31]DeCastro LN,VonZuben FJ.Clonal selection algorithm with engineering applications.Proc GECCO' 00,Las Vegas Nevada,USA,2000 36-37
    [32]张泽明.人工免疫算法及其应用研究:[博士学位论文],合肥:中国科学技术大学,2007
    [33]J.Kim and J.Bentley,"Negative Selection and Niching by An Artificial Immune System for Network Intrusion Detection",in Proc.Genetic and Evolutionary Computation Conference,pp.149-158,Orlando,Florida,1999
    [34]I.Androutsopoulos,J.Koutsias,K.V.Chandrinos,and C.D.Spyropoulos,"An Experimental Comparison of Na(i|¨)ve Bayesian and Keyword-Based Anti-Spare Filtering with Personal E-mail Message",in Proc.The 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval,pp.160-167,New York,2000
    [35]Ian H Witten,Eibe Frank.Data Mining Practical Machine Learning Tools and Techniques[M].Second Edit.San Francisco,CA:MORGAN KAUFMANN PUBLISHER,2005:88-97
    [36]L.Pelletier,J.Almhana,and V.Choulakian,"Adaptive Filtering of SPAM",in Proc.the Second Annual Conference on Communication Networks and Services Research,pp.218-224,Frederiction,Canada,2004
    [37]Eric Carter.Coming to grips with the mess of types in the Office PIAs[EB/OL].http://blogs.msdn.com/eric_carter/archive/2004/05/06/127698.aspx
    [38]李涛.计算机免疫学.北京:电子工业出版社,2004
    [39]莫宏伟,金鸿章.人工免疫系统:一个新兴的交叉学科.计算机工程与科学,vol.26(5),pp.70-73,2004
    [40]张四海,曹先彬,王煦法.基于免疫识别的免疫算法.电子学报,vol.30(12),pp.1840-1844,2002
    [41]焦李成,杜海峰,刘芳等.免疫优化计算、学习与识别.北京:科学出版社,2006
    [42]梁意文,潘海军,康立山.免疫识别器构造的多极演化.小型微型计算机系统,vol.23(4),pp.441-443,2002
    [43]焦李成,杜海峰.人工免疫系统进展与展望.电子学报,vol.31(10),pp.1540-1548,2003
    [44]W.Luo,X.Wang,Y.Tan,and X.Wang,"A Novel Negative Selection Algorithm with an Array of Partial Matching Lengths for Each Detector",in Proc.9~(th) International Conference on Parallel Problem Solving From Nature(PPSN IX),LNCS4193,pp.112-121,Reykjavik,Iceland,2006
    [45]Lentczner M,Wong M.Sender Policy Framework(SPF) for Authorizing Use of Domains in E-Mail,V.1[EB/OL].RFC4408.April 2006
    [46]Kim.J.and Bently.P.J.A Model of Gene Library Evolution in the Dynamic Clonal Selection Algorithm(ICARIS),2002
    [47]Terri O,Tony W.Developing and immunity to spam[A].In Proceedings of the Genetic and Evolutionary Computation Conference(GECCO2003)[C],Chicago:2003

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700