基于密度与距离的钓鱼邮件检测方法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Phishing E-mail Detection Method Based on Density and Distance
  • 作者:王秀娟 ; 张晨曦 ; 唐昊阳 ; 陶元睿
  • 英文作者:WANG Xiujuan;ZHANG Chenxi;TANG Haoyang;TAO Yuanrui;Faculty of Information Technology,Beijing University of Technology;
  • 关键词:机器学习 ; 钓鱼邮件 ; 特征提取 ; 维度缩减 ; 支持向量机
  • 英文关键词:machine learning;;phishing E-mail;;feature extraction;;dimensionality reduction;;support vector machine(SVM)
  • 中文刊名:BJGD
  • 英文刊名:Journal of Beijing University of Technology
  • 机构:北京工业大学信息学部;
  • 出版日期:2019-04-02 13:16
  • 出版单位:北京工业大学学报
  • 年:2019
  • 期:v.45
  • 基金:国家重点研发计划资助项目(2017YFB0802703);; 国家自然科学基金资助项目(61602052)
  • 语种:中文;
  • 页:BJGD201906004
  • 页数:8
  • CN:06
  • ISSN:11-2286/T
  • 分类号:36-43
摘要
针对钓鱼邮件检测过程中提取特征数量愈加庞大,检测效果没有明显提升且时间成本增加这一问题,提出了一种钓鱼邮件检测方法.该方法提出将原始的42维邮件特征转换为2个新特征,即基于密度的特征和基于距离的特征,检测准确率最高可达99. 74%,分类时间仅需3. 39 s,是传统算法的1/20.实验结果表明,该方法具有较好的检测效果,并且降低了时间成本.
        Phishing E-mail detection methods are mostly focused on the extraction of different E-mail features,which lead the time increasing. To solve this problem,a method based on density and distance was proposed. The method replaces the 42 original mail features with 2 new ones,i. e.,features based on density and distance. Then the machine learning classification algorithm was used to detect phishing E-mail. The detection accuracy of the proposed method reaches 99. 74%,and time is only 3. 39 s,which is 1/20 of the traditional algorithm. Results show that the algorithm has a better detection performance and saves much time.
引文
[1]中文互联网数据资讯中心.IDC:预测2016年全球网民用户数达32亿人[R/OL].[2016-12-22].http:∥www.199it.com/archives/422330.html.
    [2]CHOWDHURY M U,ABAWAJY J H,KELAREV A V,et al.Multilayer hybrid strategy for phishing email zero-day filtering[J].Concurrency&Computation Practice&Experience,2016,29(23):623-639.
    [3]杨明,杜彦辉,刘晓娟.网络钓鱼邮件分析系统的设计与实现[J].中国人民公安大学学报(自然科学版),2012(72):214-226.YANG M,DU Y H,LIU X J.The design and implementation of phishing email analysis system[J].Journal of Chinese People's Public Security University(Natural Science Edition),2012(72):214-226.(in Chinese)
    [4]中国反钓鱼联盟.中国反钓鱼联盟2016年11月月报[R/OL].[2016-12-22].http:∥www.apac.cn/.
    [5]WU L,DU X,WU J.Effective defense schemes for phishing attacks on mobile computing platforms[J].IEEETransactions on Vehicular Technology,2016,65(8):6678-6691.
    [6]CHOWDHURY M U,ABAWAJY J H,KELAREV A V,et al.Multilayer hybrid strategy for phishing email zero-day filtering[J].Concurrency&Computation Practice&Experience,2016,29(23):56-74.
    [7]PRAKASH P,KUMAR M,KOMPELLA R R,et al.Phishnet:predictive blacklisting to detect phishing attacks[C]∥Proceedings of IEEE International Conference on Computer Communications.Washington DC:IEEEComputer Society,2010:1-5.
    [8]邹学强,张鹏,黄彩云,等.基于页面布局相似性的钓鱼网页发现方法[J].通信学报,2016(增刊1):116-124.ZOU X Q,ZHANG P,HUANG C Y,et al.Phishing Web page discovery method based on similarity of page layout[J].Journal of Communication,2016(Suppl 1):116-124.(in Chinese)
    [9]VARSHNEY G,MISRA M,ATREY P K.A survey and classification of Web phishing detection schemes[J].Security&Communication Networks,2016,9:6266-6284.
    [10]FETTE I,SADEH N,TOMASIC A.Learning to detect phishing emails[C]∥International Conference on World Wide Web,WWW 2007.New York:ACM,2007:649-656.
    [11]KHONJI M,IRAQI Y,JONES A.Enhancing phishing email classifiers:a lexical URL analysis approach[J].International Journal to Information Security Research,2013,3(1):236-245.
    [12]IQBAL F,BINSALLEEH H,FUNG B C M,et al.Mining writeprints from anonymous e-mails for forensic investigation[J].Digital Investigation,2010,7(1/2):56-64.
    [13]潘锋.特征提取与特征选择技术研究[D].南京:南京航空航天大学,2011.PAN F.Research on feature extraction and feature selection[D].Nanjing:Nanjing University of Aeronautics&Astronautics,2011.(in Chinese)
    [14]TSAI C F,LIN C Y.A triangle area based nearest neighbors approach to intrusion detection[J].Pattern Recognition,2010,43(1):222-229.
    [15]TOOLAN F,CARTHY J.Feature selection for spam and phishing detection[C]∥Ecrime Researchers Summit(Ecrime).Washington DC:IEEE Computer Society,2010:1-12.
    [16]ZAREAPOOR M,SHAMSOLMOALI P,ALAM M A.Highly discriminative features for phishing email classification by SVD[J].Advances in Intelligent Systems&Computing,2015,339:649-656.
    [17]WANG S,WANG D,CAOYUAN L I,et al.Clustering by fast search and find of density peaks with data field[J].Chinese Journal of Electronics,2016,25(3):397-402.
    [18]郑金彬,卓义宝.基于密度的分布式聚类算法研究[J].计算机工程,2008,34(17):65-67.ZHENG J B,ZHUO Y B.Density based distributed clustering algorithm[J].Computer Engineering,2008,34(17):65-67.(in Chinese)
    [19]LIN W C,KE S W,TSAI C F.CANN:an intrusion detection system based on combining cluster centers and nearest neighbors[J].Knowledge-Based Systems,2015,78(1):13-21.
    [20]马萌.基于流形距离的聚类算法研究及其应用[D].西安:西安电子科技大学,2009.MA M.Research and application of clustering algorithm based on manifold distance[D].Xi'an:Xi'an Electronic and Science University,2009.(in Chinese)
    [21]WANG X J,ZHAN C X,ZHENG K F.Intrusion detection algorithm based on density,cluster centers,and nearest neighbors[J].China Communications,2016,13(7):24-31.
    [22]DEVARAJU S,RAMAKRISHNAN S.Detection of attacks for IDS using association rule mining algorithm[J].Iete Journal of Research,2015,61(6):624-633.
    [23]WANG F N.Solving the intrusion detection problem with KPCA-RVM[C]∥Design,Manufacturing and Mechatronics.Singapore:World Scientific,2015:520-527.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700