基于网格山脊点的异常点检测
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:An Outlier Detection Based on Grid Ridge
  • 作者:戴楠 ; 严悍 ; 卓勤政 ; 马玲玲
  • 英文作者:DAI Nan;YAN Han;ZHUO Qinzheng;MA Lingling;School of Computer Science and Technology,Nanjing University of Science and Technology;
  • 关键词:山脊点 ; LOF ; 网格
  • 英文关键词:ridge point;;LOF;;grid
  • 中文刊名:JSSG
  • 英文刊名:Computer & Digital Engineering
  • 机构:南京理工大学计算机科学与技术学院;
  • 出版日期:2019-05-20
  • 出版单位:计算机与数字工程
  • 年:2019
  • 期:v.47;No.355
  • 语种:中文;
  • 页:JSSG201905031
  • 页数:4
  • CN:05
  • ISSN:42-1372/TP
  • 分类号:166-169
摘要
LOF(Local Outlier Factor)算法是目前比较实用且效果比较良好的异常点检测算法之一,但是该算法在处理大规模的数据集时,往往会耗费巨大的时间和空间。目前基于网格的异常点检测算法虽然一定情况下降低了算法的时间和空间的耗费,但是时间和空间的耗费依然比较大。对此论文提出一种基于网格山脊点的异常检测算法。该算法先根据数据分布情况划分成空间网格单元,然后计算各个网格山脊点的高度,挑选出网格山脊点低的区域。最后对山脊点低的区域进行LOF算法检测。实验结果表明,相对于目前的基于网格的异常点检测算法,该算法的执行效率显著提高。
        The LOF(Local Outlier Factor)algorithm is one of the most practical and effective detection methods. However,the algorithm often takes a lot of time and space when dealing with large-scale data sets. At present,the grid-based outlier detection algorithm reduces the time and space consumption of the algorithm,but the time and space consumption is still relatively large. In this paper,an outlier detection algorithm based on grid ridge is proposed. The algorithm is divided into spatial grid cells according to the data distribution,and then the height of each grid ridge is calculated,and the area with low grid ridge is selected. Finally,the LOF algorithm is detected in the low ridge area. The experimental results show that the efficiency of the algorithm is significantly improved compared with the current grid-based outlier detection algorithm.
引文
[1]王安志,邵云.基于IFS的真实感分形植物仿真与实现[J].四川文理学院学报,2011,21(5):87-90.WANG Anzhi,SHAO Feng.Simulation and implementation of realistic fractal plants based on[J].IFS Journal of Sichuan University of Arts and Science,2011,21(5):87-90.
    [2]周文利.基于B样条曲线的植物模型建立方法[J].计算机科学,2007,34(6):245-247.ZHOU Wenli.Establishment method of plant model based on B spline curve[J].Computer science,2007,34(6):245-247.
    [3]李向,范文斌,李旭光.基于光照模型的植物仿真方法研究[J].系统仿真学报,2009(19):6312-6316.LI Xiang,FAN Wenbin,LI Xuguang.Research of plants imitation based on illumination model[J].Journal of System Simulation,2009(19):6312-6316.
    [4]黄艳峰,薛占熬,陈涛.基于L-系统的植物模拟研究[J].计算机工程与应用,2005,41(19):53-55.HUANG Yanfeng,XUE Chengyan,CHEN Tao.Research on plant simulation based on L-system[J].Computer engineering and application,2005,41(19):53-55.
    [5]陈涛.基于L-系统实现植物模拟的关键技术研究[J].河南科学,2010,28(2):179-181.CHEN Tao.Key Technology Research of Plant Simulation Based on L-System[J].Henan science,2010,28(2):179-181.
    [6]杨茂林.离群检测算法研究[D].武汉:华中科技大学,2012.YANG Maolin.Outlier detection algorithm[D].Wuhan:Huazhong University of Science and Technology,2012.
    [7]张远方.基于密度的局部离群点挖掘算法研究[D].南宁:广西大学,2011.ZHANG Qian.Density based local outlier mining algorithm[D].Nanning:Guangxi University,2011.
    [8]揭财明.基于密度的局部离群点检测算法分析与研究[D].重庆:重庆大学,2012.JIE Ming.Density based local outlier detection algorithms[D].Chongqing:Chongqing University,2012.
    [9]陆璇,叶俊译.实用多元统计分析[M].北京:清华大学出版社,2008.LU Xuan,YE Junze.Practical multivariate statistical analysis[M].Beijing:Tsinghua University press,2008.
    [10]范明,孟小峰译.数据挖掘概念与技术[M].北京:机械工业出版社,2007.MING Fan,MENG Xiaofeng.Data mining concepts and techniques[M].Beijing:Mechanical Industry Press,2007.
    [11]费爱国,王新辉.一种基于Web日志文件的信息挖掘方法[J].计算机应用,2004,24(6):57-59.fee patriotic,WANG Xinhui.An information mining method based on Web log file[J].Computer applications,2004,24(6):57-59.
    [12]赵泽茂,何坤金,陈鹏.Web日志文件的异常数据挖掘算法及其应用[J].计算机工程,2003,29(17):195-197.ZHAO Zemao,HE Kunjin,CHEN Peng.Web log file outlier data mining algorithm and its application[J].Computer Engineering,2003,29(17):195-197.
    [13]文俊浩,吴中福,吴红艳.空间孤立点检测[J].计算机科学,2006,33(5):185-187.WEN Junhao,WU Zhongfu,WU Hongyan.Spatial Outlier Detection Algorithm[J].Computer Science,2006,33(5):185-187.
    [23]薛安荣,鞠世光.基于空间约束的离群点挖掘[J].计算机科学,2007,34(3):207-209.XUE Anrong,JU Shiguang.Outlier Mining Based on Spatial Constraint[J].Computer science,2007,34(3):207-209.
    [14]郑斌祥,杜秀华,席裕庚.一种时序数据的离群数据挖掘新算法[J].控制与决策,2002,17(3):324-327.ZHENG Binxiang,DU Xiuhua,XI Yugeng.An outlier of time series data mining algorithm[J].Control and decision,2002,17(3):324-327.
    [15]王大荣,张忠占.线性回归模型中变量选择方法综述[J].数理统计与管理,2010,29(4):615-627.WANG Darong,ZHANG Zhongzhan.Linear regression statistics and management review[J].variable selection method of mathematical model,2010,29(4):615-627.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700