数据挖掘在电信网络告警相关性研究中的应用
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着网络技术的发展,网络资源呈现出异构性和动态性,网络管理的功能日益复杂,传统的网络管理技术已经无法满足对大型复杂网络管理的需要。结构复杂,规模巨大的电信网络每天产生大量告警。告警是一个异常的有害事件,通常是一个自动监测到的故障,提供给网管人员一定的信息。因为某一个故障可能引起一系列告警,所以并不是所有的告警都表明故障原因,故障管理系统想要准确定位网络故障相当困难。传统的网络管理系统和网络管理员只能依靠自身有限的经验和网络管理系统有限的功能进行故障诊断、定位和恢复。但在网络日益扩大且迅速演变的情况下,这些知识已无法满足需求。
     目前国内外学者对网络告警进行了很多的研究,有多种方法被应用于网络故障告警相关性分析中。其中,数据挖掘的方法被广泛研究和应用。
     本文首先对数据挖掘的概念、功能、基本流程等进行了概述。其次对电信网络基本结构以及电信网络告警数据特征进行了介绍。最后提出了数据挖掘技术在电信网络告警相关性分析中的应用,详细介绍了关联规则挖掘算法和序列模式挖掘算法。
     关联规则挖掘算法是一种常用的方法,其中Apriori算法是关联规则挖掘领域中最经典的算法之一,也是一种最具有影响力的挖掘布尔关联规则频繁项集的算法。关联规则算法的核心思想是基于频集理论的递推方法,所采用的是逐层搜索的迭代方法。Apriori算法会产生大量的侯选集,同时多次扫描数据库。针对Apriori算法存在的问题,Jiawei Han等人于2000年提出了另一个经典的算法——FP-growth算法,该算法基于FP-tree(频繁模式树)采取分治策略,从而在挖掘出全部频繁项集时并不产生候选项目集。文章对两种算法进行了比较,并以武汉电信某网管中心的告警数据为实验数据,对实验结果进行分析。
     序列模式挖掘是关联规则挖掘的延伸,序列是由许多有序事件组成的数据集合,是数据挖掘的一个重要分支,用于提取一维空间上有序集合的频繁子集。如果把网络告警信息库看成按时间排列的有序集合,那么序列模式挖掘就可以用来发现频繁出现的告警序列模式,从而导出告警关联规则。本文采用了基于FP树的告警频繁序列模式挖掘算法FSPMFP(Frequent Sequential Pattern Mining basedon Frequent Tree)的基本思想是:通过对FP树的改进,将告警数据压缩到一棵频繁模式树上,针对频繁模式树自底向上查找频繁告警项集,最后挖掘告警间的时序关系。
Along with the development of the network technology.Netwok resource present heterogeneity and Dynamic. The function of Network Management increasinqly complexity. The traditional network management technology was already unable to satisfy to the large-scale complex network management need. Large-scale and complex structure of moden networks have a large number of alarms per day. Alarm is a harmful abnormal events. is usually monitors automatically failure. It offers definite information to network management personnel. As a result of a single fault may lead to a series of alarms, so not all of the alarm indicated that the cause of the malfunction..Fault management system want to accurate positioning netwok fault very difficult. Traditional network management system and network administrators must rely on their own limited experience and limited network management system fault diagnosis functions, location and recovery. However, expanding the network and the rapidly evolving circumstances, such knowledge has been unable to meet demand.
     At present, Scholars at home and abroad to carry out a lot of alarm network research, there are a variety of methods have been applied to a network failure in the alarm correlation analysis. the data mining methods have been extensively studied and applied.
     In this paper, first ,summarize the concept of data mining, functional, basic processes and so on. Second, introduce the basic structure of the telecommunication network as well as the characteristics of telecommunications networks of alarm data. Finally, pose the application of the data mining technology in the telecommunication network alarm correlation analysis. Detailed introduction algorithm for mining association rules and sequential pattern mining algorithm.
     Association rule mining algorithm is a commonly used method, Apriori algorithm is one of the most classic in the field of association rule mining algorithm, is also a most influential mining Boolean association rules algorithm for frequent itemsets. The algorithm core idea of the algorithm is based on the frequency of the recursive method of set theory,used layer by layer search iterative method. Apriori algorithm engender a large number of Candidate Sets, and repeatedly scan the database. Against to the problem of Apriori algorithm, Jiawei Han, who in 2000 proposed the algorithm for another classic - FP-growth algorithm. The algorithm is based on the FP-tree (Frequent Pattern Tree) to take sub-rule strategy. In mining frequent itemsets of all does not have a candidate itemsets, In this paper, two algorithms were compared, and use Wuhan Telecommunication Alarm Data Management Center for the experimental data, the analysis of experimental results.
     Sequential pattern mining is an extension of association rule mining, in an orderly sequence of events is composed of many data sets.Squence is a data collection by many orderly events,is an important branch of Data Mining. Used to extract one-dimensional space in an orderly collection of frequent subsets. If the network alarm information base is a time-ordered collection, then the sequential pattern mining can be used to find frequent sequential patterns of alarm, in order to export association rules of alarm. In this paper, based on FP-tree warning Frequent Sequential Pattern Mining Algorithm FSPMFP (Frequent Sequential Pattern Mining based on Frequent Tree) The basic idea is: First of all, through the improvement of FP-tree, data compression will be alert to the frequent pattern tree, and then for the frequent pattern Bottom-up tree to find frequent itemsets of alarm, The final excavation of the timing relationship between alarm.
引文
1.Jiawei Han.Micheline Kamber著,范明,孟小峰等译.数据挖掘概念与技术,北京:机械工业出版社,2001
    2.徐前方.基于数据挖掘的网络故障告警相关性研究.[博士论文].北京:北京邮电大学,2007
    3.王志文,刘康平,李平均.电信网络故障管理模型分析.微机发展,2000
    4.王桂芹,黄道.数据挖掘技术综述.电脑应用技术,2007
    5.夏火松,数据仓库与数据挖掘技术.北京:科学出版社,2004
    6.王小虎.关联规则挖掘综述.计算机工程与应用,2003
    7.毛广莉,黄辛龙,罗吕隆.电信网告警数据库中的数据挖掘.计算机应用研究,2000
    8.蔡伟杰,张晓辉,朱建秋,朱杨勇.关联规则挖掘综述.计算机工程,2001
    9.林景亮.关联规则挖掘算法及其应用研究.[硕士论文].厦门:厦门大学,2007
    IO.古永革,梅玉洁,石峰.数据挖掘技术在通信网告警分析中的应用.微计算机信息,2008 24:6-3
    11.姚伟力,王锡禄,宋俊德.基于序列模式挖掘的告警相关性分析算法.北京邮电大学学报,2005
    12.马述清.数据挖掘中关联规则算法研究.[硕士论文].吉林:吉林大学,2007
    13.赵兴华.关联规则在电信网络告警分析中的应用研究.[硕士论文].河北:河北工业大学,2007
    14.刘钦启,马玉祥,郝红伙.基于数据融合和数据挖掘的网络故障管理系统.微电子学与计算机,2006,23(6):74-76.
    15.张学红,闫五四,李永春,李荣盛.基于序列模式挖掘的网管告警系统.电信科学,2006
    16.王锐,马德涛,陈晨.数据挖掘技术及其应用现状探析.电脑应用技术,2007
    17.罗可,蔡碧野.数据挖掘及其发展研究.计算机工程与应用,2002
    18.郭道荣.基于数据挖掘的电信网络故障诊断技术的研究.[硕士论文].重庆:重庆大学计算机学院,2003
    19.贾卫军.数据挖掘技术在电信网管系统中的应用研究.[硕士论文].西安西安建筑科技大学,2008
    20.吕锋.张炜玮.4种序列模式挖掘算法的特性研究.武汉理工大学学报,2006
    21.宗俊省.基于约束的序列模式挖掘算法的研究.[硕士论文].河北:燕山大学信息科学与工程学院,2006
    22.马传香,张凌.序列模式挖掘算法的分析与比较.湖北大学学报,2006
    23.吴铁峰,彭宏,张东娜.一种网络告警的增量挖掘算法.计算机科学,2004
    24.端义锋,胡谷雨,丁力.序列模式挖掘在网络告警分析中的应用.北京邮电大学学报,2004
    25.李妍妍.基于序列模式挖掘的网络告警关联.[硕士论文].北京:北京邮电大学信息,工程学院,2008
    26.朱振华.分布式关联规则挖掘在电信告警相关性分析中的应用.[硕士论文].成都:电子科技大学,2007
    27.黄宇.关联规则分析在电信告警系统中的研究与应用.[硕士论文].成都:电子科技大学.2007
    28.管恩政,周春光,王喆,徐秀娟.频繁序列模式挖掘算法.吉林大学学报,2005
    29.吴萍,朱东来.网络告警关联规则系统的研究与殴计.计算机应用与软件,2008
    30.胡博.电信网络综合故障告警的没计与实现.江西电力职业技术学院学报,2008
    31.Nahid Amani,Mahmood Fathi,Mehdi Dehghan.A case-based reasoning method for alarm filtering and correlation in telecommunication networks.Electrical and Computer Engineering,2005
    32.Qingguo Zheng,Ke Xu.Weifeng Lv.Shilong Ma.Intelligent Scarch of Correlated Alarms from Database Containing Noise Data Network Operations and Management Symposium[C],2002:405-419
    33.ITU-T Rec.M.3400:TMN management functions,2001:157-230
    34.Chert M.S,Han J,P S.Data Mining.An Overview from Database Perspective.IEEE Transactions on Knowledge and Data Engineering 1996.8(6):866-883
    35.ZaKi M J.Spade:An efficient algorithtm for mining frequent sequences.Machine Learning,2001
    36.Srikant R,Agrawal.Mining sequence patterns:generalization and performance improvements.Proc 5th Int'l Conf Extending Database Technology.1996
    37.Garofalakis M,Rastogi R,Shim K.Mining sequential patterns with regular expression constraints.IEEE Transaction on Knowledge and Data Engineering,2002.14(3):530-552
    38.Antunes C,Oliveiral A L.Sequential pattern mining with approximated constr ints.IADIS International Conference on Applied Computing.Lisboa Portugal,2003
    39.Pei J,Han JW,Wang W.Mining sequential pattern with constraints in large databases.Proc ACM Conf on Information and Knowledge Management.2002
    40.40.R.J.Bayardo.Efficiently mining long patterns from databases.SIGMOG,1998
    41.Malheiros M D.A model for alarm correlation in telecommunications networks.Belo Horizonte,1997
    42.Crank R,Callahan P,Berstein L.Rule-bases expert system for network management and operations:an introduction.IEEE network Magazine,1988

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700