基于GEP和复杂网络的高校突发事件关联规则及其预测关键技术研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
近几年我国高校突发事件的频繁发生,给学校本身及整个社会造成重大影响。挖掘高校突发事件的诱发因素及结果因素之间的关联关系,对有效预防高校突发事件具有重要意义。
     然而传统的关联规则挖掘算法在挖掘过程中需要多次重复地扫描数据库,同时产生大量候选集,效率比较低。因此,针对高校突发事件数据集,如何建立一个高效的关联规则挖掘算法是当前的一个亟待解决的问题。
     针对传统关联规则挖掘算法存在的不足,本文将把GEP和复杂网络引入到关联规则挖掘当中,利用复杂网络构造高校突发事件复杂模型,并结合GEP的简单基因编码在全局范围内搜索最优解的能力来优化关联规则挖掘的原始数据,旨在减少关联规则挖掘的时间,从而弥补传统算法不足。本文主要工作如下:
     第一,本文针对高校突发事件特点,提取事件各个属性,根据事件属性划分为诱发因素和后果因素,然后进行数据清理,把数据整合到一个数据库。
     第二,根据高校突发事件特征,确定复杂网络的节点和边,构造高校突发事件网络模型。
     第三,在此基础上,引入基因表达式编程优化算法全局搜索最优解,提出GEP-ECD(Event Network Community Division AlgorithmBased on The GEP,GEP-ECD)社区划分算法,对复杂网络突发事件模型进行社区划分,输出社区表。
     第四,提出GCAR(Association Rules Algorithm Based on the GEPand Complex Network,GCAR)新算法,针对每个社区表进行局部关联规则挖掘,以置信度和支持度作为强关联规则的衡量指标,挖掘有效规则,这些规则将为预测高校突发事件提供依据。
     第五,本文最后结合实例利用这些有用的规则建立高校突发事件预测模型,实验表明,该模型取得较好的效果。
University emergencies frequently occurred in the past few years,seriously impact on the schools themselves and the whole society. Miningthe potential association between the university emergencies predisposingfactors and result factors, it is great value to be effective in preventingfrom campus emergencies.
     However, traditional association rules mining algorithm needs toscan the database with several times in mining process, and produced alarge number of candidates set at the same time. Therefore, in connectionwith the set of campus emergencies, how to set up an efficient algorithmfor mining association rules, it is the current problems to be solved.
     In response to the shortcomings of traditional association rulemining algorithm, in this thesis, the GEP(Gene Expression Programming)and complex network will be introduced into association rule mining, andthe use of complex network to tectonic complex universities emergenciesmodel, combine with GEP simple genetic code and the ability of globallysearch for optimal solutions to optimize the original data for associationrules mining, in order to reduce the time of mining association rules, so asto make up for the lack of traditional algorithms. The main work of thisthesis as follow:
     Firstly,this thesis aims at the features of campus emergencies and theabstraction of the attributes of each emergency, according to event attributes, divide it into predisposing factors and result factors, and thenmake the data clean, Integrate data from different data sources into adatabase.
     Secondly, according to the campus emergencies characteristics,determine the nodes and edges of complex network, use complex networkmodel to construct universities emergencies.
     Thirdly, on this basis, introduce gene expression programmingalgorithm for global search optimal solution, propose communitypartitioning algorithm of GEP-ECD, divide complex network model ofemergency into communities and output each community table.
     Fourthly, propose GCAR algorithm for mining each local associationrule community table, take the Confidence and Support as the evaluationcriterion for mining the effective rules, these rules will provide the basisfor predicting campus emergencies.
     Fifthly, combine instance with using these useful rules to establishuniversities emergency prediction model, and the experiments show thatthis model achieve the better results.
引文
[1]中华人民共和国突发事件应对法.http://www.gov.cn/ziliao/flfg/2007-08/30/content_732593.htm[DB/OL].
    [2]Heiko Apel,Annegret H.Thieken,Bruno Merz etal.A Probabilistc Modelling SystemAssessing Flood Risks[J].Natural Hazards,2006,38(l-2):79-100.
    [3] Pao-Shan yu,Tao-ChangYang.A Probability-Based Renewal Rainfall Model forFlow Foreeasting[J].Natoral Hazards,1997,15(l):51-70.
    [4] Atefe Ramezanlthani,Mostafa Najafiyazdi.A Systern Dynamies Approach onPost-Disaster Management:A Case Study of Bam Earthquake,December2003[A].22ndIntemational Conferenee Of the System Damies Soeiety,2008.
    [5]魏一鸣,张林朋,范英.基于swarm的洪水灾害演化模拟研究[J].管理科学学报,2002,5(6):39-46.
    [6]傅敏宁,邹武杰,周国强.江西省自然灾害链实例分析及综合减灾对策[J].自然灾害学报,2003,13(3):101-103.
    [7]Donald D.Dudenhoeffer,May R.Pennann,Milos Manie.CIMS:A Framework forInfrastructure Interdendency Modeling and Analysis[A].Winter SimulationConference,2006.
    [8]Donald D.Dudenhoeffer,Chuck Miller,Dr.Milos Manic.Interdendency Modelingand Emergeney Response[A].Proceedings of the2007summer computer simulationconferenee,2007.
    [9]王朋义,杜军平,基于人工免疫算法的旅游突发事件预警研究[J],北京工商大学学报,2008:vol.26.No.3:76-80.
    [10]王朋义,旅游突发事件聚类研究[D],北京邮电大学硕士学位论文,2009.
    [11]李桃迎,陈燕,张琳,张金松,基于模糊关联规则的交通事故分析应用研究[J],计算机仿真,第28卷第9期,2011:335-337.
    [12]唐亮,杜军平,关联规则挖掘在旅游突发事件预测中的研究[J],北京工商大学学报,第26卷,第1期,2008:59-62.
    [13]孙光伟,王之晖,韦扬,索胜军,一种广义预测模型的研究[J],哈尔滨工业大学学报;2002,380-381.
    [14]董恩国,张蕾,基于遗传神经网络算法的发动机性能预测[J].起重运输机械,2005(9):23-25.
    [15]裘江南,王延章,董磊磊,叶鑫;基于贝叶斯网络的突发事件预测模型[J].系统管理学报,2011.99-103.
    [16]田友,旅游突发事件关联规则挖掘算法研究[D],北京邮电大学硕士学位论文,2009.
    [17]王静红,刘教民,郭盛等一种新型快速建立频繁模式树方法.计算机应用,2008,28(3):734-737.
    [18]徐维祥,苏小军.基于频繁模式树的关联规则挖掘算法及其在铁路隧道中的应用.中国安全科学学报,2007,17(3):25-31
    [19] R. Agrawal, T. Imielinski and A. Swami. Mining Association Rules between Setsof Items in Large Databases. In Proc.1993ACM-SIGMOD Int. Conf. Management ofData, Washington, D.C., pages207–216, May1993.
    [20]R.Agrawal and R.Srikant.Fast Algorithms for Mining Association Rules in LargeDatabases San Jose,California:IBM Almaden Researeh Center,1994.
    [21]A.Savasere, E.Omiecinski, S.Navathe. An Efficient Algorithm for MiningAssociation Rules in Large Databases. In: Proceedings of the21th InternationalConference on Very Large Data Bases,1995,432-444
    [22]Jiawei Han,Micheline Kamber.数据挖掘概念与技术[M].机械工业出版社,2005,3、149-167、223-234.
    [23] J. Han,J. Pei,and Y. Yin.Mining frequent patterns without candidate generation.InProc.2000ACM-SIGMOD Int.Conf.Management of Data(SIGMOD’00),Dalas,TX,May2000.
    [24]冯玉才、冯建林,关联规则的增量式更新算法[J];软件学报;1998,302-306.
    [25]李艳,白玉峰,一种星型模式下的关联规则挖掘方法[J].计算机与现代化,2011年第5期.
    [26]徐虹.高校突发事件的类型及应对措施[J].出国与就业(就业版).2011(19):128.
    [27] Han, Micheline Kamber. Data Mining: Concepts and Tech-niques[M]. USA:Morgan Kaufn ann Publishe rs,2001.
    [28] Candida Ferreira. Gene Expression Programming: A New Adaptive Algorithm forSolving Problems[J]. Complex Systems, Vol.13, issue2:87-129,2001.
    [29] Ferreira.C,2001. Gene Expression Programming in Problem Solving, invitedtutorial of the6th Online World Conference on Soft Computing in IndustrialApplications, September10-24,2001.
    [30] Zuo Jie, Tang Changjie, Li Chuan, et al. Time Series Prediction based on GeneExpression Programming. In: Proc of the5thInt’l Conf for Web Information Age2004(WAIM04). LNCS3129. Berlin: Springer-Verlag,2004.55~64.
    [31]元昌安,唐常杰,左劼等,基于基因表达式编程的函数挖掘-收敛性分析与残差制导进化算法,四川大学学报(工程科学版)2004,36(6):100~105.
    [32]段磊,唐常杰,左劼,等.基于基因表达式编程的抗噪声数据的函数挖掘方法.计算机研究与发展,2004,41(10):1684~1689.
    [33]元昌安,彭昱忠,覃晓,石亚冰,蔡宏果.基因表达式编程算法原理与应用,北京:科学出版社,2010.
    [34] http://baike.baidu.com/view/1195034.htm[DB/OL].
    [35] Ouyang M,Yu MH, Huang XZ, et al. Emergency response to disaster-struckscale-free network with redundant systems[J].Physica Statistical Mechanics and itsApplications,2008,387(18):4683-4691.
    [36]王建伟,荣莉莉突发事件的连锁反应网络模型研究[J],计算机应用研究期刊,第25卷第11期,2008:3289-3291.
    [37]望文,陕西高校突发事件分级模型研究[D].西安工业大学硕士论文.2011.
    [38]Liu Zhi-yuan,Li Peng,Zheng Ya-bin,et a1.Community detection by affinitypropagation[R].2008.
    [39]Jiawei Han,Micheline Kamber.数据挖掘概念与技术[M].机械工业出版社,2005,3、149-167、223-234.
    [40]NEWMAN M E J, GIRVAN M. Finding and evaluating community structure innetworks[J]. Physi cal Review E,2004,69(2):026-113.
    [41]左劼.基因表达式编程核心技术研究[D].四川大学博士学位论文,2004.
    [42]段晓东,王存睿,刘向东,林延平等,基于粒子群算法的Web社区发现[J],计算机科学2008Vol.35, No.3.18-22.
    [43] ZACHARY W W.An information flow model for conflict and fission in smallgroups[J].Journal of Anthropological Research,1977,33:452-473.
    [44]DUCH J, ARENAS A. Community detection in complex networks using extremaloptimization[J]. Phys Rev E72,2005:027-104.
    [45]潘洁珠,基于数据挖掘的预警技术研究[D],合肥工业大学硕士学位论文,2007:6.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700