基于依赖关联度的业务过程噪声日志过滤方法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Dependency relevancy driven noise filtering in business process
  • 作者:孙笑笑 ; 张蕾 ; 俞东进 ; 潘建梁 ; 侯文杰 ; 王焕强
  • 英文作者:SUN Xiaoxiao;ZHANG Lei;YU Dongjin;PAN Jianliang;HOU Wenjie;WANG Huanqiang;School of Computer Science and Technology,Hangzhou Dianzi University;Key Laboratory of Complex Systems Modeling and Simulation,Ministry of Education;
  • 关键词:流程挖掘 ; 模型质量 ; 噪声日志过滤方法 ; 依赖关联度 ; 业务流程管理
  • 英文关键词:process mining;;model quality;;noisy log filtering method;;dependency relevancy;;business process management
  • 中文刊名:JSJJ
  • 英文刊名:Computer Integrated Manufacturing Systems
  • 机构:杭州电子科技大学计算机学院;复杂系统建模与仿真教育部重点实验室;
  • 出版日期:2019-04-15
  • 出版单位:计算机集成制造系统
  • 年:2019
  • 期:v.25;No.252
  • 基金:国家自然科学基金资助项目(61472112);; 浙江省重点研发资助项目(2017C01010,2016F50014,2015C01040)~~
  • 语种:中文;
  • 页:JSJJ201904020
  • 页数:9
  • CN:04
  • ISSN:11-5946/TP
  • 分类号:183-191
摘要
日志中发生的低频次行为与挖掘的流程模型中某些不必要的结构相对应,而这些结构的出现会引起挖掘模型在适应度和精确度等指标上的下降。为解决这些结构对流程挖掘模型质量造成的影响,提出一种基于依赖关联度的噪声日志过滤方法。该方法首先根据日志中事件及其依赖关系的统计频率,定义了依赖关系的局部关联度和整体关联度,并将两者归一化为混合关联度来筛选出噪声日志。然后通过轨迹可达性分析去除日志中的噪声,以便最大程度地保留日志轨迹中记录的其他行为。与传统噪声日志过滤算法过滤掉包含噪声日志的整条日志轨迹不同,所提算法在移除噪声日志的同时最大程度地保留了原始日志中的其他非噪声日志。
        The behavior with low frequency in the log corresponds to some unnecessary structures in the process model,which leads to the decline of the fitness and accuracy of mining models.To address this problem,a noise log filtering method based on dependent relevancy named Dependency Relevancy driven noise Filtering(DRF)was proposed.According to the statistical frequency of events and edges,the local correlation degree and global correlation degree of the dependency relationship were defined and normalized into a mixed relationship degree,based on which the noise log was filtered out.The noise in the log by the path accessibility analysis was removed,so the other behaviors recorded in the log trajectory were preserved to the maximum extent.Unlike the traditional log filtering algorithms that filtered out the whole log trajectory containing noisy log,DRF could preserve non-noisy log while removing the noisy one.
引文
[1]CONFORTI R,LA ROSA M,TER HOFSTEDE A H M.Filtering out infrequent behavior from business process eventlogs[J].IEEE Transactions on Knowledge and Data Engineering,2017,29(2):300-314.
    [2]SURIADI S,MANS R S,WYNN M T,et al.Measuring patient flow variations:across-organizational process miningapproach[C]//Proceedings of the Asia-Pacific Conference on Business Process Management.Berlin,Germany:SpringerVerlag,2014:43-58.
    [3]SURIADI S,WYNN M T,OUYANG C,et al.Understanding process behaviors in a large insurance company inAustralia:a case study[C]//Proceedings of the International Conference on Advanced Information Systems Engineering.Berlin,Germany:Springer-Verlag,2013:449-464.
    [4]DONGEN B F V,MEDEIROS A K A D,VERBEEK H MW,et al.The ProMframework:a new era in process mining tool support[J].Lecture Notes in Computer Science,2005,3536:444-454.
    [5]KEOGH E,LONARDI S,CHIU'C.Finding surprising patterns in a time series database in linear time and space[C]//Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York,N.Y.,USA:ACM,2002:550-556.
    [6]BUDALAKOTI S,SRIVASTAVA A N,OTEY M E.Anomaly detection and diagnosis algorithms for discrete symbol sequences with applications to airline safety[J].IEEE Transactions on Systems Man&Cybernetics Part C,2009,39(1):101-113.
    [7]FLOREZ-LARRAHONDO G,BRIDGES S M,VAUGHN R.Efficient modeling of discrete events for anomaly detection using hidden markov models[C]//Proceedings of the International Conference on Information Security.Berlin,Germany:Springer-Verlag,2005:506-514.
    [8]LANE T,BRODLEY C E.Temporal sequence learning and data reduction for anomaly detection[J].ACM Transactions on Information&System Security,1999,2(3):295-331.
    [9]BASU S,MECKESHEIMER M.Automatic outlier detection for time series:an application to sensor data[M].Berlin,Germany:Springer-Verlag,2007.
    [10]DAS K,SCHNEIDER J,NEILL D B.Anomaly pattern detection in categorical datasets[C]//Proceedings of the ACMSIGKDD International Conference on Knowledge Discovery and Data Mining.New York,N.Y.,USA:ACM,2008:169-176.
    [11]MUTHUKRISHNAN S,SHAH R,VITTER J S.Mining deviants in time series data streams[C]//Proceedings of the International Conference on Scientific and Statistical Database Management.Washington,D.C.,USA:IEEE Computer Society,2004:41.
    [12]VAN DER AALST W M P,WEIJTERS T,MARUSTERL.Workflow mining:discovering process models from event logs[M].Washingtong,D.C.,USA:IEEE Educational Activities Department,2004.
    [13]WEIJTERS A J M M,RIBEIRO J T S.Flexible heuristics miner(FHM)[C]//Proceedings of the Computational Intelligence and Data Mining.Washington,D.C.,USA:IEEE,2011:310-317.
    [14]LEEMANS S J J,FAHLAND D,VAN DER AALST W MP.Discovering block-structured process models from event logs-a constructive approach[M]//Application and Theory of Petri Nets and Concurrency.Berlin,Germany:Springer-Verlag,2013:311-329.
    [15]VAN DER AALST W M P.Fuzzy mining:adaptive process simplification based on multi-perspective metrics[C]//Proceedings of the International Conference on Business Process Management.Berlin,Germany:Springer-Verlag,2007:328-343.
    [16]LEEMANS,S J J,FAHLAND D,VAN DER AALST W MP.Discovering block-structured process models from incomplete event logs[C]//Proceedings of the International Conference on Applications and Theory of Petri Nets and Concurrency.Berlin,Germany:Springer-Verlag,2014:91-110.
    [17]VAN ZELST SJ,VAN DONGEN BF,VAN DER AALSTW M P.ILP-based process discovery using hybrid regions[EB/OL].[2018-05-16].http://wwwis.win.tue.nl/~wvdaalst/publications/p828.pdf.
    [18]CARMONA JA,CORTADELLA J,KISHINEVSKY M.Aregion-based algorithm for discovering petri nets from event logs[C]//Proceedings of the 6th International Conference on Business Process Management.Berlin,Germany:SpringerVerlag,2008:358-373.
    [19]LU X,FAHLAND D,FRANK J.H.M.VAN DENBIGGELAAR,et al.Detecting deviating behaviors without models[C]//Proceedings of the International Conference on Business Process Management.Berlin,Germany:SpringerVerlag,2016:126-139.
    [20]LEEMANS S J J,FAHLAND D,VAN DER AALST W MP.Discovering block-structured process models from event logs containing infrequent behaviour[C]//Proceedings of the International Conference on Business Process Management.Berlin,Germany:Springer-Verlag,2013:66-78.
    [21]MEDEIROS A K A D,WEIJTERS A J M M,VAN DERAALST W M P.Genetic process mining:an experimental evaluation[J].Data Mining&Knowledge Discovery,2007,14(2):245-304.
    [22]ADRIANSYAH A,MUNOZ-GAMA J,CARMONA J,et al.Alignment based precision checking[C]//Proceedings of the International Conference on Business Process Management.Berlin,Germany:Springer-Verlag,2012:137-149.
    [23]ADRIANSYAH A,DONGEN B F V,VAN DER AALST WM P.Conformance checking using cost-based fitness analysis[C]//Proceedings of IEEE International Enterprise Distributed Object Computing Conference.Washington,D.C.,USA:IEEE Computer Society,2011:55-64.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700