高效用关联规则的挖掘
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
关联规则的挖掘就是要发现大量数据中项集之间的关联或相关联系,它是数据挖掘研究的重要内容之一,在科学研究、电信网络、市场与风险管理、客户关系管理(CRM)、存货控制、军事等方面得到了广泛应用。但是,传统的关联规则以支持度衡量项集的重要性,会丢失一些支持度不高但效用很高、用户很可能感兴趣的规则。本文研究的高效用关联规则弥补了传统关联规则无法表达项集效用的不足,能反映用户偏好,更好地满足决策需求。本文主要研究高维大数据集中高效用关联规则的挖掘算法,弥补了现有的基于效用关联规则挖掘算法不能有效处理高维大数据集的不足。文中还结合效用与支持度的特点,提出了基于效用与支持度的关联规则挖掘问题及算法,可发现更多的用户感兴趣的规则。本文的主要研究有:
     (1)提出了一种新的在高维大数据集中挖掘高效用长项集的算法Inter-transaction。该算法基于行枚举,通过长事务的交集运算,直接得到长项集,不必从短项集逐步扩展得到长项集。在高维数据集中,长事务间共同项目很少,事务进行交集运算后变短的速度很快,因此这种行枚举方法具有很好的收敛性。Inter-transaction算法还把划分的方法引入到效用挖掘中,仅扫描数据库两次,能很好地适应高维大数据集环境。同时,由于采用了新的剪枝策略,避免了大量的候选集的生成、检验。
     (2)提出了一种双向搜索高效用项集的混杂算法。现有的基于效用的关联规则挖掘算法采用类似Apriori的搜索策略,需要多次扫描数据库。当模式很长且数据集很大时,I/O负担太重。本文提出了一种从上下两个方向搜索高效用项集的混杂算法。该算法把发现所有高效用项集的任务分解为发现高效用长项集和高效用短项集两个相对容易解决的子问题,然后再选择不同的算法完成挖掘任务,避免了从短项集逐步扩展到长项集的冗长过程。
     (3)提出了一种优化长事务交集运算的方法。我们提出的挖掘高效用长项集的算法同时以水平项目向量(Horizontal item-vector,简称HIV)和水平项目列表(Horizontal item-list,简称HIL)两种格式存储事务,并利用HIL格式数据提供的信息减少比特级逻辑“与”运算的次数,使逻辑“与”运算的次数等于HIL格式数据的长度,与比特向量(HIV格式)的长度无关。这种以空间换时间的方法解决了事务交集运算的性能随比特向量长度的增长而降低的问题,保证了在高维环境下的高性能。这种优化方法也可有效提高垂直挖掘算法挖掘频繁长模式的效率。
     (4)提出了基于效用与支持度的关联规则挖掘问题。支持度与效用分别反映了项集的统计特性与语义特性,但人们对事物的兴趣度(或事物对人们的重要性)不但取决于事物本身的客观因素(如项集的支持度),与人们的主观因素(如人们对效用的不同理解)也密不可分。为克服单个度量(支持度或效用)的不足,本文提出了一种衡量项集重要性的新的度量:激励。项集的激励定义为支持度与效用的乘积,反映了用户获得某种效用的可能性或以某种可能性可获得多大的效用。在基于效用与支持度的关联规则挖掘中,高激励项集的挖掘避免了那些支持度不高但效用较高、或效用不高但支持度较高的项集的丢失,能发现更多的用户感兴趣的规则。
     (5)论证了激励具有两个重要的数学性质:上界特性和事务权重激励向下封闭特性。根据这两个特性,设计了两种挖掘高效用频繁集的算法HM-Miner和HM-Two-Phase-Miner。两种算法都采用了类似Apriori的自下而上的搜索方式,适合于短模式数据集的挖掘。HM-Miner利用激励的上界特性剪枝,HM-Two-Phase-Miner则利用事务权重激励向下封闭特性剪枝。
     (6)给出了一个高效用关联规则挖掘的应用系统,并用于购物篮分析中。该系统能同时输出关联规则(项集)的支持度、效用与激励,以比较基于支持度的关联规则与高效用关联规则挖掘的区别与联系。实际挖掘结果表明,高效用关联规则的挖掘能发现一些基于支持度关联规则无法发现的有趣模式,帮助商家找出高效用商品组合,促进高利润商品的销售。经过数据的转换处理,该系统还可应用于其他领域。例如,在网页分析中,把网页被访问的次数与浏览时间作为评价网页受欢迎程度的尺度,将网页挖掘问题变成高效用项集的挖掘问题。
The association rule mining(ARM) aims at extracting interesting correlations between items,and is one of the most important tasks of data mining.ARM has been widely used in many fields such as scientific research,telecommunication networks, market and risk management,customer relationship management,inventory control, military and so on.However,traditional ARM,which uses support to measure the significance of a rule,can not discover some rules that have a low support but a high utility,and these rules maybe very important to users.High-utility association rule mining(HUARM),which is the main content of this paper,can overcome the shortcomings of traditional ARM in reflecting the semantic importance of an itemset.It can reflect users' preference,providing a better support to decision making.In this paper,we studied utility-based association rule mining in large high dimensional data, and our approach solved the problem that existing algorithms for utility mining can not handle large high dimensional datasets.By integrating the advantages of support and utility,we also proposed a new task of utility and support based association rule mining (USBARM),which can find more interesting rules.The main contributions of the paper include:
     (1) Proposed a novel algorithm,Inter-transaction,to mine high-utility long itemsets from large high dimensional data.The algorithm is based on row enumeration, which can find high-utility long itemsets directly by intersecting long transactions, without extending short itemsets step by step.In high dimensional data,there are few common items between long transactions,and the intersection of multiple long transactions usually leads to a very short itemset(intersection transaction),so this kind of row enumeration method has a good convergence.By adopting the partition method, the algorithm needs to scan the database only twice,and can work well in large high dimensional data.In the meantime,new pruning strategies are used,and large number of candidates can be filtered off.
     (2) Proposed a hybrid algorithm which search high utility itemsets from two directions.Existing algorithms for utility mining adopt an Apriori-like bottom/up search and need to scan the database multiple times.If the data is very large and contains many long patterns,the I/O burden will be too high.This paper proposed a hybrid algoritm which searches high utility itemsets both from the top and from the bottom.The hybrid algorithm firstly decomposes the mining task into two easy parts (mining short high utility itemsets and mining long high utility itemsets),and then choose different algorithms to finish the two subtasks separately,without the expensive process of extending short itemsets step by step to find long itemsets.
     (3) Proposed a method to optimize the intersection operation of long transactions. Our algorithm stores a transaction in both HIV format and HIL format,and uses the information contained in HIL format to reduce the number of bit-wise "AND" operation.In this way,the number of "AND" operation in HIV format is equal to the number of non-zero figures in HIL format,having nothing to do with the length of the data in HIV format.The optimization method can solve the problem that the performance of the intersection of long transactions decreases rapidly with the increase of the length of bit-vectors,and can guarantee a high performance in high dimensional data.The optimization method of time-space trade-offs can also improve the the efficiency of vertical algorithms for frequent itemsets mining.
     (4) Proposed the task of utility and support based association rule mining (USBARM).Support and utility can reflect the statistical importance and semantic importance of an itemset respectively.However,whether an itemset is interesting to a person not only depends on its objective factor such as the support of the itemset,but also on person's subjective factor such as the preference of the user.In order to overcome the shortcomings of individual measure(support or utility),we proposed a new measure,motivation,to measure the importance of an itemset.Motivation is defined as the product of utility and support,and it shows the possibility of a user's obtaining a certain utility or the utility a user can obtain in a certain possibility.In USBARM,all high motivation itemsets can be found,without missing itemsets whose motivatin is high but the support(or utility) is less than a user-defined threshold. USBARM can find more interesting patterns.
     (5) The paper also proved that motivation has two important properties:upper bound property and transaction-weighted downward closure property.Based on the properties,we designed two algorithms,i.e.,HM-Miner and HM-Two-Phase-Miner,to discover all high utility frequent itemsets which satisfy the support threshold,utility threshold and motivation threshold respectively.Two algorithms adopt Apriori-like bottom/top search,and are suitable for short pattern datasets.HM-Miner uses the upper bound property to cut down search space,whereas HM-Two-Phase-Miner uses transaction-weighted downward closure property to achieve the same goal.
     (6) Developed a real application system to mine high utility association rules.The application system can output the support,utility and motivation of every association rule(itemset) discovered so that users can compare the difference between support-based association rule mining and high utility association rule mining.We applied our application system to shopping basket data and the mining result show that high utility ARM can discover some interesting patterns missed by traditional ARM, and can help trade people find combinations of some high utility commodity, promoting the sales of high margin productions.By proper data preprocessing,the application system can also be used in other fields.For example,in webpage analysis, the frequency of visits of a page and the time spent on the page can be used to measure the popularity of the page.If we view the time as the utility of one page,we can convert the web mining task into high utility itemsets mining task.
引文
[1]Usama M.Fayyad,Gregory Piatetsky-Shapiro,Padhraic Smyth,Ramasamy Uthurusamy(Eds.),Advances in Knowledge Discovery and Data Mining,AAAI/MIT Press,1996.
    [2]韩家炜等,数据挖掘概念与技术(影印版),范明,孟小峰等译,北京:机械工业出版社,2001.
    [3]化柏林,数据挖掘与知识发现关系探析,理论与探索,2008 31(4):507-510
    [4]朱扬勇,数据挖掘技术现状,中国传媒科技,2006,12:11-14.
    [5]员巧云,程刚,近年来我国数据挖掘研究综述,情报学报,2005,2:250-256.
    [6]Rakesh Agrawal,Tomasz Imielinski,and Arun Swami.Mining association rules between sets of items in lager databases.In:Proc.ACM SIGMOD int'l conf.management of data,Washington,DC,May 1993,207-216.
    [7]Maurice Houtsma and Arun Swami,Set-oriented mining of association rules,Research Report RJ 9567,IBM Almaden Research Center,San Jose,California,October 1993.
    [8]Agrawal R,Srikant R.Fast Algorithms for Mining Association Rules.In:Proc of 1994 Int'lConf of Very Large Data Base.Santiago,Chili:VLDB Endowment,1994,487-499.
    [9]Han J.W.,Pei J.,Yin Y.and Mao R.Mining Frequent Patterns without Candidate Generation:A Frequent pattern Tree Approach.Data Mining and Knowledge Discovery,2004,8:53-87.
    [10]Bayardo R.Efficiently mining long patterns from databases.Proceedings of the ACM SIGMOD International Conference on Management of Data,1998:85-93.
    [11]Ramesh C.Agarwal,Charu C.Aggarwal and V.V.V.Prasad,A tree projection algorithm for generation of frequent itemsets,Journal of Parallel and Distributed Computing,2000,61(3):350-371.
    [12]Ramesh C.Agarwal,Charu C.Aggarwal,Depth first generation of long patterns,Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining,2000,108-118.
    [13]Savasere A,Omiecinsky E.and Navathe S.An efficient algorithm for mining association rules in large databases.21st Int'l Conf.on Very Large Databases,1995:432-444.
    [14]Jong Soo Park,Ming-Syan Chen,Philip S.Yu,An effective hash-based algorithm for mining association rules,Proceedings of the 1995 ACM SIGMOD international conference on Management of data,1995,175-186.
    [15]S.Brin,R.Motwani,J.D.Ullman,and S.Tsur.Dynamic itemset counting and implication rules for market basket data.In Proceedings of the 1997 ACM SIGMOD Conference,pages 255-264,1997.
    [16]D.Lin,and Z.M.Kedem.Pincer-Search:A New Algorithm for Discovering the Maximum Frequent Set.In Proc.of the 6' Int Conf on Extending Databas Technology(EDBT'98),Valencia,Spain,March 1998..
    [17]Zaki M.J.,Parthasarathy S.,Ogihara M.and Li W.New algorithms for fast discovery of Association Rules.In:proceedings of the 3th International Conference on Knowledge Discovery in database and Data Mining,1997,283-286.
    [18]Mohammed J.Zaki,Srinivasan Parthasarathy,Mitsunori Ogihara,Wei Li,Parallel Algorithms for Discovery of Association Rules,Data Mining and Knowledge Discovery,1997,1(4):343-373.
    [19]Eui-Hong Han,George Karypis,Vipin Kumar,Scalable parallel data mining for association rules,Proceedings of the 1997 ACM SIGMOD international conference on Management of data,1997,277-288.
    [20]Jong Soo Park,Ming-Syan Chen,Philip S.Yu,Efficient parallel data mining for association rules,Proceedings of the fourth international conference on Information and knowledge management,1995,31-36.
    [21]Rakesh Agrawal,John C.Shafer,Parallel Mining of Association Rules,IEEE Transactions on Knowledge and Data Engineering,1996,8(6):962-969.
    [22]Osmar R.Za(i|¨)ane,Mohammad El-Hajj,Paul Lu,Fast parallel association rules mining without candidacy generation,Proceedings of the 2001 IEEE International Conference on Data Mining(ICDM '01),2001,665-668.
    [23]Mohammad El-Hajj,Osmar R.Za'iane,Parallel association mining with minimum inter-processor communication,Proceedings of the 14th International Workshop on Database and Expert Systems Applications(DEXA '03),2003,519-523.
    [24]David W.Cheung,Jiawei Han,Vincent T.Ng,Ada W.Fu,Yongjian Fu,A fast distributed algorithm for mining association rules,Proceedings of the fourth international conference on on Parallel and distributed information systems,1996,31-43.
    [25]David W.Cheung,Vincent T.Ng,Ada W.Fu,Yongjian Fu,Efficient Mining of Association Rules in Distributed Databases,IEEE Transactions on Knowledge and Data Engineering,v.8 n.6,p.911-922,December 1996.
    [26]Assaf Schuster,Ran Wolff,Dan Trock,A high-performance distributed algorithm for mining association rules,Knowledge and Information Systems,v.7 n.4,p.458-475,May 2005.
    [27]Aleksandar Lazarevic,Zoran Obradovic,Boosting Algorithms for Parallel and Distributed Learning,Distributed and Parallel Databases,v.11 n.2,p.203-229,March 2002.
    [28]Frans Coenen,Paul Leng,Partitioning strategies for distributed association rule mining,The Knowledge Engineering Review,v.21 n.1,p.25-47,March 2006
    [29]Hannu Toivonen,Sampling Large Databases for Association Rules,Proceedings of the 22th International Conference on Very Large Data Bases(VLDB '96),1996,134-145.
    [30]J.Park,M.Chen and P.Yu.Mining Association Rules with Adjustable Accuracy.IBM Research Report,1996.
    [31]Kamber M.,Han J.,Chiang J.,Metarule-guided mining of multi-dimensional
    105 association rules using data cubes,KDD 97,1997,207-210.
    [32]范明,牛常勇,朱琰,一种挖掘多维关联规则的有效算法,计算机科学,2001,28(11):44-47.
    [33]程继华,郭建生,施鹏飞,元规则指导的知识发现方法研究,计算机工程与应用,1999,10:34-36.
    [34]J.Han,Y.Fu,Discovery of Multiple-Level Association Rules from Large Databases,In Proc.of 1995 Int.Conf.on Very Large DataBases(VLDB'95),1995,420-431
    [35]R.Srikant and R.Agrawal.Mining sequential patterns:Generalizations and performance improvements.In Proc.5th Int.Conf.Extending Database Technology(EDBT'96),pages 3-17,Avignon,France,Mar.1996.
    [36]Mannila H.;Toivonen H.;Inked Verkamo A.,Discovery of Frequent Episodes in Event Sequences.Data Mining and Knowledge Discovery,1997,1(3):259-289
    [37]M.Garofalakis,R.Rastogi,and K.Shim.SPIRIT:Sequential pattern mining with regular expression constraints.In Proc.1999 Int.Conf.Very Large Data Bases(VLDB'99),1999,223-234.
    [38]J.Pei,J.Han,H.Pinto,Q.Chen,U.Dayal,and M.-C.Hsu.PrefixSpan:Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth.,Proc.2001Int.Conf.on Data Engineering(ICDE'01),Heidelberg,Germany,April 2001.
    [39]J.Yang,P.Yu,W.Wang,and J.Han," Mining Long Sequential Patterns in a Noisy Environment ",Proc.2002 ACM-SIGMOD Int.Conf.on Management of Data(SIGMOD'02)},Madison,WI,June 2002.
    [40]詹姆斯 D.汉密尔顿,.时间序列分析,刘志明译,北京:中国社会科学出版社,1999.
    [41]Brin S,Motwani R,Silverstein C.Beyond market basket:Generalizing association rules to correlations.In Proc.of ACM-SIGMOD Int.Conf.Management of Data(SIGMOD'97),1997,265-276.
    [42]Banu O|¨zden,Sridhar Ramaswamy,Avi Silberschatz,Cyclic Association Rules,Proceedings of the 14th International Conference on Data Engineering (ICDE'98),1998:412-421.
    [43]Savasere A,Omiecinski E,Navathe S.Mining for strong negative association in a large database of customer transactions,ICDE'98,1998:494-502.
    [44]Ramaswamy S.,Mahajan S.and Silberschatz A.On the discovery of interesting patterns in association rules,Proceedings of 24th International Conference on Very Large Data Bases,1998,Morgan Kaufmann,New York City,New York,368-379.
    [45]Nicolas Pasquier,Yves Bastide,Rafik Taouil,Loth" Lakhal,Discovering frequent closed itemsets for association rules,Proceedings of the 7th International Conference on Database Theory(ICDT '99),1999:398-416.
    [46]Pei,J.,Han,J.,and Mao,R.CLOSET:An efficient algorithm for mining frequent closed itemsets.In Proc.2000 ACM-SIGMOD Int.Workshop Data Mining and Knowledge Discovery(DMKD'00),Dallas,TX,2000:11-20.
    [47]Peiyi Tang,Li Ning,Ningning Wu,Domain and data partitioning for parallel mining of frequent closed itemsets,Proceedings of the 43rd annual southeast regional conference,March 18-20,2005,Kennesaw,Georgia.
    [48]Ruili Wang,Luofeng Xu,Stephen Marsland,Ramesh Rayudu,An efficient algorithm for mining frequent closed itemsets in dynamic transaction databases,International Journal of Intelligent Systems Technologies and Applications,2008,4(3):313-326.
    [49]Petko Valtchev,Rokia Missaoui,Robert Godin,A framework for incremental generation of closed itemsets,Discrete Applied Mathematics,2008,156(6):924-949.
    [50]Unil Yun,Mining lossless closed frequent patterns with weight constraints,Knowledge-Based Systems,2007,20(1):86-97.
    [51]M.J.Zaki,C.-J.Hsiao,CHARM:an efficient algorithm for closed itemset mining,in:Proceedings of the Second SIAM International Conference on Data Mining,2002,pp.12-28.
    [52]Jianyong Wang,Jiawei Han,Jian Pei,CLOSET+:searching for the best strategies for mining frequent closed itemsets,Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining,2003,236-245.
    [53]Jianyong Wang,Jiawei Han,Ting Lu,Petre Tzvetkov,TFP:An Efficient Algorithm for Mining Top-K Frequent Closed Itemsets,IEEE Transactions on Knowledge and Data Engineering,2005,17(5):652-664.
    [54]N.Pasquier,Y Bastide,R.Taouil,and L.Lakhal.Discovering frequent closed itemsets for association rules.In Proceedings of the 7th ICDT conference,1999,398--416,.
    [55]Jiawei Han,Jianyong Wang,Ying Lu,Petre Tzvetkov,Mining Top.K Frequent Closed Patterns without Minimum Support,Proceedings of the 2002 IEEE International Conference on Data Mining(ICDM'02),p.211,December 09-12,2002.
    [56]Nicolas Pasquier,Yves Bastide,Rafik Taouil,Lotfi Lakhal,Discovering Frequent Closed Itemsets for Association Rules,Proceeding of the 7th International Conference on Database Theory,p.398-416,January 10-12,1999
    [57]J.Pei,J.Han,and R.Mao.CLOSET:An efficient algorithm for mining frequent closed itemsets.In DMKD'00,May 2000.
    [58]Mohammed J.Zaki,Ching-Jui Hsiao,Efficient Algorithms for Mining Closed Itemsets and Their Lattice Structure,IEEE Transactions on Knowledge and Data Engineering,v.17 n.4,p.462-478,April 2005.
    [59]Doug Burdick MAFIA:A maximal frequent itemset algorithm for transactional databases,Proceedings of the 17th International Conference on Data Engineering(ICDE'01),2001,443.
    [60]Doug Burdick,Manuel Calimlim,Jason Flannick,Tomi Yiu / Johannes Gehrke,MAFIA:A Maximal Frequent Itemset Algorithm,IEEE Transactions on Knowledge and Data Engineering,2005,17(11):1490-1504.
    [61]Guizhen Yang,The complexity of mining maximal frequent itemsets and maximal frequent patterns Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining(KDD '04),2004:344-353.
    [62]Karam Gouda,Mohammed Javeed Zaki,Efficiently Mining Maximal Frequent Itemsets,Proceedings of the 2001 IEEE International Conference on Data Mining,p.163-170,November 29-December 02,2001.
    [63]D.-I.Lin,Z.M.Kedem,Pincer-Search:An Efficient Algorithm for Discovering the Maximum Frequent Set,IEEE Transactions on Knowledge and Data Engineering,v.14 n.3,p.553-566,May 2002.
    [64]Qinghua Zou,Wesley W.Chu,Baojing Lu,SmartMiner:A Depth First Algorithm Guided by Tail Information for Mining Maximal Frequent Itemsets,Proceedings of the 2002 IEEE International Conference on Data Mining(ICDM'02),2002,570.
    [65]Tianming Hu,Sam Yuan Sung,Hui Xiong,Qian Fu,Discovery of maximum length frequent itemsets,Information Sciences:an International Journal,2008,178(1):69-87.
    [66]D.Burdick,M.Calimlim,J.Gehrke,MAFIA:A maximal frequent itemset algorithm for transaction databases,in:Proceedings of the 17th International Conference on Data Engineering,2001,pp.443-452.
    [67]K.Gouda,M.J.Zaki,Efficiently mining maximal frequent itemsets,in:Proceedings of the First IEEE International Conference on Data Mining,2001,pp.163-170.
    [68]G.Grahne,J.Zhu,High performance mining of maximal frequent itemsets,in:Proceedings of the Sixth SIAM International Workshop on High Performance Data Mining,pages,2003,pp.135-143.
    [69]D.Lin,Z.M.Kedem,Pincer-search:a new algorithm for discovering the maximum frequent itemset,in:Proceedings of the Sixth International Conference on Extending Database Technology,1998,pp.105-119.
    [70]Piatetsky-Shapiro G.Discovery,analysis and presentation of strong rules,Knowledge Discovery in Databases,AAAI,1991,229-249.
    [71]Silberschatz,A.and Tuzhilin,A."On Subjective Measures of Interestingness in Knowledge Discovery,Proceedings of the First International Conference on Knowledge Discovery and Data Mining,1995,275-281.
    [72]Constraint-Based Rule Mining in Large,Dense Databases,Proceedings of the 15th International Conference on Data Engineering,p.188,March 23-26,1999.
    [73]Jian Pei,Jiawei Han,Can we push more constraints into frequent pattern mining,Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining,p.350-354,August 20-23,2000,Boston,Massachusetts,United States.
    [74]Jian Pei,Jiawei Han,Laks V.S.Lakshmanan,Pushing Convertible Constraints in Frequent Itemset Mining,2004,8(3):227 - 252.
    [75]Jian Pei,Jiawei Han,Laks V.S.Lakshmanan,Mining Frequent Item Sets with Convertible Constraints,Proceedings of the 17th International Conference on Data Engineering,p.433-442,April 02-06,2001.
    [76]David Wai-Lok Cheung,Jiawei Han,Vincent Ng,C.Y.Wong,Maintenance of Discovered Association Rules in Large Databases:An Incremental Updating Technique,Proceedings of the Twelfhh International Conference on Data Engineering,1996,106-114.
    [77]David Wai-Lok Cheung,Sau Dan Lee,Ben Kao,A General Incremental Technique for Maintaining Discovered Association Rules,Proceedings of the Fifth International Conference on Database Systems for Advanced Applications (DASFAA),1997,185-194.
    [78]Chang-Hung Lee,Cheng-Ru Lin,Ming-Syan Chen,Sliding-window filtering:an efficient algorithm for incremental mining,Proceedings of the tenth international conference on Information and knowledge management(CIKM '01),2001,263 -270.
    [79]朱玉全 季小俊 季小俊,基于频繁模式树的关联规则增量式更新算法,计算机学报,2003,91-96.
    [80]Vincent Ng,Stephen Chan,Derek Lau,Cheung Man Ying,Incremental mining for temporal association rules for crime pattern discoveries,Proceedings of the eighteenth conference on Australasian database,p.123-132,January 30-February 02,2007,Ballarat,Victoria,Australia.
    [81]M.J.Zaki and M.Ogihara.Theoretical foundation of association rules.SIGMOD'98 Workshop on Data Mining and Knowledge Discovery,1998,71-78
    [82]Meo R,Psaila G,Ceri S.,A new SQL-like operator for mining association rules.In Proc.1996 Int.Conf.Very Large Data Bases,1996,122—133.
    [83]Visually aided exploration of interesting association rules.In:Proc of the 3rd Pacific-Asia Conf.on Knowledge Discovery and Data Mining.1999.380~389
    [84]Feng,L.,Li,Q.,& Wong,A.(2001).Mining Inter-transactional Association Rules:Generalization and Empirical Evaluation.Proc.of the 3rd International Conference on Data Warehousing and Knowledge Discovery,Lecture Notes in Computer Science,Germany,(pp.31-40).
    [85]Feng,L.,Yu,J.X.,Lu,H.J.,& Han,J.W.(2002).A Template Model for Multidimensional Inter-transactional Association Rules.International Journal of Very Large Data Bases,11(2),153-175.
    [86]Liu,H.,Feng,L.,& Han,J.(2000).Beyond intra-transactional association analysis:Mining multi-dimensional Inter-transaction association rules.ACM Transactions on Information Systems,18(4),423-454.
    [87]Anthony K.H.Tung,Hongjun Lu,Jiawei Han,Ling Feng,Breaking the barrier of transactions:mining Inter-transaction association rules,Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining,p.297-301,August 15-18,1999,San Diego,California,United States.
    [88]Yao H.,Hamilton H.J.and Butz C.J.A Foundational Approach to Mining Itemset Utilities from Databases.Proceedings of SIAM International Conference on Data Mining,2004:482-486.
    [89]Yao H.and Hamilton H.J.Mining itemset utilities from transaction databases.Data & Knowledge Engineering,2006,59:603-626.
    [90]Liu Y,Liao W.K.and Choudhary.A fast high utility itemsets mining algorithm. Proceedings of the First International Workshop on Utiliy-based Data Mining,2005:90-99.
    [91]余光柱,邵世煌,易先军,李克清.一种基于划分的高效用长项集的挖掘算法.计算机工程与应用,2007,43(29):11-13.
    [92]Guangzhu Yu,Shihuang Shao and Xianhui Zeng,Mining long high utility itemsets in transaction database,WSEAS TRANSACTIONS on INFORMATION SCIENCE & APPLICATIONS,2008,5(2):202-210.
    [93]Guangzhu Yu,Shihuang Shao,Bin Luo and Xianhui Zeng,A Hybrid Method for High-utility Itemsets Mining in Large High-dimensional Data,International Journal of Data Warehousing and Mining,2009,5(1):57-73.
    [94]Yen,Show-Jane,Lee,Yue-Shi,Mining high utility quantitative association rules,Proceedings of the 9th International Conference on Data Warehousing and Knowledge Discovery,2007,pp 283-292.
    [95]Yao H.and Hamilton H.J.,A Unified Framework for Utility Based Measures for Mining Itemsets,Proceedings of the 2006 International Workshop on Utility-Based Data Mining,Philadelphia,PA,2006,pp.28-37.
    [96]Srikant R.and Agrawal R.,"Mining Quantitative Association Rules in Large Relational Tables",Proceedings of SIGMOD'96,1996,1- 12.
    [97]Aumann Y.,Lindell Y.A Statistical Theory for Quantitative Association Rules.Journal of Intelligent Information Systems,2003,20:255-283.
    [98]Geoffrey I.Webb.Discovering associations with numeric variables.Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining,2001:383-388.
    [99]Cai C.H.,Ada W.C.,Fu C.H.,Cheng and Kong W.W.Mining Association Rules with Weighted Items.Proceedings of the International Database Engineering and Applicatios Symposium,1998:68-77.
    [100]Lu S.F.,Hu H.P.and Li F.Mining weighted association rules.Intelligent Data Analysis,2001,5:211-225.
    [101]胡和平,路松峰.加权关联规则的开采.小型微型计算机系统,2001,22(3): 347-375.
    [102]张文献,陆建江.加权布尔型关联规则的研究.计算机工程,2003,29(9):55-57.
    [103]张智军,方颖,许云涛.基于Apriori算法的水平加权关联规则挖掘.计算机工程与应用,2003,39(14):197-199.
    [104]Colin L.Carter,Howard J.Hamilton,Nick Cercone,Share Based Measures for Itemsets,Proceedings of the First European Symposium on Principles of Data Mining and Knowledge Discovery,1997,14 - 24.
    [105]Barber,B.,and Hamilton,H.J.,Extracting Share Frequent Itemsets with Infrequent Subsets,Data Mining and Knowledge Discovery,7,2003,pp.153-185.
    [106]V.H.Vroom.Work and Motivation.John Wiley,1964.
    [107]Shen Y.D.,Zhang Z.and Yang Q.Objective-oriented utility-based association mining.Proceedings of the 2002 IEEE International Conference on Data Mining,2002:426-433.
    [108]Chan R.,Yang,Q.,and Shen,Y.D.,Mining high utility itemsets,Proceedings of the 3rd IEEE International Conference on Data Mining,2003,pp.19-26.
    [109]Lin TY.Yao YY.Louie E.Mining Value Added Association rules.Proceedings of PAKDD,2002:328-333.
    [110]Wang K.,Zhou S.,Han J.Profit mining:from patterns to action.Proceedings of International Conference on Extending Database Technology,2002:70-87.
    [111]Lqiang Geng and Howard J.Hamilton,Interestingness Measures for Data Mining:A Survey.ACM Computing Surveys(CSUR),2006,38(3):61-93.
    [112]Feng P.,Gao C.,et al.CARPENTER:Finding closed patterns in long biological database.Washington:Proc.of SIGKDD,2003,413-419.
    [113]Hongyan Liu,Jiawei Han,Dong Xin and et al.Mining frequent Patterns from Very High Dimensional Data:A Top-down Row Enumeration Approach.Bethesda,Maryland:Proceedings of the Sixth SIAM International Conference on Data Mining,2006,20-22.
    [114]2007/5/16.http://www.almaden.ibm.com/cs/projects/iis/hdb/Projects/data_mining/datasets/syndata.html
    [115]Yu Guang-zhu,Zeng Xian-hui and Shao Shi-huang,Mining Frequent Closed Itemsets in Large High Dimensional Data,Journal of Donghua University,2008,25(4):64-72.
    [116]余光柱,王亮,易先军,邵世煌。高维大数据集中频繁闭合模式的挖掘研究。计算机工程,2008,34(17):47-49.
    [117]Feng P.,Gao C.,and et al.COBBLER:combining column and row enumeration for closed pattern discovery.Proceedings of the 16th International Conference on Scientific.and Statistical Database Management,2004,pp:21- 30.
    [118]Shenoy P.,Haritsa J.R,Sudarshan S.,et al.Turbo-charging Vertical Mining of Large Databases.Dallas,Texas:Proceedings of ACM SIGMOD International Conference on Management of Data,2000,22-33.
    [119]Zaki M.J.,Gouda K.Fast vertical mining using diffsets.Washington:In Proc.of ACM SIGKDD,2003:326-335.
    [120]Guangzhu Yu,Keqing Li and Shihuang Shao,Mining High Utility Itemsets in Large High Dimensional Data,First International Workshop on Knowledge Discovery and Data Mining(WKDD2008),2008,17-20.
    [121]余光柱,刘旭辉,邵世煌,高激励项集的挖掘研究,计算机工程与应用,已接受。
    [122]Jing Wang,Ying Liu,Lin Zhou,Yong Shi,and Xingquan Zhu,Pushing Frequency Constraint to Utility Mining Model,In Proceedings of International Conference on Computational Science(3),2007,685-692.
    [123]刘旭辉,邵世煌,余光柱,基于激励的关联规则的挖掘,计算机应用,已接受。

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700