多最小支持度关联规则改进算法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:An Improved Algorithm for Association Rules with Multiple Minimum Supports
  • 作者:梁杨 ; 钱晓东
  • 英文作者:LIANG Yang;QIAN Xiao-dong;School of Electronic and Information Engineering, Lanzhou Jiaotong University;
  • 关键词:大数据 ; 频繁项集 ; 关联规则 ; 多最小支持度
  • 英文关键词:big data;;frequent itemset;;association rule;;multiple minimum support
  • 中文刊名:XNND
  • 英文刊名:Journal of Southwest University(Natural Science Edition)
  • 机构:兰州交通大学电子与信息工程学院;
  • 出版日期:2019-07-20
  • 出版单位:西南大学学报(自然科学版)
  • 年:2019
  • 期:v.41;No.295
  • 基金:国家自然科学基金项目(71461017)
  • 语种:中文;
  • 页:XNND201907019
  • 页数:11
  • CN:07
  • ISSN:50-1189/N
  • 分类号:137-147
摘要
由于大数据具有多样性的特点,在数据挖掘过程中采用单一最小支持度会出现较多冗余规则,造成挖掘效率不高等问题,该文提出一种基于多最小支持度关联规则改进算法.通过给每一项目设置单独的支持度阈值,构建多最小支持度模式树,利用最小频繁项目作为节点筛选标准,进行冗余节点删除;在挖掘频繁项集的过程中利用排序向下闭合的性质,删除冗余的候选项集,同时能够自动停止向下挖掘,从而快速直接地得到所有频繁项集,并且不需要多次扫描数据库.实验结果表明,改进算法能够提高挖掘效率,节省计算时间.
        Due to the diversity of big data, using a single minimum support in the data mining process will result in inefficient mining and redundancy rules. This paper proposes an improved algorithm based on multi-minimum support association rules. By setting a separate support threshold for each project, a multi-minimum support pattern tree is constructed, and the minimum frequent items are used as node screening criteria to perform redundant node deletion. In the process of mining frequent itemsets, the nature of sorting down-close is utilized to delete redundant candidate sets, and at the same time, it can automatically stop down mining, so that all frequent itemsets can be quickly and directly obtained, and the database does not need to be scanned multiple times. Experimental results show that the improved algorithm can improve mining efficiency and save computing time.
引文
[1] GANTZ J,REINSEL D.2011 Digital Universe Study:Extracting Value from Chaos[M].Hopkinton:IDC Go-to-Market Services,2011.
    [2] WU F,WANG Z,ZHANG Z,et al.Weakly Semi-Supervised Deep Learning for Multi-Label Image Annotation [J].IEEE Transactions on Big Data,2015,1(3):109-122.
    [3] CORMACK G V ,CLARKE C L A ,BUTTCHER S.Information Retrieval:Implementing and Evaluating Search Engines [J].The Electronic Library,2011,29(6):853-854.
    [4] NAUN C C.Book Review:Introduction to Modern Information Retrieval [J].Library Resources & Technical Services,2011,55(4):239-240.
    [5] LIU B,HSU W,MA Y M.Mining Association Rules with Multiple Minimum Supports [C]//San Diego:Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining-KDD'99,1999.
    [6] HU Y H,CHEN Y L.Mining Association Rules with Multiple Minimum Supports:A New Mining Algorithm and a Support Tuning Mechanism [J].Decision Support Systems,2006,42(1):1-24.
    [7] TSENG M C,LIN W Y.Efficient Mining of Generalized Association Rules with Non-Uniform Minimum Support [J].Data & Knowledge Engineering,2007,62(1):41-64.
    [8] LEE Y C,HONG T P,LIN W Y.Mining Association Rules with Multiple Minimum Supports Using Maximum Constraints [J].International Journal of Approximate Reasoning,2005,40(1-2):44-54.
    [9] LIU Y C,CHENG C P,TSENG V S.Discovering Relational-Based Association Rules with Multiple Minimum Supports on Microarray Datasets [J].Bioinformatics,2011,27(22):3142-3148.
    [10] HUANG T C K.Discovery of Fuzzy Quantitative Sequential Patterns with Multiple Minimum Supports and Adjustable Membership Functions [J].Information Sciences,2013,222:126-146.
    [11] RAGE U K,KITSUREGAWA M.Efficient Discovery of Correlated Patterns Using Multiple Minimum All-Confidence Thresholds [J].Journal of Intelligent Information Systems,2015,45(3):357-377.
    [12] TANG K,CHEN Y L,HU H W.Context-Based Market Basket Analysis in a Multiple-Store Environment [J].Decision Support Systems,2008,45(1):150-163.