基于聚类布尔矩阵的Apriori算法的研究
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:The Research of Apriori Algorithm Based on Cluster Boolean Matrix
  • 作者:田磊 ; 崔广才 ; 何旭 ; 陈建新
  • 英文作者:TIAN Lei;CUI Guangcai;HE Xu;CHEN Jianxin;School of Computer Science and Technology,Changchun University of Science and Technology;
  • 关键词:CBM_Apriori算法 ; CBM_Eclat算法 ; 布尔矩阵 ; K-medoids算法 ; Tidset
  • 英文关键词:CBM_Apriori algorithm;;CBM_Eclat algorithm;;Boolean matrix;;K-medoids algorithm;;Tidset
  • 中文刊名:CGJM
  • 英文刊名:Journal of Changchun University of Science and Technology(Natural Science Edition)
  • 机构:长春理工大学计算机科学技术学院;
  • 出版日期:2017-10-15
  • 出版单位:长春理工大学学报(自然科学版)
  • 年:2017
  • 期:v.40
  • 语种:中文;
  • 页:CGJM201705024
  • 页数:6
  • CN:05
  • ISSN:22-1364/TH
  • 分类号:113-118
摘要
针对聚类布尔矩阵的Apriori算法—CBM_Apriori算法的不足之处,提出了一种基于聚类布尔矩阵的Eclat算法—CBM_Eclat算法。该算法首先对布尔矩阵使用K-medoids算法,获得权值和聚类后的布尔矩阵;然后将聚类后的布尔矩阵转换成Tidset,并采用逻辑"交操作"运算,进而有效地减少了聚类布尔矩阵存储和候选项集的生成,提高了该算法的执行效率。通过实例应用和算法执行结果都能够证明CBM_Eclat算法具有可行性和有效性。
        For the inadequacy of Apriori algorithm of cluster Boolean matrix —CBM_Apriori algorithm,this paper presents a methods of Eclat algorithm based on cluster Boolean matrix —CBM_Eclat algorithm. To begin with, using K-medoids algorithm deal with Boolean matrix to obtain the weight and new Boolean matrix. Then,new Boolean matrix is transformed into the Tidset that use logical "and" operating,so the cluster Boolean matrix storage and candidate itemsets are reduced effectively. Thus,the efficiency of the algorithm is improved. Meanwhile,the application of example and result of algorithm performance both can prove the feasibility and effectiveness of the CBM_Eclat algorithm.
引文
[1]Agrawal R,Imielinaki T,Swami A.Mining association rules between sets of items in large databases[C].In Proc.1993 ACM—SIGMOD Int.Conf.Management of Date,Washington,D.C.,1993:207-216.
    [2]Jiawei Han,Jian Pei,Yiwen Yin.Mining frequent patterns without candidate generation[C].In Proc.2000 ACM—SIGMOD Int.Conf.Management of Data,Dallas,Texas,USA,2000:1-12.
    [3]Vu L,Alaghband G.A fast algorithm combining FP-tree and TID-list for frequent pattern mining[C].In Proceedings of IEEE Conference on Information and Knowledge Engineering,2011:472-477.
    [4]付沙,宋丹.基于矩阵的Apriori改进算法研究[J].微电子学与计算机,2012,5(5):156-161.
    [5]Mohammed J Zaki.Scalable algorithms for association mining[J].Knowledge and Data Engineering,2000,12(3):372-390.
    [6]Zaki M.J.Fast vertical mining using diffsets[R].Technical Report 0-1,Rensselaer Polytechnic Institute,Troy,New York,2001.
    [7]方炜炜,杨炳儒,宋威,等.基于布尔矩阵的关联规则算法研究[J].计算机应用,2008,25(7):1964-1967.
    [8]李敏,李春平.频繁模式挖掘算法分析和比较[J].计算机应用,2005,25(1):166-171.
    [9]宋长新,马克.改进的Eclat数据挖掘算法的研究[J].微计算机信息,2008,24(8):92-94.
    [10]景永霞,王治和,杜跃.一种新的Apriori改进算法[J],长春理工大学,2007,30(2):67-69.
    [11]谈恒贵,王文杰,李克双.频繁项集挖掘算法综述[J],计算机仿真,2005,22(11):1-4.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700