最大频繁项集挖掘算法的研究

英文题名：Research on Mining Algorithms of Maximal Frequent Item Sets
作者：颜跃进
论文级别：博士
学科专业名称：计算机科学与技术
中文关键词：数据挖掘 ; 关联规则 ; 频繁项集 ; 最大频繁项集 ; 前瞻剪枝 ; 超集存在判断 ; 频繁模式树 ; 最大频繁项集树 ; 组合频繁模式树
英文关键词：Data Mining ; Association Rule ; Frequent Item Set ; Maximal Frequent Item set ; Lookaheads Pruning ; Superset Checking ; Frequent Pattern Tree(FP-Tree) ; Maximal Frequent Item Set Tree(MFI-Tree) ; Combined Frequent Pattern Tree(CFP-Tree)
学位年度：2005
导师：陈火旺 ; 李舟军
学科代码：081202
学位授予单位：国防科学技术大学
论文提交日期：2005-04-01

摘要

随着信息技术尤其是网络技术的快速发展，人们收集、存储和传输数据的能力不断提高，导致数据出现了爆炸性增长。与此形成鲜明对比的是，对人们决策有价值的知识却非常匮乏。知识发现与数据挖掘正是在这一背景下诞生的一门新科学。
     关联规则是数据挖掘当前研究的主要模式之一，它用于确定数据集中不同域或属性之间的联系，找出有价值的多个域之间的依赖关系。频繁项集挖掘是生成关联规则的关键步骤，其效率问题是关联规则挖掘中的一大难点和热点。频繁项集挖掘可分为完全频繁项集挖掘、频繁闭项集挖掘和最大频繁项集挖掘三类。论文基于数据集和最大频繁项集的不同表示结构，从剪枝策略、尾项集的项排序策略和超集存在判断方法等角度对最大频繁项集的挖掘问题进行了深入的分析和研究。
     位图是—种有效的数据集和项集的表示结构。论文基于位图提出了深度优先挖掘算法DFMfi。算法DFMfi充分利用位图的字节特性，优化了项集的匹配和合并操作，并首次在其中引入了基于局部最大频繁项集的超集存在判断方法。论文证明了算法DFMfi的正确性，并通过实验说明其在运行时间上少于同类算法。
     近几年来，数据集的另—种压缩表示结构—FP-Tree结构越来越受到研究者们的青睐，论文第二部分研究基于FP-Tree结构的最大频繁项集挖掘问题，其中使用FP-Tree表示数据集及其投影，并利用MFI-Tree保存已有最大频繁项集。分析和实验说明已有算法中的超集存在判断为耗时操作，针对这种情况，论文在单棵MFI-Tree表示下基于最大频繁项集投影提出一种新的超集存在判断方法，并证明了多棵MFI-Tree表示下存在一种简单的超集存在判断方法，二者均可有效降低超集存在判断的时间开销。相应于两种超集存在判断方法，论文分别提出了算法FPMFI和FIMFI。在算法FIMFI里，论文分析了尾项集的项排序策略对压缩搜索空间的影响，提出了一种高效的、基于FP-Tree和MFI-Tree信息的尾项集项排序策略。通过使用新的前瞻剪枝方法，算法FIMFI拓展了前瞻剪枝的范围，加大了前瞻剪枝成功的可能性，尽可能地压缩了搜索空间。此外，FPMFI算法中的非冗余子树结构是寻求高效数据集压缩结构的一次尝试。实验表明，在稠密数据集上，这两个算法相对于同类算法均具有一定的优越性。其中FIMFI算法比同类算法中性能最优的FPMax~*算法平均快30％-40％。
     论文最后提出一种能同时压缩表示数据集和最大频繁项集的新的数据结构—CFP-Tree，基于CFP-Tree结构定义了最大化子集，并提出了CfpMfi算法。通过其与FPMax~*
With the development of information technology, especially the emerging of the network technology, our abilities to collect, store and transfer data have been improved dramatically. Comparing to the explosive growth of data, the needs for decision-relevant knowledge are not satisfied yet Knowledge discovery and data mining technology is an important approach to address this problem.
    As one of the main patterns in the field of data mining, association rules are used to determine the relationships among the attributes or objects, to find out valuable dependencies among the fields. The efficiency of mining frequent item sets is the key problem in association rules generating. Frequent item sets can be divided into three types: complete, closed and maximal. This dissertation focuses on the mining of maximal frequent item sets. The prune strategy, the ordering policy of tail item set and superset checking are studied thoroughly based on representation of datasets and the sets of maximal frequent item sets.
    Bitmap is an efficient representation structure of datasets and the sets of maximal frequent item sets. A depth-first mining algorithm-DFMfi which is based on bitmap is proposed in the dissertation. By utilizing the byte characteristic, DFMfi can optimize the mapping and unifying operations on the item sets. Moreover, for the first time a method based on bitmap which uses local maximal frequent item sets for fast superset checking is employed. DFMfi's correctness is proved. And experimental comparison with previous works indicates that DFMfi can obviously accelerate the generation of maximal frequent item sets.
    After near 10 years research and development, researchers begin to pay more attention to another compress representation structure of datasets, namely FP-Tree. The second part of the dissertation thus concentrates on the mining of maximal frequent item sets based on FP-Tree. Analysis and experimental results show that the superset checking operation is time-consuming and is frequently used in the mining process. Based on the observations, a new superset checking method based on projections of maximal frequent item sets is presented for the single MFI-Tree. And a simple method for fast superset checking is proposed for multiple MFI-Trees. Two algorithms, FPMFI and FIMFI, which are based on single and multiple MFI-Trees, are proposed respectively. In FIMFI, the item ordering policy of tail item sets is discussed and a new efficient ordering policy based on the

引文

[1] R. Agrawal, T. Imielinski, A. N. Swamy. Database Mining: A Performance Perspective. In IEEE Transaction on Knowledge and Data Engineering. 5(6), 914-925, 1993.
    [2] M.S. Chen, J. Han, and Philip S. Yu. Data Mining: An Overview from Database Perspective. In IEEE Transactions on Knowledge and Data Engineering. 8(6) 866-883, 1996.
    [3] http://robotics.stanford.edu/people/nilsson/mlbook.html
    [4] Tom Mitchell. Machine Learning (In English).机械工业出版社.2003.3.1.
    [5] http://www.dmgroup.org.cn/zs20.htm
    [6] http://www.channell.com/users/gpsl/meetings/kdd89/
    [7] http://www.informatik.uni-trier.de/～ley/db/books/collections/PiatetskyF91.html
    [8] http://www.ibiblio.org/pub/academic/medicine/brazil-mirror/neuralnets/bibliography/kdd93.txt
    [9] http://www.kdnuggets.com/news/94/n5.txt
    [10] http://www-aig.jpl.nasa.gov/public/kdd98
    [11] 周皓峰．关联规则挖掘的拓展性研究[博士论文]，上海交通大学，2002，11．
    [12] 刘君强．海量数据挖掘技术研究[博士论文]，浙江大学，2003，2．
    [13] Fayyad, U., P. Shapiro, G and Smyth, P. From data mining to knowledge discovery: an overview, In: Advances in knowledge discovery and data mining, AAAI/MIT Press, pages 1-34, 1996.
    [14] Fayyad U., P. Shapiro G, and Smyth P.: Knowledge Discovery and Data Mining: Towards a Unifying Framework? In Proceedings of 2nd International Conference on Knowledge Discovery and Data (KDD'96), pages 82-88, 1996.
    [15] J. Han and M. Kamber. Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, August 2000.[16] R. Agrawal, T. Imielinski, A. N. Swamy. Mining association rules between sets of items in large databases. In Proceedings of 1993 ACM SIGMOD International Conference on Management of Data, pages 207-216. Washington, D.C. USA, 1993.
    [17] R. Agrawal, R. Srikant. Fast algorithms for mining association rules in large databases. In Proceedings of 20th International Conference on Very Large Databases, pages 487-499, Santiago, Chile, 1994.
    [18] R. Srikant and R. Agrawal. Mining sequential patterns. In Proceedings of 5th International Conference Data engineering (ICDE95), 3-14, 1995.
    [19] R. Srikant and R. Agrawal. Mining sequential patterns: Generalizations and performance improvements. In Proceedings of 5th International Conference Extending Database Technology (EDBT96), pages 3-17, Avignon, France, March 1996.
    [20] J. Pei, J. Hart, B. Mortazavi-Asl, H. Pinto, Q. Chen, U. Dayal, and M. C. Hsu. PrefixSpan: Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth. In Proceedings of 2001 International Conference Data Engineering (ICDE01), pages 215-226, Heidelberg, Germany, April 2001.
    [21] M.J. Zaki. Fast mining of sequential patterns in very large databases. Technical Report 668, Department of Computer Science, Rensselaer Polytechnic Institute, 1997.
    [22] S.M. Weiss and C. A. Kulikowski. Computer Systems That Learn: Classification and prediction Methods from statistics, Neural Nets, Machine Learning, and Expert Systems. San Mateo, CA: Morgan Kaufmann, 1991.
    [23] J.R. Quinlan. C4.5: Programs for Machine Learning. San Mateo, CA: Morgan Kaufmann, 1993.
    [24] S.K. Murthy. Automatic construction of decision trees from data: A multi-disciplinary survey. Data Mining and Knowledge Discovery, 2:345-389, 1998.
    [25] J. Neter, M. H. Kutner, C. J. Nachtsheirn, and L. Wasserman. Applied Linear Statistical Models, 4th ed. Chicago: Irwin, 1996.
    [26] A. Agresti. An Introduction to Categorical Data Analysis. New York: John Wiley & Sons, 1996.[27] C. Chatfield. The Analysis of Time Series: An Introduction, 3rd ed. New York: Chapman and Hall, 1984.
    [28] R.H. Shumway. Applied Statistical Time Series Analysis. Englewood Cliffs, NJ: Prentice Hall, 1998.
    [29] L. Kaufman and E J. Rousseeuw. Finding Groups in Data: An Introduction to Cluster Analysis. New York: John Wiley & Sons, 1990.
    [30] A. K. Jain, M. N. Murty, and P. J. Flynn. Data Clustefing: A survey. ACM Computer Surv., 31:264-323, 1999.
    [31] D. Gusfield. Algorithms on Strings. Trees and Sequences, Computer Science and Computation Biology. New York: Cambridge University Press, 1997.
    [32] P. Baldi and S. Brunak. Bioinformatics: The Machine Learning Approach. Cambridge, MA: MIT Press, 1998.
    [33] A. Baxevanis and B. F. F. Ouellette. Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins. New York: John Wiley & Sons, 1998.
    [34] M.J.A. Berry and G Linoff. Mastering Data Mining: The Art and Secience of Customer Relationship Management. New York: John Wiley & Sons, 1999.
    [35] A. Berson, S. J. Smith, and K. Thearling. Building Data Mining Applications for CRM. New York: John Wiley & Sons, 1999.
    [36] 谭小萍，柳炳祥．数据挖掘在客户关系管理中的应用研究．华东经济管理．2003，17(1)，145-147．
    [37] 张喆，常桂然，黄小原．数据挖掘技术在CRM中的应用．中国管理科学．2003，11(1)．53-59．
    [38] A. Aming, R. Agrawal, and E Raghavan. A linear method for deviation detection in large Oatabases. In Proceedings of 1996 International Conference Data Mining and Knowledge Discovery (Kdd'96), pages 164-169, Portland, OR, Aug. 1996.
    [39] 吴小明，邱家驹，张国江，蔡建颖．软计算方法和数据挖掘理论在电力系统负荷预测中的应用，电力系统及其自动化学报．2003，15(1)．1-4．[40] 朱蔚恒，陈健．数据挖掘在电子商务中的应用．计算机工程．2002，28(8)．73-74，113．
    [41] 张静，田忠和．基于ⅡS和web日志的关联关系的挖掘．华中科技大学学报自然科学版．2002，30(8)．37-39．
    [42] V. Goebel and L. Gruenwald. A survey of data mining and knowledge discovery sottware tools.SIGKDD Explorations, 1: 20-33, 1999.
    [43] http://www-4.ibm.com/software/data/miner.html (IBM IntelligentMiner).
    [44] http://www.sas.com/sottware/components/miner.html (SAS Enterprise Miner).
    [45] http://www.sgi.com/sotware/mineset (SGI MineSet).
    [46] http://www.isl.co.uk/clern.html (ISL Clementine).
    [47] 郭秀娟．基于关联规则数据挖掘算法的研究．[博士论文]，吉林大学，2004，6．
    [48] 温磊．基于有向项集图的关联规则挖掘算法研究与应用．[博士论文]，天津大学，2002，11．
    [49] 颜跃进，李舟军，陈火旺．频繁项集挖掘算法．计算机科学．2004，31(3)．112-114．
    [50] T. Calders, J. Paredaens. Axiomatization of Frequent Itemsets. Theoretical Computer Science. 290(1).669-693, January 2003.
    [51] J．Han，M．Kamber著．范明，孟小峰译．数据挖掘：概念与技术．机械工业出版社．2001．8．
    [52] R. Srikant and R. Agrawal. Mining quantitative association roles in large relational tables. In Proceedings of 1996 ACM-SIGMOD International Conference on Management of Data (SIGMOD95), pages 1-12, Montreal, Canada, June 1996.
    [53] 苑森淼，程晓青．数量关联规则发现中的聚类方法研究．计算机学报．2000，23(8)．866-871．
    [54] T. Fukuda, Y. Morimoto, S. Morishita, and T. Tokuyarna. Mining optimized association rules for numeric attributes. In Proceedings of the 15th ACM SIGACTSIGMOD-SIGART Symp. on Principles of Database Systems (PODS96), pages 182-191, Montreal, Canada, June 1996,ACM Press.
    [55] C.M. Kuok, A. Fu, and M.H. Wong. Fuzzy association roles in large databases with??quantitative attributes. In ACM SIGMOD Records, March, 1998.
    [56] 段云峰，李剑威．基于数量的关联规则挖掘．北京邮电大学学报．2002，25(4)．56-60．
    [57] 陈富赞，寇纪淞等．基于网络的数值关联规则挖掘方法．系统工程理论与实践．2002，22(4)．1-9．
    [58] J. Han and Fu. Discovery of multiple-level association rules from large databases. In Proceedings of VLDB-95, pages 420-431, Zurich, Switzerland, September 1995.
    [59] 程继华，施鹏飞．快速多层次关联规则的挖掘．计算机学报．1998，21(11)．1037-1041．
    [60] 范明，牛常勇等．一种挖掘多维关联规则的有效算法．计算机科学．2001，28(11)．44-47．
    [61] 杨学兵，蔡庆生．基于数据立方体的维内关联规则挖掘算法．北京科技大学学报．2003，25(1)．83-86．
    [62] S. Brin, R. Motwani, and C. Silverstein. Beyond market basket: Generalizing association rules to correlations. In Proceedings of 1997 ACM-SIGMOD International Conference Management of Data (SIGMOD'97), pages 265-276, Tucson, AZ, May 1997.
    [63] C.C. Aggarwal and P. S. Yu. A new framework for itemset generation. In Proceedings of 1998 ACM Symp. Principles of Database Systems (PODS'98), pages 18-24, Seattle, WA, June 1999.
    [64] S. Brin, R. Motwani, J. D. Ullman, and S. Tsur. Dynamic itemset counting and implication rules for market basket analysis. In Proceedings of 1997 ACM-SIGMOD international Conference Management of Data (SIGMOD'97), pages 255-264, Tucson, AZ, May 1997.
    [65] K.M. Ahmed, N. M. EI-Makky, and Y. Taha. A note on "Beyond market basket: Generalizing association rules to correlations." SIGKDD Explorations, 1:46-48, 2000.
    [66] J. Pei, J. Han, Can we push more constraints into frequent pattern mining? In Proceedings of the 6th ACM SIGKDD international Conference on Knowledge Discovery and data mining, pages 350-354, Boston, Massachusetts, USA, August 2000, ACM Press.
    [67] R. Ng, L. V. S. Lakshmannan, J. Hart, And A. Pang. Exploratory mining and praning opfimizafions of constrained association rules. In Proceedings of 1998 ACM-SIGMOD International Conference Management of Data (SIGMOK'98), pages 13-24, Seattle, WA, June 1998.[68] E Grahne, L. Lakshmannan, and X. Wang. Efficient mining of constrained correlated sets. In Proceedings of 2000 International Conference Data Engineering (ICDE00), pages 512-521, San Diego, CA, February 2000.
    [69] 高飞，谢维信．发现含有第一类项目约束的频繁项集的快速算法．计算机研究与发展．2001，38(11)，1295-1301．
    [70] H. Mannila, H. Toivonen, and A. I. Verkarno. Efficient algorithms for discovering association rules. In Proceedings of AAAI'94 Workshop Knowledge Discovery in Databases (KDD'94), pages 181-192, Seattle, WA, July 1994.
    [71] J.S. Park, M. S. Chen, and P. S. Yu. An efficient hash-based algorithm for mining association rules. In Proceedings of 1995 ACM-SIGMOD International Conference on Management ofData (SIGMOD'95), pages 175-186, San Jose, CA, May 1995.
    [72] A. Savasere, E. Omiecinski, and S. Navathe. An efficient algorithm for mining association rules in large databases. In Proceedings of 1995 International Conference Very Large Data Bases (VLDB'95), pages 432-443, Zurich, Switzerland, 1995.
    [73] Toivonen H., Sampling Large Databases for Association Rules. In Proceedings of the 22nd International Conference on Very Large Databases, pages 134-145, Bombay, India, September 1996.
    [74] R. Agarwal, R. R. Sfikant and V. V. V. Prasad. A tree projection algorithm for generation of frequent itemsets. J. Parallel and Distributed Computing, pages 350-371, 2001.
    [75] J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. In Proceedings 2000 ACM SIGMOD International Conference on Management of Data, pages 1-12, Dallas, USA, 2000.
    [76] J. Liu, Y. Pan, K. Wang, and J. Han. Mining frequent item sets by opportunistic projection. In Proceedings of 2002 SIGKDD, pages 229-238, 2002.
    [77] G Liu, H. Lu, Y. Xu, and J. X. Yu. Ascending frequency ordered prefix-tree: Efficient mining of frequent pattems. In Proceedings 2003 DASFAA, papges 65-76, 2003.
    [78] Guimei Liu, Hongjun Lu, Jeffrey Xu Yu, Wei Wang and Xiangye Xiao. AFOPT: An Efficient Implementation of Pattem Growth Approach. In FIMI'03 Workshop on Frequent??Itemset Mining Implementations, Melbourne, Florida, USA, November, 2003.
    [79] Costa Grahne and Jianfei Zhu. Efficiently Using Prefix-trees in Mining Frequent Itemsets. In FIMI'03 Workshop on Frequent Itemset Mining Implementations, Melboume, Florida, USA, November, 2003.
    [80] D.W. Cheung, J. Han, V.T.Ngetal.A fast distributed algorithm for mining association rules. IEEE 4th Int'1 Conference on Information and Knowledge Management, pages 31-42, Baltimore, Maryland, 1995.
    [81] 杨明，孙志辉，吉根林．快速挖掘全局频繁项目集．计算机研究与发展．2002．(4)，620-626．
    [82] J.S. Park, M. Chen, and P.S. Yu, Efficient Parallel Data Mining for Association Rules. In Proceedings ACM Int'1 Conference Information and Knowledge Management, pages 31-36, New York, 1995, ACM Press.
    [83] R. Agrawal and J. Shafer. Parallel Mining of Association Rules. IEEE Trans. Knowledge andData Eng, 18(6): 962-969, Dec. 1996.
    [84] T. Shintani and M. Kitsuregawa. Hash Based Parallel Algorithms for Mining Association Rules. In Proceedings 4th Int'1 Conference Parallel and Distributed Information Systems, pages 19-30, Los Alamitos, Calif., 1996, IEEE Computer Soc. Press.
    [85] 陈富赞．大型数据集中关联规则发现方法的研究[博士论文]．天津大学．
    [86] J. Hipp, U. Guntzer, and G. Nakhaeizadeh. Mining association rules: Deriving a superior algorithm by analysing today's approaches. In Proceedings of the 4th European Conference on Principles and Practice of Knowledge Discovery, pages 159-168, Lyon, France, September 2000.
    [87] J. Hipp, U. Guntzer, and G. Nakhaeizadeh. Algorithms for association role mining: a general survey and comparison. SIGKDD Explorations, 2( 1):58-64, June 2000.
    [88] G. I. Webb. Efficient search for association roles. In Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2000), pages 99-107, Boston, MA, 2000.
    [89] M.J. Zaki, S. Parthasarathy, W. Li, and M. Ogihara. Evaluation of Sampling for Data Mining of Association Rules. In Proceeadings 7th International Workshop on Research??Issues in Data Engineering (RIDE), Birmingham, pages 42-50, UK, 1997.
    [90] Hipp. J. and Guntzer, U. and Nakhaeizadeh, G Algorithms for Association Rule Mining: A General Survey and Comparison. SIGKDD Explorations, 2( 1):58-64, June 2000.
    [91] E. Omiecinski and A. Savasere. Efficient mining of association rules in large dynamic databases. In Proceedings BNCOD'98, pages 49-63, 1998.
    [92] 惠晓滨，张凤鸣，虞健飞，牛世明．一种基于栈变换的高效关联规则挖掘算法．计算机研究与发展．2003，40(2)，3310-335．
    [93] 范明，李川．在FP-树中挖掘频繁模式而不生成条件FP-树．计算机研究与发展．2003，40(8)，1217-1222．
    [94] 李雄飞，苑森淼，董立岩，全勃．多段支持度数据挖掘算法研究．计算机学报．2001，24(6)：661-665．
    [95] 陆建江．加权关联规则挖掘算法的研究．计算机研究与发展．2002，39(10)，1282-1286．
    [96] 欧阳为民，郑诚，蔡庆生．数据集中加权关联规则的发现．软件学报．2001，12(4)，613-619．
    [97] 毛国君，刘椿午．基于项目序列集操作的关联规则挖掘算法．计算机学报．2002，25(4)．417-422．
    [98] 杨明，孙志挥．基于前缀广义链表的快速关联规则挖掘算法．小型微型计算机系统．2003，24(5)．899-901．
    [99] 黄艳，苑森淼．一种高效相联规则提取算法．吉林大学自然科学学报．1999(2)．36-38．
    [100] 黄进，尹治本．关联规则挖掘的Apriori算法的改进．电子科技大学学报．2003，32(1)．76-79．
    [101] 王多强．周建红等．快速关联规则挖掘算法DPD．华中科技大学学报．自然科学版．2002，30(12)．15-17．
    [102] D.W. Cheung, S. D. Lee, and B. Kao. A General Incremental Technique for Maintaining Discovered Association Rules. In Database Systems for Advanced Applications, pages 185-194, 1997.
    [103] W. Cheung and O. R. Zaiane. Incremental Mining of Frequent Patterns Without Candidate Generation or Support Constraint. In Proceedings Seventh International Database??Engineering and Applications Symposium (IDEAS'03), pages 111-116, Hong Kong, SAR. July, 2003.
    [104] 朱玉全，孙志挥，季小俊．基于频繁模式树的关联规则增量式更新算法．计算机学报．2003，26(1)．92-96．
    [105] 朱玉金，孙志挥，赵传申．快速更新频繁项集．计算机研究与发展．2003，40(1)．94-99
    [106] N. Pasquier, Y. Bastide, R. Taouil, and L. Lakhal. Pruning closed itemset lattices for association rules. In Proceedings of BDA Conference, pages 177-196, October 1998.
    [107] N. Pasquier, Y. Bastide, R. Taouil, and L. Lakhal: Discovering frequent closed itemsets for association roles. In Proceedings of ICDT99, pages 398-416, Jemsalern, Israel, January 1999.
    [108] M.J. Zaki and C. Hsiao. Charm: An efficient algorithm for closed association role mining. In Technical Report pages 99-108, Computer Science, Rensselaer Polytechnic Institute, 1999.
    [109] J. Pei, J. Han, and R. Mao, CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets, In Proceedings ACM-SIGMOD Workshop on Data Mining and Knowledge Discovery (DMKD'O0), pages 21-30, Dallas, TX, May 2000.
    [110] Jianyong Wang, Jiawei Han, Jian Pei. CLOSET+: Searching for the Best Strategies for Mining Frequent Closed Itemsets. In Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'03), pages 236-245, August 2003.
    [111] Yves Bastide, Nicolas Pasquier, Rafik Taouil, Gerd Stumme, and LotfiLakhal. Mining minimal non-redundant association roles using frequent closed itemsets. In Proceedings of the First International Conference on Computational Logic, pages 972-986, 2000.
    [112] N. Pasquler, Y. Bastide, R. Taouil, and L. Lakhal. Efficient Mining of association roles using closed itemset lattices. Information Systems, 24(1):25-36, 1999.
    [113] Doug Burdick, Manuel Calimlim, and Johannes Gehrke. Mafia: A Maximal Frequent Itemset Algorithm for Transactional Databases. In Proceedings of the 17th International Conference on Data Engineering, pages 443-452. Heidelberg, Germany, 2001.[114] L. Rigoutsos and A. Floratos. Combinatorial pattern discovery in biological sequences: The Teiresias algorithm. Bioinformatics, 14(1):55-67, 1998.
    [115] L. Dao I., Kedem Z. M. Pincer-Search: A new approach for discovering the maximum frequent set. In Proceedings 6th European Conference on Extending Database Technology, pages 432-444, Valencia, Spain. 1998.
    [116] Bayardo R. Efficiently mining long patterns from databases, In: Haas LM,, ed Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 85-93, 1998, ACM Press.
    [117] Murthy S, Aha D. UCI Repository of machine learning data tables [EB/OL]. http://www/ics.uci.edu/～mlearn, 1996.
    [118] Ramesh C. Agarwal, Charu C. Aggarwal and V. V. V. Prasad. Depth first generation of long pattems. In Proceedings Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 108-118, Boston, MA, USA, 2000.
    [119] 路松峰，卢正鼎．快速开采最大频繁项目集．软件学报，2001，12(2)：293-297．
    [120] 宋余庆，朱玉全，孙志挥，陈耿．基于FP-Tree的最大频繁项目集挖掘及更新算法．软件学报．2003．14(09)1586-1592．
    [121] Doug Burdick, Manuel Calimlim, and Johannes Gehrke. Mafia: A Maximal Frequent Itemset Algorithm for Transactional Databases. In Proceedings of the 17th International Conference on Data Engineering, pages 443-452, Heidelberg, Germany, 2001.
    [122] Karam Gouda and Mohammed J. Zaki. Efficiently Mining Maximal Frequent Itemsets. In Proceedings of the 1st IEEE International Conference on Data Mining, pages 163-170, San Jose, USA, 2001.
    [123] Wang H, Li QH. A Improved Maximal Frequent Itemset Algorithm. In: Wang GY Eds. Proceedings of Rough sets, Fuzzy Sets, Data Mining and Granular Computing the 9th International Conference (RSFDGrC 2003), pages 484-490, 2003, Berlin Heidelberg: Springer-Verlag.
    [124] Zhou QH, Wesley C, Lu BJ. SmartMiner: A Depth First Algorithm Guided by Tail Information for Mining Maximal Frequent Itemsets. In: Proceedings of the IEEE??International Conference on Data Mining (ICDM2002), pages 570-577, Tokyo, Japan, 2002.
    [125] Gosta Grahne, ZHU JF. High performance Mining of Maximal Frequent Itemsets. In: Proceedings of the 6th SIAM International Workshop on High Performance Data Mining (HPDM'03), pages 135-143, San Francisco, USA, 2003.
    [126] Wang H, Xiao, ZJ, Zhang Hj and Jiang Sy. Parallel Algorithm for Mining Maximal Frequent Patterns. In: Zhou XM Eds. Advanced Parallel Processing Technologies (APPT2003), LNCS 2834, pages 241-248, Berlin Heidelberg, 2003, Springer-Verlag.
    [127] Z. Zheng, R. Kohavi, LI. Mason. Real world performance of association rule algorithms. In Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pages 401-406, San Francisco, Califomia, 2001.
    [128] Goethals, B. Survey on Frequent Pattern Mining, 2003.
    [129] http://www.cs.uvm.edu/～xwu/icdm-03.html
    [130] http://fimi.cs.helsinki.fi/
    [131] Bart Coethals and M. J. Zaki. FIMI'03: Workshop on Frequent Itemset Mining Implementations. In FIMI'03 Workshop on Frequent Itemset Mining Implementations, Melbourne, Florida, USA, November, 2003.
    [132] 颜跃进，李舟军，陈火旺．基于FP-Tree有效挖掘最大频繁项集．软件学报．2005(2)．134-152．
    [133] J. Hipp, U. Guntzer, and G. Nakaeizadeh. Algorithms for association rule mining-a general survey and comparison. In Proceedings 2000 ACM SIGMOD International Conference on Management of Data, pages 58-64, Dallas, USA, 2000.
    [134] S. Orlando, P. Palmefini, R. Perego, and F. Silvestri. Adaptive and resource-aware mining of frequent sets. In V. Kumar, S. Tsumoto, P.S. Yu, and N.Zhong, editors, In Proceedings of the 2002 IEEE International Conference on Data Mining, pages 338-345, 2002, IEEE Computer Society.
    [135] P. Shenoy, J.R. Haritsa, S. Sudarshan, G. Bhalotia, M. Bawa, and D. Shah. Turbo-charging vertical mining of large databases. In Proceedings of the 2000 ACM SIGMOD??International Conference on Management of Data, 29(2):22-33, 2000, ACM Press.
    [136] M.J. Zaki. Scalable algorithms for association mining. IEEE Transactions on Knowledge and Data Engineering, 12(3):372-390, May/June, 2000.
    [137] A. Amir, R. Feldman, and R. Kashi. Anew and versatile method for association generation. Information Systems, 2:333-347, 1997.
    [138] C. Borgelt and R. Kruse. Induction of association roles: Apriori implementation. In W. Hardle and B. Ronz, editors, In Proceedings of the 15th Conference on Computational Statistics, pages 395-400, 2002. http://fuzzy.cs.uni-magdeburg.de/～borgelt/software.html.
    [139] H. Mannila and H. Toivonen. Levelwise search and boners of theories in knowledge discovery. Data Mining and Knowledge Discovery, 1(3): 241-258, 1997.
    [140] 颜跃进，李舟军，陈火旺．一种挖掘最大频繁项集的深度优先算法．计算机研究与发展．2005(3)．462-467．
    [141] http：//www.cygwin.com
    [142] 颜跃进，李舟军，陈火旺．多层扩展挖掘最大频繁项集．计算机工程与．录用待发表。
    [143] Yuejin Yan, Zhoujun Li and Huowang Chen. Fast Mining Maximal Frequent ItemSets Based on FP-Tree. In Proceedings of the 23rd International Conference on Conceptual Modeling (ER2004), pages 348-361, ShangHai, China, November 2004, Springer-Verlag.
    [144] ER 2004:http://www.cs.fudan.edu.cn/er2004
    [145] Yuejin Yan, Zhoujun Li, Tao Wang, Yuexin Chen and Huowang Chen. Mining Maximal Frequent Item Sets Using Combined FP-Tree. In Proceedings of the 17th Australian Computer Society (ACS) Australian Joint Conference on Artificial Intelligence (AI'2004), pages 475-487, Cairns, Australia, December, 2004, Springer-Verlag.
    [146] Guizhen Yang. The Complexity of Mining Maximal Frequent Itemsets and Maximal Frequent Patterns. In Proceedings Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD2004), pages 344-353, Seattle, Washington, August 2004.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700