模糊关联规则挖掘算法的研究与应用
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着新军事变革的不断深入和发展,信息技术所带来的数据容量急剧增长,以数据挖掘等信息处理技术为动力的军事转型也势在必行。数据挖掘是未来信息化战争中掌握信息化优势,牢牢掌握战争主动权的得力工具。装备保障体系中,装备维修器材是实施装备维修保障的重要物质基础,对装备的战备完好性和战斗力具有重要的影响。随着装备复杂度的提高,维修器材的确定与优化问题也越来越突出。发现维修器材之间消耗的规律对维修器材保障的优化决策具有重要的意义,能够提高保障的效率。关联规则挖掘是数据挖掘中最重要的任务之一,其目标是发现数据库中属性之间的关联关系,为维修器材保障的优化提供了一种有效的解决方法。
     本文在研究数据挖掘中关联规则挖掘算法的基础上,结合问题的特点,着重研究了关联规则的一种扩展形式——模糊关联规则。针对现有的模糊关联规则挖掘算法的不足之处,借鉴关联规则经典算法Apriori算法和FP-tree算法的特点,提出了一种基于线性链表的模糊关联规则挖掘算法。该算法只对事务数据库进行一次扫描,仅记录对计算频繁项集支持度有贡献的事务信息,减少了数据存储的开销,提高了算法计算效率。通过时间复杂度分析证明了算法效率的高效性。通过UCI数据集对算法的准确性和有效性进行了实验验证。最后,将算法应用于对装备维修器材保障优化问题的解决。
     本文首次采用模糊关联规则的挖掘方法对装备维修器材保障优化问题进行研究,提出了一种效果良好、性能优越的模糊关联规则挖掘算法,并采用CRISP-DM数据挖掘方法对挖掘过程进行了建模分析,为我军装备维修器材保障的优化提供了一种有价值的技术参考。
With the intensification and development of the new military innovation, information technologies arouse the rapid increase of data, military transform has to be happened via information processing technology such as data mining.Data mining is powerful tool for getting an informationize advantage and seizing the initiative in the informationize war in the future. At the arming support system,the arming maintenance materials are the important elements of the arming maintenance support,have an effect on the integrality and performance of arming. With the increasing complexity of arming, the choice and optimizing of maintenance materials become more and more important. Discovering the rules of maintenance materials using is signality for the optimizing decision-making,and can improve the efficiency of arming maintenance materials support. As one of the main tasks in the field of data mining,association rule mining is used to discover the relationships among the attributes,and it is a useful approach to address the problem of optimizing maintenance materials support.
     Based on the research of association rule mining algorithm and considering the feature of the problem, this thesis puts the research attention to one of the association extend forms——fuzzy association rule. In order to improve the efficiency of existing fuzzy association rule mining algorithm, we proposed a mining algorithm base on linear linklist in this thesis via learning the merits of the classic association rule mining algorithm Apriori and FP-tree. This algorithm scans database once, and only records the information of the transactions, which are useful for counting the support of the frequent itemset. It reduces the spending of data storage, increase the running efficiency. We prove this algorithm has high performance via the time complexity analysis. We also have tested the veracity and validity of this algorithm by experiment on a dataset from UCI. And last,we use this algorithm to solve the optimizing problem of maintenance materials support.
     In this thesis, we first use fuzzy association rule minimg method to solve the optimizing problem of maintenance materials support, propose a excellent mining algorithm, and build the mining modeling via using a data mining methodology called CRISP-DM. It is a valuable reference for arming maintenance materials support for our army.
引文
[1]毛国君,段立娟,王实,石云.数据挖掘原理与算法(第二版)[M].北京:清华大学出版社,2007.
    [2] Fayyad U M, Shapiro G P, Smyth P. Advances in Knowledge Discovery and Data Mining AAAI/MIT Press, 1996.
    [3]朱玉全,杨鹤标,孙蕾.数据挖掘技术[M].南京:东南大学出版社,2006.
    [4]陈志泊.数据仓库与数据挖掘[M].北京:清华大学出版社,2009.
    [5]纪希禹.数据挖掘技术应用实例[M].北京:机械工业出版社,2009.
    [6]王连来.数据挖掘技术及其在工程装备维修管理中的应用[D].中国科学技术信息研究所.硕士学位论文, 2004.
    [7] Kuok C M, Fu A W C, Wong M H. Mining fuzzy association rules in databases [C]. in: Proceedings of the ACM 6th International Conference on Information and Knowledge Management.Las Vegas,Nevada,USA: 1997. 10-14.
    [8] Hong T P, Kuo C S, Chi S C. Mining association rules from quantitative data[J]. Intelligent Data Analysis, 1999,3 363-376.
    [9] Kaya M, Alhajj R. Genetic algorithm based framework for mining fuzzy association rules[J]. Fuzzy Sets and Systems, 2005,152 587-601.
    [10]朱明.数据挖掘(第2版)[M].合肥:中国科学技术大学出版社,2008.
    [11]董杰.基于位表的关联规则挖掘及关联分类研究[D].大连理工大学.博士学位论文, 2009.
    [12] Agrawal R, Srikant R. Fast algorithms for mining assoelation rules in large databases [C]. in: Proceedings of the 1994 intenlational conference on very large databases.Santiago,Chile: 1994. 487-499.
    [13] Savasere A, Omiecinski E, Navathe S. An efficient agorithm for mining association rules in large databases [C]. in: Proceeding of the 1995 international conference on very 1arge data bases.Zurich,Switzerland: 1995. 432-443.
    [14] Park J S, Chen M S, Yu P S. Using a hash-based method with transaction trimming for mining association rules[J]. IEEE Transactions on Knowledge and Data Engineering, 1997,9 (5): 813-825.
    [15] Toivonen H. Sampling large databases for association rules [C]. in: Proceedings of the 1996 international conference on very large databases.Bombay,India: 1996. 134-145.
    [16] Brin S, Motwani R, Ullman J D, Tsur S. Dynamic itemset counting and implication rules for market basket data [C]. in: Proceeding of the 1997 ACM-SIGMOD international conference on management of data.Tucson,AZ,USA: 1997. 255-264.
    [17] Han J W, pei J, Yin Y W. Mining frequent patterns without candidate generation [C]. in:Proceedings of ACM-SIGMOD international conference on management of data.Dallas,TX,USA: 2000. 1-12.
    [18] Zaki M J. Scalable algorithms for association mining[J]. IEEE Transactions on Knowledge and Data Engineering, 2000,12 (3): 372-390.
    [19] Pei J, Han J, Lu H, Nishio S, et al. H-Mine:Fast and space-preserving frequent pattern mining in large databases[J]. IIE Transactions, 2007,39 (6): 593-605.
    [20] Liu G, Lu H, Yu J X. Ascending frequency ordered prefix-tree:Efficient mining of frequent patterns [C]. in: Proceedings of the 8th International Conference on Database Systems for Advanced Applications.Kyoto,Japan: 2003. 65-76.
    [21] Agarwal R C, Aggarwal C C, Prasad V V V. A tree projection algorithm for generation of frequent item sets[J]. Journal of Parallel and Distributed Computing, 2001,61 (3): 350-371.
    [22] Liu J Q, Pan Y H, Wang K, Han J W. Mining frequent item sets by opportunistic projection [C]. in: Proceedings of the 8th ACM SIGKDD international conference on knowledge discovery and data mining.Alberta,Canada: 2002. 229-238.
    [23]刘君强,潘云鹤.基于混合投影的频繁模式挖掘算法[J].计算机研究与发展, 2003,40 (10): 1488-1498.
    [24] Chan K C C, Au W-H. Mining fuzzy association rules [C]. in: Proceedings of the 6th ACM International Conference on Information and Knowledge Management.Las Vegas,Nevada,USA: 1997. 209-215.
    [25] Delgado M, Marín N, Sánchez D, Vila M-A. Fuzzy Association Rules: General Model and Applications[J]. IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2003,11 (2): 214-225.
    [26] Hong T P, Kuo C S, Wang S L. A fuzzy AprioriTid mining algorithm with reduced computational time[J]. Applied Soft Computing, 2004,5 1–10.
    [27] Lee Y C, Hong T P, Lin W Y. Mining fuzzy association rules with multiple minimum supports using maximum constraints[J]. Lecture Notes in Computer Science, 2004,3214 1283-1290.
    [28] Papadimitriou S, Mavroudi S. The frequent fuzzy pattern tree [C]. in: Proceeding of the 9th WSEAS international conference on computers.2005.
    [29] Lin C W, Hong T P, Lu W H. Linguistic data mining with fuzzy FP-trees[J]. Expert Systems with Applications, 2010,37 4560-4567.
    [30] Weng C-H, Chen Y-L. Mining fuzzy association rules from uncertain data[J]. Knowl Inf Syst, 2010,23 129–152.
    [31] Fu A W C. Finding fuzzy sets for the mining of association rules for numerical attributes [C]. in: Proceedings of the International Symposium of Intelligent Data Engineering and Learning.1998. 263-268.
    [32] Kaya M, Alhajj R, Polat F, Arslan A. Efficient automated mining of fuzzy association rules [C]. in: Proc. of DEXA.2002.
    [33]陆建江.数量型属性模糊关联规则的研究[D].中国人民解放军理工大学.博士学位论文, 2002.
    [34]陆建江.挖掘优化的语言值关联规则[J].计算机工程与应用, 2002,38 (16): 38-39,42.
    [35] Wang W, Bridges S M. Genetic algorithm optimization of membership functions for mining fuzzy association rules [C]. in: Proceedings of the International Conference on Fuzzy Theory & Technology.2000. 131–134.
    [36] Kaya M, Alhajj R. Utilizing Genetic Algorithms to Optimize Membership Functions[J]. Applied Intelligence, 2006,24 7-15.
    [37] Hong T P, Chen C H, Wu Y L, Lee Y C. Using Divide-and-Conquer GA Strategy in Fuzzy Data Mining[J]. IEEE Transactions on Evolutionary Computation, 2008,12 (2): 252-265.
    [38] AlcaláR, Alcalá-Fdez J, Gacto M J, Herrera F. Genetic Learning of Membership Functions for Mining Fuzzy Association Rules[J]. Fuzzy Sets and Systems, 2009,160 905-921.
    [39] Bridge S, Vaughn R. Fuzzy data mining and genetic algorithms applied to intrusion detection [C]. in: Proceedings of the 23rd national information systems security conference.Baltimore,Maryland,USA: 2000.
    [40] Wong C, Shiu S. Mining fuzzy association rules for web access case adaptation [C]. in: Proceedings of the 4th international conference on case-based reasoning Vancouver BC,Canada: 2001. 213-220.
    [41] Au W H, Chan K C C. Mining fuzzy association rules in a bank-account database[J]. IEEE Transaction on Fuzzy Systems, 2003,11 (2): 238-248.
    [42] Latiri C C, Yahia S B, Chevallet J P. Query expansion using fuzzy association rules between terms [C]. in: Proceedings of the 4th International Conference JIM'2003.Metz,France: 2003.
    [43] Lopez F J, Blanco A, Garcia F, Cano C, et al. Fuzzy association rules for biological data analysis: A case study on yeast[J]. BMC Bioinformatics, 2008,9
    [44] Delgado G, Aranda V, Calero J, Sánchez-Maranón M, et al. Using fuzzy data mining to evaluate survey data from olive grove cultivation[J]. computers and electronics in agriculture, 2009,65 99–113.
    [45] Hong T P, Kuo C S, Chi S C. A Fuzzy Data Mining Algorithm for Quantitative Values [C]. in: the Third International Conference on Knowledge-Based Intelligent Information Engineeing Systems.Adelaide, Australia: 1999. 480-483.
    [46] Hong T P, Kuo C S, Chi S C. Trade-off between time complexity and number of rules for fuzzy mining from quantitative data[J]. International Journal of Uncertainty Fuzziness and Knowledge-Based Systems, 2001,9 (5): 587-604.
    [47]邵峰晶,于忠清,王金龙,孙仁诚.数据挖掘原理与算法(第二版)[M].北京:科学出版社,2009.
    [48]宋建社,曹小平,曹耀钦,何志德.装备维修信息化工程[M].北京:国防工业出版社,2005.
    [49]甘茂治,康建设,高崎.军用装备维修工程学(第二版)[M].北京:国防工业出版社,2005.
    [50]孔繁柯,刘馝.军用车辆运用工程[M].北京:国防科技出版社,1993.
    [51] http://www.crisp-dm.org/,
    [52] Shearer C. The CRISP-DM Model: The New Blueprint for Data Mining[J]. JOURNAL of Data Warehousing, 2000,5 (4): 13-22.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700