基于谓词逻辑和包含集的分类规则约简算法
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
分类挖掘是数据挖掘的重要研究内容之一,现有的分类规则挖掘算法所得到的规则集中存在大量的冗余,严重影响了分类规则的分类效率与可理解性,因此对挖掘出的冗余分类规则集进行约简,具有重要的理论意义和应用价值。本文采用谓词逻辑和包含集对分类规则集的后处理进行了研究,其主要研究成果如下:
     第一、基于谓词逻辑的分类规则约简RMCRPL算法。首先,用谓词公式描述分类规则,把规则集转换成谓词公式的集合;其次,利用谓词逻辑中的逻辑推理,对规则集进行约简,消除冗余规则;最后,采用恒星光谱数据,实验验证了该方法是有效的,可行的;
     第二、基于包含集的分类规则约简MMIS与RRPABIRS算法。首先利用规则与数据间的分类关系,提取以单条分类规则为元素的包含集;其次,通过对单个包含集的处理来实现对分类规则集的精简,给出了MMIS与RRPABIRS算法;最后采用恒星光谱数据,实验验证了该方法是有效的,可行的。
The classification is an important task in data mining. There are a lot of redundant rules in classification rule set which are extracted by classification rule mining methods, so that the classification efficiency and understandability of the classification rule set are effected seriously. So, the classification rule set reducing has very important theory meaning and application value. In this paper , the post-processing methods of classification rule set are studied by using the predicate logic and the including set. The main research works are as follows:
     First, a reducing algorithm (RMCRPL) of classification rule set is presented based on predicate logic. Firstly, the classification rule set is described by using predicate logic, so that the rule set is changed into a predicate formula set. Secondly, the classification rule set is reduced by using logic reasoning in predicate formula, so that the redundant rules are eliminated. In the end,the experiment results validate that the algorithm is effective and feasible by taking the celestial spectrum data.
     Second, reducing algorithms (MMLS and RRPABIRS) of the classification rule set are presented based on the including set. Firstly,making use of classification relation between rule and data, the including set of any classification rule is extracted. Secondly, MMIS and RRPABIRS reducing algorithms are presented by dealing with the including set ong by one. In the end,the experiment results validate that the algorithms is effective and feasible by taking the celestial spectrum data.
引文
[1].李雄飞,李军.数据挖掘与知识发现.第1版.北京,高等教育出版社,2003,1-17
    [2].J.Han,M.Kambr.Data Mining Concepts and Techniques.Morgan Kaufmann Publishers.2000:1301-1309
    [3].Pang-Ning Tian,Michael Steinbach,ViPin Kunmar.数据挖掘导轮.第1版.北京,人民邮电出版社,2005,1-7
    [4].Jay-Louise Weldon.Data mining and visualization. Database Programming and Design,1996,9,5:1-24
    [5].Jiawei Han,Micheline Kamber.数据挖掘概念与技术.第1版.北京,机械工业出版,1981,149-314
    [6].张文宇,薛惠锋,张洪才,等.粗糙集在数据挖掘分类规则中的应用研究.西北工业大学学报,2002,20,3:430-433
    [7].尹世群,余建桥,葛继科,等.基于粗糙集的分类关联规则挖掘算法研究.计算机科学,2007,34,12:171-174
    [8].李永敏,朱善君,陈湘晖,等.基于粗糙集理论的数据挖掘模型.清华大学学报(自然科学版),1999,39,1:110-113
    [9].洪家荣,丁明峰,李星原,等.一种新的决策树归纳学习算法.计算机学报,1992,18,6:470-474
    [10].刘小虎,李生.决策树的优化算法.软件学报,1998,9,10:797-800
    [11].胡宝清.模糊理论基础.第1版.武汉,武汉大大学学出版社,2004
    [12].胡可云,陆玉昌,石纯一.概念格及其应用进展.清华大学学报(自然科学版),2000,40,9:77-81
    [13].胡立华,张继福,张素兰.基于剪枝的概念格渐进式构造.计算机应用,2006,26,7:1659-1661
    [14].胡可云,陆玉昌,石纯一.基于概念格的分类和关联规则的集成挖掘方法.软件学报,2000,11,11:1478-1483
    [15].王志海,胡可云,胡学钢,等.概念格上规则提取的一般算法与渐进式算法.计算机学报,1999,22,1:66-70
    [16].胡学钢,陈慧,张玉红,等.基于分布式概念格的分类规则挖掘.合肥工业大学学报(自然科学版),2007,30,2:132-136
    [17].王浩,胡学钢,赵文兵.基于量化相对约简格的分类规则发现.复旦大学学报,2004,43,5:61-765
    [18].梁吉业,王俊红.基于概念格的规则产生集挖掘算法.计算机研究与发展,2004,41,8,1339-1344
    [19].谢志鹏,刘宗田.概念格与关联规则发现.计算机研究与发展,2000,37,12:1415-1421
    [20].马君华,陈云开.一种基于粗糙集的分类数据挖掘算法.计算机科学,2008,35,6:13-216
    [21].赵志坤,李义杰.基于粗糙集的分类规则挖掘的研究.矿业研究与开发,2006,26,2:64-66
    [22].印勇,曹长修,张邦礼.基于粗糙集理论的分类规则发现.重庆大学学报(自然科学版),2000,23,1:63-65
    [23].邢乃宁,孙志挥.一种基于粗集理论的分类规则挖掘的实现方法.计算机应用,2001,21,12:29-31
    [24].蔡虹,叶水生,张永.一种基于粗糙-模糊集理论的分类规则挖掘方法.计算机工程与应用,2006,42,2:186-188
    [25].张文宇,薛惠锋,张洪才,等.基于粗集理论的分类关联规则挖掘研究.西安石油学院学报(自然科学版),2002,17,4:81-85
    [26].孙长嵩,董西国,张健沛.一个基于粗糙集和决策树的最简分类规则集生成算法.哈尔滨工程大学学报,2002,23,5:87-91
    [27].贾嵘,张文宇.粗糙逼近近似度量在分类规则挖掘中的应用.西安石油大学学报(自然科学版),2007,22,1:107-110
    [28].高静,杨炳儒,徐章艳,等.一种改进的基于正区域的决策树算法.计算机科学,2008,35,5:138-142
    [29].屈志毅,周海波.决策树算法的一种改进算法.计算机应用,2008,28,06:141-143
    [30].谭旭,王丽珍,卓明.利用决策树发掘分类规则的算法研究.云南大学学报(自然科学版),2000,22,6:415-419
    [31].卜亚杰,胡朝举,白兰,等.一种健壮有效的决策树改进模型.计算机应用,2008,28,6:172-174
    [32].Quinlan J R.Induction of decision trees.Machine Learning,1986,1:81-106
    [33].Quinlan J R.Simplifying decision trees.International Jour-nal of Man-manchine Studies,1987,27,3:221-234
    [34].Quinlan J R.C4.5:Programs for Machine Learning.Machine Learning,1994,16:235-240
    [35].蒋蕾,王士同.基于蚁群算法的分类规则挖掘.江南大学学报(自然科学版),2008,7,5:511-515
    [36].常晓磊,闰仁武.一种基于蚁群算法的分类规则挖掘算法.计算机技术与发展,2007,17,7:114-116
    [37].何爱香,张勇.基于遗传算法和决策树的肿瘤分类规则挖掘.山东大学学报(理学版),2007,42,9:91-95
    [38].张朝晖,陆玉昌,张钹.利用神经网络发现分类规则.计算机学报,1999,22,1:108-112
    [39].王永庆.人工智能原理与方法.第1版.西安,西安交通大学出版社,2006,1-24
    [40].屈婉玲,耿素云,张立昂.离散数学.第1版.北京,清华大学出版社,2005,35-107
    [41].左孝凌,李为鑑,刘永才.离散数学.第1版.上海,上海科学技术文献出版社,1982,81-139
    [42].Huawen Liu,Jigui Sun,HuijieZhang.Post-processing of associative classification rules using closed sets.Expert Systems with Applications,2009,36,3:6659-6667
    [43].Liu B,Hsu W,Ma Y.Pruning and summarizing the discovered association.In ACM SIGKDD International Conference on Knowledge Discovery & Data Mining,1999:125-134
    [44].Calders,T,Rigotti,C,Boulicaut,J.F.A survey on condensed representations for frequent sets.Lecture Notes in Computer Science,2006,3848:64-80
    [45].Bruha,I,&Famili,A.Post processing in machine learning and data mining.ACM SIGKDD Explorations Newsletter,2000,2,2:110–114
    [46].Alipio M.Jorge,Paulo J.Azevedo.An experiment with association rules and classification:psst-bagging and conviction.Discovery Science,2005:137-149.
    [47].Thabtah,F.Rule Pruning in Associative Classification Mining.Proceedings of the IBIMA conference,2005:101-107
    [48].Viktor Jovanoski,Nada Lavrac.Classification Rule Learing with APRICRI-C.Progress in artificial intelligence:knowledge extraction ,multi-agent systems,logic programming,and constraint solving,2001,2258:44-51.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700