Complexity of Rule Sets Induced from Data Sets with Many Lost and Attribute-Concept Values

详细信息查看全文

关键词：Incomplete data ; Lost values ; Attribute ; concept values ; Probabilistic approximations ; MLEM2 rule induction algorithm
刊名：Lecture Notes in Computer Science
出版年：2016
出版时间：2016
年：2016
卷：9693
期：1
页码：27-36
全文大小：2,312 KB
参考文献：1.Pawlak, Z., Wong, S.K.M., Ziarko, W.: Rough sets: probabilistic versus deterministic approach. Int. J. Man Mach. Stud. 29, 81–95 (1988)CrossRef MATH
2.Pawlak, Z., Skowron, A.: Rough sets: Some extensions. Inf. Sci. 177, 28–40 (2007)MathSciNet CrossRef MATH
3.Yao, Y.Y.: Probabilistic rough set approximations. Int. J. Approximate Reasoning 49, 255–271 (2008)CrossRef MATH
4.Yao, Y.Y., Wong, S.K.M.: A decision theoretic framework for approximate concepts. Int. J. Man Mach. Stud. 37, 793–809 (1992)CrossRef
5.Ziarko, W.: Probabilistic approach to rough sets. Int. J. Approximate Reasoning 49, 272–284 (2008)MathSciNet CrossRef MATH
6.Grzymala-Busse, J.W.: Rough set strategies to data with missing attribute values. In: Notes of the Workshop on Foundations and New Directions of Data Mining, in conjunction with the Third International Conference on Data Mining, pp. 56–63 (2003)
7.Grzymala-Busse, J.W.: Data with missing attribute values: generalization of indiscernibility relation and rule induction. Trans. Rough Sets 1, 78–95 (2004)MATH
8.Grzymała-Busse, J.W.: Generalized parameterized approximations. In: Yao, J.T., Ramanna, S., Wang, G., Suraj, Z. (eds.) RSKT 2011. LNCS, vol. 6954, pp. 136–145. Springer, Heidelberg (2011)CrossRef
9.Clark, P.G., Grzymala-Busse, J.W.: Experiments on probabilistic approximations. In: Proceedings of the 2011 IEEE International Conference on Granular Computing, pp. 144–149 (2011)
10.Clark, P.G., Grzymala-Busse, J.W., Rzasa, W.: Mining incomplete data with singleton, subset and concept approximations. Inf. Sci. 280, 368–384 (2014)MathSciNet CrossRef
11.Clark, P.G., Grzymala-Busse, J.W.: Complexity of rule sets induced from incomplete data with lost values and attribute-concept values. In: Proceedings of the Third International Conference on Intelligent Systems and Applications, pp. 91–96 (2014)
12.Clark, P.G., Grzymala-Busse, J.W.: Mining incomplete data with lost values and attribute-concept values. In: Proceedings of the IEEE International Conference on Granular Computing, pp. 49–54 (2014)
13.Clark, P.G., Grzymala-Busse, J.W.: Mining incomplete data with many lost and attribute-concept values. In: Ciucci, D., Wang, G., Mitra, S., Wu, W.-Z. (eds.) RSKT 2015. LNCS, vol. 9436, pp. 100–109. Springer, Heidelberg (2015)CrossRef
14.Clark, P.G., Grzymala-Busse, J.W.: On the number of rules and conditions in mining incomplete data with lost values and attribute-concept values. In: Proceedings of the DBKDA 7-th International Conference on Advances in Databases, Knowledge, and Data Applications, pp. 121–126 (2015)
15.Grzymala-Busse, J.W., Wang, A.Y.: Modified algorithms LEM1 and LEM2 for rule induction from data with missing attribute values. In: Proceedings of the 5-th International Workshop on Rough Sets and Soft Computing in Conjunction with the Third Joint Conference on Information Sciences, pp. 69–72 (1997)
16.Stefanowski, J., Tsoukias, A.: Incomplete information tables and rough classification. Comput. Intell. 17(3), 545–566 (2001)CrossRef MATH
17.Pawlak, Z.: Rough sets. Int. J. Comput. Inform. Sci. 11, 341–356 (1982)MathSciNet CrossRef MATH
18.Grzymala-Busse, J.W.: A new version of the rule induction system LERS. Fundamenta Informaticae 31, 27–39 (1997)MATH
19.Grzymala-Busse, J.W., Rzasa, W.: Definability and other properties of approximations for generalized indiscernibility relations. Trans. Rough Sets 11, 14–39 (2010)MATH
作者单位：Patrick G. Clark (19)
Cheng Gao (19)
Jerzy W. Grzymala-Busse (19) (20)

19. Department of Electrical Engineering and Computer Science, University of Kansas, Lawrence, KS, 66045, USA
20. Department of Expert Systems and Artificial Intelligence, University of Information Technology and Management, 35-225, Rzeszow, Poland
丛书名：Artificial Intelligence and Soft Computing
ISBN：978-3-319-39384-1
刊物类别：Computer Science
刊物主题：Artificial Intelligence and Robotics
Computer Communication Networks
Software Engineering
Data Encryption
Database Management
Computation by Abstract Devices
Algorithm Analysis and Problem Complexity
出版者：Springer Berlin / Heidelberg
ISSN：1611-3349
卷排序：9693

文摘

In this paper we present experimental results on rule sets induced from 12 data sets with many missing attribute values. We use two interpretations of missing attribute values: lost values and attribute-concept values. Our main objective is to check which interpretation of missing attribute values is better from the view point of complexity of rule sets induced from the data sets with many missing attribute values. The better interpretation is the attribute-value. Our secondary objective is to test which of the three probabilistic approximations used for the experiments provide the simplest rule sets: singleton, subset or concept. The subset probabilistic approximation is the best, with 5 % significance level.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700