文摘
In this paper we present experimental results on rule sets induced from 12 data sets with many missing attribute values. We use two interpretations of missing attribute values: lost values and attribute-concept values. Our main objective is to check which interpretation of missing attribute values is better from the view point of complexity of rule sets induced from the data sets with many missing attribute values. The better interpretation is the attribute-value. Our secondary objective is to test which of the three probabilistic approximations used for the experiments provide the simplest rule sets: singleton, subset or concept. The subset probabilistic approximation is the best, with 5 % significance level.