化学化工数据挖掘技术的研究

英文题名：Research on the Data Mining in Chemistry and Chemical Engineering
作者：束志恒
论文级别：博士
学科专业名称：化学工程与技术
中文关键词：效据挖掘 ; 粗糙集方法 ; 属性筛选 ; 离散化 ; 决策表的约简 ; 化学模式分类建模 ; 化工过程建模
英文关键词：Data mining ; Rough sets ; Feature selection ; Discretization ; Reduction ; Chemical Pattern Classifier Modeling ; Chemical Process Modeling
学位年度：2005
导师：陈德钊
学科代码：081701
学位授予单位：浙江大学
论文提交日期：2004-08-01

摘要

化学化工是一门实践性很强的学科,随着计算机技术的发展,积
    累了大量的数据,数据挖掘技术的发展为从这些数据获取有用知识提
    供了有力的工具。数据挖掘方法的有效性,总是与各个领域的数据特
    点紧密的结合在一起。本文针对化学化工领域中的数据具有高维、复
    共线性和带有噪音的特点,利用神经网络、粗糙集方法、模糊系统以
    及统计方法,对属性筛选、连续属性的离散化、规则获取、化学模式
    分类建模、化工过程建模进行了研究,并介绍了数据挖掘方法和粗糙
    集的基本理论和方法,以及化学化工数据挖掘所面临的问题。主要内
    容如下:
    1、提出一种基于正则化网络-遗传算法的属性筛选方法。根据
    神经网络剪枝中的正则化方法和灵敏度分析方法,采用贝叶斯正则化
    方法对网络进行训练,然后利用神经网络分类器的特性设计选择算
    子,利用遗传算法对神经网络的输入单元进行剪枝,从而达到属性筛
    选的目的。在留兰香高维模式的属性筛选中,说明了本方法优于其它
    方法。
    2、针对粗糙集方法只能处理离散型数据,提出一种基于X2统计
    量的离散化方法RSE-Chi2。本方法是一种合并型的离散化方法,以X2
    统计量的大小作为是否合并依据,以决策系统的不确定度量函数作为
    离散化停止标准,通过基于背景知识的特征价值度量大小来安排各个
    属性离散化顺序。本方法的优点是将连续属性的离散化和特征选择有
    机的结合在一起,自动确定合适的离散化程度。
    3、在基于粗糙集的分类规则获取中,为了使所得规则具有良好
    的泛化性能,并使基于规则的分类模型具有较好的推广性,提出了以
    下方法:采用RSE-Chi2方法,将决策系统的连续属性离散化和属性
    约简结合在一起,消除冗余的划分断点,使所得约简具有较好的推广
    性;在分辨矩阵的基础上,采用贪心算法,每次选入分类能力最强的
    属性值,以获得值约简的满意解;根据所得规则参数的统计性质,以
    及与样本条件属性值的匹配程度,对未知类别样本进行预测。在橄榄
    油的分类规则获取和分类建模应用中,所得结果易于理解,无需先验
    知识,具有较好的预测准确度。

    浙江大学博士学位论文
     4、根据连续属性离散化后所得知识的模糊性,将粗糙集方法与
    模糊方法相结合,并根据神经网络原理来调整有关参数,提出了以下
    方法:根据粗糙集方法所得规则构建了一种用于分类的模糊一神经网
    络系统,利用规则参数的统计性质和离散化结果对网络参数进行初始
    化,并给出训练方法;提出基于粗糙集的回归分析方法,由此获得用
    于回归建模的模糊规则,构建用于回归建模的模糊一神经网络系统,
    给出了网络初始化方法和训练方法。将这两种方法分别用于化学模式
    分类建模和化工过程建模,具有训练速度快,网络结构简单,易于理
    解,推广性良好,优于统计方法和前馈神经网络方法。
    关键词数据挖掘粗糙集方法属性筛选离散化决策表的约简
     化学模式分类建模化工过程建模
The data increase steady in the field of Chemistry and Chemical Engineering Data Mining is a powful tool to evaluate "hidden" information from large amount of data, but the methods of data mining shall be suitable to the characteristic of data in variable field. For the data with the feature of higher-dimension,noise and compound linear in Chemistry and Chemical Engineer, by the methods of neural networks,rough sets ,fuzzy sets and statistic, our work focus on the problem of feature selection, discretization, rule generation, chemical pattern modeling and chemical process modeling, the main contributions in this disseration are as follows:(1) A methods of feature selection based on regularization networks -genetic algorithm is present. We adopt the Bayes regularization method to get a well generalized neural networks, present a heuristic genetic algorithm to prune the regularization networks by sensitivity analysis, and the minimum and optimal attributes set which represent the characteristic of classification can be selected from the patterns of high dimensionality. Finally, the problem of attribute selection and patterns classification of spearmint essence is applied to check the validity of this method, the result show that the method is superior to the other methods obviously.(2) Discretization based on chi-square statistic method always need to set suitable significance level or inconsistent rate manually. Data analysis of rough sets doesn't usd any prior knowledge about data, the information entropy of rough sets can measure the uncertainty of knowledge well, it also reveal the characteristic of classification in data, so the information entropy is treated as the evaluation function for discretization, it is determined by the inherent characteristic of data, not any external knowledge about data. Moreover, the sequence of discretization for each attribute in multi-attributes should effect the result of discretization, we order it by the value of feature merit measures. At last , we present a algorithm based on information entropy as RSE-Chi2 with no parameters set manually. The application of the algorithm show it can overcome the disadvantage of Chi2 algorithm, and RSE-Chi2 can be used to generate

    the reduction of attribute.(3) In order to get well generalized rules, and let the classifier based on rules has good predicative. Firstly, the redundance of cut point of discretization is eliminated when attribute reduction are integated into discretization based on RSE-Chi2, and the attribute reduction generalize well. Secondly, a greedy algorithm which selecting the value of attribute with the best quality of classification generates a satisfying value reduction. Finally, the predicting is based on the rule's statistic parameter and matching degree. At last, we use the methods to chemical pattern classification rules generate and classifier modeling, compare to the statistical methods and neurol networks, the meaning of model is very understandable in chemical domain, and the prediction of the model is also well.(4) When continuous attribute is discretized into intervals, the interval can be regarded as fuzzy region, and every value of attribute after discretization is a linguistic value in fuzzy theory, so rough sets mthodscan be intergated with fuzzy set methods, a fuzzy inference system can bebuilt from the rules generated by rough sets, whose paramters are trainedby BP algorithm, we call the system as fuzzy-neuro networks system. For classification, we present a fuzzy-neuro networks whose structure is decided by the fuzzy rule generate by rough sets methods , and we initialize the paramter5of networks by the rule's statistic paramler and the result of discretization; When the rough sets used for regression, by discretization of decision attribute, the regression is turned into classification, rough sets generate the sugeno fuzzy rules by postprocessing the pseudo-classes rules, and

引文

[1] Jiawei Han, Micheline Kamber, Data Mining Concepts and Techniques, Morgan Kaufmann, 2000
    [2] 史忠植.知识发现.清华大学出版社,北京,2002年,第一版
    [3] 许禄,郭传杰.计算机化学方法及应用,化学工业出版社,1990
    [4] 陈念贻.模式识别技术在化学化工中的应用,北京:科学出版社, 2000.
    [5] Yizeng Liang, Feng Gan. Chemical knowledge discovery from mass spectral database I. Isotope distribution and Beynon table, Chimica Acta 2001, vol 446,ppl 15-120
    [6] Martin G. Hicks . Chemical Data Analysis in the Large: The Challenge of the Automation Age, Proceedings of the Beilstein-Institut Workshop, May 22nd-26th, 2000, Bozen, Italy
    [7] Karen Kafadar. The influence of John Tukey's work in robust methods for chemometrics and environmetrics, Chemometrics and Intelligent Laboratory ystems,2002,vol 60,127-134
    [8] Markus C. Hemmer, Johann Gasteiger, Data Mining in Chemistry, http//www2. ccc.uni-erlangen.de
    [9] Akihiro inokuchi, Takashi washio, H iroshi motoda. Complete Mining of Frequent Patterns from Graphs: Mining Graph Data, Machine Learning, 2003, 50: 321-354
    [10] Thompson M L,Kramer M A, Modeling chemical process using prior knowledge and neural networks, AiChE, 1994,40(8) : 1328-1340
    [11] Mukjree P., Tambe S.S, Kulkarni, B.D., et al. Reaction modeling and optization using neural networks and genetic algorithms: Case study TS-1-catalyzed hydroxylation of benzene. Industrial and Engineering Chemistry Research,2002,41(9) :2159-2169
    [12] Martin E B, Morris A J, Zhang J. Process performance monitoring using multivariate statistical process control. IEEE Proceedings: Control Theory and Applications, 1996, 143 (2) : 132-144

    [13] David Hand, et al, Principle of Data mining , Massachusetts Institute of Technology,2001.
    [14] Han J.Characteristic Rules. In:Kloesgen W, Zyrkow J. eds, Handbook of Data Mining and Knowledge Discovery, Oxford University Press, 1999.
    [15] R.Agrawal, T. Imielinski, and A.Swami. Mining assocation rules between sets of items in large database. In Proc. 1993 ACM- SIGMOD Int. Conf. Management of Data (SIGMOD'93),pp. 207-216, Wasinton, DC,May 1993.
    [16] R.Agrawal and R.Srikant. Fast algorithms for mining assocation rules in large databases. In Research Report RJ 9839, IBM Almaden Research Center, San Jose,CA,June 1994.
    [17] R,Agrawal and R.Srikant. Fast algorithms for mining assocation rules. In Proc. 1994 Int. Conf. Very Large Data Bases (VLDB'94),pp 487-499, Santiago, Chile,Sept. 1994
    [18] D.Heckerman. Bayesian Networks for Data Mining, Data Mining and Knowledge Discovery 1, 79-119, 1997
    [19] Z.Pawlak. Rough Sets, Theoretical Aspects of Reasoning about Data.Boston: Kluwer Academic Publishers,1991.
    [20] Glymour C. Statistical inference and data mining. Communication of the ACM, 1996,39(11):35-41
    [21] Hosking J R M,Pednault E P D, Sudan M. A statistical perspective on data mining. Future Generation Computer Systems, 1997,13:117-134.
    [22] Ian H. Witten,Eibe Frank:Data Mining Practical Machine Learning Tools and Techniques with Java Implementation, Academic Press,2000.
    [23] Tom M.Mitchell, Machine Leaning,The McGraw-Hill Companies, Inc.1997
    [24] Lu H,Setiono R, Liu H. NeuroRule: A Connectionist Approach to Data Mining.In: Proceedings of the 21st International Conference on Very large Databases,Zurich,Switzerland,1995.
    [25] Craven M W,Shavlik J W. Using Neural Networks for Data Mining. Future Generation Computer Systems, 1997,13:221-229.

    [26] A.K.Jain,M.N.Murty,and P.J.Flunn. Data clustering: A survey. ACM Comput. Surv., 31:264-323,1999.
    [27] L.Kaufman and P.J.Rousseeuw. Finding Groups in Data: An Introduction to Cluster Analysis. New York:John Wiley&Sons, 1990.
    [28] M.Ester,H.P.Kriegel, J.Sander, and X.Xu. Density-connected sets and their application for trend detection in spatial databases. In Proc. 1997 Int.Conf.Knowledge Discovery and Data mining(KDD'97) , pp.10-15, Newport Beach,CA,Aug.1997.
    [29] W.Wang, J.Yang, and R.Muntz. STING:A statistical information grid approach to spatial data mining. In Proc. 1997 Int. Conf. Very Large Data(VLDB'97) ,pp.186-195, Athens, Greece,Aug.1997.
    [30] J.W.Shavlik and T.G.Dietterich.Readings in Machine Learning. San Mateo,CA:Morgan Kaufmann,1990.
    [31] P.Cheeseman and J.stutz. Bayesian classification(AutoClass): Theory and results. In U.M.Fayyad, G.Piatesky-Shapiro,P.Smyth, and R.Uthurusamy editors. Advances in Knowledge Discovery and Data Mining, pp.153-180. Cambridge,MA:AAA/MIT Press,1996.
    [32] T.Kohonen. Self-Organized formation of topologically correct feature maps. Biological Cybernetics,43:59-69,1982
    [33] D.E.Rumelhart and D.Zipser. Feature discovery by competitive learning. Cognitive Science,9:75-112,1985.
    [34] M.J.A.Berry and G.Linoff. Mastering Data Mining: The Art and Science of Customer Relationship Magnagement. New York: John Wiley&Sons, 1994.
    [35] Vapnic V N. The nature of Statistical Learning Theory. NY:Springer-Verlag, 1995.
    [36] 马江洪,张文修等,数据挖掘与数据库知识发现:统计学的观点, 工程数学学报,Vol.19,Feb,2002.
    [37] Fayyad U. Stolorz P. Data mining and KDD: Promise and challenges. Future Generation Computer Systems,1997,13:99-115.
    [38] Hand D J. Data mining: statistics and more. The American Statistician, 1998,52(2) : 112-118.
    [39] J.R.Quinlan. C4. 5:Programs for Machine Learning San Mateo,CA:Morgan Kaufmann,1993.
    [4

    [40] L.Breiman, J.Friedman, R.Olshen, and C.Stone. Classification and Regression Trees. Monterey, CA:Wadsworth International Group,1984.
    [41] 陆汝铃,知识科学与计算科学,北京:清华大学出版社,2003.
    [42] Tickle A B, Andrews R. The Truth will Come to light:Directions and Challenges in Extracting the Knowledge Embedded within Trained Artificial Neural Network.IEEE Transactions on Neural Networks,1998,9(6) : 1057-1068.
    [43] Specht D F. A General Regression Neural Networks. IEEE Transactions on Neural Networks,1991,2(6) :568-576. [44] Zhou Zhihua,Chen Shifu,Chen Zhaoqian. FANRE:A Fast Adaptive Neural Regression Estimatior. In: Foo N (ed) Lecture Notes inartificiallntelligence1747, Berlin:Springer-Verlag, 1999,48-59.
    [45] Zhou Zhihua,Chen Shifu,Chen Zhaoqian.FANNC:A Fast Adaptive Neural Network Classifier. Knowledge and Information Systems, 2(1) :115-129, 2000.
    [46] Zdzislaw Pawlak. Data mining-a Rough Set Perspective. N. Zhong and L. Zhou (Eds.): PAKDD'99, LNAI 1574, pp. 3-12, Berlin Heidelberg: Springer-Verlag ,1999.
    [47] 曾黄麟,粗集理论及其应用,第二版,重庆:重庆大学出版社 1998.
    [48] 张文修,吴伟志等,粗糙集理论与方法,北京:科学出版社, 2001.
    [49] 王国胤,Rough集理论与知识获取,西安:西安交通大学出版社, 2001.
    [50] Komorowski J., Pawlak Z., Polkowski L. and Skowron A., Rough sets: A tutorial ,In: Rough Fuzzy Hybridization: A New Trend in Decision Making (S.K. Pal and A.Skowron, Eds.). Singapore: Springer, pp.3-98. 1999.
    [51] Sushmita Mitra, Sankar K. Pal, et al, Data Mining in Soft Framework, IEEE Transactions on Neural Networks, 13(1) ,3-14, January,2002.
    [52] J. F. Baldwin, Knowledge from data using fuzzy methods, Pattern recognition lett., vol 17,pp.536-540,1999.
    [5

    [53] I.W. Flockhart and N. J. Radcliffe, A genetic algorithm-based approach to data mining, in Proc. 2nd Int. Conf. Knowledge Discovery Data Mining (KDD-96) . Portland, OR, Aug. 2-4, pp. 299, 1996.
    [54] U.Fayyad. D. Stolorz, Data mining and KDD: Promise and challenges, 13:99-115,1997.
    [55] Chen Nianyi, et al. N-factor pattern recognition method and software applied to process optimization and materials optimal design. Proc. of Intern. Conf. On Multi-Source-Multi-Sensor Information Fusion [C] . Las Vegas, U.S.A. 1998.
    [56] Luc DehaspeE, Hannu Toivonen, Discovery of frequent Datalog patterns, Data Mining and Knowledge Discovery, 3, pp.7-36,1999.
    [57] M. Daszykowski, B. Walczak, et al, Projection methods in chemistry, Chemometrics and Intelligent Laboratory Systems ,vol 65,pp. 97-112,2003.
    [58] 陆文聪,数据挖掘在化学化工中的若干应用,中国科学院上海冶金所博士论文,1999.
    [59] 杜一平,化学数据挖掘新算法和定量构性关系基础研究,湖南大学博士论文,2002.
    [60] Yin-Tak Woo,Predictive Toxicology Challenge(PTC),2000-2001 :A Toxicologist's View and Evaluation, Http//www.ai. univie.ac. at/oefai/ml/ toxicology/project.html.
    [61] 阎平凡,张长水,人工神经网络与模拟进化计算,北京:清华大学出版社,2000.
    [62] 边肇祺,张学工等,模式识别,第二版,北京:清华大学出版社,2000.
    [63] Anil K.Jan, Robert P.W, et al, Statistical Pattern Recognition:A Review. IEEE Transactions on pattern analysis and machine intelligence, vol 22,No.1,January 2000.
    [64] 方开泰,实用多元统计分析,上海:华东师范大学出版社,1989.
    [65] M.Dash, H.Liu, Feature selection for Classification, Intelligent Data Analysisi,131-156,1997.

    [66] David J.C. Mackay, A Practical Bayesian Framework for Backpropagation Networks, Neural Computation 4,448-472,1992.
    [67] David J.C. Mackay, Bayesian Interpolation, Neural Computation 4,415-447,1992.
    [68] David J.C. Mackay, The Evidence Framework Applied to Classification Networks, Neural Computation 4,448-472,1992.
    [69] Russell Reed, Pruning Algorithms-A suvey, IEEE Transactions on neural networks, vol 4, No.5, September 1993.
    [70] 武妍,张立明,神经网络的泛化能力与结构优化算法研究,计算机应用研究,6,21-25,2002.
    [71] Castellano G.,et al, An Iterative methods for pruning feedword neural-networks, IEEE Transactions on neural networks, 8(3) ,May 1997,519-531.
    [72] E.D.Karmin, A simple procedure for pruning back-propogation trained neural networks, IEEE Transactions on neural networks, vol.1, No.2,pp.239-242,1990.
    [73] Y.Le Cun, .T.S.Denker, and S.A.Solla, Optimal brain damage, in Advances in Neural Information Processing(2) ,D.S.Touretzky, Ed.(Denver 1989) , 1990, pp.598-605.
    [74] M.C.Mozer and P.Smolensky, Skeletonization: A technique for trimming the fat from a network via relevance assessment, in Advance in Neural Information Processing (1) , D.S.Touretzky, Ed.(Denver 1988) , 1989,pp.107-115.
    [75] Y.Chauvin, A back-propagation algorithm with optimal use of hidden units, in Advances in Neural Information Processing(2) , D.S.Touretzky, Ed.(Denver 1989) , pp.519-526, 1990.
    [76] Y.Chauvin, Generalization performance of overtrained back-propagation networks, in Neural Networks, Proc. EUROSIP Workshop, L.B.Almedia and C.J. Wellekens, Eds., Feb 1990,pp.46-55.
    [77] Y.Chauvin, Dynamic behavior of constrained back-propagation networks, in Advances in Neural Information Processing(2) , D.S.Touretzky, Ed.(Denver 1989) , pp.642-649, 1990.
    [78] A.S. Weigend, D.E.Rumelhart, and B.A.Huberman, Back- propagation,weight-elimation and time series prediction, in Proc. 1990 Connectionist Models Summer School D.Touretzky,J.Elman,T.Sejnowski,andG.Hinton,Eds.,1990,pp.l05-116.
    [7

    [79] A.S.Weigend, D.E.Rumelhart, and B.A.Huberman,Generalization by weight-elimation with application to forcasting,in Advance in Neural Information Processing(3) , R.Lippmann, J. Moody, and D.Touretzky, Eds., 1991,pp.875-882.
    [80] C.Ji,R.Snapp, and D.Psaltis, Generalizing smoothess constrants from discrete samples, Neural Computation,vol.2,no.2, pp.188-197,1990.
    [81] M.Ishikawa, A structural learning algorithm with forgetting of link weights, Tech.Rep.Tr-90-7, Electrotechnical Lab., Tsukuba-City, Japan, 1990.
    [82] Setiono. R., Liu.H., Neural-network feature selector, IEEE Trans. Neural Networks, Vol. 8, No.3, 654-662, 1997
    [83] De R.K., Feature analysis:neural network and fuzzy set theoretic approachs. Patter Recognition 30(10) ,1579-1590,1997.
    [84] Castellano G., Fanelli A.M., Variable selection using neural-networks models, Neurocomputing 31,1-13,2000.
    [85] Verikas, A., Bacauskiene, M., Feature selection with neural networks, Pattern Recognition Letters 23, 1323-1335, 2002.
    [86] Zurada, J.M.,et al, Pertubation method for deleting redundant inputs of perceptron networks, Neurocomputing.
    [87] Cibas,T.,et al, Variable selection with networks, Neurocomputing 12,223-248,1996.
    [88] Bauer,K.W., et al, Feature screening using signal-to-noiseratios. Nerocomputing 31,29-44,2000.
    [89] F. D. Foresee and M. T. Hagan, Gauss-Newton approximation to bayesian learning, in Proc. IEEE Int. Conf. Neural Networks, Vol. 3, 1930-1935, 1997
    [90] 周明,孙树栋,遗传算法原理及应用,北京:国防工业出版社,1999
    [91] Dezhao Chen, Yaqiu Chen and Shangxu Hu, Correlative components for pattern classification, Chemometrics and Intelligent Laboratory Systems 35,221-229,1996
    [9

    [92] Yaqiu Chen, Dezhao Chen and Shangxu Hu, Generalised Error Back Propagation Training and Neural Nets for Pattern for Pattern Classification, Proceedings of the 2nd Asian Control,July ,22-25,1997
    [93] 陈德钊,多元数据处理,北京:化学工业出版社,1998.
    [94] 胡可云等,粗糙集理论及其应用进展,清华大学学报(自然科学版),vol.41,no.1,2001,64-68.
    [95] 张文修,吴伟志,粗糙集理论介绍和研究综述,模糊系统与数学,Vol.14,No.4,Dec.,2000.
    [96] B. Walczak, D.L. Massart, Tutorial: Rough sets theory, Chemometrics and Intelligent Laboratory Systems , 47, pp.1-16,1999.
    [97] Z.Pawlak, Rough Classification, Int.J.Human-Computer Study (1999) , 51, 369-383.
    [98] Gediga D.G. Rough Set Data Analysis,Encyclopedia of Computer Science and Technology, Marcel Dekker,200.
    [99] Ziarko W, Variable Precision Rough Set Mode, Journal of Computer and System Sciences,1993,46,39-59.
    [100] Kryszkiewicz M, Rough set approach to incomplete information systems, Information Science,1998,112:39-49.
    [101] Ivo Duntsch, Gunther Gediga,Uncertainty measure of rough set prediction, Artificial intelligence,106(1998) ,109-137
    [102] Bazan,J., Dynamic reducts and statistical inference,In:Proceeding of the Sixth International Conference, Information Procesing and Management of Uncertainty-Based Systems(IPMU'96) , July 1-5,Granada, Spain(1996) ,2,1147-1152.
    [103] 陈湘晖等,扩展的粗糙集模型及其不确定性的量度,清华大学学报(自然科学版),Vol.42,No.1,2002.
    [104] R.W. Swiniarski, L. Hargis, Rough sets as a front end of neural-networks texture classiers, Neurocomputing 36,2001,85-102.
    [105] Szczuka M., Rough set methods for constructing articial neural networks, In B.D. Czejdo,I.I.Est, B.Shirazi,B.Trousse,(eds.), Proceedings of the Third Biennial European Joint Conference on Engineering Systems Design and Analysis, 7, July 1-4,Montpellier, France(1996) ,9-14.
    [1

    [106] M.Banerjee, et al, Rough Fuzzy MLP: Knowledge Encoding and Classification, IEEE Transactions on neural networks, vol 9, No.6, November 1998.
    [107] R.yasdi, Combining Rough Sets Learning-and neural learning-methods to deal with uncertain and imprecise information. Neurocomputing,7, 1995,61-84.
    [108] Lingras.P, Rough neural networks, In Proceedings of the Sixth International Conference, Information Procesing and Management of Uncertainty in Knowledge_ Based Systems(IPMU'96) ,Julyl-5, Granada,Spain(1996) ,3,1445-1450.
    [109] Polkowski, L. ,Skowron, A., editors, Rough sets in knowledge discovery, Vol. 2. Physica Verlag, Heidelberg, 1998
    [110] Bjorvand A.T, Rough Enough-A system supporting the Rough Sets Approach, http//homes.sn.no/-torvill.
    [111] Pawanlingras, Unsupervised Rough Set Classification Using Gas, Journal of Intelligent Information Systems, 16, 215-228, 2001
    [112] Duntsch I. , Gediga G.,Statistical evaluation of rough set analysis International Journal of Human-Computer Studies, 46, 589-604, 1997.
    [113] 胡可云等,概念格及其应用进展,清华大学学报(自然科学版), Vol.40,NO.9,2000.
    [114] Radzikowska A.M., Kerreb E.E., A comparative study of fuzzy rough sets fuzzy sets and Systems 26,2002,137-155.
    [115] Y.Y. Yao, A comparative study of fuzzy sets and rough sets, Journal of information science,109,227-242,1998.
    [116] Hong T.P., Learning a coverage of maximally general rules by rough sets,Expert Systems with Applications,1996,84,33-47.
    [117] Yao Y.Y, Lingras P., Interpretations of belief function in the theory of Rough sets,Information Sciences,1998,104,81-106.
    [118] Lingras P.,Combination of evidence in rough set theory. ICCI 1993:289-293.
    [119] Skowton A., et al, From rough set to evidence theory. Anvances in the Dumpsters Shafer Theory of Evidence. New York: John Wily &Sons Inc., 1994,193-236.

    [120] Lingras P., Davies C, Rough Genetic algorithms[A]. Zhong N., Skowron A., eds. Proc 7th Intl Wksp on RSFD[C]. Springer, 1999,38-46.
    [121] Lian-Yin Zhai, et al, Feature extraction using rough set theory and genetic algorithms-an application for the simplification of product quality evaluation, Computers & Industrial Engineering 43 (2002) 661-676.
    [122] Mannila H., et al., 1994. Efficient Algorithms for Discovering Association Rules. In: U. Fayyad and R. Uthurusamy (eds.): AAAI Workshop on Knowledge Discovery in Databases, pp. 181 - 192, Seattle, WA, July 1994.
    [123] Wrolewski J., 1995. Finding Minimal Reducts Using Genetic Algorithms. Proc. of the Second Annual Join Conference on Information Sciences, pp.186-189, September 28-October 1, 1995, Wrightsville Beach, NC
    [124] Slezak D., et al.,Neural networks Architecture for Synthesis of Probability Rule Based Classifiers. Electronic Notes in Theoretical Computer Science 82,No.4(2003),http//www. elesiver.nl/locate/ entcs/voIume82.html.
    [125]Xiaohua Hu, Knowledge discovery in databases: an attribute-oriented rough set approach. PhD Thesis.computer science faculty of graduate studies,University of Regina, Regina, Saskatchewan,1995.
    [126] J. F. Peters, A. Skowron, A Rough Set Approach to Knowledge Discovery. International journal of intelligent systems,Vol.17, 109-112, 2002 John Wiley & Sons, Inc.
    [127]Pawlak Z., Rough set and intelligent data analysis, Information Science vol.147 , 2002,1-12.
    [128] P. Lingras, Fuzzy-rough and rough-fuzzy serial combinations in neurocomputing. Neurocomputing, Vol.36, 29-44, 2001.
    [129] A. Roy, K. P. Sankar, Fuzzy discretization of feature space for a rough set classifier, Pattern Recognition Letters 24, 895 -
     902,2003.
    [1

    [130] Xindong Wu, Fuzzy Interpretation of Discretized Intervals, IEEE Transactions on fuzzy Systems, Vol.7,No.6, December 1999,753-759.
    [131] Robert Susmaga, Analyzing Discretizations of Continuous Attributes Given a Monotonic Discrimination Function, Intelligent Data Analysis 1,157-179,1997.
    [132] R.Kerber, Chimerge:Discretization of numeric attributes.In AAAI-92, Proceedings Ninth National Conference on Artificial Intelligence, pages 123-128,AAAI Press/The MIT Press,1992
    [133] Huan Liu,Rudy Setiono,Feature Selection via Discretization, IEEE Transactions on knowledge and data engineering, Vol.9, No.4,July/Augst 1997
    [134] Francis E.H.Tay, L.X. Shen, A modified Chi2 Algorithms for discretization. IEEE Transactions on knowledge and data engineering. Vol.14,No.3,666-670, May/July,2002.
    [135] H. Liu,et al, Discretization: An Enabling Technique. DataMining and Knowledge Discovery, 6, 393-423, 2002, Kluwer Academic Publishers. Manufactured in The Netherlands.
    [136] H. Liu, R. Setiono, Feature Selection via Discretization, IEEE Transactions on knowledge and data engineering. Vol.9,No.4, 642-645, July/August 2002.
    [137] Nguyes H.S.,Skowron A.Quantization of real values attributes, rough sets and Boolean reasoning approach[A], Proc. Of the 2nd Joint annual Conf. On Information Sci., USA,Wrightsville Beach,NC, 1995,34-37.
    [138] Nguyes H.S., Discretization problem for rough sets methods[A]. Proc. Of the Int. Conf. On Rough Sets and Current Trends in Computing(RSCTC'98) , Warsaw, Poland,1998,545-552.
    [139] 陈希孺,数理统计引论,北京:科学出版社,1981.
    [140] S.J. Hong, Use of Contextual Information for Feature Ranking and Discretization, IEEE Transactions on knowledge and data engineering. Vol.9,No.5,642-645, September/October 1997.
    [141] X. Hu, N. Cercone, Rough Set Based Similarity Learning From Databases. Proc. Of the 1st KDD Conf.,1995.
    [1

    [142] C.J.Merz and P.M.Murphy, UCI Repository of Machine Learning Database, http://www.ics.uci.edu/-mlearn/MLRepository.html
    [143] J.R. Quinlan, C4. 5: programs for machine learning. Morgan Kaufmann, San Mateo,1993.
    [144] 李玉榕等,基于熵的粗糙集属性约简算法,电路与系统学报, Vol.7,No.3,September 2002,8-12.
    [145] Yao Y. Y.,et al,On information-theoretic measures of attribute importance. 3rd Pacific-Asia Conference on Knowledge Discovery and Data Mining (PKDD'99) ,133-137,1999.
    [146] Wong S. K. M., Ziakro M., On optimal decision rules in decision tables Bulletin of Polish Academy of science,1985, 33,693-696.
    [147] Keyun Hu, A Heuristic Optimal Reduct Algorithm, K.S. Leung, L.-W. Chan, and H. Meng (Eds.): IDEAL 2000, LNCS 1983, pp. 139-144, 2000.
    [148] H.S. Nguyen,A. Skowron , Quantization of real values attributes: Rough set and Boolean reasoning approaches, In Pro. of the Int.1 Workshop on Rough Sets Soft Computing at Second Annual Joint Conference on Information Sciences (JCIS'95) , P.P. Wang(Eds),Wrightsville Beach, North Carolina, pp.34-37, 1995.
    [150] Z. Pawlak, A Rough Set View on Bayes' Theorem, International Journal of Intelligent Systems, Vol. 18,487-498,2003. Wiley Periodicals, Inc.
    [151] J. W. Grzymala-Busse and Xihong Zou, Classification Strategies Using Certain and Possible Rules, L. Polkowski and A. Skowron (Eds.): RSCTC'98, LNAI 1424, pp. 37-44, Berlin Heidelberg : Springer-Verlag ,1998
    [152] M. S. Szczuka, Refining Classifiers with Neural Networks, International Journal of Intelligent Systems, Vol.16,39-45,2001. Wiley Periodicals, Inc.
    [153] Philip K.Hopke, Desire L, Massart, Reference data sets for chemometrical methods testing,Chemometrics and Intelligent Laboratory Systems,1993,19:35-41

    [154] Z.Pawlak, Rough Sets, Theoretical Aspects of Reasoning about Data, Kluwer Academic Publisher,Dordrecht,Netherlands,1991
    [155] Hagan, M. T., and M. Menhaj, Training feedforward networks with the Marquardt algorithm,IEEE Transactions on Neural Networks, vol. 5, no. 6, pp. 989-993, 1994.
    [156] 束志恒,方士等,基于正则化网络-遗传算法的属性筛选及其在化学模式识别中的应用,分析化学,31(10) ,1169-1172, 2003.
    [157] 束志恒,陈德钊等,粗糙集方法及其在化学模式分类规则挖掘中的应用,分析化学,32(7) ,879-883,2003.
    [158] 刘普寅,吴孟达,模糊理论及其应用,长沙:国防科技大学出版社,1998.
    [159] 张平安等译,神经-模糊和软计算,(原著:张智星等),西安: 西安交通大学出版社,2000。
    [160] 黄崇福译,模糊工程, (原著:Bart Kosko),西安:西安交通大学出版社,1999.
    [161] 王立新,自适应模糊系统与控制-设计与稳定性分析,北京:国防工业出版社,1995.
    [162] 王士同,模糊系统、模糊神经网络及其应用程序设计,上海: 上海科学技术文献出版社,1998.
    [163] O.Cordon, et al, A proposal on reasoning methods in fuzzy rule-based classification systems, International Journal of Approximate Reasoning 20(1999) ,21-45.
    [164] L.Y. Cai, H. K. Kwan, Fuzzy Classifications Using Fuzzy Inference Networks, IEEE Transactions on systems,man,and cybernetics-part B:cybernetics,Vol.28, No.3, June 1998.
    [165] Jyh-Shing Roger Jang, ANFIS:Adaptive-Network-Based Fuzzy Inference System, IEEE Trans, on Systems, Man and Cybernetics, vol. 23, no. 3, pp. 665-685, May 1993.
    [166] S.M. Weiss,N. Indurkhya,Rule-based Machine learning Methods for Functional Prediction, Journal of Artificial Intelligence Research 3(1995) ,383-403.
    [167] R.W.Swiniarski, A. Skowron, Rough set methods in feature selection and recognition, Pattern Recognition Letters 24 (2003) 833 - 849

    [168] A.G. Jackson , et al, Rough sets applied to the discovery of materials knowledge, Journal of Alloys and Compounds. 279, 14-21,1998.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700