贝叶斯方法在化工软测量建模中的应用研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
贝叶斯学习理论使用概率表示各种形式的知识和不确定性,并通过概率规则来实现学习和推理过程,是处理不确定信息的有力工具。本文在学习贝叶斯方法的理论及其应用基础上,详细讨论了贝叶斯方法在数据分类中的应用以及贝叶斯方法在化工软测量建模的应用。
     本文就以上主要内容进行了深入的研究并取得了以下结果:
     (1)软测量建模问题中为了提高模型的估计精度,通常需要将原始数据集分类,以构造多个子模型。本文利用朴素贝叶斯分类器简单高效的优点,首先对连续的类变量进行类别范围划分,然后用概率论中的“3σ”规则对连续的属性变量离散。为了消除训练样本中干扰数据的影响,利用遗传算法从训练样本集中优选样本。对连续变量的离散和样本的优选作为对数据的预处理,最后由预处理后的训练样本构建贝叶斯分类器。通过对UCI数据集和双酚A生产过程在线监测数据集的实验仿真,实验结果表明:基于遗传算法优选样本集的“3σ”规则朴素贝叶斯分类方法比其它方法有更高的分类精度。
     (2)将贝叶斯网络应用于化工软测量建模。在综合考虑生产过程工艺机理的基础上利用领域专家知识构建网络模型,采用加权联合高斯分布函数来近似表达贝叶斯网络模型中的联合概率分布,并给出了贝叶斯网络估计公式。对某企业双酚A生产装置在线采集的数据进行建模,离线估计取得了较好的效果。与支持向量机方法相比,在估计精度相当的情况下,省去了许多过程参数的估计,因此也是一种有效的软测量建模方法。
     (3)为了改善软测量模型的估计精度,提出了一种基于贝叶斯分类算法和关联向量机的多模型软测量建模方法。采用贝叶斯分类器对样本数据集进行分类,并对不同类别的输入数据分别建立关联向量回归机子模型,用“切换开关”方式组合作为最终的软测量模型输出。将该方法应用于双酚A生产过程的质量指标软测量建模,仿真结果表明:与单模型支持向量机相比,该方法估计精度较高,具有一定的应用价值。
Bayesian Learning Theory represents various knowledge and uncertainty with probability. The learning and inference are realized by probabilistic rules. Therefore, it is a strong tool dealing with uncertain information. This thesis mainly studies the basic point, application of Bayesian Learning Theory. The application of data classification and the soft sensor model based on Bayesian method are mainly studied as the key problem.
     This dissertation concentrated on the research work listed below and achieved some creative results.
     (1) Constructing sub-models can increase estimation accuracy in soft sensing modeling, and the construction of multi-model is based on the classification of the original data set. Among the methods of data classification, Naive Bayesian classifier has been widely applied because of its simplicity and efficiency. The continuous class variables are firstly divided into several categories, then the "3σ" rule based on probability theory is proposed to discretize the attributes. In order to eliminate the interferences from the training sample, the optimal sub sample set is selected from the training sample set by genetic algorithm. Finally the preprocessed training sample is used to build the Bayesian classifier. Both UCI data sets and the on-line monitoring data sets from the process of production for Bisphenol-A (BPA) are made experiment, and the simulation results show that it is possible to reliably improve the naive Bayesian classifier by using data discretization and selected as part of data pre-processing.
     (2) A new approach based on Bayesian network applied to chemical soft sensor is proposed. The network model is based on knowledge of the field experts and the mechanism of process, and a weighted combination of several normal distribution functions is used to approximate the joint probability distribution in Bayesian network, and then the estimated formula for Bayesian network is been given. The parameters of the model are estimated by processing real time data from a productive plant for Bisphenol A, and the model based on Bayesian network shows good results. Compared with support vector machine, the Bayesian network saves a lot of the estimated process parameters and has considerable accuracy. It is an effective method for soft sensor modeling.
     (3) In order to improve the estimation accuracy of the soft sensor model, a new nonlinear multi-modeling method based on Bayesian classify algorithm and relevance vector machine is proposed in the paper. The algorithm classifies the inputs by Bayesian classifier, and then trains each class by different relevance vector regression machines, and obtains the final result by the“Switch”way. The proposed algorithm is used for a soft sensor model for the bisphenol-A productive process. The experimental results indicate the proposed algorithm is superior compared with the single model of SVM and has certain application value.
引文
1.俞金寿,刘爱伦,张克进.软测量技术及其在石油化工中的应用[M].北京:化学工业出版社,2000,1
    2. C B Brosillow. Inferential Control of Process[J]. Journal of American Institute of Chemical Engineers. 1978, 24(3): 485-509
    3. Dong Dong, Thomas J McAvoy. Emission Monitoring Using Multivariate SoftSensors[J]. Proceedings of the American Control Conference Seattle. 1995, 35(2): 761-765
    4. D Wang, R Srinivasan, J Liu, P N S Guru, K M Leong. Data-driven Soft Sensor Approach For Quality Prediction in a Refinery Process[J]. IEEE, International Conference on Industrial Informatics, 2006, 230-235
    5.范洪明,季刚.离子法双酚A工艺优化探讨[J].化工时刊,2001,10(10):38-40
    6.李复生,殷金柱,耿安利等.由双酚A结晶母液的裂解产物合成双酚A反应过程研究[J].化学反应工程与工艺,2002,18(6):168-173
    7. Brosillow. Inferential control of process[J]. Jour. AichE, 1978, 24(3): 485-509
    8.刘瑞兰.软测量技术若干问题的研究及工业应用[D]:[博士学位论文].杭州:浙江大学信息科学与工程学院,2004
    9.李春富.基于数据的软测量建模方法及其应用的研究[D]:[博士学位论文].北京:清华大学自动化系,2005
    10.傅永峰.软测量建模方法研究及其工业应用[D]:[博士学位论文].杭州:浙江大学信息科学与工程,2007
    11.钟璇,王树青.粗汽油干点的在线软测量[J].化工学报,1998,49(2):251-255
    12.陈渭泉,刘瑞兰,牟盛静等.基于贝叶斯方法的4-CBA含量的软测量研究[J].化工自动化及仪表,2003,30(5):49-51
    13.金福江,周丽春.化工软测量技术研究进展[J].化工进展,2005,24(12):1379-1382
    14. Vapnik V. The Nature of Statistical Learning Theory[M]. New York: Springer Verlag,1995
    15. Marcel Nol, Ignaco S. Support vector regression for the simultaneous learning of a multivariate function and its derivatives[J]. Neurocomputing, 2005, 69(1-3): 42-61
    16.王华忠,俞金寿.基于混合核函数PCR方法的工业过程软测量建模[J].化工自动化及仪表,2005,32(2):23-25
    17.王华忠,俞金寿.基于核函数主元分析的软测量建模方法及应用[J].华东理工大学学报,2004,30(5):567-570
    18.吕志军,杨建国,项前等.基于支持向量机的纺纱质量预测模型研究[J].控制与决策, 2007,26(6):561-565
    19.俞金寿.软测量技术及其应用[J].自动化仪表, 2008,29(1):1-7
    20.宫秀军.贝叶斯学习理论及其应用研究[D]:[博士学位论文].北京科学院研究生院, 2002
    21.周颜军,王双成,王辉.基于贝叶斯网络的分类器研究[J].东北师大学报自然科学版,2003,35(2):21-27
    22. Domingos P, PazzaniM. On the optimality of the simple Bayesian classifier under zero-one loss Machine Learning[J]. 1997, 29(2-3): 103-130
    23.张剑飞.贝叶斯网络学习方法和算法研究[D]:[硕士学位论文].沈阳,东北师范大学,2005
    24. David Heckerman. Bayesian Networks for Data Mining[J]. Data Mining and Knowledge Discovery, 1997, 1(1): 79-119
    25.徐计,张桂芸.基于贝叶斯网络的一种牛奶产量预测研究[J].计算机工程与科学,2008,30(10):15-18
    26. Shiliang Sun, Changshui Zhang, Guoqiang Yu. A Bayesian Network Approach to Traffic Flow Forecasting[J]. IEEE transactions on intelligent transportation systems, 2006, 7(1): 124-132
    27.刘学艺,刘祥官,王文惠.贝叶斯网络在高炉铁水硅含量预测中的应用[J].钢铁,2005,40(3):17-20
    28. Michael E. Tipping. The relevance vector machine[A]. In Sara A Solla, Todd K Leen, and Klaus-Robert Muller, editor, Advances in Neural Information Proceeding Systems 12. Cambridge, Mass: MIT Press,2000
    29.孙宗海,孙优贤.关联向量机在微生物发酵传感器故障诊断中的应用[J].高校化工工程学报,2004,18(4):483-487
    30.张旭东,陈峰,高隽等.稀疏贝叶斯及其在时间序列预测中的应用[J].控制与决策,2006,21(5):585-588
    31.朱世增,党选举.基于相关向量机的非线性动态系统辨识[J].计算机仿真,2008,25(6):103-107
    32.陈佳,颜学峰,钱锋.基于贝叶斯学习的关联向量机及其在软测量中的应用[J].华东理工大学学报(自然科学版),2007,33(1):115-119
    33.赵恒平,俞金寿.化工数据预处理及其在建模中的应用[J].华东理工大学学报,2005,31(2):223-226
    34.苏成.数据挖掘中不可忽视的环节——数据预处理[J].应用技术,2006,1:64-66
    35.杨阳,刘峰,张天戈.分类器的数据预处理[J].计算机工程,1998,24(4):33-34
    36. Jackson J E. Principal components and Factor analysis: Part1 principal components[J]. Journal of Quality Technology, 1980, 12(4): 201-213
    37.杨斌,田永青,朱仲英.智能建模方法中的数据预处理[J].信息与控制,2002,31(4):380-384
    38.菅志刚,金旭.数据挖掘中数据预处理的研究与实现[J].计算机应用研究,2004,7:117-118
    39.高林,顾幸生.神经网络多模型软测量技术及应用[J].华东理工大学学报,2004,30(5):559-563
    40. Bates J M, Granger C W J. The combination of forecasts[J]. Operations Research Quarterly, 1969,20:319-323
    41.仲蔚,俞金寿.基于模糊C均值聚类的多模型软测量建模[J].华东理工大学学报,2000,26(1):83-87
    42.熊智华,王雄,徐永.一种利用多神经网络结构建立非线性软测量模型的方法[J].控制与决策,2000(2):173-176
    43.冯瑞,沈伟,张艳珠等.基于F_SVMs的多模型建模方法[J].控制与决策,2003,18(6):646-650
    44.张宇,李柠,黄道.基于多神经网络模型的酯化反应软测量[J].华东理工大学学报,2005,31(2):208-211
    45.常于清,王小刚,王福利.基于多神经网络模型的软测量方法及应用[J].东北大学学报,2005,26(6):519-522
    46.李修亮,苏宏业,禇健.基于在线聚类的多模型软测量建模方法[J].化工学报,2007,58(11):2834-2839
    47.李修亮,苏宏业,禇健.基于在线聚类和关联向量机的多模型软测量建模[J].化工自动化及仪表,200835(3):34-37
    48. Mitchell T M. Machine Learning [M]. New York: The McGraw-Hill Companies, Inc.1997
    49. Domingos, P., Pazzani, M., On the Optimality of the Simple Bayesian Classifier under Zero-One loss[J]. Machine Learning.1997, 29(2/3):103-130
    50. Cheng Jie. Learning Bayesian networks from data: an information-theory based approach[J].Artificial Intelligence.2002,137(1,2):43-90
    51. David Heckerman. Learning Bayesian networks: the combination of knowledge andstatistical data[J]. Machine Learning.1995,20(3): 197-243
    52. Friedman N, Goldszmidt M. Building classifiers using Bayesian network[A]. In proc. Nation Conference on Artificial Intelligence[C]. Menlo park, CA:AAAI Press, 1996:1227-1284
    53. Friedman N. Bayesian network classifiers[J]. Machine Learning, 1997,(29):131-163
    54.李艳美,张卓奎.基于贝叶斯网络的数据挖掘方法[J].计算机仿真,2008,25(2):87-89
    55.宫秀军,刘少辉,史忠植.一种增量贝叶斯分类模型[J].计算机学报,2002,25(6):645-650
    56. Yang, Ying; Webb, Geoffrey I.“On why discretization works for Naive Bayesian Classifiers”[C]. Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science), v 2903, AI 2003 - Advances in Artificial Intelligence, 2003, p 440-452
    57.钱玲飞,刘玉树,李侃.朴素贝叶斯分类器在地形评估中的应用方法[J]计算机工程与应用,2005,12(57):189-191
    58. Abraham, R.; Simha, J.B.; Iyengar, S.S. Medical Datamining with a New Algorithm for Feature Selection and Naive Bayesian Classifier[J]. 10th International Conference on Information Technology. 2007,17(20):44– 49
    59. Batista dos Santos, E.; Hruschka, E.R.; VOGA: Variable Ordering Genetic Algorithm for Learning Bayesian Classifiers[J]. Sixth International Conference on Hybrid Intelligent Systems, 2006:56-59
    60.陈国初,徐余法,俞金寿.基于粒子群模糊神经网络的丙烯腈收率软测量建模[J].系统仿真学报, 2007, 19(23): 5370-5372
    61.陈文杰,王晶.基于支持向量机的聚酯粘度在线软测量[J].控制工程,2005,12(5):492-495
    62.林士敏,田凤占,陆玉昌.贝叶斯网络的建造及其在数据采掘中的应用[J].清华大学学报(自然科学版), 2001,41(1):49-52
    63.董立岩,苑森淼等.基于预测能力的连续贝叶斯网络结构学习[J].计算机工程与应用, 2007,43(9):23-24
    64. Kraskov A, Stogbauer H, Grassberger P. Estimating mutual information[J]. Physical Review E-Statistical, Nonlinear, and Soft Matter Physics, 2004, 69(62): 1-16
    65. A. J. Smola and B. Scholkopf. A tutorial on support vector regression[J]. Statistics and Computing, vol. 14, 2004:199-222
    66. Michael E. Tipping. Sparse Bayesian Learning and the Relevance Vector Machine, Journal of Machine Learning Research[J]. 2001: 211-244
    67. Jin Yuan, Liefeng Bo, etc. Adaptive spherical Gaussian kernel in sparse Bayesian learning framework for nonlinear regression[J]. Expert Systems with Applications.2009:3982-3989
    68. Mingjun Zhong. A variational method for learning sparse Bayesian regression[J]. Neurocomputing. 2006: 2351-2355
    69. Er-rui Ding, Ping Zeng, Yong Yao. A Novel Regressive Algorithm Based on Relevance Vector Machine[J]. Fuzzy Systems and Knowledge Discovery,2007. FSKD 2007. Fourth International Conference on. 2007: 463-467
    70. Cho S B, Kim J H. Combining multiple neuralnetworks by fuzzy integral for robust classification [J]. IEEE Trans on Systems, Man and Cybernetics, 1995, 25(2): 380-384
    71. Quinonero Candela, J. Learning with uncertainty Gaussian processes and relevance vector machines[D]. Technical University of Denmark, Lyngby, Denmark, 2004
    72. Michael. E. Tipping and A. C. Faul. Fast marginal likelihood maximization for sparse Bayesian models. In C. M. Bishop and B. J. Frey, editors, Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics, Key West, FL, Jan 3-6, 2003

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700