确定性退火技术在数据挖掘中的应用

英文题名：The Application of the Deterministic Annealing Technique in Data Mining
作者：黄名
论文级别：硕士
学科专业名称：控制理论与控制工程
中文关键词：数据挖掘 ; 确定性退火 ; 径向基函数 ; 极大熵 ; 聚类
英文关键词：Data mining ; deterministic annealing ; radial basis function ; maximum entropy ; clustering
学位年度：2003
导师：曹长修
学科代码：081101
学位授予单位：重庆大学
论文提交日期：2003-05-08
答辩委员会主席：曹龙汉

摘要

近年来，数据挖掘引起了信息产业界的极大关注，其主要原因是存在大量数据，可以被广泛使用，并且迫切需要将这些数据转换成有用的信息和知识。数据挖掘主要应用模块有：分类模式；聚类模式；估计模式；预测模式。本文主要是针对其中的分类、聚类和预测模式进行研究。数据挖掘的能力大小，取决于挖掘工具的效能。目前，能够用于解决机器学习问题的方法主要有三种类型，即：模糊规则的学习方法、神经网络学习方法和遗传进化的学习方法。而神经网络具有一些独特的优点，如非线性映射、容错能力等，因而相对于其它方法，神经网络在数据挖掘中具有更大的前景，但神经网络也存在一些问题，比较典型的包括在学习算法选择不当时，系统易陷于局部极优状态，网络的复杂度与泛化能力之间存在矛盾等等。为了从某种程度上解决以上出现的问题，这儿就提出了按自然法则计算方法——确定性退火技术，应用到数据挖掘中。
    确定性退火技术是根据退火过程，将求解优化问题的最优点转化为求一系列随温度变化的物理系统的自由能函数的极小，它能够使算法避开局部极小而得到全局极小。该算法利用了信息论中极大熵原理，使算法以较小的规模得到较好的最优解。确定性退火技术一方面是一退火过程，可用来求解全局最优解，另一方面又不同于模拟退火算法用Metropolis抽样准则来模拟系统的平衡态，而是在固定的温度下利用确定的优化方法求解自由能函数的极小，对系统的平衡态进行模拟。因此，利用确定性退火技术的求解方法要比模拟退火算法的速度要快。但确定性退火技术在不同问题的应用中，由于它物理系统建立的联系不同，因此确定性退火技术中自由能函数的选取也不同，从而试图给出自由能函数的一般选取方法将是十分困难。
    因而，本文结合这两种算法的优点，提出了将确定性退火技术和神经网络技术相结合的方法，而RBF神经网络的学习速度、网络计算速度优于其他常用的神经网络，用确定性退火技术进行聚类和使之来优化RBF神经网络参数的中心和网络宽度，使它有对初始值的选取不敏感的优点，且利用到概率问题，可以很好克服系统存在死节点问题，这样不仅保证每种方法各自的优点，同时使得改进后的方法具有了新的特点。从大量的仿真实验分析得到，该方法在聚类模型中，实用性很强，特别适用于大规模的数据处理。在预测模型中，使得预测模式结果更加准确。
Data mining technology is drawing the increasing concern of information field in recent years. The main reason is that there are abundant data, which are very important in researching. They need to be changed into the useful information and knowledge. Main model applications of data mining are classification, clustering, prediction, estimation and so on. This paper mainly aims at the research of classification, clustering and prediction. The ability of data mining lies on the efficiency of mining instrument. At present, there are three kinds of methods of solving learning questions: fuzzy rule, neural networks and genetic algorithm. But neural networks have particular advantages, such as nonlinear mapping and error correcting capability. Relative to other algorithms it has a further prospect. But it isn't ideal because it is fallible in solving down-to-earth questions. For example, it is easy for the system to fall into local optimization state when learning algorithm is unsuitable. Furthermore, there is a conflict between the complexity of network and generalization ability. Thus, the deterministic annealing technique of a new branch of physical computation, which can get over above-mentioned limitation, is applied to data mining.
    According to annealing process, the deterministic annealing technique transforms the optimum point to optimal solution into a series of the minimum of free energy function of a physical system, which varies with temperature. The deterministic annealing can make the algorithm avoid local minima and get global minima. The algorithm makes use of maximum entropy of information theory and gets optimal solution by a small-scale. On the one hand, the deterministic annealing technique is an annealing process and can be used to attain globally optimal solution. On the other hand, it is different from simulated annealing algorithm, which simulates the equilibrium state of system by use of sample rule in Metropolis. It simulates the equilibrium state of system by use of deterministic optimization methods to find the minimum of free energy function in a given temperature. It can be proved in theory that the global optimal solution is a continuous map to temperature when free energy function satisfies certain appropriate conditions. So the velocity of the deterministic annealing algorithm is faster than the simulated annealing algorithm. But the choosing of free energy function in the deterministic annealing varies with physical systems. It leads the difficulty in the giving of general searching measure for all free energy functions.


    In this paper, the deterministic annealing technique and RBF neural networks are combined in term of their advantages. The learning velocity and calculating velocity of RBF neural networks is superior to others. The deterministic annealing is applied to cluster and optimize the center of RBF neural networks parameter and networks width. It can remedy the problems of the sensitivity to initial value. At the same time, it can remedy the dead-node by probability. This not only ensures the advantages of every algorithm but also has new advantages by improving them. Simulation results show the validity of the method. It can be concluded from a variety of simulation results that the uniting of the deterministic annealing technique and RBF neural networks is very practicable, especially in large-scale data processing. As such the forecasting result is accurate in forecasting model.

引文

[1] 童頫.论数据挖掘的计算智能方法[J].计算机科学,1998,25(2):21~23.
    [2] 王实,高文.数据挖掘中的聚类方法[J].计算机科学,2000,(4):42~45.
    [3] Berry,Michael. Data Mining Techniques[M]. New York: John Wiley&Sons,Inc,1997.
    [4] 余英泽,廖里,吴渝.一种新型的数据分析技术—数据挖掘[J].计算机与现代化,2000 ,(1):27~31.
    [5] 王珊 .数据仓库技术与联机分析处理[M] .北京：科学出版社,1998.
    [6] Groth,Robert.Data Mining A Hands-On Approach for Business Professionals[M].New Jersey: Prentice Hall,Inc,1998.
    [7] Mattison,Rob. Data Warehousing and Data Mining for Telecommunications[M]. Norwood: Artech House,1997.
    [8] W.S.McCulloch and W.Pitts,Bull.Math.Biophys,5,115~133,1943.
    [9] D.O.Hebb,The Organization of Behavior.Wiley,New York,1976.
    [10] F.Rosenblau,Principles of Neurodynamics,Spartan,New york,1962.
    [11] B.Widrow and M.Hoff,1960 IRE WESCON Conr.Record,96-104,AUG,1960.
    [12] M.Minsky and S.Papert,Perceptrons,MIT Press,1969.
    [13] G.Carpenter and S.Grossberg,The ART of adaptive pattern recognition by a self-organizing neural network[J].Computer,77-87,March 1988.
    [14] T.Kohonen,Self-Organization and Associative Memory,Springer,Beerlin,1984.
    [15] K.Fukushima,A neural network for visual pattern recognition[J].IEEE Computer, 656-74,March 1988.
    [16] S.Amari and M.A.Arbib,Competition and Cooperation in Neural Networks. Springer-Verlag, New York,1982.
    [17] J.A.Anderson,Cognitive and psychological computation with neural models[J].IEEE Trans, SMC-13,No.5,799-815,1983.
    [18] P.Werbos,Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences.Ph.D.Thesis,Harvard University,Cambridge,MA,Aug,1974.
    [19] J.J.Hopfield,Neural networks and physical systems with emergent collective computational abilities,Proc.Natl.Acad.Sci.,UUU.S.A,Vol.79,2554-2558,1982.
    [20] G.E.Hinton,T.J.Sejnowski and D.H.Ackley,Boltzmann machine:constraint satisfaction networks that learn,CMU-CS-84-119,Carnegie-Mellon Uni.,1984.
    [21] R.Hecht-Nielsen,Application of counter propagation networks[J].Neural Networks,1998, Vol.1,131~139.


    [22] L.O.Chua and L.Yang,Cellular neural networks:theory and applications[J].IEEE Trans.CAS. 1998,Vol.CAS-35,No.10,1257~1290.
    [23] K.Hornik,M.Stinchcombe,and H.White.Multi-layer feedforward networks are universal approximators[J].Neural Nerworks ,1989,Vol.2,pp.359~366.
    [24] K.Hornik. Approximation capabilities of multiplayer feedforward networks[J].Neural Networks.1991,Vol.4,pp.251-257.
    [25] T.Chen,H.chen,and R.-W. Liu.Approximation Capability in C(RN) by Multilayer Feedforward Networks and Related Problems[J].IEEE Trans.On Neural Networks, 1995,pp.25-30.
    [25] A.Van Ooyen,B.Nienhuis.Improving the Convergence of the Back-Propagation Algorithm Neureal Networks,1992,Vol.5,pp.465~471.
    [26] Magoulas,G.D.,Vrahatis,M.N.,&Androulakis,G.S.Effective Backpropagation Training with variable stepsize[J].Neural Networks,1997,Vol10,pp.61~82.
    [27] J.Park,I.W.Sandberg.Universal Approximation Using Radial-Basis-Function Networks[J]. Neural Computation,1991,Vol.3,pp.246~257.
    [28] Jooyoung Park,Irwin W.Sandberg.Approximation and Radial-Basis-Function Networks[J]. Neural Computation,1993,Vol.5,pp.305~316.
    [29] Sunil Elanayar V.T.and Yung C.Shin.Radial Basis Function Neural Network for Approximation and Estimation of Nonlinear Stochastic Dynamic Systems[J].IEEE Trans.On Neural Networks,1994,Vol.5,pp.594~603.
    [30] E.S.Chng,S Chen and B.Mulgrew.Gradient Radial Basis Function Networks for Nonlinear and Nonstationary Time Series Prediction[J].IEEE Trans.On Neural Networks,1996,Vol.7, pp.190~194.
    [31] Masato,O.Notions of Associative Memory and Sparse Coding[J].Neural Networks,1996,Vol.9, No.8,pp.1429~1458.
    [32] Morita,M.Associative memory with nonmonotone dynamics[J].Neural Networks,1993,6, pp.115~126.
    [33] C.Lee Giles,Gary M.Kuhn,and Ronald J.Williams.Dynamic Recurrent Nerual Networks: Theory and Applications.IEEE Trans[J].On Neural Networks,1994,vol.5,No.2,pp. 153~155.
    [34] Yoshua Bengio,Patrice Simard,and Paolo Frasconi.Learning Long-Term Dependencies with Gradient Descent is Difficult[J].IEEE Trans.On Neural Networks,1994,vol.5,No.2, pp.157~166.
    [35] O.Nerrand,P.Roussel-Ragot,D.Urbani,L.Personnaz,and G.Dreyfus[J].IEEE Trans.On Neural Networks,1994,vol.5,No.2,pp.178~184.
    [36] Rose.k,et al.Statical mechanical and phase transitions in clustering.Physical Review

    Letters[J] ,1990,Vol.65:945~948.
    [37] Fox G.C.Physical computation.Concurrency:Practice and Experince [J].1991,Vol.3(6): 627~653.
    [38] Jia Wei Han,Micheline Kamber.Data Mining Concepts and Techniques[M].Beijing:High Education Press, 2001.
    [39] Ming_Syan Chen,Jiawei Han,Philip.S.Yu.DataMining:An Overview roma Database Perspective[J].IEEE Transaction son Knowledgend Data Engineering,1996,8(6):866—883.
    [40] 俞金寿.数据挖掘技术[J].石油化工与自动化,2000,6:38~43.
    [41] Kirkpatrick S.et al.Optimization by simulalated annealing Science,1983,220:671-680.
    [42] 董占球，等。一类智能计算思想：按自然法则计算.计算机科学[J]，1996，23（3）：14~16.
    [43] Rose K,et al.A deterministic annealing to clustering[J].Pattern Recognition letter,1990,11: 589~594.
    [44] Rose K,et al.Vector quantization by deterministic annealing[J].IEEE Trans.on Information Theory,1992,38(4):1249~1257.
    [45] Rose K.A mapping approach to rate-distortion computation and analysis.IEEE Trans.on Information Theory,1994,40(6):1939~1952.
    [46] 涂健.神经元网络控制[M].北京:机械工业出版社,1998.
    [47] 徐秉铮,张百灵,韦岗.神经网络理论与应用[M].广州:华南理工大学出版社,1994.
    [48] 王旭东,邵惠鹤.RBF神经网络及其非线性系统建模中的应用[J].控制理论与应用,1997,14(1):59~66.
    [49] 李杰星,章云,付曦.一种改进的最邻近聚类学习算法[J].控制理论与应用,2000,17(5):735~738.
    [50] 史忠植.神经计算[M].北京:电子工业出版社,1993.
    [51] 赵振宇等.模糊理论和神经网络的基础与应用[M].北京:清华大学出版社.1996.
    [52] 焦李成.神经网络计算[M].西安:西安电子科技大学出版社,1993.
    [53] 阎平凡,张长水.人工神经网络与模拟进化计算[M].北京:清华大学出版社，2002.
    [54] B,Widrow and M.E.Hoff.Adaptive switching circuits.In 1960 IRE WESCON Convention Record,1960,pp.96-104,New York,NY.
    [55] J.T.Tou and R.C.Gonzalez.Pattern Recognition Principles.AddisonWesley,Reading.MA.1974.
    [56] S.Haykin.Adaptive Filter Theory.Prentice-Hall,Englewood Cliffs,NJ,2nd edition,1991.
    [57] P.Whittle.Prediction and regulation by linear least-square methods.Van Nostrand,Princeton, N.J,1963.
    [58] 王永骥,涂健.神经元网络控制[M].北京:机械工业出版社,1998.
    [59] Tianping Chen and Hong Chen.Approximation Capability to Functions of Several Variables,

    Nonlinear Functionals,and Operators by Radial Basis Function Nerual Networks[J].IEEE Trans. On Neural Networks,1995,Vol.6:904~910.
    [60] L Ioyd S P Least squares quantization in PCM[J].IEEE Transactions on Information Theory, 1998,28(1):129~137.
    [61] Ball G,Hall D.A clustering technique for summarizing multivariate data.Behavioral Science, 1967,12:153~155.
    [62] Gath I,Geva A B.Unsupervised optimal fuzzy clustering.IEEE Transactions on Pattern and Machine Intelligent,1989,11(7):773~781.
    [63] 朱麟章,蒙建波.检测理论及其应用[M].北京:机械工业出版社,1997.
    [64] Rose K,Physical computaion.Concurrency:Practice and Experience,1991,3(6):627~653.
    [65] 杨广文,李晓明,王义和.确定性退火技术[J].计算机学报,1998,21(8):765~768.
    [66] Mastorocostas P A,Theocharis J B,Bakirtzis A G.Fuzzy modeling for short term load forecasting using the orthogonal least squares method[J].IEEE Trans on Power Systems,1999, 14(1):29~36.
    [67] Rao A V,Miller D J,Rose K,Gersho A.A deterministic annealing approach for parsimonious design of piecewise regression models[J].IEEE Trans on Pattern Analysis and Machine Intelligence,1999,21(2):159~173.
    [68] 马琪.一个基于确定性退火的时延驱动标准单元布局算法[J].计算机与现代化,2001,71(1):6~14.
    [69] 杨广文,郑纬民,王鼎兴,李晓明.确定性退火技术的旅行商问题求解算法.软件学报[J],1999,10(1):57~59.
    [70] 孙冬梅,裘正定.利用薄板样条函数实现非刚性图像匹配算法[J].电子学报,2002,30(8):1104~1107.
    [71] 孙冬梅,裘正定.基于确定性退火技术的鲁棒性的点匹配算法[J].计算机学报,2002,25(6):606~611.
    [72] Chen-Chia Chuang,Shun-Feng Su,Chin-Ching Hsiao,The Annealing Robust Backpropagation (ARBP) Learning Algorithm[J].Trans on Neural Networks,2000,11(5):1067~1077.
    [73] 罗赞文,吴志坚,韩曾晋.RBF网络在交通流模型辨识中的应用[J].清华大学学报,2001,41(9):106~110.
    [74] 汪小帆,王执铨,宋文忠.径向基函数神经网络的新型混合递推学习算法[J].控制理论与应用,1998,15(2):272~276.
    [75] 朱明星,张德龙.RBF网络基函数中心选取算法的研究[J].安徽大学学报,2000,24(1):73~78.
    [76] 张立明.人工神经网络的模型及其应用[M].上海:复旦大学出版社,1993.


    [77] 张志华,郑南宁,史罡.径向基函数网络的软竞争学习算法[J].电子学报,2002,30(1):132~135.
    [78] Kenneth Rose.Deterministic Annealing for Clustering,Compression, Classification, Regression,and Related Optimization Problems[J]. Proceedings of the IEEE,1998,86(11): 2210~2239.
    [79] 胡昌华,张军波,夏军,张伟.基于MATLAB的系统分析与设计——小波分析[M].西安:西安电子科技大学出版社.,1999.
    [80] 赵彦玲,吴淑红.MATLAB与SIMULINK工程应用[M].北京:电子工业出版社,2002.
    [81] 魏克新,王云亮,陈志敏.MATLAB语言与自动控制系统设计[M].北京:机械工业出版社,1999.
    [82] 王沫然.MATLAB6.0与科学计算[M].北京:电子工业出版社,2001.
    [83] 许东,吴铮.基于MATLAB6.x的系统分析与设计——神经网络[M].西安:西安电子科技大学出版社,2002.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700