神经网络集成及其在地震预报中的应用研究

英文题名：Research on Neural Network Ensemble and Its Application to Earthquake Prediction
作者：刘悦
论文级别：博士
学科专业名称：控制理论与控制工程
中文关键词：神经网络集成 ; 地震预报 ; 实验设计 ; 粗集约简 ; 特征加权 ; 并行计算
英文关键词：Neural Network Ensemble ; Earthquake Prediction ; Design of Experiment ; Rough Set Reducts ; Feature Weighting ; Parallel Computing
学位年度：2005
导师：吴耿锋 ; 张博锋
学科代码：081101
学位授予单位：上海大学
论文提交日期：2005-08-01

摘要

学习方法的泛化能力、学习效率和易用性是机器学习及其应用过程中所面临的三个关键性挑战问题。神经网络集成学习通过训练多个神经网络并将其结果进行合成,显著地提高了学习系统的泛化能力,成为近年来机器学习领域一个重要的研究方向。本文在分析神经网络集成方法研究现状的基础上,以实验设计、粗集理论、特征加权以及并行技术等为支撑,围绕神经网络集成学习方法的易用性、泛化能力和学习效率等问题展开研究,提出了更有效的神经网络集成方法,并将其应用到地震预报领域。
     神经网络集成的结构(个体网络的数目和个体网络的结构)和个体网络的训练参数(如训练次数和学习率等)关系到集成性能的好坏,同时影响着集成是否易于被使用。本文首先研究了实验设计在神经网络集成中的应用,提出了一种简单、科学地确定神经网络集成结构和个体网络的训练参数的方法。使用者可以用较少的实验次数,分析影响神经网络集成泛化能力的因素以及确定各因素用什么水平搭配起来对集成的泛化能力最佳。同时,通过最近邻聚类算法自动确定个体网络的隐层节点,得到具有较大差异度的异构的个体网络,从而提高集成的泛化能力。其次,研究了构造算法和选择性方法的结合,提出了一种构造选择性神经网络集成方法。自动地确定了神经网络集成中个体网络的数目、个体网络隐层节点的个数及其训练次数等;并且采用多目标的个体网络选择方法,既保证了个体网络的精度又保证了个体网络之间的差异度。用户只需要简单地指定一些参数的初始值即可构造出集成,提高了神经网络集成的易用性。
     泛化能力是机器学习关注的基本问题之一。集成特征选择通过特征选择技术产生差异度大的个体,提高了集成的泛化能力。有效地生成属性子集是其需要解决的核心问题。本文就此展开了相关的研究,提出了基于粗集约简的神经网络集成方法。该方法充分考虑到了各属性之间的依赖性和相关性,利用基于可辨识矩阵的粗集约简方法有效地生成属性子集,能够构造出具有更高精度和差异度的个体,从而提高神经网络集成的泛化能力。
     特征加权能够细致地区分特征对结果影响的程度,已经成为当今流行的提高学习器的预测精度的方法之一。本文着重研究了如何将特征加权技术应用于提高神经网络集成的泛化能力,提出了一种基于特征加权的神经网络集成方法。该方法通过自适应遗传算法的优胜劣汰机制为输入属性确定了特征权值,提高了集成中各个体网络的精度和差异度,从而提高了神经网络集成的泛化能力。
     提高学习效率是机器学习永远的追求。本文结合最新的并行计算编程技术,提出了一种神经网络集成方法的并行实现方案,显著地提高了集成的学习效率;
Generalization ability, efficiency, and convenience are the three major topics in the field of machine learning and its applications. Neural network ensemble is a learning paradigm, in which a collection of a finite number of neural networks is trained for the same task. Recently, it has become a hot topic in the machine learning community because of its high generalization ability. In this paper, several novel neural network ensemble methods were proposed and applied to the field of earthquake prediction on the basis of the following advanced techniques i.e. Design Of Experiment, Rough Set Reducts, Feature Weighting, and Parallel Computing.
     The architecture of the ensemble and the training parameters of individual neural networks are closely relative to the performance of the ensemble and the convenience of the ensemble creation. This paper firstly employed Design Of Experiment to guide users with little experience of using neural networks to design the ensemble architecture and adjust the training parameters of individual neural networks. At the same time, the nearest-neighbor clustering algorithm was used to create the heterogeneous individuals with different hidden nodes. Secondly, the constructive algorithm and selective algorithm were utilized to make the users without any experience of neural networks expediently and automatically construct the ensemble, and then the convenience of the ensemble creation was improved.
     Generalization ability is the principle issue in the field of machine learning. Feature selection for ensembles has shown to be an effective strategy for improving the generalization ability of the ensemble. In this paper, we focused on how to select the appropriate feature subsets, and employed an optimized approach based on discernibility matrix for determining rough set reducts. Finally, the ensemble with high generalization ability was build up on the projections of the training set. Feature weighting is the general case of feature selection, which has the potential of performing better than (or at least as well as) feature selection. In this paper, a new self-adaptive genetic algorithm was used to conduct a search for the weight vector, which could optimize the classification accuracy of the individual neural networks to improve the generalization ability of the ensemble.
     Efficiency is another principle issue of machine learning. This paper proposed a parallelization strategy for the neural network ensemble by using MPI (Message Passing Interface) techniques to reduce the complexity and improve the efficiency of

引文

[1] 周志华,机器学习及其面临的挑战,技术报告,计算机科学面临的挑战高层研讨会,厦门大学,2003
    [2] Dietterich T G. Machine Learning Research: Four Current Directions. AI Magazine, 1997, 18(4): 97-136
    [3] 梅世蓉,冯得益,张国民,朱岳清,高旭,张肇诚,中国地震预报概论,北京:地震出版社,1993
    [4] Hansen L K, Salamon P. Neural Network Ensembles. IEEE Trans on Pattern Analysis and Machine Intelligence, 1990, 12(10): 993-1001
    [5] Sollich P, Krogh A. Learning with Ensembles: How Over-fitting Can Be Useful. Advances in Neural Information Processing Systems, MIT Press, 1996, 190-196
    [6] Cooper L N. Hybrid Neural Network Architectures: Equilibrium Systems that Pay Attention. Neural Networks: Theory and Applicationhs, San Diego, CA, Academic Press, 1991: 81-96
    [7] 周志华,陈世福,神经网络集成,计算机学报,2002,25(1):1-8
    [8] Simon Haykin 著,叶世伟、史忠植译,神经网络原理,北京:机械工业出版社,2004
    [9] Perrone M P, Coopler L N. When Networks Disagree: Ensemble Method for Neural Networks. Mammone R J ed. Artificial Neural Networks for Speech and Vision, London, Chapman-Hall, 1993: 126-142
    [10] Opitz D, Shavlik J. Actively Searching for an Effective Neural Network Ensemble. Connection Science, 1996, 8(3-4): 337-353
    [11] Kearns M, Valiant L G. Learning Boolean Formulae or Factoring. Technical Report TR-1488, Cambridge, MA, Havard University Aiken Computation Laboratory, 1988
    [12] Dietterich T D. Ensemble Methods in Machine Learning. In: Proc. of MCS 2000, Italy, LNCS, Springer, 2000: 1-15
    [13] 陈兆乾,周志华,陈世福,神经计算研究现状及发展趋势,陆汝钤主编,知识科学与计算科学,北京:清华大学出版社,2003,165-193
    [14] Breiman L. Bagging Predictors. Machine Learning, 1996, 24(2): 123-140
    [15] Efron B, Tibshirani R. An Introduction to the Bootstrap. Chapman & Hall, New York, 1993
    [16] Schapire R E. The Strength of Weak Learnability. Machine Learning, 1990, 5(2): 197-227
    [17] Freund Y. Boosting a Weak Algorithm by Majority. Information and Computation, 1995, 121(2): 256-285
    [18] Freund Y, Schapire R E. A Decision-theoretic Generalization of On-line Learning and an Application to Boosting. In: Proc. of EuroCOLT-94, Barcelona, Spain, Berlin, Springer, 1995: 23-37
    [19] Tumer K, Ghosh J. Classifier Combining: Analytical Results and Implications.In: Proc. of 13th NCAI, Portland, Oregon, 1996, 8: 126-132
    [20] Opitz D. Feature Selection for Ensembles. In: Proc. of AAAI1999, Orlando, FL, 1999: 379-384
    [21] Ho T K. The Random Subspace Method for Constructing Decision Forests. IEEE Trans on Pattern Analysis and Machine Intelligence, 1998, 20(8): 832 -844
    [22] Opitz D, Shavlik J. A Genetic Algorithm Approach for Creating Neural Network Ensembles. Amanda Sharkey (ed.). Combining Artificial Neural Nets, 1999: 79-97
    [23] Oliveira L S, Morita M, Sabourin R. Multi-Objective Genetic Algorithms to Create Ensemble of Classifiers. In: Proc. of EMO 2005, Guanajuato, Mexico, 2005: 592-606
    [24] Alexey T, Pdraig C, Mykola P. Search Strategies for Ensemble Feature Selection. Medical Diagnostics, 2003: 124-129
    [25] Bryll R, Gutierrez-Osuna R, Quek R. Attribute Bagging: Improving Accuracy of Classifier Ensembles by Using Random Feature Subsets. Pattern Recognition, 2003, 36(3): 1291-1302
    [26] Diettrich T, Bakiri G. Solving Multiclass Learning Problems via Error-Correcting Output Codes. Artificial Intelligence Research, 1995, 2: 263-286
    [27] Zhang X, Mesirov J, Waltz D. Hybrid System for Protein Secondary Structure Prediction. Molecular Biology, 1992, 225(4): 1049-1063
    [28] Rost B, Sander C. Prediction of Protein Secondary Structure at Better Than 70% Accuracy. Molecular Biology, 1993, 232(2): 584-599
    [29] Jacobs R, Jordan M, Nowlan S, Hinton G.Adaptive Mixtures of Local Experts. Neural Computation, 1991, 3(1): 79-87
    [30] Jimenez D. Dynamically Weighted Ensemble Neural Networks for Classification. In: Proc of IJCNN 1998, Anchorage, AK, 1998, 1: 753-756
    [31] Ueda N. Optimal Linear Combination of Neural Networks for Improving Classification Performance. IEEE Trans on Pattern Analysis and Machine Intelligence, 2000, 22(2): 207-215
    [32] Zhi-Hua Zhou, Jianxin Wu, Yuan Jiang, Shifu Chen. Genetic Algorithm Based Selective Neural Network Ensemble. In: Proc of IJCAI-01, Seattle, WA, 2001, 2: 797-802
    [33] 凌锦江,周志华,基于因果发现的神经网络集成方法,软件学报,2004,15(10):1479-1484
    [34] 凌锦江,周志华,基于特征选择的神经网络集成方法,复旦大学学报(第九届中国机器学习会议专刊),2004,5:685-688
    [35] 李国正,杨杰,孔安生,基于聚类算法的神经网络集成,复旦大学学报(第九届中国机器学习会议专刊),2004,5:689-691
    [36] 李凯,黄厚宽,一种基于聚类技术的选择性神经网络集成方法,计算机研究与发展,2005,42(4):594-598
    [37] 傅强,胡上序,赵胜颖,基于 PSO 算法的神经网络集成构造方法,浙江大学学报,2004,38(12):1596-1600
    [38] 施彦, 黄聪明, 侯朝桢,基于改进的 PSO 算法的神经网络集成,复旦大学学报(第九届中国机器学习会议专刊),2004,5:692-695
    [39] Yin Xucheng, Han Zhi, Liu Changping. Selective Bagging based Incremental Learning, In Proc. of ICMLC’04, Shanghai, China, 2004: 2412-2417
    [40] 傅向华,冯博琴,马兆丰,何明,增量构造负相关异构神经网络集成的方法,西安交通大学学报,2004,38(8):796-799
    [41] Hu Zhonghui, Li Yuangui, Cai Yunze, Xu Xiaoming. An Empirical Comparison of Ensembel Classification Algorithms with Support Vector Machines, In Proc. of ICMLC’04, Shanghai, China, 2004: 3520-3523
    [42] 崔伟东,周志华,李星,神经网络 VC 维计算研究,计算机科学,2000,27(7):59-62
    [43] Schapire R E, Freund Y, Bartlett Y, Lee W S. Boosting the Magin: A New Explanation for the Effectiveness of Voting Methods. Annals of Statistics, 1998, 26(5): 1651-1686
    [44] Krogh A, Vedelsby J. Neural Network Ensembles, Cross Validation, and Active Learning. In Advances in Neural Information Processing Systems, MIT Press, 1995: 231-238
    [45] Hansen L K, Liisberg L, Salamon P. Ensemble Methods for Handwritten Digit Recognition. In: Proc. of NNSP-92, Helsingoer, Denmark, IEEE Press, Piscataway, NJ, 1992: 333-342
    [46] Gutta S, Wechsler H. Face Recognition Using Hybrid Classifier Systems. In: Proc. of ICNN-96, Washington, DC, IEEE Computer Society Press, Los Alamitos, CA, 1996: 1017-1022
    [47] Huang F J, Zhou Z-H, Zhang H-J, Chen T. Pose Invariant Face Recognition. In: Proc. of FG 2000, Grenoble, France, IEEE Computer Society Press, Los Alamitos, CA, 2000: 245-250
    [48] Zhou Z-H, Jiang Y, Yang Y-B, Chen S-F. Lung Cancer Cell Identification Based on Artificial Neural Network Ensembles. Artificial Intelligence in Medicine, 2002, 24(1): 25-36
    [49] Shimshoni Y, Intrator N. Classification of Seismic Signals by Integrating Ensembles of Neural Networks. IEEE Trans on Signal Processing, 1998, 46(5): 194-1201
    [50] Drucker H, Schapire R. Improving Performance in Neural Networks Using a Boosting Algorithm. Advances in Neural Information Processing Systems 5, Denver, CO, Morgan Kaufmann, San Mateo, CA, 1993: 42-49
    [51] Mao J. A Case Study on Bagging, Boosting and Basic Ensembles of Neural Networks for OCR. In: Proc. of IJCNN-98, Anchorage, AK, IEEE Computer Society Press, Los Alamitos, CA, 1998, 3: 1828-1833
    [52] Cunningham P, Carney J, Jacob S. Stability Problems with Artificial Neural Networks and the Ensemble Solution. Artificial Intelligence in Medicine, 2000, 20(3): 217-225
    [53] Schapire R E, Singer Y. BoosTexter: A Boosting-based System for TextCategorization. Machine Learning, 2000, 39(2-3): 135-168
    [54] 徐玉秀,邢刚,原培新,基于专家系统与神经网络集成的故障诊断的应用研究,振动与冲击,2001,1:43-45
    [55] 石成钢,刘西拉,人工神经网络在震害预测中的应用,地震工程与工程振动,1991,11(2):39-472
    [56] 王虎栓,基于人工神经元网络的峰值地震动物理参数的智能判别,地震学报,1993,15(2):208-216
    [57] 李东升,王炜,黄冰树,人工神经网络及其在地震预报中的应用,地震,1995,15(4):379-383
    [58] 王炜, 蒋春曦, 张军, 周胜奎, 汪成民,BP 神经网络在地震综合预报中的应用,地震,1999,19(2):118-126
    [59] 王炜,吴耿锋,宋先月,神经网络在地震学方法综合预报中的应用,地震学报,2000,22(2):189-193
    [60] 赵利飞,王炜,利用神经网络技术对华东地区进行地震预测,地震学刊,2002,22(4):21-25
    [61] 王炜,谢端, 宋先月等,使用人工神经网络进行我国大陆强震时间序列预测,西北地震学报,2002,4:315-320
    [62] 庄昆元,王炜,黄冰树,章纯,地震序列类型的确定与现场预报规则的获取,地震,2001,21(3):15-20
    [63] 王炜,吴耿锋,张博锋,王媛,径向基函数(RBF)神经网络及其应用,地震,2005,25(2):19-25
    [64] Liu Yue, Liu Hui, Zhang Bofeng, Wu Gengfeng. Extraction of If-then Rules from Trained Neural Network and Its Application to Earthquake Prediction. In: Proc. of ICCI 2004, Victoria, Canada, 2004: 109-115
    [65] 张肇诚主编,中国震例(1966—1975),北京:地震出版社,1988
    [66] 张肇诚主编,中国震例(1976—1980),北京:地震出版社,1990
    [67] 张肇诚主编,中国震例(1981—1985),北京:地震出版社,1990
    [68] 张肇诚主编,中国震例(1986—1988),北京:地震出版社,1999
    [69] 张肇诚主编,中国震例(1989—1991),北京:地震出版社,2000
    [70] 陈棋福主编,中国震例(1992—1994),北京:地震出版社,2002
    [71] 陈棋福主编,中国震例(1995—1996),北京:地震出版社,2002
    [72] 陈棋福主编,中国震例(1997—1999),北京:地震出版社,2002
    [73] Liu Yue,Wang Yuan, Li Yuan, Zhang Bofeng, Wu Gengfeng. Earthquake Prediction by RBF Neural Network Ensemble. In: Proc. ISNN04, Dalian, China, LNCS, Springer, 2004, Part II: 962-969
    [74] Md. Monirul Islam, Yao Xin, Kazuyuki Murase. A Constructive Algorithm for Training Cooperative Neural Network Ensembles. IEEE Trans on Neural Networks, 2003, 14(4): 820-834
    [75] Wang Z, Yao Xin, Xu Y. An Improved Constructive Neural Network Ensemble Approach to Medical Diagnoses. In: Proc. of IDEAL'04, LNCS, Springer, 2004: 572-577
    [76] Wen W X, Liu H. A Feature Weighting Method for Inductive Learning. In: Proc. of the 3rd PRICAI, Beijing, China, 1994: 338-344
    [77] Punch W, Goodman E, Pei M, Chia-Shun L, Hovland P, Enbody R. FurtherResearch on Feature Selection and Classification Using Genetic Algorithms. In: Proc. of the Fifth ICGA, 1993: 379-383
    [78] Komosinski M, Krawiec K. Evolutionary Weighting of Image Features for Diagnoses of CNS tumors. Artificial Intelligence in Medicine, 2000,19: 25-38
    [79] Yu C, Skilliom D B. Parallelizing Boosting and Bagging. Technical Report, Queen's University, Kingston, CA, 2001
    [80] Lozano E, Acuna E. Parallel Computation of Kernel Density Estimates Classifiers and Their Ensembles. In: Proc. of CCCT 2003, Orlando, Florida, USA, 2003: 479-484
    [81] 方开泰,均匀设计,应用数学学报,1980,3:363-372
    [82] 方开泰,均匀设计和均匀设计表,北京:科学出版社,1994
    [83] Glimm J.著,邓越凡译,数学科学·技术·经济竞争力,天津:南开大学出版社,1992
    [84] 王兆军,均匀设计在参数设计中的应用,南开大学学报,2000,33(2):57-60
    [85] 高齐圣,潘德惠,基于均匀设计的遗传算法及其应用,信息与控制,1998,28(3):236-240
    [86] 王少波,柴艳丽,梁醒培等,神经网络学习样本点的选取方法比较,郑州大学学报,2003,24(1):63-69
    [87] 高齐圣,张嗣瀛,潘德惠等,参数设计的模拟退火并行计算法,系统工程理论与实践,2000,8:41-44
    [88] Robert J, Schilling, James J. Carroll. Approximation of Nonlinear Systems with Radial Basis Function Neural Networks. IEEE Trans on Neural Networks, 2001, 12(1): 1-14
    [89] Moody J, Darken C. Fast Learning in Networks of Locally-tuned Processing Units. Neural Computation, 1989, 1: 281-294
    [90] Moody J, Darken C. Learning with Localized Receptive Fields. In: Proc. of the 1988 Connectionist Models Summer School. San Mateo, CA, 1989: 133-143
    [91] Han J, Kamber M. Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, 2000
    [92] Blake C, Keogh E, Merz C J. UCI repository of machine learning databases. http://www.ics.uci.edu/mlearn/MLRepository.html. Department of Information and Computer Science, University of California, Irvine, California, 1998
    [93] 林命週等编著,大中城市震后趋势判定与应急,北京:地震出版社,2003
    [94] 焦远碧,地震序列类型、地震序列 b 值与地震大形势关系初探,地震,1998,18(1):33-40
    [95] 周惠兰,房桂荣,章爱娣等,余震序列持续时间,地震学报,1982,4(1):45-54
    [96] Zhi-Hua Zhou, Jianxin Wu, Wei Tang. Ensembling Neural Networks: Many Could Be Better Than All. Artificial Intelligence, 2002, 137(1-2): 239-263.
    [97] Stone M. Cross-validation: A Review. Mathematische Operationsforschung Statistischen, Serie Statistics, 1978, 9: 127-139
    [98] 庄昆元,王炜,黄冰树等,地震预报专家系统 ESEP/PC,北京:地震出版社,1990,29-36
    [99] Pawlak Z. Rough Sets. International Journal of Computer and Information Science, 1982, 11(5): 341-356
    [100] 苗夺谦,王珏,粗集理论中概念与运算的信息表示,软件学报,1999,10(2):113-116
    [101] 王国赢,Rough 集理论与知识获取,西安:西安交通大学出版社,2001
    [102] Skowron A, Graymala-Bauzser J W. The Discernibility Matrices and Functions in Information Systems. Intelligent Decision Support, 1992: 331-362
    [103] V.Vapnili 著,张学工译,统计学习理论的本质,北京:清华大学出版社,2000
    [104] Nello Cristianini, John Shawe-Taylor 著,李国正,王猛,曾华军译,支持向量机,北京:电子工业出版社,2004
    [105] 刘同明,数据挖掘技术及其应用,长沙:国防工业出版社,2001,86-87
    [106] 张肇城,郑大林,罗咏生,《中国震例》前兆资料的初步研究,地震,1990,18(5):20
    [107] 吴富春,许俊奇,张宪,董星宏,地震预报中地震学异常的统计研究,地震,2000,20(增刊):66-69
    [108] 付伟,郑鑫,吕贻波,孔海心,地震前兆的复杂性与地震预报,黑龙江八一农垦大学学报,13(4):54-57
    [109] 陆远忠,阎利军,郭若眉,用于中短期地震预报的一些地震活动性参量相关性讨论,地震,1999,19(1):11-18
    [110] Carpenter G.A, Grossberg S. A Massively Parallel Architecture for A Self-organizing Neural Pattern Recognition Machine. Computer Vision, Graphics, and Image Processing, 1987, 37: 54–115
    [111] Carpenter G.A, Grossberg S, Reynolds J N. ARTMAP: Supervised Real-time Learning and Classification of Nonstationary Data by a Self-organizing Neural Network. Neural Network, 1991, 4: 565-588
    [112] Carpenter G A, Grossberg S, Markuzon N, Rosen D B. Fuzzy ART: Fast Stable Learning and Categorization of Anolog Patterns by An Adaptive Resonance System. Neural Networks, 1991, 4: 759-771
    [113] Carpenter G A, Grossberg S, Markuzon N, Reynolds J N, Rosen D B. Fuzzy ARTMAP: A Neural Network Architecture for Incremental Supervised Learning of Analog Multidimensional Maps. IEEE Trans on Neural Networks, 1992, 3(5): 698-713
    [114] David Weenink. Category ART: A Variation on Adaptive Resonance Theory Neural Net. IFA Proceedings, 1997, 21: 117
    [115] Tan A H. Cascade ARTMAP: Integrating Neural Computation and Symbolic Knowledge Processing. IEEE Trans on Neural Networks, 1997, 8(2): 237-250
    [116] Tan A H. Adaptive Resonance Associative Map. Neural Networks, 1995, 8(3): 437-446
    [117] Holland J H. Adaptation in Natural and Artificial Systems. Ann Arbor: The University of Michigan Press, 1975
    [118] Hussein F, Kharma N, Ward R. Genetic Algorithms for Feature Selection andWeighting, a Review and Study. In: Proc. of ICDAR’01, Seattle, Washington, 2001: 10-13
    [119] 王小平,曹立明,遗传算法——理论、应用与软件实现,西安:西安交通大学出版社,2002,79-85
    [120] 都志辉,李三立,高性能计算并行编程技术-MPI 并行程序设计,北京:清华大学出版社,2001
    [121] 王炜,曹雪峰,宋先月,使用人工神经网络判断未来我国地震形势,地震学刊,2001,21(3):10-41
    [122] 国家地震局震害防御司,中国历史强震目录,北京:地震出版社,1995
    [123] 国家地震局震害防御司,中国近代地震目录,北京:中国科技出版社,1999
    [124] 中国地震局,全球地震目录(1904-1997),北京:地震出版社,2001

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700