多特征结合与支持向量机集成在图像分类中的应用

英文题名：Application of Combination of Multi-features and Support Vector Machine Ensemble to Image Classification
作者：鲜艳明
论文级别：硕士
学科专业名称：计算机应用技术
中文关键词：多特征结合 ; 支持向量机集成 ; 主成分分析 ; 高斯归一化 ; 图像分类
英文关键词：Multi-feature Combination ; Support Vector Machine Ensemble ; Principal ; Component Analysis ; Gaussian Normalization ; Image Classification
学位年度：2011
导师：付燕
学科代码：081203
学位授予单位：西安科技大学
论文提交日期：2011-06-01

摘要

近年来,随着图像数据的爆炸式增长,图像分类在很多领域都已成为一项关键性工作,因此对图像分类方法进行研究具有十分重要的价值和意义。本文围绕图像特征的有效提取和适应于图像分类的分类器设计两个方面对图像分类方法进行了研究,并开发了基于内容的图像分类原型系统。主要研究内容如下:
     针对单一特征只能描述图像的部分属性,对图像内容描述比较片面,缺少足够的区分信息,从而导致图像分类精度不高的问题,提出了基于多特征结合和支持向量机(Support Vector Machine, SVM)的图像分类方法。该方法首先分别提取图像的环形颜色直方图特征、灰度共生矩阵特征、小波变换特征和边缘方向直方图特征;然后,对所提取的这些单一特征进行合并,从而形成能够更全面的描述图像内容的综合特征,并对该综合特征进行高斯归一化;最后,采用SVM分类器对图像进行分类。实验结果表明该方法的各类图像的平均分类准确率高于基于单一特征的图像分类方法。
     针对所提取的图像特征中通常含有相当数量的冗余信息,而这些冗余信息又极大的损害学习器的泛化能力,从而导致图像分类精度不高的问题,提出了基于多特征结合和PCA-RBaggSVM(Principal Component Analysis RBaggSVM)的图像分类方法。该方法首先提取能够更全面的描述图像内容的综合特征;然后,对所提取的综合特征进行PCA降维和高斯归一化;最后,采用同时扰动训练集和SVM模型参数的二重扰动方法构造SVM集成分类器,并利用该SVM集成分类器对图像进行分类。实验结果表明与BP神经网络、C4.5和RBaggSVM方法相比,该方法的各类图像的平均分类准确率更高,训练和分类的总耗时更少。
     基于上述研究结果,设计并实现了一个基于内容的图像分类原型系统。测试结果表明该原型系统运行正确。
In recent years, with the explosive increase of image data, image classification has become a key task in many fields, so the study of image classification has great value. This thesis centers on effectively extracting image features and the design of classifiers for image classification. Moreover, a prototype system for image classification based on content is developed. The main contributions of this thesis are as follows:
     Since the single feature can only present partial contents of the image, which results in the insufficient distinguishing information, then the classification accuracy of image is not high. So here an approach for image classification based on combination of multi-features and support vector machine(SVM) is proposed. In this method, first feature of annular color histogram(ACH), feature of gray level co-occurrence matrix(GLCM), feature of tree-structured wavelet transform(TWT) and feature of edge direction histogram(EDH) are extracted respectively, then the extracted features are combined to form comprehensive features which can describe image content more completely and they are normalized with Gaussian normalization method. Finally, SVM is applied to classify images. Experimental results show that the accuracy of average classification of different kinds of images by this method is higher than that of the method based on single feature.
     Since a lot of redundant information of the extracted image features leads to the low classification accuracy of image, we introduce an approach for image classification based on combination of multi-features and principal component analysis RBaggSVM (PCA-RBaggSVM). In this method, first comprehensive features which can describe image content more completely are extracted, then their dimensions are reduced with PCA and reduced-dimension features are normalized with Gaussian normalization method. Finally, by manipulating training sets and SVM model parameters, a classifier of SVM ensemble is formed, which is used to classify images. Experimental results indicate that compared with BP Neural Network, C4.5 and RBaggSVM, this method can bring higher accuracy of average classification of different kinds of images and takes less total time in training and classifying of it.
     On the basis of above researched results, a prototype system for image classification based on content is designed and implemented. Testing results show that it operates correctly.

引文

[1]付岩,王耀威,王伟强,等. SVM用于基于内容的自然图像分类和检索[J].计算机学报, 2003, 26(10): 1261-1265
    [2] Szummer M, Picard R. Indoor-Outdoor Image Classification[C]. Proceedings of the 1998 International Workshop on Content-Based Access of Image and Video Databases, Bombay, India, 1998, 42-51
    [3] Vailaya A, Figueiredo M A T, Jain A K, Zhang H J. Image classification for content-basedindexing[J]. IEEE Transactions on Image Processing, 2001, 10(1): 117-130
    [4] Soriano M, Garcia L, Saloma C. Fluorescent image classification by major color histograms and a neural network[J]. Optics Express, 2001, 8(5): 271-277
    [5] Chang S, Nasrabadi N, Carin L. Infrared-image classification using support vector machines[C]. 2002 IEEE International Conference on Acoustic, Speech, and Signal Processing, Orlando, USA, 2002, 4: IV/4168
    [6] Venkatalakshmi K, Mercy S. Classification of multispectral images using support vectormachines based on PSO and K-Means clustering[C]. 2nd International Conference on Intelligent Sensing and Information Processing, ICISIP 2005, Chennai, India, 2005, 127-133
    [7] Mallikarjun R G, Krishna M B. Classification of remotely sensed images using neural-network ensemble and fuzzy integration[C]. 1st International Conference on Pattern Recognition and Machine Intelligence, PReMI 2005, Kolkata, India, 2005, 3776LNCS: 350-355
    [8] Fan Y, Resnick S M, Davatzikos C. Feature selection and classification of multi-parametricmedical images using bagging and SVM[C]. Medical Imaging 2008: Image Processing, San Diego, USA, 2008, 6914
    [9]荆晓远,杨静宇,黄修武.基于小波变换和群体决策方法识别人脸图像[J].计算机研究与发展, 1999, 36(1): 72-76
    [10]秦其明,陆荣建.分形与神经网络方法在卫星数字图像分类中的应用[J].北京大学学报(自然科学版), 2000, 36(6): 858-864
    [11]李杰,朱维乐,王磊,等.基于Wold模型和支持向量机的纹理识别[J].计算机研究与发展, 2007, 44(3): 460-466
    [12]常甜甜,刘红卫,冯筠.多源性数据SVM集成算法研究[J].西安电子科技大学学报(自然科学版), 2010, 37(1): 136-141
    [13]杜树新,吴铁军.模式识别中的支持向量机方法[J].浙江大学学报(工学版), 2003, 37(5): 521-527
    [14]何灵敏,沈掌泉,孔繁胜,等. SVM在多源遥感图像分类中的应用研究[J].中国图象图形学报, 2007, 12(4): 648-654
    [15]邓乃扬,田英杰.支持向量机——理论、算法与拓展[M].北京:科学出版社, 2009
    [16] Zanni L, Serafini T, Zanghirati G. Parallel software for training large scale support vector machines on multiprocessor systems[J]. Journal of Machine Learning Research, 2006, 7: 1467-1492
    [17] Keerthi S S, Gilbert E G. Convergence of a generalized SMO algorithm for SVM classifier design[J]. Machine Learning, 2002, 46(1-3): 351-360
    [18] Hsu C W, Chang C C, Lin C J. A Practical Guide to Support Vector Classification[EB/OL]. Available in http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf, 2009
    [19] Hsu C W, Lin C J. A Comparison of Methods for Multiclass Support Vector Machines [J]. IEEE Transactions on Neural Networks, 2002, 13(2): 415-425
    [20]唐伟,周志华.基于Bagging的选择性聚类集成[J].软件学报, 2005, 16(4): 496-502
    [21]周志华,陈世福.神经网络集成[J].计算机学报, 2002, 25(1): 1-8
    [22] Dietterich T G. Ensemble Methods in Machine Learning[C]. Proceedings of the First International Workshop on Multiple Classifier Systems, Cagliari, Italy, 2000, 1857 LNCS: 1-15
    [23] Duin R P W. The combining classifier: To train or not to train[C]. Proceedings - International Conference on Pattern Recognition, 2002, 16(2): 765-770
    [24] Zhou Z H, Yu Y. Adapt bagging to nearest neighbor classifiers[J]. Journal of Computer Science and Technology, 2005, 20(1): 48-54
    [25]何灵敏.支持向量机集成及在遥感分类中的应用[D].浙江大学博士学位论文, 2006
    [26] Caruana R, Niculescu-Mizil A, Crew G, Ksikes A. Ensemble selection from libraries of models[C]. Proceedings, Twenty-First International Conference on Machine Learning, ICML 2004, Alta, Canada, 2004, 137-144
    [27] Breiman L. Bagging predictors[J]. Machine Learning, 1996, 24(2): 123-140
    [28]程丽丽.支持向量机集成学习算法研究[D].哈尔滨工程大学博士学位论文, 2009
    [29]李青,焦李成.利用集成支撑矢量机提高分类性能[J].西安电子科技大学学报(自然科学版), 2007, 34(1): 68-70
    [30] Melville P, Mooney R J. Creating diversity in ensembles using artificial data[J]. Information Fusion, 2005, 6(1): 99-111
    [31] Rui Y, Huang T S, Ortega M, Mehrotra S. Relevance feedback: A power tool for interactive content-based image retrieval[J]. IEEE Transactions on Circuits and Systems for Video Technology, 1998, 8(5): 644-655
    [32]杨红颖,吴俊峰,于永健,等.一种基于HSV空间的彩色边缘图像检索方法[J].中国图象图形学报, 2008, 13(10): 2035-2038
    [33]薛少娟,左万利,赫枫龄.基于颜色分块全局直方图的图像检索方法及系统实现[J].吉林大学学报(理学版), 2006, 44(4): 606-610
    [34]洪安祥.基于内容的图像检索若干论题研究[D].浙江大学博士学位论文, 2003
    [35] Waltz F M, Miller J W V. Fast grey-level co-occurrence matrix calculations for texture analysis[C]. Two- and Three-Dimensional Methods for Inspection and Metrology III, Boston, USA, 2005, 6000
    [36] Ohanian P P, Dubes R C. Performance evaluation for four classes of textural features[J]. Pattern Recognition, 1992, 25(8): 819-833
    [37] Baraldi A, Parmiggiani F. An investigation of the textural characteristics associated with gray level cooccurrence matrix statistical parameters[J]. IEEE Transactions on Geoscience and Remote Sensing, 1995, 33(2): 293-304
    [38]易文晟.图像语义检索和分类技术研究[D].浙江大学博士学位论文, 2007
    [39] Gao X, Xiao B, Tao D, Li X. Image categorization: Graph edit distance + edge direction histogram[J]. Pattern Recognition, 2008, 41(10): 3179-3191
    [40]张震,马驷良,张忠波,等.一种改进的基于Canny算子的图像边缘提取算法[J].吉林大学学报(理学版), 2007, 45(2): 244-248
    [41] Chapelle O, Haffner P, Vapnik V N. Support vector machines for histogram-based imageclassification[J]. IEEE Transactions on Neural Networks, 1999, 10(5): 1055-1064
    [42]翟俊海,王熙照,张素芳.基于小波变换和多类支持向量机的图像分类[J].计算机工程与应用, 2007, 43(16): 47-49
    [43]罗瑜,李涛,何大可,等.基于部件和支持向量机的人脸分类[J].计算机工程, 2008, 34(4): 198-200
    [44]夏定元,周曼丽,向政权,等.基于内容的图像特征相关性检索方法[J].华中科技大学学报(自然科学版), 2002, 30(3): 97-99
    [45]王鹏,朱小燕.基于RBF核的SVM的模型选择及其应用[J].计算机工程与应用, 2003(24): 72-73
    [46]高惠璇.应用多元统计分析[M].北京:北京大学出版社, 2007
    [47] Zou W, Li Y, Lo K C, Chi Z. Improvement of image classification with wavelet and Independent Component Analysis(ICA) based on a structured neural network[C].International Joint Conference on Neural Networks 2006, Vancouver, Canada, 2006, 3949-3954
    [48] Roweis S T, Saul L K. Nonlinear Dimensionality Reduction by Locally Linear Embedding[J]. Science, 2000, 290(5500): 2323-2326
    [49]王蕴红,范伟,谭铁牛.融合全局与局部特征的子空间人脸识别算法[J].计算机学报, 2005, 28(10): 1657-1663
    [50]张华,张森,刘魏,等.基于BP神经网络的图像形状识别[J].计算机科学, 2006,33(1): 269-271
    [51]徐鹏,林森.基于C4.5决策树的流量分类方法[J].软件学报, 2009, 20(10): 2692-2704

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700