面向语义提取的图像分类关键技术研究

英文题名：Research on Key Techniques of Image Classification for Semantic Extraction
作者：曾璞
论文级别：博士
学科专业名称：控制科学与工程
中文关键词：区域潜在语义 ; 多特征融合 ; 多核学习 ; 多类分类器 ; 模型子空间 ; 类间相似性度量
英文关键词：Regional Latent Semantic ; Multi-Feature Fusion ; Multiple Kernel Learning ; Multi-class Classifier ; Model Subspace ; Inter-Class Similarity Measure
学位年度：2009
导师：吴玲达
学科代码：081101
学位授予单位：国防科学技术大学
论文提交日期：2009-10-01

摘要

随着数字成像技术的快速发展,数字图像的数量也在飞速增长。越来越丰富的图像资源使用户难以在浩如烟海的图像数据中找到真正需要的图像信息,因而,如何实现快捷、高效的图像组织与检索就成为颇具价值的研究课题。近年来,通过图像分类来提取图像语义内容已成为被广泛关注的研究热点问题。然而,目前面向语义提取的图像分类技术面临诸多挑战,如何构建有效的图像分类方法仍是一个值得深入研究的问题。
     本文围绕语义提取需求背景下的图像分类研究这一主题,主要针对特征提取、多特征融合和多类分类器设计等关键技术展开研究。论文的主要工作与创新体现在以下几个方面:
     1、提出了一种面向场景分类的图像区域潜在语义分布特征提取方法。该特征提取方法的核心在于区域潜在语义的获取,具体过程是首先采用空间金字塔分块生成图像分块区域,然后对图像分块区域集合应用概率潜在语义分析方法自动挖掘出区域潜在语义,最后联合所有图像分块中区域潜在语义的出现概率来构建区域潜在语义分布特征。与其它中间语义特征相比,该特征在不需要人工标注的情况下能利用图像区域语义的空间分布为场景分类服务。在13类场景图像上的实验验证了该特征的有效性。
     2、提出了一种基于核函数组合的多特征融合分类模型及其优化方法。该方法通过特征所对应核函数的线性加权组合来实现特征融合,将最优多特征融合问题转化为组合核函数的优化问题,并利用多核学习方法进行优化。为克服传统的多核学习方法只满足类间间隔最大化要求的缺点,根据Fisher准则提出了一种基于类间间隔最大化和类内散度最小化的多核学习算法,并将其用于多特征融合分类模型的优化。实验表明,采用该方法优化的多特征融合分类模型能同时实现特征选择和特征融合,具有更好的分类性能。
     3、提出了一种级联模型子空间最小距离分类器的多类分类方法。该方法的核心是通过最小距离分类器筛选出较小的候选集以进行最终的分类。为保证该方法的分类速度和分类精度,提出在模型子空间应用最小距离分类器,并且通过构建一个基于权值稀疏性约束的距离度量学习方法来同时获得最优的模型子空间和相应的距离度量。该方法在不损失分类精度的前提下,能大幅提高分类速度,在Caltech256数据集上的实验验证了这一结果。
     4、提出了一种融合特征分布和类别语义相似度的层次分类器生成算法。基于特征分布的类别相似度是在获取基于聚类的类别概率分布基础上,利用概率分布之间的距离来实现;而基于语义的类别相似性度量则是利用类别词在WordNet上的语义关联度来实现。最终,通过二者的线性融合来生成类别相似度,并且利用谱聚类方法实现层次分类器的构建。该方法能利用类别特征分布信息和语义相关信息之间的互补性为构建层次结构服务。与只使用一类信息的方法相比,具有更好的分类性能。
With the rapid development of digital imaging technology, the amount of image is increasing rapidly. It's difficult to find the user-wanted images from huge mount of image data. So how to organize these images and retrieval a special image from the mass database efficiently and effectively has become a major issue. Image classification is an important and challenging task in this field and is attracting more and more attention. But there is so much difficulty in image classification task for semantic extraction, how to generate a more effective image classification method is still an open problem.
     In this dissertation, some key techniques of image classification for semantic extraction have been explored, which include feature extraction, multi-feature fusion and multi-class classifier. The original contributions of this thesis can be described as follows:
     1. An image regional latent semantic distribution feature is proposed for scene classification. The core of this feature extraction method is how to get the regional latent semantic. Firstly, an image block collection is generated by using spatial pyramid subdivision method on training image collection. Then the Probability Latent Semantic Analysis method is used on the image block collection to mine the regional latent semantic. Finally, the image regional latent semantic distribution feature is defined by uniting the probability value of each regional latent semantic in each image block region. Comparing with other intermediate semantic features, this feature has used the distribution of regional semantic to improve image classification performance, as well as it reduces the load of people. Experiment results show that this feature has satisfactory classification performance on a large set of 13 categories of complex scenes.
     2. A multi-feature fusion model based on kernel combination and its optimization algorithm is proposed. In this classification model, multi-feature fusion is completed by a convex combination of feature kernels, and each feature kernel corresponds to an image feature. Then, the problem of how to fuse image features excellently has become another problem which is how to optimize the combinational kernel. To solve this problem, multiple kernel learning can be used. But the multiple kernel learning methods in existence have only maximized the between-class variance. Based on Fish rule, an excellent classifier should maximize the between-class variance and minimize within-class variance. To satisfy this rule, a new multiple kernel learning method is proposed and has been used to optimize multi-feature fusion model. The experimental results show that the proposed method can finish feature selection and feature fusion at the same time, and has higher classification accuracy.
     3. A multi-class method cascading minimum distance classifier in model subspace is proposed. Firstly, this method uses minimum distance classifier to get a litter class collection, and then multi-class SVM classifier is used to classify on this collection. To improve the classification performance, a minimum distance classifier based on model subspace is proposed. To guarantee the classification speed and precision of minimum distance classifier on model subspace, a new distance measure learning method which based on sparse restriction of distance weight is proposed. By this method, an optimization model subspace and an optimization distance can be generated at the same time. The experiments on Caltech256 show that the proposed method can guarantee the classification speed and precision.
     4. A new inter-class similarity combing feature distribution and class semantic is proposed, and this measure has been used to automatic generation of class hierarchy. In the measure based on feature distribution, the train data has been clustered firstly, and then the prior probability distribution of cluster has been used to describe the feature distribution of each class, finally a distance based probability distribution is used to get the inter-class similarity. On the other side, the semantic similarity of class word has been computed based on WordNet. The total inter-class similarity has been computed by combining these two measures and a class hierarchy has been automatically generated based on this measure by spectral clustering algorithm. Comparing with the method only using one measure, the proposed method has higher classification performance.

引文

[1]中国互联网络信息中心.第23次中国互连网络发展状况统计报告[EB/OL].http://www.cnnic.cn/uploadfiles/pdf/2009/1/13/92458.pdf,2009.
    [2] A.W.M. Smeulders , M. Worring , S. Santini , A. Gupta and R. Jain.Content-Based Image Retrieval at the End of the Early Years[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2000,22(12):1349~1380.
    [3] R. Datta,D. Joshi,J. Li and J. Z. Wang.Image Retrieval: Ideas, Influences, and Trends of the New Age [J].ACM Computing Surveys,2008,40(2):5~60.
    [4] Rafael C. Gonzalez,Richard E. Woods,Steven L. Eddins.Digital Image Processing Using MATLAB[M].Prentice Hall,2004.
    [5]黄志开.彩色图像特征提取与植物分类研究[D].合肥:中国科学技术大学研究院,2006:36~42.
    [6]程起敏.基于内容的遥感图像库检索关键技术研究[D].北京:中国科学院遥感应用研究所博士学位论文,2004.
    [7]夏定元.基于内容的图像检索通用技术研究及应用[D].武汉:华中科技大学博士学位论文,2004.
    [8] A. Oliva and A. Torralba.Modeling the shape of the scene: a holistic representation of the spatial envelope[J].International Journal of Computer Vision,2001,42(3):145~175.
    [9] A. Mojsilovic,J. Gomes,B. Rogowitz.Isee: Perceptual features for image library navigation[C].Proceedings of SPIE Human vision and electronic imaging,San Jose,California,2002,4662:266~277.
    [10] J. Fan,Y. Gao,H. Luo,G. Xu.Statistical modeling and conceptualization of natural images[J].Pattern Recognition ,2005,38:865~885.
    [11] Julia Vogel and Bernt Schiele.Semantic Modeling of Natural Scenes for Content-Based Image Retrieval[J].International Journal of Computer Vision,2007,72(2):133~157.
    [12] J. C. van Gemert,J. Geusebroek,C. J. Veenman,C. G. M.Snoek and A. W. M. Smeulders . Robust scene categorization by learning image statistics in context[C].In Proceedings of the 2006 Conference on Computer Vision and Pattern Recognition,Semantic Learning Workshop,2006.
    [13] Le Lu,Kentaro Toyama and Gregory D. Hager.A Two Level Approach forScene Recognition[C].Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition,2005,Volume 1:688~695.
    [14] J. Yang , Y.-G. Jiang , A. G. Hauptmann and C.-W. Ngo . Evaluating bag-of-visual-words representations in scene classification[C].In Proceedings of the international workshop on Multimedia Information Retrieval,New York,NY,USA,2007:197~206.
    [15]王君秋,查红彬.结合兴趣点和边缘的建筑物和物体识别方法[J].计算机辅助设计与图形学学报,18(8):1257~1263.
    [16] Yu-Gang Jiang,Chong-Wah Ngo,Jun Yang.Towards Optimal Bag-of-Features for Object Categorization and Semantic Video Retrieval[C].Proceedings of the 6th ACM international conference on Image and video retrieval,Amsterdam,The Netherlands,July 09-11,2007:494~501.
    [17] Csurka, G.,Dance, C.,Fan, L.,Willamowski, J.,Bray, C..Visual categorization with bags of keypoints[C].In European Conference Computer Vision 2004,workshop on Statistical Learning in Computer Vision,2004:59~74.
    [18]聂青,战守义.基于区域特征的图像分类技术[J].北京理工大学学报,2008,28(10),885~889.
    [19] D. Nister and H. Stewenius . Scalable recognition with a vocabulary tree[C].Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition,Washington,DC,USA,2006:2161~2168.
    [20] J. Farquhar , S. Szedmak , H. Meng and J. Shawe-Taylor . Improving "bag-of-keypoints" image categorization: Generative Models and PDF-Kernels[R].Technical report,University of Southampton,2005.
    [21] Florent Perronnin.Universal and Adapted Vocabularies for Generic Visual Categorization[J] . IEEE Transactions on Pattern Analysis and Machine Intelligence,2008,30(7):1243~1256.
    [22] F. Jurie,B. Triggs.Creating efficient codebooks for visual recognition[C].In Proceedings of IEEE International Conference on Computer Vision,2005,volume 1:604~610.
    [23] T. Tuytelaars and C. Schmid.Vector quantizing feature space with a regular lattice[C].In Proceedings of the 11th International Conference on Computer Vision,2007,Rio de Janeiro,Brazil:1~8.
    [24] Liu Yang,Rong Jin,Rahul Sukthankar,Frederic Jurie.Unifying Discriminative Visual Codebook Generation with Classifier Training for Object CategoryRecognition[C].In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR),Anchorage,Alaska,USA,June 2008:1~8.
    [25] L. Yang,R. Jin and R. Sukthankar.Discriminative Cluster Refinement: Improving Object Category Recognition Given Limited Training Data[C].In IEEE Computer Society Conference on Computer Vision and Pattern Recognition,Minneapolis,Minnesota,18-23 June 2007:1~8.
    [26] D. Larlus and F. Jurie . Latent mixture vocabularies for object categorization[C].In Proceedings of the British Machine Vision Conference,2006.
    [27] Jan van Gemert,Jan-Mark Geusebroek,Cor J. Veenman,Arnold W. M. Smeulders.Kernel Codebooks for Scene Categorization[C].Proceedings of the 10th European Conference on Computer Vision,2008,3:696~709.
    [28] A. Vailaya,M. Figueiredo,A. Jain and H. Zhang.Image classification for content-based indexing[J].IEEE Transactions on Image Processing,2001,10(1):117~130.
    [29] Quelhas, P.,Monay, F., Odobez,J.-M., Gatica-Perez,D. and Tuytelaars T..A Thousand Words in a Scene[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2007,29 (9):1575~1589.
    [30] P. Quelhas,F. Monay,J. Odobez,D. Gatica-Perez,T. Tuytelaars,L.Van Gool.Modeling scenes with local descriptors and latent aspects[C].Proceedings of International Conference on Computer Vision,Beijing,China,October,2005:883~890.
    [31] A. Bosch,A. Zisserman,X. Munoz.Scene classification via plsa[C].Proceedings of European Conference on Computer Vision,Graz,Austria,2006,vol. 4:517~530.
    [32] J. Sivic,B. C. Russell,A. A. Efros,A. Zisserman,W. T. Freeman.Discovering objects and their location in images[C].Proceedings of International Conference on Computer Vision,Beijing, China,October,2005,Vol. 1:370~377.
    [33] T. Hofmann . Unsupervised learning by probabilistic latent semantic analysis[J].Machine Learning,2001,42:177~196.
    [34] L. Fei-Fei,P. Perona.A bayesian hierarchical model for learning natural scene categories[C].In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,San Diego,USA,June,2005:524~531.
    [35] D. Blei, Y. Andrew and M. Jordan.Latent Dirichlet allocation[J].Journal ofMachine Learning Research,2003,3:993~1020.
    [36] Eva Horster,Rainer Lienhart,Malcolm Slaney.Continuous Visual Vocabulary Models for pLSA-Based Scene Recognition[C].ACM International Conference on Image and Video Retrieval (CIVR),Niagara Falls,Canada,2008:319~328.
    [37] R. Fergus,L. Fei-Fei,P. Perona and A. Zisserman.Learning Object Categories from Google’s Image Search[C].Proceedings of International Conference on Computer Vision,Beijing,China,October,2005,vol. 2:1816~1823.
    [38] Yuanning Li,Weiqiang Wang,Wen Gao.A robust approach for object recognition[C].7th Pacific-Rim Conference on Multimedia (PCM 2006),Hangzhou,China,Nov. 2-4,2007:262~269.
    [39] David Liu and Tsuhan Chen.Unsupervised Image Categorization and Object Localization using Topic Models and Correspondences between Images[C].In Proceedings of the 11th International Conference on Computer Vision,Rio de Janeiro,Brazil,2007:1~8.
    [40] Ziming Zhang,Syin Chan,Liang-Tien Chia.Image classification using tensor representation[C] . Proceedings of the 15th international conference on Multimedia,Augsburg,Germany,September 25-29,2007:281~284.
    [41] L. Wu,M. Li,Z. Li,W.-Y. Ma and N. Yu.Visual language modeling for image classification[C].In Proceedings of the International Workshop on Multimedia Information Retrieval,New York,NY,USA,2007:115~124.
    [42] Pierre Tirilly , Vincent Claveau , Patrick Gros . Language Modeling for Bag-of-Visual Words Image Categorization[C] . Proceedings of the 2008 international conference on Content-based Image and Video Retrieval,Niagara Falls,Canada,2008:249~258.
    [43] G.Wang,Y. Zhang and L. Fei-Fei.Using dependent regions for object categorization in a generative framework[C] . In Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition ,Washington,DC,USA,2006,Volume 2:1597~1604.
    [44]韩东峰,李文辉,郭武.基于潜在局部区域空间关系学习的物体分类算法[J].计算机学报,2007,30(8):1286~1294.
    [45] Robert Fergus , Pietro Perona , Andrew Zisserman . Weakly Supervised Scale-Invariant Learning of Models for Visual Recognition[J].International Journal of Computer Vision 71(3),2007:273~303.
    [46] L. Fei-Fei , R. Fergus and P. Perona . One-Shot Learning of ObjectCategories[J].IEEE Transaction on Pattern Analysis and Machine Intelligence,2006,28(4):594-611.
    [47] R. Fergus,P. Perona,A. Zisserman.Object class recognition by unsupervised scale-invariant learning[C].Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition,Washington,DC,USA,2003:264~271.
    [48] R. Fergus,P. Perona and A. Zisserman.A sparse object category model for efficient learning and exhaustive recognition[C].In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,San Diego,USA,June,2005,vol. 1:380~387.
    [49] D. Crandall,P. Felzenszwalb and D. Huttenlocher.Spatial priors for part-based recognition using statistical models[C].In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,San Diego,USA,June,2005,Vol. 1:10–17.
    [50] P. Felzenszwalb and D. Huttenlocher . Pictorial structures for object recognition[J].International Journal of Computer Vision,2005,61(1):55~79.
    [51] G. Bouchard , B. Triggs . Hierarchical part-based visual object categorization[C].In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,San Diego,USA,June,2005.
    [52] Gustavo Carneiro , David Lowe . Sparse Flexible Models of Local Features[C].Proceedings of European Conference on Computer Vision,Graz,Austria,2006,vol.3:29~43.
    [53] M. Burl,M. Weber and P. Perona.A probabilistic approach to object recognition using local photometry and global geometry[C].Proceedings of European Conference Computer Vision,June 1998:628~641.
    [54] M. Weber,M. Welling and P. Perona.Towards automatic discovery of object categories[C].In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,June 2000:2101~2108.
    [55] K.Goh,E. Chang and B. Li.Using on-class and two-class SVMs for multiclass image annotation[J].IEEE Transactions on Knowledge and Data Engineering,2005,17(10):1333~1346.
    [56]万华林,Morshed U. Chowdhury.基于支持向量机的图像语义分类[J].软件学报,2003,14(11):1891~1899.
    [57]付岩,王耀威,王伟强,高文.SVM用于基于内容的自然图像分类和检索[J].计算机学报,2003,26(10):1261~1265.
    [58]路晶,金奕江,马少平,茹立云.使用基于SVM的否定概率和法的图像标注[J].智能系统学报,2006,1(1):62~66.
    [59] L. Wolf and A. Shashua . Learning over sets using kernel principal angles[J].Journal of Machine Learning Research, Dec 2003, 4:913–931.
    [60] S. Boughhorbel,J-P. Tarel,F. Fleuret.Non-Mercer kernels for SVM object recognition[C].In; British Machine Vision Conference,London,UK,September,2004.
    [61] R. Kondor and T. Jebara.A Kernel Between Sets of Vectors[C].In Proceedings of International Conference on Machine Learning,Washington,D.C.,USA,Aug 2003.
    [62] S. Lyu . Mercer Kernels for Object Recognition with Local Features[C].Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,San Diego,CA,USA,June 2005:223~229.
    [63] K. Grauman and T. Darrell.Pyramid match kernels: Discriminative classification with sets of image features[R].Technical Report MIT-CSAIL-TR-2006-020,MIT CSAIL,Cambridge,MA,March,2006.
    [64] Grauman K , Darrell T . Approximate correspondences in high dimensions[C].Advances in Neural Information Processing Systems,2007:505~512.
    [65] S. Lazebnik,C. Schmid,J. Ponce.Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories[C].Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition ,Washington,DC,USA,2006,Volume 2:2169~2178.
    [66] Junfeng He,Shih-Fu Chang,Lexing Xie.Fast Kernel Learning for Spatial Pyramid Matching[C].In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR),Anchorage,Alaska,USA,June 2008:1~8.
    [67] H. Ling and S. Soatto.Proximity distribution kernels for geometric context in category recognition[C].In Proceedings of the 11th International Conference on Computer Vision,Rio de Janeiro,Brazil,2007:1~8.
    [68] Xiaobing Liu,Dong Wang,Jianmin Li and Bo Zhang.The Feature and Spatial Covariant Kernel: Adding Implicit Spatial Constraints to Histogram[C].Proceedings of the 6th ACM international conference on Imageand video retrieval,Amsterdam,The Netherlands,July 09-11,2007:565~572.
    [69] Andrea Vedaldi,Stefano Soatto.Relaxed Matching Kernels For Robust Image Comparison[C].In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR),Anchorage,Alaska,USA,June 2008:1~8.
    [70]陈海林,吴秀清,胡俊华.基于局部特征空间相关核的图像目标分类[J].光电工程,2009,36(3):33~38.
    [71] Guo-Jun Qi,Xian-Sheng Hua,Yong Rui,Jinhui Tang,Zheng-Jun Zha,Hong-Jiang Zhang.A Joint Appearance-Spatial Distance for Kernel-Based Image Categorization[C].In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR),Anchorage,Alaska,USA,June 2008:1~8.
    [72] Andreas Opelt,Axel Pinz,Michael Fussenegger,Peter Auer.Generic Object Recognition with Boosting[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2008,28(3):416~431.
    [73] Zhang, W.,Yu, B.,Zelinsky, G.,Samaras, D..Object class recognition using multiple layer boosting with heterogeneous features[C].In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,San Diego,USA,June,2005:323~330.
    [74] Alexander C. Berg,Tamara L. Berg,Jitendra Malik.Shape Matching and Object Recognition using Low Distortion Correspondences[C].In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,San Diego,USA,June,2005:26~33.
    [75] H. Zhang,A. Berg,M. Maire and J. Malik.SVM-KNN:Discriminative nearest neighbor classification for visual category recognition[C].In Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition,Washington,DC,USA,2006,Volume 2:2126~2136.
    [76] A. Frome,Y. Singer and J. Malik.Image retrieval and classification using local distance functions[C].Advances in Neural Information Processing Systems,Cambridge,MA,2006,Vol. 19: 417~424.
    [77] A. Frome,Y. Singer,F. Sha and J. Malik.Learning globally-consistent local distance functions for shape-based image retrieval and classification[C].In Proceedings of the 11th International Conference on Computer Vision,Rio de Janeiro,Brazil,2007:1~8.
    [78] Oren Boiman,Eli Shechtman,Michal Irani.In Defense of Nearest-Neighbor Based Image Classification[C].In IEEE Computer Society Conference onComputer Vision and Pattern Recognition (CVPR),Anchorage,Alaska,USA,June 2008:1~8.
    [79] O. Maron and A.L. Ratan . Multiple-instance learning for natural scene classification[C].Proceedings of the 15th International Conference on Machine Learning,San Francisco,CA, 1998:341~249.
    [80] Q. Zhang and S.A. Goldman.EM-DD: An Improved Multiple-Instance Learning Technique[C].Advances in Neural Information Processing Systems 14,2002:1073~1080.
    [81] Y. Chen and J.Z. Wang.Image Categorization by Learning and Reasoning with Regions[J].Journal of Machine Learning Research 5,2004:913~939.
    [82] Y. Chen,J. Bi and J. Z. Wang.MILES: Multiple-Instance Learning via Embedded Instance Selection[J].IEEE Transaction on Pattern Analysis and Machine Intelligence,2006,28(12):1931~1947.
    [83] C. Yang and M. Dong.Region-based image annotation using asymmetrical support vector machine-based multiple-instance learning[C].In Proceedings of IEEE International Conference on CVPR,2006:2057~2063.
    [84] A. Holub and P. Perona.A discriminative framework for modeling object classes[C].In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,San Diego,USA,June,2005,Vol 1:664~671.
    [85] Florent Perronnin and Christopher Dance.Fisher Kernels on Visual Vocabularies for Image Categorization[C] . In IEEE Computer Society Conference on Computer Vision and Pattern Recognition,Minneapolis,Minnesota,18-23 June 2007:1~8.
    [86] Anna Bosch,Andrew Zisserman,Xavier Munoz.Scene Classification Using a Hybrid Generative/Discriminative Approach[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2008,30(4):712~727.
    [87] Biederman I.Visual Object Recognition[C].Proceedings of An Invitation to Cognitive Science.MIT Press,1995:121~165.
    [88]王惠锋,孙正兴等.语义图像检索研究进展[J].计算机研究与发展,2002,39(5):513~523.
    [89] Hong, D.,J.-K. Wu,et al..Refining Image Retrieval Based on Context Driven Methods[C].IS&SPIE 11 th Symposium on Electronic Imaging,San Jose,CA,USA,1999:581~592.
    [90] Jaimes, A.,S.-F. Chang.Model-based classification of visual information forcontent-based retrieval[C].Proceedings of SPIE Conference on Storage and Retrieval for Image and Video Databases VII,San Jose,CA,USA,1999,3656: 402~414.
    [91] Santini S,Gupta A,Jain R.Emergent semantics through interaction in image databases[J].IEEE Transaction On knowledge and Data Engineering,2001,13 (13):337~351.
    [92] Jing Li,Nigel M. Allinson.A comprehensive review of current local features for computer vision[J].Neurocomputing 71(10-12),2008:1771~1787.
    [93] T. Tuytelaars , K. Mikolajczyk . Local Invariant Feature Detectors: A Survey[J].Foundations and Trends in Computer Graphics and Vision,2008,3(3):177~280.
    [94] H. Moravec.Towards automatic visual obstacle avoidance[C].Proceedings of the 5th International Joint Conference on Artificial Intelligence,August,1977:584~590.
    [95] C. Harris,M. Stephens.A combined corner and edge detector[C].Proceedings of The Fourth Alvey Vision Conference,Manchester,1988:147~151.
    [96] Shi.J,Tomasi C.Good Features to Track[C].Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,Seattle,Washington,USA,1994:593~600.
    [97] Crowley J.A Representation for Visual Information[D].PhD Thesis,Carnegie Mellon University,1981.
    [98] Lindeberg T.Feature Detection with Automatic Scale Selection[J].International Journal of Computer Vision,1998,30(2):79~116.
    [99] D. G. Lowe . Distinctive image features from scale-invariant keypoints[J].International Journal of Computer Vision, 2004,60(2):91~110.
    [100] K. Mikolajczyk and C. Schmid . Scale&affine invariant interest point detectors[J].International Journal of Computer Vision,2004,60(1):63~86.
    [101] T. Tuytelaars,L. Van Gool.Matching widely separated views based on affine invariant regions[J].International Journal of Computer Vision,2004,59 (1):61~85.
    [102] J. Matas,O. Chum,M. Urban,T. Padjdla.Robust wide baseline stereo from maximally stable external regions[C].In British Machine Vision Conference (BMVC),University of Cardiff,UK,Vol. 1,September 2002:384~393.
    [103] Mikolajczyk K , Schmid C . An Affine Invariant Interest PointDetector[C].Proceedings of European Conference on Computer Vision,Copenhagen,Denmark,2002:128~142.
    [104] Timor Kadir and Michael Brady . Scale, saliency and image description[J].International Journal of Computer Vision,2001,45(2):83~105.
    [105]蔡红苹,雷琳,陈涛,粟毅.一种通用的仿射不变特征区域提取方法[J].电子学报,2008,36(4):672~278.
    [106] K. Mikolajczyk,T. Tuytelaars,C. Schmid,A. Zisserman,J. Matas,F. Schaffalitzky,T. Kadir and L. J. V. Gool.A comparison of affine region detectors[J].International Journal of Computer Vision,2005,65(1-2):43~72.
    [107] E. Nowak,F. Jurie and B. Triggs.Sampling strategies for bag-of-features image classification[C].Proceedings of European Conference Computer Vision,2006,4:490–503.
    [108] J. Koenderink,A. van Doorn.Representation of local geometry in the visual system[J].Biological Cybernetics,1987,55(6):367~375.
    [109] L.M.J. Florack , B.M. Ter Haar Romeny , J.J. Koenderink , M.A. Viergever . General intensity transformations and differential invariants[J].Journal of Mathematical Imaging and Vision,1994,4(2):171~187.
    [110] C. Schmid,R. Mohr.Local grayvalue invariants for image retrieval[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1997,19 (5):530~534.
    [111] Gool L.J.V,Moons T,Ungureanu D.Affine/photometric Invariants for Planar Intensity Patterns[C].Proceedings of 4th European Conference on Computer Vision,1996:642~651.
    [112] W. Freeman,E. Adelson.The design and use of steerable filters[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1991,13 (9):891~906.
    [113] J.G. Daugman.Uncertainty relation for resolution in space, spatial frequency,and orientation optimized by two-dimensional visual cortical filters[J].Journal of the Optical Society of America A,1985,2(7):1160~1169.
    [114] F. Schaffalitzky,A. Zisserman.Multi-view Matching for Unordered Image Sets, or“How Do I Organize My Holiday Snaps?”[C].In Proceedings of the European Conference on Computer Vision,vol. 1,2002:414~431.
    [115] T. Leung,J. Malik.Representing and recognizing the visual appearance of materials using three-dimensional textons[J].International Journal of ComputerVision,2001,43 (1):29~44.
    [116] G. Carneiro,A.D. Jepson.Multi-scale phase-based local features[C].In Proceedings of the IEEE International Conference on Computer Vision Pattern Recognition,Vol. 1,Madison,WI,USA,2003:736~743.
    [117] Y. Ke,R. Sukthankar.PCA-SIFT: A more distinctive representation for local image descriptors[C].In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition,Washington,DC,USA,2004,Vol.2:506~513.
    [118] S. Belongie,J. Malik,J. Puzicha.Shape matching and object recognition using shape contexts[J] . IEEE Transactions on Pattern Analysis and Machine Intelligence,2002,24(4):509~522.
    [119] S. Lazebnik,C. Schmid,J. Ponce.A sparse texture representation using local affine regions[J] . IEEE Transactions on Pattern Analysis and Machine Intelligence,2005,27(8):1265~1278.
    [120] K. Mikolajczyk and C. Schmid . A performance evaluation of local descriptors[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2005,27(10):1615~1630.
    [121] Navneet Dalal,Bill Triggs.Histograms of Oriented Gradients for Human Detection[C].Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05),San Diego,California,USA,2005,vol. 1:886~893.
    [122] J. Van De Weije,C. Schmid.Coloring local feature extraction[C].In Proceedings of the European Conference on Computer Vision,May,2006:334~348.
    [123] J. van de Weijer,T. Gevers and A. Bagdanov.Boosting color saliency in image feature detection[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2006,28(1):150~156.
    [124] A. Bosch,A. Zisserman and X. Munoz.Representing shape with a spatial pyramid kernel[C].Proceedings of the 6th ACM international conference on Image and video retrieval,Amsterdam,The Netherlands,July 09-11,2007:401~408.
    [125] Koen E.A. van de Sande, Theo Gevers, Cees G.M. Snoek. A Comparison of Color Features for Visual Concept Classification[C]. ACM International Conference on Image and Video Retrieval (CIVR),Niagara Falls,Canada,July 7–9, 2008:141~150.
    [126] M. Stark and B. Schiele.How good are local features for classes of geometric objects[C].In Proceedings of the 11th International Conference on Computer Vision,Rio de Janeiro,Brazil,2007:1~8.
    [127] K. Mikolajczyk,B. Leibe and B. Schiele.Local features for object class recognition[C].Proceedings of International Conference on Computer Vision,Beijing, China,October,2005:1792~1799.
    [128]赵玲玲,翁苏明.模式分析的核方法[M].北京:机械工业出版社,2006.
    [129]张学工.关于统计学习理论与支持向量机[J].自动化学报,2000,26(1):32～42.
    [130]边肇祺,张学工.模式识别[M].第2版.北京:清华大学出版社,2000.
    [131] S. Aksoy,K. Koperski,C. Tusk,G. Marchisio,J.C. Tilton.Learning Bayesian classifiers for scene classification with a visual grammar[J].IEEE Transactions on Geoscience and Remote Sensing,2005,43 (3),581~589.
    [132] Deerwester, S.,Dumais, S.T.,G.W.,et al.Indexing by latent semantic analysis[J].Journal of the American Society for Information Science,1990,41(6):391~407.
    [133]孙圣和,陆哲明.矢量量化技术及应用[M].北京:科技出版社,2002.
    [134] Y.Linde , A.Buzo and R.M.Gray . An Algorithm for Vector Quantizer Design[J].IEEE Transactions on Communications,1980,28(1):84~95.
    [135] Nikhil Rasiwasia and Nuno Vasconcelos . Scene Classification with Low-dimensional Semantic Spaces and Weak Supervision[C].In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR),Anchorage,Alaska,USA,June 2008:1~8.
    [136] M. Varma and D. Ray . Learning the discriminative power-invariance trade-off[C].In Proceedings of the 11th International Conference on Computer Vision,Rio de Janeiro,Brazil,2007:1~8.
    [137]王润生.信息融合[M].北京:科学出版社, 2007.
    [138]虎晓红,钱旭,郑凯梅.一种图像分类的多特征Vague融合模型[J].计算机应用研究,2009,26(2):787~788.
    [139]王松,王卫红,秦绪佳.基于融合MPEG-7视觉描述符的图像分类方法[J].计算机工程,2006,32(24):201~203.
    [140] J.Yang,J.Y.Yang,D.Zhang,J.F.Lu.Feature fusion: parallel strategy vs. serial strategy[J].Pattern Recognition,2003,36:1369~1381.
    [141]张丽新,王家钦,赵雁南,杨泽红.机器学习中的特征选择[J].计算机科学,2004,31(11):180~184.
    [142]冯贵玉.人脸与掌纹识别的子空间特征提取方法研究[D].博士学位论文,湖南:国防科学技术大学研究生院,2007.
    [143] Xu L,Krzyzak A,and Suen C.Y.Methods of combining multiple classifiers and their application to handwriting recognition[J].IEEE Trans. on SMC, 1992,22 (3):418~435.
    [144] A. Kumar and C. Sminchisescu . Support kernel machines for object recognition[C].In Proceedings of the 11th International Conference on Computer Vision,Rio de Janeiro,Brazil,2007:1~8.
    [145] Y. Lin,T. Liu and C. Fuh.Local ensemble kernel learning for object category recognition[C].In IEEE Computer Society Conference on Computer Vision and Pattern Recognition,Minneapolis,Minnesota,18-23 June 2007:1~8.
    [146] I. Diego,J. Moguerza and A. Munoz.Combining kernel information for support vector classification[C].Lecture Notes in Computer Science,September 2004,3077:102~111.
    [147] N. Cristianini,J. Shawe-Taylor,A. Elisseeff and J. Kandola.On kernel-target alignment[C].In Advances in Neural Information Processing Systems,December 2001,Vancouver,Canada:367~373.
    [148] Martin M. S. Lee and S. Sathiya Keerthi.An Efficient Method for Computing Leave-One-Out Error in Support Vector Machines With Gaussian Kernels[J].IEEE Transactions on Neural Networks,2004,15(3):750~757.
    [149] T.S. Jaakkola and D.Haussler.Probabilistic kernel regression models[C].In Proceedings of the 7th International Workshop on AI and Statistics,Morgan Kaufmann,San Francisco,CA,USA,1999.
    [150] V.N.Vapnik.张学工译:统计学习理论的本质[M].北京:清华大学出版社,2000.
    [151] G. Lanckriet,N. Cristianini,L. El Ghaoui,P. Bartlett and M. Jordan.Learning the kernel matrix with semi-definite programming [J].Journal of Machine Learning Research,2004,5:27~72.
    [152] F. Bach,G. Lanckriet and M. Jordan.Multiple kernel learning, conic duality, and the SMO algorithm[C].In Proceedings of the 21st International Conference on Machine Learning,New York,NY,USA,2004:41~48.
    [153] S. Sonnenburg,G. Ratsch,C. Schafer and B. Scholkopf.Large scale multiple kernel learning[J].Journal of Machine Learning Research,2006,7(1):1531~1565.
    [154] A. Rakotomamonjy,F. Bach,S. Canu and Y. Grandvalet.More efficiency in multiple kernel learning[C].In ICML’07:Proceedings of the 24th international conference on Machine learning,New York,NY,USA,2007:775~782.
    [155] Grandvalet, Yves , CANU S . Adaptive scaling for feature selection in svms[C].Advances in Neural Information Processing Systems,Cambridge,Massachusetts,2003,Vol.15:553~560.
    [156] J.F. Bonnans and A. Shapiro.Optimization problems with perturbation: A guided tour[J].SIAM Review,1998,40(2):202~227.
    [157] O. Chapelle,V. Vapnik,O. Bousquet and S. Mukerjhee.Choosing multiple parameters for SVM.Machine Learning [J],2002,46(1-3):131~159.
    [158] L. Fei-Fei,R. Fergus and P. Perona.Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories[C] . Proceedings of Workshop Generative-Model Based Vision,2004,178~178.
    [159] M. Everingham,L. Van Gool,C. K. I. Williams,J. Winnand A. Zisserman. The PASCAL Visual Object Classes Challenge 2007 (VOC2007) Results[EB/OL].http://www.pascalnetwork.org/challenges/VOC/voc2007/workshop/index.html.
    [160] J. Zhang,M. Marszalek,C. Lazebnik,S. Schmid.Local features and kernels for classification of texture and object categories: a comprehensive study[J].International Journal of Computer Vision,2007,73(2):213~238.
    [161] A. Torralba.Contextual priming for object detection[J].International Journal of Computer Vision,2003,53(2):169~191.
    [162] J. Ponce,T. L. Berg,M. Everingham,D. A. Forsyth,M. Hebert,S. Lazebnik,M. Marszalek,C. Schmid,B. C. Russell,A. Torralba,C. K. I. Williams,J. Zhang and A. Zisserman.Dataset Issues in Object Recognition[M].In Toward Category-Level Object Recognition,Springer-Verlag Lecture Notes in Computer Science,2006.
    [163] G. Griffin, A. Holub, and P. Perona. Caltech 256 object category dataset[R]. Technical Report UCB/CSD-04-1366, California Institute of Technology, 2007.
    [164] Hao Zhang.Adapting Learning Techniques for Visual Recognition[D].PhD thesis,University of California,Berkeley,2007.
    [165] Jim Mutch,David G. Lowe.Multiclass Object Recognition with Sparse, Localized Features[C] . Proceedings of the 2006 IEEE Computer SocietyConference on Computer Vision and Pattern Recognition,Washington,DC,USA,June 17-22,2006:11~18.
    [166] A. Holub,M. Welling and P. Perona.Combining generative models and fisher kernels for object recognition[C].In Proceedings IEEE International Conference on Computer Vision,October 2005,Beijing,China,Vol. 1:136~143.
    [167] A. Bosch,A. Zisserman and X. Munoz.Image classification using random forests and ferns[C].In Proceedings of the 11th International Conference on Computer Vision,Rio de Janeiro,Brazil,2007:1~8.
    [168]徐磊,肖柏华,戴汝为,王春恒.一种面向大类别集的快速分类方法[J].计算机研究与发展, 2008, 45 (04): 588-595.
    [169] Weston J , Watkins C . Support vector machines for multi-Class pattern recognition[C].In: Proceedings of the European Symposium on Artificial Neural Networks,Bruges,1999:219~224.
    [170] Krebel U.Pairwise classification and support vector machines[C].In Advances in Kernel Methods-support Vector Learning,Cambridge,MA,MIT Press,1999:255~268.
    [171] Cortes C,Vapnik V.Support-vector networks[J].Machine Learning,1995,20(3):273~297.
    [172] Platt J C,Cristianini N,Shawe-Taylor J..Large margin DAG's for multiclass classification[C].Advances in Neural Information Processing Systems,MIT Press,Cambridge,MA,2000:547~553.
    [173] Schwenker F . Hierarchical Support Machines for Multi-class Pattern Recognition[C].In Proceedings of the Fourth International Conference on Knowledge-based Intelligent Engineering System & Allied Technologies,Chennai,2000:561~565.
    [174] Takahashi F , Abe S . Decision-Tree-Based Multiclass Support Vector Machines[C].In Proceedings of the Ninth International Conference on Neural Information Processing,Singapore,2002:1418~1422.
    [175] Hsu C,Lin C J.A comparison of methods for mufti-class support vector machines[J].IEEE Transactions on Neural Networks,2002,13(2):415~425.
    [176]任靖,李春平.最小距离分类器的改进算法——加权最小距离分类器[J].计算机应用,2005,25(5):992~994.
    [177] J. R. Smith,M. Naphade,A. Natsev.Multimedia semantic indexing using model vectors[C].Proceedings of the 2003 International Conference on Multimedia andExpo,July 06-09,2003:445~448.
    [178] Apostol Natsev,Milind R. Naphade and John R. Smith.Semantic Representation, Search and Mining of Multimedia Content[C].Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining,2004,Seattle,WA,USA:641～646.
    [179] Vassilis Athitsos,Alexandra Stefan,Quan Yuan,Stan Sclaroff.ClassMap: Efficient Multiclass Recognition via Embeddings[C].In Proceedings of the 11th International Conference on Computer Vision,Rio de Janeiro,Brazil,2007:1~8.
    [180] L. Yang.Distance metric learning: A comprehensive survey[R].Technical report,Michigan State University,May 19,2006.
    [181] D. Donoho.For most large underdetermined systems of linear equations the minimal l1-norm solution is also the sparsest solution[J].Communications on Pure and Applied Mathematics,2006,59(7):907~934.
    [182] John Wright,Arvind Ganesh,Allen Yang and Yi Ma.Robust Face Recognition via Sparse Representation[R] , University of Illinois , Tech Report UILU-ENG-07-2207 DC-229,June 2007.
    [183] Marcin Marszalek, Cordelia Schmid.Constructing Category Hierarchies for Visual Recognition[C].Proceedings of European Conference on Computer Vision,2008,Volume 4: 479~491.
    [184] G.Griffin, P. Perona . Learning and Using Taxonomies for Fast Visual Categorization[C] . IEEE Intl' Conf. on Computer Vision and Pattern Recognition,v.1,2008:1~8.
    [185] M. Marszalek and C. Schmid . Semantic hierarchies for visual object recognition[C].In IEEE Conference on Computer Vision & Pattern Recognition,Jun 2007.
    [186]刘志刚,李德仁,秦前清,史文中.基于特征空间中类间可分性的层次型多类支撑向量机[J].武汉大学学报,2004,29(4):324~328.
    [187]赵晖,荣莉莉,李晓.一种设计层次支持向量机多类分类器的新方法[J].计算机应用研究,2006,6:34~37.
    [188]厉小润,赵光宙,赵辽英.决策树支持向量机多分类器设计的向量投影法[J].控制与决策,2008,23(7):745~750.
    [189] Lei Wu,Xian-Sheng Hua,Nenghai Yu,Wei-Ying Ma,Shipeng Li.Flickr distance[C].ACM Multimedia ,2008:31~40.
    [190] C Fellbaum. WordNet:an Electronic Lexical Database[M].MIT Press ,Cambridge,Massachusetts,1998.
    [191] R.Richardson,A.F.Smeaton,J.Murphy.Using WordNet as a Knowledge Base for Measuring Semantic Similarity between Words[C].In proceedings of Artificial Intelligence and Cognitive Science Conference,Trinity College,Dublin,1994.
    [192]黄果,周竹荣,周亭.基于领域本体的语义相似度计算研究[J].计算机工程与科学,2007,29(5):112~117.
    [193] Dekang Lin.Using syntactic dependency as local context to resolve word sense ambiguity[C]. Proceedings of the 35th annual meeting on Association for Computational Linguistics,Madrid,Spain,July 07-12,1997:64~71.
    [194]蔡晓妍,戴冠中,杨黎斌.谱聚类算法综述[J].计算机科学,2008,35(7):14~18.
    [195]陈增照,杨扬,何秀玲,喻莹,董才林.基于核聚类的SVM多类分类方法[J].计算机应用,2007,27(1):47~49.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700