基于词袋模型的图像分类算法研究

英文题名：Research on Image Categorization Based on Bag-of-words Model
作者：吴丽娜
论文级别：博士
学科专业名称：计算机应用技术
中文关键词：图像分类 ; 词袋模型 ; 视觉单词 ; 视觉短语 ; 迁移学习
英文关键词：Imagc categorization ; The bag-of-visual words model ; Visual word ; Visual
英文关键词：phrase ; Transfer learning
学位年度：2013
导师：罗四维 ; 黄雅平
学科代码：081203
学位授予单位：北京交通大学
论文提交日期：2013-12-01

摘要

随着互联网的高速发展,数字图像大量地出现在人们的生活中,其数量和类别都发生了大规模地增长。图像分类能够帮助人们有效地组织和管理图像,这种技术得到了越来越多的重视。在各种图像分类方法中,词袋模型作为一种基于局部特征的图像分类方法取得了很好的分类性能,因此得到了广泛的研究和应用。
     词袋模型的一个重要的研究内容是如何创建和优化视觉词典(视觉单词集),以便更有效的表示图像并提高算法的分类性能。其另一个重要研究内容是如何利用迁移学习提高算法在新图像类别中的分类性能。词袋模型的迁移学习不仅能避免在每一类新图像中词袋模型都需要重新学习的问题,还能适用于仅有少量样本的图像分类任务。
     本文以创建适合迁移学习的视觉词典为目标,研究视觉词典优化和改进方法,提出用局部空间信息将多个视觉单词进行组合构成视觉短语。这种视觉短语能更有效地挖掘和表示不同图像之间的共同特征,消除视觉单词的“语义歧义性”,并能迁移到新类别图像的视觉词典中。本文的研究内容分为两大部分：第一,研究如何获得有效并有判别力的视觉单词和包含空间信息的视觉短语,为图像分类提供必要的信息(特征的表面信息和空间信息);第二,在新类别的图像学习中,尤其是仅有少量图像样本时,研究如何利用已学好的图像类别知识,通过迁移视觉短语加快新类别图像的学习并提高分类性能。围绕上述内容,本文的主要研究工作和创新性体现在以下三个方面：
     第一,提出一种加权的最小冗余最大相关(Weighted minimal-redundancy-maximal-relevance,WMR-MR)准则。WMR-MR准则从信息论的角度出发,根据视觉单词与图像类别之间、视觉单词与视觉单词的相关性,综合评估视觉词典在分类过程中的相关性和冗余性。通过删除视觉词典中与类别相关性弱且与词典内其他单词具有冗余性的单词,优化视觉词典,既保留了富有判别力的视觉单词,又缩减视觉词典的规模。利用该准则可以用相对小规模的视觉词典完成对图像集的描述,并保持算法的分类性能,解决了视觉词典规模过大带来的计算复杂性高、单词之间存在冗余的问题。而且这种小规模的视觉词典为创建视觉短语,以及视觉短语的迁移学习建立了基础。
     第二,提出一种创建包含局部空间信息的视觉短语的方法。在提取图像局部特征的同时获取局部特征的空间位置信息,并依据局部特征之间的稳定的邻近关系建立视觉短语,获得能够表示局部空间信息的视觉短语模型。与全局空间信息相比,本文的包含局部空间信息的视觉短语能够更灵活地处理图像类内的变化,有较强的鲁棒性。而且,视觉短语有助于消除独立使用其中任一单词可能带来的歧义性,增强对图像描述的可靠性。描述图像局部特征表而信息的视觉单词和描述图像局部空间信息的视觉短语,共同构成图像分类任务的两条线索。由于不同类别图像的空间结构性不同,该算法可以通过设定权值对两条线索进行权衡,使之能够适用于不同类别图像的分类任务中。
     第三,提出一种基于视觉短语的迁移学习算法。提出采用视觉短语来描述不同类别图像之间的共同特征,充分利用已有的知识帮助新类别图像的学习。实验证明,与直接迁移视觉单词相比,迁移视觉短语能更有效地提高词袋模型的分类效果。在新图像类的学习过程中,算法通过循环迭代的方式调整所迁移的视觉短语,保留对新图像分类有益的视觉短语,使得分类器在新图像类中也能获得良好的分类效果。与重新学习视觉词典的分类算法相比,这种迁移算法有效地利用了已有知识,在新类别图像的训练样本较少的情况下,也能获得较好的分类效果。
With the rapid development of the internet, a large number of digital images arise in our lives, and their number and categories have a massive increase. Image categorization has gained more and more attention as it can help people organize and manage images effectively. Bag-of-visual words(BOV) model which is based on local features for image categorization has been shown to yield state-of-the-art results.
     An important research on BOV model is how to create and improve vocabulary to represent images effectively and improve performance of BOV. Another important research is the transfer learning of BOV, which can avoid BOV model learning from the beginning for each category. The transfer learning can retain good performance in the learning task when there are only a few images.
     This paper analyzes each step (feature extraction, feature description, vector quantitation, classifier learning) of BOV model, and improves the vocabulary to fit transfer learning.
     This paper studies on the methods of optimizing and improving visual vocabulary, which aims at creating visual vocabulary for transfer learning. This paper proposes that creating visual phrases through the composition of several visual words by utilizing spatial information. The visual phrase can find and represent common local spatial information among different image categories, and avoid semantic ambiguity, which can be transferred to visual vocabulary of a novel image category. There are two parts of research in this paper:the first part is that how to obtain discriminative vocabulary and a set of phrases with spatial information, which can provide necessary knowledge (appearance information and spatial information); the second part is that how to make use of learned knowledge to speed the learning of a new category and improve its performance, especially when there are only a few training images. The main creative work and research of the paper is summarized as follows:
     1. A weighted minimal-redundancy-maximal-relevance criterion (WMR-MR) is defined. The criterion of WMR-MR considers both the redundancy between one word and another and the relevance of between a word and the category. The algorithm improves a vocabulary by eliminating redundant words which have less relevance with its category. Discriminative vocabulary with a relative small vocabulary is obtained which can solve the problem that large vocabulary can result in complicated computing and redundant words. The vocabulary obtained by this algorithm can provide a basis for creating visual phrases and the transfer learning of phrases.
     2. An algorithm of creating visual phrases with local spatial information is proposed. The position information can be obtained when extracting local features. According to this, stable neighbor relation between visual words can be modeled by visual phrases. Compared with global spatial information, the local spatial information of the visual phrases can deal with intra-class variation, which has strong robustness. Moreover, the visual phrases are helpful for eliminating ambiguity when a visual word is used for image categorization individually. So visual phrases are more reliable than words. Visual words which represent appearance information of local features and visual phrases which represent local spatial information are integrated to form two sources of information for categorization. The algorithm can balance these two sources by adjusting the weight for various image categories, so it can be applied in different image categorization.
     3. A transfer learning algorithm based on visual phrases is proposed. The algorithm describe the common features of various image categories by visual phrases, which is aimed at making use of learned knowledge to help the learning of a novel image categorization. During the learning of a novel category, the algorithm adjusts visual phrases to transfer by the way of iteration to retain visual phrases which are helpful for image categorization. The retained visual phrases can make the classifier of the novel categorization have a good performance after transfer learning. Compared with relearning vocabulary, this transfer learning algorithm makes use of learned knowledge effectively, which can gain good performance especially in the situation when there are a few images in a novel category.

引文

[1]R. VanRullen, S. J. Thorpe. Is it a bird? Is it a plane? Ultra-rapid visual categorization of natural and artificial objects. Perception.2001.30(6).655-668
    [2]I. Biederman. Visual object recognition. In S. F. Kosslyn and D. N. Osherson (Eds.).An Invitation to Cognitive Science.2nd edition. MIT Press.1995. Chapter 4.121-165
    [3]M.A. Turk, A.P. Pentland. Face Recognition Using Eigenfaces. IEEE Conference on Computer Vision and Pattern Recognition.1991.586-591
    [4]M.S. Bartlett, R. Movellan, Javier, T.J. Sejnowski. Face recognition by independent component analysis. IEEE Transactions on Neural Networks.2002.13(6).1450-1464
    [5]C. Garcia, G Zikos, G Tziritas. A Wavelet-based Framework for Face Recognition.5th European Conference on Computer Vision.1998.84-92
    [6]G Carneiro, D. Lowe. Sparse Flexible Models of Local Features. European Conference on Computer Vision (ECCV).2006.29-43
    [7]R. Fergus. Visual Object Category Recognition[Dissertation]. University of Oxford.2005
    [8]R. Fergus, P. Perona, A. Zisserman. Object class recognition by unsupervised scale-invariant learning. Proceedings of the IEEE International Conference on Computer Vision.2003(2). 264-271
    [9]D. Crandall, P. Felzenszwalb, D. Huttenlocher. Spatial priors for part-based recognition using statistical models. Proceedings of the IEEE International Conference on Computer Vision,2005(1).10-17
    [10]G Bouchard, B. Triggs. Hierarchical part-based visual object categorization. Proceedings of the IEEE International Conference on Computer Vision.2005(1).710-715
    [11]P. Felzenszwalb, D. Huttenlocher. Pictorial structures for object recognition. International Journal of Computer Vision(IJCV).2005.61(1).55-79
    [12]L. Fei-Fei. Visual recognition:computational models and human psychophysic [Dissertation]. Caifornia Institute of Technology.2005
    [13]O. G Cula, K. J. Dana. Compact representation of bidirectional texture functions. IEEE Conference on Computer Vision and Pattern Recognition,2001(1).1041-1047
    [14]T. Hofmann. Probabilistic latent semantic indexing. Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval. 1999.50-57
    [15]D. Blei, A. Ng, M. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research.2003(3).993-1022
    [16]Y. W. Teh, M. I. Jordan, M. J. Beal, D. M. Blei. Hierarchical Dirichlet Processes, Journal of the American Statistical Association.2006.101(476).1566-1581
    [17]R. Fergus, L. Fei-Fei, P. Perona, A. Zisserman. Learning Object Categories from Google's Image Search. IEEE International Conference on Computer Vision.2005(2).1816-1823
    [18]L. Fei-Fei, R. Fergus and P. Perona. One-Shot Learning of Object Categories. IEEE Transactions on Pattern Analysis and Machine Intelligence.2006.28(4).594-611
    [19]S. Lazebnik, C. Schmid, J. Ponce, Affine-Invariant Local Descriptors and Neighborhood Statistics for Texture Recognition. Proceedings of the IEEE International Conference on Computer Vision.2003.649-655
    [20]G Csurka, C. Dance, L Fan. Visual categorization with bags of keypoints. In Workshop on Statistical Learning in Computer Vision(ECCV).2004.1-22
    [21]G Dorko, C. Schmid. Object class recognition using discriminative local features. IEEE Transactions on Pattern Analysis and Machine Intelligence.2004
    [22]E. Sudderth, A. Torralba, W. Freeman, A. Willsky. Learning hierarchical models of scenes, objects, and parts. Proceedings of the IEEE International Conference on Computer Vision. 2005(2)1331-1338
    [23]C. Schmid. Constructing models for content-based image retrieval. IEEE Conference on Computer Vision and Pattern Recognition,2001(2)39-45
    [24]T. Leung, J. Malik. Representing and recognizing the visual appearance of materials using three dimensional textons. International Journal of Computer Vision.2001.43(1).29-44
    [25]M. Varma, A. Zisserman. Statistical approaches to material classification. Proceedings of the Indian Conference on Computer Vision, Graphics and Image Processing.2002.167-172
    [26]M. Varma and A. Zisserman. Texture classification:Are filter banks necessary? Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2003(2).691-698
    [27]L. Wang, Toward A Discriminative Codebook:Codeword Selection across Multi-resolution. Proceedings of the IEEE International Conference on Computer Vision.2007.1-8
    [28]B. Epshtein, S. Ullman, Feature Hierarchies for Object Classification Tenth IEEE International Conference on Computer Vision.2005(1).220-227
    [29]S. Kim, S. Kweon. Visual Categorization Robust to Large Intra-Class Variations using Entropy-guided Codebook. IEEE International Conference on Robotics and Automation. 2007.10-14
    [30]D. Liu, G. Hua, Integrated Feature Selection and Higher-Order Spatial Extraction for Object Categorization. Proceedings of the IEEE International Conference on Computer Vision 2008.1-8
    [31]J. Wang, J. Yang, K. Yu, F. Lv, T. Huang, Y. Gong. Locality-constrained Linear Coding for Image Classification. IEEE Conference on Computer Vision and Pattern Recognition (CVPR).2010.3360-3367
    [32]S. Lazebnik, C. Schmid, J. Ponce. Beyond bags of features:spatial pyramid matching for recognizing natural scene categories. IEEE Computer Society Conference on Computer Vision and Pattern Recognition.2006.2169-2178
    [33]J. Yang, K. Yu, Y. Gong, T. Huang. Linear Spatial Pyramid Matching Using Sparse Coding for Image Classification. IEEE Conference on Computer Vision and Pattern Recognition. 2009(CVPR).1794-1801
    [34]M. Marszalek, C. Schmid. Spatial Weighting for Bag-of-Features. IEEE International Conference on Computer Vision.2006(2).2118-2125
    [35]R. Fergus. Visual Object Category Recognition[Dissertation]. University of Oxford.2005
    [36]D. Gokalp, S. Aksoy, Scene Classification Using Bag-of-Regions Representations. Proceedings of the IEEE International Conference on Computer Vision.2007.1-8
    [37]S. Ullman, M. Vidal-Naquct, E. Sali. Visual features of intermediate complexity and their use in classification. Nature Neuroscience.2002.5 (7).682-687
    [38]S. Ullman, B. Epshtein. Visual classification by a hierarchy of extended fragments. Towards Category-Level Object Recognition Springer.2006.321-344
    [39]B. Epshtein, S. Ullman. Cross-generalization:learning novel classes from a single example by feature replacement. IEEE Computer Society Conference on Computer Vision and Pattern Recognition.2005.672-679
    [40]B. Epshtein, S. Ullman. Identifying semantically equivalent object parts. IEEE Computer Society Conference on Computer Vision and Pattern Recognition.2005(1).2-9
    [41]B. Epshtein S. Ullman. Semantic Hierarchies for Recognizing Objects and Parts. IEEE Computer Society Conference on Computer Vision and Pattern Recognition.2007.1-8
    [42]B. Epshtein, S. Ullman. Satellite Features for the Classification of Visually Similar Classes. IEEE Computer Society Conference on Computer Vision and Pattern Recognition.2006(2). 2079-2086
    [43]L. Karlinsky, M. Dinerstein, D. Harari, S. Ullman. The chains model for detecting parts by their context. IEEE Computer Society Conference on Computer Vision and Pattern Recognition.2010.25-32
    [44]L. Fei-Fei. Visual recognition:computational models and human psychophysic [Dissertation]. Caifornia Institute of Technology.2005
    [45]A. Karpathy, S. Miller, L. Fei-Fei. Object Discovery in 3D Scenes via Shape Analysis. International Conference on Robotics and Automation (ICRA).2013
    [46]J. Deng, J. Krause, L. Fei-Fei. Fine-Grained Crowdsourcing for Fine-Grained Recognition. IEEE Conference on Computer Vision and Pattern Recognition (CVPR).2013
    [47]B. Yao, L. Fei-Fei. Action Recognition with Exemplar Based 2.5D Graph Matching. European Conference on Computer Vision (ECCV).2012.173-186
    [48]C. Baldassano, M.C. Iordan, D.M. Beck,, L. Fei-Fei. Voxel-Level Functional Connectivity using Spatial Regularization. Neurolmage.2012.63(3).1099-1106
    [49]K. Tang, V. Ramanathan, L. Fei-Fei, D. Koller. Shifting Weights:Adapting Object Detectors from Image to Video. Neural Information Processing Systems (NIPS).2012
    [50]M. Savva, N. Kong, A. Chhajta, L. Fei-Fei, M. Agrawala, J. Heer. Revision:Automated Classification, Analysis and Redesign of Chart Images. ACM Symposium on User Interface Software and Technology (UIST).2011.393-402
    [51]B. Yao, A. Khosla, L. Fei-Fei. Combining Randomization and Discrimination for Fine-Grained Image Categorization. IEEE Conference on Computer Vision and Pattern Recognition.2011.1577-1584
    [52]B. Zhao, L. Fei-Fei, E. Xing. Large-Scale Category Structure Aware Image Categorization. Proceedings of the Neural Information Processing Systems (NIPS).2011.1251-259
    [53]O. Russakovsky, Y. Lin, K. Yu, L. Fei-Fei. Object-centric spatial pooling for image classification. European Conference on Computer Vision (ECCV).2012
    [54]B. Yao, G Bradski, L. Fei-Fei. A Codebook-Free and Annotation-Free Approach for Fine-Grained Image Categorization. IEEE Conference on Computer Vision and Pattern Recognition (CVPR).2012.3466-3473
    [55]B. Yao, L. Fei-Fei. Recognizing Human Actions in Still Images by Modeling the Mutual Context of Objects and Human Poses. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI). September 2012.34(9).1691-1703
    [56]A. Torralbo, D.B. Walther, B. Chai, E. Caddigan, L. Fei-Fei, D.M. Beck. Good Exemplars of Natural Scene Categories Elicit Clearer Patterns than Bad Exemplars but not Greater BOLD Activity. PLoS ONE.2013.8(3)
    [57]S. Lazebnik, M. Raginsky. Supervised Learning of Quantizer Codebooks by Information Loss Minimization. IEEE Transactions on Pattern Analysis and Machine Intelligence.2009. 31(7).1294-1309
    [58]S. Lazebnik, C. Schmid, J. Ponce. Beyond bags of features:spatial pyramid matching for recognizing natural scene categories. IEEE Computer Society Conference on Computer Vision and Pattern Recognition.2006.2169-2178
    [59]B. Ommer, J.M. Buhmann. Object categorization by compositional graphical models. IEEE Computer Society Conference on Computer Vision and Pattern Recognition.2005.235-250
    [60]B. Ommer, J.M. Buhmann. Learning compositional categorization models. European Conference on Computer Vision.2006.316-329
    [61]B. Ommer, M. Sauter, J.M. Buhmann. Learning top-down grouping of compositional hierarchies for recognition. IEEE Conference on Computer Vision and Pattern Recognition Workshop,2006
    [62]B. Ommer, J.M. Buhmann. Learning the Compositional Nature of Visual Objects. IEEE Conference on Computer Vision and Pattern Recognition,2007.1-8
    [63]D.J. Crandall, D.P. Huttenlocher. Composite models of objects and scenes for category recognition. IEEE Conference on Computer Vision and Pattern Recognition.2007.1-8
    [64]J. Sivic, A. Zisserman. Video google:A text retrieval approach to object matching in videos. Proceedings. Ninth IEEE International Conference on Computer Vision,2003.(2).1470-1477
    [65]J. Sivic, B. Russell, A. Efros, A. Zisserman, W. Freeman. Discovering object categories in image collections. Technical Report A. I. Memo 2005-005, Massachusetts Institute of Technology,2005
    [66]Anna Bosch Ru'e. Image Classification for Large Number of Object categories[Dissertation]. University of Girona.2007
    [67]A. Bosch, A. Zisserman, X. Munoz. Scene Classification via pLSA. European Conference Computer Vision(ECCV).2006(4).517-530
    [68]A. Bosch, X. Munoz, J. Marti. Using appearance and context for outdoor scene object classification. IEEE International Conference on Image Processing (ICIP).2005.1218-1221
    [69]A. Bosch., X. Munoz, A. Oliver, R. Marti. Object and Scene Classification:what does a Supervised Provide us? Proceedings of the 18th International Conference on Pattern Recognition (ICPR'06),2006
    [70]A. Bosch, X. Munoz, J. Freixenet. Segmentation and description of natural outdoor scenes. Image and Vision Computing.2007.25(5).727-740
    [71]A. Bosch, X. Munoz, R Marti. A review:Which is the best way to organize/classify images by content? Image and Vision Computing.2007.25(6).778-791
    [72]A. Bosch, A. Zisserman, X. Muoz. Scene classification using a hybrid generative/ discriminative approach. IEEE Trans. Pattern Analysis and Machine Intelligence.2008.30(4). 712-727
    [73]M. Varma, B. R. Babu. More generality in efficient multiple kernel learning. In Proceedings of the International Conference on Machine Learning, Montreal,2009.1065-1072
    [74]M. Varma, A. Zisserman. A statistical approach to material classification using image patch exemplars. IEEE Transactions on Pattern Analysis and Machine Intelligence.2009.31(11). 2032-2047
    [75]S. J. Pan, Q. Yang. A Survey on Transfer Learning. IEEE Transactions on Knowledge and Data Engineering.2009.22(10).1345-1359
    [76]R. Raina, A. Battle, H. Lee,B. Packer,A.Y. Ng. Self-taught Learning:Transfer Learning from Unlabeled Data. Proceedings of the 24th international conference on Machine learning. 2007.759-766
    [77]W. Dai, Q. Yang, G.R. Xue, Y. Yu. Boosting for Transfer Learning. Proceedings of The 24th Annual International Conference on Machine Learning (ICML'07).2007.193-200
    [78]Y Freund, RE Schapirc. A decisionthcorctic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences.1997.55(1),119-139
    [79]Y. Yao, G. Dorctto. Boosting for transfer learning with multiple sources. IEEE Conference on Computer Vision and Pattern Recognition (CVPR).2010.1855-1862
    [80]I. Ulusoy, C.M. Bishop. Generative versus discriminative methods for object recognition. IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR). 2005(2).258-265
    [81]X. Li, T.S. Lee, Y. Liu. Hybrid Generative-Discriminative Classification using Posterior Divergence. IEEE Conference on Computer Vision and Pattern Recognition (CVPR).2011. 2713-2720
    [82]J. Yang, R. Yan, A.G. Hauptmann. Cross-Domain Video Concept Detection Using Adaptive SVMs. Proceedings of the 15th International Conference on Multimedia.2007.188-197
    [83]L. Duan, I.W. Tsang, D. Xu, T.S. Chua. Domain Adaptation from Multiple Sources via Auxiliary Classifiers. International Conference on Machine Learning.2009.289-296
    [84]L. Duan, D. Xu, I.W. Tsang. Domain Adaptation From Multiple Sources:A Domain-Dependent Regularization Approach. IEEE Trans. Neural Networks Learning Systems. 2012.23(3).504-518
    [85]Y. Zhu, S. J. Pan, Y. Chen, G-R. Xue, Q. Yang, Y. Yu. Heterogeneous transfer learning for image classification. Proceedings of the 25th AAAI Conference on Artificial Intelligence. 2010
    [86]L. Duan, D. Xu, I.W. Tsang. Domain Adaptation From Multiple Sources:A Domain-Dependent Regularization Approach. IEEE Trans. Neural Networks Learning Systems. 2012.23(3).504-518
    [87]H. O. Song, S. Zickler, T. Althoff. Sparselet models for efficient multiclass object detection. European Conference on Computer Vision(ECCV).2012.802-815
    [88]H. Pirsiavash and D. Ramanan. Steerable part models. In IEEE Conference on Computer Vision and Pattern Recognition.2012.3226-3233
    [89]A. Krizhevsky, I. Sutskever, and G Hinton. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems.2012(25). 1106-1114
    [90]J. Farquhar, S. Szedmak, H. Meng, J. Shawe-Taylor. Improving "bag-of-keypoints" image categorisation. Technical report, University of Southampton,2005
    [91]A. Pinz. object categorization. Foundations and Trends in Computer Graphics and Vision. 2006.1(4).255-353
    [92]E. Novak, F. Jurie, and B. Triggs. Sampling strategies for bag-of-features image classification. IEEE European Conference on Computer Vision,2006.490-503
    [93]K. Mikolajczyk, C. Schmid. A Performance Evaluation of Local Descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence(PAMI).2005.27(10).1615-1630.
    [94]C.Harris, M.Stevens. A combined corner and edge detector, in 4th Alvey Vision Conference. 1988.147-151
    [95]S.M. Smith, J.M. Brady. SUSAN-A New Approach to Low Level Image Processing. International Journal of Computer Vision.1997.23(1).45-78
    [96]D.G Lowe. Object recognition from local scale-invariant features. Proceedings of the 7th International Conference on Computer Vision.1999.1150-1157
    [97]D. G Lowe. Distinctive image features from scale-invariant keypoints. International journal of computer vision.2004.60(2).91-110
    [98]H. Bay, A. Ess, T. Tuytelaars. Speeded-Up Robust Features (SURF). Computer Vision and Image Understanding 2008.110(3).346-359
    [99]A. Johnson, M.Hebert, Recognizing objects by Matching Oriented Points. Proceedings Conference computer Vision and Pattern Recognition.1997.684-689
    [100]S. Lazebnik, C. Schmid, J. Ponce. A sparse texture representation using local affine regions. IEEE Transactions on Pattern Analysis and Machine Intelligence.2005.27(8).1265-1278
    [101]F. Jurie, B. Triggs. Creating efficient codebooks for visual recognition. Tenth IEEE International Conference on Computer Vision,2005.(1).604-610
    [102]郭立君,赵杰煜,史忠植.生成模型与判别方法相融合的图像分类方法.电子学报.2010.38(5)：1141-1145
    [103]T. Deselaers, D. Keysers, H. Ney. Discriminative training for object recognition using image patches. IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR).2005.2:157-162
    [104]J. MacQueen. Some methods for classification and analysis of multivariate observations. Proceedings of the 5th Berkeley Symposium on Mathematics Statistics and Probability. 1967(1).281-297
    [105]J.C. Gemert, J.M. Geusebroek, C.J. Veenman. Kernel codebooks for scene categorization. Computer Vision-ECCV.2008.696-709
    [106]J. Winn, A. Criminisi, T. Minka. Object categorization by learned universal visual dictionary. Tenth IEEE International Conference on Computer Vision.2005(2) 1800-1807
    [107]R. Maree, P. Geurts, J. Piater, L. Wehenkel. Random subwindows for robust image classification. IEEE Computer Vision and Pattern Recognition.2005.(1) 34-40
    [108]Y. Liu, J. Rong, S. Rahul, J. Frederic. Unifying discriminative visual codebook generation with classifier training for object category recognition. IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR).2008.1-8
    [109]R.J. McEliece. The theory of Information and Coding. Cambridge University Press,2002.
    [110]T. Carter. An introduction to information theory and entropy. Complex Systems Summer School, Santa Fe,2007
    [111]N. Tishby, F.C. Pereira, W. Bialek. The Information Bottleneck method. The 37th annual Allerton Conference on Communication, Control, and Computing.1999.368-377
    [112]J. T. Kent. Information gain and a general measure of correlation. Biomctrika.1983.70(1). 163-173
    [113]T.M. Cover. The best two independent measurements are not the two best. IEEE Transactions on Systems, Man, and Cybernetics.1974(4).116-117
    [114]A. Dasgupta, P. Drineas, B. Harb. Feature selection methods for text classification. Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining.2007.230-239
    [115]T. Cover, J. Thomas. Elements of Information Theory. New. Wiley-intersciencc,2012.
    [116]A.K. Jain, R.P.W. Duin, J.Mao. Statistical Pattern Recognition:A Review. IEEE Transactions on Pattern Analysis and Machinelntclligcncc.2000.22(1).4-37
    [117]H. Peng, F. Long, C. Ding. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine learning,2005.27(8).1226-238
    [118]X. Li, Y.Guo. Adaptive Active Learning for Image Classification. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2013
    [119]http://www.vision.caltech.edu/Image_Datasets/Caltcch101/
    [120]http://www. vision.caltech.cdu/Image_Datascts/Caltech256/
    [121]W. B. Cavnar, J. M. Trenkle. N-gram-based text categorization. In Proceedings of the Symposium on Document Analysis and Information Retrieval.1994(2).161-75
    [122]J. Bai, J.Y. Nie, F. Paradis. Using language models for text classification. Proceedings of the Asia Information Retrieval Symposium(AIRS).2004
    [123]L. Wu, M. Li, Z. Li, W. Ma, N. Yu. Visual Languag Modeling for Image Classification. Proceedings of the International Workshop on Multimedia Information Retrieval.2007. 115-124
    [124]P. Tirilly, V. Claveau, P. Gros. Language Modeling for Bag-of-Visual Words Image Categorization[C]. In Proceedings of the 2008 international conference on Content-based image and video retrieval,2008:249-258
    [125]O. Duchenne, A. Joulin, J. Ponce. A graph-matching kernel for object categorization. EEE International Conference on. IEEE Computer Vision (ICCV).2011(1).1792-1799.
    [126]K. Rosenblum, L. Zelnik-Manor, Y. C.Eldar Dictionary optimization for block-sparse representations. AAAI Fall 2010 Symposium on Manifold Learning.2010.50-58
    [127]R. Jenatton, J. Mairal, G. Obozinski. Proximal methods for sparse hierarchical dictionary learning. ICML.2010
    [128]P.O. Hoycr, A. Hyvarincn. A multi-layer sparse coding network learns contour coding from natural images. Vision Research,2002.42(12).1593-1605
    [129]X. Liu,B.Cheng, S.Yan. Label to region by bi-layer sparsity priors.Proceedings of the 17th ACM international conference on Multimedia.2009.115-124
    [130]谢尧芳,苏松志,李绍滋.基于稀疏编码的迁移学习及其在行人检测中的应用.厦门大学学报(自然科学版).2010.49(2).186-192
    [131]J. Farquhar, S. Szedmak, H. Meng, J. Shawe-Taylor. Improving "bag-of-keypoints" image categorisation. Technical report, University of Southampton,2005
    [132]F. Perronnin, Universal and Adapted Vocabularies for Generic Visual Categorization. IEEE Transactions on Pattern Analysis and Machine Intelligence.2008.30(7).1243-1256
    [133]J. Yang, J. Wright, T. Huang. Image super-resolution as sparse representation of raw image patches. Computer Vision and Pattern Recognition.2008.1-8
    [134]J. Yang,H. Tang,Y. Ma. Face hallucination via sparse coding.15th IEEE International Conference on Image Processing.2008.1264-1267
    [135]A. Bergamo, L. Torresani, and A. Fitzgibbon. Picodes:Learning a compact code for novel-category recognition. Neural Information Processing Systems Conference and Workshops (NIPS).2011.2088-2096
    [136]S. Lazebnik. Semi-local and global models for texture, object and scene recognition [Dissertation]. University of Illinois at Urbana Champaign,2006
    [137]A. Quattoni, M. Collins, T. Darrell. Transfer learning for image classification with sparse prototype representations. IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2008.1-8
    [138]K. Yu, Y. Lin, J. Lafferty. Learning image representations from the pixel level via hierarchical sparse coding. IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2011.1713-1720
    [139]J. Yuan, M. Yang, Y. Wu. Mining discriminative co-occurrence patterns for visual recognition. IEEE Conference on Computer Vision and Pattern Recognition (CVPR).2011. 2777-2784
    [140]W. H. Hsu, S.F. Chang. Visual cue cluster construction via information bottleneck principle and kernel density estimation. Proceedings of ACM Conference on Image and Video Retrieval (CIVR),2005. (3685).591-602
    [141]S. Kim, I. S. Kweon. Object categorization robust to surface markings using entropy-guided codebook. IEEE Workshop on Applications of Computer Vision(WACV07).2007.22-22
    [142]Y. Huang, K. Huang, Y. Yu. Salient coding for image classification. IEEE Conference on Computer Vision and Pattern Recognition (CVPR).2011.1753-1760
    [143]刘硕研.面向感知的图像场景及情感分类算法研究[Dissertation]北京交通大学.2011.
    [144]X. Li, T.S. Lee, Y. Liu. Hybrid generative-discriminative classification using posterior divergence. IEEE Conference on Computer Vision and Pattern Recognition (CVPR).2011. 2713-2720
    [145]K. Saenko, B. Kulis, M. Fritz. Adapting visual category models to new domains. Computer Vision-ECCV.2010.213-226
    [146]解文杰.基于中层语义表示的图像场景分类研究[Dissertation]北京交通大学.2011
    [147]X. Lan, C.L. Zitnick, R. Szeliski. Local bi-gram model for object recognition. Technical report, MSRTR-2007-54, Microsoft Research.2007.
    [148]B. S. Divakaruni, J. Zhou. Image Categorization using Codebooks Built from Scored and Selected Local Features. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2013
    [149]L. Liu, L. Wang, C. Shen. A generalized probabilistic framework for compact codebook creation. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2011.1537-1544
    [150]T. Kobayashi. BoF meets HOG:Feature Extraction based on Histograms of Oriented p.d.f Gradients for Image Classification. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2013
    [151]Z. Song, Q. Chen, Z. Huang. Contextualizing object detection and classification. IEEE Conference on Computer Vision and Pattern Recognition (CVPR).2011.1585-1592
    [152]Y.L. Boureau, J. Ponce, Y. LeCun. A theoretical analysis of feature pooling in visual recognition. International Conference on Machine Learning.2010.111-118
    [153]K. R. Canini, M. M. Shashkov, T. L. Griffiths. Modeling transfer learning in human categorization with the hierarchical Dirichlet process. Proceedings of the 27th International Conference on Machine Learning.2010.151-158
    [154]B. Yao, L. Fei-Fei. Grouplet:A structured image representation for recognizing human and object interactions. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2010.9-16
    [155]秦磊,高文.珠于内容相关性的场景图像分类方法.计算机研究与发展.2009.46(7).1198-1205
    [156]洪佳明,印鉴,黄云,刘玉葆,王甲海TrSVM一种基于领域相似性的迁移学习算法.计算机研究与发展.2011.48(10).1823-1830
    [157]郭立君,赵杰煜,史忠植.生成模型与判别方法相融合的图像分类方法.电子学报.2010.38(5).1141-1145
    [158]郭立君,刘曦,赵杰烬,史忠植.基于改进局部特征分布的图像分类方法.模式识别与人工智能.2011.24(3).368-375
    [159]贾世杰,孔样维.一种新的直方图核函数及在图像分类中的应用.电子与信息学报.2011.33(7).1738-1742
    [160]S. Paisitkriangkrai, C. Shen, A. van Hengel. Sharing features in multi-class boosting via group sparsity. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2012.2128-2135
    [161]Q. Hao, R. Cai, Z. Li.3D visual phrases for landmark recognition. IEEE Conference on Computer Vision and Pattern Recognition (CVPR).2012.3594-3601
    [162]M. Rohrbach, M. Stark, G. Szarvas. What helps where-and why? semantic relatedness for knowledge transfer. IEEE Conference on Computer Vision and Pattern Recognition (CVPR).2010.910-917
    [163]T. Tommasi, F. Orabona, B. Caputo. Safety in numbers:Learning categories from few examples with multi model knowledge transfer. IEEE Conference on Computer Vision and Pattern Recognition (CVPR).2010.3081-3088
    [164]P.K. Mallapragada, R. Jin, A.K. Jain. Online visual vocabulary pruning using pairwisc constraints. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2010.3073-3080
    [165]W. Choi, Y. W. Chao, C. Pantofaru, S. Savarcsc. Understanding Indoor Scenes using 3D Geometric Phrases. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2013
    [166]T. Dean, M. A. Ruzon, M. Segal. Fast, Accurate Detection of 100,000 Object Classes on a Single Machine. IEEE Conference on Computer Vision and Pattern Recognition(CVPR). 2013
    [167]M. Long, G. Ding, J. Wang. Transfer Sparse Coding for Robust Image Representation. IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2013.
    [168]V. Bcttadapura, G. Schindler, T. Plotz. Augmenting Bag-of-Words:Data-Driven Discovery of Temporal and Structural Information for Activity Recognition. IEEE Conference on Computer Vision and Pattern Recognition (CVPR).2013
    [169]G. Iric, D. Liu, Z. Li. A Baycsian Approach to Multimodal Visual Dictionary Learning. IEEE Conference on Computer Vision and Pattern Recognition (CVPR).2013

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700