图像检索中跨模语义信息获取方法研究

英文题名：Cross-modal Semantic Information Acquisition for Image Retrieval
作者：何宁
论文级别：博士
学科专业名称：计算机软件与理论
中文关键词：用户标注 ; 对象语义 ; 语义获取 ; 跨模语义 ; 特征描述子
英文关键词：User tagging ; Object semantics ; Semantic Acquisition ; Cross-mode semantics ; Feature descriptor
学位年度：2013
导师：曹加恒
学科代码：081202
学位授予单位：武汉大学
论文提交日期：2013-04-01

摘要

随着图像获取和分享技术的发展,人们所面对的图像数据量大幅增长。如何高效精准地获得用户感兴趣的图像成为了一个突出的问题。基于内容的图像检索(Content-Based Image Retrieval, CBIR)技术虽然近年来得到了长足发展,但是仍然无法满足人们的需求。其主要原因是：由于底层视觉特征与高层语义概念之间的语义鸿沟使得CBIR的精度不能满足需求；CBIR使用的图像特征向量通常很长,处理速度较慢；CBIR的输入对用户不够友好,因为用户通常很难找到跟自己所需的图像相似的查询样例。基于文本的图像检索(Text-Based Image Retrieval, TBIR)仅使用文本信息来进行图像的索引与搜索,与视觉信息相比,文本信息从本质上以人类更易理解的低维的简单的概念来描述图像的内容。但是,TBIR往往需要进行人工语义标注,只适合于小规模的专业图像库。近年来社交网络的发展,使得大批量的图像数据的语义标注成为可能。不过这些语义信息具有很大的随意性,包含有大量噪声且不完整。
     本文针对以上互联网图像数据库中图像数据检索存在的问题,结合CBIR和TBIR各自的优势,研究跨模语义信息获取方法,主要开展了以下研究工作：
     1.研究跨越文本和视觉内容的图像检索中的语义获取技术,提出图像检索中跨模语义信息获取模型。本文提出了一种跨模语义信息获取CSIA (Cross-modal Semantic Information Acquisition)。该模型框架以语义对象为研究核心,探索从图像的底层特征自动获取对象的语义信息,结合基于内容的相似度算法,对用户标签文本和底层特征获取的对象语义进行融合建模,实现高层语义的获取。CSIA实现图像底层特征向高层语义的跨越,避免了基于内容的语义获取所得到语义的单调性,又提高了用户标注文本中语义信息的可靠性,比单纯基于文本或基于内容的语义提取更加有效。
     2.研究基于内容的图像语义自动提取技术,提出一种基于对象轮廓形状的特征描述子,采用多粒度的策略,即尺度空间方向梯度直方图(Scale Space Histogram of Oriented Gradient, SSHOG),在多尺度上描述对象,并应用于图像中对象语义的自动获取。方向梯度直方图是对象检测领域最有效的特征描述子,但是该描述子只在一个固定的尺度上获取图像中对象的语义特征,使得图像中对象的识别率不高。对象的特征具有多尺度性质,识别某些部位需要利用细粒度的细节特征,而另外一些部位可能需要粗粒度的整体特征,还有一些部位需要粗粒度和细粒度的特征结合使用。采用SSHOG描述子,在行人检测基准测试数据集INRIA Person Dataset上,与目前应用最广泛的HOG描述子,进行实验比较,结果表明,图像中对象的识别准确率得到了提高
     3.研究图像相似度度量在图像检索中的应用,提出了一种新的Spatiogram距离度量,应用于图像底层特征到高层语义的映射。在系统分析了经典的图像颜色和空间特征的基础上,利用李群论中的理论工具,将空间直方图李群相似度度量(Lie Group Spatiogram Similarity, LGSS)用于图像语义的获取。空间直方图作为颜色直方图的扩展,能够有效的弥补颜色直方图丢失了图像空间分布信息的不足。但是由于空间直方图不再是简单的向量,而是高斯分布(即高斯函数)组成的集合,度量其相似度比较困难。因为相似度概念本身是与度量对象所在的拓扑空间(如欧几里德空间、流形等)结构相关的,即相似度本身是度量对象在其空间上与其它对象的距离远近,所以本文根据高斯函数空间的李群结构特性,采用基于李群元素间测地线距离的空间直方图相似度度量对图像进行比较。在图像检索基准测试数据集Corel dataset上的实验结果表明,利用基于LGSS的检索结果要优于采用其它基于空间直方图相似度度量的检索方法。
     4.研究图像融合与清理的方法,提出一种对图像的文本语义和内容语义进行融合的方法。该方法综合利用图像内容和图像的标注文本进行语义融合,能有效的获取图像中与用户检索目的一致的语义信息。一方面,基于图像内容进行对象语义自动提取(即自动标注),作为用户标注信息的补充；另一方面,根据图像内容的相似度度量对用户标注信息进行清理,过滤错误的标注信息并根据相似图像的标注相关性自动补充标注。最终提取的语义信息中既利用了用户标注语义的丰富性,又避免了用户标注信息中包含有大量噪声的缺陷。在多模图像检索标准测试数据集NUS-WIDE dataset上的实验结果表明,自动语义提取和基于内容相似性的用户标注信息清理均能提高最终的检索性能。
     本研究根据互联网图像数据库的新特点,综合利用图像视觉内容和文本两种模态的数据各自的优点,弥补各自的不足,进行图像语义信息提取,为图像检索系统服务,符合技术发展的趋势,对图像检索技术的发展具有重要的价值。
Since image producing and sharing become easier and easier, the size of the image databases we are using become larger and larger. How to effectively and efficiently get images which we are interested, therefore, becomes an important and urgent question. Although Content-Based Image Retrieval (CBIR) has been extensively studied for more than a decade, there exist three limitations which restrict its practicability. Firstly, the precision of CBIR is usually unsatisfactory because of the semantic gap between low-level visual features and high-level semantic concepts. Secondly, the efficiency of CBIR is usually low due to the high dimensionality of visual features. Thirdly, the query form of CBIR is unnatural for image search owing to the possible absence of appropriate example images. In contrast, Text-Based Image Retrieval (TBIR) solely adopts the text information to carry through the image indexing and search. Compared with visual information, text is essentially a kind of representation for image content in the view of human-being concepts thus is low dimensional and can be processed much easier.. Therefore, TBIR is a straightforward solution to conquer the disadvantages of CBIR. But annotating large-scale image database manually is impossible. Recently, as social networks become popular, more and more users are involved to share and annotate images distributed in the web. However, such kind of user annotated tags is noisy and incomplete.
     In this thesis, we combine both the text and visual modality to extract semantic information from images. Our major work includes:
     1. We study the cross-modal semantic acquisition technology(CSIA) for image retrieval and propose a framework for Cross-modal Semantic Information Acquisition. Based on the framework, we implement cross-modal semantic acquisition. Both semantics of text and semantics of visual content are extracted and fused together. Compared with single modal semantic acquisition, our framework is more effective for image retrieval.
     2. We investigate the automatic image annotation problem and propose a new feature descriptor, scale space histogram of oriented gradient (SSHOG) for content based image semantic acquisition. SSHOG describe images in a multi-scale way based on the scale space theory. Since objects in real world have multi-scale properties. SSHOG is more effective than single scale features. We test our SSHOG based image semantic acquisition method on INRIA Person Dataset. Experimental results show the effectiveness of our method.
     3. We investigate image distance measure technology for image retrieval and propose a Lie group spatiogram similarity measure based image retrieval approach. Spatiogram is a extension of ordinary histogram. To overcome the problem that histogram can capture color information only but discard all the spatial information, spatiogram can capture not only distribution of color information but also the distribution of pixel locations using Gaussian distribution. However, Gaussian functions are not vectors but form a Lie group. So we adopt the Lie group spatiogram similarity which is based on Lie group based Gaussian space analysis for image retrieval. We test our retrieval method on Corel dataset. Experiment results indicate that our method is more effective than other spatiogram based methods.
     4. We address the semantic fusion problem and propose a method to extract and fuse semantics from text and visual content. On one hand, we automatically annotate images based on visual content and combined the resultant annotations with user annotations. On the other hand, we refine the annotations based on semantic consistency and content similarity. We model the semantic fusion problem as a constrained optimization problem. Constraints include semantic consistency, content consistency, error sparsity etc. We test our method on the NUS-WIDE and MIRFlickr-25K datasets. Experimental result show the effectiveness of our method.
     Since cross modal semantic information acquisition can avoid the disadvantages of single modal, our work is useful for image retrieval.

引文

[1]S. K. Chang. Pictorial database systems. IEEE Computer,1981,30(11):13-31
    [2]Zhao W, Chellappa R, Phillips P J.et al. Face recognition:a literature survey [J]. ACM Computing Surveys,2003,35 (4):399-458
    [3]Jain A K, Vailaya A. Shape-based retrieval:a case study with trademark image databases[J]. Pattern Recognition,1998,31(9)11369-1390
    [4]Huang J. et al. Combining supervised learning with color correlograms for content-based image retrieval[C]. In Proceedings of the ACM International Conference on Multimedia. (MM). ACM 1997325-334.
    [5]Zhong S. Zhang H. Li S. et al. Relevance feedback in content based image retrieval:Bayesian frame work feature subspaces and progressive learning. [J] IEEE Transactions on Image Processing (TIP).2003.12(8):924-937
    [6]Rui Y. Huang T. Mehrotra S. et al Relevance feedback:A power tool for interactive content-based image retrieval[J]. IEEE Transactions on Circuits and Systems for Video Technology.1998.8(5):644-651.
    [7]Chen Y, Zhou X. S., Huang T. S. One-class SVM for learning in image retrieval[C]. In:Proceedings of International Conference on Image Processing. Greece.2001:34-37
    [8]Wu H., Lu H., MaS. D.. Apractical SVM-Based algorithm for ordinal regression in image retrieval[C]. In:Proceedings of ACM Multimedia Berkeley, USA, 2003:612-621.
    [9]张磊,林福宗,张钹.基于支持向量机的相关反馈图像检索算法[J].清华大学学报(自然科学版).2002(1)：2-3.
    [10]曾晓宁,蔺旭东,裴彩燕.基于支持向量机的图像相关反馈检索[J].计算机工程与设计,2008(4)：1-3.
    [11]Zeng Cheng, Cao JiaHeng, Peng ZhiYong,Yu Wei. A A Novel Cross-Media Layered Semantic Mining Model. Wuhan University Journal of Natural Sciences, Vol.13, No.1,2008, p21-26
    [12]曾承,基于语义关联网络的跨模信息检索方法,中国,201010252935,2010-08-13
    [13]Smeulders A W M, Worring M, Santini S, et al. Content-based image retrieval at the end of the early years [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2000,22(12):1349-1.380
    [14]Liu Y, ZhangD S,LuG J,et at.A survey of content-based image retrieval with high-level semantics[J]. Pattern Recognition,2007,40(1):262-282
    [15]A. W. Smeulders, M. Worring, S. Santini, A. Gupta and R. Jain. Content-Based Image Retrieval at the End of Early Years. IEEE Transaction on Pattern Analysis and Machine Intelligence,22(12):1349-1380,2000.
    [16]Airliners.net homepage, http://www.airliners.net.2005
    [17]D. P. Huijsmans and N. Sebe. How to Complete Performance Graphs in Content-Based Image Retrieval:Add Generality and Normalize Scope. IEEE Transaction on Pattern Analysis and Machine Intelligence,27(2):245-251,2005.
    [18]I. J. Cox, M. L. Miller, T. P. Minka, T. V. Papathomas and P. N. Yianilos. The Bayesian Image Retrieval System, PicHunter:Theory, Implementation and Psychophysical Experiments. IEEE Transaction on Image Processing,9(1):20-37, 2000
    [19]M. Swain and B. Ballard. Color Indexing. International Journal of Computer Vision,7(1):11-32,1991.
    [20]M. Flickner, H. Sawhney, W. Niblack, J. Ashley, Q. Huang, B. Dom, M. Gorkani, J. Hafner, D. Lee, D. Petkovic, D. Steele and P. Yanker. Query by Image and Video Content:The QBIC System. IEEE Computing,28(9):23-32,1995.
    [21]T. Gevers and A. Smeulders. Pictoseek:Combining Color and Shape Invariant Features for Image Retrieval. IEEE Transaction on Image Processing, 9(1):102-119,2000.
    [22]J. Smith and S.-F. Chang. Integrated Spatial and Feature Image Query. IEEE Transaction on Knowledge and Data Engineering,9(3):435-447,1997.
    [23]G. Finlayson. Color in Perspective. IEEE Transaction on Pattern Analysis and Machine Intelligence,18(10):1034-1038.1996.
    [24]J. Huang. S. Ravi Kumar. M. Mitra. W.-J. Zhu and R. Zabih. Spatial Color Indexing and Applications. International Journal of Computer Vision. 35(3):245-268.1999.
    [25]B. Manjunath and W.-Y. Ma. Texture Features for Browsing and Retrieval of Image Data. IEEE Transaction on Pattern Analysis and Machine Intelligence, 18(8):837-842,1996.
    [26]J. Wang, G. Wiederhold, O. Firschein and S. Wei. Content-Based Image Indexing and Searching using Daubechies Wavelets. International Journal of Digital Libraries,1(4):311-328,1998.
    [27]C. Schmid and R. Mohr. Local Grayvalue Invariants for Image Retrieval. IEEE Transaction on Pattern Analysis and Machine Intelligence,19(5):530-535,1997.
    [28]T. Tuytelaars and L. Van Gool. Content-Based Image Retrieval Based on Local Affinely Invariant Regions. In International Conference on Visual Information Systems (VISUAL),1999.
    [29]S.-C. Zhu and A. Yuille. Region Competition:Unifying Snakes, Region Growing and Bayes/MDL for Multiband Image Segmentation. IEEE Transaction on Pattern Analysis and Machine Intelligence,18(9):884-900,1996.
    [30]J. Shi and J. Malik. Normalized Cuts and Image Segmentation. IEEE Transaction on Pattern Analysis and Machine Intelligence,22(8):888-905,2000.
    [31]A. D. Del Bimbo and P. Pala. Visual Image Retrieval by Elastic Matching of User Sketches. IEEE Transaction on Pattern Analysis and Machine Intelligence, 19(2):121-132,1997
    [32]F. Mokhtarian. Silhouette-Based Isolated Object Recognition Through Curvature Scale Space. IEEE Transaction on Pattern Recognition and Machine Intelligence, 17(5):539-544,1995.
    [33]J. Smith and S.-F. Chang. Visualseek:A Fully Automated Content-Based Image Query System. In ACM International Conference on Multimedia,1997.
    [34]S. Chang, Q. Shi and C. Yan. Iconic Indexing by 2-D Strings. IEEE Transaction on Pattern Analysis and Machine Intelligence,9(3):413-427,1987.
    [35]Swets D.L and Weng J. Using discriminate eigenfeatures for image retrieval. IEEE Transactions Pattern Analysis and Machine Intelligence, 18(8):831-836,1996
    [36]A. D. Del Bimbo and P. Pala. Visual Image Retrieval by Elastic Matching of User Sketches. IEEE Transaction on Pattern Analysis and Machine Intelligence, 19(2):121-132,1997.
    [37]R. Wilson and E. Hancock. Structural Matching by Discrete Relaxation. IEEE Transaction on Pattern Analysis and Machine Intelligence,19(6):634-648,1997.
    [38]H. Wolfson and W. Heinzelman. Spin-It:A Data Centric Routing Protocol for Image Retrieval in Wireless Networks. In IEEE International Conference on Image Processing (ICIP),2002.
    [39]R.Fagin. Combining Fuzzy Information from Multiple Systems. In ACM-SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS),1997.
    [40]Y. Wu, Q. Tian and T. S. Huang. Discriminant-EM Algorithm with Application to Image Retrieval. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2000.
    [41]M. Webe, M. Welling and P. Perona. Unsupervised Learning of Models for Recognition. In European Conference on Computer Vision (ECCV),2000.
    [42]Y. Rui, T. S. Huang, M. Ortega and S. Mehrotra. Relevance Feedback:A Power Tool in Interactive Content-Based Image Retrieval. IEEE Transaction on Circuit, System and Video Technology,8(5):644-655,1998.
    [43]Y. Rui, T. Huang and S. Mehrotra. Content-Based Image Retrieval with Relevance Feedback in Mars. In IEEE International Conference on Image Processing (ICIP),1997.
    [44]S. Chang, J. Smith, M. Beigi and A. Benitez. Visual Information Retrieval from Large Distributed Online Repositories. Communications of ACM,40(12):63-71, 1997.
    [45]A. Gupta and R. Jain. Visual Information Retrieval. Communications of ACM, 40(5):70-79,1997.
    [46]S. Mukherjea. K. Hirata and Y. Hara. Amore:A World Wide Web Image Retrieval Engine. In International World Wide Web Conference (WWW),1999
    [47]A. Pentland, R. Picard and S. Sclaroff. Photobook:Tools for Content-Based Manipulation of Image Database. In SPIE Conference on Storage and Retrieval for Image and Video Database II,1994.
    [48]W. Ma and B. Manjunath. Texture Thesaurus for Browsing Large Aerial Photographs. Journal of American Society of Information Science,49(7):633-648, 998.
    [49]J. Malik and P. Perona. Preattentive Texture Discrimination with Early Vision Mechanisms. Journal of Optical Society American. A7(5):923-932.1990.
    [50]S. X. Yu and J. Shi. Segmentation Given Partial Grouping Constraints. IEEE Transaction on Pattern Analysis and Machine Intelligence,26(2):173-183,2004
    [51]Y. Zhang. M. Brady and S. Smith. Segmentation of Brain MR Images Through a Hidden Markov Random Field Model and the Expectation-Maximization Algorithm. IEEE Transaction on Medical Image,20():45-57,2001.
    [52]J. Carballido-Gamio, S. Belongie and S. Majumdar. Normalized Cuts in 3-D for Spinal MRI Segmentation. IEEE Transaction on Medical Imaging,23(1):36-44, 2004.
    [53]D. Comaniciu and P. Meer. Mean Shift:A Robust Approach Toward Feature Space Analysis. IEEE Transaction on Pattern Analysis and Machine Intelligence, 24(5):603-619,2002.
    [54]J. Z. Wang, J. Li, R. M. Gray and G. Wiederhold. Unsupervised Multiresolution Segmentation for Images with Low Depth of Field. IEEE Transaction on Pattern Analysis and Machine Intelligence,23():85-90,2001
    [55]C. Carson, S. Belongie, H. Greenspan and J. Malik. Blobworld:Image Segmentation using Expectation-Maximization and Its Application to Image Querying. IEEE Transaction on Pattern Analysis and Machine Intelligence, 24(8):1026-1038,2002.
    [56]Y. Chen and J. Z. Wang. A Region-Based Fuzzy Feature Matching Approach to Content-Based Image Retrieval. IEEE Transaction on Pattern Analysis and Machine Intelligence,24(9):252-1267,2002
    [57]B. Manjunath, J.-R. Ohm, V. V. Vasudevan and A. Yamada. Color and Texture Descriptors. IEEE Transaction on Circuit, System and Video Technology, 11(6):703-715,2001
    [58]J. Li, R. M. Gray and R. a. Olshen. Multiresolution Image Classification by Hierarchical Modeling with Two Dimensional Hidden Markov Models. IEEE Transaction on Information Theory,46(5):1826-1841,2000.
    [59]J. Li and J. Z. Wang. Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach. IEEE Transaction on Pattern Analysis and Machine Intelligence,25(9):1075-1088,2003.
    [60]Q. Iqbal and J. K. Aggarwal. Retrieval by Classification of Images Containing Large Manmade Objects using Perceptual Grouping. Pattern Recognition. 35(7):1463-1479,2002.
    [61]R. Datta, W. Ge, J. Li and J. Z. Wang. Toward Bridging the Annotation-Retrieval Gap in Image Search. IEEE Multimedia,14(3):24-35,2007.
    [62]R. Datta, D. Joshi, J. Li and J. Z. Wang. Studying Aesthetics in Photographic Images using A Computational Approach. In European Conference on Computer Vision (ECCV),2006.
    [63]R. Datta, D. Joshi, J. Li and J. Z. Wang. Tagging Over Time:Real-World Image Annotation by Lightweight Meta-Learning. In ACM International Conference on Multimedia,2007.
    [64]R. Datta, J. Li and J. Z. Wang. IMAGEATION:A Robust Image-Based Captcha Generation System. In ACM International Conference on Multimedia,2005.
    [65]R. Datta, J. Li and J. Z. Wang. Learning the Consensus on Visual Quality for Next-Generation Image Management. In ACM International Conference on Multimedia,2007.
    [66]R. Haralick. Statistical and Structural Approaches to Texture. Proceedings of IEEE,67(5):786-804,1979.
    [67]J. Malik and P. Perona. Preattentive Texture Discrimination with Early Vision Mechanisms. Journal of Optical Society American. A7(5):923-932,1990.
    [68]M. Unser. Texture Classification and Segmentation using Wavelet Frames. IEEE Transaction on Image Processing,4(11):1549-1560,1995.
    [69]M. N. Do and M. Vetterli. Wavelet-Based Texture Retrieval using Generalized Gaussian Density and Kullback-Leibler Distance. IEEE Transaction on Image Processing,11(2):146-158,2002.
    [70]J. Portilla and E. Simoncelli. A Parametric Texture Model Based on Joint Statistics of Complex Wavelet Coefficients. International Journal of Computer Vision,40(1):49-71,2000.
    [71]W. Ma and B. Manjunath. Texture Thesaurus for Browsing Large Aerial Photographs. Journal of American Society of Information Science,49(7):633-648, 998.
    [72]A. Jain and F. Farrokhnia. Unsupervised Texture Segmentation using Gabor Filters. In International Conference on Systems, Man and Cybernetics,1990
    [73]F. Schaffalitzky and A. Zisserman. Viewpoint Invariant Texture Matching and Wide Baseline Stereo. In IEEE International Conference on Computer Vision (ICCV),2001
    [74]K. Mikolajczyk and C. Schmid. Scale and Affine Invariant Interest Point Detectors. International Journal of Computer Vision,60(1):63-86,2004.
    [75]R. Mehrotra and J. E. Gary. Similar-Shape Retrieval in Shape Data Management. IEEE Comput.28(9):57-62,995.
    [76]S. Berretti, A. Del Bimbo and P. Pala. Retrieval by Shape Similarity with Perceptual Distance and Effective Indexing. IEEE Transaction on Multimedia, 2(4):225-239.
    [77]E. G. M. Petrakis, C. Faloutsos and K. I. Lin. Imagemap:An Image Indexing Method Based on Spatial Similarity. IEEE Transaction on Knowledge and Data Engineering,14(5):979-987,2002.
    [78]L. J. Latecki and R. Lakamper. Shape Similarity Measure Based on Correspondence of Visual Parts. IEEE Transaction on Pattern Analysis and Machine Intelligence,22(10):1185-1190,2000.
    [79]S. Belongie, J. Malik and J. Puzicha. Shape Matching and Object Recognition using Shape Contexts. IEEE Transaction on Pattern Analysis and Machine Intelligence,24(4):509-522,2002.
    [80]S. Berretti, A. Del Bimbo and P. Pala. Retrieval by Shape Similarity with Perceptual Distance and Effective Indexing. IEEE Transaction on Multimedia, 2(4):225-239.
    [81]P. Ciaccia, M. Patella and P. Zezula. M-Tree:An Efficient Access Method for Similarity Search in Metric Spaces. In International Conference on Very Large Databases (VLDB),1997.
    [82]I. Bartolini, P. Ciaccia and M. Patella. Warp:Accurate Retrieval of Shapes using Phase of Fourier Descriptors and Time Warping Distance. IEEE Transaction on Pattern Analysis and Machine Intelligence,27(1):142-147,2005.
    [83]Y. H. Wang. Image Indexing and Similarity Retrieval Based on Spatial Relationship Model. Information Science and Information Computing Science, 154(1-2):39-58,2003.
    [84]S. Berretti, A. Del Bimbo and E. Vicario. Weighted Walkthroughs Between Extended Entities for Retrieval by Spatial Arrangement. IEEE Transaction on Multimedia,5(1):52-70,2003.
    [85]S. Berretti and A. Del Bimbo. Modeling Spatial Relationships Between 3D Objects. In International Conference on Pattern Recognition (ICPR),2006.
    [86]Q. Tian, N. Sebe, M. S. Lew, E. Loupias and T. S. Huang. Image Retrieval using Wavelet-Based Salient Points, Journal of Electron Imaging,10(4):835-849,2001.
    [87]Y. Rubner, C. Tomasi and L. J. Guibas. The Earth Mover's Distance as a Metric for Image Retrieval. International Journal of Computer Vision.40:99-121,2000.
    [88]R. Fergus, P. Perona and A. Zisserman. Object Class Recognition by Unsupervised Scale-Invariant Learning. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2003.
    [89]R. Fergus, P. Perona and A. Zisserman. A Sparse Object Category Model for Efficient Learning and Exhaustive Recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2005.
    [90]G. Bouchard and B. Triggs. Hierarchical Part-Based Visual Object Categorization. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2005.
    [91]G. Carneiro and D. Lowe. Sparse Flexible Models of Local Features. In European Conference on Computer Vision (ECCV),2006.
    [92]J. Amores, N. Sebe and P. Radeva. Fast Spatial Pattern Discovery Integrating Boosting with Constellations of Contextual Descriptors. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2005.
    [93]J. Amores, N. Sebe, P. Radeva, T. Gevers and A. Smeulders. Boosting Contextual Information in Content-Based Image Retrieval. In ACM SIGMM International Workshop on Multimedia Information Retrieval (MIR) at the International Multimedia Conference,2004.
    [94]H. Zhang, R. Rahmani, S. R. Cholleti and S. A. Goldman. Local Image Representations using Pruned Salient Points with Applications to CBIR. In ACM International Conference on Multimedia,2006.
    [95]V. Gouet and N. Boujemaa. On the Robustness of Color Points of Interest for Image Retrieval, In IEEE International Conference on Image Processing (ICIP), 2002.
    [96]K. Mikolajczk and C. Schmid. A Performance Evaluation of Local Descriptors. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2003.
    [97]S. Lazebnik, C. Schmid and J. Ponce. Affine-Invariant Local Descriptors and Neighborhood Statistics for Texture Recognition. In IEEE International Conference on Computer Vision (ICCV),2003.
    [98]J. Li and J. Z. Wang. Real-time Computerized Annotation of Pictures. IEEE Transaction on Pattern Analysis and Machine Intelligence,30(6):985-1002,2008.
    [99]E. Hadjidemetriou, M. D. Grossberg and S. K. Nayar. Multiresolution Histograms and Their Use for Recognition. IEEE Transaction on Pattern Analysis and Machine Intelligence,26(7):831-847,2004.
    [100]S. Jeong, C. S. Won and R. Gray. Image Retrieval using Color Histograms Generated by Gauss Mixture Vector Quantization. Computer Vision and Image Understanding,9(1-3):44-66,2004.
    [101]E. Levina and P. Bickel. The Earth Mover's Distance is the Mallows Distance: Some Insights from Statistics. In IEEE International Conference on Computer Vision (ICCV),2001.
    [102]C. L. Mallows. A Note on Asymptotic Joint Normality. Ann. Math. Statis. 43(2):508-515,1972.
    [103]Y. Deng, B. S. Manjunath, C. Kenney, M. S. Moore and H. Shin. An Efficient Color Representation for Image Retrieval. IEEE Transaction on Image Process, 10(1):140-147,2001.
    [104]T. Hastie, R. Tibshirani and J. Friedman. The Elements of Statistical Learning. Springer-Verlag,2001.
    [105]X.-J. Wang, W.-Y. Ma, Q.-C. He and X. Li. Grouping Web Image Search Result. In ACM International Conference on Multimedia,2004.
    [106]B. Gao, T.-Y. Liu, T. Qin, X. Zheng, Q.-S. Cheng and W.-Y. Ma. Web Image Clustering by Consistent Utilization of Visual Features and Surrounding Texts. In ACM International Conference on Multimedia,2005.
    [107]D. Cai, X. He, Z. Li, W. Y. Ma and J. R. Wen. Hierarchical Clustering of WWW Image Search Results using Visual, Textual and Link Information. In ACM International Conference on Multimedia,2004.
    [108]X. Zheng, D. Cai, X. He, W.-Y. Ma and X. Lin. Locality Preserving Clustering for Image Database. In ACM International Conference on Multimedia,2004.
    [109]Y. Chen, J. Z. Wang and R. Krovetz. Clue:Cluster-Based Retrieval of Images by Unsupervised Learning. IEEE Transaction on Image Processing.14(8):1187-1201, 2005.
    [110]B. L. Saux and N. Boujemaa. Unsupervised Robust Clustering for Image Database Categorization. In International Conference on Pattern Recognition (1CPR),2002.
    [111]S. Gordon, H. Greenspan and J. Goldberger. Applying the Information Bottleneck Principle to Unsupervised Clustering of Discrete and Continous Image Representation, In International Conference on Computer Vision (ICCV),2003.
    [112]J. Li. Two-Scale Image Retrieval with Significant Meta-Information Feedback. In ACM International Conference on Multimedia,2005.
    [113]G. McLachlan and D. Peel. Finite Mixture Models. Wiley-Interscience,2000.
    [114]J. Li and J. Z. Wang. Real-time Computerized Annotation of Pictures. IEEE Transaction on Pattern Analysis and Machine Intelligence.30(6):985-1002.2008.
    [115]A. Vailaya, M. A. T. Figueiredo, A. K. Jain and H.-J. Zhang. Image Classification for Content-Based Indexing. IEEE Transaction on Image Processing,10():117-130,2001.
    [116]J. Wang, J. Li and G. Wiederhold. SIMPLIcity:Semantics-sensitive Integrated Matching for Picture Libraries. IEEE Transaction on Pattern Analysis and Machine Intelligence,23(9):947-963,2001.
    [117]K.-S. Goh, E. Y. Chang and K.-T. Cheng. SVM Binary Classifier Ensembles for Image Classification. In ACM International Conference on Information and Knowledge Management (CIKM),2001.
    [118]N. Panda and E. Y. Chang. Efficient Top-K Hyperplane Query Processing for Multimedia Information Retrieval. In ACM International Conference on Multimedia,2006.
    [119]Y. Chen and J. Z. Wang. Image Categorization by Learning and Reasoning with Regions. Journal of Machine Learning Research,5:913-939,2004.
    [120]Wang, Chua T S, Zhao M. Exploring knowledge of sub-domain in a multi-resolution bootstrapping framework for concept detection in news video. [C]//Proceedings on the ACM International Conference on Multimedia, Vancouver,2008,249-258
    [121]Zhuang Y T, Yang Y, Wu F. Minnig semantic correlation of heterogeneous multimedia data for cross-media retrieval [J]. IEEE Transactions on Multimedia, 2008,10(2);221-229
    [122]Donoho, D., Compressed Sensing, Ieee Transaction on Information Theory, 52(4):1289-1306.2006
    [123]Cades, E., Tao, T., Reflections on Compressed Sensing [J]. IEEE Information Theory Society Newsletter,58(4),20-23,2008
    [124]Fei Wu, Yahong Han, Qi Tian, Yueting Zhuang, Multi-label Boosting for Image Annotation by Structural Group Sparsity, ACM Multimedia 2010
    [125]Yanan Liu, Fei Wu, Yueting Zhuang, Group Sparse Representation for Image Categorization and Semantic Video Retrieval, Science in China Series F: Information Sciences,2011,54(10):2051-2063
    [126]Yahong Han, Fei Wu, Qi Tian, Yueting Zhuang, Image Annotation by Input-Output Structural Grouping Sparsity, IEEE Transactions on Image Processing,2012,21(6):3066-3079
    [127]Ying Yuan, Fei Wu, Jian Shao, Yueting Zhuang, Image annotation by semi-supervised cross-domain learning with group sparsity, Journal of Visual Communication and Image Representation,24(2):95-102,2013
    [128]Iijima T., Basic theory of pattern normalization (for the case of a typical one dimensional pattern, Bulletin of the Electro technical Laboratory,26:368-388, 1962.
    [129]Witkin A P., Scale space filtering, International Joint Conference on Artificial Intelligence,1983.
    [130]Koenderink J J., The structure of image, Biological Cybernetics,50:363-370, 1984.
    [131]Liyu Gong,Tianjiang Wang, Yan Yu,Fang Liu, Xiangen Hu,A Lie Group based Gaussian Mixture Model Distance Measure for Multimedia Comparison, In International Computing and Service(ICIMCS),2009
    [132]Liyu Gong, Tianjiang Wang, Chengshuo Wang, Fang Liu, Fuqiang Zhang, Xiaoyuan Yu,Recognizing After from Non-Stylized Body Motion Using Shape of Gaussian Descriptors. In ACM Symposium On Applied Computing(SAC),2010
    [133]C. Wang, F. Jing, L. Zhang, and H, Zhang. Content-based image annotation refinement. IEEE Conference on Computer Vision and Pattern Recognition (CVPR).2007.
    [134]D. Liu, X.-S. Hua, M. Wang, and H.-J. Zhang. Image retagging, ACM annual Conference on Multimedia (ACM MM),2010.
    [135]Cai J F, Candes E J, Shen Z. A singular value thresholding algorithm for matrix completion[J]. SIAM Journal on Optimization,2010,20(4):1956-1982

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700