图像检索中自动标注技术的研究

英文题名：Research on Automatic Annotation in Image Retrieval
作者：赵玉凤
论文级别：博士
学科专业名称：信号与信息处理
中文关键词：图像检索 ; 自动图像标注 ; 半监督学习 ; 多示例学习 ; 伪相关反馈 ; 隐马尔科夫模型 ; 直推式学习
英文关键词：Image Retrieval ; Automatic Image Annotation ; Semi-supervised Learning ; Multiple-instance Learning ; Pseudo Relevance Feedback ; Hidden Markov Model ; Transductive Learning
学位年度：2009
导师：赵耀
学科代码：081002
学位授予单位：北京交通大学
论文提交日期：2009-06-01
答辩委员会主席：袁保宗

摘要

随着多媒体技术和计算机网络技术的发展,人们接触到的图像数据迅速增长。面对海量图像资源,基于内容图像检索(Content Based Image Retrieval, CBIR)技术能够有效地分析、组织和管理图像数据,因此成为多媒体领域的研究热点。然而由于受到“语义鸿沟”瓶颈的制约,也就是低层视觉特征(如颜色、纹理、形状等)不能完全反映和匹配用户的查询意图,导致CBIR技术遇到了前所未有的巨大挑战。近几年发展起来的自动图像标注技术就着手于建立起高层语义与低层特征之间的桥接,是解决“语义鸿沟”问题的有效途径之一。
     针对当前自动图像标注技术中存在的问题和不足,本文尝试和探索从不同的角度挖掘图像内容的语义概念,即半监督模式、小样本学习、伪相关反馈机制与多视角的语义关联性分析,以此强化对图像内容的语义理解,改善自动图像标注的性能。主要成果和创新之处包括以下几个方面：
     (1)半监督模式下的自动图像标注
     本文首先探讨了自动图像标注问题本身的特点,即由于一幅图像被标注多个关键词,同时一幅图像又包含多个区域,因此其属于一个多类多示例学习问题,据此提出了在半监督模式下完成自动图像标注任务。通过在多示例学习框架下对语义关键词进行独立分析,将多类分类问题转化为半监督模式下的二类分类问题,实现语义粒度的层次化描述,以期有效挖掘图像的内在语义概念。实验结果验证了该图像标注框架的有效性。
     (2)自动图像标注中小样本学习问题
     虽然图像标注工作已经取得了很大的进展,但是由于关键词语义类别的多样性,用于图像标注任务的训练图像数量相对不足,即小样本学习问题,导致了图像标注的效果不甚理想。为了解决自动图像标注中小样本学习问题,本文着重研究了在最小参考集(Minimum Reference Set, MRS)框架下的多示例学习策略。通过采用具有最小MRS的代表示例集合表征关键词的语义信息,提高了多示例学习的鲁棒性,从而使得在训练样本不足时自动图像标注的性能得到显著改善。
     (3)伪相关反馈框架下的自动图像标注
     从数据挖掘的角度分析可知,图像检索与图像标注两种技术在某种程度上具有一致性及互补性。针对现有基于Search的图像标注中存在的不足,如相关图像集合的精度低、用户负担重等,本文尝试通过有效融合伪相关反馈机制,建立伪相关条件概率标注模型。在避免人工干预的同时实现自动迭代搜索,以期获得更为可靠的相关图像集合；而且利用基于文本分析技术获取关键词之间的语义关联,从而更好地服务于图像标注任务。
     (4)多视角的语义关联性分析
     如何挖掘基于语义的多视角相关模型是当前自动图像标注技术中一项重要而迫切的研究课题。本文从概率关联模型角度,分析了隐马尔科夫模型解决自动图像标注任务的可行性。在直推式支持向量机的框架下,有效地建立图像-关键词之间的对应关系；而且通过融合关键词的共生关系与语义词典,高效地获取关键词-关键词之间的语义关联,建立了图像-关键词与关键词-关键词的多视角相关模型,有助于解决自动图像标注任务。
With the development of multimedia technology and computer network, content-based image retrieval (CBIR) becomes more and more important to organize, index and retrieve the massive image information in many application scenarios. Thus, CBIR has emerged as a hot topic in recent years. However, the improvement of CBIR is hindered by the well-known semantic gap between low-level visual features, e.g. color, texture, shape, and high-level semantic concepts. Automatic image annotation (AIA) is a feasible way to narrow down the semantic gap since it attempts to establish the bridge between low-level visual features and high-level semantic concepts.
     Aiming at the problems and the difficulties in the field of AIA, the semantic concepts of images are mined from different views, i.e. the manner of semi-supervised learning, the learning of small samples, the scheme of pseudo relevance feedback and semantic relationship based on multiple views. Since the semantic understanding of image content is addressed based on the four views, the performance of AIA can also be largely improved. The main contributions of the dissertation are as follows:
     (1) Automatic image annotation in a manner of semi-supervised learning
     The discussion and analysis of AIA is given in this dissertation, i.e. one image is annotated by several keywords and is segmented into many regions. Therefore, the task of AIA attributes to both the problem of multiple-classification learning and multiple-instance learning (MIL). For this, the dissertation proposes that AIA is resolved in a manner of semi-supervised learning. By independently analyzing the keywords under the framework of MIL, the multiple-classification is able to be transformed into binary-classification so that the hierarchical description of semantic granularity is implemented and the intrinsic semantic concept is effectively mined. The experimental results verify the effectiveness of the proposed framework.
     (2) Small sample learning in automatic image annotation
     Although many improvements are made in recent researches, the problem of small samples is more and more salient in the domain of AIA, which degrades greatly the performance of image annotation. In order to focus on the problem of small samples, the MIL strategy based on minimum reference set (MRS) is investigated in this dissertation. Then, the salient instance set with the smallest size of MRS can be accurately exploited to characterize the semantic content of keywords. Since the robustness of MIL is promoted, the quality of AIA can also be increased greatly.
     (3) Pseudo relevance feedback oriented automatic image annotation
     Analyzed from the view of data mining, the image annotation technology possesses the consistency and the complementarities with the image search technology. To overcome the difficulties in search based image annotation, e.g. lower accuracy of relevant images, more burdens on human, the dissertation attempts to integrate the scheme of pseudo relevance feedback into the task of AIA and create the pseudo relevance probability model of automatic image annotation. Hence, more reliable relevant images are explored without human's interruption and the semantic correlations among keywords are mined by the technology of textual analysis, which leads to better annotation performance.
     (4) Semantic relationship analysis from multiple views
     A popular technology is focused on how to build the semantic relation of relevance model based on multiple views recently. From the view of probability relevance model, it is feasible for Hidden Markov Model (HMM) to deal with the task of AIA. Under the framework of transductive support vector machine, the correspondence of image-keyword is able to be constructed effectively. Moreover, the semantic relation of keyword-keyword is correctly mined by combining the co-occurrence and the tool of WordNet. Then, the multiple-views based relevance model, i.e. image-keyword and keyword-keyword, can be built to promote the quality of AIA.

引文

[1]Tamura H, Yokoya N. Image systems:A survey. Pattern Recognition,1984, 17(1):29-43.
    [2]http://www.google.com
    [3]http://www.altavista.com/
    [4]http://www.lycos.com/
    [5]http://www.baidu.com
    [6]Page L, Brin S, Motwani R, Winograd T. The pagerank citation ranking:Bringing order to the web. Technical report, Stanford Digital Library Technologies Project, 1998.
    [7]Kato T. Database architecture for content-based image retrieval. In:Proc. of SPIE Int. Conf. on Image Storage and Retrieval System, San Jose, CA, USA, May. 1992:112-123.
    [8]Myron F, Harpreet S, Wayne N. Query by image and video content:The QBIC system. In:Proc. of IEEE Int. Conf. on Computer, San Jose, CA, September,1995: 23-32.
    [9]Bach J, Fuller C, Gupta A. Virage image search engine:An open framework for image management. In:Proc. of Intl. Symposium on Electronic Imaging:Science and Technology-Storage, Retrieval/or Image and Video Databases, San Jose, CA, USA, Oct.1996:76-87.
    [10]Pentland A, Picard R, Sclaroff S. Photobook:Content-based manipulation of image databases. Int. Journal of Computer Vision,1996,18(3):233-251.
    [11]Smith J, Chang S. Visualseek:A fully automated content based image query system. In:Proc. of ACM Int. Conf. on Multimedia (ACM Multimedia'96), Boston MA USA, Nov.1996:87-98.
    [12]S. Mehrotra, Y. Rui. Supporting Content-Based Queries over Images in MARS. Proc. In:Proc. of IEEE Int. Conf. on Multimedia Computing and Systems, Ottawa, Ontario, Canada, Jun.1997:632-633.
    [13]基于特征的多媒体信息检索系统MIRES.中国科学院成果鉴定报告,1999.
    [14]Lu Y, Hu C H, Zhu X Q, Zhang H J, Yang Q. A unified framework for semantics and feature based relevance feedback in image retrieval systems. In:Proc. of ACM Int. Conf. on Multimedia (ACM Multimedia'00), Los Angeles, Oct.2000:31-38.
    [15]吴洪.交互式图像检索中的相关反馈技术.中科院自动化所,博士学位论文.2004.
    [16]Fang Y, Donald G, Nozha B. Irma-a content-based approach to image retrieval in medical applications. In:Proc. of Int. Conf. on Information Resources Management Association, Washington, DC, May.2006:911-912.
    [17]Schietse J, Eakins J P, Remco C. Practice and challenges in trademark image retrieval. In:Proc. of ACM Int. Conf. on Image and video retrieval (ACM IVR'07), Amsterdam, The Netherlands, Jul.2007:518-524.
    [18]Lehmann T, Deselaers T. An interactive system for mental face retrieval. In:Proc. of ACM SIGMM Int. workshop on Multimedia information retrieval (ACM MIR'07), Augsburg, Germany, Sep.2005:193-200.
    [19]韩军伟.基于内容的图像检索技术研究.西北工业大学,博士学位论文.2003.
    [20]Ma W Y, Manjunath B. Netra:a toolbox for navigating large image databases. In: Proc. of IEEE Int. Conf. on Image Processing (ICIP'97), Santa Barbara, USA, Oct. 1997:568-571.
    [21]Carson C, Belongie S, Greenspan H, Malik J. Blobworld:Image segmentation using expectation-maximization and its application to image querying. IEEE Trans. on Pattern Analysis and Machine Intelligence,2002,24(8):1026-038.
    [22]Wang J Z, Li J, Wiederhold G. SIMPLIcity:semantics-sensitive integrated matching for picture libraries. IEEE Trans. on Pattern Analysis and Machine Intelligence,2001,23(9):947-963
    [23]Ko B C, Byun H. Frip:a region-based image retrieval tool using automatic image segmentation and stepwise Boolean and matching. IEEE Trans. on Multimedia, 2005,7:105-113.
    [24]Liu Y, Zhang D, Lu G. Region-based image retrieval with high-level semantics using decision tree learning. Pattern Recognition,2008,41(8):2554-2570.
    [25]Chen Y, Wang J Z. A region-based fuzzy feature matching approach to content-based image retrieval. IEEE Trans. on Pattern Analysis and Machine Intelligence,2002,24(9):1252-1267.
    [26]Zhang R, Zhang Z. Hidden semantic concept discovery in region based image retrieval. In:Proc. of IEEE Computer Society Conf. on Computer Vision and Pattern Recognition (CVPR'04), Washington, DC, USA, Jun.2004,2:996-1001.
    [27]Ko B C, Kwak S Y, Byun H. SVM-based Salient Region(s) Extraction Method for Image Retrieval. In:Proc. of IEEE Int. Conf. on Pattern Recognition (ICPR'04), Cambridge, UK, Aug.2004:977-980.
    [28]Kwak S Y, Ko B C, Byun H. Automatic salient object extraction using the contrast map and salient points. In:Proc. of Pacific-Rim Conf. on Multimedia (PCM'04), Tokyo, Japan,2004:138-145.
    [29]Kim S, Park S, Kim M. Central object extraction for object-based image retrieval. In:Proc. Of Int. Conf. on Image and Video Retrieval (CIVR'03), Urbana, USA, Jul. 2003:39-49.
    [30]Ko B C, Nam J Y. Automatic object-of-interest segmentation from natural images. In:Proc. of IEEE Int. Conf. on Pattern Recognition (ICPR'06), Hong Kong, China, Sep.2006:45-48
    [31]Picard, T.P. Minka, and M. Szummer. Modeling user subjectivity in image libraries. In Proc. of IEEE Int. Conf. on Image Processing (ICIP'96),1996.
    [32]Rahmani R, Goldman S A, Zhang H, Cholleti S R, Fritts J E. Localized content based image retrieval. IEEE Trans. on Pattern Recognition and Machine Intelligence,2008,30(11):1902-1912.
    [33]Rahmani R, Goldman S A, Zhang H, Krettek J, Fritts J. Localized content based image retrieval. In:Proc. of ACM Int. Conf. on Multimedia Information Retrieval (ACM MIR'05), Singapore, Nov.2005:227-236.
    [34]Zhang Q, Goldman S, Yu W, Fritts J. Content-based image retrieval using multiple instance learning. In:Proc. of IEEE Int. Conf. on Machine Learning (ICML'02), Sydney, Australia, Jul.2002:682-689.
    [35]Yang C, Lozano-Perez T. Image database retrieval with multiple instance techniques. In:Proc. of IEEE Int. Conf. on Data Engineering (ICDE'00), San Diego, CA, USA, Mar.2000:233-243.
    [36]Picard, Minka T P, Szummer M. Modeling user subjectivity in image libraries. In Proceeding of IEEE Int. Conf. on Image Processing (ICIP'96), Lausanne, Switzerland, Sep.1996:777-800.
    [37]Cox I J, Miller M. An optimized interaction strategy for Bayesian relevance feedback, IEEE Conf. Computer Vision and Pattern Recognition (CVPR'98), Santa Barbara, CA, Jun.1998:553-558.
    [38]Rui Y, Huang T S. Relevance feedback:A power tool in interactive content-based image retrieval. IEEE Tran. on Circuits and Systems for Video Tech,1998,8(5): 644-655.
    [39]Tian Q, Hong P, Huang T S. Update relevant image weights for content-based image retrieval using support vector machines. In:Proc. of IEEE Int. Conf. on Multimedia and Expo (ICME'00), New York City, USA, Jul.2000:1199-1202.
    [40]Rui Y, Huang T S, et al. Automatic matching tool selection using relevance feedback in MARS. In:Proc. of Int. Conf. on Visual Information Systems, San Diego, CA, Dec.1998:45-50.
    [41]Huang J, Kumar S R, Mitra M. Combining supervised learning with color correlograms forcontent-based image retrieval. In:Proc. of ACM Int. Conf. on Multimedia, Seattle, Washington, USA, Nov.1997:325-334.
    [42]Jing F, Li M J, Zhang H J, Zhang B. An effective region-based image retrieval framework. In:Proc. of ACM Int. Conf. on Multimedia (ACM Multimedia'02), Juan-les-Pins, France, Dec..2002:456-465.
    [43]Sclaroff S, Taycher L. Imagerover:A content-based image browser for the world wide web. In:Proc. of IEEE Int. workshop on Content-Based Access to Image and Video Libraries, San Juan, Jun.1997:2-9.
    [44]Nastar C, Mitschke M, Meilhac C. Efficient query refinement for image retrieval. In:Proc. of IEEE Int. Conf. on Computer Vision and Pattern Recognition (CVPR'98), Santa Barbara, CA, Jun.1998:547-552.
    [45]Meilhac C, Nastar C. Relevance feedback and category search in image databases. In:Proc. of IEEE Int. Conf. on Multimedia Computing and Systems, Florence, Jun. 1999:512-517.
    [46]Vasconcelos N, Lippman A. Bayesian relevance feedback for content-based image retrieval. In:Proc. of IEEE Int. workshop on Content-Based Access to Image and Video Libraries, Hilton Head Island, SC, Jun.2000:63-67.
    [47]吴洪,卢汉清,马颂德.基于内容图像检索中相关反馈技术的回顾.计算机学报,2005,28(12)：1969-1979.
    [48]Zhang L, Liu F, Zhang B. Support vector machine learning for image retrieval. In: Proc. Of IEEE Int. Conf. on Image Processing (ICIP'01), Thessaloniki, Greece, Oct. 2001:721-724.
    [49]Gondra I, Heisterkamp D. Learning in region-based image retrieval with generalized support vector machines. In Proc. of Int. workshops. on Computer Vision and Pattern Recognition (CVPRW'04), Washington, DC, USA, Jun.2004: 149-157.
    [50]Tong S, Chang E. Support vector machine active learning for image retrieval. In Proc. of ACM Int. Conf. on ACM Multimedia (ACM Multimedia'01), Ottawa, Canada,2001:107-118.
    [51]陈毅松,汪国平,董士海.基于支持向量机的渐进直推式分类学习算法.软件学报,2003,14(3)：451-460.
    [52]Chen Y, Zhou X, Huang T S. One-class SVM for learning in image retrieval. In: Proc. of IEEE Int. Conf. on Image Processing (ICIP'01), Greece, Oct.2001:34-37.
    [53]Zhang C, Chen T. An active learning framework for content-based information retrieval. IEEE Trans. on Multimedia,2002,4(2):260-268.
    [54]Chen S, Rubin S, Shyu M, Zhang C. A dynamic user concept pattern learning framework for content-based image retrieval. IEEE Trans. on Systems, Man, and Cybernetics-Part C:Applications and Reviews,2006,36(6):772-783.
    [55]Zhou X H, Chen K J. Exploiting unlabeled data in content-based image retrieval. In: Proc. of European Conf. on Machine Learning, Pisa, Italy, Sep.2004:525-536.
    [56]He J, Li M, Zhang H J, Tong H H, Zhang C. Generalized manifold-ranking based image retrieval. IEEE Trans. on Image Processing,2006,15(10):3170-3177.
    [57]Tong H H, He J, Li M, Ma W Y, Zhang H J, Zhang C. Manifold-ranking based keyword propagation for image retrieval. EURASIP Journal of Applied Signal Processing, Special Issue on Information Mining from Multimedia Database,2006, 21:1-10.
    [58]He J, Li M, Zhang H J, Tong H H, Zhang C. Manifold-ranking based image retrieval. In:Proc. of ACM Int. Conf. on Multimedia (ACM Multimedia'04), New York, USA, Oct.2004:9-16.
    [59]Zhou Z H, Chen K J, Dai H B. Enhancing relevance feedback in image retrieval using unlabeled data. ACM Trans. on Information Systems,2006,24(2):219-244.
    [60]Wu Y, Tian Q, Huang T S. Discriminant EM algorithm with application to image retrieval. In:Proc. of IEEE Int. Conf. on Computer Vision and Pattern Recognition (CVPR'00), South Carolina, USA, Jun.2000,1:155-162.
    [61]Jing, F, Li, M, Zhang, H J, Zhang B. An Efficient and Effective Region-Based Image Retrieval Framework. IEEE Trans. on Image Processing,2004,13(5): 699-709.
    [62]Jing F, Li M, Zhang H J, Zhang B. A Unified Framework for Image Retrieval Using Keyword and Visual Features. IEEE Trans. on Image Processing,2005, 14(7):979-989.
    [63]景风.高效准确的基于内容的图像检索研究.清华大学,博士学位论文.2004.
    [64]Jeon J., Manmatha R. Automatic image annotation of news images with large vocabularies and low quality training data, In:Proc. of ACM Int. Conf. on Multimedia (ACM Multimedia'04), New York, USA, Oct.2004.
    [65]刘静.网络图像检索系统中关键技术的研究.中国科学院自动化研究所,博士学位论文.2008.
    [66]Mori Y, Takahashi H, Oka R. Image-to-word transformation based on dividing and vector quantizing images with words. In:Proc. ofIntl. Workshop on Multimedia Intelligent Storage and Retrieval Management (MISRM'99), Orlando, Oct.1999.
    [67]Duygulu P, Barnard K, Freitas N, Forsyth D. Object recognition as machine translation:learning a lexicon for a fixed image vocabulary. In:Proc. of European Conf. on Computer Vision (ECCV'02), Copenhagen, Denmark, May.2002: 97-112.
    [68]Barnard K, Duygulu P, Freitas N, Forsyth D, Blei D, Jordan M I. Matching words and pictures. Journal of Machine Learning Research,2003,3:1107-1135.
    [69]Jeon J, Lavrenko V, Manmatha R. Automatic image annotation and retrieval using cross-media relevance models. In:Proc. of Int. ACM SIGIR Conf. on Research and Development in Information Retrieval (ACM SIGIR'03), Toronto, Canada, Jul. 2003:119-126.
    [70]Lavrenko V, Manmatha R, Jeon J. A model for learning the semantics of pictures. In:Proc. Of Advances in Neural Information Processing Systems (NIPS'03),2003.
    [71]Feng S, Manmatha R, Lavrenko V. Multiple bernoulli relevance models for image and video annotation. In:Proc. of IEEE Int. Conf. on Computer Vision and Pattern Recognition (CVPR'04), Washington DC, USA, Jun.2004:1002-1009.
    [72]Monay F, Gatica-Perez D. Modeling semantic aspects for cross-media image indexing. IEEE Trans. on Pattern Analysis and Machine Intelligence,2007,29(10), 1802-1817.
    [73]Kang F, Jin F. Symmetric statistical translation models for automatic image annotation. In:Proc. of SIAM Conf. on Data Mining, Newport Beach, CA, Apr. 2005:21-23.
    [74]Jeon J, Manmatha F. Using maximum entropy for automatic image annotation. In: Proc. of IEEE Int. Conf. on Image and Video Retrieval, Dublin, IRLANDE, Jul. 2004:24-32.
    [75]Wang C, Jing F, Zhang L, Zhang H J. Image annotation refinement using random walk with restarts. In:Proc. of ACM Int. Conf. on Multimedia (ACM Multimedia'06), Santa Barbara, CA, Oct.2006:647-650.
    [76]Monay F, Perez D G. On image auto-annotation with latent space models. In:Proc. of ACM Int. Conf. on Multimedia (ACM Multimedia'03), Berkeley, CA, USA, Nov.2003:275-278.
    [77]Kumar M P, Torr P H S, Zisserman A. OBJ CUT, In:Proc. of IEEE Int. Conf. on Computer Vision and Pattern Recogniton (CVPR'05), San Diego, CA, USA, Jul. 2005:18-25.
    [78]Blei D, JordanM, Modeling anntatated data. In:Proc. of Int. Conf. on Research and Development in Information Retrieval, Toronto, Canada, Jul.2003:127-134.
    [79]Wang X J, Zhang L et al. AnnoSearch:Image auto-annotation by search. In Proc. of IEEE Computer Society Conf. on Computer Vision and Pattern Recognition (CVPR'06), New York, USA, Jun.2006:1483-1490.
    [80]Wang X J, Zhang L, Li X, Ma W Y. Annotating images by mining image search results. IEEE Trans. On Pattern Analysis and Machine Intelligence,2008,30(11): 1919-1932.
    [81]Wang C H, Jing F, Zhang L, Zhang H J. Scalable search-based image annotation of personal images. In:Proc. of ACM Int. Conf. on Multimedia Information Retrieval (ACM MIR'06), Santa Barbara, California, USA, Oct.2006:269-277.
    [82]Liu J, Wang B, Lu H Q, Ma S D. A graph-based image annotation framework. Pattern Recognition Letters,2008,29(4):407-415.
    [83]J. Liu, M. Li, et al., An adaptive graph model for automatic image annotation, In: Proc. of ACM Int. Conf. on Multimedia Information Retrieval (ACM MIR'06), Santa Barbara, California, USA, Oct.2006:61-69.
    [84]王斌.图像检索中自动标注与快速相似搜索技术研究.中国科技大学,学位论文.2007
    [85]Li Y, Shapiro L G, Bilmes J A. A generative/discriminative learning algorithm for image classification. In:Proc. of IEEE Int. Conf. on Computer Vision (ICCV'05), Beijing, China, Oct.2005:1605-1612.
    [86]Chen Y, Bi J, Wang J Z. MILES:multiple-instance learning via embedded instance selection. IEEE Trans. on Pattern Analysis and Machine Intelligence,2006,28(12): 1931-1947.
    [87]Carneiro G, Chan A B, Moreno P J, Vasconcelos N. Supervised learning of semantic classes for image annotation and retrieval. IEEE Trans, on Pattern Analysis and Machine Intelligence,2007,29(3):394-410.
    [88]Qi X, Han Y. Incorporating multiple SVMs for automatic image annotation. Pattern Recognition,2007,40(2):728-741.
    [89]Tang J, Lewis P H. A study of quality issues for image auto-annotation with the Corel dataset. IEEE Trans. on Circuits and Systems for Video Technology,2007, 17(3):384-389.
    [90]Cusano C, Ciocca G, Schettini R. Image annotation using SVM. In:Proc. of Int. SPIE Conf. on Imaging IV, San Jose, CA, USA, Feb.2004:330-338
    [91]Chang E, Goh K, Sychay G, Wu G. CBSA:Content-based soft annotation for multimodal image retrieval using Bayes point machines. IEEE Trans.on Circuits and Systems for Video Technology,2003, (13)1:26-38.
    [92]Li J, Wang J. Automatic linguistic indexing of pictures by a statistical modeling approach. IEEE Trans. On Pattern Analysis and Machine Intelligence,2003, (25)9: 1075-1088.
    [93]Gustavo C, Nuno V, A database centric view of semantic image annotation and retrieval. In:Proc. of Int. ACM SIGIR Conf. on Retrieval (ACM SIGIR'05), Salvador, Brazil, Aug.2005:559-566.
    [94]Yang C, Dong M, Fotouhi F. Region based image annotation through multiple instance learning, In:Proc. of ACM Conf. on Multimedia (ACM MM'05), Singapore, Nov.2005:435-438.
    [95]Yang C, Dong M, Hua J. Region-based image annotation using asymmetrical support vector machine-based multiple-instance learning, In:Proc. of IEEE Int. Conf. on Computer Vision and Pattern Recognition (CVPR'06), New York, USA, Jun.2006:2057-2063,.
    [96]Chen Y C, Wang J Z. Image categorization by learning and reasoning with regions, Journal of Machine Learning Research,2004,5:913-939.
    [97]Tang J H, Hua X, Qi G, Wu X. Typicality ranking via semi-supervised multiple-instance learning. In:Proc. of ACM Conf. on Multimedia (ACM MM'07), Augsburg, Germany, Sep.2007:297-300.
    [98]Pan J Y, Yang H J, Pinar D, Automatic multimedia cross-modal correlation discovery, In:Proc. of ACM SIGKDD Conf. on Knowledge Discovery and Data Mining (ACM KDDM'04), Seattle, WA, USA, Aug.2004:653-658.
    [99]http://www.flickr.com/.
    [100]http://us.fotolia.com/.
    [101]Russell B C, Torralba A, Murphy K P, Freeman W T. Labelme:a database and web-based tool for image annotation. MIT-CSAIL-TR-2005-056, Massachusetts Institute of Technology, Technical report,2005.
    [102]Ahn L V, Dabbish L. Labeling images with a computer game. In:Proc. of ACM Conf. on Computer Human Interaction (CHI'04), Vienna, Austria, Apr.2004.2004: 319-326.
    [103]Ahn L V, Liu R, Blum M. Peekaboom:A game for locating objects in images. In: Proc. of ACM Conf. on Computer Human Interaction (CHI'06), Quebec, Canada, Apr.2006:55-64.
    [104]Han J, Ngan K, Li M, Zhang H. A memory learning framework for effective image retrieval. IEEE Trans. on Image Processing,2005,14(4):511-524.
    [105]Jing F, Zhang B, Li M J. Learning in hidden annotation-based image retrieval. In: Proc. of IEEE Int. Conf. on Pattern Recognition (ICPR'04), Cambridge, UK, Aug. 2004:1001-1004.
    [106]Han J W, Ngan K J, Li M J, Zhang H J. Learning semantic concepts from user feedback log for image retrieval. In:Proc. of IEEE Int. Conf. on Multimedia and Expo (ICME'04), Taiwan, Jun.2004:995-998.
    [107]Yang C, Dong M, Fotouhi F. Learning the semantics in image retrieval-A statistical natural language processing approach. In:Proc. of IEEE Int. Workshop. on Multimedia Data and Document Engineering (MDDE'04), Washington, DC, Jun. 2004:137-137.
    [108]Zhang R, Zhang Z. A Bayesian framework for automatic concept discovery in image collections. In:Proc. of IEEE Int. Conf. on Pattern Recognition (ICPR'04), Cambridge, UK, Aug.2004:973-976.
    [109]Kang F, Jin R, Chai J. Regularizing translation models for better automatic image annotation. In:Proc. of Int. Conf. on Information and Knowledge Management, Washington, D.C., USA, Nov.2004:350-359.
    [110]Lavrenko V, Croft B. Relevance-based language models, In:Proc. of Int. ACM SIGIR Conf. on Retrieval (ACM SIGIR'01), New Orleans, Louisiana, United States, Sep.2001:120-127.
    [111]Maron O, Lozano P T. A framework for multiple-instance learning. In:Proc. of Advances in Neural Information Processing Systems (NIPS'98), Pittsburgh, USA, Oct.1998:570-576.
    [112]Maron O, Lozano-Perez T. Multiple-instance learning for natural scene classification. In:Proc. of Int. Conf. on Machine Learning (ICML'98), Madison, Wisconsin, USA, Jul.1998:341-349.
    [113]Carneiro G, Chan A B, Moreno P J, Vasconcelos N. Supervised learning of semantic classes for image annotation and retrieval. IEEE Trans. on Pattern Analysis and Machine Intelligence,2007,29(3):394-410.
    [114]Dietterich T G, Lathrop R H, LozanoP T. Solving the multiple-instance problem with axis-parallel rectangles. Int. Journal of Artificial Intelligence,1997,89(1-2): 31-71.
    [115]Maron O. Learning from ambiguity. Department of Electrical Engineering and Computer Science, MIT, PhD dissertation.1998.
    [116]Zhou D, Bousquet O, Lal T N, Weston J, Scholkopf B. Learning with local and global consistency. In:Proc. of Advances in Neural Information Processing Systems (NIPS'03), British Columbia, Dec.2003:321-328.
    [117]Zhou D, Weston J, Gretton A, Bousquet O, et al. Ranking on Data Manifolds. In: Proc. Of Advances in Neural Information Processing Systems (NIPS'03), British Columbia, Canada, Dec.2003:169-176.
    [118]Pan J Y, Yang H J, Duygulu P, Faloutsos C. Automatic image captioning. In:Proc. of IEEE Int. Conf. on Multimedia and Expo (ICME'04), Taipei, China, Jun.2004, 1:987-990.
    [119]Shi J, Malik J. Normalized cuts and image Segmentation. IEEE Trans. on Pattern Analysis and Machine Intelligence,2000,22(8):888-905.
    [120]张鸿斌,孙广煜.近邻法参考样本集的最优选择.电子学报,2000,28(11)：16-21.
    [121]Chen X, Jeong J C. Minimum reference set based feature selection for small sample classifications, In:Proc. of Int. Conf. on Machine Learning (ICML'07), Oregon, USA, Jun.2007:153-160.
    [122]Nakashima T, Ishibuchi H. GA-Based Approaches for Finding the Minimum Reference Set For Nearest Neigbor Classification. In:Proc. of Int. Conf. on Compuational Intelligence, Anchorage, AK, USA, May.1998:709-714.
    [123]Li W. Random texts exhibit zipfs-law-like word frequency distribution. IEEE Trans. On Information Theory,1992,38(6).
    [124]Vapnik V, Wiley J, Sons. Statistical Learning Theory. New York,1998.
    [125]Luntz A, Brailovsky V. On estimation of characters obtained in statistical procedure of recognition. Technicheskaya, Russian, Mar.1969.
    [126]Zhao Y F, Zhao Y, Zhu Z F. MRS-MIL:Minimum reference set based multiple instance learning for automatic image annotation. In:Proc. of IEEE Int. Conf. on Image Processing (ICIP'08), San Diego, CA, Oct.2008:12-15.
    [127]Rahmani R, Goldman S A, Zhang H, Cholleti S R, Fritts J E. Localized content based image retrieval. IEEE Trans. on Pattern Recognition and Machine Intelligence,2008,30(11):1902-1912.
    [128]Rahmani R, Goldman S A, Zhang H, Krettek J, Fritts J. Localized content based image retrieval. In:Proc. of ACM Int. Conf. on Multimedia Information Retrieval (ACM MIR'05), Singapore, Nov.2005:227-236.
    [129]Deng Y, Manjunath B S. Unsupervised segmentation of color-texture regions in images and video. IEEE Trans. on Pattern Analysis and Machine Learning,2001, 23(8):800-810.
    [130]Rahmani R, Goldman S A, Zhang H, Krettek J, Fritts J. Localized content based image retrieval. In:Proc. of ACM Int. Conf. on Multimedia Information Retrieval (ACM MIR'05), Singapore, Nov.2005:227-236.
    [131]Voorhees E. Using WordNet to Disambiguate Word Senses for Text Retrieval. In: Proc. of ACM SIGIR Int. Conf. on Research and Development in Information Retrieval, Pittsburgh, Pennsylvania, USA, Jul..1993:171-180
    [132]Ghoshal A, Ircing P, Khudanpur S. Hidden Markov models for automatic annotation and content-based retrieval of images and video. In:Proc. of ACM SIGIR Int. Conf. on Image Retrieval, Brazil, Aug.2005:544-551.
    [133]Jiang J, Conrath D.Semantic similarity based on corpus statistics and lexical taxonomy. In:Proc. of Int. Conf. on Research on Computational Linguistics, Taiwan, Sep.1997.
    [134]He J, Li M, Zhang H J, Tong H, Zhang C. Pseudo relevance feedback based on iterative probabilistic one class SVM in web image retrieval. In:Proc. of Pacific Conf. on Multimedia (PCM), Tokyo, Japan, Nov.2004:213-220.
    [135]Pseudo relevance feedback with biased support vector machine in multimedia retrieval. Technology Report, Microsoft Asia Research,1997.
    [136]冯松鹤.面向感知的图像检索及自动标注算法研究.北京交通大学,学位论文.2009.
    [137]Ghoshal A, Khudanpur S. Hidden Markov models for automatic annotation and content-based retrieval of images and video retrieval. In:Proc. of ACM Int. Conf. on SIGIR (ACM SIGIR'05), New York, USA, Aug.2005:541-551.
    [138]詹德川,周志华.基于流行排序的基于流形学习的多示例回归算法.计算机学报,2006,29(11)：1948-1955.
    [139]Yan R, Hauptmann A, Jin R. Negative pseudo-relevance feedback in content-based video retrieval. In:Proc. of ACM Int. Conf. on Multimedia (ACM MM'03), Berkeley, CA, USA, Nov.2003:343-346.
    [140]Vapnik V. Statistical learning theory. Wiley,1998.
    [141]Bennett K. Combining support vector and mathematical programming methods for classification. Advances in Kernel Methods-Support Vector Learning. MIT-Press, 1998.
    [142]Kristin B. A Naive bayes classifier using transductive inference for text classification, http://www-cse.ucsd.edu/users/elkan/254/reports.html.
    [143]Gammerman A, Vapnik V, Vowk V. Learning by transduction. In:Proc. of Int. Conf. on Uncertainty in Artificial Intelligence, Morgan, USA, Jul.1998:148-156.
    [144]Joachims T. Transductive inference for text classification using support vector machines. In:Proc. of IEEE Int. Conf. on Machine Learning (ICML'98), Bled, Slovenia, Jun.1999:200-209.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700