图像的语义标注及其改善问题研究

英文题名：Research on Image Annotation and Image Annotation Refinement
作者：刘峥
论文级别：博士
学科专业名称：计算机系统结构
中文关键词：图像检索 ; 图像标注 ; Web图像标注 ; 图像标注改善
英文关键词：Image Retrieval ; Image Annotation ; Web Image Annotation ; Image Annotation Refinement
学位年度：2011
导师：马军
学科代码：081201
学位授予单位：山东大学
论文提交日期：2011-04-20

摘要

随着数码照相机、具有照相功能的手机等设备的迅速普及,数字图像呈现出爆炸式地增长趋势,而且随着互联网的飞速发展,越来越多的人能够更加方便、快捷、经济地使用这些图像数据。目前面临的问题不再是缺少图像数据资源,而是如何在浩如烟海的图像数据中找到自己所需要的信息。如何对规模庞大的数字图像进行快速高效的检索,成为亟待解决的问题。现有的图像检索系统主要利用图像的语义标注词进行基于语义的图像检索,但是随着图像数量的激增,人工进行图像标注显然不现实。因此,对图像进行自动语义标注成为图像检索领域的重要问题,得到了学术界和企业界越来越多的关注。鉴于已有图像标注方法的标注准确性还未达到令人满意的程度,因此如何对已标注图像进行标注结果的优化与改善成为了图像的语义标注这一研究领域的重要问题之一。
     本文针对不同类型的图像,提出了一系列有针对性的语义标注以及语义标注改善的方法,主要研究成果和创新点表现在以下五个方面：
     (1)提出了一种基于LDA主题模型的图像标注方法。首先,利用图像训练集建立一个视觉词袋模型,并利用LDA模型计算待标注图像和标注词词典中各标注词之间的相关度,从而获得图像的初始标注。接下来,提出一种基于搜索的标注词扩展方法,将初始标注提交到图像搜索引擎,从搜索引擎返回的结果中选取与待标注图像相似的图像,进而从这些相似图像的周边文本中获取图像的扩展标注词。最后,将初始标注词集合和扩展标注词集合进行合并,获得最终标注。
     (2)提出了一种面向社会网络图像共享社区的图像标注方法。该类网站允许用户在上传图像时为图像提供标签,我们利用用户提供的标签对图像进行语义标注。首先,将待标注图像分割后的图像区域作为样例数据点,对用户提供的标签进行过滤后得到图像的初始标签,并将其所对应的图像视觉特征作为待排序的数据点,利用流形排序算法对图像的初始标签进行排序。接下来,利用Flickr提供的API函数以及加权投票策略对排序位次高的初始标签进行扩展,从而得到扩展标签。最后,将排序位次高的初始标签集合和扩展标签集合合并,得到图像的最终标注。
     (3)提出了一种面向图像共享社区中个人相册的图像标注方法。首先,利用位置敏感哈希函数将图像的SIFT描述符映射到哈希桶中,并将每个哈希桶看作直方图的一个柱,把待标注图像转化为直方图,通过计算直方图的距离得到两幅图像之间的视觉相似度,从而对个人相册进行去除重复图像的处理。然后,利用图像的视觉特征和图像GPS坐标构造三分图,通过对三分图的划分进行个人相册中图像的聚类。将Core15K数据集作为训练集,建立视觉词袋模型,为该数据集标注词词典中的每个标注词求出与之对应的视觉词语向量。对个人相册聚类后得到的图像簇,通过视觉词袋模型求出图像簇所对应的视觉词语向量,从训练集的标注词词典中选择与其相关度高的词作为图像簇的标注。
     (4)提出了一种基于二分图增强学习算法以及概念本体推理的层次化Web图像标注方法。首先,从Web页面中抽取图像的初始标注,通过概念本体对初始标注进行推理,将初始标注和经过概念本体推理得到的层次化扩展标注作为图的顶点,构造二分图。然后,通过二分图增强学习算法对初始标注和扩展标注进行排序,并提出了一个标注词选择策略,从排序后的初始标注词集和扩展标注词集中选取图像的最终标注词。
     (5)提出了一种基于图划分和图像搜索引擎的图像标注改善算法。该算法通过对待标注图像的候选标注词进行去噪处理,提高标注的准确性。算法的核心思想是将候选标注词作为图的顶点,将标注词之间的相关度作为边的权值,从而将图像标注改善问题转换为图划分问题。我们用两个参数对标注词间的相关度进行加权处理后计算出边的权值。第一个参数是根据图像搜索引擎返回结果计算出的候选标注词与待标注图像视觉特征之间的相关度,第二个参数是候选标注词在待标注图像所属页面中的重要程度,此参数仅适用于Web图像。然后,用启发式最大割算法对构造出的图进行划分,最后从图划分后得到的两个标注词集中选择其一作为最终标注。
     本文对图像的语义标注及其改善问题的研究,有助于理解图像中包含的语义概念,提升图像检索系统的性能,对多媒体领域的研究也具有较大的意义。
With the rapid popularization of digital cameras and mobile phones equipped with camera devices, the number of digital images increases explosively. Particularly, with the rapid development of Internet, more and more people can use digital images conveniently and fastly with low cost. Currently, the key problems for us are not lack of image resources but how to find what we really need in large-scale digital images. Hence, it is of high importance to retrieve the large-scale digital images dataset rapidly and efficiently. The existing image retrieval systems mainly employ semantic annotations of images to carry out semantic-based image retrieval. However, manual image annotation is not suitable for the increasing number of images. Therefore, automatic image annotation is an important issue in image retrieval, and gains more and more attentions from both commercial and academic circles. As the existing image annotation methods are still far from practical, how to optimize and improve image annotation results becomes one of the most important problems in image annotation research areas.
     In this dissertation, we propose a series of methods to annotate images and refine image annotations for different kinds of images. The main research contents and innovations are shown in the following five aspects.
     (1) We present a LDA topic model based image annotation method. Firstly, a bag of visual words model is trained from a training dataset, and initial image annotations could be obtained by discovering the relationship between the unlabeled image and words in annotation dictionary from LDA Model. Afterwards, we propose a searching based annotation expanding method. We submit initial annotations to image search engine, and then select the images which are similar to the unlabeled image from searching results. Then, the extended annotations could be extracted from surrounding texts of similar images. Finally, we combine initial annotations and extended annotations to build up final annotations.
     (2) We present an image sharing community oriented image annotation method. As image sharing communities allow users to provide image tags when uploading images, we exploit user-supplied tags to annotate images. Firstly, the initial tags are ranked using manifold-ranking algorithm, by which regions of the photo to be annotated are served as queries to launch manifold-ranking algorithm which ranks the initial tags according to their relevance to the queries. Next, using Flickr API, top ranked initial annotations are expanded by a weighted voting scheme. Finally, we combine top ranked initial tags with expanding tags to construct final annotations.
     (3) We propose a personal photo collection oriented image annotation approach, and personal photo collections are downloaded from image sharing community. Firstly, we employ locality-sensitive hashing to map the SIFT descriptors of an image to hash tables. Afterwards, given a hash table, a histogram is obtained by the way that each bin is corresponding to a bucket of the hash table. Then, image similarity could be computed by estimating the distance between histograms. A pair of images which are belonged to one personal photo collection are considered to be near-duplicate when the similarity between them is higher than a predefined threshold. After deleting the near-duplicate images in personal photo collection, image visual features and GPS information are used to construct a tripartite graph, and then images of the photo collection are clustered through tripartite graph partitioning. Next, Corel5K dataset is used to establish a visual word model and obtain visual word vector for each word in annotation dictionary. For each photo cluster, a visual word vector is obtained from bag of visual word model. Afterwards, all words in the training dataset vocabulary are ranked by the distance between visual word vectors of photo cluster and themselves. Finally, the words with high ranking score are reserved as final annotations.
     (4) We propose a hierarchical Web image annotation approach by bipartite graph enhancing algorithm and concept ontology reasoning. Given a Web image, initial annotations are extracted from the surrounding texts and other textual information of the hosting Web page. A concept ontology is applied to achieve hierarchical probabilistic image concept reasoning for multi-level image annotation. After the concept reasoning process, a set of extended annotations is obtained. The initial annotations and the extended annotations are considered as two disjoint sets of graph vertices to construct a bipartite graph. Then, an annotation enhancing algorithm is designed to re-rank the initial annotations and extended annotations based on the bipartite graph. Finally, we design an annotation selecting policy and the annotations with the highest ranking scores are reserved as the final annotations.
     (5) We present a novel algorithm solving image annotation refinement problem(IAR) by graph partitioning and image search engine. Our algorithm focuses on pruning the noisy words in candidate annotation set to enhance image annotation performance. The main idea of the proposed algorithm lies in that candidate annotations are served as graph vertices, and the relevance between two candidate annotations is used to construct the edge weight. Then, the image annotation refinement problem can be converted to the weighted graph partitioning problem. The edge weight is the annotation similarity weighted by two parameters. Parameter 1 is the relationship between candidate annotation and image visual features, and parameter 2 refers to the importance of candidate annotation in host Web page. Next, we compute Max-Cut of the graph using a heuristic algorithm. After the graph is bi-partitioned, one of the two vertex sets is chosen as final annotations.
     In short, research on image annotation and image annotation refinement could help us to understand semantic concepts in digital images and promote the performance of image retrieval system. Moreover, this research is of great significance to the research of multimedia.

引文

[1]Datta, R. and Joshi, D. and Li, J. and Wang, J.Z, image retrieval:ideas, influences, and trends of the new age, ACM Computing Surveys.2008,40(2):1-60.
    [2]吴磊,视觉语言分析：从底层视觉特征表达到语义距离学习,博士学位论文,中国科学技术大学,2010.
    [3]Duygulu, P., Barnard, K., de Freitas, J., Forsyth, D.A, Object recognition as machine translation:Learning a lexicon for a fixed image vocabulary, In: Proceedings of 7th Europe Conference on Computer Vision,2002, pp.97-112.
    [4]Jeon, J., Lavrenko, V., Manmatha, R., Automatic image annotation and retrieval using cross-media relevance models, In:Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval,2003, pp.119-126.
    [5]V. Lavrenko, R. Manmatha, and J. Jeon, A model for learning the semantics of pictures, In:Proceedings of Conference on Advances in Neural Information Processing Systems,2003.
    [6]S. Feng, R. Manmatha, and V. Lavrenko, Multiple bernoulli relevance models for image and video annotation, In Proc. IEEE CS Conf. Computer Vision and Pattern Recognition,2004, pp.1002-1009.
    [7]Jing Liu, Bin Wang, Mingjing Li, et al, Dual cross-media relevance model for image annotation, In Proceedings of the 15th international conference on Multimedia, 2007, pp.605-614.
    [8]Florent Monay, Daniel Gatica-Perez, PLSA-based image auto-annotation: constraining the latent space, In Proceedings of the 12th annual ACM international conference on Multimedia,2004, pp.348-351.
    [9]Kobus Barnard, Pinar Duygulu, David Forsyth, et al, Matching words and pictures, Journal of Machine Learning Research,2003,3:1107-1135.
    [10]Putthividhy, D., Attias, H.T., Nagarajan, S.S., Topic regression multi-modal latent dirichlet allocation for image annotation, In Proceedings of IEEE Computer Vision and Pattern Recognition,2010, pp.3408-3415.
    [11]A. Vailaya, A. Jain, and H. Zhang, On image classification:city vs. landscape, Pattern Recognition,1998,31(12):1921-1935.
    [12]M. Szummer and R. Picard, Indoor-outdoor image classification, In Proceedings of IEEE international workshop on Content-based Access of Image and Video Database,1998, pp.42-51.
    [13]Carneiro, G., Chan, A.B., Moreno, P.J., Vasconcelos, N, Supervised learning of semantic classes for image annotation and retrieval, IEEE Transactions on Pattern Analysis and Machine Intelligence,2007,29(3):394-410.
    [14]Changbo Yang, Ming Dong, and Jing Hua, Region-based image annotation using asymmetrical support vector machine-based multiple-instance learning, In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition,2006, pp.2057-2063.
    [15]路晶,马少平,使用基于多例学习的启发式SVM算法的图像自动标注,计算机研究与发展,2009,46(5)：864-871.
    [16]Fan, J., Gao, Y., Luo, H, Hierarchical classification for automatic image annotation, In Proceedings of the 30th annual international ACM SIGIR conference, 2007, pp.111-118.
    [17]Julia A. Lasserre, Christopher M. Bishop, Thomas P. Minka, Principled hybrids of generative and discriminative models, In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition,2006, pp.87-94.
    [18]Grabner, H., Roth, P.M., Bischof, H, Eigenboosting:combining discriminative and generative information, In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,2007, pp.1-8.
    [19]卢汉清,刘静,基于图学习的自动图像标注,计算机学报,2008,31(9)：1629-1639.
    [20]Xiaoguang Rui, Mingjing Li, Zhiwei Li, Wei-Ying Ma, Nenghai Yu, Bipartite graph reinforcement model for web image annotation, In Proceedings of the 15th international conference on Multimedia,2007, pp.585-594.
    [21]Pan, J.Y., Yang, H.J., Faloutsos, C., Duygulu, P, Automatic multimedia cross-modal correlation discovery, In Proceedings of International Conference on Knowledge Discovery and Data Mining,2004, pp.653-658.
    [22]Liu, J., Li, M., Liu, Q., Lu, H., Ma, S, Image annotation via graph learning, Pattern Recognition.2009,42(2):218-228.
    [23]Chang, E., Kingshy, G., Sychay, G., Wu, G, CBS A:content-based soft annotation for multimodal image retrieval using Bayes point machines, IEEE Transactions on Circuits and Systems for Video Technology,2003,13(1):26-38.
    [24]Li, J., Wang, J., Automatic linguistic indexing of pictures by a statistical modeling approach, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003,25 (19):1075-1088.
    [25]荚济民,基于互联网数据集的图像标注技术研究,博士学位论文,中国科学技术大学,2009.
    [26]H. Xu, X. Zhou, M Wang, et al, Exploring Flickr's related tags for semantic annotation of web images, In Proceeding of the ACM International Conference on Image and Video Retrieval,2009.
    [27]Wong, R.C.F, Leung, C.H.C, Automatic semantic annotation of real-world web images, IEEE Transactions on Pattern Analysis and Machine Intelligence,2008, 30(11):1933-1944.
    [28]Xiaoguang Rui, Nenghai Yu, Taifeng Wang, Mingjing Li, A search-based web image annotation method, In Proceedings of International Conference on Multimedia and Expo,2007, pp.655-658.
    [29]X.-J. Wang et al., AnnoSearch:image auto-annotation by search, In Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition,2006, pp.1483-1490.
    [30]V.S. Tseng, J.H. Su, B.W. Wang, Y.M. Lin, Web image annotation by fusing visual features and textual information, In Proceedings of ACM symposium on Applied computing,2007, pp.1056-1060.
    [31]Feng, H., Shi, R., Chua, T.S, A bootstrapping framework for annotating and retrieving WWW images, In Proceedings of the 12th annual ACM international conference on Multimedia,2004, pp.960-967.
    [32]L. Wu, L. J. Yang, N. H. Yu and X. S. Hua, Learning to tag, In Proceedings of the 18th international conference on World Wide Web,2009, pp.361-370.
    [33]D. Liu, X. S. Hua, L. Yang, et al, Tag ranking, In Proceedings of the 18th international conference on World Wide Web,2009, pp.351-360.
    [34]Jimin Jia, Nenghai Yu, Xian-Sheng Hua, Annotating personal albums via web mining, In Proceeding of the 16th ACM international conference on Multimedia, 2008, pp.459-468.
    [35]Zhang, L., Chen, L., Li, M., Zhang, H, Automated annotation of human faces in family albums, In Proceedings of the eleventh ACM international conference on Multimedia,2003, pp.355-358.
    [36]Cui, J., Wen, F., Xiao, R., Tian, Y., Tang, X, EasyAlbum:an interactive photo annotation system based on face clustering and re-ranking, In Proceedings of the SIGCHI conference on Human factors in computing systems,2007, pp.367-376.
    [37]Liangliang Cao, Jiebo Luo, Thomas S. Huang, Annotating photo collections by label propagation according to multiple similarity cues, In Proceeding of the 16th ACM international conference on Multimedia,2008, pp.121-130.
    [38]Liangliang Cao, Jiebo Luo, Henry Kautz, Thomas S. Huang, Image annotation within the context of personal photo collections using hierarchical event and scene models, IEEE Transactions on Multimedia,2009, 11(2):208-219.
    [39]Jin, Y., Khan, L., Wang, L., and Awad M, Image annotations by combining multiple evidence & wordNet, In Proceedings of the 13th annual ACM international conference on Multimedia,2005, pp.706-715.
    [40]Changhu Wang, Feng Jing, Lei Zhang, Hong-Jiang Zhang, Image annotation refinement using random walk with restarts, In Proceedings of the 14th annual ACM international conference on Multimedia,2006, pp.647-650.
    [41]Changhu Wang, Feng Jing, Lei Zhang, Hong-Jiang Zhang, Content-based image annotation refinement, In Proceedings of IEEE Computer Vision and Pattern Recognition,2007. pp.1-8.
    [42]Yohan Jin, Kibum Jin, Khan, L. Prabhakaran, B, The randomized approximating graph algorithm for image annotation refinement problem, Computer Vision and Pattern Recognition Workshops,2008, pp.1-8.
    [43]Yohan Jin, Latifur Khan, B.Prabhakaran, To be annotated or not?:the randomized approximation graph algorithm for image annotation refinement problem, ICDE2008 Workshop,2008.
    [44]Yohan Jin, Latifur Khan, B. Prabhakaran, Knowledge based image annotation refinement, Signal Processing Systems,2010,58(3):387-406.
    [45]曹娟,张勇东,李锦涛,唐胜,一种基于密度的自适应最优LDA模型选择方法,计算机学报,2008,31(10)：1780-1787.
    [46]D. M. Blei, A. Y. Ng, and M. I. Jordan, Latent dirichlet allocation, Journal of Machine Learning Research,2003,3:993-1022.
    [47]刘硕研,须德,冯松鹤,刘镝,裘正定,一种基于上下文语义信息的图像块视觉单词生成算法,电子学报,2010,38(5)：1156-1161.
    [48]Lowe, D.G, Distinctive image features from scale-invariant keypoints, Internatioanl Journal of Computer Vision,2004,60(2):pp.91-110.
    [49]J. van de Weijer and C. Schmid, Coloring local feature extraction, In Proceedings of ECCV,2006, pp.334-348.
    [50]吴飞,韩亚洪,庄越挺,邵健,图像-文本相关性挖掘的Web图像聚类方法,软件学报,2010,21(7)：1561-1575
    [51]Chang, T., and Kuo, C.C.J, Texture analysis and classification with tree-structured wavelet transform, IEEE Transactions on Image Processing,1993, 2(4):429-441.
    [52]IT. Jolliffe, Principal component analysis, Springer,2nd edition,2002.
    [53]H.J. Zeng, Q.C. He, Z. Chen, and W.-Y. Ma, Learning to cluster Web search results, In Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval,2004, pp.210-217.
    [54]ALIPR. http://www.alipr.com/.
    [55]J. Li, J. Z. Wang, Real-time computerized annotation of pictures, IEEE transactions on pattern analysis and machine intelligence,2008,30(6):985-1002.
    [56]Wang, C, Jing, F., Zhang, L, and Zhang, H.J, Scalable search-based image annotation of personal images, In Proceedings of the 8th ACM international workshop on Multimedia information retrieval,2006, pp.269-278.
    [57]芮晓光,真实世界环境下的自动图像标注方法研究,博士学位论文,中国科学技术大学,2010.
    [58]许红涛,Web图像语义分析与自动标注研究,博士学位论文,复旦大学2009.
    [59]曾伟铭,针对Flickr的关于Web2.0的摄影传播研究,硕士学位论文,上海师范大学,2008.
    [60]Borkur Sigurbjornsson, Roelof van Zwol, Flickr tag recommendation based on collective knowledge, In Proceeding of the 17th international conference on World Wide Web,2008, pp.327-336.
    [61]陈烨,面向资源共享网站的图像标注和标签推荐技术研究,硕士学位论文,浙江大学,2010.
    [62]Wikipedia:http://www.wikipedia.org/.
    [63]Rudi L. Cilibrasi, Paul M.B. Vitanyi, The Google similarity distance, IEEE Transactions on Knowledge and Data Engineering,2007,19(3):370-383.
    [64]Y. Linde, A. Buzo, R. Gray, An algorithm for vector quantizer design, IEEE Transactions on Communications,1980,28(1):84-95.
    [65]Shi, J. and Malik, J, Normalized cuts and image segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence,2002,22(8):888-905.
    [66]D. Zhou, O. Bousquet, T. N. Lal, J. Weston, B. SchOlkopf, Learning with local and global consistency, In Proceedings of Advances in Neural Information Processing Systems,2003, pp.595-602.
    [67]D. Zhou, J. Weston, A. Gretton, O. Bousquet and B. SchOlkopf, Ranking on data manifolds, In Proceedings of Advances in Neural Information Processing Systems, 2003.
    [68]李杰,基于内容的图像检索方法研究,博士学位论文,中国科学技术大学,2008.
    [69]李杰,程义民,葛仕明,张玲,结合流形排序和区域匹配的图像检索,小型微型计算机系统,2008,29(3)
    [70]王灿,基于半监督流形学习的Web信息检索技术研究,博士学位论文,浙江大学,2009
    [71]K. Jarvelin, and J. Kekalainen, IR evaluation methods for retrieving highly relevant documents, In Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval,2000, pp.41-48.
    [72]Kalervo Jarvelin, Jaana Kekalainen, Cumulated gain-based evaluation of IR techniques, ACM Transactions on Information Systems,2002,20(4):422-446.
    [73]刘玉婷,网页排序中的随机模型及算法,博士学位论文,北京交通大学,2009.
    [74]Kim, H.-s., Chang, H.-W., Lee, J., Lee, D, Basil:effective near-duplicate image detection using gene sequence alignment, In Proceedings of ECIR,2010, pp.229-240.
    [75]Y. Jing, S. Baluja, VisualRank:applying pagerank to large-scale image search, IEEE Transactions on Pattern Analysis and Machine Intelligence,2008, 30(11):1877-1890.
    [76]M. Datar, N. Immorlica, P. Indyk, and V.S. Mirrokni, Locality-sensitive hashing scheme based on p-Stable distributions, In Proceedings of 20th Symp on Computational Geometry,2004, pp.253-262.
    [77]Sung-Hyuk Cha, Sargur N. Srihari, On measuring the distance between histograms, Pattern Recognition,2002,35(6):1355-1370.
    [78]Yuki Arase, Xing Xie, Takahiro Hara, Shojiro Nishio, Mining People's Trips from Large Scale Geo-tagged Photos, In Proceedings of the international conference on Multimedia,2010, pp.133-142.
    [79]B. Gao, T.-Y. Liu, T. Qin, X. Zheng, Q.-S. Cheng, and W.-Y. Ma, Web image clustering by consistent utilization of visual features and surrounding texts, In Proceedings of the 13th international conference on Multimedia,2005, pp.112-121.
    [80]Boyd, S., and Vandenberghe, L. Convex optimization, Cambridge University Press,2004.
    [81]R. Negoescu and D. Gatica-Perez, Analyzing Flickr groups, In Proceedings International Conference on Image and Video Retrieval,2008, pp.417-426.
    [82]Bin Gao, Tie-Yan Liu, Xin Zheng, Qian-Sheng Cheng, and Wei-Ying Ma, Consistent bipartite graph co-partitioning for star-structured high-order heterogeneous data co-clustering, In Proceedings of the 11th International Conference on Knowledge Discovery and Data Mining,2005, pp.41-50.
    [83]http://www.cgfa.sunsite.dk.
    [84]Yan Ke, Rahul Sukthankar, Larry Huston, An efficient parts-based near-duplicate and sub-image retrieval system, In Proceedings of the 12th annual ACM international conference on Multimedia,2004, pp.869-876.
    [85]Salton G, Automatic text processing:the transformation, analysis, and retrieval of information by computer, Addison-Wesley,1989.
    [86]宋峻峰,面向语义Web的领域本体表示、推理、集成及其应用研究,博士学位论文,国防科学技术大学,2006
    [87]金芝,知识工程中的本体论研究,世纪之交的知识工程与知识科学.清华大学出版社,2001,pp.447-465.
    [88]Miller, G.A, WordNet:a lexical database for English, Communications of the ACM,1995,38(11):39-41.
    [89]Jianping Fan, Yuli Gao, and Hangzai Luo, Integrating concept ontology and multitask learning to achieve more effective classifier training for multilevel image annotation. IEEE Transactions on Image Processing,2008,17(3):407-426.
    [90]Yuli Gao, Jianping Fan, Incorporating concept ontology to enable probabilistic concept reasoning for multi-level image annotation, In Proceedings of the 8th ACM international workshop on Multimedia information retrieval,2006, pp.79-88.
    [91]PhotoSIG:http://www.photosig.com.
    [92]Munirathnam Srikanth, Joshua Varner, Mitchell Bowden, Dan Moldovan, Exploiting ontologies for automatic image annotation, Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval,2005, pp.552-558.
    [93]Google Image Search Engine:http://www.images.google.com/.
    [94]R.M. Karp. Reducibility among combinatorial problems. Complexity of Computer Computations, Plenum Press,1972, pp.85-103.
    [95]刘运龙,王建新,带权最大割问题的一种基于划分技术的固定参数可解算法,高技术通讯,2010,20(3)：264-269.
    [96]Garey M R, Johnson D S. Computers and intractability:a guide to the theory of NP-Completeness. W. H. Freeman & Co,1979.
    [97]Goemans, M. X., and Williamson, D. P, Improved approximation algorithms for maximum cut and satis-fiability problems using semidefinite programming, Journal of the ACM,1995,42(6):1115-1145.
    [98]P. Festa, P.M. Pardalos, M.G.C. Resende, and C.C. Ribeiro, Randomized heuristics for the max-cut problem, Optimization Methods and Software,2002, 17(6):1033-1058.
    [99]Huang J, Kumar S, Mitra M, et al, Image indexing using color correlograms, In Proceedings of Conference on Computer Vision and Pattern Recognition,1997, pp.1063.
    [100]H. Y,M. L, H. Z, et al, Color texture moment for content-based image retrieval. In Proceedings of International Conference on Image Processing,2002, pp.929-932.
    [101]Theodoridis, S. and Koutroumbas, K, Pattern recognition, Academic Press, 2003.
    [102]Flickr:http://www.flickr.com.
    [103]王斌,图像检索中自动标注与快速相似检索技术研究,博士学位论文,中国科学技术大学,2007.
    [104]王斌,俞能海,一种针对大规模网络图像的自动标注改善算法,电子与信息学报,2009,31(2)：270-274.
    [105]王梅,周向东,张军旗,许红涛,施伯乐,基于扩展生成语言模型的图像自动标注方法,软件学报,2008,19(09)：2449-2460.
    [106]路晶,马少平,基于多例学习的Web图像聚类,计算机研究与发展,2009,46(9)：1462-1470.
    [107]Rui Shi,Chin-Hui Lee,Tat-Seng Chua, Enhancing image annotation by integrating concept ontology and text-based bayesian learning model, In Proceedings of the 15th international conference on Multimedia,2007, pp.341-344.
    [108]芮晓光,袁平波,何芳,俞能海,一种新的基于语义聚类和图算法的自动图像标注方法,中国图象图形学报,2007,12(02)：239-244
    [109]冯松鹤,向感知的图像检索及自动标注算法研究,博士学位论文,北京交通大学,2009.
    [110]路晶,马少平,基于概念索引的图像自动标注,计算机研究与发展,2007,44(03)：452-459.
    [1]Datta, R. and Joshi, D. and Li, J. and Wang, J.Z, image retrieval:ideas, influences, and trends of the new age, ACM Computing Surveys.2008,40(2):1-60.
    [2]Jin, Y, Khan, L., Wang, L., and Awad M, Image annotations by combining multiple evidence & wordNet, In Proceedings of the 13th annual ACM international conference on Multimedia,2005, pp.706-715.
    [3]Changhu Wang, Feng Jing, Lei Zhang, Hong-Jiang Zhang, Image annotation refinement using random walk with restarts, In Proceedings of the 14th annual ACM international conference on Multimedia,2006, pp.647-650.
    [4]Changhu Wang, Feng Jing, Lei Zhang, Hong-Jiang Zhang, Content-based image annotation refinement, In Proceedings of IEEE Computer Vision and Pattern Recognition,2007. pp.1-8.
    [5]Miller, GA, WordNet:a lexical database for English, Communications of the ACM,1995,38(11):39-41.
    [6]Yuli Gao, Jianping Fan, Incorporating concept ontology to enable probabilistic concept reasoning for multi-level image annotation, In Proceedings of the 8th ACM international workshop on Multimedia information retrieval,2006, pp.79-88.
    [7]Li, J., Wang, J., Automatic linguistic indexing of pictures by a statistical modeling approach, IEEE Transactions on Pattern Analysis and Machine Intelligence,2003, 25 (19):1075-1088.
    [8]L. Kaufman and P. J. Rousseeuw, Clustering by means of medoids in statistical data analysis based on the L1-Norm and related methods. Y. Dodge, Ed. Amsterdam, The Netherlands:North-Holland,1987, pp.405-416.
    [9]Jeon, J., Lavrenko, V., Manmatha, R., Automatic image annotation and retrieval using cross-media relevance models, In:Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval,2003, pp.119-126.
    [1]Jun Jie Foo, Justin Zobel, Ranjan Sinha, S. M. M. Tahaghoghi, Detection of near-duplicate images for web search, In Proceedings of the 6th ACM international conference on Image and video retrieval,2007, pp.557-564.
    [2]Bin Wang, Zhiwei Li, Mingjing Li, Wei-Ying Ma, Large-Scale Duplicate Detection for Web Image Search, In Proceedings of IEEE International Conference on on Multimedia and Expo,2006, pp.353-356.
    [3]Wan-Lei Zhao, Chong-Wah Ngo, Scale-rotation invariant pattern entropy for keypoint-based near-duplicate detection, IEEE Transactions on Image Processing, 2009,18(2):412-423.
    [4]Feng Tang, Yuli Gao. Fast near duplicate detection for personal image collections. In Proceedings of the seventeen ACM international conference on Multimedia,2009, pp.701-704.
    [5]Xin-Jing Wang, Lei Zhang, Ming Liu, Yi Li, Wei-Ying Ma, ARISTA-image search to annotation on billions of web photos. In Proceedings of IEEE Conference Computer Vision and Pattern Recognition,2010, pp.2987-2994.
    [6]Yan Ke, Rahul Sukthankar, Larry Huston, An efficient parts-based near-duplicate and sub-image retrieval system, In Proceedings of the 12th annual ACM International Conference on Multimedia,2004, pp.869-876.
    [7]Kim, H.-s., Chang, H.-W., Lee, J., Lee, D, Basil:effective near-duplicate image detection using gene sequence alignment, In Proceedings of ECIR,2010, pp.229-240.
    [8]Lowe, D.Q Distinctive image features from scale-invariant keypoints, Internatioanl Journal of Computer Vision,2004,60(2):pp.91-110.
    [9]Y. Jing, S. Baluja, VisualRank:applying pagerank to large-scale image search, IEEE Transactions on Pattern Analysis and Machine Intelligence,2008, 30(11):1877-1890.
    [10]J. Yang, Y. G Jiang, A. G Hauptmann, and C. W. Ngo, Evaluating bag of visual words representations in scene classification, In Proceedings of the International Workshop on Multimedia Information Retrieval,2007, pp.197-206.
    [11]Y.-G Jiang, J. Yang, C.-W. Ngo, and A.G Hauptmann, Representations of keypoint-based semantic concept detection:A comprehensive study. IEEE Transactions on Multimedia,2010,12(1):42-53.
    [12]P. Indyk, R. Motwani, P. Raghavan, and S. Vempala, Approximate nearest neighbor:towards removing the curse of dimensionality, In Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing,1998, pp.604-613.
    [13]M. Datar, N. Immorlica, P. Indyk, and VS. Mirrokni, Locality-sensitive hashing scheme based on p-Stable distributions, In Proceedings of 20th Symp on Computational Geometry,2004, pp.253-262.
    [14]P. Indyk, Stable distributions, pseudorandom generators, embeddings, and data stream computation, In Proceedings 41st IEEE Symp. Foundations of Computer Science,2000, pp.189-197.
    [15]Sung-Hyuk Cha, Sargur N. Srihari, On measuring the distance between histograms, Pattern Recognition,2002,3.5(6):1355-1370.
    [16]Flickr:http://www.flickr.com.
    [17]R. Negoescu and D. Gatica-Perez, Analyzing Flickr groups, In Proceedings International Conference on Image and Video Retrieval,2008, pp.417-426.
    [18]Bin Gao, Tie-Yan Liu, Xin Zheng, Qian-Sheng Cheng, and Wei-Ying Ma, Consistent bipartite graph co-partitioning for star-structured high-order heterogeneous data co-clustering, In Proceedings of the 1 lth International Conference on Knowledge Discovery and Data Mining,2005, pp.41-50.
    [19]Google Image Search Engine:http://www.images.google.com/.
    [20]http://www.cgfa.sunsite.dk.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700