真实世界环境下的自动图像标注方法研究

设为首页

收藏本站

网站地图 | English | 公务邮箱

远程访问

NSTL服务站

真实世界环境下的自动图像标注方法研究

详细信息本馆镜像全文| 推荐本文 | | 获取CNKI官网全文

英文题名：Research on Real-World Automatic Image Annotations
作者：芮晓光
论文级别：博士
学科专业名称：信号与信息处理
中文关键词：图像标注 ; 大规模学习算法 ; 图像标注改善 ; 图像检索
英文关键词：image annotation ; large scale learning ; image annotation refinement ; image retrieval
学位年度：2010
导师：俞能海
学科代码：081002
学位授予单位：中国科学技术大学
论文提交日期：2010-04-01

摘要

随着多媒体影像技术和存储技术的快速发展,互联网上图像信息呈爆炸性增长。视觉图像信息与文本信息相比,更加生动、易于理解。这些数字图像的应用非常广泛,如商业、新闻媒体、医学、教育等方面。因此,如何帮助用户快速、准确地找到需要的图像成为近年来多媒体研究的热点课题之一。而解决这一课题最重要的技术就是自动图像标注技术。
     但是,传统的自动图像标注研究主要在受限环境下进行的,例如只是针对人工收集的小规模图像数据库,基本没有考虑真实世界环境下的图像标注问题。这造成了一些传统自动图像标注方法在实际应用中遇到了很多问题,如图像标注性能不高,用户对图像标注的感受不好,无法处理大量的语义概念等等。因此,研究传统自动图像标注方法在真实世界环境下的推广,以及针对传统方法的不足研究真实世界环境下的新的自动图像标注方法,都具有重要的意义。
     本论文尝试研究真实世界环境下自动图像标注的关键问题。论文对大规模图像标注学习算法、网络图像标注、多语言环境下的图像标注和图像标注改善等问题进行了深入地研究。另外,我们设计了基于提出的真实世界环境下图像标注算法的图像检索演示系统,并研究了图像表示和图像检索排序问题,实现了真实世界环境下大规模图像数据库快速有效地检索。本文主要成果和创新之处包括以下几个方面：
     1.提出了一种基于大规模距离尺度学习算法的自动图像标注方法。首先,提出了一种区分性距离尺度学习算法。该算法通过保存数据集的局部非线性结构和利用数据的区分性信息来学习马氏距离尺度,可以改善基于K近邻方法的自动图像标注算法的性能。然后,提出了一种集成的距离尺度学习算法,使得区分性距离尺度学习算法可以通过并行或者在线的方式实现有效地训练,从而可以处理大规模数据。实验表明,集成距离尺度学习算法不仅可以提高图像标注性能,也可以大大降低标注模型的学习时间。
     2.提出了一种基于集成思想的大规模支持向量机算法实现了图像的自动标注。支持向量机是自动图像标注的常用方法。通过首先在数据子集上分别学习然后集成的思想,实现了大规模支持向量机算法。该算法可以大大提高原有支持向量机算法的可扩展性。实验表明,与常见的支持向量机算法相比,集成支持向量机算法在基本不损失性能的情况下,可以在较短时间内处理百万级的训练数据。
     3.提出了一种基于二部图加强模型的网络图像自动标注算法。如何利用网络图像的己有文本信息来帮助图像标注是网络图像标注的关键。提出的算法可以从网络图像的已有文本中提取若干单词作为候选标注,然后利用大规模图像数据扩展出更多标注,并将所有标注建模成一个二部图模型。通过在二部图模型上的加强学习算法,可以重排序已有图像标注。实验结果表明,提出的算法可以大大提高网络图像原有标注的性能。
     4.提出了一种基于统计模型的图像标注方法。通过对大规模的网络图像数据集的聚类和统计建模,实现对个人图像和网络图像快速有效地标注。实验表明,提出的算法与现有算法相比,不仅提高了标注性能,而且大大提高了图像标注速度,速度可达每秒20幅图像。
     5.提出了一种跨语言图像自动标注框架。该框架可以利用大规模的多语言网络图像数据集作为训练集,并根据用户的母语自动提供多语言的图像标注结果。该框架提出了一种同时对标注排序和翻译的多语言标注融合的算法MAF。MAF将候选标注建模成一个n-部图模型,然后通过迭代算法提高了多语言标注的性能和翻译效果。实验结果表明,跨语言图像标注框架可以提高标注性能,并且能给用户提供多语言的标注结果。
     6.提出一种基于优化模型的图像标注改善算法,并给出基于该算法的统一的图像标注框架。提出的算法同时使用了标注先验知识和标注间局部语义相关性信息,并将图像标注改善问题建模成一个0-1整数规划问题实现无参数的图像标注改善。并且,它可以通过半正定优化算法实现了快速求解。与以前的方法相比,它可以直接确定最终标注,无需任何经验(设定阈值)。实验结果表明了算法的有效性。
     7.提出了基于空间关系的图像视觉表示方法和考虑图像质量和重要性的图像静态排序算法。结合提出的自动图像标注算法,设计并实现了一个基于大规模数据库的实时图像检索演示系统。总之,论文对真实世界环境下自动图像标注的研究,有助于理解图像与概
     念之间的深层联系,帮助实现视觉信息的统一表示模型,对多媒体领域的研究
     具有较大的意义,对探索和发展大规模学习理论也具有一定的借鉴意义。
With the prevalence of digital imaging and storage equipment, there are more and more images available on the Internet. Compared with text information, visual images are more vivid and easy to understand. These digital images have been widely used in the business, education, science and technology. Thus, how to design efficient and effective image retrieval technologies have been an important research direction for academic. A key solution to this problem is automatic image annotation technology.
     But most of automatic image annotation approaches are studied in limited circumstances, e.g. only designed for the collection of small-scale artificial image databases, without considering the real-world image annotation problem. This causes that when existing image annotation methods are applied in practical application, they has encountered many problems, such as low image annotation performance, bad user feeling for image annotation and cannot handle a large number of semantic concepts, etc. Therefore, researching on the extension of current methods to real-world situation and researching on new real-world methods to solve problems of existing methods are very important.
     Additionally, we design an image retrieval demo system based on the proposed image annotation approaches. We also research on some other key problems of image retrieval, such as image representation and image ranking. The main contributions of this dissertation are as follows:
     1. Proposed a large scale distance metric learning algorithm based automatic image annotation method. First, we proposed a discriminative distance metric learning (DDML) algorithm which can improve the KNN-based image annotation methods. Then, an aggregated distance metric learning method (ADML) is proposed, which can train DDML in a parallel way or an online way. Thus, ADML can handle large scale problems. The experimental results show that the proposed method can improve both effectiveness and efficiency of image annotations.
     2. Proposed a large scale support vector machine algorithm (ASVM) to automatically annotate images. Instead of learning from the entire data, our method divides the training set into subsets. A series of sub-models can then be learned from subsets of training samples by SVM, followed up by a simple global aggregation. ASVM can largely improve the scalability of original SVM solvers. And millions of data can be trained in a short time by ASVM.
     3. Proposed a bipartite graph reinforcement model (BGRM) for web image annotations. How to utilize this information to help tagging images is the key of web image annotations. The proposed model extracts surrounding text and other textual information of images as candidate annotations. They are then extended to include more potentially relevant annotations by searching and mining a large-scale image database. All candidates are modeled as a bipartite graph. Then a reinforcement algorithm is performed on the bipartite graph to re-rank the candidates. Only those with the highest ranking scores are reserved as the final annotations. The experimental results show BGRM can largely improve the annotation performance.
     4. Proposed a real-world image annotation approach based on statistical model (SRIA) for real-world image annotations. SRIA can leverage large scale training data set to annotate both personal and web images in a unified framework efficiently. The experimental results show SRIA not only improves the annotation performance but also speed up the annotation process.
     5. Proposed a cross language image annotation framework. The proposed framework can utilized the large scale multilingual web image data as training set, and provide multilingual annotations according to the mother languages of users. By using the idea of "two languages are more informative than one", we proposed a multilingual annotation fusion algorithm (MAF) for candidate annotation ranking and translations. The experimental results show the good performance of the framework.
     6. Proposed an optimization-based image annotation refinement algorithm (OptTag). Based on the proposed algorithm, we provide a unified image annotation framework. OptTag perform non-parametric image annotation refinement based on 0-1 integer optimization model using the prior and joint local probabilities. It can be efficiently solved by semi-definite optimization problem. Additionally, it can directly determine final tags while many previous approaches just use predefined thresholds for deciding unrelated words. The experiments demonstrate the effectiveness of OptTag.
     7. Proposed a spatial visual topic model based image representation, and an image static ranking called SocialRank for revealing the importance and quality of images. By incorporating with proposed image annotation method, a real-time image retrieval demo system is established based on a large scale image database.
     In a word, research on real-world automatic image annotations helps to understand the deep relation between images and concepts, benefits achieving the unified representative model of visual information, is of great significance not only to research on multimedia, but also to large scale learning theory.

引文

[1]Gong Y, Zhang H, Chuan HC, and Sakauchi M. An image database system with content capturing and fast image indexing abilities. In:Proceedings of IEEE Int. Conf. on Multimedia Computing and Systems, pp.121-130.2,213-242.2002,1994.
    [2]Shneiderman B and Kang H. Direct annotation:A drag-and-drop strategy for labeling photos. In:Proc. International Conference Information Visualisation (IV2000), pp.88-98. London, England,2000.
    [3]Smeulders, AWM, Worring, M. and Santini, S. et.al., "Content-based image retrieval at the end of the early years", IEEE Transactions on Pattern Analysis and Machine Intelligence, pp.1349—1380,2000.
    [4]Zhou, X.S. and Huang, T.S. Relevance feedback in image retrieval:A comprehensive review. Multimedia systems. Vol.6, pp.536-544.2003.
    [5]Rodden K. How do people organise their photographs? In:BCS IRSG 21st Ann. Colloq. on Info. Retrieval Research,1999.
    [6]Bellman, R., Adaptive Control Processes:A Guided Tour, Princeton University Press,1961.
    [7]Alexandr Andoni and Piotr Indyk, Near-Optimal Hashing Algorithms for Near Neighbor Problem in High Dimensions. In Proceedings of the Symposium on Foundations of Computer Science (FOCS'06),2006.
    [8]Shen HT, Ooi BC, and Tan KL, Giving meanings to WWW images. In:Proceedings of ACM Multimedia 2000, pp 39-48,2000.
    [9]Cai D., He X., Ma W.Y., et al. Organizing WWW images based on the analysis of page layout and Web link structure. Vol.1,2004.
    [10]Xue X. B., Zhou Z. H., Zhang Z.M. Improve web search using image snippets. Proceedings of the 21st national conference on Artificial Intelligence, pp.1431-1436,2006.
    [11]Reuters, Flickr Maps the World's Photos,19 Nov.2007.
    [12]Datta, R. and Joshi, D. and Li, J. and Wang, J.Z. Tagging over time:real-world image annotation by lightweight metalearning. Proceedings of the 15 th international conference on Multimedia,2007.
    [13]Li, J. and Wang, J.Z. Real-time computerized annotation of pictures. Proceedings of the 14th annual ACM international conference on Multimedia, ACM Press New York, NY, USA,911-920,2006.
    [14]Google Scholar, http://scholar.google.com.
    [15]Duygulu, P. and Barnard, K. Object recognition as machine translation:learning a lexicon for a fixed image vocabulary. In Proc. of ECCV,2002.
    [16]王斌,图像检索中自动标注与快速相似搜索技术研究,中国科学技术大学,博士毕业论文,2007。
    [17]荚济民,基于互联网数据集的图像标注技术研究,中国科学技术大学,博士毕业论文,2009。
    [18]何芳,基于多渐进式的图像自动标注算法研究,中国科学技术大学,硕士毕业论文,2009。
    [19]王长虎,互联网环境下大规模图像的内容分析、检索和自动标注的研究,中国科学技术大学,博士毕业论文,2009。
    [20]Chang, E. and Goh, K. and Sychay, G. and Wu, G. CBSA:content-based soft annotation for multimodal image retrieval using Bayes point machines. IEEE Transactions on Circuits and Systems for Video Technology,2003.
    [21]Li, J and Wang J.Z, Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach. IEEE Trans. On Pattern Analysis and Machine Intelligence.25(19):p.1075-1088, 2003.
    [22]Cusano C, Ciocca G, and Schettini R. Image annotation using svm. In Proceedings of Internet Imaging IV, SPIE 5304, volume 5304, pages 330-338,Dec 2003.
    [23]Kang F, Jin R, Sukt hankar R. Correlated label propagation with application to multi-label learning. Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.2006.
    [24]Makadia, A. and Pavlovic, V. and Kumar, S. A new baseline for image annotation. ECCV'08: Proceedings of the 10th European Conference on Computer Vision.5304, pp 316-329, 2008.
    [25]Andrews, S. Tsochantaridis, I. and Hofmann, T. Support vector machines for multiple-instance learning. Advances in neural information processing systems, pp.577-584, 2003.
    [26]Yang, C. and Dong, M. and Hua, J. Region-based image annotation using asymmetrical support vector machine-based multiple-instance learning. IEEE Computer Society Conference on Computer Vision and Pattern Recognition.2006.
    [27]Mori, Y., Takahashi, H., and Oka, R. Image-to-word transformation based on dividing and vector quantizing images with words. In MISRM,1999.
    [28]Jeon, J., Lavrenko, V., and Manmatha, R. Automatic Image Annotation and Retrieval Using Cross-media Relevance Models. In Proc. of SIGIR, Toronto, July 2003.
    [29]芮晓光,袁平波,何芳,俞能海,一种新的基于语义聚类和图算法的自动图像标注方法,
    中国图象图形学报,2007.
    [30]Lavrenko, V., Manmatha, R., and Jeon, J. A Model for Learning the Semantics of Pictures. In Proc. NIPS,2003.
    [31]Feng, S. L., Manmatha, R., and Lavrenko, V. Multiple bernoulli relevance models for image and video annotation. In Proc. of CVPR, Washington, DC, June,2004.
    [32]Monay, F. and D. Gatica-Perez. On Image Auto-Annotation with Latent Space Models, in Proceedings of ACM International Conference on Multimedia.2003.
    [33]Barnard, K., P. Duygulu, and D. Forsyth. Clustering Art. in Proceedings of the 2001 IEEE Computer Society Conference on Pattern Recognition.2001.
    [34]Blei, D. M. and Jordan, M. I. Modeling annotated data. In Proc. SIGIR, Toronto, July.2003.
    [35]Chong Wang and David Blei and Li Fei-Fei. "Simultaneous Image Classification and Annotation". Intl. Conf on Computer Vision (CVPR),2009.
    [36]Pan, J.Y., Yang, H.J. and Pinar, D. Automatic multimedia cross-modal correlation discovery. The Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp.653-658, August 2004.
    [37]Liu, J. and Li, M. and Liu, Q. and Lu, H. and Ma, S. Image annotation via graph learning. Pattern Recognition, vol.42, pp.218-228,2009.
    [38]Wang, C., Jing, F., Zhang, L, and Zhang, H.J. Scalable search-based image annotation of personal images. In Proceedings of the 8th ACM international workshop on Multimedia information retrieval. ACM Press New York, NY, USA,269-278,2006.
    [39]Wang, X., Zhang, L., Jing, F., and Ma, W. AnnoSearch:Image Auto-Annotation by Search. In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR. IEEE Computer Society, Washington, DC,1483-1490,2006.
    [40]Rui Xiaoguang, Li Mingjing, Li Zhiwei, Ma Wei-Ying, Nenghai Yu. Bipartite Graph Reinforcement Model for Web Image Annotation. Proceedings of the 15th annual ACM international conference on Multimedia,2007.
    [41]Russell B. C., Torralba A., Murphy K. P, and Freeman w. t. Labelme:a database and web-based tool for image annotation. MIT AI Lab demo AIM-2005-025,2005.
    [42]Luis von Ahn and Laura Dabbish. Labeling images with a computer game. ACM Computer Human Interaction Conference (CHI),2004.
    [43]Luis von Ahn, Ruoran Liu, and Manuel Blum.Peekaboom:A game for locating objects in images. ACM Computer Human Interaction Conference (CHI),2006.
    [44]Wang B., Li Z. W., Yu N. H., Li M. J.:Image Annotation in a Progressive Way. In: Proceeding of ICME. Beijing, China,2007.
    [45]He Fang, Yu Nenghai, Rui Xiaoguang, "Multi-Progressive Model for Web Image Annotation", ACM Multimedia 2008.
    [46]Jeon J, R Manmatha. "Using Maximum Entropy for Automatic Image Annotation". Int'l Conf on Image and Video Retrieval (CIVR 2004). pp.24-32.2004.
    [47]Yavlinsky A, Schofield E and Ruger S. "Automated Image Annotation Using Global Features and Robust Nonparametric Density Estimation". Int'l Conf on Image and Video Retrieval (CIVR, Singapore), Jul 2005.
    [48]Emre Akbas and Fatos Y. Vural. "Automatic Image Annotation by Ensemble of Visual Descriptors", Intl. Conf. on Computer Vision (CVPR), Workshop on Semantic Learning Applications in Multimedia,2007.
    [49]Soren Sonnenburg, Vojtech Franc, Elad Yom-Tov and Michele Sebag, "Large Scale Learning-Challenge (Learning with Millions of Examples and Dimensions)", ICML'08 Workshop PASCAL Large Scale Learning Challenge, July 2008.
    [50]Li, Y.,& Long, P. M., "The relaxed online maximum margin algorithm", Mach. Learn.,46, 361-387.2002.
    [51]Gentile, C,& M.Warmuth, "Linear hinge loss and average margin". In Advances in Neural Information Processing Systems, pp.225-231.1998.
    [52]Crammer, K., Dekel, O., Keshet, J., Shalev-Shwartz, S.,& Singer, Y., "Online passive-aggressive algorithms", J. Mach. Learn. Res.,1,551-585.2006.
    [53]Kivinen, J.,& M.K.Warmuth. "Additive versus exponentiated gradient updates for linear prediction", Information and Computation,132,1-64.1997.
    [54]Kivinen, J., Smola, A. J.,& C.Williamson, R.. Online learning with kernels. IEEE Transactions on Signal Processing,52,2165-2176,2002.
    [55]Moore G., "Progress in digital integrated electronics". In IEDM Tech. Digest, pages 11-13, 1975.
    [56]Catanzaro, Bryan and Sundaram, Narayan and Keutzer, Kurt, "Fast Support Vector Machine Training and Classification on Graphics Processors", ICML,2008.
    [57]Rajat Raina, Anand Madhavan, Andrew Y. Ng, "Large-scale Deep Unsupervised Learning using Graphics Processors", ICML,2009
    [58]Hans Peter Graf, Hari Cadambi, Igor Durdanovic, Venkata Jakkula, Murugan Sankaradass, Eric Cosatto, Srimat Chakradhar, "A Massively Parallel Digital Learning Processor", NIPS 2008.
    [59]Chu, Cheng-Tao and Kim, Sang K. and Lin, Yi-An and Yu, Yuanyuan and Bradski, Gary and Ng, Andrew Y. and Olukotun, Kunle, "Map-Reduce for Machine Learning on
    Multicore", Advances in Neural Information Processing Systems,2007.
    [60]Dean, J. and Ghemawat, S, "MapReduce:Simplified data processing on large clusters", Proceedings of the 6th Symposium on Operating Systems Design and Implementation, San Francisco, CA, December 6-8,2004.
    [61]Kearns M.. Efficient noise-tolerant learning from statistical queries, pages 392-401,1999.
    [62]Graf H., Cosatto E., Bottou L., Dourdanovic I., and Vapnik V.. Parallel Support Vector Machines:The Cascade SVM. Advances in Neural Information Processing Systems, 17:521-528,2005.
    [63]Tamir Hazan, Amit Man and Amnon Shashua, "A Parallel Decomposition Solver for SVM: Distributed Dual Ascend using Fenchel Duality", CVPR 2008.
    [64]Yael Ben-Haim and Elad Yom-Tov, "A streaming parallel decision tree algorithm", ICML 2008 workshop on PASCAL Large Scale Learning Challenge,2008.
    [65]Xing, E.P. and Ng, A.Y. and Jordan, M.I. and Russell, S, "Distance metric learning with application to clustering with side-information", Advances in neural information processing systems,521-528,2003.
    [66]Shental N., Hertz T., Weinshall D., and Pavel M., "Adjustment learning and relevant component analysis," in Proc. of the European Conference on Computer Vision. London, UK:Springer-Verlag,2002, pp.776-792.
    [67]Hoi, S.C.H. and Liu, W. and Lyu, M.R. and Ma, W.Y., "Learning distance metrics with contextual constraints for image retrieval", Proc. CVPR,2072-2078,2006.
    [68]Goldberger. J, S. Roweis, G. Hinton, and R. Salakhutdinov, "Neighbourhood components analysis," in Proc. NIPS,2005.
    [69]Weinberger. K, J. Blitzer, and L. Saul, "Distance metric learning for large margin nearest neighbor classification", in Proc. NIPS, MIT Press,2006, pp.1475-1482.
    [70]Van Rijsbergen, C. J. Information Retrieval (2nd ed.). Butterworth.1979
    [71]Chua Tat-Seng, Tang Jinhui, Hong Richang, Li Haojie, Zhiping Luo, and Yan-Tao Zheng. "NUS-WIDE:A Real-World Web Image Database from National University of Singapore", ACM International Conference on Image and Video Retrieval. Greece. Jul.8-10,2009.
    [72]Geman S.et al., "Neural networks and the bias/variance dilemma", Neural computation 4,1, 1-58,1992.
    [73]Hastie T.et al., "The elements of statistical learning", Springer,2001.
    [74]Chang C.-C. and Lin C.-J.. LIBSVM:a library for support vector machines,2001.
    [75]Morik K., Brockhausen P., and Joachims T.. Combining statistical learning with a knowledge-based approach-a case study in intensive care monitoring. In ICML, pages 268-277,1999.
    [76]Boyd, S., Vandenberghe, L. Convex optimization. Cambridge University Press,2004.
    [77]Platt. J. Sequential minimal optimization:A fast algorithm for training support vector machines. Advances in Kernel Methods-Support Vector Learning, pages 185-208,1999.
    [78]Shalev-Shwartz S., Singer Y., and Srebro N.. Pegasos:Primal estimated sub-gradient solver for svm. In ICML, pages 807-814,2007.
    [79]Joachims T. Training linear svms in linear time. In KDD, pages 217-226,2006.
    [80]Zanni, L. Serafini T. and Zanghirati G. Parallel Software for Training Large Scale Support Vector Machines on Multiprocessor Systems. Journal of Machine Learning Research 7:14671492,2006.
    [81]Collobert R., Bengio S., andBengio Y. A Parallel Mixture of SVMs for Very Large Scale Problems. In NIPS 2002.
    [82]Dong, Jian-Xiong Adam Krzyzak, and Ching Y. Suen. A fast parallel optimization for training support vector machine. In Proceedings of 3rd International Conference on Machine Learning and Data Mining, volume 17, pages 96-105. Springer Lecture Notes in Artificial Intelligence, Leipzig, Germany,2003.
    [83]Dong Jian-xiong, Krzyzak, A., Suen, C.Y. Fast SVM training algorithm with decomposition on very large data sets. IEEE Transactions on Pattern Analysis and Machine Intelligence. Volume 27, Issue 4, Page(s):603-618, April 2005.
    [84]Zhang J., Li Z., andYang J. A Parallel SVM Training Algorithm on Large-Scale Classification Problems. Machine Learning and Cybernetics,2005. Proceedings of 2005 International Conference on,3,2005.
    [85]Hazan T., Man A. and Shashua A.. A Parallel Decomposition Solver for SVM:Distributed Dual Ascent using Fenchel Duality. In Conference on Computer Vision and Pattern Recognition (CVPR), Anchorage, June 2008.
    [86]Magnus R. Hestenes, Eduard Stiefel. Methods of Conjugate Gradients for Solving Linear Systems. Journal of Research of the National Bureau of Standards. Vol.49, No.6, December 1952.
    [87]Wesseling, P. Introduction to multigrid methods. Storming Media,1995.
    [88]MNIST, http://yann.lecun.com/exdb/mnist
    [89]Ivor W. Tsang, James T. Kwok, Pak-Ming Cheung. Core vector machines:Fast SVM training on very large data sets. Journal of Machine Learning Research,6:363-392,2005.
    [90]USPS, http://www.kernel-machines.org/data/usps.mat.gz
    [91]Asuncion A. and Newman D. UCI machine learning repository,2007.
    [92]UCI Adult, ftp://ftp.ics.uci.edu/pub/machine-learning-databases/adult
    [93]Web dataset, http://www.research.microsoft.com/～jplatt/smo.html
    [94]Forest Cover Type, http://kdd.ics.uci.edu/databases/covertype/covertype.html
    [95]RCV1, http://www.csie.ntu.edu.tw/～cjlin/libsvmtools/datasets/binary.html#rcv1.binary
    [96]Provost F. and Fawcett T. Analysis and visualization of classifier performance:comparison unver imprecise class and cost distribution. In Proceeding of the Third International Conference on Knowledge Discovery and Data Mining, pages 43-48, AAAI Press,1997.
    [97]Domingos P. A unified bias-variance decomposition. Technical report, Department of Computer Science and Engineering, University of Washington, Seattle, WA,2000.
    [98]Valentini G, Dietterich T:Bias-variance analysis of support vector machines for the development of svm-based ensemble methods. Journal of Machine Learning Research, 5:725-775.2004.
    [99]Salton, G, "Automatic text processing:the transformation", Analysis and Retrieval of Information by Computer, Addison-Wesley Publishing Co.,1989.
    [100]Photo.Net, http://photo.net/
    [101]PhotoS ig, http://www.photosig.com
    [102]Zeng, H.J., He, Q.C., Ma, W.Y. et.al. Learning to Cluster Web Search Results. In Proceedings of SIGIR,2004.
    [103]Jiang, J. and Conrath, D. Semantic similarity based on corpus statistics and lexical taxonomy. In Proceedings of International Conference on Research in Computational Linguistics,1997.
    [104]Wang, C., Jing, F., Zhang, L., and Zhang, H. Image annotation refinement using random walk with restarts. In Proceedings of the 14th Annual ACM international Conference on Multimedia,Santa Barbara, CA, USA, October 23-27,2006.
    [105]Wang, X.J., Ma, W.Y., Zhang, L. and Li, X. Iteratively clustering web images based on link and attribute reinforcements. In Proceedings of the 13th annual ACM international conference on Multimedia, ACM Press New York, NY, USA,122-131,2005.
    [106]Kleinberg, J.M. Authoritative sources in a hyperlinked environment. Journal of the ACM,46(5),2000,604-632.
    [107]Tong, H., He, J., Li, M. et.al. Graph Based Multi-Modality Learning, In Proceedings of ACM Multimedia,2005.
    [108]Zhang, L., Hu, Y, Li, M., Ma, W, and Zhang, H. Efficient propagation for face annotation in family albums. In Proceedings of ACM Multimedia. New York,2004.
    [109]Jin, Y., Khan, L., Wang, L., and Awad, M. Image annotations by combining multiple evidence & Wordnet. In Proc. of ACM Multimedia, Singapore,2005
    [110]Rui Xiaoguang, Yu Nenghai, Wang Taifeng, Li Mingjing, "A Search-Based Web Image Annotation Method", IEEE ICME 2007,2007.
    [111]Christof Monz, Bonnie J. Dorr. Iterative translation disambiguation for cross-language information retrieval. In Proc. Of SIGIR,2005.
    [112]Daelemans W., Sima'an K., Veenstra J., and J. Zavrel, editors. Different Approaches to Cross Language Information Retrieval, Language and Computers:Studies in Practical Linguistics, AmsterdamRodopi,2001.
    [113]Google translation, http://www.google.com/translate_t
    [114]Venugopal A., Vogel S., and Waibel A.. Effective phrase translation extraction from alignment models. In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics (ACL-2003), pages 319-326,2003.
    [115]Och F.-J. and Ney H.. The alignment template approach to statistical machine translation. Computational Linguitics,30(4):417-449,2004.
    [116]Christof Monz, Bonnie J. Dorr. Iterative translation disambiguation for cross-language information retrieval. In Proc. OfSIGIR,2005.
    [117]MacQueen, J. B.. "Some Methods for classification and Analysis of Multivariate Observations". Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability.1. University of California Press, pp.281-297,1967.
    [118]Ward, Joe H.. "Hierarchical Grouping to Optimize an Objective Function". Journal of the American Statistical Association 58 (301):236-244,1963.
    [119]Jianbo Shi and Jitendra Malik, "Normalized Cuts and Image Segmentation", IEEE Transactions on Pattern Analysis and Machine Intelligence,22(8),888-905, August 2000.
    [120]Wang B., Li Z.W., Li M.J., and Ma W.Y. "Large-Scale Duplicate Detection For Web Image Search," in Proc. ICME,2006.
    [121]Ponte J M and Croft W B. "A Language Modeling Approach to Information Retrieval". Research and Development in Information Retrieval, pp.275-281.1998.
    [122]Zhai, C. and Lafferty, J. A study of smoothing methods for language models applied to information retrieval. ACM Transactions on Information Systems.22:pp 214-216,2004.
    [123]Adriani M. Dictionary-based clir for the clef multilingual track. In Working Notes of CLEF, Lisbon, September 2000.
    [124]Li Mingjing. Texture moment for content-based image retrieval. In Proceeding of ICME, Beijing, China,2007.
    [125]Linguistic Data Consortium, http://morph.ldc.upenn.edu/Projects/Chinese/
    [126]在线金山词典,http://ciba.kingsoft.net/online/
    [127]U.Washington, http://www.cs.washington.edu/research/imagedatabase/groundtruth/
    [128]Bischoff K., Firan C. S., Nejdl W., and Paiu R.. Can all tags be used for search? In CIKM,2008.
    [129]Miller, G., Beckwith, R., Fellbaum, C., Gross, D., and Miller, K, "WordNet:An on-line lexical database", International Journal of Lexicography,3(4),235-244.,1990.
    [130]Klerk, E. de, Warners, J.P, Semidefinite programming approaches for MAX 2-SAT and MAX-3-SAT:Computational perspectives. Combinatorial and Global Optimization, pp. 161-176,2002
    [131]Poljak S. And Rendl F. Nonploy hedral Relaxations of Graph Bisection Problems. SIAM J. Optimization,5:467-487,1995.
    [132]Helmberg C., Rendl F., Vanderbel R. J. and Wolkowicz. An interior point method for semidefinite programming. SIAM Journal on Optimization,2005.
    [133]Clements, M. de Vries, A.P. and Reinders, M.J.T. Detecting synonyms in social tagging systems to improve content retrieval. Proceedings of the 31st annual international ACM SIGIR conference,2008.
    [134]Swain M.J., Ballard D.H., "Color Indexing", International Journal of Computer Vision, 1991.
    [135]Pass G, Zabih R, Miller J, Comparing images using color coherence vectors, ACM International Conf. Multimedia,1996.
    [136]Stricker M A, Orengo M, "Similarity of color image", SPIE,1995.
    [137]Haralick R, Shanmugam K, Dinstein I, Texture Features for Image Classification, IEEE Transactions on System, Man, and Cybernetics,1973.
    [138]Tamura H., Mori S., Yamawaki T., "Textural Features Corresponding to Visual Perception", IEEE Transactions on Systems, Man, and Cybernetics,1978.
    [139]Costa L F, Cesar R M, "Shape Analysis and Classification:Theory and Practice ", CRC Press,2001.
    [140]Mikolajczyk, K. and Schmid, C, Scale & affine invariant interest point detectors, International Journal of Computer Vision,60:63—86,2004.
    [141]Fergus R., Perona, P., and Zisserman, A.. Object class recognition by unsupervised scale-invariant learning. In Proc. IEEE CVPR,2003.
    [142]Bouchard, G. and Triggs, B. Hierarchical part-based visual object categorization. In Proc. IEEE CVPR.2005.
    [143]Csurka, G. and Dance, C. and Fan, L. and Willamowski, J. and Bray, C. Visual categorization with bags of keypoints, Workshop on Statistical Learning in Computer Vision, ECCV,2004.
    [144]Hofmann, T., Probabilistic latent semantic indexing, Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, 1999.
    [145]Lindeberg T. and Garding J., Shape-adapted smoothing in estimation of 3-D depth cues from affine distortions of local 2-D structure, International Journal of Computer Vision 15: pp 415—434,1997
    [146]Lowe, David G., Object recognition from local scale-invariant features, Proceedings of the International Conference on Computer Vision.2. pp.1150-115,1999.
    [147]PASCAL VOC 2006, http://www.pascal-network.org/challenges/VOC/
    [148]Grubinger M., Clough P., M"uller H., and Deselaers T.. The IAPR Benchmark:A New Evaluation Resource for Visual Information Systems. In LREC 06 OntoImage 2006: Language Resources for Content-Based Image Retrieval, Genoa, Italy, May 2006.
    [149]Richardson, M., Prakash, A. and Brill, E., "Beyond PageRank:machine learning for static ranking", WWW 2006.
    [150]Page, L., Brin, S. and Motwani, R. et.al. "The pagerank citation ranking:Bringing order to the web", Stanford Digital Library Technologies Project,1998.
    [151]Frankel C., Swain M., andAthitsos V, Webseer:An Image Search Engine for the WorldWide Web, IEEE Conf. on CVPR,1997.
    [152]Robertson S. E., Walker S., "Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval", Proceedings of the 17th annual international ACM SIGIR conference, July 03-06,1994.
    [153]Zhang, L., Jing, F.et al. "EnjoyPhoto:a vertical image search engine for enjoying high-quality photos", ACM Multimedia 2006.
    [154]Jin, X. and French, J.C., "Improving image retrieval effectiveness via multiple queries", Proceedings of the first ACM international workshop on Multimedia databases,2003.
    [155]Google Image Search, http://image.google.com
    [156]Roger Horn and Charles Johnson. Matrix Analysis, Chapter 5, Cambridge University Press,1985.
    [157]Devroye L., Gyorfi L., and Lugosi G.. A Probabilistic Theory of Pattern Recognition. Springer, New York,1996.
    [158]Saul. L. K and S.T. Roweis, "Think Globally, Fit Locally:Unsupervised Learning of Low Dimensional Manifold," Journal of Machine Learning Research, vol.4, pp.119-155, 2003.
    [159]Ido Dagan and Alon Itai and Ulrike Schwall. Two languages are more informative than one. Proceedings of the 29th annual meeting on Association for Computational Linguistics, 1991.
    [160]Liu J., Wang B., Li M. et al.. Dual Cross-Media Relevance Model for Image Annotation. ACM International Conference on Multimedia.2007.
    [161]Tao Dacheng, Tang Xiaoou, Li Xuelong and Wu Xindong. Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence 28 (7), pp.1088-1099, 2006.
    [162]Avidan, S. Support vector tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence. Volume 26, Issue 8, Page(s):1064-1072, Aug.2004.
    [163]Pontil, M. and Verri, A. Support vector machines for 3D object recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence.20:637-646,1998.
    [164]Harold Hotelling, "Analysis of a Complex of Statistical Variables into Principal Components," Journal of Educational Psychology, vol.24, pp.417-441,1933.
    [165]Xiaoguang Rui, Nenghai Yu, Mingjing Li, Lei Wu, "On Cross-language Image Annotations", ICME 2009 Workshop on Internet Multimedia Search and Mining,2009.
    [166]Shao, Y. and Zhou, Y. and He, X. and Cai, D. and Bao, H. Semi-supervised topic modeling for image annotation. Proceedings of the seventeen ACM international conference on Multimedia, pp.521-524.2009.
    [167]Jamieson, M. and Fazly, A. and Stevenson, S. and Dickinson, S. and Wachsmuth, S. Using Language to Learn Structured Appearance Models for Image Annotation. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.32. Pp.148-164.2010.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700