基于多标签学习的图像语义自动标注研究

设为首页

收藏本站

网站地图 | English | 公务邮箱

远程访问

NSTL服务站

基于多标签学习的图像语义自动标注研究

详细信息本馆镜像全文| 推荐本文 | | 获取CNKI官网全文

作者：王梅
论文级别：博士
学科专业名称：计算机软件与理论
中文关键词：图像自动标注 ; 多标签学习 ; 统计学习 ; 生成模型 ; 判别分类方法 ; 语义层次结构 ; 超平面树 ; Web图像标注 ; 噪声训练集
英文关键词：Automantic image annotation ; multi-label learning ; statistical learning ; generative model ; discriminative classification method ; semantic hierarchy ; hyperplane tree ; Web image annotation ; nosiy training set
学位年度：2008
导师：施伯乐
学科代码：081202
学位授予单位：复旦大学
论文提交日期：2008-04-15

摘要

随着多媒体数字化技术的发展和推广、存储成本的降低、网络传输带宽的增长,各种多媒体数据如图像、视频等飞速膨胀逐渐成为信息的主流,并对人们的生活和社会发展产生重要的影响。“语义清晰”是大规模多媒体数据管理的重要前提,因此通过信息技术自动获取多媒体数据对象的语义内容的研究具有十分重要的理论与实践意义,引起了学术界与工业界的高度关注。
     图像是视频的基础,在多媒体数据管理中占有重要的地位,因此图像语义的自动标注技术是当前相关领域的研究热点。图像语义的自动标注本质上是一个“学习”问题,即根据图像的视觉内容推导出图像的语义标签。因此,各种机器学习、统计推理技术都应用于图像标注的研究中,并在不断的深化和推进。然而,由于图像标注中“语义鸿沟”以及“多标签”问题的影响,现有方法的标注性能仍有待进一步提高。
     本文围绕图像标注的多标签特点,集中利用多标签相关性,对多标签带来的数据重叠、数据不平衡等问题以及Web图像标注开展研究,在基于生成模型的多标签传递、生成模型与判别分类方法相结合的图像标注、基于噪声训练集的Web图像标注等方面进行了新的尝试,提出多个具有较好性能的图像标注方法。
     本文主要研究内容如下:
     1.提出扩展生成模型的图像标注方法:为了有效利用多标签之间的相关性,将原始生成模型扩展为对多标签同时标注,并提出启发式迭代算法进行求解。在该方法中,提出主题-图像-区域多粒度层次特征估计模型,对语义关键词之间的相关性进行分析,并使两者在提出的迭代算法中相互结合共同改进标注性能。实验证明所提基于扩展生成模型的图像标注方法较传统生成模型在标注准确度上有明显改进。
     2.提出基于可判别超平面树的图像标注方法:基于待标注图像的高视觉生成领域构造局部隐藏主题层次结构,并在其基础上构造可判别超平面树。在引入分类器的判别能力的同时,保留了基于概率模型的图像语义标注的优点,实现将生成模型与判别分类方法相结合改进图像标注。实验证明所提基于可判别超平面树的图像标注方法较之传统生成模型和判别分类模型在标注准确度上有明显提高。
     3.提出基于局部多标签分类的图像标注方法:给出将生成模型与判别分类技术相结合用于图像标注的另一个解决思路,更深层次的考虑并区分特征相似所隐含的不同语义模式,并对多标签语义特征空间及特征空间的分类边界同时进行考虑,以使生成的隐藏主题同时获得较大的语义和视觉可分性。实验证明所提基于局部多标签分类的图像标注方法较之传统生成模型和判别分类模型在标注准确度上有明显提高。
     4.提出基于噪声训练集的Web图像标注方法:本文给出一个完整的Web图像标注解决方案。首先提出一个自动生成Web图像标注训练集的“轻量级”方法,进而针对训练集中的噪声数据,设计基于混合模型局部Fisher判别分析的Web图像标注方法。实验表明所提标注方法在存在噪声数据的情况下较传统标注方法获得较好的标注效果。
With the rapid development and widespread of multimedia digital techniques, the reduction in storage cost and the transmission bandwidth growth of the network, multimedia information such as image and video become ever more available, and play big role in people's life and social development. "Explicit semantics" is an important prerequisite for large scale multimedia information management. So automatically obtaining the semantics of the multimedia data by using information techniques has important meaning in theory and practice, and attracts great attentions in academic and industrial fields.
     Image is the basis of the video. It occupies an important position in multimedia data management. So automatic image annotation (AIA) is the hot research issue in the related fields. The nature of AIA is a process of "learning", that is, associating images with semantic keywords according to their visual contents. Thus, machine learning methods and statistical inference techniques have both been applied to solving the problem of AIA, which is continuously deepening and promoting. However, due to the problem of "semantic gap" and multi-labeling, the annotation performance of existing methods is not satisfactory, and needs to be further improved.
     This paper studies multi-label characteristic of AIA. By concentrating on the correlation between multiple keywords, we addresses the problems of data imbalance and overlapping brought about by multi-labeling and Web image annotation problem. Based on this, several image annotation methods with good performance are proposed, which are mainly about generative model based multiple class label propagation, the combined generative model and discriminative techniques and noisy training set based web image annotation.
     The main work of this paper is as follows:
     1. A new image annotation method via extended generative model is proposed: inorder to exploit the correlation between keywords, we propose a new image annotation method by extending the tradition generative model to estimating the probability of a set of keywords being the caption of an image, and present a heuristic iterative algorithm to solve the problem. In this method, we propose a topic-image-region multi-granular hierarchical feature estimation model, and analyze the correlation between keywords. Both of their contributions to image annotation are extensively exerted according to our heuristic iterative algorithm. The experimental results on a real world benchmark show that our method outperforms the traditional generative model based annotation method.
     2. The discriminative hyperplane tree based image annotation method is proposed: this method leverages the benefits of the generative and discriminative models by building the local latent topic hierarchy and the corresponding hyperplane tree based on the high generative probability neighborhood of the unlabel image. The experimental results on a real world benchmark show that our method outperforms the state-of-the-art generative model based annotation method and discriminative model based method.
     3. The local multi-label classification based image annotation method is proposed:this method provides another solution to combine generative model and discriminative techniques to improve AIA. We further explore the underlying semantics of visual similarities, and try to find the optimal margin in both visual and semantic spaces when generating the latent topic to obtain large separation in both spaces. The experimental results on a real world benchmark show that our method outperforms the state-of-the-art generative model based annotation method and discriminative model based method.
     3. A new Web image annotation method based on noisy training set is proposed:we present a novel web annotation framework. We introduce a "light weight" method to obtain the training set automatically. Then, we propose a novel annotation method based on mixture component based local fisher discriminant analysis to deal with the bad influence of the noisy training data. The experimental results on a real world Web image data set show that our method outperforms the traditional annotation approaches with noisy training data.

引文

[Bar96] J.R .Bach etc., The virage image search engine:An open frame work for image management, In Proc. SPIE: Storage and Retrieval for Still Image and Video Databases IV 2670, 1996: 76-87.

    [BC05] J. Bi and Y.X. Chen. A Sparse Support Vector Machine Approach to Region-Based Image Categorization. In: Proc. of the IEEE Conf. Computer Vision and Pattern Recognition. San Diego:IEEE Computer Society, 2005:1121-1128.

    [BF01] K. Barnard and D. A. Forsyth. Learning the Semantics of Words and Pictures. In Proc. International Conference on Computer Vision, Vancouver, Canada:IEEE Computer Society, 2001: 408-415.

    [BF00] G. Baudat and F. Anouar. Generalized Discriminant Analysis Using a Kernel Approach. Neural Computation 12, Cambridge:The MIT Press, 2000: 2385-2404.

    [BH03] J.Bezdek and R.Hathaway. Convergence of alternating optimization. Neural, Parallel Sci. Comput., 11(4), 2003:351-368.
    [BNM03] D. Blei, A. Ng, and M. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, Cambridge:The Mit Press, 2003:993-1022.
    [BY99] Ricardo Baeza-Yates, Berthier Ribeiro-Neto. Modern Information Retrieval, New York: ACM press, 1999:123-129.
    [CBW06] Y.X. Chen, J.B. Bi and J.Z. Wang. MILES: Multiple-Instance Learning via Embedded Instance Selection. IEEE Trans. on Pattern Analysis and Machine Intellience, IEEE CS Press, 28(12), 2006:1931-1947.
    [CCM07] G. Carneiro, A.B. Chan, P.J. Moreno, N. Vasconcelos. Supervised Learning of Semantic Classes for Image Annotation and Retrieval, IEEE Trans. on Pattern Analysis and Machine Intellience, IEEE CS Press, 29(3), 2007:394-410.
    [CCS04] C. Cusano, G. Ciocca, and R. Scettini. Image Annotation Using SVM. In Proc. SPIE: Internet Imaging IV. S. Santini, R. Schettini eds. 2004:330-338.
    [CHW04] D. Cai, X.F. He, Z.W. Li, W.Y. Ma, J.R. Wen, Hierarchical Clustering of WWW Image Search Results Using Visual, Textual and Link Information, In Proc. 12th ACM International Conference on Multimedia, H.Schulzrinne etc. Eds., New York:ACM Press, 2004:952-959.
    [CL] C.C. Chang and C.J. Lin.LIBSVM: a Library for Support Vector Machines. Software available at http://www.csie.ntu.edu.tw/～cjlin/libsvm.
    [Clo06] Clough, etc,. The CLEF 2005 cross-language image retrieval track, In Proc. 6th Workshop of the Cross Language Evaluation Forum, Lecture Notes in Computer Science, Springer,2006:535-557.
    [CSY87] S. K. Chang, Q. Y. Shi, and C. Y. Yan, "Iconic indexing by 2-D strings, IEEE Trans. on Pattern Analysis and Machine Intellience, IEEE CS Press, 9(3), 1987:413-428.
    [DBF02] P. Duygulu, K. Barnard, J. F. G de Freitas, and D. A. Forsyth. Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary. In: Heyden A eds. Proc. of European Conference on Computer Vision. Berlin:Spring-Verlag, 2002:97-112.
    [DDF90] S. Deerwester, S. T. Dumais, G. W. Furnas, Landauer. T. K., and R. Harshman. Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41, 1990: 391-407.
    [DGS00] J.Ding, L.Gravano, and N.Shivakumar. Computing Geographical Scopes of Web Resource. In Proc. 26th Intel. Conference on Very Large Data Bases (VLDB 2000), Amr El Abbadi etc. Eds., Cairo: Morgan Kaufrnann, 2000:545-556.
    [DYW03] C. Deng, S. Yu, J. Wen, etal, VIPS:A Vision-Based Page Segmentation Algorithm. Microsoft Technical Report, MSR-TR-2003-79, Redmond: Microsoft Research Corporation, 2003:1-79.
    [FF97] D.Forsyth and M.Fleck.Body plans. In Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, 1997:678-683.
    [Fli95] M. Flickner etc., Query by image and video content: the QBIC system, IEEE Comput.28, 1995:23-32.
    [FML04] S. L. Feng, R. Manmatha and V. Lavrenko. Multiple Bernoulli Relevance Models for Image and Video Annotation. In: Proc. of the IEEE Conf. Computer Vision and Pattern Recognition. Washington DC: IEEE Computer Society, 2004:1002-1009.
    [FSC04] H.M. Feng, R. Shi and T.S. Chua. A Bootstrapping Framework for Annotating and Retrieving WEB Images,In Proc. 12th ACM International Conference on Multimedia, H.Schulzrinne etc. Eds., New York:ACM Press, 2004:960-967.
    [GDH04] E.Gabrilovich, S.Dumais and E.Horvitz. Newsjunkie: Providing Personalized Newsfeeds via Analysis of Information Novelty. In Proc. of the 13 th International WWW Conference, New York: ACM Press, 2004:482-490.
    [GFX06] Y.L. Gao, J.P. Fan, X.Y. Xue and R. Jain. Automatic Image Annotation by Incorporating Feature Hierarchy and Boosting to Scale up SVM Classifiers. In: Klara N, Matthew T, Yong R, Wolfgang K, Ketan MP, eds. Proc. of ACM International Conference on Multimedia. Santa Barbara:ACM Press, 2006.901-910.
    [GR95] V. N. Gudivada, and V. V Raghavan, Design and evaluation of algorithms for image retrieval spatial similarity, ACM Trans.on Information Systems, 13(2), 1995:115-144.
    [GRB07] H. Grabner, P.M. Roth and H. Bischof, Eigenboosting: Combining Discriminative and Generative Information, In Proc. of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Minneapolis: IEEE Computer Society Press, 2007:1-8.
    [HL97] Z.M. N.Haering and N.Lobo. Locating dedicuous trees. In Proc. Workshop in Content-Based Access to Image and Video Libraries, 1997: 18-25.
    [HS92] R.M. Haralick and L.G. Shapiro, Computer and Robot Vision (Vol.1), Addison-Wesley, Reading, Boston, Mass., 1992.
    [HTF01] T.Hastie, R. Tibshirani and J. Friedman.. The Element of Statistical Learining; data mining, inference, and prediction. Springer-Verlag, 2001.
    [HWL] Z.Hua, X.Wang, Q.Liu, and H.Liu. Semantic Knowledge Extraction and Annotation for Web Images. In: Zhang HZ, Chua TS, eds. Proc. of ACM International Conference on Multimedia. Singapore:ACM Press, 2005.467-470.
    [HWX05] Z.Hua, C.Wang, X.Xie,H.Lu, and W.-Y.Ma. Automatic Annotation of Location Information for WEB Images. In Proc. International Conference on Mulitimedia and Expo(ICME),Amsterdam:IEEE Computer Society, 2005:771-774.
    [JCS04] R. Jin, J. Y. Chai, and L. Si. Effective Automatic Image Annotation via A Coherent Language Model and Active Learning. In: Henning S, Nevenka D, eds. Proc. of International Conference on ACM Multimedia. New York:ACM Press, 2004:892-899.
    [JKW05] Y. Jin, L. Khan, L. Wang, M. Awad. Image Annotations By Combining Multiple Evidence & WordNet. In: Zhang HZ, Chua TS, eds. Proc. of ACM International Conference on Multimedia. Singapore:ACM Press, 2005.706-715.
    [JLM03] J. Jeon, V. Lavrenko, and R. Manmatha. Automatic Image Annotation and Retrieval using Cross-Media Relevance Models. In: Proc. of Intl. ACM SIGIR.Toronto, ACM Press, 2003:119-126.
    [JM04] J. Jeon and R. Manmatha. Using Maximum Entropy for Automatic Image Annotation. In Proc. Int'l Conf on Image and Video Retrieval (CIVR'04), Lecture Notes in Computer Science, Dublin:Springer, 2004:24-32.
    [KJS06] F. Kang, R.Jin, and R.Sukthankar. Correlated label propagation with application to multi-label learning. In Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, New York:IEEE Computer Society Press, 2006: 1719-1726.
    [Kuh55] H.W. Kuhn, "The Hungarian Method for the Assignment Problem," Naval Research Logistics Quarterly, vol. 2, 1955:83-97.
    [LBM06] J.A. Lasserre, C.M. Bishop and T.P. Minka, Principled Hybrids of Generative and Discriminative Models, In Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, New York:IEEE Computer Society Press, 2006:87-94.
    [LC06] X.R. Li, L. Chen,L. Zhang,F.Z. Lin, and W.Y. Ma. Image Annotation by Large-scale Content-based Image Retrieval. In Proc. of the 14th ACM International Conference on Multimedia, Klara Nahrstedt etc. Eds, Santa Barbara: ACM Press, 2006:607-610.
    [LLM06] J. Liu, M.J. Li, W.Y. Ma, Q.S. Liu, H.Q. Lu, An Adaptive Graph Model for Automatic Image Annotation, In: James ZW, Nozha B, eds. Proc. of ACM SIGMM International Workshop on Multimedia Information Retrieval. Santa Barbara:ACM Press, 2006.61-69.
    [LMJ04] V. Lavrenko, R. Manmatha, and J. Jeon. A Model for Learning the Semantics of Pictures. In: Sebastian T, Lawrence K.S, Bernhard S, eds. Proc. of Neural Information Processing Systems(NIPS).Vancouver and WhistlerMIT Press, 2004:553-560.
    [LS01] N. D. Lawrence and B. Scholkopf, Estimating a Kernel Fisher Discriminant in the Presence of Label Noise, In Proc. of the Eighteenth International Conference on Machine Learning (ICML), Carla E. Brodley, etc. Eds., Williamstown: Morgan Kaufmann , 2001:306-313.
    [LS02] Y.Li and L.Shapiro. Consistent line clusters for building recognition in cbir. In Proc. Int'l Conf. Pattern Recognition, Quebec: IEEE Computer Society Press, 2002: 952-956.
    [LS06] W. Li and M.S. Sun, Semi-supervised Learning for Image Annotation Based on Conditional Random Fields, In 5th International Conference on Image and Video Retrieval, Hari Sundaram , etc. Eds., Tempe:Springer, 2006: 463-472.
    [LSD06] Lew, Sebe, Djeraba, Jain, Content-based Multimedia Information Retrieval: State of the Art and Challenges", ACM Transactions on Multimedia Computing, Communications, and Applications, ACM Press, 2006:1-19.
    [LW06] J. Li and J. Z. Wang. Real-Time Computerized Annotation of Picture. In: Klara N, Matthew T, Yong R, Wolfgang K, Ketan MP, eds. Proc. of ACM International Conference on Multimedia. Santa Barbara:ACM Press, 2006:911-920.
    [LWL07] J.Liu, B. Wang, M.J. Li, Z.W. Li, W.Y. Ma, H.Q. Lu, S.D. Ma. Dual Cross-Media Relevance Model for Image Annotation. In Proc. ACM International Conference on Multimedia, Augsburg:ACM Press, 2007:605-614.
    [LZZ06] T. Li, C.L. Zhang and S.H. Zhu. Empirical Studies on Multi-label Classification, In Proc. of 18th IEEE International Conference on Tools with Artificial Intelligence (ICTAI06), Washington: IEEE Computer Society, 2006:86-92.
    [MC07] M. Marszalek and C. Schmid, Semantic Hierarchies for Visual Object Recognition, In Proc. 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Minneapolis: IEEE Computer Society Press, 2007.
    [MG95] R.Mehrotra and J.Gary, Similar-shape retrieval in shape data management, IEEE Comput.28, 1995:57-62.
    [MG03] Monay, F. and D. Gatica-Perez. On Image Auto-Annotation with Latent Space Models. In: Lawrence AR, Harrick MV, Thomas P, Prashant JS, John RS, eds. Proc. of ACM International Conference on Multimedia. Berkeley:ACM Press, 2003.275-278.
    [MG04] Monay, F. and D. GaticaPerez. PLSA-based Image AutoAnnotation: Constraining the Latent Space. In: Henning S, Nevenka D, eds. Proc. of International Conference on ACM Multimedia. New York:ACM Press, 2004:348-351.
    [MGP05] R. Maree, P. Geurts, J. Piater, and L. Wehenkel. Random Subwindows for Robust Image Classification. In Proc. IEEE International Conference on Computer Vision and Pattern Recognition, San Diego: IEEE Computer Society Press, 2005:34-30.
    [MM96] B .S.Manjunath and W.Ma, Texture features for browsing and retrieval of image data, IEEE Trans. Pattern Anal.Machine Intell. 18(8), 1996:837-842.
    [NKH02] M. Naphade, I. Kozintsev, and T. Huang. Factor graph framework for semantic video indexing. IEEE Trans. on IEEE Trans. on Circuits and Systems for Video Technology, 12(1), 2002:40-52.
    [MTO99] Y.Mori, H.Takahashi, and R.Oka. Image-to-word transformation based on dividing and vector quantizing images with words. In First International Workshop on Multimedia Intellegent Storage and Retrieval Management, 1999.
    [Nib93] W. Niblack etal, The QBIC project: Querying images by content using color, texture and shape, In Proc. SPIE: Storage and Retrieval for Image and Video Databases, San Jose, Calif: SPIE, 1993:173-182.
    [NST06] M.Naphade, J.R. Smith, J.Tesic, S.Chang, W.Hsu, L.Kennedy, and A.Hauptmann. Large-scale concept ontology for multimedia. IEEE MultiMedia, 2006: 86-91.
    [Par] HTML Parser . http://htmlparser.sourceforge.net
    [QHR07] G.Qi, X.Hua, Y.Rui, J.Tang, T.Mei, and H.Zhang. Correlative multi-label video annotation. In Proc. Proc. of the 15th International Conference on Multimedia, Rainer Lienhart, etc. Eds.,Augsburg:ACM Press, 2007: 17-26.
    [RGN07] T. Rattenbury, N. Good, and M. Naaman. Towards automatic extraction of event and place semantics from flickr tags. In Proc. of the Thirtieth International ACM SIGIR Conference. ACM Press, 2007:103-110.
    [SA00] R.Swan and J.Allan. TimeMine: Visualizing Automatically Constructed Timelines. In Proc. of 23rd Annual International ACM SIGIR. Athens: ACM Press, 2000:393.
    [SB99] F. Song and W. Bruce Croft. A General Language Model for Information Retrieval. In Proc. 18th Conf. on Information and Knowledge Management. Mario J. Silva, etc. Eds., Lisbon:ACM Press, 1999:316-321.
    [SBL04] X. Shen, M. Boutell, J. Luo, and C. Brown. Multi-label machine learning and its application to semantic scene classification. In International Symposium on Electronic Imaging, San Jose, CA, 2004.
    [SC97] J.Smith and S.Chang, Querying by color regions using the VisualSEEK content-based visual query system. In: M. Maybury, editor, Intelligent Multimedia Information Retrieval, AAAI Press, 1997.
    [SCL06] R.Shi, T.S. Chua, C.H. lee and S. Gao. Bayesian Learning of Hierarchical Multinomial Mixture Models of Concepts for Automatic Image Annotation. In: Hari S, eds. Proc. of Conf. Image and Video Retrieval. Tempe:Lecture Notes in Computer Science,2006:102-112.
    [SD97] H.M. Sanderson and M.D. Dunlop. Image Retrieval by Hypertext Links. In Proc. of the 20th Annual International ACM SIGIR, Philadelphia:ACM Press, 1997: 296-303.
    [Sim97] J.R .Smith. Integrated Spatial and Feature Image Systems: Retrieval, Compression and Analysis. PhD thesis, Graduate School of Arts and Sciences, Columbia University, February 1997.
    [SM00] J. Shi and J. Malik. Normalized Cuts and Image Segmentation. IEEE Trans. on Pattern Analysis and Machine Intelligence, 22,2000:888-905.
    [SMJ05] C.Steven, R.Michael, and R.Jin. Integrating user feedback log into relevance feedback by coupled svm for content-based image retrieval. In Proc. of International Workshop on Data Engineering, Tokyo:IEEE Computer Society, 2005:1177-1186.
    [SMZ07] B.Sun, P.Mitra, H.Zha, C.L. Giles, and J.Yen. Topic segmentation with shared topic detection and alignment of multiple documents, In Proc. of the 30th Annual International ACM SIGIR, Amsterdam:ACM Press, 2007:199-206.
    [SQT00] H.T. Shen, B.C. Qoi and K.L. Tan. Giving meaning to WEB images. In Proceedings of ACM International Conference on Multimedia, 2000,LA,USA.39-47.
    [Sug06] M. Sugiyama, Local Fisher Discriminant Analysis for Supervised Dimensionality Reduction, In Proc. of the 23th International Conference on Machine Learning, ACM International Conference Proceeding Series. William W. Cohen, etc. Eds., Pittsburgh:ACM Press, 2006: 905-912.
    [SVB05] M. Srikanth, J. Varner, M. Bowden, D. Moldovan. Exploiting Ontologies for Automatic Image Annotation. In: Ricardo ABY, Nivio Z, Gary M, Alistair M, John T, eds. Proc. of SIGIR. Salvador:ACM Press, 2005:552-558.
    [TMY78] H.Tamura, S.Mori, and T.Yamawaki, Texture features corresponding to visual perception, IEEE Trans. Sys. Man. Cyb. SMC-8( 6), 1978:780-786.
    [TSW07] V.S. Tseng, J.H. Su, B.W. Wang, Y.M. Lin. WEB Image Annotation by Fusing Visual Features and Textual Information. In Proceedings of the 2007 ACM symposium on Applied computing, Symposium on Applied Computing, New York:ACM Press, 2007:1056 - 1060 .
    [VZ98] A.J. A.Vailaya and H.Zhang. On image classification: City vs. landscape. In Proc. of International Conference on Pattern Recognition, Brisbane: IEEE Computer Society, 1998:1921-1936.
    [WL02] J Z. Wang and J. Li. Learning-Based Linguistic Indexing of Pictures with 2-D MHMMs. In Proc. of the 10th ACM International Conference on Multimedia, Juan les Pins: ACM Press, 2002:436-445.
    [WLG01] J. Z. Wang, J. Li and G. Wiederhold. SIMPLIcity: Semantics-Sensitive Integrated Matching for Picture Libraries. IEEE Trans. on Pattern Analysis and Machine Intelligence. 23(9), 2001:947-962
    [WRS04] T.Wang, Y.Rui and J.G. Sun. Contraint Based Region Matching for Image Retrieval, International Journal Of Computer Vision, Kluwer Academic,56(l/2), 2004:37-45.
    [WS82] G.Wyszeckiand W.Stiles, Color Science:Concepts and Methods, Wiley Sons Inc. New York, 1982.
    [WTS04] Y. Wu, B. L. Tseng, and J. R. Smith. Ontology-based multi-classification learning for video concept detection. In Proc. of IEEE International Conferences on Multimedia and Expo, Taipei:IEEE Computer Society, 2004:1003-1006.
    [WZ06] X.Wang, L. Zhang, and etc. AnnoSearch: Image Auto-Annotation by Search. In: Hari S, Milind RN, John RS, Yong R, eds. Proc. of Conf. Image and Video Retrieval. Tempe:Lecture Notes in Computer Science,2006:1483-1490.
    [XNL04] L.Xu, J.Neufeld, B.Larson, and D.Schuurmans.Maximum margin clustering. Advances in Neural Information Processing Systems, Vancouver, 2004.
    [XS06] L.Xu and D.Schuurmans. Unsupervised and semi-supervised multi-class support vector machines. Advances in Neural Information Processing Systems, 2006.
    [YD06]C.B. Yang and M. Dong, Region-based Image Annotation using Asymmetrical Support Vector Machine-based Multiple-Instance Learning, In Proc. of 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, New York: IEEE Computer Society, 2006: 2057-2063.
    [YSR05] A.Yavlinsky, E.Schofield, and S.Ruger. Annotation using global features and robust nonparametric density estimation. In Proc. of 5th International Conference on Image and Video Retrieval, Hari Sundaram, etc., Eds., Lecture Notes in Computer Science, Tempe: Springer, 2005: 507-517.
    [ZBM06] H. Zhang, A. C. Berg, M.Maire, J. Malik, SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Categorg Recognition, In Proc. of 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, New York: IEEE Computer Society, 2006:2126-2136.
    [ZGY02]Q. Zhang, S.A. Goldman, W. Yu, and J. Fritts, "Content-Based Image Retrieval Using Multiple-Instance Learning," In Proc. 19th Int'l Conf. Machine Learning, Claude Sammut, etc. Eds., Sydney: Morgan Kaufmann, 2002:682-689.
    [ZL04] C.X. Zhai and J. Lafferty. A Study of Smoothing Methods for Language Models Applied to Information Retrieval. ACM Transactions on Information Systems, Vol. 22, No. 2, April, 2004:179-214
    [ZTK07] K.Zhang, I.Tsang, and J.Kwok. Maximum margin clustering made practical. In Proc. 19th Int'l Conf. Machine Learning, ACM International Conference Proceeding Series, Zoubin Ghahramani Eds., Corvalis:ACM Press, 2007: 1119-1126.
    [ZW04] X.Q. Zhu and X.D. Wu, Class Noise vs. Attribute Noise: A Quantitative Study of Their Impacts, Artificial Intelligence Review 22, 2004:177-210.
    [ZZ02] R.F. Zhang, Z.F. (Mark) Zhang. A Clustering Based Approach to Efficient Image Retrieval. Proceedings. 14th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2002) 4-6 Nov. 2002:339 - 346.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700