图像语义的自动标注方法研究

英文题名：Research on Automatic Annotation of Image Semantics
作者：张熠转
论文级别：硕士
学科专业名称：计算机科学与技术
中文关键词：图像语义 ; 自动标注 ; 特征提取
英文关键词：Image Semantics ; Automatic Annotation ; Feature Extraction
学位年度：2007
导师：马培军
学科代码：081203
学位授予单位：哈尔滨工业大学
论文提交日期：2007-07-01

摘要

随着网络技术、多媒体技术、数据库技术的发展和互联网的不断普及,图像信息应用越来越广泛,人们对图形、图像等多媒体数据的需求也越来越强烈。基于语义的图像检索不仅方便于用户的使用,还准确地体现出用户的意图,因此是图像检索发展的必然。而图像的标注字能够较好的表达图像的语义内容,它能够缩小图像的高级语义和低级视觉内容之间的差距,因此图像语义的自动标注方法也正逐渐引起人们的重视。
     本文通过分析图像语义自动标注的相关技术,深入探讨和研究了图像视觉特征的提取方法。在颜色特征提取方面,利用HSV(Hue, Saturation, Value)颜色空间进行非等间隔量化,并构造一维特征矢量,用累加直方图表示图像的颜色特征;在纹理特征提取方面,针对不同纹理特点分别采用了基于共生矩阵的统计纹理分析和基于小波变换的频谱纹理分析两种方法予以实现。通过对以上特征的提取,实现了融合多特征的图像检索。
     针对不同特点的图像,融合不同的图像特征,并采用不同的相似性度量方法,提高了图像检索的准确率。在检索结果的基础上,采用基于实例的方法实现了对图像语义的自动标注。这种方法的基本思想是,把带有标注的训练样本集当作一种标注经验,在提取出示例图像的视觉特征后,从经验库中检索出与之视觉相似的图像,并且通过模仿这些例子图像的标注,对图像实施标注。
     通过大量的仿真验证,表明该语义自动标注方法不仅能够将图像的视觉特征转化为图像的标注字信息,可以有效的应用于基于语义的图像检索,而且克服了手工标注费时费力的缺点,为用户的使用带来了极大的方便。
With the development of web-technique, multi-media, database-technique and unceasing popularity of the Net, the using of image is more and more popular, and the requirements for multi-media data such as graphs and images are more and more intense. The semantics-based image retrieval can not only be convenient for users, but also deliver their intentions exactly, so it is the inevitable way of image retrieval. The annotations, which are able to reduce the gap between high-level semantics and low-level visual content, can well express the semantic content of images, so the automatic annotation of image semantics is being paid more and more attention.
     This paper has a further exploration and study of visual feature extraction depending on analyzing correlative technology of the automatic annotation. According to the HSV(Hue, Saturation, Intensity) color space, the work of color feature extraction is finished, the process is as follows: quantifying the color space in non-equal intervals, constructing one dimension feature vector and representing the color feature by cumulative histogram. Similarly, the work of texture feature extraction is obtained by using co-occurrence matrix or frequency analysis based on wavelet transform depending on different characteristics of images. Depending on the former algorithms, image retrieval based on multi-feature fusion is achieved.
     Fusing different image features and using different similarity measures depending on different characteristics improves the accuracy of image retrieval. At last, on the basis of retrieval results, an example-based method is introduced to annotate images automatically. The training data are stored as the annotation experiences. In order to annotate a new input image, visual similar images are retrieved from the database. Annotation words can be derived from imitating the annotation examples of the retrieved images.
     A large number of simulations show that the semantic annotation of image semantics is not only able to change visual features of images into annotations, which are very useful for semantics-based image retrieval, but also overcomes the shortcomings of manual annotation, which is time-consuming and strenuous, and provides users with great convenience.

引文

1贾林.基于多特征的图像检索技术研究.中国海洋大学硕士学位论文. 2005: 1~3
    2蔡昌许.基于语义的图像标注与检索系统研究.武汉大学硕士学位论文. 2005: 12~16
    3 T. S. Huang, X. S. Zhou. Unifying keywords and visual contents in image retrieval. IEEE Multimedia. 2002, 9(2): 23~33
    4 J. Feng, M. J. Li, H. J. Zhang, B. Zhang. A unified framework for image retrieval using deyword and visual features. IEEE Transactions on Image Processing. 2005, 14(7): 979~989
    5朱兴全,张宏江,刘文印,吴立德. iFind:一个结合语义和视觉特征的图像相关反馈检索系统.计算机学报. 2002, 25(7): 681~688
    6 R. Besancon, P. Hede, P. A. Moellic, C. Fluhr. Cross-media Feedback strategies: merging text and image information to improve image retrieval. Multilingual Information Access for Text, Speech and Images: 5th Workshop of the Cross-Language Evaluation Forum, Darmstadt, Germany. 2005, 9: 709~717
    7 W. C. Lin, Y. C. Chang, H. H. Chen. From text to image: generating visual query for image retrieval. In Multilingual Information Access for Text, Speech and I mages: 5th Workshop of the Cross-Language Evaluation Forum, Darmstadt, Germany. 2005, 9: 664~675
    8于林森.图像检索中缩小语义差距的几个关键技术研究.哈尔滨工业大学博士学位论文. 2006: 41~52
    9 J. Li, J. Z. Wang. Automatic linguistic indexing of pictures by a statistical modeling approach. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2003, 25(9):1~14
    10 W. Niblack, R. Barber, Equitz Wetal. QBIC project : Querying images by content using color, texture and shape. In Storage and Retrieval for Image and Video Databases, Proc. SPIE 1908 (1993): 173~187
    11 J. R. Bach, C. fuller, A. Gupta. The Virage image search engine: An open framework for image management. In Proc. SPIE, Storage and Retrieval for Image and Video Databases, CA, USA, San Jose, 1996: 76~87
    12 A. Pentland, R. W. S. Picard, Sclaroff. Photobook: Content-based manipulation of image databases. In International Journal of Computer Vision, 1996,18(3): 233~254
    13 J. R. Smith, S. F. Chang. Querying by color regions using the VisualSEEK content-based visual query system. In Intelligent Multimedia Information Retrieval, IJCAI, 1996: 123~130
    14 Y. Rui, T. Huang, S. Mehrotra. Content-Based image retrieval with relevance feedback in MARS. In Proc of the IEEE Int'1 Conf. on Image Processing, New York, 1997, 2: 815~818
    15 R. John, Smith, Shih-Fu Chang. Visual Seek: A Fully Automated Content-based Image Query System. ACM Multimedia 96, Boston, MA, Nov.1996: 87~98
    16白雪生,廖春元,徐光佑,史元春. ImgRetr—一个基于内容的图像检索系统.第七届全国多媒体技术学术年会, 1998: 289~294
    17 H. T. Shen, B. C. Ooi, K. L. Tan. Giving meanings to www images. Proceedings of ACM Multimedia, 2000: 39~48
    18 J. Wang, J. Li, G. Wiederhold. Simplicity: Semantics-sensitive integrated matching for picture libraries. Transactions on Pattern Analysis and Machine Intelligence, 2001, 23(9): 947~963
    19 M. Christel, D. Ng, H. Wactlar, A. Hauptmann. Collages as dynamic summaries for news video. Proceedings of the ACM Multimedia, Juan-les-Pins, France, 2002: 1~6
    20 K. Barnard, P. Duygulu, N. Freitas, D. Forsyth, D. Blei, M. I. Jordan. Matching words and pictures. Journal of Machine Learning Research, 2003, 3: 1107~1135
    21 B S Manjunath, W Y Ma. Texture feature for browsing and retrieval of image data. IEEE Transaction on PAMI, 1996, 18(8): 837~842
    22刘鹏宇.基于内容的图像特征提取算法研究.吉林大学硕士学位论文. 2002: 13~18
    23 L. F. Costa, R. M. Cesar. Shape Analysis and Classification: Theory and Practice. CRC Press, 2001
    24 Y. Rui,. C. Alfred, T. S. Huang. Modified descriptor for shape representation, a practical approach. In: Proc of First Int's workshop on Image Database and Multimedia Search, 1996
    25 S. K. Chang, K. S. Fu. Query by pictorial example. IEEE SE, 6(6), 1980: 519~524
    26 Sameer Antania, Rangachar Kasturia, Ramesh Jainb. A survey on the use of pattern recognition for abstraction, indexing and retrieval of images and video. Pattern Recognition, 2002, 35: 945~965
    27孙君顶.基于内容的图像检索技术研究.西安电子科技大学博士学位论文. 2005: 38~40
    28 J. R. Smith. Integrated Spatial and feature image systems retrieval, analysis and compression. Ph.D thesis, graduate school of arts and sciences, columnbia university. 1997
    29 E. Mathias. Comparing the influence of color spaces and metrics in the content based image retrieval. International symposium on computer gaphics, image processing and vision, 1998: 371-378
    30黄元元,郭丽,杨静宇.基于主色调匹配的图像检索方法.计算机工程, 2002, 28(6): 28~29
    31阴炳皓,赵臣,韩晓军.基于改进的HSI空间模型的目标搜索方法.河北工业大学学报, 2003, 32(1): 659~662
    32 Won Soon Kim, Rae Hong Park. Color Image Palette Construction based on the HSI Color System for Minimizing the Reconstruction Error. IEEE Proceedings of International Conference on Image Processing. 1996: 1041~1044
    33章毓晋.基于内容的视觉信息检索.北京科学出版社, 2003: 64~88
    34 Y. Mori, H. Takahashi, R. Oka. Image-to-word transformation based on dividing and vector quantizing images with words. MISRM’99 First International Workshop on Multimedia Intelligent Storage and Retrieval Management, Orlando, Florida. 1999, 10: 120~127
    35 M. Stricker, M. Orengo. Similarity of Color Images. SPIE, 1995, 2420: 381~392
    36 J. R. Smith, F. S. Chang. Tools and Techniques for Color Image Retrieval. Symposium on Electronic Imaging: Science and Technology-Storage and Retrieval for Image and Video Database IV, Vol. 2670, 1996: 426~237
    37 G. Pass, R. Zabih. Histogram refinement for content-based image retrieval. IEEE Workshop on Applications of Computer Vision, 1996: 96~102
    38 Niblack. The QBIC project: querying images by content using color, texture, and shape. Proceedings of SPIE Storage and Retrieval for Image and VideoDatabases, Vol. 1908, February, 1993: 173~187
    39 R. M. Harafick, K. Shangmugam. Texture feature for image classification. IEEE Transactions on Systems, 1973, SMC-3(6): 768~780
    40王惠明,史萍.图像纹理特征的提取方法.中国传媒大学学报, 2006, 13(1): 50~52
    41 M. M. Galloway. Texture Classification Using Gray-level Run lengths. CGIP 4, June, 1975: 172~179
    42 M. R. Turner. Discrimination by Gabor Functions. Biol. Cybern, 1986,55:71~82
    43曹莉华,柳伟,李国辉.基于多种主色调的图像检索算法研究与实现.计算机研究与发展, 1999, 36 (1): 96~100
    44王涛,胡事民,孙家广.基于颜色—空间特征的图像检索.软件学报. 2002, 13(10): 2013~2036
    45龚坚.彩色图像分割和纹理分隔.东南大学硕士学位论文. 1998: 38~40
    46 K. Barnard, P. D. A. Forsyth. Learning the Semantics of Words and Pictures. Proceedings of International Conference on Computer Vision, 2001: 408~415
    47 J. Z. Wang, J. Li. Learning-based linguistic indexing of pictures with 2-DMHMMs. Proceedings of ACM Conference on Multimedia, 2002: 436~445
    48 P. Duygulu, K. Barnard, N. Freitas, D. Forsyth. Object recognition as machine transla-tion: Learning a lexicon for a fixed image vocabulary. In Seventh European Conf. on Computer Vision, 2002: 97~112
    49 D. Blei, A. Ng, M. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, 2003, 3: 993~1022
    50 J. Jeon, V. Lavrenko, R. Manmatha. Automatic Image Annotation and Retrieval using Cross-Media Relevance Models. In Proceedings of the 26th Intl. ACM SICIR Conf., 2003: 119~126
    51 J. Jeon, R. Manmatha. Using Maximum Entropy for Automatic Image Annotation. Int'1 Conf on Image and Video Retrieval (CIVR 2004), 2004: 24~32
    52 R. Jin, J. Y. Chai, L. Si. Effective Automatic Image Annotation via A Coherent Language Model and Active Learning. Proceedings of MM'04, 2004
    53刘晶.基于语义的自然图像检索.西北工业大学硕士学位论文. 2006: 9~22
    54吴聪苗.多媒体交叉参照检索和语义自动标注.浙江大学硕士学位论文. 2005: 7~30

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700