基于内容的实时图像检索系统

作者：邓小飞
论文级别：硕士
学科专业名称：计算机软件与理论
中文关键词：图像检索 ; 图像聚类 ; GPU加速
英文关键词：image retrieval ; image cluster ; speed up by GPU
学位年度：2010
导师：章毅
学科代码：081202
学位授予单位：电子科技大学
论文提交日期：2010-03-01

摘要

随着计算机技术和网络的迅速发展,信息采集、传播无论是在速度还是规模都达到了空前的水平。特别是伴随各种数码电子产品的普及,每天新增的图片数量达到了极高的数量级,如Facebook、腾讯等热门网站每天用户上传的图片数量就可达到数千万张。如何有效的管理、查询图像信息并使之更有效的与其它用户分享成了亟待解决的问题,图像检索技术自然也就成为近年来国内外研究的热点。现在常用的方式是基于文字的图像检索(TBIR),如谷歌、百度等搜索引擎。但一幅图像胜过千言万语,不是所有的东西都可以用文字描述,不是所有的东西都在文字里描述了,TBIR并不能解决所有的问题,而基于内容的图像搜索(CBIR)能提供TBIR所不具备的功能和应用。
     本文围绕如何提取有效图像视觉特征、如何衡量图像之间的相似性以及如何加速基于内容的图像检索以达到实时检索的目的来展开研究。本文的主要内容包括:
     1、研究和分析图像特征的不同提取方法,包括纹理、小波、轮廓、颜色、兴趣点、显著点等一系列图像特征。
     2、在研究经典CBIR系统的基础上,建立了一个图像检索框架,包括特征提取、特征保存、图像搜索、结果显示等,该框架具有模块替换性、接口灵活、易于维护与扩充的特点。并且该平台为其他多媒体挖掘算法的实现提供了基础。
     3、使用聚类和GPU加速图像搜索,充分利用GPU(Graphic Processing Unit)的运算能力实现图像检索加速,已成功达到20倍加速比(NVIDIA 9500GT,Intel E2200单线程)。
     4、在CBIR基础上实现视频检索原型,实现视频镜头分割、视频关键帧提取等。可根据输入图片找到视频中对应或者相似的信息。
With the rapid development of computer and network technology, the speed and scale of information collection and dissemination reach to an unprecedented level. In particular, with the popularity of all kinds of digital products, a huge mass of pictures are produced every day. How to management image effective and make it more effective to share with other users has become a serious problem, in recent years, image retrieval naturally becomes the research hotspot at home and abroad. Text-based image retrieval is a popular way, such as Google, Baidu and other search engines. However, a picture worth a thousand words, not all things can be described in words and not all things have described in words. TBIR cannot solve all problems, while the content-based image retrieval could provide some functions and applications not available by TBIR.
     This thesis is focused on how to extract effective image visual features, how to measure similarity between the images and how to speed up retrieval process in order to achieve real - time image retrieval. The main contents are as follows:
     1、Study arid analysis different way to extract image features, such as texture, wavelet, profile, color, interest points
     2、A content-based image retrieval framework is built on the study of some classic CBIR system, including feature extract, feature storage, image retrieval and results display. It provides a foundation for other multimedia mining system.
     3、Speed up image retrieval process by cluster and GPU, and has achieve speed-up ratio up to 20 by NVIDIA 9500GT under benchmarks of Intel E2200 single thread.
     4、A content base video retrieval system has been implement in the foundation of CBIR system, which has the function of Shot segmentation, key frame extraction and has the ability to retrieval corresponding or similar video information according to user’s input image.

引文

[1]章毓晋.图像工程上册:图象处理和分析.北京:清华大学出版社, 2000
    [2]章毓晋.基于内容的视觉信息检索.北京:科学出版社, 2003
    [3] Y Rui, S H Thomas, S F Chang. Image retrieval: Past, present and future. Journal of Visual Communication and Image Representation, 1999, 10(1): 39-62
    [4] Jia Li, James Z. Wang. Real-Time Computerized Annotation of Pictures. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007
    [5] Hanjalic A.2001a. Video and image retrieval beyond the cognitive level: the needs and possibilities. SPIE, 4676:130-140
    [6] IBM. QBIC-IBM’S Query by Image Content. http: //www.qbic.almaden.ibm.com/~qbic/qbic. html, 1995
    [7] M. Flickner, et.al. Query By Image and Video Content: The QBIC System, Computer, 1995, 28(9): 23~31
    [8] Photoshop CS2中文完全手册. http://tool.admin5.com/shouce/photoshopcs2/ help.html
    [9] Gonzalez R C, Woods R E. Digital Image Processing [M]. 1992, 3rd ed. Addison-Wesley
    [10] Palus H. Colour space. In: Sangwine S J. Horne R E N, eds. The Color Image Processing Handbook. Chapman & Hall, 1998
    [11] J. Huang, S. Kumar, M. Mitra, W.-J. Zhu, and R. Zabih, Image indexing using color correlogram, in Proc.of IEEE Conf. on Computer Vision and Pattern Recognition, 1997
    [12] Weshsler W. Texture analysis-a survey. Signal Processing, 1980, 2: 271-280
    [13] M. K. Hu, "Visual Pattern Recognition by Moment Invariants", IRE Trans. Info. Theory, 1962, vol. IT-8: 179–187
    [14] http://en.wikipedia.org/wiki/Image_moment
    [15] K. Mikolajczyk and C. Schmid. A performance evaluation of local descriptors. PAMI, 2005,27(10):1615–1630
    [16] D. G. Lowe. Object recognition from local scale-invariant features. In Proceedings of the 7th International Conference on Computer Vision, Kerkyra, Greece, 1999, 1150-1157
    [17] D. Lowe. Distinctive image features from scale-invariant key points. International Journal of Computer Vision, 2004, 2(60): 91-110
    [18] Mikolajczyk, K. Detection of local features invariant to affine transformations, Ph.D. thesis, Institute National Poly technique de Grenoble, France, 2002
    [19]冈萨雷斯,数字图像处理,电子工业出版社, 2005
    [20] Harris, C. and Stephens, M. 1988. A combined corner and edge detector. In Fourth Alvey Vision Conference, Manchester, UK, 147-151
    [21] Pang-Ning Tan, Michael Steinbach, Vipin Kumar.数据挖掘导论,人民邮电出版社, 2006
    [22] C. L. Mallows. A not on asymptotic joint normality. Annals of Mathematical Statistics, 1972, vol. 43, (2): 508-515
    [23] F. Monay and D. Gatica-Perez. On image auto-annotation with latent space models, Proc. ACM Multimedia Conference, Berkeley, CA. 2003, 275–278
    [24] Y. Rubner, C. Tomasi, and L. J. Guibas. A Metric for Distributions with Applications to Image Databases. Proceedings of the 1998 IEEE International Conference on Computer Vision, Bombay, India, January 1998, 59-66.
    [25] Code for the Earth Movers Distance (EMD). http://ai.stanford.edu/~rubner/emd/ default.htm
    [26]刘伟,图像检索的若干问题的研究: [博士论文],杭州:浙江大学, 2007
    [27] Weber R, Schek H, Blott, S. A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In: Gupta A, Shmueli O, Widom J, eds. Proc. of the 24th ACM Int’l Conf. on Very Large Data Bases (VLDB’98). New York: Morgan Kaufmann Publishers, 1998, 194-205
    [28] Josef Sivic, Andrew Zisserman Video Google: A text retrieval approach to object matching in videos. In Proceedings of the International Conference on Computer Vision, Nice, France 2003
    [29] Zhang H J, Kankanhalli A, Smoliar S. Automatic partitioning of video. Multimedia Systems, 1993, 1(1): 10-28
    [30] NVIDIA, NVIDIA CUDA计算统一设备架构编程指南2.0, 2008, 1-11
    [31]赵开勇, GPU的革命, http://blog.csdn.net/OpenHero/category/379389.asp

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700