基于注意力机制的图像显著区域提取算法分析与比较

设为首页

收藏本站

网站地图 | English | 公务邮箱

远程访问

NSTL服务站

基于注意力机制的图像显著区域提取算法分析与比较

详细信息本馆镜像全文| 推荐本文 | | 获取CNKI官网全文

英文题名：Analysis and Comparison of Image Salient Region Extraction Algorithms Based on Attention Mechanism
作者：李敏学
论文级别：硕士
学科专业名称：计算机科学与技术
中文关键词：视觉感知 ; 显著图 ; 显著区域 ; 算法评测
英文关键词：Visual Perception ; Saliency Map ; Salient Region ; Algorithm Evaluation
学位年度：2011
导师：宋泽海
学科代码：081202
学位授予单位：北京交通大学
论文提交日期：2011-06-01
答辩委员会主席：须德

摘要

基于生物感知的图像显著特征分析在二十世纪九十年代末兴起,并逐渐成为生物视觉感知领域研究的焦点。该方法结合人类心理学和生理学理论知识,基于人类视觉注意机制,模拟人眼机能建立图像显著性提取模型。视觉显著性特征提取作为一项独立的技术,可以帮助我们更好地分析和理解数字图像。提取图像显著性的研究是一个图像分析,特征提取和探索人类视觉特性的综合过程,对各种基于图像分析和理解的应用都有着非常重要的意义。本文的工作是分析各种显著图生成模型,实现基于注意力机制的显著图生成算法,开发算法实现与比较系统,并进行结果分析。
     本文分析五种显著图生成算法。Itti模型是典型的基于空间的注意模型,它对一幅输入的图像提取颜色、亮度、方向等特征,然后在各个特征维上形成关注图,最终融合这些关注图为显著图；SMG(Saliency map generation)算法是一种“尺度内中心周边对比,尺度间插值融合”的基于特征的注意模型；HC(Histogram-based Contrast)算法是基于直方图对比的显著图生成算法；RC(Region-based Contrast)算法是基于区域对比的显著图生成算法,该算法使用分割算法将图像分割成区域,在区域级上进行基于颜色直方图的显著值计算；FT(Frequency-tuned)算法是基于频率调频的显著图生成算法,并且增加了边缘检测。
     基于以上的研究工作,本人对基于注意力机制的显著性特征提取理论进行了梳理,重新实现Itti算法和SMG算法。然后根据算法提取显著区域的准确率,对各算法进行比较评测,参照的标准是微软亚洲研究院(MSRA)提供的显著对象图像库。比较发现,RC算法的准确率最高,但该算法的存贮量和计算量较大：SMG算法效率高,系统开销小,有助于满足图像检索的实时性要求；Itti算法需要对原图像降采样,显著图仅为原图像的1/256；FT算法计算效率高,而且输出的显著图是全分辨率的；LC算法避免了图像分割,因此对于那些难于自动分割的图像,效果非常好。
In the late 1980s, Analysis of Image saliency feature based on biological perception were beginning to emerge, and were gradually becoming the focus of biological perception research. This method combines with the theory of human psychology and physiology knowledge, based on human visual attention mechanism to simulate human eye functions to build a saliency extraction model. As an independent technology, extracting visual saliency feature could help us analyze and understand digital image better. Research on image saliency extracting is an integrated process of image analysis, feature extracting and human visual feature discovering, and it is important to kinds of applications based on image analysis and understanding. This topic is to analyze kinds of saliency map generation models, implement saliency map generation algorithms based on visual attention, develop algorithms implementation and comparison system, and analyze the results.
     This topic introduces five saliency map generation models. Itti model is a typical spatial-based attention model. This model extracts various features from one image including color, illumination and orientation, then generates attention maps on each feature dimension, finally, fuses these attention maps into saliency map; SMG(Saliency map generation) algorithm is the "center-surrounding contrast in scale, interpolation fusion between scales" feature-based attention model; HC(Histogram-based Contrast) algorithm defines saliency values for image pixels using color statistics of the input image; RC(Region-based Contrast) algorithm integrates spatial relationships into region-level contrast computation, it needs to cut the image into segments, and then computes the saliency value based on color histogram on region level; FT(Frequency-tuned) is the algorithm based on Frequency-tuned salient region detection adding edge detection.
     Based on the above research work, I make a pectination to the theory of saliency feature extraction based on visual attention. Re-implement the Itti algorithm and the SMG algorithm. Then I compare the algorithms with the precision of salient region extracting. Microsoft Research Asia (MSRA) provides a significant object image library as the reference standard. So, RC algorithm has the highest precision, but this method's storage and computation is large; SMG algorithm is efficient, and the overhead is low, this helps SMG meet the requirement of real-time of image retrieval; Itti algorithm needs to down-sampling to the original image, so the saliency map has only the scale of 1/256 of original image; FT algorithm has high efficiency, and the saliency map is full resolution; LC algorithm avoids image segmentation, so it is effectual to the image difficult to automatic segmentation.

引文

[1]Daniel J. Simons & Daniel T. Levin (1997). Change Blindness. Trends in Cognitive Sciences 1:241-82
    [2]Boothe R G. Perception of the Visual Environment. New York:Spring-Verlag,2002
    [3]Itti L, Koch C, Niebur E. A model of saliency-based visual attention for rapid scene analysis[J]. IEEE Trans on Pattern Analysis and Machine Intelligence,1998,20(11):1254-1259
    [4]周明全,耿国华,韦娜.基于内容图像检索技术.北京：清华大学出版社,2007.7
    [5]Subutai A. VISIT:An efficient computational model of human visual attention [D]. California: the University of Illinois Berkeley,1991
    [6]冯松鹤.面向感知的图像检索及自动标注算法研究.北京交通大学博=士学位论文,2009.
    [7]Yaoru Sun and Robert Fisher, Object-based Visual Attention for Computer Vision. Artificial Intelligence,2003,146:77-123.
    [8]Ming-Ming Cheng,, Guo-Xin Zhang, Niloy J. Mitra, Xiaolei Huang, Shi-Min Hu. Global Contrast based Salient Region Detection. IEEE CVPR, p.409-416, Colorado Springs, USA, June 21-23,2011
    [9]R. Achanta, S. Hemami, F. Estrada.Frequency-tuned salient region detection. In CVPR, pages 1597-1604,2009.1,2,4,5,6,7
    [10]Michael J. SWAIN, H.BALLARD. Color indexing.International Journal of Computer Vision, 7:1,11-32(1991)
    [11]Markus Stricker, Alexander Dimai.Color Indexing with Weak Spatial Constraints. SPIE proceedings,2670:2940, February 1996
    [12]Greg Pass, Ramin Zabih. Justin Miller. Comparing Images Using Color Coherence Vectors. MULTIMEDIA'96 Proceedings of the fourth ACM international conference on Multimedia
    [13]Jing Huang, S. Ravi Kumar, Mandar Mitra, Wei-Jing Zhu, Ramin Zabih. Image indexing using color correlograms. cvpr,1997
    [14]刘晓民.纹理研究综述.测控技术,2008,27(5)：4-9
    [15]王平,董玉德,罗喆帅, Freeman链码的直线识别方法.计算机工程-2005年10期.
    [16]S K Chang. Iconic indexing by 2D string. IEEE trans pattern analysis and machine intelligence, 1984,6(4):413-428
    [17]罗四维,等.视觉信息认知计算理论.北京：科学出版社,2010.
    [18]SW Kuffler. Discharge patterns and functional organization of mammalian retina.Journal of Neurophysiology,1953, American Physiological Association
    [19]Rodieck R W, Stone J J. Analysis of receptive fields of cat retina ganglion cells. Journal of Neuropthysiology,1965,28:833-849
    [20]H K. Hartline, Henry G Wagner, Floyd Ratliff. Inhibition in the eye of Limulus. The Journal of General Physiology.
    [21]Feng Songhe, Xu De, Yang Xu, Wu Aimin. A novel region-based image retrieval algorithm using selective visual attention model. In:Proc. of Int. Conf. on Advanced Concepts for Intelligent Vision Systems (ACIVS'05), Antwerp, Belgium, Sep.2005, LNCS 3708,235-242
    [22]Ungerleider LG, Mishkin M. Two cortical visual systems. In:Ingle DJ, Goodale MA,Mansfield RJW(ends)Analysis of visual behavior
    [23]Melvyn A, Goodale, A.David Milner. Separate visual pathways for perception and action.T rends in Neurosciences Volume 15, Issue 1, January 1992, Pages 20-25
    [24]I A Rybak, V I Gusakova, A V Golovan, et al. A model of attention-guided visual perception an recognition.Vision Research,1998,38:2387-2400
    [25]Anne M. Treisman, Garry Gelade.A feature-integration theory of attention. Cognitive Psychology Volume 12, Issue 1, January 1980, Pages 97-136
    [26]L Itti, C Koch. Computational modeling of visual attention.Nature Reviews Neuroscience,2001, 2(3):194-230
    [27]高静静,张菁,卓力,等.应用于图像检索的视觉注意模型的研究.测控技术,2008,27(5)：19-21
    [28]O Meur, et al. A Coherent Computational Approach to Model Bottom-Up Visual Attention, IEEE Transactions on Pattern Analysis and Machine Intelligence.2006,28(5):802-817
    [29]D B Walther, C Koch.Attention in hierarchical models of object recognition. Prog Brain Res, 2007,165(3):57-58
    [30]V Navalpakkam, L Itti.Modeling the Influence of Task on Attention.Vision Research, 2005,45(2):205-231
    [31]H Fu, Z Chi, D Feng.Attention-driven Image Interpretation with Application to Image Retrieval. Pattern Recognition,2006,39(7):1604-1621
    [32]阮秋奇,数字图像处理基础,北京：清华大学出版社,2009.12
    [33]Rafael C. Gonzalez, Richard E. Woods, Steven L. Eddins.著,阮秋奇等译.数字图像处理(MATLAB版).北京：电子工业出版社,2005.9
    [34]Koch C, Ullman S. Shifts in selective visual attention:towards the underlying neural circuitry[J]. Human Neurobiology,1985,4(4):219-227
    [35]Ma Y F, Zhang H J. Contrast-based image attention analysis by using fuzzy growing. In:Proc. of Int. Conf. on ACM Multimedia (ACM Multimedia'03), Berkeley, CA, USA, Nov.2003: 374-381.
    [36]X.Hou and L. Zhang. Saliency detection:A spectral residual approach. IEEE Conference on Computer Vision and Pattern Recognition,2007.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700