用户名: 密码: 验证码:
视觉注意模型及其在图像分类中的应用
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
视觉选择性注意机制是人类视觉系统的重要组成部分,它使得我们可以在复杂的场景中快速定位到感兴趣的区域,从而对外界环境做出及时的反应。人类视觉系统的选择性注意机制主要包含两个过程:快速的、生理驱动的自底向上注意机制和慢速的、任务驱动的自顶向下注意机制。在一幅图像中,通常有一个或者几个显著区域,而其他区域则为背景区域,是人眼所不感兴趣的地方,且会给图像处理带来很大的干扰性。人眼可以通过视觉注意机制对图像中的显著区域进行加强,同时对非显著区域进行削弱以减少杂乱背景的干扰,这样,在后期图像处理中,可以只选取图像中的部分信息,而不需要对全图信息进行无区分地处理。
     已有的视觉选择性注意模型集中在对自底而上的机制的描述上。通过对这些计算模型的研究发现,已有模型在描述图像内容时存在着明显的不足,其计算结果在一些情况下与人眼的判断并不相符。同时,一个显著的特征在某些情况下可能不会得到注意,人眼更可能会注意到一幅图像里比较稀少的特征。针对上述情况,本文对现有的模型进行了改进,提出了一种基于视觉选择性注意模型和全局稀少性相结合的视觉注意模型,并将其用于图像分类中,即自动根据图像内容进行图像的归类。用本文算法提出的图像特征,不涉及图像的具体语义内容,而是借助了与语义相关的图像基元特征,可以利用空间特征的对比性来筛选出可能存在的显著对象。这样,可以弥补单纯的低层特征在描述图像内容时对视觉对象的偏离,同时,也避免了基于区域分割的全图处理带来的高复杂度和高冗余度。实验结果表明,使用本文提出的视觉选择性注意模型提取出的图像特征,在多类物体分类中达到97.74%的总准确率,取到了非常好的效果。
Visual attention is one of the most important mechanisms of the human visual system. It enables us to locate regions of interest in a complex scene, in order to act effectively in our environment. The visual selective mechanism mainly includes two steps: bottom-up mechanism which is rapid and data-driven; top-down mechanism which is slow and task-driven. More often than not, there are several salient regions in the image while the other regions can be viewed as the background which are useless to the understanding of the current image and may bring complexity to the image processing. Through the human selective mechanism, we can locate the salient regions and do not have to process the information of the whole scene.
     The existing visual selective model mainly focuses on the bottom-up mechanism. However, these models are not very efficient to locate salient points in some situations. Meanwhile an image can not be fully described only through a visual selective model because a salient feature can become less salient in certain situations. Humans may become attracted by features which are in minority. This paper proposes a way of combining visual selective model with global rarity to group together images. The attention features extracted through this method focus on the description of objects which may attract human attention. Experimental results show that the proposed approach works well for image classification and the average accuracy rate can reach 97.74%.
引文
1. J.A. Deutsch and D. Deutsch. Attention: Some Theoretical Considerations[J]. Psychological Review, 1963, 70: 80-90.
    2. D.H. Hubel. The visual cortex of the brain[J]. Scientific American,1963, 209(5), 54-62
    3. C. Koch and S. Ulman. Shifts in selective visual attention: toward the underlying neural circuitry[J]. Hum Neurobiol, 1985, 4(4):219-227.
    4. L.Itti, Models of bottom-up and top-down visual attention[dissertation], California Institute of Technology Pasadena, 2000
    5. R. Milanese. Detecting salient regions in an image: From biological evidence to computer implementation[dissertation], University of Geneva, Switzerland, 1993
    6. M.MANCAS, Computational Attention: Modelisation&Application to Audio and Image Processing[dissertation], Belgium: TCTS Lab , 2007
    7.章毓晋,图像理解与计算机视觉,北京:清华大学出版社,2000.
    8.章毓晋,图像处理与分析,北京:清华大学出版社,1999(3).
    9. Tie Liu, Jian Sun, Nan-Ning Zheng, Xiaoou Tang and Heung-Yeung Shum. Learning to detect a salient object[J]. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2007:1-8.
    10. L. Itti, C. Koch, and E. Niebur. A model of saliency-based visual attention for rapid scene analysis[J]. IEEE Trans. on PAMI, 1998, 20(11):1254–1259.
    11. D. Marr. Vision-A computational investigation into the human representation and processing of visual information. San Francisco: W.H.Freeman, 1982.
    12. Rybak I.A. A model of attention guided visual perception and recognition[J]. Vision Search, 1998, 38(15-16): 2387-2400,
    13. F. Mahmood. Attentional selection in object recognition[dissertation]. USA: Massachusetts institute of technology, 1993
    14. T. Kadir. Scale, saliency and scene description[dissertation]. UK: university of Oxford, 2002
    15. Fei-Fei, L.and Perona. A Bayesian hierarchical model for learning natural scene categories[J], Computer Vision and Pattern Recognition, 2005,2:524-531
    16. L. Itti and C. Koch, Computational modeling of visual attention[J], Nature Reviews Neuroscience, 2001, 2(3): 194-203.
    17. L. Itti and C. Koch. A saliency-based search mechanism for overt and covert shifts of visual attention[J]. Vision Research, 2000, 40(10–12):1489–1506.
    18.孙即祥.模式识别中的特征提取与计算机视觉不变量.长沙:国防科技大学出版社, 2001
    19. J.G. Daugman. Complete discrete 2-D Gabor Transform by neural network for image analysis and compression[J]. IEEE Trans Speech and Signal Processing, 1998, 7(36): 1169-1179.
    20. J. Huang, S Ravi Kumar. An automatic hierarchal image classification scheme, proceedings of the sixth ACM international conference on multimedia, 1998: 219-228.
    21. A. M. Treisman and G. Gelade. A feature-integration theory of attention[J]. Cognitive Pyschology, 1980, 12(1):97–136.
    22. T. Lindeberg. Scale space theory in computer vision. Kluwer academic publisher, 1993
    23. T. Lindeberg. Scale space for discrete signals[J]. IEEE Trans. Pattern analysis and machine intelligence, 1990, 12(3): 234-245.
    24. Milan Sonka著.图像处理,分析与机器视觉,艾海州等译.北京:人民邮电出版社, 2003
    25. B.A. Olshausen, C.H. Anderson, and D.C. Van Essen. A neurobiological model of visual-attention and invariant pattern-recognition based on dynamic routing of information[J]. Neuroscience, 1993, 13(11):4700–4719.
    26. T. Kadir, M. Brady. Scale, saliency and image description[J]. International Journal ofComputer Vision, 2001, 45(2):83-105
    27.杨俊.图像数据的视觉显著性检测技术及其应用[博士论文],湖南:国防科技大学, 2007
    28. C.E. Connor, D.C. Preddie, J.L. Gallant, and D.C. Van Essen. Spatial attention effects in macaque area V4[J] .Neuroscience, 1997, 17(9):3201–3214.
    29. A. Kuijper. Mutual information aspects of scale space images[J]. Pattern recognition, 2004, 37(12):2361-2373.
    30. U. Rutishauser, D. Walther, C. Koch, and P. Perona, Is bottom-up attention useful for object recognition[J], in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004, 2, 37–44.
    31. B.A. Olshausen, C.H. Anderson, and D.C. Van Essen. A neurobiological model of visual-attention and invariant pattern-recognition based on dynamic routing of information[J]. Neurosci.1993, 13(11):4700–4719.
    32. M. Riesenhuber and T. Poggio. Hierarchical models of object recognition in cortex[J]. Nat.Neurosci. 1999, 2(11):1019–1025.
    33. W. Einh¨auser and P. K¨onig, Does luminance-contrast contribute to a saliency map for overt visual attention?[J], European Journal of Neuroscience, 2003, 17(5): 1089–1097.
    34. D. Parkhurst, K. Law, and E. Niebur, Modeling the role of salience in the allocation of overt visual attention[J], Vision Research, 2002, 42(1): 107–123.
    35. D. Parkhurst and E. Niebur, Texture contrast attracts overt visual attention in natural scenes[J], European Journal of Neuroscience, 2004, 19(3): 783–789.
    36. R. J. Peters, A. Iyer, L. Itti, and C. Koch, Components of bottom-up gaze allocation in natural images[J], Vision Research, 2005, 45(18): 2397–2416.
    37. A. Bamidele, F. W. M. Stentiford, and J. Morphett, An attention-based approach to content-based image retrieval[J], BT Technology Journal, 2004, 22(3): 151–160.
    38. L. Itti and C. Koch, Feature combination strategies for saliency-based visual attentionsystems[J], Journal of Electronic Imaging, 2001, 10(1): 161–169.
    39. C. Connor, H. Egeth, and S. Yantis, Visual attention: bottom-up versus top-down[J],Current Biology, 2004, 14(19):850–852.
    40. Yaoru Sun, and Robert Fisher, Object-based visual attention for computer vision[J], Artificial Intelligence, 2003, 146: 77-123.
    41. Sang-Bok Choi, Sang-Woo Ban, and Minho Lee,Biologically Motivated Visual Attention System Using Bottom-up Saliency Map and Top-down Inhibition[J], Neural Information Processing-Letters and Reviews, 2004, 2(1): 19-25.
    42. Zhang Peng, and Wang Run-Sheng, Detecting Salient Regions Based on Location Shift and Extent Trace[J], Journal of Software, 2004, 15(6): 891-898.
    43. John T. Serences, Jens Schwarzbach, Susan M. Courtney, Xavier Golay, and Steven Yantis, Control of Object-based Attention in Human Cortex[J], Cerebral Cortex, 2004, 14(12): 1346-1357.
    44. R. Fergus, Visual object category recognition[dissertation], Unversity of Oxford, 2005.
    45. Simone Frintrop, VOCUS: A Visual Attention System for Object Detection and Goal-Directed Search, USA, Springer-Verlag New York Inc, 2006, 55-86 .
    46. simonhaykin,神经网络原理,北京:机械工业出版社, 2004
    47. R. Fergus, P. Perona, and A. Zisserman,A sparse object category model for efficient learning and exhaustive recognition[J]. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2005, 1:380-387.
    48. A. Opelt and A. Pinz. Object localization with boosting and weak supervision for generic object recognition[J]. In Proceedings of the 14th Scandinavian Conference on Image Analysis (SCIA), 2005.
    49. A. Opelt, A. Pinz, M. Fussenegger, and P. Auer. Generic object recognition with boosting[J]. Pattern Analysis and Machine Intelligence, IEEE Transactions, 2006, 28(3):416–431.
    50. Chih-Chung Chang and Chih-Jen Lin, LIBSVM: a library for supporting vector machines.2001, http://www.csie.ntu.edu.tw/~cjlin/libsvm
    51. L. Fei-Fei, R. VanRuellen, C. Koch and P. Perona. Why does natural scene categorization require little attention? Exploring attentional requirements for natural and synthetic stimuli[J]. Visual Cognition.2005, 12(6): 893-924.
    52. A. Oliva, A. Torralba. Modeling the shape of the scene: A holistic representation of the spatial envelope[J]. International Journal of Computer Visions, 2001(42), 145–175.
    53. L. Fei-Fei, R. Fergus and P. Perona. One-Shot learning of object categories[J]. IEEE Trans. Pattern Analysis and Machine Intelligence, 2006, 28(4): 594– 611.
    54. L. Fei-Fei. Visual recognition: computational models and human psychophysics[dissertation] . USA: Caifornia Institute of Technology. 2005.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700