用户名: 密码: 验证码:
基于拓扑性质的视觉注意力模型的研究及其应用
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
视觉是人类所有知觉中最主要的一种,注意力选择是视知觉中的一个重要特性。我们可以很容易的检测和识别图像以及视频中的不同物体,然而传统的机器视觉却很难做到这一点。心理学研究表明,在人类视知觉过程中,物体的全局拓扑知觉首先得到感知,其次视觉会感知到亮度、颜色、运动等局部特征,将其并行输入到视觉神经元中同步处理。注意力选择机制在视觉感知中发挥了关键性的作用,帮助我们关注和提取场景中感兴趣的区域和目标。现有的注意力选择机制主要分为自下而上和自上而下的注意力两部分。
     本文中我们研究了视觉注意机制,基于心理学上的拓扑知觉理论,将拓扑性质应用到了自下而上的显著性检测方法中,提出了一个新的客观的显著图评价标准,还将显著性信息应用到了图像分割中。论文的主要工作和贡献包含以下几个方面:
     1.将拓扑性质应用到了注意力选择模型当中,提出了一个基于拓扑知觉特性的白下而上的显著性检测方法。该方法将视觉信息的拓扑连通性,颜色,亮度,运动等特征提取出来,并行输入到四元数模型中进行处理。经过超复数傅里叶变换,得到原始图像的相位信息。之后将其反变换到空间域,再进行滤波,最终得到模型的注意力显著图。这种方法考虑了视知觉中拓扑性质的重要作用,以著名的心理学理论为基础。我们模型的效果能很好地反映注意力的显著性信息分布。
     2.提出了一个全新的显著图评价标准。该标准不需要人工参与,完全依靠客观评价。在自下而上的注意力选择模型当中,模型的结果一般都是依靠注意力显著图来表现的。如何有效地评价显著图质量一直是这个领域重要的问题之一。现有的评价方法几乎都需要人工的参与,这样的评价标准难免带有主观性的因素,可信度和说服力都不高。我们提出的评价方法基于模型通道贡献度的评测,完全不需要人工参与,是一个客观的评价指标。基于提出的评价标准,发现了模型中存在的一些问题,进行了改进,调整了拓扑通道的权值,适度地降低了其对最终结果的影响。改进后的模型效果能更客观真实地反映注意力的分布。
     3.将注意力显著性信息应用到图像分割中,实现了彩色图像的自动分割。现有的彩色图像分割方法绝大多数都严重依赖于人工标记,或者是参数调整。我们将图像的显著性信息提取出来,把目标和背景区域的人工标注改进为显著图自动标记,并结合了均值平移和基于最大相似度的区域生长两种分割方法,提出了一个新的彩色图像自动分割方法。新方法在分割效果上有所提升,但最重要的突破在于它是一个全自动的,不需要人工参与的彩色图像分割方法。
Vision is one of the most important human being's perceptions. Attention selection is an important feature in visual perception. We can easily detect and recognize different objects in images and videos, but it is difficult for the traditional machine vision. Psychological research shows that, global topological perception of objects is perceived firstly in the process of human visual perception, followed by local features as brightness, color and motion. These local features are input parallel to visual neurons and processed synchronously. Attention selection plays a key role in the visual perception, which helps us focus and extract target regions with interest. Attention selection could be divided into two parts:bottom-up attention and top-down attention.
     We study the mechanisms of visual attention in this paper. Based on topological perception theory in psychology, topological properties are applied to bottom-up attention detection method. We also propose a new objective evaluation criterion for saliency maps, and apply saliency information in image segmentation. The main contribution of this paper includes:
     1. Topological properties are applied in an attention selection model. We propose a bottom-up attention detection method based on topological perception. In our model, the topological connectivity, color, brightness, motion and other features of visual information are extracted and input parallel to quaternion for processing. With hypercomplex Fourier transformation, the phase information of the original image could be obtained. After the inverse transformation to the spatial domain and filtering, the model's attention saliency map could be got finally. This method takes into account the important role of topological properties in visual perception and has a famous psychological theory as a basis. Our model can reflect attention saliency information distribution effectively.
     2. We introduce a new evaluation criterion for saliency map. The criterion is an objective evaluation without human beings participation. In bottom-up attention selection models, the results are displayed in saliency maps. It is always an important issue in this area that how to evaluate the quality of saliency maps effectively. Almost all of the existing evaluation methods require human beings participation. This kind of evaluation criteria has subjective factors inevitably without high reliability and persuasiveness. Our criterion is based on evaluation of the contribution of channels, as an objective evaluation, which does not need human beings participation. Based on the proposed evaluation criteria, we find some problems in the model introduced in the first part and improve. With adjusting the weight of the topological channel, the influence of that channel to the model could be reduced. The improved model can reflect the attention distribution more objectively and really.
     3. Attention saliency information is applied to image segmentation and achieves automatic segmentation of color images. Most of the existing color image segmentation methods rely heavily on manual marking, or parameter adjustment. We extract the image saliency information and improve the artificial mark of object and background regions to automatic mark by saliency maps. Combined with the mean shift and the maximal similarity based region merging segmentation methods, we introduce a new automatic color image segmentation method. The effect of our method is slightly improved, but the most important improvement is that it is an automatic color image segmentation method without needing human beings participation.
引文
[1]C.L. Guo and L.M. Zhang, An Attention Selection System Based on Neural Network and Its Application in Tracking Objects, International Symposium of Neural Networks,3972 (2006) 404-410.
    [2]C.L. Guo and L.M. Zhang, Attention Selection with Self-supervised Competition Neural Network and Its Applications in Robot, International Symposium of Neural Networks,4491 (2007) 723-732.
    [3]J. Wolfe, Guided Search 2.0:A Revised Model of Guided Search, Psychonomic Bulletin & Review,1 (1994)202-238.
    [4]W. James, The Principles of Psychology, New York:Henry Holt,1 (1890) 403-404.
    [5]J. A. Deutsch and D. Deutsch, Attention:Some Theoretical Considerations, Psychological Review,70 (1963) 80-90.
    [6]A. Treisman and G. Gelade, A Feature-Integration Theory of Attention, Cognitive Psychology, 12(1980)97-136.
    [7]J. Duncan and G. W. Humphreys, Visual Search and Stimulus Similarity, Psychological Review, 96(1989)433-458.
    [8]L. Itti, C. Koch, E. Niebur, et al, A Model of Saliency-Based Visual Attention for Rapid Scene Analysis, IEEE Transactions on Pattern Analysis and Machine Intelligence,20 (1998) 1254-1259.
    [9]L. Itti, Models of Bottom-Up and Top-Down Visual Attention, Ph.D. thesis, California Institute of Technology, Pasadena,2000.
    [10]L. Itti and C. Koch, A Saliency-Based Search Mechanism for Overt and Covert Shifts of Visual Attention, Vision Research,40 (2000) 1489-1506.
    [11]L. Itti and P. Baldi, A Principled Approach to Detecting Surprising Events in Video, IEEE Computer Society Conference on Computer Vision and Pattern Recognition,1 (2005) 631-637.
    [12]L. Itti and P. Baldi, Bayesian Surprise Attracts Human Attention, Vision Research,49 (2009) 1295-1306.
    [13]C. Breazeal and B. Scassellati, A Context-Dependent Attention System for a Social Robot, Proceedings of International Joint Conference on Artificial Intelligence, (1999) 1146-1151.
    [14]D. Walther and C. Koch, Modeling Attention to Salient Proto-Objects, Neural Network,19 (2006) 1395-1407.
    [15]F. W. M. Stentiford, An Attention Based Similarity Measure with Application to Content Based Information Retrieval, Proceedings of SPIE Storage and Retrieval for Media Databases,5021 (2003).
    [16]K. Lee, H. Buxton, and J. FENG, Cue-Guided Search:A Computational Model of Selective Attention, IEEE Transactions on Neural Networks,16 (2005) 910-924.
    [17]X.D. Hou and L.Q. Zhang, Saliency Detection:A Spectral Residual Approach, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, (2007) 1-8.
    [18]C.L. Guo, Q. Ma and L.M. Zhang, Spatio-Temporal Saliency Detection Using Phase Spectrum of Quaternion Fourier Transform, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, (2008) 1-8.
    [19]C.L. Guo and L.M. Zhang, A Novel Multiresolution Spatiotemporal Saliency Detection Model and Its Applications in Image and Video Compression, IEEE Transactions on Image Processing,19(2010) 185-198.
    [20]R. Desimone and J. Duncan, Neural Mechanisms of Selective Visual Attention, Annual Reviews in Neuroscience,18 (1995) 193-222.
    [21]J. M. Wolfe, K. R. Cave, and S. L. Franzel, Guided Search:An Alternative to The Feature Integration Model of Visual Search, Journal of Experimental Psychology:Human Perception and Performance,15 (1989) 419-433.
    [22]Y. Fang, X.D. Gu, and Y.Y. Wang, Pulse Coupled Neural Network Based Topological Properties Applied in Attention Saliency Detection, Proceedings of International Conference on Natural Computation,4 (2010) 1965-1969.
    [23]L. Chen, Topological Structure in Visual Perception, Science,218 (1982) 699-700.
    [24]L. Chen, The Topological Approach to Perceptual Organization, Visual Cognition,12 (2005) 553-637.
    [25]L. Chen, Topological Structure in The Perception of Apparent Motion, Perception,14 (1985) 197-208.
    [26]L. Chen, Topological Perception:A Challenge to Computational Approaches to Vision, In R.Pfeifer (Ed.), Connectionism in perspective, North-Holland:Elsevier Science Publishers, (1989).
    [27]E.C. Zeeman, The Topology of Brain and Perception, Topology of 3-manifolds and Related Topics, Englewood Cliffs, NJ:Prentice Hall, (1962) 240-256.
    [28]J. Zhang and S. Wu, Structure of Visual Perception, Proceedings of the National Academy of Sciences,87 (1990) 7819-7823.
    [29]L. Chen, S.W. Zhang, M.V. Srinivasan, Global Perception in Small Brains:Topological Pattern Recognition in Honeybees, Proceedings of the National Academy of Sciences,100 (2003) 6884-6889.
    [30]T.A. Ell and S.J. Sangwine, Hypercomplex Fourier Transforms of Color Images, IEEE Transactions on Image Processing,16 (2007) 22-35.
    [31]S.J. Sangwine and T.A. Ell, The Discrete Fourier Transform of a Colour Image, Proceedings of Image Processing II Mathematical Methods, Algorithms and Applications, Chichester, U.K., (2000) 430-441.
    [32]S.J. Sangwine, Fourier Transforms of Colour Images Using Quaternion or Hypercomplex numbers, Electronics Letters,32 (1996) 1979-1980.
    [33]S.J. Sangwine, Colour Image Edge Detector Based on Quaternion Convolution, Electron. Letters,34 (1998) 969-971.
    [34]S.J. Sangwine and T.A. Ell, Colour Image Filters Based on Hypercomplex Convolution, Proceedings of IEEE Vision, Image and Signal Processing,147 (2002) 89-93.
    [35]S.J. Sangwine, T.A. Ell, and C.E. Moxey, Vector Phase Correlation, Electronics Letters,37 (2001) 1513-1515.
    [36]C.E. Moxey, S.J. Sangwine, and T.A. Ell, Hypercomplex Correlation Techniques for Vector Images, IEEE Transactions on Signal Processing,51 (2003) 1941-1953.
    [37]S.J. Sangwine and T.A. Ell, Hypercomplex Auto and Cross Correlation of Color Images, Proceedings of International Conference on Image Processing,4 (1999) 319-322.
    [38]X.D. Gu, Spatial-Temporal-Coding Pulse Coupled Neural Network and Its Applications, Martin L. Weiss, Neuronal Networks Research Horizons, Nova Science Publishers,5 (2007) 137-179.
    [39]R. Eckhorn, H. J. Reitboeck, M. Amdt, and et. al., Feature Linking via Synchronization Among Distributed Assemblies:Simulation of Results from Cat Cortex, Neural Computation, 2(1990)293-307.
    [40]G. Kuntimad and H.S. Ranganath, Perfect Image Segmentation Using Pulse Coupled Neural Networks, IEEE Transactions on Neural Networks,10 (1999) 591-598.
    [41]H.S. Ranganath and G. Kuntimad, Object Detection Using Pulse Coupled Neural Networks, IEEE Transactions on Neural Networks,10 (1999) 615-620.
    [42]X.D. Gu, D.H. Yu, and L.M. Zhang, Image Shadow Removal Using Pulse Coupled Neural Network, IEEE Transactions on Neural Networks,5 (2005) 692-698.
    [43]M.L. Padgett and J.L. Johnson, Pulse Coupled Neural Networks and Wavelets:Biosensors Applications, Proceedings of International Conference on Neural Networks,6 (1997) 2507-2512.
    [44]X.D. Gu, D.H. Yu, and L.M. Zhang, General Design Approach to Unit-linking PCNN for Image Processing, Proceedings of International Joint Conference on Neural Networks,3 (2005) 1836-1841.
    [45]X.D. Gu, S.H. Guo, and D.H. Yu, A New Approach for Automated Image Segmentation Based on Unit-Linking PCNN, Proceedings of the First International Conference on Machine Learning and Cybernetics,11 (2002) 175-178.
    [46]顾晓东,郭仕德,余道衡,基于PCNN的二值文字空洞滤波,计算机应用研究,12(2003)65-66.
    [47]N.D. Bruce, and J.K. Tsotsos, Saliency Based on Information Maximization, Neural Information Processing Systems, (2005).
    [48]B.W. Tatler, R.J. Baddeley, and I.D. Gilchrist, Visual Correlates of Fixation Selection:Effects of Scale and Time, Vision Research,45 (2005) 643-659.
    [49]D. Gao, V. Mahadevan, and N. Vasconcelos, The Discriminant Center-surround Hypothesis for Bottom-Up Saliency, Neural Information Processing Systems, (2007).
    [50]J. Harel, C. Koch, and P. Perona, Graph-Based Visual Saliency, Neural Information Processing Systems, (2006).
    [51]P. Bian and L.M. Zhang, Biological Plausibility of Spectral Domain Approach for Spatiotemporal Visual Saliency, Lecture Notes in Computer Science,5506 (2009) 251-258.
    [52]P. Bian, L.M. Zhang, Visual Saliency:A Biologically Plausible Contourlet-like Frequency Domain Approach, Cognitive Neurodynamics,4 (2010) 189-198.
    [53]N.D. Bruce and J.K. Tsotsos, A Statistical Basis for Visual Field Anisotropies, Neurocomputing,69 (2006) 1301-1304.
    [54]N.D. Bruce and J.K. Tsotsos, Saliency, Attention, and Visual Search:An Information Theoretic Approach, Journal of Vision,9 (2009) 1-24.
    [55]J.F. Ning, L. Zhang, D. Zhang, and et al, Interactive Image Segmentation by Maximal Similarity Based Region Merging, Pattern Recognition,43 (2010) 445-456.
    [56]K. Fukunaga and L. Hostetler, The Estimation of The Gradient of a Density Function, with Applications in Pattern Recognition, IEEE Transactions on Information Theory,21 (1975) 32-40.
    [57]Y.Z. Cheng, Mean Shift, Mode Seeking, and Clustering, IEEE Transactions on Pattern Analysis and Machine Intelligence,17 (1995) 790-799.
    [58]D. Comaniciu and P. Meer, Mean Shift:A Robust Approach toward Feature Space Analysis, IEEE Transactions on Pattern Analysis and Machine Intelligence,24 (2002) 603-619.
    [59]D. Comaniciu, V. Ramesh, and P. Meer, Real-Time Tracking of Non-Rigid Objects Using Mean Shift, IEEE Computer Society Conference on Computer Vision and Pattern Recognition,2 (2000) 2142-2149.
    [60]A. Tyagi, M. Keck, J.W. Davis, and et al, Kernel-Based 3D Tracking, IEEE Conference on Computer Vision and Pattern Recognition,1 (2007) 1-8.
    [61]N. Song, I.Y.H. Gu, Z.P. Cao, and et al, Enhanced Spatial-Range Mean Shift Color Image Segmentation by Using Convergence Frequency and Position, Proceedings of the 14th European Signal Processing Conference, (2006).
    [62]R.T. Collins, Mean-Shift Blob Tracking through Scale Space, IEEE Computer Society Conference on Computer Vision and Pattern Recognition,2 (2003) 234-240.
    [63]M. Kass, A. Witkin, and D. Terzopoulos, Snake:Active Contour Models, International Journal of Computer Vision,1 (1987)321-331.
    [64]F. Meyer and S. Beucher, Morphological Segmentation, Journal of Visual Communication and Image Representation,1 (1990) 21-46.
    [65]P. Felzenszwalb and D. Huttenlocher, Efficient Graph-Based Image Segmentation, International Journal of Computer Vision,59 (2004) 167-181.
    [66]Q. Yang, C. Wang, X. Tang, and et al, Progressive Cut:An Image Cutout Algorithm That Models User Intentions, IEEE Multimedia,14 (2007) 56-66.
    [67]A. Levin, A. Rav-Acha, and D. Lischinski, Spectral Matting, IEEE Transactions on Pattern Analysis and Machine Intelligence,30 (2008) 1699-1712.
    [68]R. Carsten, K. Vladimir, and B. Andrew, "Grabcut":Interactive Foreground Extraction Using Iterated Graph Cuts, Proceedings of ACM SIGGRAPH,23 (2004) 309-314.
    [69]A. Blake, C. Rother, M. Brown, and et al, Interactive Image Segmentation Using an Adaptive GMMRF model, Lecture Notes in Computer Science,3021 (2004) 428-441.
    [70]B. Peng, L. Zhang, and J. Yang, Iterated Graph Cuts for Image Segmentation, Lecture Notes in Computer Science,5995 (2010) 677-686.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700