摘要
针对自然场景中复杂背景干扰检测的问题,本文提出一种基于视觉感知机制的场景文字检测定位方法。人类视觉感知机制通常分为快速并行预注意步骤与慢速串行注意步骤。本文方法基于人类感知机制提出一种场景文字检测定位方法,该方法首先通过两种视觉显著性方法进行预注意步骤,然后利用笔画特征以及文字相互关系实现注意步骤。本文方法在ICDAR 2013与场景汉字数据集中均取得较有竞争力的结果,实验表明可以较好地用于复杂背景的自然场景英文和汉字的检测。
To solve the detection problem with respect to the interference of complex backgrounds in natural scenes,in this paper,we propose a scene text detection and localization scheme based on a visual perception mechanism.The human visual perception mechanism is commonly divided into the fast parallel pre-attention step and the slow serial attention step. In our proposed scheme,we first precedes the pre-attention step with two visual saliency methods and then implement the attention step using a stroke feature and the relationship between characters. Our experimental results show the scheme to be competitive with respect to the ICDAR 2013 and the scene Chinese-character dataset. It is also suitable for English and Chinese character detection of natural scenes under complex background conditions.
引文
[1]JUNG K,KIM K I,JAIN A K.Text information extraction in images and video:a survey[J].Pattern recognition,2004,37(5):977-997.
[2]BAI Bo,YIN Fei,LIU Chenglin.Scene text localization using gradient local correlation[C]//International Conference on Document Analysis and Recognition,Washington DC,2013:1412-1416.
[3]姜维,卢朝阳,李静,等.针对场景文字的基于视觉显著性和提升框架的背景抑制方法[J].电子与信息学报,2014,36(3):617-623.JIANG Wei,LU Zhaoyang,LI Jing,et al.Visual saliency and boosting based background suppression for scene text[J].Journal of electronics&information technology,2014,36(3):617-623.
[4]CONG Yao,et al.Detecting texts of arbitrary orientations in natural images[C]//IEEE Conference on Computer Vision and Pattern Recognition,Providence.2012:1083-1090.
[5]LI Yao,JIA Wenjing,SHEN Chunhua,et al.Characterness:an indicator of text in the wild[J].IEEE transactions on image processing,2014,23(4):1666-1677.
[6]赵春晖,王佳,王玉磊.采用背景抑制和自适应阈值分割的高光谱异常目标检测[J].哈尔滨工程大学学报,2016,37(2):278-283.ZHAO Chunhui,WANG Jia,WANG Yulei.Hyperspectral anomaly detection based on background suppression and adaptive threshold segmentation[J].Journal of Harbin engineering university,2016,37(2):278-283.
[7]HOU X D,ZHANG L Q.Saliency detection:a spectral residual approach[C]//IEEE Conference on Computer Vision and Pattern Recognition,Minneapolis,2007:1-8.
[8]EPSHTEIN B,OFEK E,WEXLER Y.Detecting text in natural scenes with stroke width transform[C]//IEEE International Conference on Computer Vision and Pattern Recognition.San Francisco,2010:2963-2970.
[9]BOYKOV Y,KOLMOGOROV V.An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision[J].IEEE transaction pattern analysis and machine intelligence,2004,26(9):1124-1137.
[10]KARATZAS D,SHAFAIT F,UCHIDA S,et al.ICDAR 2013Robust Reading Competition[C]//IEEE International Conference on Document Analysis and Recognition.Washington DC,2013:1484-1493.
[11]LUCAS S M.ICDAR 2005 text locating competition results[C]//8th International Conference on Document Analysis and Recognition.2005:80-84.
[12]姜维,卢朝阳,李静,等.基于角点类别特征和边缘幅值方向梯度直方图统计特征的复杂场景文字定位算法[J].吉林大学学报:工学版,2013,43(1):250-255.JIANG Wei,LU Zhaoyang,LI Jing,et al.Text localization algorithm in complex scene based on corner-type feature and histogram of oriented gradients of edge magnitude statistical feature[J].Journal of Jilin University:engineering and technology edition,2013,43(1):250-255.