A method for text line detection in natural images

详细信息查看全文

作者：Jie Yuan (1)
Baogang Wei (2)
Yonghuai Liu (3)
Yin Zhang (2)
Lidong Wang (4)

1. Jiangsu Electric Power Information Technology Co. Ltd. ; Nanjing ; 210029 ; China
2. College of Computer Science and Technology ; Zhejiang University ; Hangzhou ; 310027 ; China
3. Department of Computer Science ; Aberystwyth University ; Wales ; UK ; SY23 3DB
4. Qianjiang College ; Hangzhou Normal University ; Hangzhou ; 310027 ; China
关键词：Text detection ; Text line ; Maximal stable extremal regions ; Sparse classifier
刊名：Multimedia Tools and Applications
出版年：2015
出版时间：February 2015
年：2015
卷：74
期：3
页码：859-884
全文大小：2,364 KB
参考文献：1. Chen T (2008) Text localization using DWT fusion algorithm. In: 11th IEEE International Conference on Communication Technology (ICCT), pp. 722鈥?25
2. Chen X, Yuille A (2004) Detecting and reading text in natural scenes. In: the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 366鈥?73
3. Frey BJ et al (2007) Clustering by passing messages between data points. Science 315:972鈥?76 CrossRef
4. Gllavata J, Ewerth R, Freisleben B (2004) Text detection in images based on unsupervised classification of high-frequency wavelet coefficients. In: Proceedings of the 17th International Conference on Pattern Recognition (ICPR), vol.1, pp. 425鈥?28
5. Grana C, Borghesani D, Cucchiara R (2011) Automatic segmentation of digitalized historical manuscripts. Multimed Tools Appl 55(3):483鈥?06 CrossRef
6. Idris F, Panchanathan S (1997) Review of image and video indexing techniques. J Vis Commun Image Represent 8(2):146鈥?66 CrossRef
7. Karatzas D, Antonacopoulos A (2004) Text Extraction from Web Images Based on a Split-and-Merge segmentation Method Using Color Perception. In: Proceedings of the 17th International Conference on Pattern Recognition (ICPR), vol.2, pp. 634鈥?37
8. Kim W, Kim C (2009) A New approach for overlay text detection and extraction from complex video scene. IEEE Trans Image Process 18:401鈥?11 CrossRef
9. Kimmel R, Zhang C, Bronstein AM, Bronstein MM (2011) Are MSER features really interesting? IEEE Trans Pattern Anal Mach Intell 33(11):2316鈥?320 CrossRef
10. Li Z, Liu G, Qian X, Wang C, Ma Y, Yang Y (2010) A Video Text Detection Method Based on Key Text Points. In: Processing of the 11th Pacific-Rim Conference on Advances in multimedia information processing (PCM), pp. 284鈥?95
11. Liang J, Doermann D, Li H (2005) Camera based analysis of text and documents: a survey. Int J Doc Anal Recognit 7:84鈥?04 CrossRef
12. Lienhart R (2000) Automatic text segmentation and text recognition for video indexing. Multimed Syst Mag 8:69鈥?1 CrossRef
13. Liu X, Wang W (2012) Robustly extracting captions in videos based on stroke-like edges and spatio-temporal analysis. IEEE Trans Multimed 14(2):482鈥?89 CrossRef
14. Lucas SM (2005) ICDAR 2005 text locating competition results. In: Proceeding of the 8th International Conference on Document Analysis Recognition, vol. 1, pp. 80鈥?5
15. Lucas SM, Panaretos A, Sosa L, Tang A, Wong S, Young R (2003) ICDAR 2003 robust reading competitions. In: Proceeding of 7th International Conference on Document Analysis Recognition, pp. 682鈥?87
16. Mairal J, Bach F, Ponce J, Sapiro G, Zisserman A (2008) Discriminative learned dictionaries for local image anlysis. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1鈥?
17. Matas J, Chum O, Urban M, Pajdla T (2002) Robust wide baseline stereo from maximally stable extremal regions. British Machine Vision Computing Conference, pp. 384鈥?93
18. Ofek BEE, Wexler Y (2010) Detecting text in natural scenes with stroke width transform. IEEE Conf Comput Vis Pattern Recognit pp. 2963鈥?970
19. Minetto R, Thome N, Cord M, Fabrizio J, Marcotegui B (2010) 鈥淪noopertext: A multiresolution system for text detection in complex visual scenes鈥? In: 17th IEEE International Conference on Image Processing ICIP, pp. 3861鈥?864
20. Shahab A, Shafait F, Dengel A (2011) ICDAR 2011 Robust Reading Competition Challenge 2: Reading Text in Scene Images. In: Proceeding of International Conference on Document Analysis Recognition, pp. 1491鈥?496
21. Shivakumara P, Dutta A, Tan CL, Pal U (2010) A New Wavelet-Median-Moment based Method for Multi-Oriented Video Text Detection. In: Proceedings of the Ninth IAPR International Workshop on Document Analysis and Systems (DAS), pp. 279鈥?88
22. Shivakumara P, Huang W, Phan TQ, Tan CL (2010) Accurate video text detection through classification of low and high contrast images. Pattern Recogn 43:2165鈥?185 CrossRef
23. Shivakumara P, Phan TQ, Tan CL (2010) New fourier-statistical features in RGB space for video text detection. IEEE Trans Circ Syst Video Technol 20(11):1520鈥?532 CrossRef
24. Shivakumara P, Phan TQ, Tan CL (2011) A laplacian approach to multi-oriented text detection in video. IEEE Trans Pattern Anal Mach Intell 33:412鈥?19 CrossRef
25. Shivkumara P, Huang W, Tan CL (2008) Efficient Video Text Detection using Edge Features. In: Proceedings of 19th international Conference on Pattern Recognition (ICPR), pp. 1鈥?
26. Wang F, Ngo C-W, Pong T-C (2008) Structuring low-quality videotaped lectures for cross-reference browsing by video text analysis. Pattern Recogn 41:3257鈥?269 CrossRef
27. Yanga M, Zhanga L, Fengb X, Zhang D (2011) Fisher Discrimination Dictionary Learning for Sparse Representation. In: Proceeding of IEEE International Conference on Computer Vision (ICCV), pp. 543鈥?50
28. Yao C, Bai X, Liu W, Ma Y, Tu Z (2012 June) Detecting Texts of Arbitrary Orientations in Natural Images. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
29. Ye Q, Jiao J, Huang J, Hua Y (2007) Text detection and restoration in natural scene images. Vis Commun Image Represent 18:504鈥?13 CrossRef
30. Yi J, Peng Y, Xiao J (2007) Color-based clustering for text detection and extraction in image. In: Proceedings of the 15th International Conference on Multimedia (MM), pp. 847鈥?50
31. Yi C, Tian YL (2011) Text string detection from natural scenes by structure-based partition and grouping. IEEE Trans Image Process 20(9):2594鈥?605
32. Zhang D, Islam M, Lu G (2012) A review on automatic image annotation techniques. Pattern Recogn 45:346鈥?62 CrossRef
33. Zhao M, Li S, Kwok J (2010) Text detection in images using sparse representation with discriminative dictionaries. Image Vis Comput 28:1590鈥?599 CrossRef
34. Zhao X, Lin K-H, Fu Y, Hu Y, Liu Y, Huang TS (2011) Text from corners: a novel approach to detect text and caption in videos. IEEE Trans Image Process 20(3):790鈥?99 CrossRef
刊物类别：Computer Science
刊物主题：Multimedia Information Systems
Computer Communication Networks
Data Structures, Cryptology and Information Theory
Special Purpose and Application-Based Systems
出版者：Springer Netherlands
ISSN：1573-7721

文摘

Text information in natural images is very important to cross-media retrieval, index and understanding. However, its detection is challenging due to varying backgrounds, low contrast between text and non-text regions, perspective distortion and other disturbing factors. In this paper, we propose a novel text line detection method which can detect text line aligned with a straight line in any direction. It is mainly composed of three steps. In the first step, we use the maximal stable extremal region detector with dam line constraint to detect candidate text regions, we then define a similarity measurement between two regions which combines sizes, absolute distance, relative distance, contextual information and color histograms. In the second step, we propose a text line identification algorithm based on the defined similarity measurement. The algorithm firstly searches three regions as the seeds of a line, and then expands to obtain all regions in the line. In the last step, we develop a filter to remove non-text lines. The filter uses a sparse classifier based on two dictionaries which are learned from feature vectors extracted from morphological skeletons of those candidate text lines. A comparative study using two datasets shows the excellent performance of the proposed method for accurate text line detection with horizontal or arbitrary consistent orientation.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700