用户名: 密码: 验证码:
基于特征融合的视频文本获取研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
视频文本能够提供重要的视频语义信息以供视频检索和视频摘要。视频文本一般被划分为两种类型:叠加文本和场景文本。叠加文本主要包含视频中的标题和字幕,能够对视频语义提供重要的辅助信息。场景文本是自然场景中存在的文本,能够用来推测场景信息。因此获取视频文本对于视频语义分析具有重要作用。
     视频文本获取主要包括文本检测、定位、提取和识别四个步骤。本论文主要从叠加文本检测和定位、场景文本检测和定位、文本提取三个方面对视频文本获取进行研究并提出解决算法。具体的研究问题包括:如何在视频复杂背景中检测和定位叠加文本;如何在有光照变化和文本排列不规则的情况下,检测和定位场景文本;如何在检测到的文本区域中完整提取文本。本文的主要贡献如下:
     (1)提出了基于运动感知场的视频叠加文本检测和定位算法,为复杂背景下叠加文本检测和定位提供了一种有效方法。基于相同的叠加文本在多帧视频序列上会保持位置不变的运动模式,我们定义多帧运动向量融合的运动场为运动感知场,并通过运动感知场获取叠加文本运动模式。同时提出基于运动感知场的视频叠加文本检测和定位算法。我们在视频镜头分割的基础上,在单个镜头中提取30帧进行多帧融合形成一个融合帧,在融合帧上执行基于运动感知场的视频叠加文本检测和定位算法。最后,执行基于运动感知场的多帧验证确定文本区域。
     (2)提出了基于笔划图的场景文本检测和定位算法,为有光照变化和多种文本排列方式的场景文本检测和定位提供了一种有效方法。场景文本易受光照变化和文本排列方式的影响,我们定义基于字符笔划特征融合的笔划图,并基于笔划图检测场景文本。首先,对于视频帧上获取的笔划图进行基于纹理粗糙度统计的文本行检测,并运用角点检测和形态学算法定位文本区域。最后,对于一些与文本区域相似的纹理区域,提取小波矩特征、Laws纹理特征和小波共生矩阵特征作为SVM分类器的特征向量进行训练,并用这个SVM分类器在候选区域中区分文本区域和非文本区域。
     (3)提出了基于边缘图和颜色聚类分析的叠加文本提取算法和基于改进的Niblack场景文本提取算法,为复杂背景下的文本提取提供了有效方法。对于视频叠加文本提取,我们基于彩色梯度图算法,融合图像梯度信息形成文本行边缘图。我们提出了基于边缘图字符分割和基于K均值颜色聚类算法的文本提取算法。算法首先基于文本行边缘图垂直投影将文本行分割成单个字符。然后,用K均值聚类算法将单个字符图像划分成几个聚类图像,然后使用改进的坝点标注和向内填充算法处理聚类图像。对于场景文本,我们提出了基于改进的Niblack场景文本提取算法。
     (4)视频文本获取系统的设计与实现。为了验证本文算法有效性,我们设计并实现了视频文本获取的验证系统。通过在系统上的大量实验表明,本文提出的方法能够准确地获取视频文本,所获取的视频文本可以应用于视频检索和场景理解的工作中。
     本文主要研究了视频文本获取算法,对于叠加文本和场景文本的获取提出了一些有效的解决方法。本文提出的视频文本检测和提取算法对于视频内容理解具有实际应用价值。通过系统的实际应用验证了本文提出的算法的有效性。
Video text brings important semantic clues for video indexing and summarization. There are two kinds of textual information in the video: the superimposed text and the scene text. In videos, the superimposed texts (e.g., captions in broadcast news programs) are added by video editors and normally can be used to infer the semantic content of videos. The scene text is inherent text in the video captured by the video camera. Scene text can be used to infer scene information. Therefore, video text information extraction is important for video semantics analysis.
     Extraction of text information involves detection, extraction, and recognition of the text from video. This thesis mainly focuses on the three aspects:superimposed text detection and localization, scene text detection and localization, text extraction. We discuss some important problems of these areas and try to provide some solutions. These problems are as follows:how to detect and locate superimposed text on complex background, how to detect and locate scene text with uneven illumination and various text alignments, how to extract the text efficiently. In this thesis, our major contributions are as follows:
     (1) The author proposes a superimposed text detection and localization algorithm based on motion perception field, which provides an effective method for superimposed text detection and localization. The same superimposed texts keep the same position on consecutive frames. We define the motion perception field (MPF) to retrieve the text motion patterns. Moreover, we propose a superimposed text detection and localization method based on MPF. First, based on shot segmentation, we extract MPF on the 30 consecutive frames of a single shot. Then we perform multiframe integration to retrieve the synthesized frame. We detect and locate candidate text regions on synthesized frame based on MPF. Finally, multi-frame verification based on MPF is performed to filter candidate text regions.
     (2) The author proposes a scene text detection and localization algorithm based on stroke map, which provides an effective method for scene text detection and localization under the condition of uneven illuminations and various text alignments. Scene text detection in video present many difficulties due to uneven illuminations and various text alignments.We define the stroke map which integrate the character stroke features in certain orientation. Then we propose a scene text detection method based on stroke map. First, we produce a stroke map based on 2D Log-Gabor filters. Second, we calculate texture feature on every line of stroke map to detect text lines. Then, we perform Harris corner detection and morphological operation to locate the text regions. Finally, a trained SVM is used to verify the candidate text regions.
     (3) The author proposes a superimposed text extraction algorithm based on edge map and color clustering, and proposes a scene text extraction algorithm based on improved Niblack method, both of which provide effective text extraction methods in complicated background. For superimposed text, we use the color gradient method to integrate the gradient information into edge map of text row. We propose a text extraction algorithm based on character segmantion on edge map and color clustering. First, we produce the edge map using the gradient amplitude and orientation. Second, we segment the text row into single character based on the vertical projection of edge map. Third, we use K-means to cluster single character image into several clustering images. Then we use dampoint label and inward filling to extract the character binary image. For scene text, we proposed a text extraction approach based on improved Niblack method.
     (4) Video text information extraction system. For verifying the efficiency of our method, we design and implement the video text information extraction system. The experimental results demonstrate that the proposed methods can efficiently detect, locate and extract the text, which can be applied to video search and scene understanding.
     In this thesis, we focus on the research about video text information extraction. We propose some efficient methods for superimposed text and scene text. The video text detection and extraction algorithms proposed by us have pratical significance for video content understanding. Experimental results show that our approaches are robust and can be effectively applied to real video.
引文
[Aigrain96]P. Aigrain, H. J. Zhang, and D. Petkovic. Content-based representation and retrieval of visual media:a state-of-the-art review, Multimedia Tools and Applications. Norwell, MA:Kluwer, Nov.1996, vol.3, Page(s):179-202.
    [Antani02]S. Antani, R. Kasturi, and R. Jain, A Survey on the Use of Pattern Recognition Methods for Abstraction, Indexing, and Retrieval of Images and Video, Pattern Recognition 35 (2002), Page(s):945-965.
    [Carrato01]I. Koprinska and S. Carrato. Temporal video segmentation:a survey, Proceeding of Signal Processing:Image Communication, vol.16,2001, Page(s):477-500.
    fChen01]D. Chen, K. Shearer, and H. Bourlard. Text Enhancement with Asymmetric Filter for Video OCR, Proceedings, Proceedings of International Conference on Image Analysis and Processing,2001, Page(s):192-197.
    [Chen04]Xilin Chen, Jie Yang, Jing Zhang, Waibel, A. Automatic Detection and Recognition of Signs From Natural Scenes, IEEE Transactions on Image Processing, Volume 13, Issue 1, Jan.2004, pp:87-99.
    [ChenD04]Datong Chen, Jean-Marc Odobez, Herve Bourlard. Text detection and recognition in images and video frames, Pattern Recognition, Vol.37, No.3, 2004, Page(s):595-608.
    [Chowdhury09]S P Chowdhury, S Dhar, A K Das, B Chanda, K McMenemy. Robust Extraction of Text from Camera Images, Proceedings of IEEE International Conference on Document Analysis and Recognition,2009, Page(s):1280-1284.
    [Comelli95]Comelli P., Ferragina, P., Granieri M.N., Stabile, F.. Optical recognition of motor vehicle license plates, IEEE Transactions on Vehicular Technology, Volume 44, Issue 4, Nov.1995, Page(s):790-799.
    [Ekin06]Ahmet Ekin. Local Information Based Overlaid Text Detection by Classifier Fusion, Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing,2006, Volume 2, Page(s):753-756.
    [Fu06]Hui Fu, Xiabi Liu, Yunde Jia and Hongbin Deng. Gaussian Mixture Modelling of Neighbor Characters for Multilingual Text Extraction in Images, Proceedings of International Conference on Image Processing,2006, Page(s):3321-3324.
    [Gao00]X. Gao and X. Tang et al. Automatic news video caption extraction and recognition, in Proc. LNCS 1983:2nd Int. Conf. Intell. Data Eng. Automated Learning Data Mining, Financial Eng., Intell. Agents, K. S. Leung et al., Eds., Hong Kong,2000, Page(s):425-430.
    [Gao01]Jiang Gao, Jie Yang. An Adaptive Algorithm for Text Detection from Natural Scenes, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Volume 2,2001, Page(s):84-89.
    [Garcia00]C. Garcia and X. Apostolidis. Text Detection and Segmentation in Complex Color Images, Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing,2000, vol.4, Page(s):2326-2330.
    [Gllavata04]Julinda Gllavata, Ralph Ewerth and Bernd Freisleben., Text detection in images based on unsupervised classification of high-frequency wavelet coefficients, Proceedings of International Conference on Pattern Recognition, 2004, Vol.1, Page(s):425-428.
    [Gatos05]B.Gatos, I.Pratikakis, K.Kepene and S.J.Perantonis. Text detection in Indoor/Outdoor Scene Images, First International Workshop on Camera-based Document Analysis and Recognition, Page(s):127-132.
    [Goto08]Hideaki Goto. Redefining the DCT-based feature for scene text detection: Analysis and comparison of spatial frequency-based features, International Journal on Document Analysis and Recognition(11), No.1,October 2008, Page(s):1-8.
    [Govindaraju03]Govindaraju, V., Tulyakov, S.. Postal address block location by contour clustering., Proceeding of Seventh International Conference on Document Analysis and Recognition, vol.1,3-6 Aug.2003, Page(s):429-432.
    [Graham03]L. Graham, Y. Chen, T. Kalyan, J. H. N. Tan and M. Li. Comparison of Some Thresholding Algorithms for Text /Background Segmentation in Difficult Document Images, Proceedings of International Conference on Document Analysis and Recognition,2003, Vol 2, Page(s):859-865.
    [Hanif08]Shehzad Muhammad Hanif, Lionel Prevost, Pablo Augusto Negri. A Cascade Detector for Text Detection in Natural Scene Images, Proceedings of International Conference on Pattern Recognition,2008, Page(s):1-4.
    [Hanif09]Shehzad Muhammad Hanif, Lionel Prevost. Text Detection and Localization in Complex Scene Images using Constrained AdaBoost Algorithm, Proceedings of International Conference on Document Analysis and Recognition,2009, Page(s):1-5.
    [Hua01]Xian-Sheng Hua, Xiang-Rong Chert, Liu Wenyin, Hong-Jiang Zhang. Automatic Location of Text in Video Frames, Proceedings of the 2001 ACM workshops on Multimedia, Sept 2001, Page(s):24-27.
    [Hua02]X. Hua, P. Yin and H.J. Zhang, Efficient Video Text Recognition Using Multiple Frame Integration, Proceedings of International Conference on Image Process, Sept 2002, Volume 2, Page(s):397-400.
    [Ji08]Rongrong Ji, Pengfei Xu, Hongxun Yao, Zhen Zhang, Xiaoshuai Sun, Tianqiang Liu. Directional correlation analysis of local Haar binary pattern for text detection, Proceedings of IEEE International Conference on Multimedia and Expo,2008, Page(s):885-888.
    [Keechu104]Keechul Jung, Kwang In Kim, Anil K. Jain. Text Information Extraction in Images and Video:A Survey, Pattern Recognition, Vol.37, No.5, May 2004, Page(s):977-997.
    [Kim03]Kwang In Kim, Keechul Jung, Jin Hyung Kim. Texture-Based Approach for Text Detection in Images Using Support Vector Machines and Continuously Adaptive Mean Shift Algorithm, IEEE Transactions on Pattern Analysis and Machine Intelligence, December,2003, vol.25, no.12, Page(s):1631-1639.
    [Kim04]K. C. Kim, H. R. Byun, Y. J. Song, Y. W. Choi, S. Y. Chi, K. K. Kim, Y. K. Chung. "Scene Text Extraction in Natural Scene Images Using Hierarchical Feature Combining and Verification", Proceedings of International Conference on Pattern Recognition,2004, Vol.2, Page(s):679-682.
    [Kim09]Wonjun Kim and Changick Kim. A New Approach for Overlay Text Detection and Extraction From Complex Video Scene, IEEE Transactions on Image Processing, Volume 18, Issue 2,2009, Page(s):401-411.
    [Leydier04]Y. Leydier, F. Le Bourgeois, and H. Emptoz. Serialized Unsupervised Classifier for Adaptive Color Image Segmentation:Application to Digitized Ancient Manuscripts, Proceedings of International Conference on Pattern Recognition,2004, Page(s):494-497.
    [Li00a]Huiping Li, David Doermann, Omid Kia. Automatic Text Detection and Tracking in Digital Video, IEEE Transactions on Image Processing, Vol.9, Issue 1, January 2000, Page(s):147-156.
    [Li00b]Huiping Li, Doermann, D. A video text detection system based on automated training, Proceedings of International Conference on Pattern Recognition,2000, Volume 2, Page(s):223-226.
    [Li08]Xiaojun Li, Weiqiang Wang, Shuqiang Jiang, Qingming Huang, Wen Gao. Fast and effective text detection, Proceedings of IEEE International Conference on Image Processing,2008, Page(s):969-972.
    [LiJ08]Jia Li, Yonghong Tian, Tiejun Huang, Wen Gao. Multi-polarity Text Segmentation Using Graph Theory, Proceedings of International Conference on Image Process, 2008, Page(s):3008-3011.
    [Li09]Xiaojun Li, Weiqiang Wang, Qingming Huang, Wen Gao, Laiyun Qing. A Hybrid Text Segmentaion Approach, Proceedings of IEEE International Conference on Multimedia and Expo,2009, Page(s):510-513.
    [Lienhart99]R. Lienhart. Comparison of automatic shot boundary detection algorithms, Proceeding of SPIE, vol.3656,1999, Page(s):290-301.
    [Lin05]Lin Lin, Chew Lim Tan. Text Extraction from Name Cards with Complex Design, Proceedings of International Conference on Document Analysis and Recognition, 2005, Page(s):977-980.
    [Liu05]Yangxing Liu, Satoshi Goto, Takeshi Ikenaga. A Robust Algorithm for Text Detection in Color Images, Proceedings of International Conference on Document Analysis and Recognition,2005, Vol.1, Page(s):399-403.
    [Liu08]Fang Liu, Xiang Peng, Tianjiang Wang, Songfeng Lu. A Density-based Approach for Text Extraction in Images, Proceedings of International Conference on Pattern Recognition,2008, Page(s):1-4.
    [LiuZ08]Zongyi Liu, Sudeep Sarkar. Robust Outdoor Text Detection Using Text Intensity and Shape Features, Proceedings of International Conference on Pattern Recognition,2008, Page(s):1-4.
    [Lowell00]Lowell L. Winger, John A. Robinson, M. Ed Jernigan, Low-complexity character extraction in low-contrast scene images, in International Journal of Pattern Recognition and Artificial Intelligence, Vol 14, No 2, March 2000, Page(s) 113-135.
    [Lu08]Su Lu, Kenneth E. Barner. Weighted DCT coefficient based text detection, Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing,2008, Page(s):1341-1344.
    [Lyu05]Michael R. Lyu, Jiqiang Song, Min Cai. A Comprehensive Method for Multilingual Video Text Detection, Localization, and Extraction, IEEE Transactions on Circuits and Systems for Video Technology,2005,15(2), Page(s):243-255.
    [Mancas-Thillou05]Mancas-Thillou, C.; Gosselin, B. Color Text Extraction from Camera-based Images, the Impact of the Choice of the Clustering Distance, Proceedings of International Conference on Document Analysis and Recognition, 2005,Vol.1, Page(s):312-316.
    [Mao02]Wenge Mao, Fu-lai Chung, Kenneth K.M.Lan and Wan-Chi Siu. Hybrid Chinese English text detection in Images and Video Frames, Proceedings of International Conference on Pattern Recognition,2002, Volume 3, Page(s):1015-1018.
    [Mariano00]V. Y. Mariano and R. Kasturi. Locating uniform-colored text in video frames, Proceedings of 15th International Conference on Pattern Recognition, vol.4, 2000, Page(s):539-542.
    [Messelodi99]S. Messelodi, C. M. Modena. Automatic Identification and Skew Estimation of Text Lines in Real Scene Images, Pattern Recognition,32(5) (1999), Page(s): 789-808.
    [Miao08]Guangyi Miao, Qingming Huang, Shuqiang Jiang, Wen Gao. Coarse-to-fine Video Text Detection, Proceedings of IEEE International Conference on Multimedia and Expo,2008, Page(s):569-572.
    [Otsu79]N. Otsu, A Threshold Selection Method from Gray-level Histograms, IEEE Transaction on System, Man, Cybernet, vol. SMC-9, no.1, Jan.1979, Page(s): 62-66.
    [Pan08]Wumo Pan, T. D. Bui, C. Y. Suen. Text Detection from Scene Images Using Sparse Representation, Proceedings of International Conference on Pattern Recognition,2008, Page(s):1-5.
    [Pei04]Soo-Chang Pei, and Yu-Ting Chuang. Automatic Text Detection Using Multi-Layer Color Quantization in Complex Color Images, Proceedings of IEEE International Conference on Multimedia and Expo,2004, Volume 1, Page(s): 619-622.
    [Phan09]Trung Quy Phan, Palaiahnakote Shivakumara and Chew Lim Tan. A Laplacian Method for Video Text Detection, Proceedings of International Conference on Document Analysis and Recognition,2009, Page(s):66-70.
    [Sato98]T. Sato, T. Kanade, E. K. Hughes, and M. A. Smith. Video OCR for digital news archive, Proceedings of IEEE Workshop Content-Based Access Image Video Database,1998, Page(s):52-60.
    [Sethi95]I. K. Sethi and N. V. Patel. A statistical approach to scene change detection, Proceedings of SPIE, vol.2420,1995, Page(s):329-339.
    [Shim98]J. C. Shim, C. Dorai, and R. Bolle. Automatic Text Extraction from Video for Content-based Annotation and Retrieval, Proceeding of International Conference on Pattern Recognition, Vol.1,1998, Page(s):618-620.
    [Shin00]C. S. Shin, K. I. Kim, M. H. Park, H. J. Kim. Support vector machine-based text detection in digital video, Proceedings of the 2000 IEEE Signal Processing Society Workshop On Neural Networks for Signal, Volume 2, (2000) Page(s): 634-641.
    [Shivakumara09a]Palaiahnakote Shivakumara, Trung Quy Phan and Chew Lim Tan. A Robust Wavelet Transform Based Technique for Video Text Detection, Proceedings of IEEE International Conference on Document Analysis and Recognition,2009, Page(s):1285-1289.
    [Shivakumara09b]Palaiahnakote Shivakumara, Trung Quy Phan and Chew Lim Tan. A Gradient Difference Based Technique for Video Text Detection, Proceedings of IEEE International Conference on Document Analysis and Recognition,2009, Page(s):156-160.
    [Shivakumara09c]Palaiahnakote Shivakumara, Trung Quy Phan and Chew Lim Tan. Video text detection based on filters and edge features, Proceedings of IEEE International Conference on Multimedia and Expo,2009, Page(s):514-517.
    [Sin02]Bong-kee Sin, Seon-Kyu Kim, Beom-Joon Cho. Locating characters in scene images using frequency features, Proceedings of International Conference on Pattern Recognition,2002, Volume 3, Page(s):489-492.
    [Smeulders00]Arnold W.M. Smeulders, Simone Santini, Amarnath Gupta, and Ramesh Jain, Content-Based Image Retrieval at the End of the Early Years, IEEE Transactions on Pattern Analysis and Machine Intelligence,22 (12) (2000), Page(s):1349-1380.
    [Smith97]M. A. Smith and T. Kanade. Video skimming and characterization through the combination of image and language understanding technique, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,1997, Page(s): 775-781.
    [Sun09]Li Sun, Guizhong Liu, Xueming Qian, Danping Guo. A novel text detection and localization method based on corner response, Proceedings of IEEE International Conference on Multimedia and Expo,2009, Page(s):390-393.
    [Tan00]Y.-P. Tan, D. D. Saur, S. R. Kulkarni, and P. J. Ramadge. Rapid estimation of camera motion from MPEG video with application to video annotation," IEEE Transaction Circuits System Video Technology, vol.10, no.1, Feb.2000, Page(s): 133-146.
    [Tang02a]Xiaoou Tang, Xinbo Gao, Jianzhuang Liu, Hongjiang Zhang. A Spatial-Temporal Approach for Video Caption Detection and Recognition, IEEE Transactions on Neural Networks, Volume 13, Issue 4, July 2002, Page(s): 961-971.
    [Tang02b]Xiaoou Tang, Bo Luo, Xinbo Gao, Edwige Pissaloux, Hongjiang Zhang. Video Text Extraction Using Temporal Feature Vectors, Proceedings of IEEE International Conference on Multimedia and Expo,20012, Page(s):85-88.
    [Wang02]Qing Wang, Zheru Chi, Rongchun Zhao. Hierarchical content classification and script determination for automatic document image processing, Proceeding of 16th International Conference on Pattern Recognition, Volume 3,11-15 Aug. 2002, Page(s):77-80.
    [Wang04]Rongrong Wang, Wanjun Jin, Lide Wu. A Novel Video Caption Detection Approach Using Multi-Frame Integration, Proceedings of International Conference on Pattern Recognition,2004, Vol.1, Page(s):449-452.
    [Wolf96]W. Wolf. Key frame selection by motion analysis, Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, vol.2, 1996, Page(s):1228-1231.
    [Wu99]V. Wu, R. Manmatha, and E. M. Riseman. Textfinder:An automatic system to detect and recognize text in images, IEEE Transaction on Pattern Analysis and Machine Intelligence, vol.21, no.11, Nov.1999, Page(s):1224-1229.
    [Wu03]S. Wu and A. Amin. Automatic Thresholding of Gray-level Using Multi-stage Approach", Proceedings of International Conference on Document Analysis and Recognition,2003, Page(s):493-497.
    [Wu05]Wen Wu, Xilin Chen, and Jie Yang. Detection of Text on Road Signs From Video, IEEE Transactions on Intelligent Transportation Systems, Volume 6, Issue 4, 2005, Page(s):378-390.
    [Xu01]P. Xu, L. X. Xie, S. F. Chang, A. Divakaran, A. Vetro, and H. F. Sun. Algorithms and system for segmentation and structure analysis in soccer video, Proceedings of IEEE International Conference on Multimedia and Expo,2001, Page(s): 721-724.
    [Yassin00]Yassin M. Y. Hasan and Lina J. Karam. Morphological Text Extraction from Images, IEEE Transactions on Image Processing, Vol.9, NO.11, November 2000, Page(s):1978-1983.
    [Ye04]Qixiang Ye, Qingming Huang. A New Text Detection Algorithm in Images/Video Frames, Proceedings of Pacific-Rim Conference on Multimedia 2004, LNCS 3332, Page(s):858-865.
    [Yeung96]M. Yeung, B.-L. Yeo, and B. Liu. Extracting story units from long programs for video browsing and navigation, Proceedings of IEEE International Conference on Multimedia Computing and Systems,1996, Page(s):296-305.
    [Yi07]Jian Yi, Yuxin Peng, Jianguo Xiao. Color-based Clustering for Text Detection and Extraction in Image, ACM MultiMedia 07, Page(s):847-850.
    [Yuan08]袁海东,马华东,黄晓冬.基于梯度与粗糙度的视频文本检测与定位.电子学报.2008,Vol.36, No.8, Page(s):1660-1665.
    [Zenzo86]S. Di Zenzo. A Note on the Gradient of a Multi-image. Computer Vision, Graphics, and Image Processing,33(1) 1986, Page(s):116-125,.
    [Zhang93]H.J.Zhang, A.Kankanhalli, and S.Smoliar. Automatic Partitioning of Full-Motion Video, Multimedia Systems, Vol.1, No.l,1993, Page(s):10-28.
    [Zhang07]章毓晋.图像工程(第2版).清华大学出版社,2007.
    [Zhong99]Y. Zhong, H.-J. Zhang, and A. K. Jain. Automatic caption localization in compressed video, Proceedings of International Conference On Image Process, vol.2,1999, Page(s):96-100.
    [Zhong00]Yu Zhong and Anil K. Jain. Object Localization using Color, Texture, and Shape, Pattern Recognition 33 (2000), Page(s):671-684.
    [Zhong01]D. Zhong. Segmentation, Index and Summarization of Digital Video Content, Doctoral Dissertation, Columbia University,2001.
    [Barron94]John L. Barron, David J. Fleet, and Steven Beauchemin. Performance of optical flow techniques, International Journal on Computer Vision,1994 (12), Page(s): 43-77.
    [Ekin06]Ahmet Ekin. Local Information Based Overlaid Text Detection by Classifier Fusion, Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing,2006, Volume 2, Page(s):753-756.
    [Gllavata04]Julinda Gllavata, Ralph Ewerth and Bernd Freisleben. Text detection in images based on unsupervised classification of high-frequency wavelet coefficients, Proceedings of International Conference on Pattern Recognition, 2004, Vol.1, Page(s):425-428.
    [Goto08]Hideaki Goto. Redefining the DCT-based feature for scene text detection: Analysis and comparison of spatial frequency-based features, International Journal of Document Analysis and Recognition(11), No. l,October 2008, Page(s): 1-8.
    [Hanif08]Shehzad Muhammad Hanif, Lionel Prevost, Pablo Augusto Negri. A Cascade Detector for Text Detection in Natural Scene Images, Proceedings of International Conference on Pattern Recognition,2008, Page(s):1-4.
    [Hanif09]Shehzad Muhammad Hanif, Lionel Prevost. Text Detection and Localization in Complex Scene Images using Constrained AdaBoost Algorithm, Proceedings of International Conference on Document Analysis and Recognition,2009, Page(s): 1-5.
    [Hua02]X. Hua, P. Yin and H.J. Zhang. Efficient Video Text Recognition Using Multiple Frame Integration, Proceedings of International Conference on Image Process, Sept 2002, Volume 2, Page(s):397-400.
    [Ji08]Rongrong Ji, Pengfei Xu, Hongxun Yao, Zhen Zhang, Xiaoshuai Sun, Tianqiang Liu. Directional correlation analysis of local Haar binary pattern for text detection, Proceedings of IEEE International Conference on Multimedia and Expo,2008, Page(s):885-888.
    [Kim09]Wonjun Kim and Changick Kim. A New Approach for Overlay Text Detection and Extraction From Complex Video Scene, IEEE Transactions on Image Processing, Volume 18, Issue 2,2009, Page(s):401-411.
    [Li08]Xiaojun Li, Weiqiang Wang, Shuqiang Jiang,Qingming Huang, Wen Gao. Fast and effective text detection, Proceedings of IEEE International Conference on Image Processing,2008, Page(s):969-972.
    [Liu08]Zongyi Liu, Sudeep Sarkar. Robust Outdoor Text Detection Using Text Intensity and Shape Features, Proceedings of International Conference on Pattern Recognition,2008, Page(s):1-4.
    [Lu08]Su Lu, Kenneth E. Barner. Weighted DCT coefficient based text detection, Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing,2008, Page(s):1341-1344.
    [Lyu05]Michael R. Lyu, Jiqiang Song, Min Cai. A Comprehensive Method for Multilingual Video Text Detection, Localization, and Extraction, IEEE Transactions on Circuits and Systems for Video Technology,2005,15(2), Page(s): 243-255.
    [Miao08]Guangyi Miao, Qingming Huang, Shuqiang Jiang, Wen Gao. Coarse-to-fine Video Text Detection, Proceedings of IEEE International Conference on Multimedia and Expo,2008, Page(s):569-572.
    [Pan08]Wumo Pan, T. D. Bui, C. Y. Suen. Text Detection from Scene Images Using Sparse Representation, Proceedings of International Conference on Pattern Recognition,2008, Page(s):1-5.
    [Phan09]Trung Quy Phan, Palaiahnakote Shivakumara and Chew Lim Tan. A Laplacian Method for Video Text Detection, Proceedings of International Conference on Document Analysis and Recognition,2009, Page(s):66-70.
    [Pratt07]W. K. Pratt. Digital Image Processing 4th Edition, John Wiley & Sons, Inc., Los Altos, California,2007
    [Shivakumara09a]Palaiahnakote Shivakumara, Trung Quy Phan and Chew Lim Tan. A Robust Wavelet Transform Based Technique for Video Text Detection, Proceedings of IEEE International Conference on Document Analysis and Recognition,2009, Page(s):1285-1289.
    [Shivakumara09b]Palaiahnakote Shivakumara, Trung Quy Phan and Chew Lim Tan. A Gradient Difference Based Technique for Video Text Detection, Proceedings of IEEE International Conference on Document Analysis and Recognition,2009, Page(s):156-160.
    [Shivakumara09c]Palaiahnakote Shivakumara, Trung Quy Phan and Chew Lim Tan. Video text detection based on filters and edge features, Proceedings of IEEE International Conference on Multimedia and Expo,2009,Page(s):514-517.
    [Sun09]Li Sun, Guizhong Liu, Xueming Qian, Danping Guo. A novel text detection and localization method based on comer response, Proceedings of IEEE International Conference on Multimedia and Expo,2009, Page(s):390-393.
    [Tang02]Xiaoou Tang, Xinbo Gao, Jianzhuang Liu, Hongjiang Zhang. A Spatial-Temporal Approach for Video Caption Detection and Recognition, IEEE Transactions on Neural Networks, Volume 13, Issue 4, July 2002, Page(s):961-971.
    [Wang04]Rongrong Wang, Wanjun Jin, Lide Wu. A Novel Video Caption Detection Approach Using Multi-Frame Integration, Proceedings of International Conference on Pattern Recognition,2004, Vol.1, Page(s):449-452.
    [Wu05]Wen Wu, Xilin Chen, and Jie Yang. Detection of Text on Road Signs From Video, IEEE Transactions on Intelligent Transportation Systems, Volume 6, Issue 4, 2005, Page(s):378-390.
    [Ye04]Qixiang Ye, Qingming Huang. A New Text Detection Algorithm in Images/Video Frames, Proceedings of Pacific-Rim Conference on Multimedia 2004, LNCS 3332, Page(s):858-865.
    [Yi07]Jian Yi, Yuxin Peng, Jianguo Xiao. Color-based Clustering for Text Detection and Extraction in Image, ACM MultiMedia 07, Page(s):847-850.
    [Yuan08]袁海东,马华东,黄晓冬.基于梯度与粗糙度的视频文本检测与定位.电子学报.2008,Vol.36, No.8, Page(s):1660-1665.
    [Chen04]Xilin Chen, Jie Yang, Jing Zhang, Waibel, A., "Automatic Detection and Recognition of Signs From Natural Scenes", IEEE Transactions on Image Processing, Volume 13, Issue 1, Jan.2004, Page(s):87-99.
    [Comelli95]Comelli P., Ferragina, P., Granieri M.N., Stabile, F.. Optical recognition of motor vehicle license plates, IEEE Transactions on Vehicular Technology, Volume 44, Issue 4, Nov.1995, Page(s):790-799
    [Field871Field D. Relations between the Statistics of Natural Images and the Response Properties of Cortical Cells, Journal of the Optical Society of America, Vol.4, No. 12, December 1987, Page(s):2379-2394.
    [Gao01]Jiang Gao, Jie Yang. An Adaptive Algorithm for Text Detection from Natural Scenes, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Volume 2,2001, Page(s):84-89.
    [Gllavata04]Julinda Gllavata, Ralph Ewerth and Bernd Freisleben. Text detection in images based on unsupervised classification of high-frequency wavelet coefficients, Proceedings of International Conference on Pattern Recognition, 2004, Vol.1, Page(s):425-428.
    [Goto08]Hideaki Goto. Redefining the DCT-based feature for scene text detection: Analysis and comparison of spatial frequency-based features, International Journal on Document Analysis and Recognition(11), No.1,October 2008, Page(s): 1-8.
    [Govindaraju03]Govindaraju, V., Tulyakov, S.. Postal address block location by contour clustering., Proceeding of Seventh International Conference on Document Analysis and Recognition, vol.1,3-6 Aug.2003, Page(s):429-432.
    [Hanif08]Shehzad Muhammad Hanif, Lionel Prevost, Pablo Augusto Negri. A Cascade Detector for Text Detection in Natural Scene Images, Proceedings of International Conference on Pattern Recognition,2008, Page(s):1-4.
    [Hanif09]Shehzad Muhammad Hanif, Lionel Prevost. Text Detection and Localization in Complex Scene Images using Constrained AdaBoost Algorithm, Proceedings of International Conference on Document Analysis and Recognition,2009, Page(s): 1-5.
    [Ji08]Rongrong Ji, Pengfei Xu, Hongxun Yao, Zhen Zhang, Xiaoshuai Sun, Tianqiang Liu. Directional correlation analysis of local Haar binary pattern for text detection, Proceedings of IEEE International Conference on Multimedia and Expo,2008, Page(s):885-888.
    [Kim04]K. C. Kim, H. R. Byun, Y. J. Song, Y. W. Choi, S. Y. Chi, K. K. Kim, Y. K. Chung. Scene Text Extraction in Natural Scene Images Using Hierarchical Feature Combining and Verification, Proceedings of International Conference on Pattern Recognition,2004, Vol.2, Page(s):679-682.
    [Liu05]C. Liu, C. Wang and R. Dai. Text Detection in Images Based on Unsupervised Classification of Edge-based Features, Proceedings of IEEE International Conference on Document Analysis and Recognition,2005, Page(s):610-614.
    [Liu08]Zongyi Liu, Sudeep Sarkar. Robust Outdoor Text Detection Using Text Intensity and Shape Features, Proceedings of International Conference on Pattern Recognition,2008, Page(s):1-4.
    [LuO8]Su Lu, Kenneth E. Barner. Weighted DCT coefficient based text detection, Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing,2008, Page(s):1341-1344.
    [Lyu05]Michael R. Lyu, Jiqiang Song, Min Cai. A Comprehensive Method for Multilingual Video Text Detection, Localization, and Extraction, IEEE Transactions on Circuits and Systems for Video Technology,2005,15(2), Page(s): 243-255.
    [Mao02]Wenge Mao, Fu-lai Chung, Kenneth K.M.Lan and Wan-Chi Siu. Hybrid Chinese English text detection in Images and Video Frames, Proceedings of International Conference on Pattern Recognition,2002, Volume 3, Page(s):1015-1018.
    [Messelodi99]S. Messelodi, C. M. Modena. Automatic Identification and Skew Estimation of Text Lines in Real Scene Images, Pattern Recognition,32(5) (1999), Page(s): 789-808.
    [Pan08]Wumo Pan, T. D. Bui, C. Y. Suen. Text Detection from Scene Images Using Sparse Representation, Proceedings of International Conference on Pattern Recognition,2008, Page(s):1-5.
    [Pei04]Soo-Chang Pei, and Yu-Ting Chuang. Automatic Text Detection Using Multi-Layer Color Quantization in Complex Color Images, Proceedings of IEEE International Conference on Multimedia and Expo,2004, Volume 1, Page(s): 619-622.
    [Phan09]Trung Quy Phan, Palaiahnakote Shivakumara and Chew Lim Tan. A Laplacian Method for Video Text Detection, Proceedings of International Conference on Document Analysis and Recognition,2009, Page(s):66-70.
    [Sin02]Bong-kee Sin, Seon-Kyu Kim, Beom-Joon Cho. Locating characters in scene images using frequency features, Proceedings of International Conference on Pattern Recognition,2002, Volume 3, Page(s):489-492.
    [Sun09]Li Sun, Guizhong Liu, Xueming Qian, Danping Guo. A novel text detection and localization method based on corner response, Proceedings of IEEE International Conference on Multimedia and Expo,2009, Page(s):390-393.
    [Wang02]Qing Wang, Zheru Chi, Rongchun Zhao. Hierarchical content classification and script determination for automatic document image processing, Proceeding of 16th International Conference on Pattern Recognition, Volume 3,11-15 Aug. 2002, Page(s):77-80.
    [Wang04]Rongrong Wang, Wanjun Jin, Lide Wu. A Novel Video Caption Detection Approach Using Multi-Frame Integration, Proceedings of International Conference on Pattern Recognition,2004, Vol.1, Page(s):449-452.
    [Wu05]Wen Wu, Xilin Chen, and Jie Yang. Detection of Text on Road Signs From Video, IEEE Transactions on Intelligent Transportation Systems, Volume 6, Issue 4, 2005, Page(s):378-390.
    [Yi07]Jian Yi, Yuxin Peng, Jianguo Xiao. Color-based Clustering for Text Detection and Extraction in Image, ACM MultiMedia 07, Page(s):847-850.
    [Zhong00]Yu Zhong and Anil K. Jain. Object Localization using Color, Texture, and Shape, Pattern Recognition 33 (2000), Page(s):671-684.
    [Fu06]Hui Fu, Xiabi Liu, Yunde Jia, Hongbin Deng. Gaussian Mixture Modeling of Neighbor Characters for Multilingual Text Extraction in Images, Proceedings of IEEE International Conference on Image Processing,2006, Page(s):3321-3324.
    [Graham03]L. Graham, Y. Chen, T. Kalyan, J. H. N. Tan and M. Li. Comparison of Some Thresholding Algorithms for Text /Background Segmentation in Difficult Document Images, Proceedings of International Conference on Document Analysis and Recognition,2003, Vol 2, Page(s):859-865.
    [Kim04]K. C. Kim, H. R. Byun, Y. J. Song, Y. W. Choi, S. Y. Chi, K. K. Kim, Y. K. Chung. Scene Text Extraction in Natural Scene Images Using Hierarchical Feature Combining and Verification, Proceedings of International Conference on Pattern Recognition,2004, Vol.2, Page(s):679-682.
    [Kim09]Wonjun Kim and Changick Kim. A New Approach for Overlay Text Detection and Extraction From Complex Video Scene, IEEE Transactions on Image Processing, Volume 18, Issue 2,2009, Page(s):401-411.
    [Kumar07]Kumar, S., Gupta, R., Khanna, N., Chaudhury, S., Joshi, S.D. Text Extraction and Document Image Segmentation Using Matched Wavelets and MRF Model, IEEE Transactions on Image Processing, Volume 16, Issue 8, Aug.2007, Page(s): 2117-2128.
    [Lin05]Lin Lin, Chew Lim Tan. Text extraction from name cards with complex design, Proceedings of International Conference on Document Analysis and Recognition, 2005, Vol.2, Page(s):977-980.
    [Loo03]Poh-Kok Loo, Chew-Lim Tan. Using irregular pyramid for text segmentation and binarization of gray scale images, Proceedings of International Conference on Document Analysis and Recognition,2003, vol.1, Page(s):594-598.
    [Lowell00]Lowell L. Winger, John A. Robinson, M. Ed Jernigan. Low-complexity character extraction in low-contrast scene images, in International Journal of Pattern Recognition and Artificial Intelligence, Vol 14, No 2, March 2000. Page(s):113-135.
    [Lyu05]Michael R. Lyu, Jiqiang Song, Min Cai. A Comprehensive Method for Multilingual Video Text Detection, Localization, and Extraction, IEEE Transactions on Circuits and Systems for Video Technology,2005,15(2), Page(s): 243-255.
    [Mancas05]Mancas-Thillou, C.; Gosselin, B. Color Text Extraction from Camera-based Images, the Impact of the Choice of the Clustering Distance, Proceedings of International Conference on Document Analysis and Recognition,2005, Vol.1, Page(s):312-316.
    [Otsu79]N. Otsu, A Threshold Selection Method from Gray-level Histograms, IEEE Transaction on System, Man, Cybernet, vol. SMC-9, no.1, Jan.1979, Page(s): 62-66.
    [Shim98]J. C. Shim, C. Dorai, and R. Bolle. Automatic Text Extraction from Video for Content-based Annotation and Retrieval, Proceeding of International Conference on Pattern Recognition, Vol.1,1998, Page(s):618-620.
    [Tang02]Xiaoau Tang, Bo Luo, Xinbo Gao, Pissaloux, E., Hongjiang Zhang. Video text extraction using temporal feature vectors, Proceedings of IEEE International Conference on Multimedia and Expo,2002, Volume 1, Page(s):85-88.
    [Wang04]Bin Wang, Xiangfeng Li, Feng Liu, Fuqiao Hu. Color Text Image Binarization Based on Binary Texture Analysis, Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing,2004, Page(s):III: 585-588.
    [Wu03]S. Wu and A. Amin. Automatic Thresholding of Gray-level Using Multi-stage Approach, Proceedings of International Conference on Document Analysis and Recognition,2003, Page(s):493-497.
    [Yi07]Jian Yi, Yuxin Peng, Jianguo Xiao. Color-based Clustering for Text Detection and Extraction in Image, ACM MultiMedia 07, Page(s):847-850.
    [Zenzo86]S. Di Zenzo. A Note on the Gradient of a Multi-image. Computer Vision, Graphics, and Image Processing,33(1),1986, Page(s):116-125.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700