视频文本的提取

英文题名：Text Extraction in Video
作者：章东平
论文级别：博士
学科专业名称：通信与信息系统
中文关键词：视频索引 ; 文本定位 ; 连通区域分析 ; 文本跟踪 ; 文本增强 ; 文本分割
英文关键词：video indexing ; text location ; connected component analysis ; text tracking ; text enhancement ; text segmenting
学位年度：2006
导师：刘济林
学科代码：081001
学位授予单位：浙江大学
论文提交日期：2006-05-01

摘要

视频中的文本能够给视频检索和索引提供重要的辅助信息,有时视频中的文本包含了其它地方没有的信息,例如电影片头的字幕,有时,视频中的文本是一种重要而简洁的辅助信息,例如体育比赛中的得分股票价格。如果视频中的文本能够被有效地提取和识别,那么许多高层次的应用,例如视频摘要,就可以更好地实现。
     论文对视频文本提取的几个方面,包括文本定位、文本跟踪、文本增强和文本分割进行了研究。与文档图像相比较,视频中的文本提取由于其较低的分辨率、复杂的背景、照明的变化、和位置、形状与颜色的不确定而具有很大的挑战性。
     本文采用了一种压缩域与空域相结合的文本行定位方法,文本区域使用DCT块的纹理能量直接在DCT域检测,文本行根据文本区域差分图像的水平投影轮廓线来提取。
     提出了一种基于M估计模板匹配的文本跟踪方法,匹配模板用LLT(Logical Level Technique)对文本区域进行粗分割得到,搜索窗口位置用MPEG-2比特流中的运动向量来估计,模板匹配的加速采用基于优胜者更新的多分辨率方法。
     一种多帧融合的增强方法被用来提高文本与背景的对比度,论文根据文本区域中每个象素在时间域上的强度分布决定采用多帧平均方法还是采用多帧最小或多帧最大方法来增强文本区域。
     提出了一种基于彩色笔画模型的文本分割算法,彩色笔画模型描述了字符在彩色空间的局部地形学特征,文本分割算法由文本区域二值化和连通区域二部分组成。
Text in digital video can provide important supplemental information for retrieval and indexing. There are cases where text in a clip contains information that is not found anywhere else such as movie credits, and other cases where text is an important concise supplement, such as sports scores or stock prices. Many high-level applications such as video abstract become possible if text in digital video can be extracted and recognized robustly.This dissertation presents our work on several aspects of text extracting in digital video, including text localization, tracking, enhancement and segmentation. Compared with typical document images text in video presents challenges because of low resolution, complex background, lighting variation, and unrestricted pose, shape and color.A method to automatically localize texts in the compressed domain and spatial domain is presented. The text regions are detected directed in DCT domain using the texture energy of each DCT block. A horizontal projection profiled of differential image of text region is employed in text line extraction.The tracking algorithm makes use of template matching with M-estimator. The matching template is acquired by segmenting the text region using logical level technique. The location of search window is estimated by using the motion vectors in the MPEG-2 bitstream. Multi-resolution method based on the winner-update strategy is adopted to speed up the template matching.An enhancement algorithm by multi-frame integration is used to increase the contrast between text and background. We decide to adopt multi-frame averaging method or multi-frame minimizing/maximizing method to enhance the text region by the analyzing the intensity distributing of each pixel over time.A text segmentation algorithm based on color stroke model is proposed. The color stroke model depicts the local topographical feature of characters in color space. The algorithm combines the binarization of text region and connected components analysis.

引文

[Agnihotri, 1999] Agnihotri L and Dimitrova N. Text detection for video analysis. Proceedings of IEEE Workshop on Content-Based Access of Image and Video Libraries, 1999, pp. 109-113.
    [Antani, 2000] Antani S, Crandall D and Kasturi R. Robust extraction of text in video. Proceedings of 15th International Conference on Pattern Recognition, Vol. 1, 2000, pp. 831-834.
    [Antani, 2001] Antani S. Reliable extraction of text from video. Ph.D. Thesis, Pennsylvania State University, August 2001.
    [Bourbakis, 1996] Bourbakis N G.A methodology of separating images from text using an OCR approach. IEEE International Joint Symposia on Intelligence and Systems, 1996, pp. 311-317.
    [蔡,2003] 蔡波,周洞汝,胡宏斌.数字视频中字幕检测及提取的研究和实现.计算机辅助设计与图形学学报,Vol.15,N0.7,2003,pp.898-903.
    [Cai, 2002] Cai M, Song J Q and Lyu, M R. A new approach for video text detection. Proceedings of 2002 International Conference on Image Processing, Vol. 1, 2002, pp.Ⅰ-117-Ⅰ-120.
    [then, 2001a] Chen D T, Shearer K and gourlard H. Text enhancement with asymmetric filter for video OCR. Proceedings of 11th International Conference on Image Analysis and Processing, 2001, pp. 192-197.
    [Chen, 2001b] then Y S, Hung Y P and Fuh C S. Fast block matching algorithm based on the winner-update strategy. IEEE Transactions on Image Processing. Vol. 10, No. 8, 2001, pp. 1212-1222.
    [Chert, 2002a] Chert D T, Olobez J M and Bourlard H. Text segmentation and recognition in complex background based on Markov random field. Proceedings of 16th International Conference on Pattern Recognition, Vol. 4, 2002, pp. 227-230.
    [Chen, 2002b] Chen X L, Yang J, ghang J and Waibel A. Automatic detection of signs with affine transformation. Proceedings of Sixth IEEE Workshop on Applications of Computer Vision, 2002, pp. 32-36.
    [Chen, 2003] Chen D T and Olobez J M. Sequential Monte Carlo video text segmentation. Proceedings of 2003 International Conference on Image Processing, Vol. 3, 2003, pp. Ⅲ-21-4.
    [Chen, 2004a] Chen D T, Odobez J M and Thiran J P. a localization/verification scheme for finding text in images and video frames based on contrast independent feature and machine learning methods. Signal Processing: Image Communication, Vol. 19, 2004, pp. 205-217.
    [Chen, 2004b] Chen D T, Odobez J M and Bourlard H. Text detection and recognition in images and video fremes. Pattern Recognition, Vol. 37, 2004, pp. 595-608.
    [Chen, 2004c] Chen T B, Ghosh D and Ranganath S. Video-text extraction and recognition. IEEE Region 10 Conference, Vol. A, 2004, pp. 319-322.
    [Chen, 2004d] Chen X L, Yang J, Zhang J and Waibel A. Automatic detection and recognition of signs from natural scenes. IEEE Transactions on Image Processing, Vol. 13, 2004, pp. 87-99.
    [Chen, 2004e] Chen X R and Yuille A L. Detecting and reading text in natural scenes. Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 2, 2004, pp. Ⅱ-366-Ⅱ-373.
    [Chen, 2004f] Chen Y L, Chiu C C and Wu B F. Complex document image segmentation using localized histogram analysis with multi-layer matching and clustering. IEEE International Conference on Systems, Man and Cybernetics, Vol. 4, 2004, pp. 3063-3070.
    [Cheng, 2004a] Cheng Z G and Liu Y C. An automatic system for text location and extraction in digital video based using SVM. Proceedings of 7th International Conference on Signal Processing, Vol. 3, pp. 2592-2595.
    [Cheng, 2004b] Cheng Z G and Liu Y C. Caption locationand extraction in digital video based on SVM. Proceedings of 2004 International Conference on Machine Learning and Cybernetics, Vol. 6, 2004, pp. 3515-3519.
    [Chun, 1999a] Chun B T, Bae Y and Kim T Y. Automatic text extraction in digital videos using FFT and neural network. IEEE International Fuzzy Systems Conference Proceedings, Vol. 2, 1999, pp. 1112-1115.
    [Chun, 1999b] Chun B T, Bae Y and Kim T Y. Text extraction in videos using topographical features of characters. IEEE International Fuzzy Systems Conference Proceedings, Vol. 2, 1999, pp. 1126-1130.
    [Crandall, 2001] Crandall D and Kasturi, R. Robust detection of stylized text events in digital video. Proceedings of Sixth International Conference on Document Analysis and Recognition, 2001, pp. 865-869.
    [Datta, 2003] Datta S, Choudhuri B R and Ganguli A. Text extraction system. Proceedings of the Sixth International Conference of Information Fusion, Vol. 2, 2003, pp. 1441-1448.
    [Dimov, 2001] Dimov D T. Using an exact performance of Hough transform for image text segmentation. Proceedings of 2001 International Conference on Image Processing, Vol. 1, 2001, pp. 778-781.
    [Djeziri, 1998] Djeziri S, Nouboud F and Pla~ondon R. Extraction of signatures fron check background based on a filiformity criterion. IEEE Transactions on Image Processing, Vol. 7, No. 10, 1998, pp. 1425-1438.
    [Dobez, 2002] Dobez J M and Chen D T. Robust video text segmentation and recognition with multiple hypotheses. 2002. Proceedings of 2002 International Conference on Image Processing, Vol. 2, 2002, pp. Ⅱ-433-Ⅰⅰ-436.
    [Du, 2002] Du E Y, Chang C and rhouin P D. rhresholding video images for text detection. Proceedings of 16th International Conference on Pattern Recognition, Vol. 3, 2002, pp. 919-922.
    [Du, 2003] Du E Y, Chang C and rhouin P D. An unsupervised approach to color video thresholding. Proceedings of 2003 International Conference on Multimedia and Expo, Vol. 3, 2003, pp. Ⅲ-337-40.
    [Ezaki, 2004] Ezaki N, Bulacu M and Schomaker L. Text detection from natural scene images: towards a system for visually impaired persons. Proceedings of the 17th International Conference on Pattern Recognition, Vol. 2, 2004, pp. 683-686.
    [冯,2002] 冯慧君.基于小波变换和神经网络的视频文字检测.九江师专学报(自然科学版),No.6,2002,pp.4-10.
    [Gandhi, 2000] Gandhi T, Kasturi R and Antani, S. Application of planar motion segmentation for scene text extraction. Proceedings of 15th International Conference on Pattern Recognition, Vol. 1, 2000, pp. 445-449.
    [Gao, 2001] Gao J and Yang ]. An adaptive algorithm for text detection from natural scenes. 2001. CVPR 2001. Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 2, 2001, pp. Ⅱ-84-Ⅱ-89.
    [Garcia, 2000] Garcia C and Apostolidis X. Text detection and segmentation in complex color images. Proceedings of 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 6, 2000, pp. 2326-2329.
    [Gargi, 1999] Gargi U, Crandall D, Antani S, Gandhi T, Keener R and Kasturi, R. A system for automatic text detection in video. Proceedings of the Fifth International Conference on Document Analysis and Recognition, 1999, pp. 29-32.
    [Gllavata, 2003a] Gllavat J, Ewerth R and Freislebe B. A robust algorithm for text detection in images. Proceedings of the 3rd International Symposium on Image and Signal Processing and Analysis, Vol. 2, 2003, pp. 611-616.
    [Gllavata, 2003b] Gllavat J, Ewerth R and Freislebe B. Finding text in images via local thresholding. Proceedings of the 3rd IEEE International Symposium on Signal Processing and Information Technology, 2003, pp. 539-542.
    [Gllavata, 2004a] Gllavat J, Ewerth R and Freislebe B. A text detection, localization and segmentation system for OCR in images. 2004. Proceedings of IEEE Sixth International Symposium on Multimedia Software Engineering, 2004, pp. 310-317.
    [Gllavata, 2004b] Gllavat J, Ewerth R and Freislebe B. Text detection in images based on unsupervised classification of high-frequency wavelet coefficients. Proceedings of the 17th International Conference on Pattern Recognition, Vol. 1, 2004, pp. 425-428.
    [郭,2004] 郭丽,孙兴华,黄元元,杨静宇.视频文本的自动提取方法.小型微型计算机系统,Vol.25,No.6,2004,pp.1086-1088.
    [Haritaoglu, 2001] Haritaoglu I. Scene text extraction and translation for handheld devices. Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 2, 2001, pp. Ⅱ-408-Ⅱ-413.
    [Hasan, 2000] Hasan Y M Y and KaramL. Morphological text extraction from images. IEEE Transactions on Image Processing, Vol. 9, 2000, pp. 1978-1983.
    [Hase, 2001] Hase H, Shinokawa T, Yoneda M and Suen C Y. Character string extraction from color documents. Pattern Recognition, Vol. 34, 2001, pp. 1349-1365.
    [He, 2004] He J Y and Li S F. Hybrid Chinese/English text identification in Web images. Proceedings of Third International Conference on Image and Graphics, 2004, pp. 361-364.
    [何,2005] 何家颖,黎绍发.一种复杂背景图像文字分割算法.模式识别与人工智能,Vol.18,No.2,2005,pp.148-153.
    [Hirata, 2000] Hirata N S T, Barbera J and Terada R. Text segmentation by automatically designed morphological operators. Proceedings ⅩⅢ Brazilian Symposium on Computer Graphics and Image Processing, 2000, pp. 284-291.
    [Hontani, 2001] Hontani H and Koga T. Character extraction method without prior knowledge on size and position information. Proceedings of the IEEE International Vehicle Electronics Conference, 2001, pp. 67-72.
    [Hori, 1999] Hori, O. A video text extraction method for character recognition. 1999. Proceedings of the Fifth International Conference on Document Analysis and Recognition, 1999, pp. 25-28.
    [胡,2001a] 胡宏斌,徐骏,周洞汝.基于COM技术的视频流文字检测.计算机工程,Vol.27,No.6,2001,pp.95-97.
    [胡,2001b] 胡建明,吴立德.一种改进的文字/图形图像的快速分割算法.模式识别与人工智能,Vol.14,No.2,2001,pp.201-205.
    [Hu, 2005] Hu S Y and Chen M Y. Adaptive Fre/spl acute/chet kernel based support vector machine for text detection. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 5, 2005, pp.v/365-v/368.
    [Hua, 2001] Hua X S, Liu W Y and Zhang H J. Automatic performance evaluation for video text detection. Proceedings of Sixth International Conference on Document Analysis and Recognition, 2001, pp. 545-550.
    [Hua, 2002] Hua X S, Yin P and Zhang H J. Efficient video text recognition using multiple frame integration. Proceedings of 2002 International Conference on Image Processing, Vol. 2, 2002, pp. Ⅱ-397-Ⅱ-400.
    [Hua, 2004] Hua X S, Liu W Y and Zhang H J. An automatic performance evaluation protocol for video text detection algorithms. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 14, 2004, pp. 498-507.
    [Hwang, 1992] Hwang C M, Shu S Y, Chen W Y, Chen Y W and Wen K P. A PC-based car licence plate reader. Proceedings of SPIE, Vol. 1823, 1992, pp. 272-283.
    [黄,2002] 黄祥林,沈兰荪.基于DCT压缩域的图象字符定位.中国图象图形学报,Vol.7(A),No.1,pp.22-26.
    [黄,2003] 黄晓东,周源华.用小波变换及颜色聚类提取的视频图像内中文字幕.计算机工程,Vol.29,NO.1,2003,pp.43-44.
    [Jain, 1996] Jain A K and Zhong Y. Page segmentation using texture analysis. Pattern Recognition, Vol. 29, 1996, pp. 743-770.
    [Jain, 1998] Jain A K and Yu B. Automatic text location in images and video frames. Pattern Recognition, Vol. 31, 1998, pp. 2055-2076.
    [Jung, 2001] Jung K. Neural network-based text location in color images. Pattern Recognition Letter, Vol. 22, 2001, pp. 1503-1515.
    [Jung, 2002a] Jung K, Kim K I and Han J H. Text extraction in real scene images on planar planes. Proceedings of 16th International Conference on Pattern Recognition, Vol. 3, 2002, pp. 469-472.
    [Jung, 2002b] Jung K, Kim K I, Kurata T, Kourogi M and HanJ H. Text scanner with text detection technology on image sequences. Proceedings of 16th International Conference on Pattern Recognition, Vol. 3, 2002, pp. 473-476.
    [Jung, 2004a] Jung K and Han J H. Hybrid approach to efficient text extraction in complex color images. Pattern Recognition Letter, Vol. 25, 2004, pp. 679-699.
    [Jung, 2004b] Jung K, Kim K I and Jain A K. Text information extraction in images and videos: a survey. Pattern Recognition, Vol. 37, 2004, pp. 977-997.
    [Kamel, 1993] Kamel M and Zhao A. Extraction of binary character/graphics images from grayscale document images. Graphical Models and Images Processing, Vol. 55, 1993, pp. 203-217.
    [Karatzas, 2003] Karatzas D and Antonacopoulos A. Two approaches for text segmentation in Web images. Proceedings of Seventh International Conference on Document Analysis and Recognition, Vol. 1,2003, pp. 131-136.
    [Karatzas, 2004] Karatzas D and Antonacopoulos A. Text extraction from Web images based on a split-and-merge segmentation method using colour perception. Proceedings of the 17th International Conference on Pattern Recognition, Vol. 2, 2004, pp. 634-637.
    [Kim, 1999] Kim P K. Automatic text location in complex color images using local color quantization. Proceedings of the IEEE Region 10 Conference, Vol. 1, 1999, pp. 629-632.
    [Kim, 2000] Kim E Y, Kim K I, Jung K and Kim H J. A video indexing system using character recognition. International Conference on Consumer Electronics, 2000, pp. 358-359.
    [Kim, 2001] Kim K I, Jung K, Park S H and Kim H J. Support vector machine-based text detection in digital video. Pattern Recognition, Vol. 34, 2001, pp. 527-529.
    [Kim, 2003] Kim K I, Jung K and Kim J H. Texture-based approach for text detection in images using support vector machines and continuously adaptive mean shift algorithm. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 25, 2003, pp. 1631-1639.
    [Kim, 2004] Kim K C, Byun H R, Song Y J, Choi Y W, Chi S Y, Kim K K and Chung Y K. Scene text extraction in natural scene images using hierarchical feature combining and verification. Proceedings of the 17th International Conference on Pattern Recognition, Vol. 2, 2004, pp. 679-682.
    [Kurakake, 1997] Kurakake S, Kuwano H and Odaka K. Recognition and visual feature matching of text region in video for conceptual indexing. SPIE, Vol. 3022, pp. 368-379.
    [Kwak, 2000] Kwak S, Choi Y and Chung K. Video caption image enhancement for an efficient character recognition. Proceedings of 15th International Conference on Pattern Recognition, Vol. 2, 2000, pp. 606-609.
    [Kuwano, 2000] Kuwano H, Taniguchi Y, Arai H, Mori M, Kurakake S and Kojima H. Telop-on-demand: video structuring and retrieval based on text recognition. IEEE International Conference on Multimedia and Expo, Vol. 2, 2000, pp. 759-762.
    [Lee, 2003] Lee C W, Jung K and Kim H J. Automatic text detection and removal in video sequences. Pattern Recognition Letter, Vol. 24, 2003, pp. 2607-2623.
    [Li, 1999a] Li H, Kia O and Doermann D. Text enhancement in digital video. Proceedings of SPIE on Document Recognition ;Ⅳ, 1999, pp.1-8.
    [Ll, 1999b] Li L Y, Nagy G, Samal A, Seth S and Xu Y H. Cooperative text and line-art extraction from a topographic map. Proceedings of the Fifth International Conference on Document Analysis and Recognition, 1999, pp. 467-470.
    [Li, 1999c] Li B Q and Li B X. Building pattern classifiers using convolutional neural networks. International Joint Conference on Neural Networks, Vol. 5, 1999, pp. 3081-3085.
    [Li, 2000a] Li H P and Doermann D. A video text detection system based on automated training. Proceedings of 15th International Conference on Pattern Recognition, Vol. 2, 2000, pp. 223-226.
    [Li, 2000b] Li H P. Automatic processing and analysis of text in digital video. Ph.D. Thesis, University of Maryland, 2000.
    [Li, 2000c] Li H P, Doermann D and Kia O. Automat ic text detection and tracking in digital video. IEER Transactions on Image Processing, Vol. 9, 2000, pp. 147-156.
    [Li, 2001] Li C, Ding X Q and Wu Y S. Automatic text location in natural scene images. Proceedings of Sixth International Conference on Document Analysis and Recognition, 2001 pp. 1069-1073.
    [Li, 2002] Li H P and Doermann D. Video indexing and retrieval based on recognized text. 2002 IEEE Workshop on Multimedia Signal Processing, 2002, pp. 245-248.
    [Li, 2004] Li S T and Kwok J T. Text extraction using edge detection and morphological dilation. Proceedings of 2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, 2004, pp. 330-333.
    [李,1993] 李水源,陈维南.一种强鲁棒的完全确定型的快速阈值化方法.模式识别与人工智能,Vol.6,No.3,1993,pp.235-241.
    [李,2002] 李朝晖,王秀峰.影视字幕文字识别的研究.计算机工程,Vol.28,No.3,2002, pp.175-176.
    [李,2001] 李朝晖,余英林,张为,邹艳碧.小波一神经网络在视频文本自动检测中的应用.Vol.15,2001,pp.36-39.
    [李,2003a] 李朝晖,余英林.基于小波形态学的文本自动检测.计算机工程与应用,No.14, 2003,pp.119-120.
    [李,2003b] 李洁,焦李成.视频图像中字幕区域自动提取算法.系统工程与电子技术,Vol.25, NO.9,2003,pp.1147-1150.
    [李,2004] 李朝晖,余英林.基于边缘信息和LH的视频文本自动检测.计算机应用研究,2004, pp.166-167.
    [李,2005] 李朝晖,余英林.一种视频文本自动定位、跟踪和识别的方法.中国图象图形学报,Vol.10,NO.4,2005,pp.457-462.
    [Lienhart, 1996] Lienhart R and Stuber F. Automatic text recognition in digital videos. Proceedings of SPIE, Vol. 2666, 1996, pp. 180-188.
    [Lienhart, 2002] Lienhart R and Wernicke A. Localizing and segmenting text in images and videos. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 12, 2002, pp. 256-268.
    [Lim, 2000] Lim Y K, Choi S H and Lee S W. Text extraction in MPEG compressed video for content-based indexing. Proceedings of 15th International Conference on Pattern Recognition, Vol. 4, 2000, pp. 409-412.
    [Liu, 2003] Liu H Y, Zhou D R. NewsBR: a content-based news video browsing and retrieval system. Proceedings of the 3rd International Symposium on Image and Signal Processing and Analysis, Vol. 2, 2003, pp. 793-798.
    [Liu, 2004a] Liu H Y. Content-Based TV Sports Video Retrieval Based on Audio-Visual Features and Text Information. Proceedings of IEEE/WIC/ACM International Conference on Web Intelligence, 2004, pp. 481-484.
    [Liu, 2004b] Liu Y, Lu H, Xue X Y and Tan Y P. Effective video text detection using line features. Control, Automation, Robotics and Vision Conference, Vol. 2, 2004, pp. 1528-1532.
    [刘,2003] 刘骏伟,吴飞,庄越挺.基于SVM和ICA的视频帧字幕自动定位与提取.中国图象图形学报,Vol.8(A),No.11,2003,pp.1335-1340.
    [刘,2005] 刘洋,薛向阳,路红,郭跃飞.一种基于边缘检测和线条特征的视频字符检测算法.计算机学报,Vol.28,NO.3,2005,pp.428-433.
    [Loo, 2003] Loo P K and Tan C L. Using irregular pyramid for text segmentation and binarization of gray scale images. Proceedings of Seventh International Conference on Document Analysis and Recognition, vol. 1 2003, pp. 594-598.
    [Luo, 2003] Luo B, Tang X O, Liu J Z and Zhang H j. Video caption detection and extraction using temporal information. Proceedings of the 3rd International Symposium on Image and Signal Processing and Analysis, Vol. 3, 2003, pp. 1723-1728.
    [Lyu, 2005] Lyu M R, Song J Q and Cai M. A comprehensive method for multilingual video text detection, localization, and extraction. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 15, 2005, pp. 243-255.
    [Malik, 1999] Malik R and Chin S A. Extraction of text in images. Information 1999. Proceedings of International Conference on Intelligence and Systems, 1999, pp. 534-537.
    [Mao, 2002] Mao W G, Chung F L, Lam K K M and Sun W C. Hybrid Chinese/English text detection in images and video frames. Proceedings of 16th International Conference on Pattern Recognition, Vol. 3, 2002, pp. 1015-1018.
    [Mariano, 2003] Mariano V Y and Kasturi R. Detection of text marks on moving vehicles. Proceedings of Seventh International Conference on Document Analysis and Recognition, 2003, pp. 393-397.
    [Messelodi, 1999] Messelodi S and Modena C M. Automatic identification and skew estimation of text lines in real scene images. Pattern Recognition, Vol. 32, 1999, pp. 791-810.
    [Mestetskii, 2002] Nestetskii L N, Reyer I A and Sederberg T W. Continuous approach to segmentation of handwritten text. Proceedings of Eighth International Workshop on Frontiers in Handwriting Recognition, 2002, pp. 440-445.
    [Mital, 1995] Mital D P and Leng G W. Text segmentation for automatic document processing. Proceedings of International Conference on Consumer Electronics, 1995, pp. 132-133.
    [Mital, 1996] Mital D P and Leng G W. Text segmentation for automatic document processing. Proceedings of IEEE Conference on Emerging Technologies and Factory Automation, Vol. 2, 1996, pp. 642-648.
    [Moalla, 2002] Moalla I, Elbaati A, Alimi A A and Benhamadou A. Extraction of Arabic text from multilingual documents. IEEE International Conference on Systems, Man and Cybernetics, Vol. 4, 2002, pp. 5-8.
    [Nakajima, 1998] Nakajima Y, Yoneyama A, Yanagihara H and Sugano M. Moving object detection from MPEG coded data. Proceedings of SPIE Visual Communications and Image Processing, Vol. 3309, 1998, pp. 988-996.
    [Nawaz, 2003] Nawaz, Sarfraz S N, M, Zidouri, A and AI-Khatib W G. An approach to offline Arabic character recognition using neural networks. Proceedings of the 2003 10th IEEE International Conference on Electronics, Circuits and Systems, Vol. 3, 2003, pp. 1328-1331.
    [Negi, 2003] Negi A and Kasinadhuni N. Localization and extraction of text in Telugu document images. Conference on Convergent Technologies for Asia-Pacific Region Vol. 2, 2003, pp. 749-752.
    [Ohya, 1994] Ohya J, Shio A and Akamatsu S. Recognition characters in scene images. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 16, 1994, pp. 214-220.
    [Okun, 2002] Okun O, Yan Y and Pietikainen M. Robust text detection from binarized document images. Proceedings of 16th International Conference on Pattern Recognition, Vol. 3, 2002, pp. 61-64.
    [欧,2002] 欧国斌,张利,谢攀.视频信号中实时字幕信息的提取方法.清华大学学报,Vol.42, No.7,2002,pp.869-872.
    [欧,2004] 欧文武,朱军民,刘昌平.视频文本定位.计算机工程与应用,No.30,2004, pp.65-67.
    [Parodi, 1996] Parodi P and Piccioli G. A fast and flexible statistical method for text extraction in document pages. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1996, pp. 619-624.
    [Pei, 2004] Pei S C and Chuang Y T. Automatic text detection using multi-layer color quantization in complex color images. 2004 IEEE International Conference on Multimedia and Expo, Vol. 1, 2004, pp. 619-622.
    [Perroud, 2001] Perroud T, Sobottka K and Bunke H. Text extraction from color documents-clustering approaches in three and four dimensions. Proceedings of Sixth International Conference on Document Analysis and Recognition, 2001, pp. 937-941.
    [Pietikainen, 2001] Pietikainen M and Okun O. Edge-based method for text detection from complex document images. 2001. Proceedings of Sixth International Conference on Document Analysis and Recognition, 2001, pp. 286-291.
    [Pilu, 1998] Pilu M. On using raw MPEG motion vectors to determine global camera motion. Proceedings of SPIE Visual Communications and Image Processing, Vol. 3309, 1998, pp. 448-459.
    [Poirier, 1997] Poirier B and Dagenais M. An interactive system to extract structured text from a geometrical representation. Proceedings of the Fourth International Conference on Document Analysis and Recognition, Vol. 1, 1997, pp. 342-346.
    [Sarfraz, 2003] Sarfraz M, Nawaz S N and Al-Khuraidly A. Offline Arabic text recognition system. Proceedings of 2003 International Conference on Geometric Modeling and Graphics, 2003, pp. 30-35.
    [Sabari, 2004] Sabari R S, Pati P Band Ramakrishnan A G. Gabor filter based block energy analysis for text extraction from digital document images. Proceedings of First International Workshop on Document Image Analysis for Libraries, 2004, pp. 233-243.
    [Schaar-Mitrea, 1998] Schaar-Mitrea M v d and With P. Compression of Mixed Video and Graphics Images for TV Systems. Proceedings of SPIE Visual Communication and Image Processing, 1998, pp. 213-221.
    [Shahraray, 1995] Shahraray B and Gibbon D C. Automatic generation of pictoriai transcripts of video programs. Proceedings of SPIE Conference on Multimedia Computing and Networking, Vol. 2417, 1995.
    [史,2004] 史迎春,王韬,周献中.一种基于时空分布特征的新闻字幕检测新算法.系统仿真学报,Vol.16,No.11,2004,pp.2483-2485.
    [Shiku, 2004] Shiku O, Xiao Y and Yah H. Extraction of character patterns in different styles and orientations from natural scene images. 2004. Proceedings of International Symposium on Intelligent Multimedia, Video and Speech Processing, 2004, pp. 719-722.
    [Shim, 1998] Shim J C, Dorai C and Bolle R. Automatic text extraction from video for content-based annotation and retrieval. Proceedings of Fourteenth International Conference on Pattern Recognition, Vol. 1, 1998, pp. 618-620.
    [Shin, 2000] Shin C S, Kin K I, Park M H and Kim H J. Support vector machine-based text detection in digital video. Proceedings of IEEE Signal Processing Society Workshop on Neural Networks for Signal Processing, Vol. 2, 2000, pp. 634-641.
    [Sobottka, 1999] Sobottka K, Bunke Hand Kronenberg H. Identification of text on colored book and journal covers. Proceedings of the Fifth International Conference on Document Analysis and Recognition, 1999, pp. 57-62.
    [Song, 2003a] Song J Q, Cai M and Lyu M R. A robust statistic method for classifying color polarity of video text. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 3, 2003, pp. Ⅲ-581-4.
    [Song, 2003b] Song J Q, Cai M and Lyu M R. A robust statistic method for classifying color polarity of video text. Proceedings of International Conference on Multimedia and Expo, Vol. 2, 2003, pp. Ⅱ-385-8.
    [Strouthopoulos, 2001] Strouthopoulos C, Papamarkos N, Atsalakis A and Chamzas C. Locating text in color documents. 2001. Proceedings of International Conference on Image Processing, Vol. 1, 2001, pp. 1066-1069.
    [Strouthpoulos, 2002] Strouthpoulos C, Papamarkos N and Atsalakis A E. Text extraction in complex color document. Pattern Recognition, Vol. 35, 2002, pp. 1743-1758.
    [Strouthopoulos, 2003] Strouthopoulos C, Papamarkos N, Atsalakis A and Chamzas C. Text identification in color documents. Proceedings of the 3rd International Symposium on Image and Signal Processing and Analysis, Vol. 2, 2003, pp. 702-705.
    [Takahashi, 2005] Takahashi H and Nakajima M. Region Graph Based Text Extraction from Outdoor Images. Third International Conference on Information Technology and Applications, Vol. 1, 2005, pp. 680-685.
    [Tang, 1996] Tang Y Y, Lee S W and Suen C Y. Automatic document processing: a survey. Pattern Recognition, Vol. 29, 1996, pp. 1931-1952.
    [Tang, 2002] Tang X A, Luo B, XGao X B, Pissaloux E and Zhan H J. Video text extraction using temporal feature vectors. Proceedings of IEEE International Conference on Multimedia and Expo, Vol. 1, 2002, pp. 85-88.
    [Taylor, 2004] Taylor G W and Wolf C. Reinforcement learning for parameter control of text detection in images from video sequences. Proceedings of 2004 International Conference on Information and Communication Technologies: From Theory to Applications, 2004, pp. 517-518.
    [Tsai, 2000] Tsai C H. Detection of text strings from mixed text/graphics images. Ph.D. Thesis, Case Western Reserve University, May 2000.
    [Wang, 2001a] Wang X W, Ding X Q and Liu C S. Character extraction and recognition in natural scene images. Proceedings of Sixth International Conference on Document Analysis and Recognition, 2001, pp. 1084-1088.
    [Wang, 2001b] Automatic character location and segmentation in color scene images. Wang H. Proceedings of llth International Conference on Image Analysis and Processing, 2001, pp. 2-7.
    [Wang, 2001c] Wang H and Kangas J. Character-like region verification for extracting text in scene images. Proceedings of Sixth International Conference on Document Analysis and Recognition, 2001, pp. 957-962.
    [Wang, 2003] Wang K Q and Kangs J A. Character location in scene images from digital camera. Pattern Recognition, Vol. 36, 2003, pp. 2287-2299.
    [Wang, 2004a] Wang R R, Jin W J and Wu L D. A novel video caption detection approach using multi-frame integration. Proceedings of the 17th International Conference on Pattern Recognition, Vol. 1, 2004, pp. 449-452.
    [Wang, 2004b] Wang C, Wang Y, Liu H Y and He Y X. Automatic story segmentation of news video based on audio-visual features and text information. International Conference on Machine Learning and Cybernetics, Vol. 5, 2003, pp. 3008-3011.
    [汪,2003] 汪斌,胡福乔.基于图理论聚类的彩色图像文本提取.微电子学与计算机,No.8, 2003,pp.89-93.
    [王,2002] 王辰,老松杨,胡晓峰.视频中的文字探测.小型微型计算机系统,Vol.23,No.4, 2002,pp.478-481.
    [王,2001] 王伟强,高文.一种压缩域上的快速标题文字探测算法及其应用.计算机学报,Vol. 24,NO.6,2001,pp.620-626.
    [王,2004a] 王勇,郑辉,胡德文.图像和视频中的文字获取技术.中国图象图形学报,Vol.9, NO.5,2004,pp.532-538.
    [王,2004b] 王勇,李小春,郑辉,胡德文.一种视频字幕检测定位新方法.计算机工程与应用,No.23,2004,pp.40-42.
    [王,2004c] 王建,周源华.一种基于纹理能量的JPEG图像文本定位算法.上海交通大学学报,Vol.38,NO.9,2004,pp.1492-1495.
    [Wernicke, 2000] Wernicke A and Lienhart R. On the segmentation of text in videos. 2000. IEEE International Conference on Multimedia and Expo, Vol. 3, 2000, pp. 1511-1514.
    [Wong, 2000] Wong E K and Chen M Y. A robust algorithm for text extraction in color video. IEEE International Conference on Multimedia and Expo, Vol. 2, 2000, pp. 797-800.
    [Wong, 2003] Wong E K and Chen M Y. A new robust algorithm for video text extraction. Pattern Recognition, Vol. 36, 2003, pp. 1397-1406.
    [Wu, 1999] Wu V, Manmatha R and Riseman E M. Textfinder: an automatic system to detect and recognize text in images. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 21, 1999, pp. 1224-1229.
    [Wu, 2002] Wu J, QuS L, Zhuo Q and Wang W Y. Automatic text detection in complex color image. Proceedings of 2002 International Conference on Machine Learning and Cybernetics, Vol. 3, 2002, pp. 1167-1171.
    [Xi, 2001] Xi J, Hua X S, Chert X R, Liu W Y and Zhang H J. A video text detection and recognition system. 2001. ICME 2001. IEEE International Conference on Multimedia and Expo, 2001, pp. 873-876.
    [谢,2004] 谢毓湘,栾悉道,吴玲达,老松杨.新闻视频中的字幕控测.计算机工程,Vol.30, NO.20,2004,pp.167-168.
    [Yan, 2001] Yan H. Detection of curved text path based on the fuzzy curve-tracing (FCT) algorithm. Proceedings of Sixth International Conference on Document Analysis and Recognition, 2001, pp. 266-270.
    [yan, 1997] Yan S and Leedham G. Mathematical properties of the native integral ratio handwriting and text extraction technique. Proceedings of the Fourth International Conference on Document Analysis and Recognition, Vol. 2, 1997, pp. 1102-1106.
    [Yang, 2002] Yang J, Chen X L, Zhang J, Zhang Y and Waibel A. Automatic detection and translation of text from natural scenes. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 2, 2002, pp. 2101-2104.
    [Yang, 2004a] Yang X, Takahashi H and Nakajima M. Investigation of robust color model for edge detection on text extraction from scenery images. IEEE Region 10 Conference Vol. B, 2004, pp. 85-88.
    [Yang, 20046] Yang C C and Li K W. Error analysis of Chinese text segmentation using statistical approach. Proceedings of the 2004 Joint ACM/IEEE Conference on Digital Libraries, 2004, pp. 256-257.
    [杨,2000] 杨友庆,高隽,鲍捷,杨学东.基于视频的字幕检索与提取.计算机应用,Vol.20, No.10,2000,pp.33-36.
    [ye, 2001] Ye X Y, Cheriet M and Suen C Y. Stroke-model-based character extraction from gray-level document images. IEEE Transactions on Image Processing, Vol. 10, No. 8, 2001, pp. 1152-1161.
    [ye, 2003] Ye Q X, Gao W, Wang W O and Zeng W. A robust text detection algorithm in images and video frames. Proceedings of the Joint Conference of the Fourth International Conference on Information, Communications and Signal Processing, Vol. 2, 2003, pp. 802-806.
    [Ye, 2004] Ye Q X, Gao W and Huang Q M. Automatic text segmentation from complex background. International Conference on Image Processing, Vol. 5, 2004, pp. 2905-2908.
    [Yeo, 1996] Yeo B L and Liu B. Visual content highlighting via automatic extractionof embedded captions on MPEG compressed video. Proceedings of SPIE Vol. 2668, 1996, pp. 38-47.
    [Yuan, 2001] Yuan Q and Tan C L. Text extraction from gray scale document images using edge information. Proceedings of Sixth International Conference on Document Analysis and Recognition, 2001, pp. 302-306.
    [Zhang, 2002a] Zhang J, Chen X L, Yang J and Waibel A. A PDA-based sign translator. Proceedings of Fourth IEEE International Conference on Multimodal Interfaces, 2002, pp. 217-222.
    [Zhang, 2002b] Zhang D Q, Rajendran R K, Chang S F. General and domain-specific techniques for detecting and recognizing superimposed text in video. Proceedings of International Conference on Image Processing, Vol. 1, 2002, pp.I-593-I-596.
    [Zhang, 2003] Zhang D Q, Tseng B L and Chang S F. Accurate overlay text extraction for digital video analysis. Proceedings of International Conference on Information Technology: Research and Education, 2003, pp. 233-237.
    [Zhang, 2004] Zhang D Q and Shih-Fu Chang S F. Learning to Detect Scene Text Using a Higher-Order MRF with Belief Propagation. Conference on Computer Vision and Pattern Recognition Workshop, 2004, pp. 101-101.
    [章,2002] 章东平,刘济林,罗义军.一种基于t图的多尺度边缘检测方法.电路与系统学报,Vol.7,NO.3,2002,pp.17-20.
    [章,2003a] 章东平,刘济林,祝金标.一种基于形态学的汽车牌照提取方法.电子技术应用,vol.29,No.1,2003,pp.44-46.
    [章,2003b] 章东平,刘济林,罗义军.车辆牌照字符的提取.电路与系统学报,Vol.8,No.4,2003,pp.73-76.
    [章,2005] 章东平,祝金标,刘济林.自动定位彩色图像中的文本.浙江大学学报(工学版),Vol.39,NO.2,2005,pp.229-233.
    [章,2001] 章毓晋.图象分割.北京:科学出版社,2001.
    [张,1994] 张桂林,陈益新,曹伟煊,李强.基于跑长码的连通区域标记算法.华中理工大学学报,Vol.22,1994,pp.11-14.
    [张,2002] 张引,潘云鹤.面向彩色图像和视频的文本提取新方法.计算机辅助设计与图形学学报,Vol.14,No.1,2002,pp.36-40.
    [张,2003] 张佑生,彭表松,汪荣贵,偶春生.基于子图像VCH的文本检测与定位方法研究.武汉大学学报(信息科学版),Vol.28,NO.3,2003,pp.354-358.
    [张,2004] 张佑生,彭青松,汪荣贵.一种基于变异直方图的视频字幕检测定位方法.电子学报,Vol.32,No.2,2004,pp.314-317.
    [Zheng, 2003] Zheng Y F, Li H P and Doermann D. A model-based line detection algorithm in documents. Proceedings of Seventh International Conference on Document Analysis and Recognition, 2003, pp. 44-48.
    [Zheng, 2005] Zheng Y F, Li H P and Doermann D, A parallel-line detection algorithm based on HMM decoding. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 27, 2005, pp. 777-792.
    [Zhong, 1995] Zhong Y, Karu K and Jain A K. Locating text in complex color images. Pattern Recognition, Vol. 28, 1995, pp. 1523-1535.
    [Zhong, 2000] Zhong Y, Zhang H J and Jain A K. Automatic Caption Localization in Compressed Video. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.22, 2000, pp. 385-392.
    [Zhou, 1997a] Zhou J, Lopresti D and Lei Z. OCR for world wide web images. Proceedings of SPIE on Document Recognition Ⅳ, Vol. 3027, 1997, pp. 58-66.
    [Zhou, 1997b] Zhou J Y and Lopresti D. Extracting text from WWW images. Proceedings of the Fourth International Conference on Document Analysis and Recognition, Vol. 1, 1997, pp. 248-252.
    [Zhou, 1998] Zhou J, Lopresti D and rasdizen T. Finding text in color images. Proceedings of SPIE on Document Recognition V, 1998, pp. 130-140.
    [周,2002] 周军,徐奕,周源华.基于局部能量特征的视频字幕分割.中国图象图形学报, Vol.7(A),No.11,pp.1134-1138.
    [Zhou, 2001] Zhou L X. Research of segmentation of Chinese texts in Chinese search engine. IEEE International Conference on Systems, Man, and Cybernetics, Vol. 4, 2001, pp. 2627-2631.
    [Zhu, 2003] Zhu X and Lin X G. Automatic date imprint extraction from natural images. Proceedings of the Joint Conference of the Fourth International Conference on Information, Communications and Signal Processing, Vol. 1, 2003, pp. 518-522.
    [庄,1999] 庄越挺,潘云鹤.基于内容的图像检索综述.模式识别与人工智能,Vol.12,1999, pp.170-177.
    [庄,2002] 庄越挺,刘骏伟,吴飞,潘云鹤,张引.基于支持向量机的视频字幕自动定位与提取.计算机辅助读者设计与图形学学报,Vol.14,No.8,2002,pp.749-753.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700