数字视频中的文本分割的研究

英文题名：Research on Text Segmentation in Digital Video
作者：许剑峰
论文级别：博士
学科专业名称：计算机应用
中文关键词：视频文本分割 ; 镜头分割 ; 文本跟踪 ; 文本增强 ; 车牌识别
英文关键词：video text segmentation ; shot segmentation ; text tracking ; text enhancement ; car license plate recognition
学位年度：2005
导师：黎绍发
学科代码：081203
学位授予单位：华南理工大学
论文提交日期：2005-04-25

摘要

如今多媒体信息的应用越来越广泛。以前图书馆里收藏的资料绝大多数都是纯粹的文本书籍,现在则有了多媒体图书馆,里面收藏的资料包括图像﹑视频和音频。建立多媒体图书馆的一个重要步骤是为海量的多媒体资料建立索引,以便用户进行高效率的检索。
    随着在多媒体数据制造、存储与传播方面取得的重大技术进步,数字视频在各个领域的应用也越来越广泛,已经成为大多数人日常生活中经常遇到的一部分,能够从大量的视频资料中找到想要的信息成为人们迫切的要求。数字图像和视频也是数字图书馆计划中的核心内容。为了构建数字图书馆,要求将各种信息数字化,以便存储,检索和操作。如何管理和检索海量的视频数据已经成为近10 年来全球学术界和工业界一个富有挑战性的热门话题之一。近年来对视频检索系统的构建已经有了一些研究。有的系统是基于低层特性的,如视频中对象的形状﹑区域的亮度﹑颜色﹑纹理﹑人物动作描述﹑声音特征,有的系统是基于高层特征的,如人脸检测﹑说话人识别﹑文本识别。其中从视频中提取文本信息是比较受关注的一项,也是建立索引的一个重要的来源。
    文本是视频中重要的内容信息。视频中文本的检测和识别在视频分析过程中起到很大的作用。文本可以作为视频片断的内容标识和索引,例如在新闻视频中出现的新闻摘要,可以作为该段新闻内容的描述,用于新闻视频资料的检索;文字可以作为视频分段的依据,例如播音员名字或演员表出现的地方,可以作为新闻视频的开始或影片的结束;文字可以作为视频内容重要程度的判断依据,例如出现醒目文字的帧,可以抽取出来作为对应的视频片断的代表帧,或者在生成视频摘要的过程中,出现醒目文字的部分,可以截取下来作为视频摘要的一部分。所以对文字的分析和处理是视频分析的重要内容。而检测视频中文字的出现及其准确位置,并将文字从复杂多变的背景中分割出来,是视频文字分析处理的基础。
    在视频中提取和识别文字,可以有许多应用:从视频中提取出来的文本可以作为它们的索引和注释。例如对于一个关于篮球比赛的视频,可以提取视频中球员衣服上的球衣号码、球员姓名、球队名字作为注释和索引。这和建立视频中基于其他内容的索引相比,如对象的形状,计算代价要小得多。又如商业中,多媒体文档的手工登记工作要消耗大量的人力。如果能够自动读取商业多媒体档案中的特定文本信息,那就可以节约不少人力资源。
    同扫描出来的文件图像中的文字的检测与识别相比,视频中的文字的检测与识别需要不同的方法。因为前者一般具有单一的文字颜色和背景颜色,只需要一个简单的阈值就可以将文字与背景分开。而视频图像中往往有多种噪声成分,文字的背景大多处于运动状态,字与背景的颜色也经常不单一,分辨率也比较低,
Information is becoming increasingly enriched by multimedia components. Libraries that were originally pure text are continuously adding images, videos, and audio clips to their repositories, and large digital image and video libraries are emerging as well. They all need an automatic means to efficiently index and retrieve multimedia components.
    Most of the information available today is either on paper or in the form of still photographs and videos. The rapid growth of video data leads to an urgent demand for efficient and true content-based browsing and retrieving systems. To construct such systems, both low-level features such as object shape, region intensity, color, texture, motion descriptors, audio measurements, and high-level techniques such as human face detection, speaker identification, and character recognition have been studied for indexing and retrieving image and video information in recent years. Among these techniques, video caption based methods have attracted particular attention due to the rich content information contained in caption text. Caption text routinely provides such valuable indexing information as scene locations, speaker names, program introductions, sports scores, special announcements, dates and time. Compared to other video features, information in caption text is highly compact and structured, thus is more suitable for efficient video indexing.
    Text detection and recognition in videos can help a lot in video content analysis and understanding, since text can provide concise and direct description of the stories presented in the videos. In digital news videos, the superimposed captions usually present the involved person’s name and the summary of the news event. Hence, the recognized text can become a part of index in a video retrieval system.
    Systems that automatically extract and recognize text from images with general backgrounds are also useful in many situations, for examaple: text found in images or videos can be used to annotate and index those materials. For example, video sequences of events such as a basketball game can be annotated and indexed by extracting a player’s number, name and the name of the team that appear on the player’s uniform. In contrast, image indexing based on image content such as the shape of an object is difficult and computationally expensive to do. Systems that automatically register stock certificates and other financial documents by reading specific text information in the documents are in demand. This is because manual registration of the large volume of documents generated by daily trading requires tremendous manpower.
    Crrent OCR technology is largely restricted to finding text printed against clean backgrounds and cannot handle text printed against shaded or textured backgrounds and or embedded in images. More sophisticated text reading systems usually employ document analysis (page segmentation) schemes to identify text regions before applying OCR, so that the OCR engine does not spend time trying to interpret non-text items. However, most such schemes require clean binary input; some assume specific document layouts such as newspapers and technical journals; others utilize domain specific knowledge such as mail address blocks or configurations of chess games.
    However, extracting captions embedded in video frames is not a trivial task. In

引文

[1] Hae-Kwang Kim. Efficient automatic text location method and content-based indexing and structuring of video database. Journal of visual communication and image representation. 1996, 7(4): 336-344.
    [2] Yu Zhong, Anil K. Jain. Object localization using color, texture, and shape. Pattern Recognition. 2000, 33(4): 671-684.
    [3] Sameer Antani, Rangachar Kasturi, Ramesh Jain. A survey on the use of pattern recognition methods for abstraction, indexing, and retrieval of images and video. Pattern Recognition, 2002, 35(4): 945-965.
    [4] Myron Flickner, Harpreet S. Sawhney, Jonathan Ashley, Qian Huang, Byron Dom, Monika Gorkani, Jim Hafner, Denis Lee, Dragutin Petkovic, David Steele, Peter Yanker. Query by image and video content: the QBIC system. IEEE Computer. 1995, 28 (9): 23-32.
    [5] HongJiang Zhang, Yihong Gong, Stephen W. Smoliar, Shuang Yeo Tan. Automatic parsing of news video. Proceedings of IEEE Conference on Multimedia Computing and Systems, Boston, 1994: 45-54
    [6] Arnold WM Smeulders, Marcel Worring, Simone Santini, Amarnath Gupta, Ramesh Jain. Content-based image retrieval at the end of the early years. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2000, 22 (12): 1349-1380
    [7] Kim Hae-Kwang. Efficient Automatic Text Location Method and Content-Based Indexing and Structuring of Video Database. Journal of Visual Communication and Image Representation. 1996, 7(4), 1996: 336-344
    [8] Behzad Shahraray, David C. Gibbon. Automatic generation of pictorial transcripts of video programs. Proceedings of SPIE Conference on Multimedia Computing and Networking. Vol. 2417, 1995
    [9] Toufik Ahmed, Abdelhamid Nafaa, Ahmed Mehaoua. An Object-Based MPEG-4 Multimedia Content Classification Model for IP QoS Differentiation. Proceedings of the Eighth IEEE International Symposium on Computers and Communications. 2003: 1091-1096
    [10] Yasuhiko Watanabe, Yoshihiro Okada, Yeun-Bae Kim, Tetsuya Takeda. Translation camera. Proceedings of the 14th International Conference on Pattern Recognition. 1998, 1: 613–617
    [11] Ismail Haritaoglu. Scene text extraction and translation for handheld devices. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2001, 2: 408–413
    [12] Young K. Ham, Min Seok Kang, Hong Kyu Chung, Rae-Hong Park, Gwi-Tae Park. Recognition of raised characters for automatic classification of rubber tires. Optical Engineering. 1995, 34(1): 102–108
    [13] 高健, 何南忠, 王以治. 三峡右岸拌和楼汽车牌号图像识别系统. 华中理工大学学报. 1998, 26(5): 83-85
    [14] Sang Kyoon Kim, Kim HJ. A Recognition of Vehicle License Plate Using a Genetic Algorithm Based Segmentation. Proceedings of the International Conference on Image Processing. 1996: 661-664
    [15] Giulia Piccioli, Enrico De Micheli, Marco Campani. A robust method for road sign detection and recognition. Image and Vision Computing. 1996, 14: 209-233
    [16] Keechul Jung, Kwang In Kim, Anil K. Jain. Text Information Extraction in Images and Video: A Survey. Pattern Recognition, 2004, 37(5): 977-997
    [17] Rainer Lienhart, Frank Stuber. Automatic text recognition in digital videos. Proceddings of the SPIE Image and Video Processing IV. 1996, 2666: 180–188
    [18] Anil K. Jain, Bin Yu. Automatic text location in images and video frames. Pattern Recognition, 1998, 31(12): 2055–2076
    [19] Huiping Li, David Doermann, Omid Kia. Automatic text detection and tracking in digital video. IEEE Transactions on Image Processing. 2000, 9(1):147-156
    [20] Boon-Lock Yeo, Bede Liu. Visual content highlighting via automatic extraction of embedded captions on MPEG compressed video. Proceedings of SPIE Digital Video Compression: Algorithms and Technology. 1996, 2668: 38-47
    [21] Toshio Sato, Takeo Kanade, Ellen K. Hughes, Michael A. Smith, Shin'ichi Satoh. Video OCR: Indexing Digital News Libraries by Recognition of Superimposed Captions. ACM Multimedia System, 1999,
    7: 385-395
    [22] Byung Tae Chun, Younglae Bae, Tai-Yun Kim. A method for original image recovery for caption areas in video. IEEE International Conference on Systems, Man, and Cybernetics, 1999, 2: 930-935
    [23] Shin'ichi Satoh, Yuichi Nakamura, Takeo Kanade. Name-It: naming and detecting faces in news videos. IEEE Multimedia. 1999, 6(1): 22-35
    [24] Kwang In Kim, Keechul Jung, Se Hyun Park, Hang Joon Kim. Support vector machine-based text detection in digital video. Pattern Recognition, 2001, 34(2): 527-529
    [25] Marco Bertini, Carlo Colombo, Alberto Del Bimbo. Automatic caption localization in videos using salient points. Proceedings of the First IEEE International Conference on Multimedia and Expo, 2001: 68 –71
    [26] Rainer Lienhart, Axel Wernicke. Localizing and segmenting text in images and videos. IEEE Transactions on Circuits and Systems for Video Technology, 12(4), 2002: 256-268
    [27] Hyeran Byun, Inyoung Jang, Yeongwoo Choi. Text Extraction in Digital News Video Using Morphology. Lecture Notes in Computer Science. Publisher: Springer-Verlag Heidelberg. 2002, 2423: 341-252
    [28] Edward K.Wong, Minya Chen. A new robust algorithm for video text extraction. pattern recognition, 2003: 1397-1406
    [29] 杨友庆, 高隽, 鲍捷, 学东. 基于视频的字幕检索与提取. 计算机应用. 2000, 20(10): 33-35
    [30] 胡宏斌, 徐骏, 周洞汝. 基于COM 技术的视频流文字检测. 计算机工程. 2001, 27(6): 95-97
    [31] 张引, 潘云鹤. 面向彩色图像和视频的文本提取新方法. 计算机辅助设计与图形学学报. 2002, 14(1): 36-40
    [32] 王辰, 老松杨, 胡晓峰. 视频中的文字探测. 小型微型计算机系统, 2002, 23(4): 578-481
    [33] 周军, 徐奕, 周源华. 基于局部能量特征的视频字幕分割. 中国图象图形学报. 2002, 7a(11): 1134-1138
    [34] 庄越挺, 刘骏伟, 吴飞, 潘云鹤, 张引. 基于支持向量机的视频字幕自动定位与提取. 计算机辅助设计与图形学学报. 2002, 14(8): 750-753
    [35] 黄晓东, 周源华. 用小波变换及颜色聚类提取的视频图像内中文字幕. 计算机工程. 2003,29(1): 43-44
    [36] 王勇, 燕继坤, 郑辉. 一种自适应的视频帧中字幕检测定位方法. 计算机应用. 2004, 24(1): 134-135
    [37] 郭丽, 孙兴华, 黄元元, 杨静宇. 视频文本的自动提取方法. 小型微型计算机系统. 2004, 25(6): 1086-1088
    [38] Irena Koprinska, Sergio Carrato. Temporal video segmentation: a survey. Signal processing: Image communication, 2001, 16: 477-500
    [39] 朱兴全, 薛向阳, 吴立德. 一种自动门限选取的视频Shot 分割方法. 计算机研究与发展. 2000, 37(1): 80-85
    [40] Rainer Lienhart. Reliable Dissolve Detection. Proceedings of SPIE Storage and Retrieval for Media Databases, 2001, 4315: 219-230
    [41] Michael A. Smith, Takeo Kanade. Video skimming and characterization through the combination of image and language understanding. IEEE International Workshop on Content-Based Access of Image and Video Database, 1998: 61-70
    [42] Xiangrong Chen, HongJiang Zhang. Text Area Detection from Video Frames. Proceedings of the Second IEEE Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing, 2001: 222-228
    [43] Victor Wu, R. Manmatha, Edward M. Riseman. Finding text in images. Proceedings of the second ACM International Conference on Digital Libraries, 1997: 23–26
    [44] 张学工. 关于统计学习理论与支持向量机. 自动化学报. 2000, 26 (1): 32-42.
    [45] Lyu M.R., Jiqiang Song, Min Cai. A Comprehensive Method for Multilingual Video Text Detection, Localization, and Extraction. IEEE Transactions on Circuits and Systems for Video Technology. 15(2): 243-255
    [46] Chen Datong, Odobez Jean-Marc, Bourlard Hervé. Text detection and recognition in images and video frames. Pattern Recognition. 2004, 37(3): 595-608
    [47] Chen Datong, Odobez Jean-Marc, Thiran Jean-Philippe. A localization/verification scheme for finding text in images and video frames
    based on contrast independent features and machine learning methods. Signal
    Processing: Image Communication. 2004, 19(3): 205-217
    [48] Taylor G.W., Wolf, C. Reinforcement learning for parameter control of text detection in images from video sequences. Proceedings of 2004 International Conference on Information and Communication Technologies: From Theory to Applications.2004, 19-23: 517-518
    [49] Zhi-Guo Cheng, Yun-Cai Liu. Automatic caption location and extraction in digital video based on SVM. Proceedings of 2004 International Conference on Machine Learning and Cybernetics. 2004, 6: 26-29
    [50] Huiping Li, Omid Kia, David Doermann. Text Enhancement in Digital Video. Proceedings of {CIKM}-99, 8th {ACM} International Conference on Information and Knowledge Management, 1999: 122-130
    [51] Datong Chen, Jean-Marc Odobez. Sequential Monte Carlo video text segmentation. 2003. Proceedings of 2003 International Conference on Image Processing. 2003, 3: 21-24
    [52] Serhat Tekinalp, A. Aydin Alatan. Utilization of texture, contrast and color homogeneity for detecting and recognizing text from video frames. Proceedings of 2003 International Conference on Image Processing. 2003, 3: 505-508
    [53] Akira Nakamura, Kazuhiko Yamamoto. Caption text recognition in video frames by MAP matching. Proceedings of Seventh International Conference on Document Analysis and Recognition. 2003: 650 –655
    [54] Rongrong Wang, Wanjun Jin, Lide Wu. A novel video caption detection approach using multi-frame integration. Proceedings of the 17th International Conference on Pattern Recognition, 2004, 1: 23-26
    [55] Xian-Sheng Hua, Pei Yin,Hong-Jiang Zhang. Efficient video text recognition using multiple frame integration. Proceedings of 2002 International Conference on Image Processing. 2002, 2: II-397 -II-400
    [56] Dongqing Zhang, Rajendran R.K., Shih-Fu Chang. General and domain-specific techniques for detecting and recognizing superimposed text in video. Proceedings of 2002 International Conference on Image Processing. 2002, 1: I-593 -I-596
    [57] Jie Xi, Xian-Sheng Hua, Xiang-Rong Chen, Liu Wenyin, Hong-Jiang Zhang. A video text detection and recognition system. IEEE International Conference on Multimedia and Expo. 2001: 873-876
    [58] Xian-Sheng Hua, Liu Wenyin, Hong-Jiang Zhang. An automatic performance evaluation protocol for video text detection algorithms. IEEE Transactions on Circuits and Systems for Video Technology. 14(4), 2004: 498-507
    [59] Chang-Woo Lee, Hyun Kang, Kyung Mi Lee, Keechul Jung, Hang Joon Kim. Spatiotemporal Restoration of Regions Occluded by Text in Video Sequences. Lecture Notes in Computer Science. 2003, 2690: 1071 –1075
    [60] David Crandall, Sameer Antani, Rangachar Kasturi. Extraction of special effects caption text events from digital video. International Journal on Document Analysis and Recognition. 2003, 5: 138-157
    [61] Qixiang Ye, Wen Gao, Weiqiang Wang, Wei Zeng. A robust text detection algorithm in images and video frames. Proceedings of the 2003 Joint Conference of the Fourth International Conference on Information, Communications and Signal Processing, 2003 and the Fourth Pacific Rim Conference on Multimedia. 2003, 2: 802 –806
    [62] Wenge Mao, Fu-lai Chung, Lam K.K.M., Wan-chi Sun. Hybrid Chinese/English text detection in images and video frames. Proceedings of 16th International Conference on Pattern Recognition. 2002, 3: 1015 –1018
    [63] Xiaoau Tang, Bo Luo, Xinbo Gao, Pissaloux E., Hangjiang Zhan. Video text extraction using temporal feature vectors. Proceedings of 2002 IEEE International Conference on Multimedia and Expo. 2002, 1, 2002: 85-88
    [64] Jean-Marc Odobez, Datong Chen. Robust video text segmentation and recognition with multiple hypotheses. Proceedings of 2002 International Conference on Image Processing. 2002, 2: II-433 -II-436
    [65] Min Cai, Jiqiang Song, Lyu, M.R. A new approach for video text detection. Proceedings of 2002 International Conference on Image Processing. 2002, 1: I-117 -I-120
    [66] Eliza Yingzi Du, Chein-I Chang, Thouin P.D. Thresholding video images for text detection. Proceedings of 16th International Conference on Pattern
    Recognition. 2002, 3: 919-922
    [67] Christian Wolf, Jean-Michel Jolion, Fran?oise Chassaing. Text localization, enhancement and binarization in multimedia documents. Proceedings of 16th International Conference on Pattern Recognition. 2002, 2: 1037 –1040
    [68] Jong Ryul Kim, Young Shik Moon. Extraction of Text Regions and Recognition of Characters from Video Inputs. Lecture Notes in Computer Science. 2002, 2532: 767-774
    [69] Xian-Sheng Hua, Liu Wenyin, Hong-Jiang Zhang. Automatic performance evaluation for video text detection. Proceedings of Sixth International Conference on Document Analysis and Recognition. 2001: 545-550
    [70] Sameer K. Antani. Reliable extraction of text from video. Ph.D. Thesis, Pennsylvania State University, August 2001
    [71] Axel Wernicke, Rainer Lienhart. On the segmentation of text in videos. Proceedings of IEEE International Conference on Multimedia and Expo (III). 2000, 3: 1511–1514
    [72] Edward K. Wong, Minya Chen. A robust algorithm for text extraction in color video. Proceedings of IEEE International Conference on Multimedia and Expo. 2000, 2: 797–800
    [73] Yu Zhong, HongJiang Zhang, Anil K. Jain. Automatic Caption Localization in Compressed video. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2000, 22(4): 385-392
    [74] Huiping Li, Doermann, D. Superresolution-based enhancement of text in digital video. Proceedings of 15th International Conference on Pattern Recognition. 2000, 1: 847-850
    [75] Rainer Lienhart, Wolfgang Effelsberg. Automatic text segmentation and text recognition for video indexing. ACM/Springer Multimedia Systems. 2000, 8(1): 69 –81
    [76] Weiqiang Wang, Wen Gao, Jintao Li, Shouxun Lin. News Content Highlight via Fast Caption Text Detection on Compressed Video. Proceedings of the Second International Conference on Intelligent Data Engineering and Automated Learning, Data Mining, Financial Engineering, and Intelligent Agents. 2000,
    1983: 443-448
    [77] Keechul Jung, JungHyun Han. Texture-Based Text Location for Video Indexing. Proceedings of the Second International Conference on Intelligent Data Engineering and Automated Learning, Data Mining, Financial Engineering, and Intelligent Agents. 2000, 1983: 449-454
    [78] Vladimir Y. Mariano, Rangachar Kasturi. Locating uniform-colored text in video frames. Proceedings of 15th International Conference on Pattern Recognition. 2000, 4 : 539-542
    [79] Kee Chul Jung, Jung Hyun Han, Kwang In Kim, Se Hyun Park. Support vector machines for text location in news video images. Proceedings of IEEE Region 10 conference 2000. 2000, 2: 176-180
    [80] Lalitha Agnihotri, Nevenka Dimitrova. Text detection for video analysis. Proceedings of the IEEE Workshop on Content-Based Access of Image and Video Libraries. 1999: 109-113
    [81] Yasuo Ariki, Katsumi Matsuura, Seiichi Takao. Telop and flip frame detection and character extraction from TV news articles. Proceedings of the Fifth International Conference on Document Analysis and Recognition. 1999: 701–704
    [82] Ullas Gargi, David Crandall, Sameer Antani, Tarak Gandhi, Ryan Keener, Rangachar Kasturi. A system for automatic text detection in video. Proceedings of the Fifth International Conference on Document Analysis and Recognition. 1999: 29–32
    [83] K. Y. Jeong, K. Jung, E. Y. Kim, H. J. Kim. Neural network based text location for news video indexing. Proc. Int. Conf. Image Processing. vol. 3, 1999: 319–323
    [84] Huiping Li, David Doermann. Text Enhancement in Digital Video Using Multiple Frame Integration. ACM Multimedia. 1999, 1: 19-22
    [85] Toshio Sato, Takeo Kanade, Ellen K. Hughes, Michael A. Smith. Video OCR for Digital News Archives. Proceedings of IEEE International Workshop on Content-Based Access of Image and Video Databases. 1998: 52 –60
    [86] Huiping Li, David S. Doermann, Omid E. Kia. Text Extraction, Enhancement and OCR in Digital Video. Document Analysis Systems. 1998: 363-377
    [87] John S. Boreczky, Lawrence A. Rowe. Comparison of Video Shot Boundary Detection Techniques. Proceedings of IS&T/SPIE International Symposium Electronic Imaging: Storage and Retrieval for Image and Video Databases IV, 1996: 170-179
    [88] Rainer Lienhart. Comparison of Automatic Shot Boundary Detection Algorithms. Proceedings of SPIE Conference on Storage and Retrieval for Image and Video Databases VII. 1999, 3656: 290-301
    [89] G. Lupatini, Caterina Saraceno, Riccardo Leonardi. Scene Break Detection: a Comparison. Proceedings of the 8th International Workshop on Research Issues in Data Engineering. 1998: 34-41
    [90] Alan Hanjalic. Shot-boundary Detection: Unraveled and Resolved. IEEE Transactions on Circuits and Systems for Video Technology. 2002, 12(2): 90-104
    [91] Rainer Lienhart. Reliable Transition Detection in Videos: A Survey and Practitioner's Guide. International Journal of Image and Graphics. 2001, 1(3): 469-486
    [92] Akio Nagasaka, Yuzuru Tanaka. Automatic video indexing and full-video search for object appearances. Proceedings of the Second Working Conference on Visual Database Systems. 1991: 119-133
    [93] Kiyotaka Otsuji, Yoshinobu Tonomura. Projection Detecting Filter for Video Cut Detection. Proceedings of the First ACM International Conference on Multimedia. 1993: 251-257
    [94] Yoshinobu Tonomura, Akihito Akutsu, Yukinobu Taniguchi, Gen Suzuki. Structured Video Computing. IEEE Multimedia. 1994, 1(3): 34-43
    [95] H. Zhang, A. Kankanhalli, SW Smoliar. Automatic parsing of full-motion video. ACM Multimedia Systems. 1993, 1:10-28
    [96] Boon-Lock Yeo, Bede Liu. On the Extraction of DC Sequence from MPEG Compressed Video. Proceedings of the 1995 IEEE International Conference on Image Processing. 1995, 2: 260-263
    [97] Boon-Lock Yeo, Bede Liu. A Unified Approach to Temporal Segmentation of Motion JPEG and MPEG Compressed Video. Proceedings of the International
    Conference on Multimedia Computing and Systems. 1995: 81-89
    [98] Boon-Lock Yeo, Bede Liu. Rapid scene analysis on compressed video. IEEE Transaction on Circuits and Systems for Video Technology. 1995, 5(6): 533-544
    [99] MM Yeung, Bede Liu. Efficient Matching and Clustering of Video Shots. Proceedings of the IEEE International Conference on Image Processing. 1995, I: 338-341
    [100] Jianhao Meng, Yujen Juan, Shih-Fu Chang. Scene Change Detection in a MPEG Compressed Video Sequence. Proceeding of SPIE Conference on Digital Video Compression: Algorithms and Technologies. 1995, 2419: 14-25
    [101] Ishwar K.Sethi, Nilesh Patel. A Statistical Approach to Scene Change Detection. Proceedings of IS&T/SPIE Conference on Storage and Retrieval for Image and Video Databases III. 1995, 2420: 329-338
    [102] HongJiang Zhang, Chien Yong Low, Stephen W. Smoliar, Di Zhong. Video parsing, retrieval and browsing : An integrated and content-based solution. Proceedings of ACM Multimedia. 1995: 15-24
    [103] Mark S. Toller, Paul H. Lewis, Mark S. Nixon. Video segmentation using combined cues. In Proceedings of SPIE Electronic Imaging: Storage and Retrieval for Image and Video Databases VI. 1997, 3312:414-425
    [104] Smeaton AF., Gilvarry J., Gormley G., Tobin B., Marlow S, Murphy M. An evaluation of alternative techniques for automatic detection of shot boundaries in digital video. Proceedings of the Irish Machine Vision and Image Processing Conference. 1999
    [105] Behzad Shahraray. Scene change detection and content-based sampling of video sequences. Proceedings of SPIE Conference on Digital Video Compression: Algorithms and Technologies. 1995, 2419: 2-13
    [106] S. Moon-Ho Song, Tae-Hoon, Kwon, Woonkyung M.Kim. On detection of gradual scene changes for parsing of video data. Proceedings of SPIE Storage and Retrieval for Image and Video Databases VI. 1997, 3312: 404-413
    [107] 曲晓慧,安钢.数据融合方法综述及展望.舰船电子工程.2003(2):2-5
    [108] 丛爽,向微.网络结构、参数及训练方法的设计与选择.计算机工程.2001, 27(10): 36-38
    [109] 丛爽. 典型人工神经网络的结构、功能及其在智能系统中的应用. 用信息与控制. 2001, 30(2): 92-103
    [110] Vijay V. Raghavan, Gwang S. Jung, Peter. Bollmann. A critical investigation of recall and precision as measures of retrieval system performance. ACM Transactions on Information Systems. 1989, 7(3): 205-229
    [111] 张佑生, 彭青松, 汪荣贵. 一种基于变异灰度直方图的视频字幕检测定位方法. 电子学报, 2004, 32(2): 314-317
    [112] 郭丽, 孙兴华, 黄元元, 杨静宇. 视频文本的自动提取方法. 小型微型计算机系统. 2004, 25(6): 1086-1088
    [113] 王飞, 李在铭. 视频动目标标识文本检测与识别技术. 信息与电子工程. 2003, 1(1): 25-30
    [114] 马小勇, 谢萍, 张宪民. 视频帧中提取文字区域的算法. 计算机工程. 2003, 29(9): 155-157
    [115] 刘骏伟, 庄越挺, 吴飞. 基于SVM 和ICA 的视频帧字幕自动定位与提取. 中国图象图形学报. 2003, 8(11): 1334-1340
    [116] 蔡波, 周洞汝, 胡宏斌. 数字视频中字幕检测及提取的研究和实现. 计算机辅助设计与图形学学报. 2003, 15(7): 898-903
    [117] 王飞, 刘镰斧, 李在铭. 视频目标标识文本视觉特征与模糊识别. 仪器仪表学报(增刊). 2002, 23(s3): 645-646
    [118] 冯慧君. 基于小波变换和神经网络的视频文字检测. 九江师专学报. 2002, 119: 4-10
    [119] 欧国斌, 张利, 谢攀. 视频信号中实时字幕信息的提取方法. 清华大学学报(自然科学版), 2002, 42(7): 869-872
    [120] 李朝晖, 王秀峰. 影视字幕文字识别的研究. 计算机工程, 2002, 28(3): 175-176
    [121] 李朝晖, 余英林, 张为, 邹艳碧. 小波-神经网络在视频文本自动检测中的应用. 广州大学学报(社会科学版). 2001, 15(5): 36-39
    [122] 胡建明. 视频图像中文字识别技术的研究. 复旦大学博士后论文. 2001: 1-69
    [123] 王润生. 图像理解. 国防科技大学出版社. 1995
    [124] Victor Wu, Raghavan Manmatha, Edward M. Riseman. TextFinder : an automatic system to detect and recognize text in images. IEEE Transactions on Pattern Analysis and Machine Intelligence. 1999, 21(11): 1224-1229
    [125] Anil K. Jain, Sushil Bhattacharjee. Text Segmentation Using Gabor Filters for Automatic Document Processing. Machine Vision and Applications. 1992, 5: 169-184
    [126] J. Malik, P. Perona. Preattentive texture discrimination with early vision mechanism. Journal of the Optical Society of America A. 1990, 5(5): 923-932
    [127] Sameer Antani, David Crandall, Rangachar Kasturi. Robust Extraction of Text in Video. Proceedings of the Fifteenth International Conference on Pattern Recognition. 2000: 1831-1834
    [128] Stavros J. Perantonis, Basilios Gatos, Vassilios Maragos, Vangelis Karkaletsis, Georgios Petasis. Text Area Identification in Web Images. Lecture Notes in Computer Science. 2004, 3025: 82-92
    [129] Kwang In Kim, Keechul Jung, Jin Hyung Kim. Texture-based approach for text detection in images using support vector machines and continuously adaptive mean shift algorithm. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2003, 25(12): 1631-1639
    [130] Keechul Jung, Kwang In Kim, JungHyun Han. text extraction in real scene images on planar planes. Proceedings of 16th International Conference on Pattern Recognition. 2002, 3: 469-472
    [131] C. Garcia, X. Apostolidis. Text detection and segmentation in complex color images. Proceedings of 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. 2000, 4: 2326-2329
    [132] Nevenka Dimitrova, Lalitha Agnihotri, Chitra Dorai, Ruud Bolle. MPEG-7 Videotext description scheme for superimposed text in images and video. Signal Processing: Image Communication. 2000, 16: 137-155
    [133] Yassin MY Hasan, Lina J. Karam. Morphological Text Extraction from Images. IEEE Transactions on Image Processing. 2000, 9(11): 1978-1983
    [134] Tarak Gandhi, Rangachar Kasturi, Sameer Antani. Application of Planar Motion Segmentation for Scene Text Extraction. Proceedings of
    the Fifteenth International Conference on Pattern Recognition. 2000,
    1(3-7): 1445-1449
    [135] Victor Wu, R. Manmatha, Edward M. Riseman. TextFinder: An Automatic System to Detect and Recognize Text In Images. IEEE Transactions on Pattern Analysis and Machine Intelligence. 1999, 21(11): 1224-1229
    [136] Hiroshi Kamada, Katsuhito Fujimoto. High-speed, High-accuracy Binarization Method for Recognizing Text in Images of Low Spatial Resolutions. Proceedings of IEEE Fifth international conference on Document Analysis and Recognition. 1999: 139-142
    [137] Osamu Hori. A Video Text Extraction Method for Character Recognition. Proceedings of IEEE Fifth international conference on Document Analysis and Recognition. 1999: 25-28
    [138] Pyeoung-Kee Kim. Automatic Text Location in Complex Color Images Using Local Color Quantization. Proceedings of the IEEE Region 10 Conference. 1999, 1:629-632
    [139] Karin Sobottka, Horst Bunke, Heino Kronenberg. Identification of Text on Colored Book and Journal Covers. Proceedings of IEEE Fifth international conference on Document Analysis and Recognition. 1999: 57-62
    [140] Jiangying Zhou, Daniel P. Lopresti. Extracting Text from WWW Images. Proceedings of the Fourth International Conference on Document Analysis and Recognition. 1997: 248-252
    [141] Hiroyuki Hase, Toshiyuki Shinokawa, Masaaki Yoneda, Mitsuru Sakai, Hiroshi Maruyama. Character String Extraction by Multi-stage Relaxation. Proceedings of the Fourth International Conference on Document Analysis and Recognition. 1997: 298-302
    [142] P. L’Assainato, P. Gamba, A. Mecocci. Character recognition in external scenes by means of vanishing point grouping. Proceedings of the International Conference on Digital Signal Processing. 1997, 2: 2-4
    [143] J.Z. Xu, Minsoo Suk, Sanjay Ranka. Text string location on images. Proceedings of Third International Conference on Signal Processing. 1996, 2: 1996
    [144] Yu Zhong, Kalle Karu, Anil K. Jain. Locating Text In Complex Color Images. Pattern Recognition. 1995, 28(10): 1523-1535
    [145] Michael A. Smith, Takeo Kanade. Video Skimming for Quick Browsing based on Audio and Image Characterization. Computer Science Technical Report, Carnegie Mellon University, 1995
    [146] Jun Ohya, Akio Shio, Shigeru Akamatsu. Recognizing Characters in Scene Images. IEEE. Transactions on Pattern Analysis and Machine Intelligence. 1994, 16(2): 214-220
    [147] 陈又新, 刘长松, 丁晓青. 复杂彩色文本图像中字符的提取; 中文信息学报2003, 17(5): 55-59
    [148] 胡建明,吴立德. 一种改进的文字/图形图象的快速分割算法. 模式识别与人工智能. 2001, 14(2): 201-204
    [149] Vladimir N. Vapnik. Statistical Learning Theory. John Wiley & Sons, New York, 1998
    [150] Susan T. Dumais, John C. Platt, David Hecherman, Mehran Sahami. Inductive learning algorithms and representations for text categorization.Proceedings of the 7th ACM International Conference on Information and Knowledge Management. 1998: 148-155
    [151] vind Due Trier, Anil K. Jain. Goal-Directed Evaluation of Binarization Methods. IEEE Transactions on Pattern Analysis and Machine Intelligence. 1995, 17(12): 1191-1201
    [152] John Bernsen. Dynamic thresholding of grey-level images. Proceedings of 8th IAPR International Conference on Pattern Recognition. 1986:1251-1255
    [153] CK Chow, T. Kaneko. Automatic detection of the left ventricle from cineangiograms,"Computers and Biomedical Research. 1972, 5: 388-410
    [154] Yasuo Nakagawa, Azriel Rosenfeld. Some experiments on variable thresholding. Pattern Recognition. 1979, 11(3): 191-204
    [155] Line Eikvil, Torfinn Taxt, Knut Moen. A fast adaptive method for binarization of document images. Proceedings of the First International Conference on Document Analysis and Recognition. 1991: 435-443
    [156] Kanti V. Mardia, TJ Hainsworth. A Spatial Thresholding Method for Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence. 1988, 10(6): 919-927
    [157] Wayne Niblack, An Introduction to Digital, Image Processing, Prentice Hall, Englewood Cliffs, 1986: 115-116
    [158] Torfinn Taxt, Patrick J. Flynn, Anil K. Jain. Segmentation of document images. IEEE Transaction on Pattern Analysis and Machine Intelligence. 1989, 11(12): 1322-1329
    [159] SD Yanowitz, AM Bruckstein. A new method for image segmentation. Computer Vision, Graphics and Image Processing. 1989, 46(1): 82-95
    [160] James M. White, Gene D. Rohrer. Image Thresholding for Optical Character Recognition and Other Applications Requiring Character Image Extraction. IBM Journal of Research and Development. 1983, 27(4): 400-411
    [161] JR Parker. Gray level thresholding in badly illuminated images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1991, 13: 813-819
    [162] ?ivind Due Trier, Torfinn Taxt. Improvement of "integrated function algorithm" for binarization of document images. Pattern Recognition Letters. 1995, 16(3): 277-283
    [163] Sunghoon Kim, Daechul Kim, Younbok Ryu, Gyeonghwan Kim. A Robust License-plate Extraction Method under Complex Image Conditions. Proceedings of the 16th International Conference on Pattern Recognition. 2002: 216-219.
    [164] Da-shan Gao, Jie Zhou. Car License Plate Detection from Complex Scene. Proceedings of International Conference on Signal Processing. 2000: 1409-1414.
    [165] Jun-Wei Hsieh, Shih-Hao Yu, Yung-Sheng Chen. Morphology-based license plate detection from complex scenes. Proceedings of 16th International Conference on Pattern Recognition. 2002, 3: 176-179
    [166] Dong-Su Kim, Sung-Il Chien. Automatic car license plate extraction using modified generalized symmetry transform and image warping. Proceedings of IEEE International Symposium on Industrial Electronics. 2001, 3: 2022-2027
    [167] Thanongsak Sirithinaphong, Kosin Chamnongthai. The recognition of car license plate for automatic parking system. Proceedings of the Fifth International Symposium on Signal Processing and Its Applications. 1999, 1: 455-457
    [168] Thanongsak Sirithinaphong and Kosin Chamnongthai. Extraction of car license plate using motor vehicle regulation and character pattern recognition. Proceedings of the 1998 IEEE Asia-Pacific Conference on Circuits and Systems. 1998: 559-562
    [169] Hans A. Hegt, Ron J. De La Haye, Nadeen A. Khan. A high performance license plate recognition system. Proceedings of IEEE International Conference on System, Man and Cybernetics. 1998, 5: 4357-4362
    [170] M. Shridhar, J. W. V. Miller, G. Houle et. al. Recogniton of license plate images: issues and perspectives. Proceedings of Fifth International Conference on Document Analysis and Recognition. 1999: 17-20
    [171] Rodolfo Zunino, Stefano Rovetta. Visual location of license plates by vector quantization. Proceedings of IEEE International Symposium on Circuits and Systems. 1999, 4: 135-138
    [172] Yuntao Cui, Qian Huang. Automatic license extraction from moving vehicles. IEEE International Conference on Image Processing. 1997, 3: 126-129
    [173] Luis Salgado, Jose M. Memendez, Enrique Rendon, et. al. Automatic car plate detection and recognition through intelligent vision engineering. Proceedings IEEE Annual International Technology

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700