数字视频中文本的提取方法研究

英文题名：Research on Text Extraction in Digital Video
作者：王振
论文级别：博士
学科专业名称：物理海洋学
中文关键词：文字定位 ; 视频文字跟踪 ; 点模式匹配 ; 故事单元分割 ; 车辆导航
英文关键词：Text Localization ; Video text tracking ; Corner feature matching ; Story segmentation ; Vehicle navigation
学位年度：2011
导师：魏志强
学科代码：070701
学位授予单位：中国海洋大学
论文提交日期：2011-04-15

摘要

对于视频内容的分析与检索已成为当前视频信息研究领域的一个热点。由于视频中包含的文字信息与视频内容关系密切,可以为视频内容理解与检索提供重要线索,因此如何快速、准确的提取视频中文本信息也就成为一项非常有意义的研究方向。除此以外,视频文本提取技术通过与各种移动数码设备(数码摄像机、数码相机、PDA、手机等)结合,在自动翻译、盲人导航、机器人视觉、智能交通等方面也发挥了越来越大的作用,并逐渐成为了研究人员关注的热点问题。
     从视频中提取文本信息并不是一件简单的事情,由于视频图像中的文本往往存在于复杂的背景中,同一幅图像中可能含有不同字体、颜色、大小和排列方式的文字,因此对于视频中文本检测、定位和分割具有很大的难度。
     本文对于视频文本提取框架中的若干关键问题,如文本定位﹑跟踪﹑增强以及实际应用(新闻故事自动分割、道路交通标识牌文字识别系统)开展研究。研究内容主要如下:
     提出了一种综合灰度形态学和小波多尺度分解与重构算法的文本定位方法。首先结合形态学与小波分析在边缘检测方面的优点,提取出视频帧边缘像素,然后通过“基于密度”的区域增长算法将边缘像素合并成为候选文本区。最后采用基于BPSO算法进行特征选择及SVM参数同步优化的分类器对候选文本区进行确认。本方法有效克服了单独优化特征或单独优化分类器参数的缺陷,取得较好的分类效果。
     提出一种基于边缘角点与改进Hausdorff距离为判定准则的静止和线性运动文本的跟踪算法。首先将边缘算子提取的二值图像经去噪、细化处理后,以提取的边缘角点为特征点集合,用改进的Hausdorff距离度量为判定准则,通过点模式匹配法跟踪文本区域在相邻视频帧序列中的位置。实验结果显示,点模式匹配的跟踪算法比图像整体像素匹配的算法跟踪精度更高。由于该算法不必对每个视频帧都进行文本定位,从而大大提高了系统效率。在文本跟踪的基础上,用基于多帧融合思想的前景/背景识别算法提取视频文字笔画并作OCR识别。
     提出了一种融合视频中的标题字幕信息以及音、视频等多模态信息的新闻故事单元分割方法,并实现了一个新闻故事分割、浏览和检索的原型系统。首先根据第二、三章的算法实现对新闻标题文本的定位、跟踪与分割,并在镜头分割的基础上,根据混合高斯模型(GMM)与KL差异法完成播音员和非播音员音频镜头的识别,最后结合新闻视频节目的特殊结构知识完成对新闻节目故事单元的自动分割。
     介绍了一种视频文本提取算法在辅助驾驶系统中的应用,通过对道路标识牌上的文字提取,从而提供给驾驶员在公路上的导航,如所处位置、方向、限速等信息。算法首先基于颜色信息来定位特定颜色的道路标识牌,然后经过仿射变换,基于笔画算子的种子区域增长算法进行交通标识牌文字的定位、分割和提取。
Nowadays, the speedy growth of video resources bring about an urgent demand for efficient Video Information Classification and Retrieval system which could help customers acquire interesting video or video clip from huge amounts of unstructurized video data. Among these techniques, text extraction method has become a very meaningful research topic because the text in frames have close relationship with the video content. Besides, many mobile devices have been equipped with high-performance camera, such that images and videos containing text can be easily captured when necessary. If these texts can be automatically discovered, many utilitarian applications (e.g. translation, special service for blind person, machine vision and intelligent traffic system) can be provided for users.
     However, the embedded text in video frames have different size, style, direction and arrangement, as well as low contrast and complex backgrounds which make the text extraction problem very complicated.?
     This dissertation focuses on the research in the crucial problems of video text segmentation, including video text location in single video frame, multi-frame video text tracking, video text enhancement, video text segmentation application (news video story segmentation, text detection of road signs system).
     The main works of this dissertation are as follows:
     An edge detection approach combining gray-scale mathematical morphology with wavelet transform is proposed for coarse filtration first.This edge detection method combines the advantages of both wavelet transform and morphology methods together to fuse the two edge information obtained by different method,thus suppressing effectively the noises with the consecutive and clear edges kept up. Next, a density-based region growing method is used to join these pixels into text regions. Finally, A algorithm based on binary particle swarm optimization was presented and applied to optimize feature selection and parameters of SVM simultaneously which is used to identify true text from the candidates. Experimental results show that this approach can fast and robustly detect text lines under various conditions.
     A video text tracking and text extraction method under complex background is proposed. On the basis of comer detection of curvature function,a point matching method is introduced to track text objects for which a modified Hausdorff distance is used to find and register the corresponding text block in video frames. The algorithm can avoid detecting text in every video frame which improves the system efficiency a lot. Next, a multi-frame-based foreground/background recognition algorithm is proposed to extract text strokes for optical character recognition. The efficiency and robustness of the point matching method for video text tracking and the text extraction algorithm are proved by objective and thorough experiments on TV serials and movies.
     A novel news story automatic segmentation scheme based on video,audio and text information is proposed. Firstly, the shot boundaries for news video is detected, then the topic-caption frames are identified to get segmentation cues by using text detection and tracking algorithm in previous chapter. Next, depending on the Gauss Mixture Model and KL divergence method, every video shot is identified as announcer or un-announce type by using voice recognition. Finally, the news story unit segmentation is carried on under the special structure knowledge of news program.
     A fast and robust approach for the extraction of text on road signs based on color and stroke is proposed.First, a novel color model derived from Karhunen-Loeve(KL) transform was applied to find all possible road sign candidates. Then, affine transformation was performed to restore road signs to let every road sign seems to be vertical to the camera optical axis which can improve the accuracy in detecting texts embedded in road signs. Finally, mathematical morphology and region growing algorithms were used to obtain a clearer binary picture which is sent to OCR software. Experimental results demonstrate the great robustness and efficiency of proposed algorithm.

引文

[1] TREVID: http://www-nlpir.nist.gov/projects/trecvid/.
    [2] Lingyu Duan, Min Xu, Tatseng Chua, et al. A mid-level representation framework for semantic sports video analysis[A]. Proceedings of the eleventh ACM international conference on Multimedia[C], 2003:33~44.
    [3] Keechul Jung, Kwang In Kim, Anil K.Jain. Text information extraction in images and video: A Survey[J]. Pattern Recognition, 2004, 37(5):977~997.
    [4] Toshio Sato, Takeo Kanade, Ellen K.Hughes. Video OCR: indexing digital news libraries by recognition of superimposed captions[J]. Multimedia System, 1999(7):385~395.
    [5] Hasan Y.M.Y, Karam L.J. Morphological text extraction from images[J]. IEEE Transactions on Image Processing, 2000, 9(11):1978~1983.
    [6] Chen Datong, Odobez Jean-Marc, Bourlard Hervé. Text detection and recognition in images and video frames[J]. Pattern Recognition.2004, 37(3):595~608
    [7] Lyu Michael, Jiqiang Song, Min Cai. A comprehensive method for multilingual video text detection, localization, and extraction[J]. IEEE Transactions on Circuit and System on Video Technology, 2005, 15(2):.243~255.
    [8]李闯,丁晓青,吴佑寿.基于边缘的字符串定位算法.清华大学学报(自然科学版),2005,45(10):1335~1338
    [9] Jiang Gao, Jie Yang. An adaptive algorithm for text detection from natural scenes[A]. Proceedings of the 2001 IEEE Conference on Computer Vision and Pattern Recognition[C]. 2001:84~89.
    [10] Victor Wu, Raghavan Manmatha, Edward M.Riseman. Textfinder: an automatic system to detect and recognize text in images[J]. IEEE Transaction on Pattern Analysis and Machine Intelligence, 1999, 21(11):1224~1229.
    [11] Julinda Gllavata, Ralph Ewerth, Bernd Freisleben. Text detection in images based on unsupervised classification of high-frequency wavelet coefficients[A]. Proceedings of 17th International Conference on Pattern Recognition[C]. 2004: 425~428.
    [12] Huiping Li, Doermann D, Kia O. Automatic text detection and tracking in digital video [J].IEEE Transactions on Image Processing, 2000, 9 (1):147~156.
    [13] Shin C.S, Kim K.I, Park M.H, Kim H.J. Support vector machine-based text detection in digital video [J]. Pattern Recognition, 2001, 34(2):527~529.
    [14] Rainer Lienhart, Frank Stuber. Automatic text recognition in digital videos[A]. Proceedings of the SPIE Image and Video Processing IV[C], 1996, 2666-20:180~188
    [15] Hyeran Byun, Inyoung Jang, Yeongwoo Choi. Text extraction in digital news video using morphology. Proceedings of the 5th International Workshop on Document Analysis Systems V. Princeton, USA. 2002:341~252.
    [16] Bernard Mancas-Thillou, C.and Gosselin. Spatial and color spaces combination for natural scene text extraction. Proceedings of 2006 International Conference on Image Processing. Atlanta, GA. 2006, 1:985~988.
    [17] Yuwing Tai, Jiaya Jia, Chikeung Tang. Local color transfer via probabilistic segmentation by expectation-maximazation. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005, 1:747~754.
    [18]刘琼,周慧灿,王耀南.结合亮度分级和笔画检测的彩色图像文本提取.计算机工程与应用. 2008,44(18):157~162
    [19] Tekinalp Serhat, Alatan A.Aydin. Utilization of texture, contrast and color homogeneity for detecting and recognizing text from video frames. Proceedings of 2003 International Conference on Image Processing, Ankara, Turkey. 2003, 2:505~508.
    [20] Yu Zhong, Hongjiang Zhang, A.K.Jain. Automatic Caption Localization in Compressed video[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2000, 22(4):385~392.
    [21] Qixiang Ye, Qingming Huang, Wen Gao, et al. Fast and Robust text detection in images and video frames[J]. Image and Vision Computing, 2005, 23(6): 565~576.
    [22]张佑生,彭青松,汪荣贵.基于子图像vch的文本检测与定位方法研究.武汉大学学报(信息科学版) .2003,28(3):354~358
    [23] Lienhart R, Wernicke A. Localizing and segmenting text in images and videos[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2002, 12:256~268.
    [24] Crandall,2001]Crandall D, Kasturi R. Robust detection of stylized text events in digital video. Proeeedings of Sixth International Conference on Doeument Analysis and Reeognition, Seattle,USA 2001,PP.865~869.
    [25] Antani S. Reliable extraction of text from video. Ph.D. Thesis, Pennsylvania State University, 2001.
    [26]章东平,徐志江.一种基于稳健匹配准则的视频文本跟踪方法.电路与系统学报. 2007,12(5):44~49
    [27]密聪杰,刘洋,薛向阳.基于多帧图像的视频文字跟踪和分割算法.计算机研究与发展. 2006,43(9):1523~1529
    [28]马瑞,王家廞.基于点模式匹配的视频文字跟踪和笔画提取.计算机工程. 2008,34(3):15~17
    [29] Jian Yi, Yuxin Peng. Using Multiple Frame Integration for the Text Recognition of Video. Proceeding of 2009 10th International Conference on Document Analysis and Recognition. Barcelona, Spain. 2009:71~75.
    [30] Huiping Li, Omid Kia, David Doermann. Text enhancement in digital video. Proeeedings of SPIE on Document Recognition IV, 1999:1~8.
    [31] Kwak S, Choi Y, Chung K. Video caption image enhancement for an efficient character recognition. Proeeedings of 15th International Conference on Pattern Reeognition. Barcelona, Spain. 2000, 2:606~609.
    [32] Xiansheng Hua, Pei Yin, Hongjiang Zhang. Efficient Video Text Recognition Using Multiple Frame Integration, International Conference on Image Processing, 2002, 2:397~400.
    [33] Chun-Ming Tsai, His-Jian Lee. Binarization of color document images via luminance and saturation color features[J] . IEEE Transactions on Image Processing , 2002 ,11 (4):434~451.
    [34] Gllavata J, Ewerth R, Freisleben B. Finding text in images via local thresholding[A]. Proceedings of the 3rd IEEE In’tl Symposium on Signal Processing and Information Technology[C]. Darmstadt, Germany, 2003:539~542.
    [35] Gllavata J, Ewerth R, Freisleben B. Text detection in images based on unsupervised classification of high frequency wavelet coefficients[A]. Proceedings of the International Conference on Pattern Recognition, 2004, 1:425~428.
    [36] Wernicke L R. A localizing and segmenting text in images and videos[J] . IEEE Transactions on Circuit s and Systems for Video Technology, 2002, 12(4):256~258.
    [37] Garcia C, Apostolidis X. Text detection and segmentation in complex color images.Proceedings of 2000 IEEE International conference on Acoustics, Speech, and Signal Processing, Istanbul, Turkey, 2000, 6:2326~2329.
    [38] Qixiang Ye, Wen Gao, Qingming Huang. Automatic Text Segmentation from Complex Background. Proceedings of IEEE International Conference on Image Processing, Singapore, 2004:24~27.
    [39]朱成军,欧阳元新,盛浩.基于边缘和颜色的视频文本图像分割方法.系统仿真学报. 2008,20(23):6498~6501
    [40] V Vapnik. The Nature of Statistical Learning Theory[M]. New York: Springer Verlag, 1995.
    [41] Edward R. Dougherty. An introduction to morphological image processing[M]. SPIE Optical Engineering Press. Washington, USA, 1992.
    [42]龚炜,石青云,程民德.数字空间中的数学形态学—理论及应用.北京:科学出版社,1997.
    [43]范立南,韩晓徽,张广洲.图像处理与模式识别.北京:科学出版社,2007.
    [44] Mallat S.G. A theory for multi resolution signal decomposition: the wavelet representation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 1989, 11:674~693.
    [45] Otsu N. A Threshold Selection Method From Gray-level Histograms[J]. IEEE Trans Systems,Man and Cybernetics,1979,9(1):62-66.
    [46] Qixiang Ye, Wen Gao, Weiqiang Wang, Wei Zeng. A robust text detection algorithm in images and video frames. Fourth International Conference on Information Communications and Signal Processing and Pacific-Rim Conference on Multimedia, Singapore, 2003. 2:802~806.
    [47]黄剑华.自然场景中文本信息提取方法:[博士学位论文].哈尔冰:哈尔滨工业大学,2007
    [48] Datong Chen, Bourlard H, Thiran J-P. Text identification in complex background using SVM. IEEE International Conference on Computer Vision and Pattern Recognition, 2001, 2:621~626.
    [49] Ting-Fan Wu, Chin-Jen Lin, Ruby C Weng. Probability estimates for multi-class classification by pairwise coupling[J], Journal of Machine Learning Research, 2004, 5:975~1005.
    [50] Chin-Man Pun, Moon-Chuen Lee. Log-Polar wavelet energy signatures for rotation and scale invariant texture classification [J]. IEEE Transactions on Patten Analysis and Machine Intelligence, 2003, 25(5):590~603.
    [51] Setia L, Teynor A, Halaw ani A, et al. Image classification using cluster co-occurrencematrices of local relational features [A] Proceedings of the 8th ACM International Workshop on Multimedia Information Retrieva, New York, USA. 2006:173~182.
    [52] Manjunath B S, Ma W Y. Texture f eatures for browsing and retrieval of image data[J] . IEEE Transactions on Pat tern Analysis and Machine Intelligence,1996,18(8):837~ 842.
    [53] R M. Haralick, Dinstein, K Shanmugam. Texture features for image classification [J]. IEEE Transactions on Systems, Man and Cybernetics, 1973, 3: 610~621.
    [54]田破荒,彭天强,李弼程.基于文字穿越线和笔画连通性的视频文字提取方法.电子学报2009,37(1):72~78.
    [55] Kennedy J,Eberhart R C. Particle swarm optimization[A]. Proc of IEEE International Conference on Neural Networks,USA. 1995:1942-1948.
    [56] Xiansheng Hua, Wenyin Liu, Hongqiang Zhang. An automatic performance evaluation protocol for video text detection algorithms[J]. IEEE Transactions on Circuits and Systems for Video Technology. 2004, 14: 498~507.
    [57] Chih-Wei Hsu, Chih-Chung Chang, Chih-Jen Lin. A practical guide to support vector classification. Taiwan: Department of Computer Science, National Taiwan University, 2007: http://www.csie.ntu.edu.tw/~cjlin/libsvm/index.html.
    [58] Kwang In Kim, Keechul Jung, Jin Hyung Kim. Texture-based approach for text detection in images using support vector machines and continuously adaptive mean shift algorithm[J], IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003, 25: 1631~1639.
    [59] Jianfeng Xu, Shaofa Li. Image Text Location Based on Color Edge and SVM[J]. Application Research of Computers, 2006, 23(3):155~157
    [60] Jianhua Huang, Ziye Yan, Xianglong Tang. method for text detection in image and video based on wavelet reconstruction[J], Journal of Harbin Institute of Technology, 2006, 38(9):1459~1460.
    [61]章东平.视频文本的提取:[博士学位论文].杭州:浙江大学,2006
    [62] Antani S. Reliable extraction of text from video. Ph.D. Thesis, Pennsylvania StateUniversity, 2001.
    [63] ChenY S, Hung Y P. Fast block matching algorithm based on the winner-update strategy[J]. IEEE Transaction on Image Proeessing, 2001, 10:1212~1222.
    [64] Canny John. A computational approach to edge detection[J]. IEEE Transactions on PatternAnalysis and Machine Intelligence, 1986, 8:679~698.
    [65] Paul Bao, Lei Zhang, Xiaolin Wu. Canny edge detection enhancement by scale multiplication[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27: 1485~1490.
    [66]黄贤武,苏鹏程,柏培权.基于方向滤波分割的指纹自动识别系统算法.中国图象图形学报. 2002,7(8):829~834
    [67] N.J Naccache, R.Shinghal Spta. A proposed algorithm for thinning binary patterns[J]. IEEE Transactions on Systems, Man, and Cybernetics, 1984, 14:409~418.
    [68] Hans P. Moravec. Visual Mapping by a Robot Rover[A]. Proceedings of the 6th International Joint Conference on Artificial Intelligence[C], 1979:599~601.
    [69] C. Harris, M. Stephens. A combined corner and edge detector[A]. Proceedings of the fourth Alvey Vision Conference[C], Manchester, UK. 1988, 15:147~151.
    [70] Bandera A,Urdiales C,Arrebola E. Comer detection by means of adaptively estimated curvature function[J].Electronics Letters, 2000,36(2):124~126.
    [71] Gunilla Borgevors. Distance transforms in digital image [J]. Computer Vision, Graphics, and Image Processing, 1986, 34(3):344~371.
    [72] D.P Huttenlocher, G.A Klanderman, W.A Rucklidge. Comparing images using the Hausdorff distance[J]. IEEE Transactions on Patten Analysis and Machine Intelligence, 1993,15(9):850~863.
    [73] Dubuisson M P, Jain A K. A modified Hausdorff distance for object matching[A]. Proceedings of the 12th IAPR International Conference on Pattern Recognition[C], Jerusalem, Israel. 1994, 1: 566~568.
    [74] Yue Lu, Chew Lim Tan. Chinese word searching in imaged documents[J]. pattern Recognition and Artificial Intelligence, 2004, 18(2):229~246.
    [75] Kwan-Ho Lin, Kin-Man Lam, Wan-Chi Siu. Spatially eigen-weighted hausdorff distances for Human face recognition[J]. Pattern Recognition, 2003, 36(8): 1827~1834.
    [76] Kwan-Ho Lin, Baofeng Guo, Kin-Man Lam, et al. Human face recognition using a spatially Weighted Hausdorff distance[A]. Proceedings of 2001 International Symposium on Intelligent Multimedia, Video and Speech Processing[C], Hong Kong, China. 2001:145~148.
    [77] Sim DG, Kwon OK, Park RH. Object matching algorithm using robust Hausdorff distancemeasures[J]. IEEE Transactions on Image Processing, 1999, 8(3): 425~429.
    [78] Xiaoou Tang, Xinbo Gao, Jianzhuang Liu. A spatial temporal approach for video caption detection and recognition[J]. IEEE Transactions on Neural Networks, 2002, 13(4):961~971.
    [79]许剑峰.数字视频中的文本分割的研究:[博士学位论文].广州:华南理工大学,2005
    [80] Lei Xu, Kongqiao Wang. Extracting text information for content-based video retrieval[A], MMM'08 Proceedings of the 14th international conference on Advances in multimedia modeling[C]. 2008:58~69.
    [81] C. Strouthopoulos, N. Papamarkos, A.E. Atsalakis. Text extraction in complex color documents[J]. Pattern Recognition 2002, 35(8):1743~1758.
    [82] Christian Wolf, Jean-Michel Jolion, Extraction and recognition of artificial text in multimedia documents[J], Pattern Analysis & Applications, 2003, 6(4):309~326.
    [83]朱庆,吴波,万能.具有良好重复率与信息量的立体影像点特征提取方法.电子学报. 2006,34(2):205~209
    [84]冀中.基于多模态信息的新闻视频内容分析技术研究:[博士学位论文].天津:天津大学,2007
    [85] C. Cotsaces, N. Nikolaidis, I. Pitas. Video shot detection and condensed representation: a review[J]. IEEE Signal Processing Magazine, 2006, 23(2): 28~37.
    [86]钱刚,曾贵华.典型视频镜头分割方法的比较.计算机工程与应用. 2004,32:51~55
    [87] U. Gargi, R. Kasturi, S. H. Strayer. Performance characterization of video shot change detection methods[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2000, 10(1): 1~13.
    [88] Hari Sundaram, Lexing Xie, Shih-Fu Chang. A utility framework for the automatic generation of audio-visual skims[A]. Proceedings of the tenth ACM international conference on Multimedia[C], 2002:189~198.
    [89] Hari Sundaram, Shih-Fu Chang. Constrained utility maximization for generating visual skims[A], IEEE Workshop on Content-based Access of Image and Video Libraries (CBAIVL-2001) [C], Hawaii, USA. 2001:124~131.
    [90] Nevenka Dimitrova, Hongjiang Zhang, Behazad Shahraray, et al. Applications of video-content analysis and retrieva[J]. IEEE Multimedia, 2002, 9(3):42~55.
    [91] Zhaoyang Ye, Fei Wu, Yueting Zhuang. A robust fusion algorithm for shot boundarydetection[J], Journal of Computer Aided Design and Computer Graphics, 2003, 15(11):950~955.
    [92] A. Hanjalic. Shot-Boundary Detection:Unraveled and Resolved[J]. IEEE Transaction on Circuits and Systems for Video Technology, 2002, 12(2):90~105.
    [93]王方石,须德,吴伟鑫.基于自适应阈值的自动提取关键帧的聚类算法.计算机研究与发展. 2005,42(10):1752~1757
    [94] Winston Hsu, Shih-Fu Chang, Chih-Wei Huang, et al. Discovery and fusion of salient multi-modal features towards news story segmentation[A], IS&T/SPIE Electronic Imaging[C], USA, 2004:244~258.
    [95]冀中,张春田,苏育挺.新闻视频故事单元分割技术综述.中国图象图形学报. 2007,12 (11):1954~1960
    [96] Sugano M, Hoash K, Mutsumato K, et al. Shot Boundary Determination on MPEG Compressed Domain and Story Segmentation Experiments for TRECVID 2003[A]. Proceedings of the TREC Video Retrieval Evaluation (TRECVID)[C], 2004:109~120.
    [97] Wenping Liu, Chihcheng Hung. Caption-based news video story segmentation and retrieval[J]. Journal of Information & Computational Science, 2008, 5(2):1~6.
    [98]史迎春,方鹏飞,周献中,等.综合利用声视特征的新闻视频结构化模型.计算机工程与应用. 2004,40(32):99~101
    [99]刘华咏.基于音视频特征和文字信息自动分段新闻故事.系统仿真学报. 2004,16(11):2608~2610
    [100] Hao Jiang, Hongjiang Zhang, Tony Lin. Video segmentation with the support of audio segmentation and classification, Proceeding of International Conference on Multimedia and EXPO 2000(ICME), Oral presentation, New York , USA, 2000.
    [101] Howard D. Wactlar. Informedia-Search and Summarization in the Video Medium[A], Proceeding of Imagina 2000 Conference[C], Monaco, France, 2000.
    [102] Liu Z, Huang JC, WangY. Classification of TV programs based on audio information using hidden Markov model[A]. Proceedings of IEEE Second Workshop on Multimedia Signal Processing[C], Redondo Beach, CA, USA, 1998: 27~32.
    [103] Lu L, Zhang H J, Li S Z. Content-based audio classification and segmentation by using support vector machines [ J]. Multimedia Systems, 2003, 8(6): 482~492.
    [104] Xinbo Gao, Jie Li, Bing Yang. A graph-theoretical clustering based anchorperson shot detection for news video indexing [A]. International Conference on Computational Intelligence and Multimedia Applications[C], Xi’an, China, 2003: 108~113.
    [105] Hongjiang Zhang, Yihong Gong, Smoliar SW, et al. Automatic parsing of news video [A]. Proceedings of the International Conference on Multimedia Computing and Systems[C], Boston, USA. 1994:45~54.
    [106] Yun Zhai, Yilmaz A, Shah M. Story segmentation in news videos using visual and text cues[A]. Proceedings of the 4th International Conference In Image and Video Retrieval (CIVR) [C], Singapore, 2005:92~102.
    [107] Ce Wang, Yun Wang, Huayong Liu, et al. Automatic story segmentation of news video based on audio-visual features and text information[A]. Proceedings of International Conference on Machine Learning and Cybernetics[C]. Xi’an, China. 2003, 5:3008~3011.
    [108] Browne P, Czirjek C, Gaughan G, et al. Dublin City University Video Track Experiments for TREC 2003[EB/OL]. http://doras.dcu.ie/434/.
    [109] Lekha Chaisorn, Tat-Seng Chua. The segmentation and classification of story boundaries in news video[A]. Proceedings of International Conference on Visual and Multimedia Information Management[C]. Brisbane, Australia. 2002:95~109.
    [110] Hsu Winston, Kennedy L, Huang Shifu, et al. News video story segmentation using fusion of multi-level multi-modal features in TRECVID 2003[A]. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing[C]. Montreal, Canada. 2004:645~648.
    [111]彭天强,李弼程.基于朴素贝叶斯模型的新闻故事分割方法.计算机工程. 2009,35(20):178~180
    [112] Ramin Zabih, Justin Miller, Kevin Mai. Feature-based algorithms for detecting and classifying scene breaks[A]. Proceedings of ACM on Mutlimedia[C]. 1995: 189~200.
    [113]章毓晋.基于内容的视觉信息检索.第1版.北京:科学出版社,2003
    [114] Alberto del Bimbo. Visual information retrieval[M]. Morgan Kaufmann, Inc. first edition, 1999.
    [115]栾悉道,谢毓湘,刘宇驰.融合多特征的新闻故事探测.小型微型计算机系统. 2008,29(5):950~953
    [116]公安部交通管理局.中华人民共和国道路交通事故统计年报. 2009
    [117]牛学军,朱茵,楼涛.道路交通安全风险预控管理对策.综合运输. 2008,10~13
    [118]王贵槐,万剑.汽车安全辅助驾驶支持系统信息感知技术综述.交通与计算机. 2008,26(3):50~53
    [119] Giulia Piccioli, Enrico De Micheli, Marco Campani, et al. Robust method for road sign detection and recognition [J]. Image and Vision Computing, 1996, 14 (3): 209~223.
    [120] H Sandoval, T Hattori, S Kitagawa, et al. Angle-dependent edge detection for traffic signs recognition[A]. Proceedings of the IEEE IV 2000 intelligent vehicle Symposium [C], 2000:308~313.
    [121] B Besserer, S Estable, B Ulmer, et al. Shape classification for traffic sign recognition[A]. 1st Int’l Workshop on Intelligent Autonomous Vehicles[C], 1993: 487~492.
    [122] A. de la Escalera, L.E. Moreno, M.A. Salichs et al. Road traffic sign detection and classification[J]. IEEE Transaction on Industrial Electronics, 1997, 44(6): 848~859.
    [123] Hsiu-Ming Yang, Chao-Lin Liu , Kun-Hao Liu ,et al. Traffic sign recognition in disturbing environments[A]. Proceedings of the 14th Int’l Symposium on Methodologies for Intelligent Systems (LNAI 2871) [C], 2003: 252~261.
    [124] S. Vitabile, G. Pollaccia, G. Pilato, and E. Sorbello. Road signs recognition using a dynamic pixel aggregation technique in the HSV color space[A], Proceedings of IEEE International Conference on Image Analysis and Processing[C], 2001: 572~577.
    [125] S. Vitabile, A. Gentile, S.M. Siniscalchi, et al. Efficient rapid prototyping of image and video processing algorithms[A], Euromicro Symposium on Digital System Design[C], 2004:452~458.
    [126] A de la Escalera, J. MA Armingol, M. Mata. Traffic sign recognition and analysis for intelligent vehicles[J]. Image and Vision Computing, 2003, 21(3): 247~258.
    [127]冈萨雷斯.数字图像处理.第2版.北京:电子工业出版社,2005.
    [128] J.C. Rojas, J.D. Crisman. Vehicle detection in color images[A]. IEEE Conference on Intelligent Transportation System[C], Boston, USA. 1997:403~408.
    [129] Douglas, D. H. and Peucker, T.K. Algorithms for the reduction of the number of points required to represent a digitized line or its caricature[J], The Canadian cartographer, 1973:112~122.
    [130] Wen-Hsiang Tsai. Momen-Preserving thresholding: A new approach[J]. Compute Vision,Graphics and Image Proeess, 1985, 29(3): 377~393.
    [131] J.Bernsen. Dynamic thresholding of gray-level images[A]. Proceedings of International Conference on Pattern Recognition[C], 1986(2):1251~1255.
    [132] Mohamed Kamel, Aiguo Zhao. Extraction of binary character/graphics images from grayscale Document images[J]. Graphical Models and Image Processing, 1993, 55(3):203~217.
    [133] S.Djeziri, F. Nouboud, R. plamondon. Extraction of signatures from check bank ground based on a filiformity criterion[J]. IEEE Transactions on Image Proeessing. 1998, 7(10): 1425~1438.
    [134] Xiangyun Ye, Mohamed Cheriet, C.Y. Suen. Stroke-Model-Based character extraction from gray-level document images[J], IEEE Transactions on Image Processing 2001(10):1152~1161.
    [135]王振,魏志强.交通标识牌字符提取算法.计算机应用, 2011,31(1):266-269

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700