基于字幕的新闻视频检索中字幕定位算法的研究

英文题名：The Research of Caption Location Algorithm in News Video Retrieval Based on Caption
作者：蓝照华
论文级别：硕士
学科专业名称：计算机应用技术
中文关键词：基于字幕的新闻视频检索 ; 字幕定位算法 ; C-均值聚类分割算法 ; 纵向微分运算 ; 中值滤波
英文关键词：news video retrieval based on caption ; caption location algorithm ; C-means clustering algorithm ; vertical differential coefficient ; median filter
学位年度：2008
导师：赵进创
学科代码：081203
学位授予单位：广西大学
论文提交日期：2008-06-01
答辩委员会主席：陈华

摘要

基于字幕的视频检索是基于内容的视频检索的重要方面,利用新闻字幕进行检索可以大大降低视频检索的复杂度,提高视频检索的速度和准确性,对基于字幕的视频检索技术的研究具有重要的理论意义和应用价值。新闻字幕定位算法是基于字幕的视频检索的核心技术之一,它已成为该领域的研究热点。针对不同的应用背景,有不同的新闻字幕定位算法。
     本文针对一类具有较复杂背景的图像和视频的字幕定位算法进行了研究。在前人工作的基础上,主要完成以下的工作:
     (1)对基于字幕的新闻视频检索技术进行了研究,重点研究了新闻视频检索中字幕定位算法及其应用特点。
     (2)针对一类具有较复杂背景的图像和视频提出了基于C-均值聚类结合边缘检测的新闻字幕定位算法。实验结果表明,该算法具有较好的定位效果。
     (3)在利用C-均值聚类分割算法进行字幕分割中,针对新闻视频中经常采用的深字白底和白字深底的字幕条,提出固定初始聚类中心的快速C-均值聚类分割算法,以提高基于字幕的新闻视频检索中字幕定位算法的运算速度。
     (4)探讨了削弱噪声干扰的方法。采用纵向微分运算结合自定义模板(1,1,1,1,1)的中值滤波方法削弱视频图像背景噪声干扰,取得了较满意的实验结果。
Caption presents in the video frames plays an important role in understanding the content of a video sequence. Using news caption to do video retrieval can reduce the complexity of video retrieval and increase the veracity of video retrieval, it is very useful for video retrieval based on caption. Caption location algorithm is a core technology of video retrieval based on caption, it has become a hotspot. There are different news caption location algorithms for different application backgrounds.
     In the paper, caption location algorithm has been researched for a video which has complex background. On the basis of the former, the main works are as follow:
     (1) News video retrieval based on caption has been researched, focused on the caption location algorithm and its application.
     (2) An algorithm of news caption location based on C-mean clustering and edge detection is proposed for a video which has complex background. The experimental results show that the algorithm performs well.
     (3) In order to enhance the velocity of caption location, the rapid C-means clustering algorithm of fixed initial cluster center is proposed for the special caption when doing C-means clustering extraction.
     (4) Noise reducing technology has been researched. Using vertical differential coefficient and median filter to reduce the influence of complex background, median filter uses the custom template (1,1,1,1,1), it is effective.

引文

[1]史迎春,方鹏飞,周献中.综合利用声视特征的新闻视频结构化模型[J].计算机工程与应用,2004,32:99-101
    [2]庄越挺,潘云鹤,芮勇等.基于内容的图像检索综述[J].模式与人工智能,1999,12(2):170-177
    [3]王惠锋,孙正兴,王箭.语义图像检索研究进展[J].计算机研究与发展,2002,39(5):513-523
    [4]马颂德,卢汉清.图像理解与图像视频数据的检索[C].第五届全国计算机应用联合学术会议,北京:电子工业出版社,1999:30-36
    [5]M.Thomas,et al.A system for region-based image indexing and retrieval[M].Lecture Notes in Computer Science,Amsterdam.The Netherlands.June 1999:509-516
    [6]Y.Rui,T.S.Huang.Relevance feedback techniques interactive content-based image retrieval[M].Proceedings of the SPIE:Storage and Retrieval for Image and Video Databases Ⅵ.3312,1998:25-36
    [7]Kong Weixin,et al.A new scene breakpoint detection algorithm using slice of video stream[C].Proceeding of IAPR International Conference of Multimedia Information Analysis and Retrieval,Hong Kong,1998:175-180
    [8]S.Colonnese,et al.Automatic moving objects and background segmentation by means of higher order statistics[C].IS&T Electronic Imaging'97 Conference:Visual Communication and Image Processing S.Jose'.Feb.'97:8-14
    [9]R.Mech,et al.A noise robust method for 2D shape estimation of moving objects in video sequences considering a moving camera[J].Signal Processing:Special Issue on Video Sequence Segmentation for Content-based Processing and Manipulation,1998,(66):218-232
    [10]J.G.Choi,M.Kim,et al.Automatic segmentation based on spatio-temporal information[C].MPEG97/1743,Feb.1997:156-160
    [11]Lienhart R.Indexing and retrieval of digital video sequences based on automatic text recognition[C].In:Proceedings of 4~(th)ACM International Multimedia Conference,Boston,MA,USA,1996:212-216
    [12]Pfeiffer S,Lienhart R,Fischer S,et al.Abstracting digital movies automatically[J].Journal Vision Communication.Image Rep resent,1996,7(4):345-353
    [13]Wactlar H D,Christel M G,Gong Y,et al.Lessons learned from building a terabyte digital video library[J].IEEE Computer,1999,32(2):66-73
    [14]Mori S,Suen C Y,Yamamo to K.Historical review of OCR research and development[J].In:Proceedings of IEEE,1992,80(7):1029-1058
    [15]王勇,郑辉,胡德文.图像和视频中的文字获取技术[J].中国图象图形学报,2004,9(5):532-538
    [16]Zhong Y,Karu K,Jain A K.Locating text in complex color images[J].Pattern Recognition,1995,28(10):1523-1536
    [17]Jain A K,Yu B.Automatic text location in images and video frames[J].Pattern Recognition,1998,31(12):2055-2076
    [18]Smith M A,Kanade T.Video skimming for quick brow sing based on audio and image characterization[R].Technology Report CMU-CS-95-186,Carnegie Mellon University,Pittsburgh,PA,USA,July 1995:66-68
    [19]Wu V,Manmatha R,Riseman E M.Finding text in images[C].In:Proceedings of 2nd ACM International Conference Digital Libraries.Philadelphia,PA.USA,1997:23-26
    [20]Sato T,Kanade T,Hughes E,et al.Video OCR:Indexing digital news libraries by recognition of superimposed caption[J].Multimedia System s,1999,7(5):385-395
    [21]Sato T,Kanade T,Hughes E K,et al.Video OCR for digital new searchives[C].In:Proceedings of IEEE International Work shop on Content-Based Access of Image and Video Database(CAVID'98),Bombay,India,1998:52-60
    [22]Lienhart R.Automatic text recognition for video indexing[C].In:Proceedings of ACM Multimedia 96,Boston,MA,USA,1996:11-20
    [23]Lienhart R,Effelsberg W.Automatic text segmentation and text recognition for video indexing[J].Multimedia Systems,2000,8(2):69-81
    [24]Li Huiping,Doermann D,Kia O.Automatic text detection and tracking in digital video[J].IEEE Transactions Image Processing,2000,9(1):147-156
    [25]Li Huiping,Kia O,Doermann D.Text enhancement in digital videos[C].In:Proceedings of SPIE99-Document Recognition and Retrieval,San Jose,CA,USA,January 1999:1-8
    [26]庄越挺,刘骏伟,吴飞,潘云鹤,张引.基于支持向量机的视频字幕自动定位与提取[J].计算机辅助设计与图形学学报,2002,14(8):750-753
    [27]刘洋,薛向阳,路红,郭跃飞.一种基于边缘检测和线条特征的视频字符检测算法[J].计算机学报,2005:28(3):427-433
    [28]章东平,祝金标,刘济林.自动定位彩色图像中的文本[J].浙江大学学报(工学版),2005:39(2):229-233
    [29]蔡波,周洞汝,胡宏斌.数字视频中字幕检测及提取的研究和实现[J].计算机辅助设计与图形学学报,2003,15(7):898-903
    [30]李刚.MPEG-7与媒体资产检索[J].西部广播电视,2004,2:19-22
    [31]杨庆友,高隽,鲍捷.基于视频的字幕检索与提取[J].计算机应用,2000,20(10): 33-35
    [32]胡小峰,赵辉.Visual C++/MATLAB图像处理与识别实用案例精选[M].北京:人民邮电出版社,2004:67-77
    [33]叶芗云,等.文本图像的快速二值化方法[J].红外与毫米波学报,1997,16(5):344-350
    [34]Ohya J,Shio A,A ksmatsu S.Recognition characters in scene images[J].IEEE Transactions Pattern Analysis and Machine Intelligence,1994,16(2):214-220
    [35]Xi Jie,Hua Xian-sheng,Chen Xiang-rong.A video text detection and recognition system[C].In:IEEE International Conference on Multimedia and Expo(ICME 2001),Waseda University,Tokyo,Japan,August 22-25,2001:1080-1083
    [36]欧国斌,张利,谢攀.视频信号中实时字幕信息的提取方法[J].清华大学学报,2002.42(7):869-872
    [37]Li Hui-ping,Kia O,Doermann D.Text enhancement in digital videos[C].In:Proceedings of SPIE99-Document Recognition and Retrieval,San Jose,CA,USA,January 1999:1-8
    [38]Hori O.A video text extraction method for character recognition[C].In:Proceedings of 5th International Conference Document Analysis and Recognition(ICDAR1999),Bangalore,India,1999:25-28
    [39]Wu V,Manmatha R,Riseman E.TextF inder:An automatic system to detect and recognize text in images[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1999,21(11):1224-1229
    [40]Hua Xian-sheng,Chen Xiang-rong,L iu Wen-yin,et al.Automatic location of text in video frames[C].In:3~(rd)International Work shop on Multimedia Information Retrieval(MIR2001),Ottawa,Canada,2001:126-129
    [41]黄晓东,等.用小波变换及颜色聚类提取的视频图像内中文字幕[J].计算机工程,2003,29(1):43-44
    [42]WANG H.Automatic character location and segmentation in color scene images[C].11th International Conference on Image Analysis and Processing.Piscataway:IEEE,2001:2-7
    [43]温熙森.模式识别与状态监控[M].北京:科学出版社,2007:235-237
    [44]Wu V,Manmatha R,Riseman E M.Finding text in images[C].In:Proceedings of 2nd ACM International Conference Digital Libraries.Philadelphia,PA,USA,1997:23-26
    [45]Wu V,Manmatha R.Document image clean-up and binarization[C].In:Proceedings of SPIE Symposium on Electronic Imaging1998.San Jose,CA,USA,January 1998:263-273
    [46]Li Hui-ping,Doermann D,Kia O.Automatic text detection and tracking in digital video[R].LAMP Technology Report 028,Maryland University,USA,1998:126-138
    [47]Jain A K,Sushil Bhattacharjee.Text segmentation using gabor filters for automatic document processing[J].Machine Vision and Applications,1992,5(3):169-184
    [48]Chen Da-tong,Bourland Herve,Thiran Jean-Philippe.Text identification in complex background using SVM[C].In:Proceedings of the International Conference on Computer Vision and Pattern Recognition2001,Kauai Marriott,Hawaii,USA,2001,2:621-626
    [49]Lienhart R,Wernicke Axel.Localizing and segmenting text in images and videos[J].IEEE Transactions on Circuits and Systems for Video Technology,2002,12(4):256-268
    [50]Chang S F,Chen W,Meng H J,et al.VideoQ-an automatic content-based video search system using visual cues[C].In:Proceedings of ACM Multimedia Conference,Seattle,WA,USA,1997:147-151
    [51]Chen Xiang-rong,Zhang Hong-jiang.Text area detection from video frames[C].In:2nd IEEE Pacific-Rim Conference on Multimedia(PCM 2001),Beijing,China,2001:222-228
    [52]Kwang In Kim,Keechul Jung,and Jin Hyung Kim.Texture-Based Approach for Text Detection in Images Using Support Vector Machines and Continuously Adaptive Mean Shift Algorithm[J].IEEE Computer Society,2003,25(12):1631-1639
    [53]王勇,等.一种基于学习的视频字幕验证方法[J].中国图象图形学报,2006,11:1645-1649
    [54]Cheng H D,Jiang X H,Sun Y,et al.Color image segmentation:advances and prospects[J].Pattem Recognition,2001,34(12):259-281
    [55]杨淑莹.VC++图像处理程序设计[M].北京:清华大学出版社,2005:110-111
    [56]章毓晋.图像工程[M].北京:清华大学出版社,2000:156-158
    [57]Gabbouj M,Coyle E J,Gallagher N C J.An overview of median and stack filtering[J].Circuits System Signal Processing,1992,11(1):7-45
    [58]王广志,丁辉,彭江,汪爱媛.序列切片图像中斑块状污染的消除[J].清华大学学报(自然科学版),2006,46(9):1621-1624
    [59]蓝照华,赵进创.新闻视频检索技术的研究[J].中国有线电视,2006,24:2414-2416
    [60]Wong E.K.,Chen M.A new robust algorithm for video text extraction[J].Pattern Recognition,2003,36(6):1397-1406

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700