新闻视频主题追踪技术研究

英文题名：Research on Topic Tracking in News Video Broadcasts
作者：白志杰
论文级别：硕士
学科专业名称：军事情报学
中文关键词：主题追踪 ; 关键帧 ; 关键点 ; 近似重复 ; 相似度匹配 ; 贝叶斯信息准则 ; 示例库扩展 ; 多模态融合
英文关键词：topic tracking ; keyframe ; salient point ; near-duplicate ; similarity matching ; BIC ; query corpus expansion ; multimodal fusion
学位年度：2009
导师：李弼程
学科代码：110504
学位授予单位：解放军信息工程大学
论文提交日期：2009-04-15

摘要

新闻视频主题追踪技术可以用来监控各个电视台的新闻报道,检测相同的主题,并追踪事件的发展,对重大事件前因后果和来龙去脉进行陈述。该技术可以提高视频信息的利用价值,满足相关单位和个人的需求,在信息安全、数字图书馆等领域都有广阔的应用前景。本文研究新闻视频主题追踪技术,主要取得了如下3个方面的研究成果:
     (1)研究了视频结构分析技术,提出了一种基于全局统计的关键帧合成方法。不同于传统的关键帧提取的方法,该方法从合成的角度出发,利用镜头中所有帧在相同位置像素值出现的概率来合成关键帧。实验结果表明,该方法充分考虑了镜头内所有帧的空间和时间信息,与传统的抽取方法相比,能够较好地表示镜头信息。
     (2)研究了近似重复帧检测技术,提出一种基于BIC(Bayesian Information Criterion)的新闻视频近似重复帧检测方法。该方法先对帧进行关键点检测,提取关键点区域特征组成特征值序列;再使用BIC对两个帧的特征值序列进行判决,来确定是否近似重复。实验结果表明,新方法的召回率和准确率优于传统方法。

     (3)研究了主题追踪技术,提出了一种基于多模态特征匹配融合的主题追踪方法。该方法利用全局特征和局部特征分别进行帧间相似度匹配,对匹配结果使用线性加权方法进行融合,根据帧的相似度得分进行故事排序,进而进行示例库扩展和故事库扩展,来完成主题追踪。实验结果表明,该方法与单独使用全局特征或局部特征相比,性能有较大提升。
News video topic tracking techniques can be used to scout news reports in every news station, detect the same topic, track the develop of event, present the cause-and-effect generating of event. They can advance the video information value in use and satisfy the demand of some units and individual with a wide application foreground in information security, video library, etal. In this paper, we have studied news video topic tracking techniques and obtained the following three achievements mainly.
     (1) We have researched keyframe extraction technology and proposed a synthesizing method based upon global statistic. Unlike traditional methods, we obtain a synthesized frame according to the possibilities that pixel values appear at the same position of every frame. Experimental results indicate that this method has sufficiently considered both spacial and temporal information of every frame in the shot and has better shot information representation than conventional methods.
     (2) We have studied the near duplicate keyframe detection of news video, and a BIC (Bayesian Information Criterion) based method is proposed. Firstly, corner points of the frame are detected and feature vectors are extracted at these points; then whether two frames are near-duplicate or not can be judged by comparing their feature vectors using BIC. Experimental results show that this method, which doesn’t need threshold value and machine learning, has better recall and precision than conventional methods.
     (3) We have researched topic tracking techniques of news video, and proposed a multimodal feature matching fusion method to track topic. In this method, both low-level features and near-duplicate features are used in keyframes similarity detection. We use a linear weighted fusion method for combining the relevance outcome, get story order from frames similarity score, and then extend query corpus and story corpus to accomplish tracking. Experiment results prove that this method has more powerful tracking ability than use global features or local features alone.

引文

[1] M IADOWICZ J. Story Tracking in Video News Broadcasts [D]. M.S: Poznan University of Technology.2004.
    [2]凌坚,新闻视频主题识别与追踪的研究[D]杭州:浙江大学,2007
    [3]潘渊,网络新闻主题检测与追踪[D].郑州:信息工程大学,2008
    [4] J Allan. Topic Detection and Tracking: Event-based Information Organization [M]. Boston: Kluwer Academic Publishers, 2002:1241-1253.
    [5] HSU W H, CHANG Shih-Fu, Topic Tracking Across Broadcast NEWS Videos With Visual Duplicates And Semantic Concepts[C], Department of Electrical Engineering, Atlanta, GA, 2006:141-144
    [6]洪宇,张宇,刘挺,李生.话题检测与跟踪的评测及研究综述[J].中文信息学报2009,21(6) 71-87
    [7] J. Allan, V. Laverenko, R. Papka. Event Tracking. CIIR technical report IR-128, 1998.
    [8] Informedia-II: Auto-Summarization and Visualization Over Multiple Video Documents and Libraries, NSF Cooperative Agreement No. IIS-9817496 Annual Progress Report February 2000, Carnegie Mellon University, School of Computer Science, Pittsburgh.
    [9] M.Roantree.Efficient Global Transactions for Video Media[R]. Technical Report No. ISG-02-03, Dublin City University, 2002
    [10] H. D. Wactlar, A. Q Hauptmann. Informedia News-on-Demand: Using speech recognition to create a digital video library[R]. CMU Technical Report, CMU-CS-98-109, 1998, Mar. 19.
    [11] The 2003 Topic Detection and Tracking (TDT2003) Task Definition and Evaluation Plan. http://www.nist.gov/speech/tests/tdt/tdt2003/evalplan.htm
    [12]赵黎,张宏江.利用改进NFL算法对镜头进行基于内容的检索[J].软件学报2002.13(4):586-590
    [13]熊华,视频内容结构化技术的研究与实现[D].长沙:国防科学技术大学,2001
    [14]刘桂清,视频摘要技术的研究与实现[D].长沙:国防科学技术大学,2004
    [15]章毓晋,陆海斌.视频分层组织方案和技术[J].中国工程科学.2000.2(3):18-22
    [16]姜帆,章毓晋.新闻视频的场景分段索引及摘要生成[J].计算机学报.2003,7(1):859-865
    [17] Y T. Zhuang, R. G Xiao, F. Wu, Key Issues in Video Summarization and Its Application, Proceedings of Pacific-Rim Conference on Multimedia, Singapore, 2003,1(1):885-889.
    [18]彭天强.新闻视频识别和检索技术研究[D].郑州:信息工程大学, 2008.
    [19]钱刚,曾贵华.典型视频镜头分割方法的比较[J].计算机工程与应用, 2004,40(32):51-55.
    [20] J P Fan, A K Elmagarmid, X Q Zhu. ClassView: Hierarchical video shot classification, indexing and accessing [J]. IEEE Transactions on Multimedia, 2004, 6(1):70-86.
    [21]熊华,胡晓峰.一种不需经验参数的视频镜头自校证聚类方法[J].中国图象图形学报, 2001, 6(3):243-249.
    [22] Y Wang, J Huang, Z Liu, T Chen. Multimedia content classification using audio and motion information [A]. Invited paper in IEEE Int. Conf. Circuits and Systems (ISCAS97)[C],Hong Kong, 1997,vol.2:1488-1491.
    [23]朱映映,周洞汝.一种基于视频聚类的关键帧提取方法[J].计算机工程. 2004, 30(4):12-13.
    [24]孙季丰,徐兴.视频检索中关键帧选取的时间自适应算法[J].计算机工程. 2003,29(7): 150-151.
    [25] Z hang Z,Wu J,Zhong D.An Integrated System for Content based Video Retrieval and Browsing[J].Pattern Recognition,1997,30(4):643.
    [26] Wolf W. Key Frame Selection by Motion Analysis [A] In:Proc.IEEE Int. Conf. Acoust,Speech and Signal Proc[C]. Washington, DC, USA. 1996:1228-1231
    [27] G resle P O, Huang T S.Gisting of Video Documents: A Key Frames Selection Algorithm Using Relative Activity Measure[C].In: The 2nd Int.Conf.on Visual Information Systems.1997:279-286.
    [28] Z huang Y T, Rui Y, Huang T S. Adaptive Key Frame Extraction Using Unsupervised Clustering[C].Proc .of IEEE Int.Conf.on Image Processing,1998,1(1):866-870.
    [29] F erman A M, Tekalp A M. Multiscale Content Extraction and Representation for Video Indexing[J].Multimedia Storage and Archival Systems,1997.3229:23-31.
    [30]庄越挺,潘云鹤,吴飞.网上多媒体信息分析与检索[M].清华大学出版社,北京,2002,29-66.
    [31] E Persoon and K S. Fu. Shape discrimination using fourier descriptors[C]. IEEE Trans. on Sys. Man. Cyb. 1997, 7(3): 170-179.
    [32]郭瑞,张淑玲,汪小芬.人脸识别特征提取方法和相似度匹配方法研究[J].计算机工程. 2006,32(11):225-227
    [33]田破荒,李弼程,彭天强.基于文字穿越线和笔画连通性的视频文字提取方法[J].电子学报,2009,37(1),72-78.
    [34]施智平,李清勇,史俊,史忠植.基于关键帧序列的视频片段检索.计算机应用. 2005. 25(8): 1787-1788
    [35] NGO Chong-Wah, ZHAO Wan-Lei, JIANG Yu-Gang. Fast Tracking of Near-Duplicate Keyframes in Broadcast Domain with Transitivity Propagation[C]. Proceedings of the 14th annual ACM international conference on Multimedia ,santa Babara,CA,USA.2006:845-854
    [36] ZHANG Dong-Qing, CHANG Shih-Fu. Detecting Image Near-Duplicate by Stochastic Attribute Relational Graph Matching with Learning[C], ACM Special Interest Group on Multimedia Department of Electrical Engineering, NewYork,USA ,2006:877-884
    [37]潘渊,李弼程,张先飞.LS-SVM:一种有效的新闻主题追踪方法[J].计算机应用研究.2008,25(9):2261-2263
    [38] John R.Kender, Milind R.Naphade. Visual concepts for news story tracking analyzing and exploiting the NIST TRESVID video annotation experiment[C].Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.2005
    [39]赵锞锞,彭天强等.一种稳健的新闻视频镜头检测方法[J].电子学报.2009.37(02):325-328.
    [40]冀中,张春田,苏育挺.新闻视频故事分割技术综述[J].中国图象图形学报, 2007,12(11):1954-1960.
    [41] Zhang H J, Kankanhalli A, Smoliar S. Automatic partitioning of video [J]. Multimedia Systems, 1993, 1(1):10-28.
    [42] Chaisorn L, Chua T S,Koh C K,et al. A two-level multi-modal approach for story segmentation of large news video corpus[EB/OL]. http://www-nlpir.nist.gov/projects/ tvpaper03/nus.final.paper.pdf.
    [43] Massimo De Santo, Gennaro Percannella, et al. Segmentation of news videos based on audio-video information [J]. Pattern Analysis Application, 2007, 10(2):135-145.
    [44] Wang C,Wang Y,Liu H Y,et al. Automatic story segmentation of news video based on audio-visual features and text information[A]. In: Proceedings of International Conference on Machine Learning and Cybernetics[C], Xi’an, China, 2003:3008-3011.
    [45] Tat-Seng Chua, Shih-Fu Chang, et al. Story boundary detection in large broadcast news video archives-techniques, experience and trends [A]. ACM Multimedia[C], New York, USA, Oct, 2004:893-900.
    [46] Yun Zhai, Alper Yilmaz, Mubarak Shah. Story segmentation in news videos using visual and text cues [A]. Proc. IEEE Conf. on CIVR[C], Singapore, 2005:92-102.
    [47] Massimo De Santo, Gennaro Percannella, et al. Segmentation of news videos based on audio-video information [J]. Pattern Analysis Application, 2007, 10(2):135-145.
    [48] Browne P,Czirjek C,Gaughan G,et al. Dublin City University Video Track Experiments for TREC 2003[EB/OL]. http://www-nlpir.nist.gov/projects/tvpaper03/dublin.lee. paper.pdf.
    [49] Lekha Chaisorn, Tat-Seng Chua. Story boundary detection in news video using global rule induction technique [A]. Multimedia and Exposition 2006 IEEE International Conference[C], Toronto, Canada, 2006:2101-2104.
    [50] Hsu W,Chang S F,Huang C W. Discovery and fusion of salient multi-modal features towards news story segmentation[A]. In: Proceedings of International Conference on Storage and Retrieval Methods and Applications for Multimedia 2004[C], San Jose, CA, USA, 2004:244-258..
    [51]赵锞锞.新闻视频结构分析技术研究[D].郑州:信息工程大学, 2008.
    [52]郭三华,方贤勇,罗斌.一种视频序列的拼接算法[J].计算机应用, 2007,27(11):2786-2788.
    [53] Ferman A. M, Krishnamachari S, Tekalp A. M, Mottaleb M. A, and Mehrotra R, Group-of-frames/pictures color histogram descriptors for multimedia applications, ICIP2000, 2000, 1(1):65-68.
    [54] Hoon s H, Yoon K .A new technique for shot detection and keyframes selection in histogram space[C].In proceeding of the 12th Workshop on Image Processing and Image Understanding.2000:475-479
    [55] Markus Stricker and Markus Orengo. Similarity of color images [J]. In Proc. SPIE Storage and Retrieval for Image and Video Databases, 1995, 2185: 381-392.
    [56] H Tamura, S Mori, and T Yamawaki. Textural Features Corresponding to Visual Perception[J]. IEEE Trans. on Sys. Man and Cyb, 1978, SMC-8: 460-473.
    [57] M. K. Hu. Visual Pattern Recognition by Moment Invariants [J]. IRE Transactions on Information Theory, 1962:179-187.
    [58]宋辉.基于区域特征的图像检索技术研究[D].郑州:信息工程大学, 2006.
    [59] Moravec. Towards automatic visual obstacle avoidance[C]. In: Fifth International Joint Conference on Artificial Intelligent, Cambridge, MA, 1977, 584-590.
    [60]宋辉.一种使用Harris特征点的区域图像检索算法.计算机工程, 2006,32(7):202-203,206
    [61] S. M. Smith, J. M. Brady. SUSAN-A new Approach to Low Level Image Processing [J]. International Journal of Computer Vision, 1997, 23(1):45-78.
    [62] Cordelia Schmid, Roger Mohr, Christian Bauckhage.Evaluation of Interest Point Detectors [J]. International Journal of Computer Vision, 2003,37(2):151-172.
    [63] S.S.Chen and P.S.Gopalakrishnan. Speaker, environment and channel change detection and clustering via the Bayesian information criterion[C]. DARPA Broadcast News Transcription Understanding Workshop, Landsdowne, 1998.
    [64]张宁,贾自艳,史忠植.使用KNN算法的文本分类[J].计算机工程,2005,31(8):171-172.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700