基于内容的视频检索关键技术研究

英文题名：The Research on Key Techniques of Content-Based Video Retrieval
作者：刘洋
论文级别：硕士
学科专业名称：电路与系统
中文关键词：视频检索 ; 镜头分割 ; 关键帧提取 ; 镜头聚类 ; 互信息量
英文关键词：Video retrieval ; Shot segmentation ; Key frame extraction ; Shot clustering ; Mutual information
学位年度：2008
导师：毛建旭
学科代码：080902
学位授予单位：湖南大学
论文提交日期：2008-04-16
答辩委员会主席：戴瑜兴

摘要

随着多媒体技术和网络技术的飞速发展,数字视频的获取和传播变得越来越容易,已经逐渐成为人类信息传播的主要载体之一。在视频信息高度膨胀的今天,随之而来的问题就是对海量视频的高效检索和浏览。传统的视频检索通过对视频以手工的方法添加文字标识符的方式进行检索,这种检索方式工作量巨大、效率很低,而且受主观因素的影响,因此不能满足实际使用的需要。基于内容的视频检索技术借助计算机对视频进行从低层到高层的处理、分析和理解的过程获取其内容并根据内容进行检索,克服了传统的基于文本检索方式的不足,已成为多媒体信息检索领域的研究热点。
     本文首先分析总结了视频检索技术的理论框架和研究现状,然后对该领域中的视频镜头分割、关键帧提取、镜头聚类等关键技术进行了深入的研究和探索。视频镜头分割是进行视频处理的第一步,本文在总结现有镜头分割方法的基础上,研究了基于互信息量的视频镜头分割方法。设计并实现了一种基于双滑动窗口的镜头切变检测算法,算法通过计算视频帧间的互信息量作为衡量两帧相似度的依据,采用双滑动窗口方法找出相邻帧间互信息量的局部极值用于确定切变镜头的边界。针对运动和闪光对镜头检测的干扰,提出了一种基于图像分块的互信息量镜头切变检测算法,算法以互信息量作为评价帧间差异的准则,通过把帧图像分块,然后分别计算相邻两帧对应子图像块之间的互信息量,再进行反比例变换后累加,利用自适应阈值方法找出帧间差的局部极值,从而找出切变镜头的边界。研究并实现了一种基于互信息量的镜头渐变检测算法,算法利用不同帧间距的非相邻帧间互信息量差值检测渐变镜头边界。实验结果表明,本文所提出的视频镜头分割方法指标明确、算法简单,对切变和渐变达到了较高的查全率和准确率。
     视频关键帧的提取是基于内容的视频检索技术的关键步骤之一,本文首先研究了关键帧提取技术的原理和主要方法,然后将互信息量引入关键帧提取中,提出了一种基于互信息量的关键帧提取算法,算法针对镜头内互信息量的变化,通过计算帧间差的标准差来判断镜头内连续帧的相似性,并对相似性较高的连续帧提取一帧作为关键帧。实验结果表明,使用本文算法提取的关键帧可以准确地反映镜头内容,较好地得到了真正意义上的关键帧。
     镜头聚类作为一种从视频内容低层特征到高级抽象的桥梁,在基于内容的视频检索系统中起着很关键的作用。本文首先研究了聚类分析的原则和特点,并简单分析了该领域存在的主要算法,然后提出了一种基于关键帧颜色特征的视频镜头聚类算法,算法将关键帧的颜色特征作为聚类依据,运用改进的K均值聚类算法对镜头进行聚类,并对聚类结果进行优化。实验结果表明,本文所提出的算法具有较高的准确率和效率,增加了聚类结果的稳定性,达到了令人满意的效果。
With the rapid development of the multimedia technology and network technology, the access and dissemination of digital video has became more easily.The video has became one of the main carrier of information dissemination. In the days of highly inflated video information, the attendant problem is efficient retrieval and browsing to the massive video. The traditional way of video retrieval searches in the way that adds the text identifier using manual methods. The workload of this method is large and the efficiency is low. Also the performance of this methods is affected by the subjective factor, so it can't meet the actual needs.The Content-Based Video Retrieval which get the content of the video employing the method of processing, analysis and understanding to the video from low to high-level by computer and searches video accordance with it has became one of hot issue in the field of multimedia information retrieval.
     This paper summarized and analysised the the theoretical framework and the study status of video retrieval firstly,then conducted a in-depth study and exploration to some key Techniques in this field, including shot segmentation、key frame extraction and shot clustering. The shot segmentation is the first step to the video retrieval.On the basis of concluding the existing algorithm, the paper did some research into shot segmentation based on mutual information. The paper designed and implemented a Cut Transition algorithm based on dual-sliding window, which calculate the Mutual information among frames to determine the similitude of two frames and use the dual-sliding window method to find the local extremum of neighbor frame mutual information in order to locate the boundary of cut shots.The paper proposed a mutual information Cut Transition detecting algorithm based on image block against the interference of the movement and flash in the detection of Cut Transition.The mutual information is used as the criteria for evaluation of differences between frames.The algorithm partition the image first and then calculate the mutual information of the corresponding sub-image between the adjacent image.Then the algorithm conduct a inverse proportion transformation to the value of mutual information and do a accumulation to all values in the whole image.At last the algorithm use adaptive threshold method to identify the local extreme of the difference between frames to find the boundary of cut shots.The paper research and achieve a shot Gradual Transition algorithm which the difference of different frame spacing of non-neighbor frame mutual information is used to inspect the boundary of gradual transition shots. Experiment results show that the algorithm is simple、clear indicators and achieve a high recall rate and accuracy.
     The key frame extraction is the key step of content-based video retrieval. The paper first conduct a research on the principle and the main methods of key frame extraction technology and thenr made a introduction of mutual information to the key frame extraction and advised a mutual information based key frame extraction algorithm.The algorithm calculate the standard deviation of the difference between frames aim at the variety of mutual information inside the shot to determine the similarity of consecutive frames from which extract a key frame. The experiment results showe that the use of this method can accurately reflect the contents of the shots and get a real sense of the key frame better.
     The shot clustering plays a very important role in Content-Based Video Retrieval system as a bridge from the low-level characteristics of video content to the the high abstract of video content.The paper first do some research to the rinciples and characteristics of clustering analysis and analysised the major algorithms in the areas simplely. After that the paper suggested a shot cluster algorithm based the color characteristics of the key frame.The algorithm use an improved K-means clustering algorithm for shot clustering according as color characteristics of the key frame and optimize the clustering results. Experimental results show that the method achieved the high accuracy rate and efficiency, increase the stability of the Cluster results, obtained the satisfactory results.

引文

[1] 章毓晋.基于内容的视觉信息检索.北京:科学技术出版社,2003,3-38,221-252, 450-471
    [2] 张维明.多媒体信息系统.北京:电子工业出版社,2002,11-29
    [3] 格罗斯基.多媒体信息管理技术手册.北京:科学技术出版社,1998, 7-21
    [4] 老松杨,刘海涛,白亮等.视频检索综述.数字图书馆论坛,2006,27(8):10-17
    [5] Colombo C,Delbimbo A,Pala P.Semantics in visual information retrieval.IEEE Multimedia,1999, 6(3):38-55
    [6] 魏维,游静,刘凤玉等.语义视频检索综述.计算机科学,2006,33(2):1-7
    [7] 蔡波.基于内容的视频聚类及检索研究:[武汉大学博士学位论文].武汉:武汉大学,2003,5-9
    [8] 金红,周源华.基于内容检索的视频处理技术.中国图形图像学报,2000,05(4): 276-283
    [9] Liu J,Bhanu R.learning semantic visual concepts from video.In:IEEE Proceed- ings of 16th International Conference on Recognition Pattern.Quebcc:IEEE Press,2002,1061-1064
    [10] 魏维,游静,刘凤玉等.语义视频检索综述.计算机科学,2006,33(2):1-7
    [11] Sahanger A,Little T D C.A survey of technologies for parsing and indexing dig ital video.Journal of Visual Communication and Image Representation,1996, 7 (1) :28-43
    [12] 孙中伟,张福炎.自动视频内容分析综述.计算机科学,2002,29(5):80-85
    [13] 庄越挺,刘小明,吴翌等.通过例子视频进行视频检索的新方法.计算机学报,2000,23(3):300-305
    [14] 钟玉琢,沈洪,冼伟铨等.多媒体技术基础及应用.北京:清华大学出版社,2005, 33-45
    [15] 吴翌 ,庄越挺 ,潘云鹤 .视频的检索反馈 .计算机研究与发展 ,2001,38(5): 546-551
    [16] 庄越挺,刘小明,吴翌等.通过例子视频进行检索的新方法.计算机学报,2000, 23(3):300-305
    [17] Bimbo A.Visual Information Retrieval. San Fransisco:Morgan Kaufmann, 1999, 23-33
    [18] Bimber A,Frohlich B,Schmalstieg D,et al.The virtual show case.IEEE,trans Mu- ltimedial,2001,4(2):48-55
    [19] 肖鸿开,吴飞.视频内容分析与检索技术研究现状和未来发展趋势.广播与电视技术, 2005,32(6):50-54
    [20] Swain M J,Ballard D H.Color indexing.International Journal of Computer Vision,1991,7(1):11-32
    [21] Zabih R,Miller J,Mai K.A feature-based algorithm for detecting and classifying scene breaks.In:Proceedings of the third ACM international conference on Multimedia.San Francisco:ACM,1995,189-203
    [22] Zhang Y J,Kankanhalli A,Smoliar S.Automatic partitioning of video. Multimedia Systems,1993,1(1):10-20
    [23] Zhang H J,Wu J H,Zhong D,et al. An integrated system for content-based video retrieval and browsing.Pattern Recognition,1997,30 (4):643-657
    [24] Song S M,Kwon T H.On detection of gradual scenechanges for parsing of video data. In:Proceedings of SPIE.San Jose:CA,1997,404-409
    [25] Hampapur A,Jain R,Weymouth T.Digital video segmentation.In:Proceedings of the 2nd Annual ACM Multimedia Conference and Exposition,New York:ACM Press,1994,357-364
    [26] Alattar A M.Detecting and compressing dissolve regions in video sequences with a DVI multimedia image compression algorithm.In:Proceedings of IEEE International Symposium Circuits and Systems. Chicago: ISCAS,1993,13-16
    [27] Meng J H,Juan Y J,Chang S F.Scene change detection in a MPEG compressed video sequence.In:Proceedings of IS&T/SPIE,Conference on Multimedia Computing and Networking. San Jose:SPIE Press,1995,180-191
    [28] Truong B T,Dorai C,Venkatesh S.Improved fade and dissolve detection for reliable video segmentation. In:Proceedings of IEEE International Conference on Image Processing(ICIP 2000).Vancouver:IEEE Press,2000,961-964
    [29] Truong B T,Dorai C,Venkatesh S.New enhancements to cut/fade and dissolve detection processes in video segmentation.In:Proceedings of ACM Multimedia 2000.Los Angeles:ACM Press,2000,219-227
    [30] 刘政凯,汤晓鸥.视频检索中镜头分割方法综述.计算机工程与应用,2002, 38(23):84-87
    [31] Sethi L K,Patel N.A statistical approach to scene change detection.In:IS&T SPIE Storage and Retrieval for Image and Video Databases.San Jose:CA, 1995, 329-339
    [32] Fernando W A C,Canagarajah C N.Scenechange detection algorithms for content based video indexing and retrieval.Electronics & CommunicationEngineering Journal,2001,18(2):12-32
    [33] Lee S W,Kim Y M.Fast scene change detection using direct feature extraction from MPEG compressed videos.IEEE Trans On Multimedia,2000,2(4):240-254
    [34] Divakaran A,Ito H,Sun H F,et al.Fade-in/out scene change detection in the MPGE-1/2/4 compress video domain.SPIE,2000,397290(14):518-522
    [35] Kobla V,Dementhon D,Doermann D.Special effect edit detection using Video Trails:a comparison with existing techniques.SPIE,1999,3656(3):302-313
    [36] Torsten B,Jean P T.Shot boundary detection with mutual information. IEEE Int Conf Image Processing,2001,3(18):422-425
    [37] Zuzana,Ioannis P,Christophoros N.Information Theory-Based Shot Cut/Fade Detection and Video Summarization.IEEE TRANSACTIONS ON CIRCUITS AND SYSTEM FOR VIDEO TECHNOLOGY,2006,16(4):82-90
    [38] Xia Dingyuan,Deng Xuefei,Zeng Qingning.Shot Boundary Detection Based on Difference Sequences of Mutual Information.In:Proceedings of the Fourth International Conference on Image and Graphics.Washington:IEEE Computer So ciety,2007,389-394
    [39] 田玉敏 , 王昊 . 镜头边界检测的动态窗口技术 . 光子学报 ,2007,36(10): 1949-1953
    [40] 陈运,周亮,陈新.信息论与编码.北京:电子工业出版社,2002,70-94
    [41] 朱雪龙.应用信息论基础.北京:清华大学出版社,2001,100-112
    [42] 孙季丰,徐兴.视频检索中关键帧选取的时间自适应算法.计算机工程,2003, 29(7):150-153
    [43] Wolf W.Key frame selection by motion analysis.In:Processings Acoustics, Spe- ech and Signal Processing IEEE International Conference.Atlanta:IEEE Comp- uter Society,1996, 1228-1231
    [44] 金红,史桂蓉,周源华.用无监督模糊聚类方法进行视频内容的分层表示.计算机工程与运用,2002,10(2):163-135
    [45] Chi C L,Shuenn J W.Video Segmentation Using a Histogram-based Fuzzy c- means Clustering Algorithm.Computer Standards & Interface,2001,23(5):429- 438
    [46] Hanjalic A,Zhang H J.An Integrated Scheme for Automated Video Abstraction Based on Unsupervised Cluster-validity Analysis.IEEE Trans on Circuits and Systems for Video Technology,1999,9(8):1280-1289
    [47] 季春.基于内容的视频检索中的关键帧提取.情报技术,2006,25(11):116 -119
    [48] Mukherjee D P,Das S K,Saha S.Key Frame Estimation in Video Using Random-ness Measure of Feature Point Pattern.Circuits and Systems for Video Technology.IEEE Transactions,2007,17(5):612-620
    [49] 许伟,许宏丽.基于颜色特征的视频数据库检索系统.计算机工程与设计, 2006,27(7):1208-1210
    [50] 郭军华.数据挖掘中聚类分析的研究:[武汉理工大学硕士学位论文].武汉:武汉理工大学,2003,22-41
    [51] 杨占华,杨燕.一种基于 SOM 和 K-means 的文档聚类算法.计算机应用研究,2006,16(5):73-75
    [52] 侯冠华,史萍.视频分割与场景聚类算法研究.中国传媒大学学报自然科学版, 2006,13(2):32-37
    [53] 蔡波,周洞汝.基于镜头关键帧集的视频场景聚类的研究.计算机工程与运用, 2003,39(28):32-35
    [54] Hanjalic A,Zhang H J.An integrated scheme for automated video abstraction based on unsupervised cluster-validity analysis.IEEE Tram on Circuits and System for Video Technology,1999,9(8):l280-l289
    [55] 任江涛,施潇潇,孙婧昊等.一种改进的基于特征赋权的 K 均值聚类算法.计算机科学,2006, 33(7):186-187
    [56] Siddheswar R,Rose H T .Determination of Number of Clusters in K-Means Cl- ustering and Application in colour Image Segmentation.In:Proceedings of the 4th International Conference on Advances in PaternRecognition and Digital Techniques.Calcutta:ICAPRDT Press,1999,137-143
    [57] Fraley C,Aflery A.How many cluster? Which clustering method? Answers via model based clusteranalysis.The Computer Journa1,1998,41(8):578-588
    [58] 冷明伟,陈晓云,颜清.一种基于影响因子的快速 K-均值算法.计算机应用, 2007,27(12):3042-3044

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700