视频广告内容分析与理解
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
视频广告已经成为当今人类社会最为流行的一种商业媒介,为人们的现代生活带来了不可或缺的商业信息,时时刻刻都在潜移默化的影响着人们的工作和生活方式。每年世界各国的企业都会花费上亿美元的资金,生产、投放成千上万条的视频广告并在各国电视台循环不断的播出,在向民众推介各种各样新颖的商品及服务的同时,也带动了相关产业的迅速发展。
     同时,随着数字化浪潮的发展,人们已经可以通过各种手段录制海量的视频广告以便随时获取重要的商业咨询。但是,由于缺少有效的广告内容自动分析技术,录制广告数量的爆炸性增长导致了不同用户群对于视频广告自动滤除、采集以及索引等多方面的迫切需求。如何针对他们各具特色的需求,开发一系列有效的视频广告内容分析与理解技术,从而快速有效的监视、分析、存储、查询视频广告的内容、播出时间、质量等已经成为当前多媒体内容分析领域的一个热点问题。
     针对当前视频广告内容分析与理解技术中存在的不足,本文尝试和探索从视频广告各种潜在语义特性的分析入手,借助计算机视觉、机器学习以及多媒体处理技术,跨媒体挖掘视频广告中存在的各种语义概念,构造中层描述子,实现不同媒体模态下的信息交互融合,提出行之有效的解决方案。本文主要成果和创新之处包括以下几个方面:
     1)视频广告识别技术中的由粗及细匹配策略
     为提高视频广告识别技术的效率,通过将局部敏感哈希函数(Locality Sensitive Hash, LSH)与精细尺度连续滤除技术(Fine Granularity Successive Elimination, FGSE)进行有机的融合,提出一种由粗及细的匹配策略。在粗匹配阶段,利用LSH加快初始检索速度,大量过滤无关内容,得到全局近似的查询结果;在精匹配阶段,引入FGSE技术解决粗匹配过程中的冲突问题,通过逐层分解匹配特征,快速寻找局部差异,获得精确匹配结果,实现对于视频广告的快速识别。
     2)基于协同学习的视频广告文本检测
     视频广告中的文本是一种重要的语义信息。为实现对这类复杂文本的有效定位,提出一种基于协同学习的视频广告文本检测方法。通过将文本检测视为一种特殊纹理的分类问题,引入协同学习机制,采用两种相对独立的视角强化对于文本区域特性的描述。针对协同学习机制中容易引入噪声样本的问题,结合Bootstrap思想,提出一种改进的协同学习算法,在两个相对独立的视角中交互选择典型样本,提高分类器的泛化能力。
     3)融合视觉-音频-文本模态信息的视频广告段落检测
     提出一种基于视觉、音频以及文本模态交互式融合的广告段落检测方法。通过充分挖掘广告各模态中本质的播放特性,首次在视频广告的文本模态中,结合视频文本区域在时空域随机变化的特性,提出一个全面的文本描述子,同广告音视频模态特征构成完整的描述空间。此外,为解决现有融合方式中,简单将各模态信息进行叠加的不足,提出一种交互集成式学习算法Tri-AdaBoost,交互挖掘各种模态的中级描述子所蕴含的互补信息,实现这些模态的有机融合,从而提升分类器的性能。
     4)视频广告段落分割技术中跨媒体特性分析与融合研究
     通过融合广告视觉、音频以及文本模态,提出一种有效的视频广告段落分割方法。为了加强对于广告分割具有重要作用的描述子——产品信息帧(Frame Marked with Product Information, FMPI)检测的鲁棒性,首次将文本模态与一些重要的视觉特性引入FMPI的构造过程,并结合音频模态描述子形成一个对于广告边界特性的完整描述空间。此外,利用不同模态下描述子之间时域的上下文信息,实现各模态的有效融合,自动分割广告段落。
     5)基于稀疏视觉词包描述的广告语义分类方法
     为提高传统视觉词包的描述能力,利用更加符合人类理解图像方式的稀疏学习技术,提出一种基于稀疏视觉词包描述的广告语义分类方法。基于对于大量广告中视觉语义单元共生规律之间的分析,将不同类别广告中出现的各具特色的语义单元映射为一种过完备化的视觉词典表示,并利用这个词典中基本元素的稀疏线性组合描述广告中潜在的语义,在不同类别广告中蕴含的语义信息与稀疏视觉词包描述之间建立潜在的映射关系,实现对于广告语义内容的分类。
As one of the most popular means of promoting products, video commercials have become an inescapable part of modern life, significantly influencing our work habits and other aspects of life. Due to the importance of video commercials, tens of thousands of commercials are produced and broadcasted on many TV channels to promote a variety of new commodities or services, costing billions of dollars.
     Meanwhile, benefiting from the rapid development of digital technologies, people can conveniently record more and more commercials for commercial information acquisition. However, the explosive growth of recorded commercials results in critical demands for the actual applications (e.g. commercial filtering, capturing, and indexing) of a smart commercial content analysis and understanding (CCAU) scheme for different user groups. It is deeply desirable to design an effective CCAU scheme to assist them in monitoring, browsing, and indexing daily updated commercials. This kind of research has become an intense focus in multimedia analysis.
     To alleviate the challenges of CCAU, some key issues in CCAU are explored by a series of state-of-art computer vision, machine learning, and multimedia processing technologies. Specially, we propose a variety of mid-level descriptors to describe the intrinsic commercial semantics from different modalities. In addition, aiming at collaboratively exploiting these cross-media characteristics, some effective techniques are well designed to boost the performance of the proposed CCAU methods. The following points highlight several contributions of this paper:
     1) Video commercial recognition using coarse-to-fine matching strategy
     Aiming at improving the efficiency of video commercial recognition, a coarse-to-fine matching strategy is proposed resorting to the effective combination of locality sensitive hash (LSH) and the fine granularity successive elimination (FGSE). Specially, LSH is applied to accelerate the initial coarse retrieval procedure and FGSE is evolved into the means to eliminate rapidly those irrelevant candidates which have passed the coarse matching process.
     2) An enhanced co-training based video commercial text detection
     To pave the way for utilizing the video textual characteristics in commercials, we present an enhanced co-training based commercial text detection approach by interactively exploiting the intrinsic correlation of multiple texture representation spaces. Specially, to alleviate the problem of noise samples in co-training process, an enhanced co-training strategy combining with Bootstrap is proposed for improving the generalization ability of the classifier.
     3) Collaboratively exploiting visual-audio-textual characteristics for video commercial block detection
     We focus our research on commercial block detection by the means of collaborative exploitation of visual-audio-textual characteristics embedded in commercials. Rather than utilizing exclusively visual-audio characteristics like most previous works, some intrinsic textual characteristics associated with commercials but rarely presented in general programs are fully exploited via analyzing the spatio-temporal properties of overlay texts in commercials. Additionally, Tri-AdaBoost, an interactive ensemble learning manner is proposed to form a consolidated semantic fusion across visual, audio, and textual characteristics.
     4) Video commercial block segmentation based on the collaborative fusion of visual-audio-textual descriptors
     An effective commercial block segmentation method has been proposed by collaboratively fusing the visual-audio-textual descriptors. Additional informative descriptors including textual characteristics are introduced to boost the robustness in the detection of frame marked with product information (FMPI). Together with the audio characteristics, FMPI can provide a kind of complementary representation architecture to model the similarity of intra-commercial and the dissimilarity of inter-commercial. In addition, the relation among these multi-modal descriptors in temporal domain is further collaboratively utilized to segment commercial block into multiple individual commercials.
     5) Video commercial categorization using sparse coding based visual bag of words representation
     To boost the discrimination ability of the traditional visual bag of words (VBoW) in commercial categorization, a more suitable representation method, i.e. sparse coding based VBoW, is presented to describe the co-occurrence of semantic units in different kinds of commercials. These semantic units are mapped into an over-completed dictionary and each commercial is further represented by the sparse liner combination of these atoms in the dictionary.
引文
[1]Television advertisement. [Online]. http://en.wikipedia.org/wiki/TV_commercial
    [2]电视广告, 网络资源:http://zh.wikipedia.org/wiki/%E7%94%B5%E8%A7%86%E50/oB9%BF%E5%91%8A
    [3]电视广告,网络资源:http://baike.baidu.com/view/88986.htm
    [4]《国家中长期科学和技术发展规划纲要(2006—-2020年)》,网络资源:http://www.gov.cn/jrzg/2006-02/09/content 183787.htm
    [5]《北京市中长期科学和技术发展规划纲要(2008—2020年)》,网络资源:http://www.bjkw.gov.cn/n1143/n1240/n1375/n1705/6138204.html
    [6]http://tech.sina.com.cn/it/2003-11-24/0753259431.shtml
    [7]《广播电视广告播出管理办法》,http://www.gov.cn/flfg/2009-09/10/content_1414069.htm
    [8]《广电总局办公厅关于重申广播电视广告播放管理有关规定的通知》http://wwav.sarft.gov.cn/articles/2008/01/30/20080130091955950432.html
    [9]《广电总局关于进一步加强广播电视广告审查和监管工作的通知》http://www.sarft.gov.cn/articles/2010/02/20/20100220111647570862.html
    [10]《关于进一步加强广播电视医疗和药品广告监管工作的通知》http://www.sarft.gov.cn/articles/2009/02/16/20090216173158460883.html
    [11]《国家广电总局严令禁播的八类涉性药品、医疗、保健品广告及有关医疗资讯、电视购物节目》,http://www.ahgd.gov.cn/dt2111111237.asp?docid=2111121834
    [12]T. Koga, Y. Yamamoto, R. Ohtsuki, and S. Matsumoto. Commercial detection apparatus and video playback apparatus. Invention patent,8010363, Aug.30,2011.
    [13]R.Glasberg, T. Sikora, and C. Tas. Method for detecting a commercial in a video data stream by evaluating descriptor information. Invention patent,7761491, Jul.20,2010.
    [14]L. Winger. Compressed domain commercial detect/skip. Application 20070030584, Feb.8, 2007.
    [15]M. Harville. Iterative, maximally probable, batch-mode commercial detection for audiovisual content. Patent 7778519, Aug.17,2010.
    [16]E. Linzer. Commercial detection suppressor with inactive video modification. Patent 7853968, Dec.14,2010.
    [17]S. Taro, T. Takao, O. Masashi, A. Toshiya, M. Noboru, A. Naohisa, and T. Masami. Commercial detection which detects a scene change in a video signal and the time interval of scene change points. Patent 6285818, Sep.4,2001.
    [18]山口宇唯,关本信博,亲松昌幸。节目录像装置及广告检测方法,发明专利,200710147817,2007年8月30日。
    [19]赵丹,王向东,钱跃良,刘群,林守勋。一种广告检测识别方法及系统,发明专利,200810057162,2008年1月30日。
    [20]邱安德。在视频信号中进行高效能广告检测的方法与相关系统,发明专利,200410055809,2004年08月04日。
    [21]S·古特塔,L·阿格尼霍特里。通过结合视频和音频签名增强的广告检测,发明专利,CN03822923.4,2005年10月19日。
    [22]陈向文,朱明。一种电视广告检测方法及装置,发明专利,200910167249,2009年08月 27日。
    [23]田永鸿,李甲,段凌宇,黄铁军,高文。基于视觉显著度的视频广告关联方法与系统,发明专利,200910076782.6,2009年01月21日
    [24]朱振峰,刘楠,赵耀。一种基于分层匹配的视频广告识别方法,发明专利,ZL.200710177523.3,2007年10月17日。
    [25]赵耀,刘楠,朱振峰。一种基于分层匹配的快速音频广告识别方法,发明专利,ZL.200710177517.8,2007年10月17日。
    [26]赵耀,杨厚德,朱振峰,刘楠。一种基于显式共享子空间的视频广告检测系统,发明专利,2011年9月30日。
    [27]李春亮,广告视频探测技术研究[学位论文],长沙,国防科技大学,2004。
    [28]朱明,视频流中广告检测的研究和实现[学位论文],合肥,中国科技大学,2004。
    [29]张亮,鲁棒的视频广告监测技术研究[学位论文],北京,北京交通大学,2006。
    [30]杨厚德,视频广告的自动识别与检测[学位论文],北京,北京交通大学,2011。
    [31]张鸿飞,电视节目广告片段检测系统,[学位论文],北京,北京邮电大学,2008。
    [32]曾宪帼,电视广告监管处理系统的研究与应用[学位论文],北京,北京工业大学,2007。
    [33]于立洋,面向图像的电视广告自动监播系统的研究与实现[学位论文],哈尔滨,哈尔滨理工大学,2007。
    [34]王金桥,广播视频的结构分析和语义检索[学位论文],北京,中国科学院自动化研究所,2008.
    [35]J. M. Sanchez, X. Binefa, J. Vitria, and P. Radeva. Local color analysis for scene break detection applied to TV commercials recognition. In Proc. of the 3rd International Conference on Visual Information and Information Systems (VISUAL 1999), Lecture Notes in Computer Science 1614, Amsterdam, Netherland,1999, Berlin, Springer,1999, pp.:237-244.
    [36]J. M. Sanchez, X. Binefa, and J. Vitria. Shot partitioning based recognition of TV commercials. Multimedia Tools and Applications,2002, vol.18(3), pp.:233-247.
    [37]T. Kurozumi, K. Kashino, and H. Murase. A method for robust and quick video searching using probabilistic dither-voting. In Proc. of the 2001 International Conference on Image Processing (ICIP 2001), Thessaloniki, Greece,2001, USA, IEEE,2001, Vol.2, pp.:653-656.
    [38]J. S. Yuan, L. Y. Duan, Q. Tian, and C. S. Xu. Fast and robust short video clip search using an index structure. In Proc. of the 6th ACM SIGMM International Workshop on Multimedia Information Retrieval (MIR 2004), New York, USA,2004, USA, ACM,2004, pp.:61-68.
    [39]Y. Li, J. S. Jin and X. Zhou. Matching commercial clips from TV streams using a unique, robust and compact signature. In Proc. of the 2005 Digital Image Computing: Techniques and Applications (DICTA 2005), Cairns, Australia,2005, USA, IEEE Computer Society,2005, pp.:266-272.
    [40]X. Naturel and P. Gros. A fast shot matching strategy for detecting duplicate sequences in a television stream. In Proc. of the 2nd International Workshop on Computer Vision Meets Databases (CVDB 2005), Baltimore, USA,2005, USA, ACM,2005, pp.:21-27.
    [41]S. H. Lee, W. Y. Yoo, and Y. S. Yoon. A visual feature based video identifying system for the TV commercial's monitoring. In Proc. of the 8th International Conference on Advanced Communication Technology (ICACT 2006), Phoenix Park,2006, USA, IEEE,2006, pp.:880-883.
    [42]S. H. Lee, W. Y. Yoo, and Y. S. Yoon. Real-time monitoring system for TV commercials using video features. In Proc. of the 5th International Conference on Entertainment Computing (ICEC 2006), Lecture Notes in Computer Science 4161, Cambridge, UK,2006, Berlin, Springer,2006, pp.:81-89.
    [43]A. Shivadas, J. M. Gauch. Real-time commercial recognition using color moments and hashing. In Proc. of the 4th Canadian Conference on Computer and Robot Vision (CRV 2007), Montreal, Canada,2007, USA, IEEE,2007, pp.:465-472.
    [44]H. T. Shen, X. F. Zhou, Z. Huang, and J. Shao et al. UQLIPS: A real-time near-duplicate video clip detection system. In Proc. of the 33rd International Conference on Very Large Data Bases (VLDB 2007), Vienna, Austria,2007, USA, ACM,2007, pp.:1374-1377.
    [45]Y. J. Li, D. Q. Zhang, X. M. Zhou, and J. S. Jin. A confidence based recognition system for TV commercial extraction. In Proc. of the 19th Australasian Database Conference on Database Technologies 2008 (ADC 2008), Wollongong, Australia,2008, Australia, Australian Computer Society 2008, pp.:57-64.
    [46]Z. C. Zhao and X. D. Liu. A segment-based advertisement search method from TV stream. In Proc. of the 2nd International Conference on Future Computer and Communication (ICFCC 2010), Wuhan, China,2010, USA, IEEE,2010, vol.2, pp.:690-693.
    [47]K. Schoeffmann and L. Boeszoermenyi. Video sequence identification in TV broadcasts. In Proc. of the 17th International Conference on Multimedia Modeling Conference (MMM 2011), Advances in Multimedia Modeling, Taipei, China,2011, Berlin, Springer,2011, pp.:1-8.
    [48]N. Liu, Y. Zhao, and Z. F. Zhu. Commercial recognition in TV streams using coarse-to-fine matching strategy. In Proc. of the 11th Pacific Rim conference on Multimedia (PCM 2010), Advances in multimedia information processing, Shanghai, China,2010, Berlin, Springer,2010, Part 1 pp.:296-307.
    [49]H. D. Yang, N. Liu, Y. Zhao, and Z. F. Zhu. Pruned multi-level successive elimination algorithm for TV commercial recognition. In Proc. of the 3rd ACM International Conference on Internet Multimedia Computing and Service (ICIMCS 2011), Chengdu, China,2011, USA,ACM,2011, pp.:1-4.
    [50]J.G. Lourens. Detection and logging advertisements using its sound. IEEE Transactions on Broadcasting,1990, vol.36(3), pp.:231-233.
    [51]J.G. Lourens. Detection and logging advertisements using its sound. In Proc. of the IEEE 1990 South African Symposium on Communications and Signal Processing (COMSIG 1990), South Africa,1990, USA, IEEE,1990, pp.:209-212.
    [52]B. Oliveira, A. Crivellaro, and R. M. Cesar, Jr. Audio-based radio and TV broadcast monitoring. In Proc. of the 11th Brazilian Symposium on Multimedia and the Web (WebMedia 2005), Brazil, 2005, USA, ACM,2005, pp.:1-3.
    [53]D. Zhao, X. D. Wang, Y. L. Qian, and Q. Liu et al. Fast commercial detection based on audio retrieval. In Proc. of the 2008 IEEE International Conference on Multimedia and Expo (ICME 2008), Hannover, Germany,2008, USA, IEEE Computer Society,2008, pp.:1185-1188.
    [54]M. Covell, S. Baluja, and M. Fink. Detecting ads in video streams using acoustic and visual cues. Journal of Computer,2006, vol.39(12), pp.:135-137.
    [55]M. Covell, S. Baluja, and M. Fink. Advertisement detection and replacement using acoustic and visual repetition. In Proc. of the 8th IEEE Workshop on Multimedia Signal Processing (MMSP 2006), Victoria, Canada 2006, USA, IEEE,2006, pp.:461-466.
    [56]G. B. Zheng and J. Q. Han. Real-time audio retrieval method and automatic commercial detecting system. Computer Science,2006, vol.2 (3), pp.:297-302.
    [57]S. J. Lee and J. S. Seo. A TV commercial monitoring system using audio fingerprinting. In Proc. of the 6th International Conference on Entertainment Computing (ICEC 2007), Lecture Notes in Computer Science 4740, Shanghai, China,2007, Berlin, Springer 2007, pp.:454-457.
    [58]H. Duxans, D. Conejero, and X. Anguera. Audio-based automatic management of TV commercials. In Proc. of the 2009 IEEE International Conference on Acoustics, Speech and Signal (ICASSP 2009), Taipei, China,2009, USA, IEEE,2009, pp.:1305-1308.
    [59]N. Liu, Y. Zhao, and Z. F. Zhu. Coarse-to-fine based matching for audio commercial recognition. In Proc. of the 2008 International Conference on Neural Networks and Signal Processing (ICNNSP 2008), Zhenjiang, China,2008, USA, IEEE,2008, pp.:87-90.
    [60]H. Rehatschek, R. Sorschag, B. Rettenbacher, H. Zeiner, J. Nioche, F. Dejong, and D. V. Leeuwen. Mediacampaign: a multimodal semantic analysis system for advertisement campaign detection. In Proc. of the 2008 International Workshop on Content-Based Multimedia Indexing (CBMI 2008), London, UK,2008, USA, IEEE,2008, pp.:85-92.
    [61]R. Lienhart, C. Kuhmiinch, and W. Effelsberg. On the detection and recognition of television commercials. Technical Report, University of Mannheim,1996, http://madoc.bib.uni-mannheim.de/madoc/volltexte/2004/800/pdf/TR-96-016.pdf.
    [62]R. Lienhart, C. Kuhmiinch, and W. Effelsberg. On the detection and recognition of television commercials. In Proc. of the 1997 IEEE International Conference on Multimedia Computing and Systems, (ICMCS 1997), Ottawa, Canada,1997, USA, IEEE,1997, pp.:509-516.
    [63]A. G. Haupmann and M. J. Witbrock. Story segmentation and detection of commercials in broadcast news video. In Proc. of the 1998 IEEE International Forum on Research and Technology Advances in Digital Libraries (ADL 1998), Santa Barbara, USA,1998, USA, IEEE Computer Society,1998, pp.:22-24.
    [64]D. A. Sadlier, S. Marlow, N. O'Connor, and N. Murphy. Automatic TV advertisement detection from MPEG bitstream. In Proc. of the 1st International Workshop on Pattern Recognition in Information Systems (PRIS 2001), Setubal, Portugal,2001, Portugal, ICEIS,2001, pp.:14-25.
    [65]D. A. Sadlier, S. Marlow, N. O'Connor, and N. Murphy. Automatic TV advertisement detection from MPEG bitstream. Pattern Recognition,2002, vol.35(12), pp.:2719-2726.
    [66]N. Dimitrova, S. Jeannin, J. Nesvadba, T. McGee, L. Agnihotri, and G. Mekenkamp. Real time commercial detection using MPEG features. In Proc. of the 9th International Conference on Information Processing and Management of Uncertainty in Knowledge-based Systems (IPMU 2002), Annecey, France,2002, Berlin, Springer,2002, pp.:481-486.
    [67]A. Albiol, M.J.C. Fulla, A. Albiol, and L. Torres. Commercial detection using HMMs. In Proc. of the 5th Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS 2004), Portugal,2004, USA, IEEE Computer Society,2004, pp.:1-4.
    [68]A. Albiol, M. J. C., Fulla, A. Albiol, and L. Torres. Detection of TV commercials. In Proc. of the 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2004), Montreal, Canada,2004, USA, IEEE,2004, vol.3, pp.:541-544.
    [69]J. H. Yeh, J. C. Chen, J. H. Kuo, and J. L. Wu. TV commercial detection in news program videos. In Proc. of the 2005 IEEE International Symposium on Circuits and Systems (ISCAS 2005), Kobe, Japan,2005, USA, IEEE,2005, vol.5, pp.:4594-4597.
    [70]J.C. Chen, J. H. Yeh, W. T. Chu, J. H. Kuo, and J. L. Wu. Improvement of commercial boundary detection using audiovisual features. In Proc. of the 6th Pacific Rim Conference on Multimedia, Advances in Multimedia Information Processing (PCM 2005), Jeju Island, Korea,2005, Berlin, Springer 2005, Part I, pp.:776-786.
    [71]S. J. Li, Y. F. Guo, and H. Li. A temporal and visual analysis-based approach to commercial detection in news video. In Proc. of the 9th International Conference on Visual Information and Information Systems, Advances in Visual Information Systems (VISUAL 2007), Shanghai, China 2007, Berlin, Springer 2007, pp.:117-125.
    [72]S. J. Li, H. Li, and Z. J. Wang. A novel approach to commercial detection in news video. In Proc. of the 8th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing (SNPD 2007), Qingdao, China,2007, USA, IEEE Computer Society,2007, vol.2, pp.:86-90.
    [73]Y. P. Huang, L. W. Hsu, and F. E. Sandnes. An intelligent subtitle detection model for locating television commercials. IEEE Transactions on Systems, Man, and Cybernetics, Part B:Cybernetics, 2007, vol.37, pp.:485-492.
    [74]D. Conejero, X. Anguera, and R. Telefonica.TV advertisements detection and clustering based on acoustic information. In Proc. of the 2008 International Conference on Computational Intelligence for Modelling Control & Automation (CIMCA 2008), Vienna, Austria,2008, USA, IEEE,2008, pp.:452-457.
    [75]L. Meng, Y. Cai, M. Wang, and Y. X. Li, TV commercial detection based on shot change and text extraction. In Proc. of the 2nd International Congress on Image and Signal Processing (CISP 2009), Tianjin, China,2009, USA, IEEE,2009, pp.:1-5.
    [76]K. Schoffmann, M. Lux, and L. Boszormenyi. A novel approach for fast and accurate commercial detection in H.264AVC bit streams based on logo identification. In Proc. of the 15th International Multimedia Modeling Conference on Advances in Multimedia Modeling (MMM 2009), Sophia Antipolis, France,2009, Berlin, Springer,2009, pp.:1-9.
    [77]N. Venkatesh, B. Rajeev, and M. Girish Chandra. Novel TV commercial detection in cookery program videos. In Proc. of the 2009 World Congress on Engineering and Computer Science (WCECS 2009), San Francisco, USA,2009, USA, IAENG,2009, pp.:1-5.
    [78]F. Ge and P. Shi. The detection of TV commercial based on multi-feature fusion. In Proc. of the 2010 International Conference on Multimedia Technology (ICMT 2010), Ningbo, China,2010, USA, IEEE,2010, pp.:1-4.
    [79]N. Venkatesh, M. Girish Chandra, and B. Rajeev. A novel commercial break detection and automatic annotation of TV programs for content based retrieval. In Proc. of the 2nd International Conference on Signal Processing Systems (ICSPS 2010), Dalian, China,2010, USA, IEEE,2010, vol.2, pp.:577-581.
    [80]A. Tanwer and P.S. Reel. Effects of threshold of hard cut based technique for advertisement detection in TV video streams. In Proc. of the 2010 IEEE Students'Technology Symposium (TechSym 2010), Kharagpur, India,2010, USA, IEEE,2010, pp.:211-216.
    [81]R. Arjun. Detection and Removal of advertisements in Broadcast videos [Dissertation]. India, Supercomputer Education and Research Centre,2008.
    [82]MythTv [Online]. Available:http://www.mythpvr.com/mythtv/taxonomy/term/4/.
    [83]Comskip [Online]. Available:http://www.kaashoek.com/comskip/.
    [84]Tivo [Online]. Available:http://www.tivo.com/.
    [85]VideoReDo [Online]. Available:http://www.videoredo.com/en/index.htm.
    [86]Replay TV [Online]. Available:http://www.digitalnetworksna.com/replaytv/.
    [87]J. M. Gauch and A. Shivadas. Identification of new commercials using repeated video sequence detection. In Proc. of the 2005 International Conference on Image Processing (ICIP 2005), Genoa, Italy,2005, USA, IEEE,2005, vol.3, pp.:1252-1255.
    [88]J. M. Gauch and A. Shivadas, Finding and identifying unknown commercials using repeated video sequence detection. Computer Vision and Image Understanding,2006, vol.103(1), pp.:80-88.
    [89]Z. Zeng, S.W. Zhang, H. B. Zheng, and W. Y. Yang. Program segmentation in a television stream using acoustic cues. In Proc. of the International Conference on Audio, Language and Image Processing 2008 (ICALIP 2008), Shanghai, China,2008, USA, IEEE,2008, pp.:748-752.
    [90]S.A. Berrani, G. Manson, and P. Lechat. A non-supervised approach for repeated sequence detection in TV broadcast streams. Image Communication,2008, vol.23 (7), pp.:525-537.
    [91]J. Wedin. Commercial detection based on audio repetition [Dissertation]. Sweden, Lulea University of Technology,2008.
    [92]X. Naturel and P. Gros. Detecting repeats for video structuring. Multimedia Tools and Applications,2008, vol.38(2), pp.:233-252.
    [93]I. Dohring and R. Lienhart. Mining TV broadcasts for recurring video sequences. In Proc. of the 8th ACM International Conference on Image and Video Retrieval (CIVR 2009), Santorini Island, Greece,2009, USA, ACM,2009, pp.:1-8.
    [94]X. Naturel, and S. A. Berrani. Content-based TV stream analysis techniques toward building a catch-up TV service. In Proc. of the 11th IEEE International Symposium on Multimedia (ISM 2009), San Diego, USA,2009, USA, IEEE Computer Society,2009, pp.:412-417
    [95]G. Wei, L. Agnihotri, and N. Dimitrova. TV program classification based on face and text processing. In Proc. of the 2000 IEEE International Conference on Multimedia and Expo (ICME 2000), New York, USA,2000, USA, IEEE Computer Society,2000, vol.3, pp.:1345-1348.
    [96]L. Agnihotri, N. Dimitrova, and T. McGee, S. Jeannin, D. Schaffer, and J. Nesvadba. Evolvable visual commercial detector. In Proc. of the 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2003), Madison, USA,2003, USA, IEEE,2003, vol.2, pp.: 79-84.
    [97]P. Duygulu, M. Y. Chen, and A. Hauptmann. Comparison and combination of two novel commercial detection methods. In Proc. of the 2004 IEEE International Conference on Multimedia and Expo (ICME 2004), Taipei, China,2004, USA, IEEE Computer Society,2004, vol.2, pp.: 1267-1270.
    [98]X. S. Hua, L. Lu, and H. J. Zhang. Robust learning-based TV commercial detection. In Proc. of the 2005 IEEE International Conference on Multimedia and Expo (ICME 2005), Amsterdam, Netherlands,2005, USA, IEEE Computer Society,2005, pp.:149-152.
    [99]M. Mizutani, S. Ebadollahi, and S. F. Chang. Commercial detection in heterogeneous video streams using fused multi-modal and temporal features. In Proc. of the 2005 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2005), Philadelphia, USA,2005, USA, IEEE,2005, vol.2, pp.:157-160.
    [100]T. Y. Liu, T. Qin, and H. J. Zhang. Time-constraint boost for TV commercials detection. In Proc. of the 2004 International Conference on Image Processing (ICIP 2004), Singapore,2004, USA, IEEE,2004, vol.3, pp.:1617-1620.
    [101]M. Montagnuolo and A. Messina. TV genre classification using multimodal information and multilayer perceptrons. In Proc. of the 10th Congress of the Italian Association for Artificial Intelligence and Human-Oriented Computing, Italy,2007, Berlin, Springer,2007, pp.:730-741.
    [102]L. Zhang, Z. F. Zhu, and Y. Zhao. Robust commercial detection system. In Proc. of the 2007 IEEE International Conference on Multimedia and Expo (ICME 2007), Beijing, China,2007. USA, IEEE Computer Society,2007, pp.:587-590.
    [103]X. Wang and Z. M. Guo. A novel real-time commercial detection scheme. In Proc. of the 3rd International Conference on Innovative Computing Information and Control (ICICIC 2008), Dalian, China,2008, USA, IEEE Computer Society,2008, pp.:536-539.
    [104]Z. Liu. Commercial detection in program videos. In Proc. of the 2009 International Forum on Computer Science-Technology and Applications (IFCSTA 2009), Chongqing, China,2009, USA, IEEE,2009, pp.:107-110.
    [105]M. Montagnuolo and A. Messina. Parallel neural networks for multimodal video genre classification. Multimedia Tools and Applications,2009, vol.41(1), pp.:125-159.
    [106]B. Satterwhite and O. Marques. Automatic detection of TV commercials. IEEE Potentials, 2004, vol.23(2), pp.:9-12.
    [107]S. A. Berrani, P. Lechat, and G. Manson. TV broadcast macro-segmentation: metadata-based vs. content-based approaches. In Proc. of the 6th ACM International Conference on Image and Video Retrieval (CIVR 2007), Amsterdam, Netherlands, USA, ACM,2007, pp.:325-332.
    [108]N. Liu, Y. Zhao, and Z. F. Zhu, and R. R. Ni. Commercial shot classification based on multiple features combination. IEICE Transaction on Information and System,2010, Vol.E93-D (9), pp.:2651-2655.
    [109]N. Liu, Y. Zhao, and Z. F. Zhu, and H. Q. Lu. Multi-modal characteristics analysis and fusion for TV commercial detection. In Proc. of the 2010 IEEE International Conference on Multimedia and Expo (ICME 2010), Singapore,2010, USA, IEEE Computer Society,2010, pp.: 831-836.
    [110]N. Liu, Y. Zhao, and Z. F. Zhu, and H. Q. Lu. Exploiting visual-audio-textual characteristics for automatic TV commercial block detection and segmentation. IEEE Transactions on Multimedia, to be appeared.
    [111]L. Y. Duan, J. Q. Wang, Y. T. Zheng, J.S. Jin, H. Q. Lu, and C. S. Xu. Segmentation, categorization, and identification of commercial clips from TV streams using multimodal analysis. In Proc. of the 14th ACM International Conference on Multimedia (ACM Multimedia 2006), Santa Barbara, USA 2006, New York, ACM,2006, pp.:201-210.
    [112]L. Y. Duan, Y. T. Zheng, J. Q. Wang, H. Q. Lu, and J. S. Jin. Digesting commercial clips from TV streams. IEEE Multimedia,2008, vol.15(1), pp.:28-41.
    [113]J. Q. Wang, L. Y. Duan, and Q. S. Liu, H. Q. Lu, and J. S. Jin. A multimodal scheme for program segmentation and representation in broadcast video streams. IEEE Transaction on Multimedia,2008, vol.10(3), pp.:393-408.
    [114]J. Q. Wang, L. Y. Duan, H. Q. Lu, and J. S. Jin. A semantic image category for structuring TV broadcast video streams. In Proc. of the 7th Pacific Rim Conference on Multimedia (PCM 2006), Advances in Multimedia Information Processing, Hangzhou, China,2006, Berlin, Springer,2006, pp.:279-286.
    [115]J. Q. Wang, L. Y. Duan, H. Q. Lu, J.S. Jin, and C. S. Xu. A mid-level scene change representation via audiovisual alignment. In Proc. of the 2006 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2006), Toulouse,2006, USA, IEEE,2006, pp.: 1-4.
    [116]Y. T. Zheng, L. Y. Duan, Q. Tian, and J. S. Jin. TV commercial classification by using multi-modal textual information. In Proc. of the 2006 IEEE International Conference on Multimedia and Expo (ICME 2006), Toronto, Canada,2006, USA, IEEE Computer Society,2006, pp.:497-500.
    [117]J. Q. Wang, L. Y. Duan, L. Xu, H. Q. Lu, and J. S. Jin. TV ad video categorization with probabilistic latent concept learning. In Proc. of 9th ACM SIGMM International Workshop on Multimedia Information Retrieval (MIR 2007), Augsburg, Germany,2007, USA, ACM,2007, pp.. 217-226.
    [118]J. Q. Wang, H. Q. Lu, L. Y. Duan, and J. S. Jin. Commercial video retrieval with video-based bag of words, In Proc. of the 5th International Conference on Intelligent Multimedia Computing and Networking (IMMCN 2007), Salt Lake City, USA,2007, USA, WorldSciBook, 2007, pp.:1496-1502.
    [119]J. Q. Wang, L. Y. Duan, Q. S. Liu, H. Q. Lu, J. S. Jin. Robust commercial retrieval in video streams. In Proc. of the 2007 IEEE International Conference on Multimedia and Expo (1CME 2007). Beijing, China,2007, USA, IEEE Computer Society,2007, pp.:260-263.
    [120]C. Colombo, A. D. Bimbo, and P. Pala. Retrieval of commercials by semantic content:the semiotic perspective. Multimedia Tools and Applications,2001, vol.13, pp.:93-118.
    [121]M. Caliani, A. D. Bimbo, P. Pala, and C. Colombo. Commercial video retrieval by induced semantics. In Proc. of the 1998 International Workshop on Content-Based Access of Image and Video Databases (CAVID 1998), Bombay, India,1998, USA, IEEE,1998, pp.:72-80.
    [122]Y. Zhong, K. Karu, and A.K. Jain. Locating text in complex color images. Pattern Recognition,1995, vol.28(10), pp.:1523-1535.
    [123]A.K. Jain, and B. Yu. Automatic text location in images and video frames. Pattern Recognition,1998, vol.31(12), pp.:2055-2076.
    [124]M. R. Lyu, J. Q. Song, and M. Cai. A comprehensive method for multilingual video text detection, localization, and extraction. IEEE Transactions on Circuits and Systems for Video Technology,2005, vol.15(2), pp.:243-255.
    [125]C. K., Jung, Q. F. Liu, and J. K. Kim. A stroke filter and its application to text localization. Pattern Recognition Letters,2009, vol.30 (2), pp.:114-122.
    [126]J. Xi, X. S. Hua, X. R. Chen, W.Y. Liu, and H. J. Zhang. A video text detection and recognition system. In Proc. In Proc. of the 2001 IEEE International Conference on Multimedia and Expo (ICME 2001), Tokyo, Japan,2000, USA, IEEE Computer Society,2001, pp.:1080-1083.
    [127]J. Zhang, D. Goldof, and R. Kasturi. A new edge-based text verification approach for video. In Proc. of the 19th International Conference on Pattern Recognition (ICPR 2008), Tampa, USA, 2008, USA IEEE,2008, pp.:1945-1948.
    [128]K. G. Kim, K.C. Jung, and J. H. Kim. Texture-based approach for text detection in images using support vector machines and continuously adaptive mean shift algorithm. IEEE Transactions on Pattern Analysis and Machine Intelligence,2003, vol.25(12), pp.:1631-1639.
    [129]H. P. Li, D. Doermann, and O. Kia. Automatic text detection and tracking in digital video. IEEE Transactions on Image Processing,2000, vol.9(1), pp.:147-156.
    [130]H. P. Li and D. Doermann. A video text detection system based on automated training. In Proc. of the 15th International Conference on Pattern Recognition (ICPR 2000), Barcelona, Spain, 2000, USA, IEEE,2000, pp.:223-226.
    [131]D. T. Chen, J. Odobez, and H. Bourlard. Text detection and recognition in images and video frames. Pattern Recognition,2004, vol.37 (3), pp.:595-608.
    [132]W. Wu, D. T. Chen, and J. Yang. Integrating co-training and recognition for text detection. In Proc. of the 2005 IEEE International Conference on Multimedia and Expo (ICME 2005), Amsterdam, Netherlands,2005, USA, IEEE Computer Society,2005, pp.:1166-1169.
    [133]C. Jung, Q. F. Liu, and J. K. Kim. Accurate text localization in images based on SVM output scores. Image and Vision Computing,2009, vol.27 (9), pp.:1295-1301.
    [134]R. Yan and M. Naphade. Co-training non-robust classifiers for video semantic concept detection. In Proc. of the 2005 International Conference on Image Processing (ICIP 2005), Genoa, Italy,2005, USA, IEEE,2005, vol.1, pp.:1205-1208.
    [135]刘楠,赵耀,朱振峰。基于Co-training策略的视频广告文本检测。北京交通大学学报自然科学版,2010,第34卷,第5期,第17页。
    [136]A. Blum and T. Mitchell. Combining labeled and unlabeled data with co-training. In Proc. of the 11th Annual Conference on Computational Learning Theory (COLT 1998), Madison, USA, 1998, USA, ACM,1998, pp.:92-100.
    [137]R. Yan and M. Naphade.Semi-supervised cross feature learning for semantic concept detection in videos. In Proc. of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), San Diego, CA, USA,2005, USA, IEEE,2005, vol.1, pp.: 657-663.
    [138]R. Yan and M. Naphade.Multi-modal video concept extraction using cotraining. In Proc. of the 2005 IEEE International Conference on Multimedia and Expo (ICME 2005), Amsterdam, Netherlands,2005, USA, IEEE Computer Society,2005, pp.:514-517.
    [139]K. K. Sung and T. Poggio. Example-based learning for view-based human face detection. IEEE Transactions on Pattern Analysis and Machine Intelligence,1998, vol.20 (1), pp.:39-51.
    [140]G. Tzanetakis and P. Cook. Musical genre classification of audio signals. IEEE Transactions on Speech and Audio Processing,2002, vol.10(5), pp.:293-302.
    [141]B. S. Ong. Towards automatic music structural analysis:identifying characteristic within song excerpts in popular music. [Dissertation], University Pompeu Fabra, Barcelona,2005, pp.:
    [142]Y. Freund and R. E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Computer and System Sciences,1997, vol.55(1), pp.:119-139.
    [143]Z. H. Zhou and M. Li. Tri-training:exploiting unlabeled data using three classifiers. IEEE Transactions on Knowledge and Data Engineering,2005, vol.17(11), pp.:1529-1541.
    [144]A. M. Ferman, A. M. Tekalp, and R. Mehrotra. Robust color histogram descriptors for video segment retrieval and identification. IEEE Transactions on Image Processing,2002, vol.11(5), pp.:497-508.
    [145]A. Gionis, P. Indyk, and R. Motwani. Similarity search in high dimensions via hashing. In Proc. of the 25th International Conference on Very Large Data Bases (VLDB 1999), Edinburgh, UK, 1998, USA, Morgan Kaufmann Publishers,1999, pp.:518-529.
    [146]P. Indyk and R. Motwani. Approximate nearest neighbors:towards removing the curse of dimensionality. In Proc. of the Thirtieth Annual ACM Symposium on the Theory of Computing (STOC 1998), Dallas, Texas, USA,1998, USA, ACM,1998, pp.:604-613.
    [147]Y. Yu, M. Takata, and K. Joe. Similarity searching techniques in content-based audio retrieval via hashing. In Proc. of the 12th International Conference on Multimedia Modeling (MMM 2006), Advances in Multimedia Modeling, Beijing, China,2006, Berlin, Springer,2006, vol.1, pp.:397-407.
    [148]Y. Yu, M. Takata, and K. Joe. Index-based similarity searching with partial sequence comparison for query-by-content audio retrieval. In Proc. of the First Workshop on Learning Semantics of Audio Signals (LSAS 2006), Lecture Notes in Computer Science 4306, Athens, Greece, 2006, Berlin, Springer,2006, pp.76-86.
    [149]S. Y. Hu. Efficient video retrieval by locality sensitive hashing. In Proc. of the 2005 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2005), Philadelphia, USA,2005, USA, IEEE,2005, vol.2, pp.:449-452.
    [150]C. Zhu, W. S. Qi, W. Ser. A new successive elimination algorithm for fast block matching in motion estimation. In Proc. of the 2004 International Symposium on Circuits and Systems (ISCAS 2004), Vancouver, Canada,2004, USA, IEEE,2004, vol.3, pp.:733-736.
    [151]章毓晋.图像工程(中册)图像分析.北京,清华大学出版社,2005,页数:300-308.
    [152]LIBSVM, [Online], http://www.csie.ntu.edu.tw/-cjlin/libsvm/
    [153]B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani. Least angle regression. Annals of Statistics,2004, Vol.32, pp.:407-499.
    [154]H. L. Lee, A. Battle, R. Raina, and A.Y. Ng. Efficient sparse coding algorithms. In Proc. of the Twenty-first Annual Conference on Neural Information Processing Systems (NIPS 2007), Advances in Neural Information Processing Systems 20, Vancouver, British Columbia, Canada, 2007,USA,MIT,2008,pp.:801-808.
    [155]I. Ramirez, P. Sprechmann, and G. Sapiro. Classification and clustering via dictionary learning with structured incoherence and shared features. In Proc. of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2010), San Francisco, CA, USA,2010, USA, IEEE,2010, pp.:3501-3508.
    [156]S. Lazebnik, C. Schmid, and J. Ponce. Beyond bags of features:Spatial pyramid matching for recognizing natural scene categories. In Proc. of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2006), New York, NY, USA,2006,USA, IEEE,2006,pp.:2169-2178.
    [157]J. C. Yang, K. Yu, Y.H. Gong, and T. Huang. Linear spatial pyramid matching using sparse coding for image classification. In Proc. of the 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), Miami, Florida, USA,2009, USA, IEEE, 2009,pp.:1794-1801.
    [158]J. Mairal, F. Bach, J. Ponce, G. Sapiro, and Andrew Zisserman. Discriminative learned dictionaries for local image analysis. In Proc. of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008), Anchorage, Alaska, USA,2008, USA, IEEE,2008,pp.:1-8.
    [159]M. A. Ranzato, F.J. Huang, Y. L. Boureau, and Y. LeCun. Unsupervised learning of invariant feature hierarchies with applications to object recognition. In Proc. of the 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2007), Minneapolis, Minnesota, USA,2007, USA,IEEE,2007,pp.:1-8.
    [160]K. Huang and S. Aviyente. Sparse Representation for Signal Classification. In Proc. of the Twentieth Annual Conference on Neural Information Processing Systems (NIPS 2006), Advances in Neural Information Processing Systems 19, Vancouver, British Columbia, Canada,2006, USA, MIT, 2007, pp.:1-8.
    [161]T. Serre, L. Wolf, and T. Poggio. Object recognition with features inspired by visual cortex. In Proc. of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), San Diego, CA, USA,2005, USA, IEEE,2005, vol.2, pp.:994-1000.
    [162]张鸿,吴飞,庄越挺。跨媒体相关性推理与检索研究。计算机研究与发展,2008,第45卷,第5期,第869-876页。

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700