基于视觉哈希的视频拷贝检测算法研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着多媒体技术的发展,网络视频的传播变得十分便捷迅速。由于数字视频的拍摄编辑处理非常容易,使得数以千计的数字视频每天都被创造出来。同时,非法盗版者往往对视频进行一些编辑处理(如添加噪声,添加边框,尺寸变化,滤波,画中画,添加字幕,JPEG压缩,对比度变换等攻击),使得盗版视频也在成倍出现飞速传播,极大侵害了版权所有者的切身利益。随着这一发展,对于视频拷贝检测技术的研究逐渐成为了多媒体信息版权处理领域中的研究热点,并且开始在视频跟踪,视频内容检索,视频内容认证,版权保护,视频内容过滤等方面进行应用。因此,如何建立更鲁棒的视频拷贝检测系统模型就成为了国内外研究的重点。
     本论文首先介绍了视频拷贝检测系统机制的基本理论;然后介绍了一种用于视频拷贝检测的时空联合哈希算法,并在此基础上,本论文针对目前视频拷贝检测算法存在的不足,结合时空联合特征在表征视频内容上的全面性以及顺序特征在鲁棒性上的贡献,以及视觉关注区域即图像中最能引起用户兴趣,最能表现图像内容的区域,这些区域特征的提出将会大大提高图像处理和分析的效率和准确度。引入人类视觉关注模型,提出了基于视觉关注的视频拷贝检测算法,分别研究了视觉关注模型的应用以及其在视频哈希形成以及视频哈希加权上的分析。论文最后还介绍了对基于视觉关注的视频拷贝检测算法的改进以及其在视频拷贝检测查全率查准率上的贡献。
     本论文的主要创新和贡献包括以下四个方面:
     (1)提出一种基于视频拷贝检测的时空联合哈希算法。该算法考虑到视频是一系列时间上连续的视频帧的集合,提取时空域特征来代替以往的只提取时域特征或空域特征,由于视频帧颜色的空间分布以及由于亮度变化和块效应导致的帧图像边缘信息变化,使得采用颜色直方图和运动矢量特征的视频内容特征提取方案不完善,这里采用视频帧块的顺序特征来提取视频内容的指纹,发现在检测中性能更好。
     (2)提出一种基于视觉关注的视频拷贝检测算法。该算法充分考虑到人的视觉系统对提取视频内容特征的影响,将人的关注加入到视频拷贝检测系统模型中,根据人眼对视频内容的关注程度的不同,赋予各视频帧块不同权重,进而在进行哈希匹配时,每一哈希比特位,不再是均一权重。这样对视频内容特征进行提取分析更符合人的感知。
     (3)介绍了视觉关注模型在视频哈希形成上的应用,区别与以前视频哈希指纹直接由提取视频帧块顺序特征得到,这里的改进是分别计算出时域信息代表图像的二值序列和视觉显著图的二值序列特征,进而将时域信息代表图像的二值序列特征和视觉显著图的二值序列特征进行融合得到最终的视频哈希指纹。这样所提取内容指纹包含了人的视觉关注,实验结果表明在保证查全率的同时,查准率得到提高。
     (4)进一步介绍了视频关注模型在视频拷贝检测系统上的改进,为了进一步提高视频拷贝检测的查全率和查准率,首先将时域信息代表图像的二值序列和视觉显著图的二值序列进行融合得到一个视频片段的二值序列,然后再次利用关注模型,根据人眼特性,对代表图像进行分块处理,计算出每一块的权重,再将此权重分配给上述视频序列的二值序列得到最终视频哈希指纹进行哈希匹配,其实验结果的稳定性,为视频拷贝检测提供了有利的参考价值。
     文章中提出的基于视觉关注的带有权重分配的哈希算法在视频拷贝检测上表现出了较好的鲁棒性与区分性。这样通过关注模型将得到的权重序列赋予上述对应的二进制比特流即得到最后的带有权重分配的视频哈希指纹。这样对视频内容特征进行提取分析更符合人的感知。
With the development of multi-media, the spread of online video becomes very convenient and rapid. The shooting, edit and management of digital video are so easy that thousands of digital video are created everyday. Meanwhile illegal pirates always do some edit towards the video (for example, add noise, add frame, change scale, filtration, picture in picture, add subtitles, JPEG compression, change contrast and many other attack), making pirated video also appear in multiple rapid propagation, which violate the interests of copy owner heavily. With this development, the research of video copy detection technology becomes the hotspot of the fields of multimedia information copyright processing gradually, and come to use in video tracking, video content retrieval, video content authentication, copyright protection and video filtration. So how to build more robust video copy detection system model becomes the key research both at home and abroad.
     This paper introduces the basic theory of mechanism of video copy detection system firstly; and then introduces a kind of time and space combined hash algorithm used for video copy detection, and on this basis, this paper take the affect of human perception system to video content features into consideration, and introduce human visual attention model. And put forward video copy detection algorithm based on visual attention. They study the application of visual attention model and its analysis in video hash formation and video hash weighted. At last the paper introduces the improvement of the visual attention based video copy detection algorithm and its contribution in recall and precision ratio of video copy detection.
     The main innovation and contribution of this paper include the following four aspects:
     (1) Proposed a kind of video copy detection based time and space combined algorithm. This algorithm take that video is a set of a series of time continuous video frame into consideration. It extracts time domain and spatial feature instead of time domain feature or spatial feature only previous. Because of the space distribution of video frame color and image edge information changes owing to brightness changes and block effect, the extract scheme of video content feature is not perfect used color histogram and motion vector characteristic. Here we adopt the order feature of video frame block to extract the fingerprint of video content. And it turns out better performance in the detection.
     (2) Proposed a kind of video copy detection algorithm based on visual attention. This algorithm fully considers the influence of human visual system to the extracted video content feature, so it adds human attention to video copy detection system model. According to different attention degree of human eyes to video content, it gives different weights to each video. And thus there will be not only one weight per hash bit when do the hash matching. Then the extract and analysis of video content feature will more accord with human perception.
     (3) Introduced the application of visual attention model in hash formation. Compared to video hash fingerprint was formed directly from extracting the order feature of video frame block previously, the improvement was that compute the binary sequence feature of time domain information representative image and binary sequence feature of visual significant image. And then combining these two binary sequence features so that we can get the final video hash fingerprint. The content fingerprint extracted by this way includes human visual attention. The experiment show that it guarantees the recall ratio and meanwhile improve the precision ratio.
     (4) Introduced the improvement of video attention model in video copy detection system. In order to improve the recall ratio and precision ratio of video copy detection more, we combine the binary sequence feature of time domain information representative image and binary sequence feature of visual significant image to get a binary sequence of a video clip firstly, and then make use of attention model again, and do block process to representative image according to human eye characteristic. We compute the weight of every block and distribute this weight to the binary sequence of the above video to attain the final hash fingerprint and do hash matching. The stability of the experiment results provides favorable reference value for video copy detection.
     A novel video hashing algorithm is proposed, which takes account of visual saliency during hash generation. In the proposed algorithm, Experiments on different kinds of videos with different kinds of attacks verify that the proposed algorithm has better performance on robustness and discrimination.
引文
[1]. Corvaglia, M. Guerrini, F. Leonardi, R. Migliorati, P. Rossi, E. "CBCD based on color features and landmark MDS assisted distance estimation." IEEE International Conference, Acoustics Speech and Signal Processing, pp.2374-2377, March.2010.
    [2]. Radhakrishnan R. Bauer C "Content-based Video Signatures based on Projections of Difference Images" Multimedia Signal Processing, MMSP, pp:341-344,2007.
    [3]. Mohan R, "Video sequence matching," Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, vol.6, pp.3697-3700, 1998.
    [4]. Hampapur A and Bolle R M, "Video copy detection using inverted file indices,' IBM Research Division Thomas, T.J. Watson Research Center, Technical Report, 2001
    [5]. M. M. Esmaeili, M. Fatourechi and R. K. Ward, A robust and fast video copy detection system using content-based fingerprinting, IEEE Transactions on Information Forensics and Security,6(1), pp.213-226,2011.
    [6]. Lowe D G, "Distinctive image features from scale-invariant key points,' International Journal of Computer Vision, vol.60, no.2, pp.91-110,2004.
    [7]. Kim Cand, Vasudev B. "Spatial temporal sequence matching for efficient video copy detection," IEEE Transactions. Circuits and Systems for Video Technology, vol.15. no.1, pp.127-132, Jan.2005.
    [8]. Malekesmaeili M. and Ward R. K, "Robust video hashing based on temporally informative representative images," Digest of Technical Papers. IEEE International Conference on Consumer electronics, pp.179-180, Jan.2010.
    [9]. X. S. Nie, J. Liu, J. D. Sun, et al. Robust video hashing based on double-layer embedding, IEEE Signal Processing Letters,18, pp.307-310,2011.
    [10]. Zhao Y. X. "Video copy detection based on local ordinal," Journal of Computer-Aided Design & Computer Graphic, vol.21, no.9, pp.1339-1343. Sep. 2009
    [11]. Li W. and Preneel, "From image hashing to video hashing," Lecture Notes in Computer Science, v 5916, pp.662-668.2009.
    [12]. J. Law-To, L. Chen, A. Joly, et al, Video copy detection:A comparative study, Processing of ACM International Conference Image and Video Retrieval, New York, pp.371-378,2007.
    [13]. Hampapur and R. M. Bolle, VideoGREP:Video copy detection using inverted file indices, IBM Research Division Thomas, T.J. Watson Research Center, Technical Report,2001.
    [14]. X. Su, T. J. Huang, W. Gao, Robust video fingerprinting based on visual attention regions, IEEE International Conference on Acoustics, Speech and Signal Processing, (ICASSP), pp.1525-1528,2009.
    [15]. Y. F. Ma, L. Lu, H. J. Zhang et al, A user attention model for video summarization, ACM Multimedia,2002.
    [16]. R. Butz, Alternative algorithm for Hilbert's space-filling curve, IEEE Transactions on Computers,20(4), pp.424-426,1971.
    [17]. J. Zhang, J.D. Sun, H. Yan, et al, Visual attention model with cross-layer saliency optimization, IEEE International Conference on Intelligent Information Hiding and Multimedia Signal Processing, pp.240-243,2011.
    [18]. J. Law-To, O. Buisson, V. Gouet-Brunet, et al, ViCopT:a robust system for content-based video copy detection in large databases, Multimedia Systems,15, pp. 337-353,2009.
    [19]. Mucedero, A., Lancini, R., and Mapelli, F.2004. A novel hashing algorithm for video sequences. In international conference on image processing (ICIP) (Oct, 2004),2239-2242.
    [20]. Chen, L. and Stentiford, F. W. M.,2008.Video sequence matching based on temporal ordinal measurement. Pattern Recognition Letters,29,13 (Oct.2008), 1824-1831.
    [21].Ma,Y. F. Lu, L. H. Zhang, J.2002. A user attention model for video summarization, ACM Multimedia,2002
    [22]. C.-Y. Lin and S.-F. Chang. "A robust image authentication method distinguishing JPEG compression from malicious manipulation," IEEE Trans. Circuits Syst. Video Technol, vol.11, no.2, pp.153-168, Feb.2001.
    [23]. Sunil Lee and Chang D. Yoo, "Video Fingerprinting Based on Centroids of Gradient Orientations," In Proc ICASSP 2006, Toulouse, France, vol.2, pp. 401-404, May 2006.
    [24]. Cheung, Sen-Ching S. Efficient video similarity measurement with video signature. IEEE Trans on Circuits and Systems for Video Technology, v 13, n 1, p 59-74, January 2003.
    [25]. J. Fridrich and M. Goljan, "Robust hash functions for digital watermarking," in ITCC'00:Proc. Int. Conf. Information Technology:Coding and Computing, p. 178,2000,.
    [26]. The origin of the video dataset is MUSCLE-VCD-2007 [Online]. Available: http://www-rocq.inria.fr/imedia/civr-bench/index.html.
    [27]. X. S. Nie, J. P. Qiao, J. Liu, and J. D. Sun, "LLE-based video hashing for video identification," in loth IEEE Int. Conf. Signal Processing (ICSP),2010, pp. 1837-1840.
    [28]. X. S. Nie, J. Liu, and J. D. Sun, "Robust video hashing for identifica-tion based on MDS," in IEEE Int. Conf. Acoustics Speech and Signal Processing (ICASSP). 2010, pp.1834-1837.
    [29]. Y. Chiu, H. M. Wang, and C. S. Chen, "Fast min-hashing indexing and robust spatio-temporal matching for detecting video copies," ACM Trans. Multimedia Comput., Commun. Applicat., vol.6, no.2, Mar.1,2010.
    [30]. X. B. Zhou, S. Martin, and B. Christopher, "Perceptual hashing of video content based on differential block similarity," Lecture Notes in Computer Science, vol. 3802, pp.80-85,2005.
    [31]. W. Li and Preneel, "From image hashing to video hashing," Lecture Notes in Computer Science, vol.5916, pp.662-668,2009.
    [32]. Cherubini M., Oliveira R., and Oliver N., Understanding near-duplicate videos:a user-centric approach[C], ACMMM09, pp.35-44,2009
    [33]. Wang J. T., Yang Y. W, Chang Y. T, and Yu S. S. A high verification capacity reversible fragile watermarking scheme for 3D models[J]. International Journal of Innovative Computing, Information and Control, v 7, n 1, p 365-378, January 2011
    [34]. Wang, Y. P and Hu, S. M. Optimization approach for 3D model watermarking by linear binary programming[J]. Computer Aided Geometric Design, v 27, n 5, p 395-404, June 2010.
    [35]. Darazi R., Hu R, Macq, B., Applying spread transform dither modulation for 3D-mesh watermarking by using perceptual models [C]. Proc. of ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing-Proceedings, p 1742-1745,2010.
    [36]. Kanai S., Date H., and Kishinami T., Digital Watermarking for 3D Polygons using Multiresolution Wavelet Decomposition[C], Proc. Of the Sixth IFIP WG International Workshop on Geometric Modeling:Fundamentals and Applications (GEO-6), pp.296-307, Tokyo, Japan,December 1998.
    [37]. Wolf W., Key frame selection by motion analysis[C], Proceeding IEEE Int. Conf. Accoust., Speech and Signal Proc.1996.
    [38]. Zhao L., Qi W., Yang S. Q., and Zhang H. J., Keyframe extraction and shot retrieval using nearest feature line[C], International Workshop on Multimedia Information retrieval 2000, p 1238-1241,2000.
    [39]. Aapo Hyvarinen, Juha Karhunen, Erkki Oja. Independent component analysis [M]. John Wiley & Sons, Inc.,2001.
    [40]. He X F,Niyogi P.Locality preserving projections[C].Proceedings of Neural Information Processing.System, Vancouver,2003.
    [41]. Brun A.Nonlinear dimensionality reduction by kernel eigenmaps[C].Proceedings of 18th Interational Joint Conference on Artificial Intelligence pp:547-552., August 2003
    [42]. He X F. Locality preserving projections[Dissertation]. Department of Computer Science, The university of Chicago, IL,2005.
    [43]. Martinez A M,Kak A C.PCA versus LDA[J].IEEE Trans.on Pattern Analysis and Machine lntelligence,23(2):228-233.2001.
    [44]. Zhang C S, Wang J, Zhao Y, et al. Reconstruction and analysis of multi-pose face images based on nonlinear dimensionality reduction[J]. Pattern Recognition,37(2):325-336,2004
    [45]. Hinton G E and Roweis S T. Stochastic Neighbor Embedding [J]. In Advances in Neural Information Processing Systems, Cambridge, MA, USA. volume 15: 833-840,2002
    [46]. Weinberger K.Q, Saul L.K. "Unsupervised learning of image manifolds by semidefinite Programming"[C] International Journal of Computer Vision, 70(1):77-90,2006
    [47]. Tenebaum J.B., Silvam V.D. and Langford J.C. "A global geometric framework for nonlinear dimensionality reduction[J]. Science,290:2319-2323,2000.
    [48].赵玉鑫,刘光杰,戴跃伟等,“基于局部排序的视频复制检测”[J],计算机辅助设计与图形学学报,21(9),pp.1339-1343,2009.
    [49].靳延安,“基于内容的视频复制检测研究”[J],计算机应用,28(8),pp.2021-2023,2008.
    [50]. Wu Z. P., Huang Q. M., Jiang S. Q., "Robust copy detection by mining temporal self-similarities"[C], ICME, pp.554-557,2009.
    [51].张勇东,张冬明,郭俊波,唐胜,“压缩域快速视频拷贝检测算法”[J],通信学报,30(3),pp.135-140,2009.
    [52]. H. M. Ren, S. X. Lin, D. M. Zhang, et al, "Visual words based spatiotemporal sequence matching in video copy detection"[C], ICME, pp.1382-1385,2009.
    [53].潘雪峰,李锦涛,张勇东等,“基于视觉感知的时空联合视频复制检测方法”,[J]计算机学报,32(1),pp.108-114,2009.
    [54]. Radhakrishnan, R.; Bauer, C.; "Video fingerprinting based on moment invariants capturing appearance and motion" [C]Multimedia and Expo,2009. ICME 2009. IEEE International Conference on, pp.1532-1535,2009.
    [55]. Cirakman, O.; Gunsel, B.; Sengor, N. Serap; Gursoy, Ozan; "Key-frame based video fingerprinting by NMF"[C] Image Processing (ICIP), 2010 17th IEEE International Conference on, pp.2373-2376,2010.
    [56]. Wei Q.; Yilong Y.; Chunxiao Ren; Lili L.; "Video-based fingerprint verification"[C] Acoustics Speech and Signal Processing (ICASSP),2010 IEEE International Conference on, pp.1426-1429,2010.
    [57]. Baudry, S.; Chupeau, B.; Lefebvre, F.; "Adaptive video fingerprints for accurate temporal registration"[C] Acoustics Speech and Signal Processing (ICASSP),2010 IEEE International Conference on pp.1786-1789,2010,
    [58]. Xiaoli L. Krishnan, S.; Ngok-Wah Ma; "A wavelet-PCA-based fingerprinting scheme for peer-to-peer video file sharing"[J] Information Forensics and Security, IEEE Transactions on Volume:5, Issue:3,2010, Page(s):365-373.
    [59]. P. Viola, M.Jones.Rapid Object Detection using a Boosted Cascade of Simple Features.conference on computer vision and pattern recognition 2011.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700