摘要
随着网络通讯和多媒体技术的迅猛发展,视频信息近年来呈现出爆炸式的增长态势。相应地,以视频信息为中心的应用也层出不穷,如网络电视,3G视频通信、视频点播和视频分享等。由此引发了视频信息获取和传播方式的深刻变革——传统意义上单一、被动的信息获取模式正在被多元化、互动式的媒体交互业务所取代。与此同时,视频信息数量的膨胀和应用模式的扩展也逐渐显现出诸多技术和社会问题。一方面,人们对视频信息组织、利用、版权管理和内容认证等需求日益增强。另一方面,传统的索引、检索和信息安全等技术又难以直接应用于视频信息。因此,如何针对视频信息的特性,研究完善、高效的视频信息内容管理机制已经成为当前学术界和多媒体产业界所广泛关注的热点问题。
本文从视频信息的基本特性出发,围绕视频信息应用过程中所显现出的需求,对视频信息管理中的关键问题进行研究。本文的研究工作旨在通过设计有效的内容管理机制来提高视频信息的可利用性以及可信任性。研究内容主要涉及视频结构解析、视频摘要、视频内容识别以及视频内容认证。
本文的研究工作和创新点在于:
(1)提出一种快速的镜头边界检测通用框架,以解决现有镜头边界检测算法运算复杂度高的弊端。本文工作并不拘泥于特定的切变或渐变检测算法,而是致力于提出一种能够提高镜头检测效率、并具有普遍适用性的通用框架。该框架采用多项预处理技术初步剔除非镜头区域并预测镜头边界的属性。另一方面,提出一种并行于视频编码的快速镜头检测算法。算法通过有效地利用视频编码过程中产生的边带信息来辅助镜头检测。仿真实验表明,本文算法在显著提高镜头检测效率的同时,还可以达到理想的检测准确度。
(2)提出基于视觉注意力模型和在线聚类的视频摘要算法。在详尽分析注意力形成过程的视神经生理学机制的基础上,将注意力模型引入关键帧提取过程。通过模拟视觉系统各功能单元在注意力形成过程中的作用机理来自动检测帧内的关键目标,并以此作为关键帧提取的依据。为了保证关键帧的简洁性、降低存储需求并实现实时的摘要显示,算法提出针对感兴趣区域特征的在线聚类方案。仿真实验表明,本文算法具有内容自适应性,所提取的关键帧集合在一定程度上能够较好地与主观观察结果相吻合。
(3)提出基于时空域显著点的视频识别算法,以Harris显著点检测器和运动轨迹跟踪技术为基础,对显著点的空域显著性和时域稳定性进行衡量,最终选取最稳定的时空域显著点作为视频识别特征。算法将Hausdorff距离引入特征匹配,以应对显著点的无序性。此外,本文还提出基于非负矩阵分解的视频识别算法,推导了Euclidean范数准则下的非负矩阵分解算法。在此基础上,利用非负矩阵分解提取能够综合反映视频信息时空内容本质的基图像,以基图像作为视频识别的切入点。实验结果表明,本文提出的两种算法可以实现精确的视频识别,性能优于同比算法。此外,时空域显著点可以有效地抵御几何失真对视频识别的影响。
(4)对鲁棒哈希函数在视频内容认证中的应用进行了研究,阐述了哈希函数概念和应用领域的拓展。提出了基于随机Gabor滤波和抖动格型矢量量化的鲁棒哈希函数。通过构造具有旋转不变性的Gabor滤波器来增强鲁棒哈希对旋转操作的抵御能力。为了保证特征提取的安全性,提出依赖于密钥的随机Gabor滤波方案,并探讨了鲁棒哈希函数中安全性和随机性之间关系。针对现有量化器的局限性,算法提出基于抖动格型矢量量化的量化方案,并通过理论分析和实验验证对该量化方案的有效性进行论述。实验和分析结果表明,本文算法在鲁棒性和区分性方面都有良好的表现,尤其是在对旋转操作的鲁棒性方面明显优于代表性算法。此外,针对视频信息的特性,提出一种基于视频时空域能量关系的鲁棒哈希函数。算法借助随机像素块划分和三维信号变换提取视频内不同区域的能量关系。相比于现有的视频鲁棒哈希函数,本文算法在鲁棒性方面的性能有显著提升。此外,分析结果显示算法的特征提取环节具有较高的随机性。
With the rapid developments of network communication and multimedia technologies, there has been an explosive growth on the amount of video data. At the same time, video oriented applications keep emerging in recent years, such as Internet TV, 3G video communication, video on demand (VOD) and video sharing. Consequently, the vast amount of video information and the extension of application scenarios have lead to significant changes on the ways of video acquisition, utilization and distribution. The conventional monotonous and passive video acquisition modes are being replaced by diverse and interactive multimedia services. Meanwhile, the ever increasing video information also results in a series of technological and social problems. There has been a strong demand of video organizing, utilizing, copyright management and content authentication techniques. However, the conventional indexing, retrieval and information security techniques cannot be simply extended to video information. Therefore, developing efficient and effective video information management techniques has become a major topic of interest in both academia and the multimedia industry.
Taking the characteristics of video information as the point of departure, this dissertation addresses the technical issues arising from video applications. The principal goal of this dissertation is to design effective content management schemes to enhance the availability and reliability of video information. The research work of this dissertation focuses on video structure parsing, video abstraction, video identification and content authentication.
The main work and contributions of this dissertation are as follows:
A fast shot boundary detection framework that employs pre-processing techniques is proposed. The motivation of our work is not to design a specific hard cut or gradual transition (GT) detection method. Instead, we concentrate on a fast shot boundary detection framework that can enhance the efficiency of shot boundary detection. Several pre-processing techniques are incorporated in the framework to eliminate non-boundary frames and predict the attributes of potential shot boundaries. Moreover, we also propose a fast shot boundary detection paradigm that is parallel with video coding. The side information generated by video encoder is exploited to facilitate shot boundary detection. As a result, the detector can get rid of the computationally intensive feature extraction procedure. Experimental results indicate that both of the proposed works can effectively improve the efficiency of shot boundary detection, while the detection accuracy can be maintained at a satisfactory level.
In order to facilitate video browsing, an attention model and on-line clustering based video abstraction algorithm is developed. We first investigate the visual neuron-physiology mechanisms of human attention, based on which the visual attention model is employed in video abstraction. Region of interests (ROI) are detected in each representative frame by simulating the functions of human visual system components in forming attention. In order to reduce the consumption of memory and achieve on-the-fly key frame representation, an on-line clustering scheme is proposed. It is revealed in simulation that the proposed key frame extraction algorithm is content adaptive, and the extracted key frames are well consistent with the results of human perceptions.
We also present a spatial-temporal salient points based video identification algorithm. The spatial-temporal salient points are detected with the aid of the Harris detector and trajectory tracking techniques. The stability of each salient point is evaluated from both spatial and temporal aspects, and those with the highest spatial saliency and temporal stability are selected as the feature for video identification. In order to cope with the arbitrary order of salient points, the Hausdorff distance is employed as the metric for feature comparison. In addition, a non-negative matrix factorization (NMF) based video identification algorithm is proposed. The updating function of NMF under the Euclidean norm criterion is derived in this work. Consequently, NMF is performed on the input video to obtain the basis images that can represent the spatial-temporal content essence of the input video. Video sequences are identified via the features of basis images. It is demonstrated that the proposed algorithms can achieve accurate video identifications, and their performances are superior to that of the state-of-the-art algorithm. Especially, the spatial-temporal salient points can effectively resist geometrical distortions.
Also, the application of robust hashing in content authentication is investigated in this work. Firstly, the extension of the concept of hash function from generic data to multimedia data is elaborated. We propose a random Gabor filtering and dithered lattice vector quantization (DLVQ) based robust hash function. In order to enhance the robustness against rotation manipulations, the conventional Gabor filter is adapted to be rotation invariant. Consequently, a key dependent random filtering scheme is developed to facilitate secure feature extraction. The relationship between the security and randomness of robust hash function is investigated. Consider the limitations of existing quantization schemes, a DLVQ based quantization scheme is developed. The efficiency of the DLVQ based quantization scheme is illustrated by analytical and experimental results. It has been revealed that the proposed robust hashing performs outstandingly well on robustness and discrimination. Especially, it shows significant advantages over state-of-the-art algorithms on the robustness against rotation manipulations. In addition, a spatial-temporal energy based video hashing algorithm is developed. The energy relationships between different regions are calculated using three dimensional signal transform and random block partition. The proposed work outperforms existing works in terms of robustness. In addition, analytical results show that the proposed video hashing algorithm can exhibit a high amount of randomness.
引文
1 comScore. Online video: the new face of the Internet [R]. Reston: comScore, 2008
2 B. Furht, O. Marques. Handbook of video databases: design and applications [M]. Boca Raton: Auerbach Publications, 2004
3 C. Cotsaces, N. Nikolaidis, I. Pitas. Video shot detection and condensed representation [J]. IEEE Signal Processing Magazine, 2006, 23(2): 28-37
4 J. Yuan, H. Wang, W. Zheng, et al. A formal study of shot boundary detection [J]. IEEE Trans. Circuits and Systems for Video Technology, 2007, 1(11): 1-19
5大卫·波德维尔,汤普森.电影艺术——形式与风格(第八版)[M].彭吉象等译.北京:北京大学出版社,2003
6 Open video project [DB/OL]. [2007-04-01], http://www.open-video.org
7 B. T. Truong, S. Venkatesh. Video abstraction: a systematic review and classification [J]. ACM Trans. on Multimedia Computing, Communications and Applications, 2007, 3(1): 1-37
8 T. C. Hoad, J. Zobel. Detection of video sequences using compact signatures [J]. ACM Trans. on Information Systems, 2006, 24(1): 1-50
9 M. Stamp. Information security: principles and practice (2nd ed.) [M]. New York: John Wiley, 2006
10旋奈尔.应用密码学协议算法与C源程序[M].吴世忠等译.北京:机械工业出版社,2000
11牛夏牧,焦玉华.感知哈希综述[J].电子学报, 2008, 36(7):1405-1411
12 J. Haitsma, T. Kalker. A highly robust audio fingerprinting system [C] // SPIE International Conference on Music Information Retrieval, 2002:107-115
13 D. D. Feng, W. C. Siu, H. J. Zhang. Multimedia information retrieval and management: technological fundamentals and application [M]. Berlin: Springer, 2003
14 S. F. Chang, W. Chen, H. Meng, et al. Video Q: an automated contend based video search system using visual cues [C] //ACM International Conference on Multimedia, 1997, Seattle WA:313-324
15 VisualQ [OL]. http:// wwww/ctr/conlumbia.edu/VisualSeekbia.edu/Videoq
16 WebSeek [OL]. http://wwww/ctr/conlum
17 H. D. Wactlar, T. Kanader. Intelligent access to digital video: informedia project [J]. IEEE Computer, 1996, 29(5):46-52
18 J. Huang, Z. Li, S.Q. Yang. TVFind (TM): an MPEG-7 based video management system over Internet [C] // SPIE International Conference on Storage and Retrieval for Media Databases, 2001:336-346
19 X. Liu, Y. Zhang, Y. Pan. Webscope-CBVR: a customized content based search engine for video on WWW [C] // SPIE. 2000, San Jose, CA
20 Y. P. Tan, J. Nagamani, H. Lu. Modified Kolmogorov–Smirnov metric for shot boundary detection [J]. IEE Electronics Letter. 2003, 39(18):1313-1315
21 C. C. Lo, S. J. Wang. Video segmentation using a histogram-based fuzzy c-means clustering algorithm [J]. Computer Standard and Interface, 2001, 23:429-438
22 A. Hanjalic. Shot-boundary detection: unraveled and resolved? [J]. IEEE Trans. Circuits and System for Video Technology, 2002, 12 (2):90-105
23 C. W. Su, H. Y. M. Liao, H. R. Tyan, et al. A motion-tolerant dissolve detection algorithm [J]. IEEE Trans. Multimedia, 2005, 7(6):1106-1113
24 H. W. Yoo, H. J. Ryoo, D. S. Jang. Gradual shot boundary detection using localized edge blocks [J]. Multimedia Tools and Applications, 2006, 28:283-300
25 R. Zabih, J. Miller, K. Mai. A feature-based algorithm for detecting and classification production effects [J]. Multimedia Systems, 1999, 7:119-128
26 J. Bescos, G. Cisneros, J. M. Martinez, et al. A unified model for techniques on video-shot transition detection [J]. IEEE Trans. Multimedia, 2005, 7(2):293-307
27 C. Grana, R. Cucchiara. Linear transition detection as a unified shot detection approach [J]. IEEE Trans. Circuits and System for Video Technology, 2007, 17(4):483-489
28 R. A. Joyce, B. D. Liu. Temporal segmentation of video using frame and histogram space [J]. IEEE Trans. Multimedia, 2006, 8(1):130-140
29 M. H. Lee, H. W. Yoo, D. S. Jang. Video scene change detection using neural network: improved ART2 [J]. Expert Systms with Applications, 2006, 31:13-25
30 P. Sarah, M. Majid, T. Barry. Temporal video segmentation and classification of edit effects [J]. Image and Visual Computing, 2003, 21: 1097-1106
31 S. B. John, D. W. Lynn. A hidden markov model framework for video segmentation using audio and image features [C] // International Conference on Acoustics, Speech, and Signal Processing, 1998:3741-3744
32 A. Miene, T. Hermes, G. T. Ioannidis, et al. Automatic shot boundary detectionusing adaptive thresholds [C] // TRECVID Workshop, 2003:1-7
33 H. H. Yu, W. Wayne. A hierarchical multiresolution video shot transition detection scheme [J]. Computer Vision and Image Understanding, 1999, 75(1): 196-213
34 B. L. Yeo, B. Liu. Rapid scene analysis on compressed video [J]. IEEE Trans. Circuits and System for Video Technology, 1995, 5 (6): 533-544
35 D. Lelescu, D. Schonfeld. Statistical sequential analysis for real-time video scene change detection on compressed multimedia bitstream [J]. IEEE Trans. Multimedia, 2003, 5(1):106-117
36 W. Zheng, J. Yuan, H. Wang, et al. A novel shot boundary detection framework [C] // SPIE Visual Communication and Image Processing, 2005, 5960: 410-420
37 S. B. Jun, K. Yoon, H. Y. Lee. Dissolve transition detection algorithm using spatio-temporal distribution of MPEG macro-block types [C] // ACM International Conference on Multimedia, 2000:391-394
38 J. R. Cao, A. N. Cai. A robust shot transition detection method based on support vector machine in compressed domain [J]. Pattern Recognition Letters, 2007, 28:1534-1540
39 H. Yi, R. Deepu, L. T. Chia. A motion-based scene tree for compressed video content management [J]. Image and Visual Computing, 2006, 24:131-142
40 N. V. Patel, I. K. Sethi. Compressed video processing for cut detection [J]. IEE Image Processing, 1996, 143(5):315-323
41 B. L. Yeo, B. Liu. Rapid scene analysis on compressed video [J]. IEEE Trans. Circuits and Systems for Video Technology, 1995, 5(6):533-544
42 Y. Yusoff, W. Christmas, J. Kittler. Video shot cut detection using adaptive thresholding [C] // International Conference on British Machine Vision, 2000:362-371
43 H. Yi, R. Deepu, L.T. Chia. A motion-based scene tree for compressed video content management [J]. Image and Visual Computing, 2006, 24:131-142
44 L. J. Yang, H. Lu, B. Wang, et al. Shot boundary classification and refinement using inter-frame similarity patterns [C] // IEEE International Conference on Information, Communication and Signal Processing, 2005: 673-677
45 R. A. Joyce, B. Liu. Temporal segmentation of video using frame and histogram space [J]. IEEE Trans. Multimedia. 2006, 8(1):130-140
46 S. C. Pei, Y. Z. Chou. Effective wipe detection in MPEG compressed video usingmacroblock type information [J]. IEEE Trans. Multimedia, 2002, 4(3):309-319
47 H. J. Zhang, A. Kankanhali, S. W. Smoliar. Automatic partitioning of full-motion video [J]. Multimedia Systems, 1993:10-28
48 W. Zhang, J. Lin, X. Chen, et al. Video shot detection using hidden markov models with complementary features [C] // IEEE International Conference on Innovative Computing, Information and Control, 2006, Beijing, China: 593-596
49 H. Feng, W. Fang, S. Liu. A new general framework for shot boundary detection based on SVM [C] // International Conference on Neural Networks and Brain, 2005, 2:1112-1117
50 H. Lu, Y. P. Tan. An effective post-refinement method for shot boundary detection [J]. IEEE Trans. Circuits and Systems for Video Technology, 2005, 15(11): 1407-1422
51 U. Gargi, R. Kasturi, S. H. Strayer. Performance characterization of video-shot-change detection methods [J]. IEEE Trans. Circuits and Systems for Video Technology, 2000, 10(1):1-13
52 L. Sébastien, H. Jér?me, V. Nicole. A review of real-time segmentation of uncompressed video sequences for content-based search and retrieval [J]. Real-Time Imaging, 2003, 9 (1):73-98
53 F. Arman, R. Depommier. Content-based browsing of video sequences [C] //ACM Multimedia, 1994, 8:97-103
54 E. K. Kang, S. J. Kim, J. S. Choi. Video retrieval based on scene change detection in compressed domain [J]. IEEE Trans. Consumer Electronics. 1999, 45(3): 932-936
55 C. Kim, J. N. Hwang. Object-based video abstraction for video surveillance systems [J]. IEEE Trans. Circuits and Systems for Video Technology, 2002, 12(12):1128-1138
56 Z. Rasheed, M. Shan. Scene detection in hollywood movies and TV shows [C] // IEEE International Conference on Computer Vision and Pattern Recognition Conference, 2003, Madison, WI:343-348
57 N. D. Doulanmis, A. D. Doulanmis, Y. S. Avrithis, et al. A stochastic framework for optimal key frame extraction from MPEG video databases [C] // International Workshop on Multimedia Signal Processing. 1999:3-24
58 N. D. Doulanmis, A. D. Doulanmis, Y. S. Avrithis, et al. Video content representation using optimal extraction of frames and scenes [C] // IEEEInternational Conference on Image Processing, 1998, Chicago, 1:875-879
59 A. D. Doulanmis, N. D. Doulanmis, Y. S. Avrithis, et al. A fuzzy video content representation for video summarization and content-based retrieval [J]. Signal Processing, 2000, 80:1049-1067
60 X. Sun, M. S. Kankanhalli. Video summarization using r-sequences [J]. Real Time Imaging, 2000, 6:449-459
61 H. S. Chang, S. Sull, S. U. Lee. Efficient video indexing scheme for content-based retrieval [J]. IEEE Trans. Circuits and Systems for Video Technology, 1999. 9(8):1269-1279
62 Z. Li, G. M. Schuster, A. K. Latsaggelos, et al. Optimal video summarization with a bit budget constraint [C] // IEEE International Conference on Image Processing, 2004:617-620
63 Y. J. Zhang, H. B. Lu. A hierarchical organization scheme for video data [J]. Pattern Recognition, 2002, 35(11):2381-2387
64 X. D. Yu, L. Wang, Q. Tian, et al. Multi-level video representation with application to keyframe extraction [C] // International Conference on Multimedia Modeling. 2004:117-121.
65 C. Kim, J. N. Hwang. Object-based video abstraction using clustering analysis [C] // IEEE International Conference on Image Processing, 2001, Thessaloniki, Greece: 657-660
66 D. DeMenthon, V. Kobla, D. Doermann. Video summarization by curve simplification [C] // ACM Multimedia, 1998: 211-218
67 S. L. Zhai, J. Tang, C. Y. Zhang. Video abstraction based on relational graph [C] // International Conference on Image and Graphics, 2007:827-832
68 M. Cooper, J. Foote. Discriminative techniques for key frame selection [C] // IEEE International Conference on Multimedia and Expo, 2005:502-505.
69 T. Liu, J. R. Andkender. Rule-Based semantic summarization of instructional videos [C] // IEEE International Conference on Image Processing, 2002:601-604
70 Y. Rui, A. Gupta, A. Acero. Automatically extracting highlights for TV baseball programs [C] // ACM International Conference on Multimedia, 2000, Los Angeles, USA:105-115
71 Y. Ariki, M. Kumano, K. Tsukada. Highlight scene extraction in real time from baseball live video [C] // International Workshop on Multimedia Information Retrieval, 2003: 209-214
72 L. He, E. Sanocki, A. Gupta, et al. Auto-summarization of audio-video presentations [C] // ACM Multimedia. 1999:489-498.
73 K. Miura, R. Hamada, I. Ide, et al. Motion based automatic abstraction of cooking videos [J]. IPSJ Trans. Computer Vision and Image Media, 2002:1-4
74 J. G. Kim, H. S. Chang, K. Kang, et al. Summarization of news video and its description for content-based access [J]. International Journal of Imaging System and Technology. 2004, 13(5): 267-274
75 T. Liu, X. Zhang, J. Feng, et al. Shot reconstruction degree: a novel criterion for key frame selection [J]. Pattern Recognition Letters, 2004, 25(2):1451-1457
76 T. Y. Liu. Inertia-based cut detection technique: a step to the integration of video coding and content-based retrieval [C] // International Conference on Signal Processing 2000, 2:1018-1025.
77 H. Zhang, J. Wu, D. Zhong, et al. An integrated system for content-based video retrieval and browsing [J]. Pattern Recognition. 1997, 30(4): 643-658.
78 M. Yeung, B. Liu. Efficient matching and clustering of video shots [C] // IEEE International Conference on Image Processing, 1995, Washington, USA: 338-341
79 Y. P. Tan, S. R. Kulkarni, P. J. Ramadge. A framework for measuring video similarity and its application to video query by example [C] // IEEE International Conference on Image Processing, 1999, Kobe, Japan, 2:106-110
80 S. L. Lee, S. J. Chun, J. H. Lee. Effective similarity search methods for large video data streams [C] // International Conference on Computational Science, 2003:1030-1039.
81 A. G. Hauptmann, M. G. Christel, N. D. Papernick. Video retrieval with multiple image search strategies [C] // ACM/IEEE-CS Joint Conference on Digital Libraries, 2000:376-376
82 B. Sevinc, T. S. Husrev, M. Nasir. Video copy detection based on source device characteristics: a complementary approach to content-based methods [C] // ACM Multimedia Information Retrieval, 2008, Vancouver: 435-442
83 A. Hampapur, R. Bolle. Comparison of sequence matching techniques for video copy detection [C] // International Conference on Storage and Retrieval for Media Databases, 2002:194-201
84 T. Hoad, J. Zobel. Video similarity detection for digital rights management [C] // Australasian Computer Science Conference, Adelaide, Australia, 2003: 237-245
85 L. Gustavo, K. Hari, F. Borko. Video identification using video tomography [C]// International Conference on Multimedia and Expo, 2009:1030-1033
86 Z. Li, A. K. Katsaggelos, B. Gandhi. Fast video shot retrieval based on trace geometry matching [J]. IEE Proceedings on Vision, Image and Signal Processing, 2005, 152 (3):367-373
87 T. C. Hoad, J. Zobel. Detection of video sequences using compact signatures [J]. ACM Trans. on Information Systems. 2006, 24(1):1-50
88 A. G. Hauptmann, R. Jin, T. D. Ng. Multi-modal information retrieval from broadcast video using OCR and speech recognition [C] // ACM/IEEE-CS Joint Conference on Digital Libraries. 2002:160-161
89 Z. Xu, H. Ling, F. Zou, et al. Fast and robust video copy detection scheme using full DCT coefficients [C] // International Conference on Multimedia and Expo, 2009:434-437
90 C. Kas, H. Nicolas. Compressed domain copy detection of scalable SVC videos [C] // International Workshop on Content-Based Multimedia Indexing, 2009:89-94
91 T. Kuronuni, K. Kashino, H. Murase. A method for robust and quick video searching using probabilistic dither-voting [C] // IEEE International Conference on Image Processing, 2001, 2:653-656
92 S. S. Cheung, A. Zakhor. Fast similarity search and clustering of video sequences on the world-wide-web [J]. IEEE Trans on Multimedia,7(3):524-537
93 K. Shearer, S. Venkantesh, H. Bunke. Video sequence matching via decision tree path following [J]. Pattern Recognition Letters, 2001, 22:479-492
94 S. Poullot, O. Buisson, M. Crucianu. Z-grid-based probabilistic retrieval for scaling up content-based copy detection [C] // ACM International Conference on Image and Video Retrieval, 2009, Amsterdam, Netherlands:348-355
95 L. O. Julien, B. Olivier, G. B. Valerie, et al. Robust voting algorithm based on labels of behavior for video copy detection [C] // ACM International Conference on Multimedia, 2006, Santa Barbara, USA:835-844
96 N. Gengembre, S. A. Berrani. A probabilistic framework for fusing frame-based searches within a video copy detection system [C] // ACM International Conference on Image and Video Retrieval, 2008, Niagara Falls, Canada:211-220
97 S. Berchtold, C. Bohm, H. P. Kriegel. The pyramid-technique: towards breaking the curse of dimensionality [C] // ACM SIGMOD International Conference on Management of Data, 1998, Seattle, USA:142-153
98 X. Wu, A. G. Hauptmann, C. W. Ngo. Practical elimination of near-duplicates from web video search [C] //ACM International Conference on Multimedia, 2007, Augsburg, Germany:218-227
99 Z. Yang, W. T. Ooi, Q. Sun. Hierarchical, non-uniform locality sensitive hashing and its application to video identification [C] // IEEE International Conference on Multimedia and Expo., 2004:743-746
100 B. Sevinc, T. S. Husrev, M. Nasir. Video copy detection based on source device characteristics: a complementary approach to content-based methods [C] // ACM Multimedia Information Retrieval, 2008, Vancouve: 435-552
101 T. V. Lanh, K. S. Chong, S. Emmanuel, et al. A survey on digital camera image forensic methods [C] // IEEE International Conference on Multimedia and Expo, 2007:16-19
102 H. T. Sencar, N. Memon. Overview of state-of-the-art in digital image forensics [M]. Singapore: World Scientific Press, 2008
103 Y. Wu, Y. Zhuang, Y. Pan. Content-based video similarity model [C] // ACM International Conference on Multimedia. 2000, California, USA:465-467
104 J. Oostveen, T. Kalker, J. Haitsma. Visual hashing of digital video: applications and techniques [C] // SPIE International Conference on Applications of Digital Image Processing XXIV, 2001:121-131
105 B. Coskun, B. Sankur, N. Memon. Spatio–temporal transform based video hashing [J]. IEEE Trans. Multimedia, 2006, 8(2):1190-1208
106 K. Hamon, M. Schmucker, X. Zhou. Histogram-based perceptual hashing for minimally changing video sequences [C] // International Conference on Automated Production of Cross Media Content for Multi-Channel Distribution, 2006:236-241
107 R. Venkatesan, S. M. Koon, M. H. Jakubowski, et al. Robust image hashing [C] // International Conference on Image Processing, 2000, 3:664-666
108 J. Fridrich, M. Goljan. Robust hash functions for digital watermarking [C] // International Conference on Information Technology: Coding and Computing, 2000:178-183
109 S. Xiang, H. J. Kim, J. W. Huang. Histogram-based image hashing scheme robust against geometric deformations [C] // ACM International Workshop on Multimedia and Security, 2007, Texas, USA:121-128
110 M. K. Mihcak, R. Venkatesan. New iterative geometric methods for robustperceptual image hashing [C] // ACM Workshop on Security and Privacy in Digital Rights Management, 2001, Philadephia, PA:13-21
111 S. S. Kozat, R. Venkatesan, M. K. Mihcak. Robust perceptual image hashing via matrix invariants [C] // IEEE International Conference on Image Processing, 2004:3443-3446
112 V. Monga, M. K. Mihcak. Robust and secure image hashing via non-negative matrix factorizations [J]. IEEE Trans. Information Forensic Security, 2007, 2(3):376-390
113 V. Monga, B. L. Evans. Perceptual image hashing via feature points: performance evaluation and tradeoffs [J]. IEEE Trans. Image Processing, 2006, 15(11):3452-3465
114 S. Roy, Q. Sun. Robust hash for detecting and localizing image tampering [C] // IEEE International Conference on Image Processing, 2007:117-120
115 S. S. Jin, J. Haitsma, T. Kalker, et al. A robust image fingerprinting system using the Radon transform [J]. Signal Processing and Image Communication, 2004, 19(4):325-339
116 A. Swaminathan, Y. Mao, M. Wu. Robust and secure image hashing [J]. IEEE Trans. Information Forensic and Security, 2006, 1(2):215-230
117 J. Haitsma, T. Kalker. A highly robust audio fingerprinting system [C] // International Conference on Music Information Retrieval, 2002:107-115
118 Y. Liu, H. S. Yun, N. S. Kim. Audio fingerprinting based on multiple hashing in DCT domain [J]. IEEE Signal Processing Letters, 2009, 16(6):525-528
119 M. K. Mihcak, R. Venkatesan. A perceptual audio hashing algorithm: a tool for robust audio identification and information hiding [C] // International Workshop Information Hiding, 2001, Pittsburgh, PA:51-65.
120 E. P. McCarthy, F. Balado, G. C. M. Silvestre, et al. A model for improving the performance of feature extraction based robust hashing [C] // SPIE International Conference on Security, Steganography, and Watermarking of Multimedia Contents VII. 2005:59-67
121 V. Monga, A. Banerjee, B. L. Evans. A clustering based approach to perceptual image hashing [J]. IEEE Trans. Information Forensic and Security, 2006, 1(1):68-79
122 Y. Mao, M. Wu. Unicity distance of robust image hashing [J]. IEEE Trans. Information Forensic and Security, 2007, 2(3):462-467
123 Q. Li, S. Roy. On the security of non-forgeable robust hash functions [C] // IEEE Conference on Image Processing, 2008:3124-3127
124 W. Li, B. Preneel. Attacking some perceptual image hash algorithms [C] // IEEE Conference on Multimedia and Expo, 2007:879-882
125 Motion Pictures Experts Group. Generic coding of moving pictures and associated audio [S]. MPEG, ISO, 1995
126 H. Yi, D. Rajan, L. T. Chia. A motion-based scene tree for compressed video content management [J]. Image and Vision Computing. 2006, 24:131-142
127王甦,汪安圣等著.认知心理学[M].北京:北京大学出版社,1992
128罗四维等著.视觉感知系信息处理理论[M].北京:电子工业出版社,2006
129汪云九.神经信息学——神经系统的理论和模型[M].北京:高等教育出版社,2006
130寿天德.视觉信息处理的脑机制[M].上海:上海科技教育出版社,1997
131 C. E. Conner, H. E. Egeth, S. Yantis. Visual attention: bottom-up versus dispatch top-down [J]. Current Biology, 2004, 14:850-852
132 O. L. Meur, P. L. Callet, D. Barba, et al. A coherent computational approach to model bottom-up visual attention [J]. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2006, 28(5):802-817
133 P. Reinagel, A.M. Zador. Natural scene statistics at the centre of gaze [J]. Network: Computational Neural Systems, 1999, 10:1-10
134 D. J. Parkhurst, E. Niebur. Scene content selected by active vision [J]. Spatial Vision, 2003, 16:125-154
135 L. Itti, C. Koch, E. Niebur. A model of saliency-based visual attention for rapid scene analysis [J]. IEEE Trans. Pattern Analysis and Machine Intelligence, 1998, 20(11):1254-1259
136 L. Itti, C. Koch. A comparison of feature combination strategies for saliency-based visual attention systems [C] // SPIE International Conference on Human Vision and Electronic Imaging IV, 1999:373-38
137 L. Itti, C. Koch. A saliency-based search mechanism for overt and covert shifts of visual attention [J]. Vision Research. 2000, 40:1489-1506
138 G. Mickael, G. Nathalie, P. Denis and L. Patricia. Static and dynamic feature-based visual attention model: comparison to human judgment [C] // European Signal Processing Conference, 2005:1-4
139 W. H. A. Beaudot. Sensory coding in the vertebrate retina: towards an adaptivecontrol of visual sensitivity [C] // Network: Computation in Neural Systems, 1996, 7:317-323
140 D. H. Hubel, T. N. Wiesel. Receptive fields, binocular interaction, and functional architecture in cat’s visual cortex [J]. Journal of Physiology, 1962, 160:106-154.
141 J. G. Daugman. Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical fillers [J]. Optical Society of America, 1985, 2(7):1160-1169.
142 C. Harris. M. Stevens. A combined corner and edge detector [C] // Alvey Vision Conference, 1988:153-158
143 D. G. Lowe. Distinctive image features from scale-invariant keypoints [J]. International Journal of Computer Vision, 2004, 60(2):91-110
144 R. A. Horn, C. R, Jorson.矩阵分析(英文影印版)[M].北京:人民邮电出版社,2005
145 M. P. Dubuisson, A.K. Jain. A modified Hausdorff distance for object matching [J]. Pattern Recognition, 1994, 1(9):566 -568
146 D. D. Lee, H. S. Seung. Learning the parts of objects by non-negative matrix factorization [J]. Nature, 1999, 401:788-791
147 F. Jiao, W. Gao, X. Chen, et al. A face recognition method based on local feature analysis [C] // Asian Conference on Computer Vision, 2002, Melbourne, Australia:1-5
148 I. Buciu, I. Pitas. Application of non-negative and local non negative matrix factorization to facial expression recognition [C] // IEEE International Conference on Pattern Recognition, 2004:288-291
149 F. A. Nielsen, D. Balslev, L. K. Hansen. Mining the posterior cingulate: segregation between memory and pain components [J]. NeuroImage, 2005, 27(3):520-522
150 M. W. Berry, M. Browne. Email surveillance using non-negative matrix factorization [J]. Computational and Mathematical Organization Theory, 2005, 11(3):249-264
151 C. Fevotte, N. Bertin and J. L. Durrieu. Nonnegative matrix factorization with the itakura-saito divergence: with application to music analysis [J]. Neural Computation, 2009, 21(3):1-32
152 T. Zhang, B. Fang, Y. Y. Tang, et al. Topology preserving non-negative matrix factorization for face recognition [J]. IEEE Trans. on Image Processing. 2008,17(4):574-584
153 J. K. Kamarainen, V. Kyrki, H. Kalviainen. Noise tolerant object recognition using Gabor filtering [C] // International Conference on Digital Signal Processing, 2002:1349-1352
154 C. D. Roover, C. D. Vleeschouwer, F. Lefebvre, et al. Robust video hashing based on radial projections of key frames [J]. IEEE Trans. on Signal Processing. 2005, 53(10):4020-4037
155 V. Kyrki, J. K. Kamarainen, H. Kalviainen. Simple Gabor feature space for invariant object recognition [J]. Pattern Recognition Letters, 2004, 25(3):311-318
156 R. Porter, N. Canagarajah. Robust rotation-invariant texture classification: wavelet, Gabor filter and GMRF based schemes [J]. IEE Proc.Vision Image Signal Processing, 1997, 144(3):180-188
157 J. H. Conway, N. J. A. Sloane. Sphere packings, lattices and groups [M]. Berlin: Springer, 1999
158 A. Gersho. On the structure of vector quantization [J]. IEEE Trans. Information Theory, 1982, 28(2):157-166
159 A. Kirac, P. P. Vaidyanathan. Results on lattice vector quantization with dithering [J]. IEEE Trans. on Circuits.-II, 1996, 43(12):811-826
160孙圣和,陆哲明.矢量量化技术及应用[M].北京:科学出版社,2002
161 J. H. Conway and N. Sloane. A fast encoding method for lattice codes and quantizers [J]. IEEE Trans. Information Theory, 1983, 29:820-824
162 R. Laroia, N. Farvardin. A structured fixed rate vector quantizer derived from a variable length scaler quantizer: Pts. I and II [J]. IEEE Trans. Information Theory, 1993, 39:851-876
163 C. Wang, H. Q. Cao, W. Li and K. K. Tzeng, Lattice labeling algorithms for vector quantization [J]. IEEE Trans. on Circuits and Systems for Video Technology, 1998, 8(2):206-220
164 Photography Image Database [DB/OL]. [2008-12-01] http://www.stat.psu.edu/~jiali/index.download.html