视频信息内容管理关键问题研究

英文题名：Research on Key Issues in Video Information Content Management
作者：李岳楠
论文级别：博士
学科专业名称：信息与通信工程
中文关键词：视频信息内容管理 ; 镜头边界检测 ; 视频摘要 ; 视频识别 ; 内容认证 ; 鲁棒哈希函数
英文关键词：video information content management ; shot boundary detection ; video abstraction ; video identification ; content authentication ; robust hash function
学位年度：2010
导师：陆哲明
学科代码：081001
学位授予单位：哈尔滨工业大学

摘要

随着网络通讯和多媒体技术的迅猛发展,视频信息近年来呈现出爆炸式的增长态势。相应地,以视频信息为中心的应用也层出不穷,如网络电视,3G视频通信、视频点播和视频分享等。由此引发了视频信息获取和传播方式的深刻变革——传统意义上单一、被动的信息获取模式正在被多元化、互动式的媒体交互业务所取代。与此同时,视频信息数量的膨胀和应用模式的扩展也逐渐显现出诸多技术和社会问题。一方面,人们对视频信息组织、利用、版权管理和内容认证等需求日益增强。另一方面,传统的索引、检索和信息安全等技术又难以直接应用于视频信息。因此,如何针对视频信息的特性,研究完善、高效的视频信息内容管理机制已经成为当前学术界和多媒体产业界所广泛关注的热点问题。
     本文从视频信息的基本特性出发,围绕视频信息应用过程中所显现出的需求,对视频信息管理中的关键问题进行研究。本文的研究工作旨在通过设计有效的内容管理机制来提高视频信息的可利用性以及可信任性。研究内容主要涉及视频结构解析、视频摘要、视频内容识别以及视频内容认证。
     本文的研究工作和创新点在于:
     (1)提出一种快速的镜头边界检测通用框架,以解决现有镜头边界检测算法运算复杂度高的弊端。本文工作并不拘泥于特定的切变或渐变检测算法,而是致力于提出一种能够提高镜头检测效率、并具有普遍适用性的通用框架。该框架采用多项预处理技术初步剔除非镜头区域并预测镜头边界的属性。另一方面,提出一种并行于视频编码的快速镜头检测算法。算法通过有效地利用视频编码过程中产生的边带信息来辅助镜头检测。仿真实验表明,本文算法在显著提高镜头检测效率的同时,还可以达到理想的检测准确度。
     (2)提出基于视觉注意力模型和在线聚类的视频摘要算法。在详尽分析注意力形成过程的视神经生理学机制的基础上,将注意力模型引入关键帧提取过程。通过模拟视觉系统各功能单元在注意力形成过程中的作用机理来自动检测帧内的关键目标,并以此作为关键帧提取的依据。为了保证关键帧的简洁性、降低存储需求并实现实时的摘要显示,算法提出针对感兴趣区域特征的在线聚类方案。仿真实验表明,本文算法具有内容自适应性,所提取的关键帧集合在一定程度上能够较好地与主观观察结果相吻合。
     (3)提出基于时空域显著点的视频识别算法,以Harris显著点检测器和运动轨迹跟踪技术为基础,对显著点的空域显著性和时域稳定性进行衡量,最终选取最稳定的时空域显著点作为视频识别特征。算法将Hausdorff距离引入特征匹配,以应对显著点的无序性。此外,本文还提出基于非负矩阵分解的视频识别算法,推导了Euclidean范数准则下的非负矩阵分解算法。在此基础上,利用非负矩阵分解提取能够综合反映视频信息时空内容本质的基图像,以基图像作为视频识别的切入点。实验结果表明,本文提出的两种算法可以实现精确的视频识别,性能优于同比算法。此外,时空域显著点可以有效地抵御几何失真对视频识别的影响。
     (4)对鲁棒哈希函数在视频内容认证中的应用进行了研究,阐述了哈希函数概念和应用领域的拓展。提出了基于随机Gabor滤波和抖动格型矢量量化的鲁棒哈希函数。通过构造具有旋转不变性的Gabor滤波器来增强鲁棒哈希对旋转操作的抵御能力。为了保证特征提取的安全性,提出依赖于密钥的随机Gabor滤波方案,并探讨了鲁棒哈希函数中安全性和随机性之间关系。针对现有量化器的局限性,算法提出基于抖动格型矢量量化的量化方案,并通过理论分析和实验验证对该量化方案的有效性进行论述。实验和分析结果表明,本文算法在鲁棒性和区分性方面都有良好的表现,尤其是在对旋转操作的鲁棒性方面明显优于代表性算法。此外,针对视频信息的特性,提出一种基于视频时空域能量关系的鲁棒哈希函数。算法借助随机像素块划分和三维信号变换提取视频内不同区域的能量关系。相比于现有的视频鲁棒哈希函数,本文算法在鲁棒性方面的性能有显著提升。此外,分析结果显示算法的特征提取环节具有较高的随机性。
With the rapid developments of network communication and multimedia technologies, there has been an explosive growth on the amount of video data. At the same time, video oriented applications keep emerging in recent years, such as Internet TV, 3G video communication, video on demand (VOD) and video sharing. Consequently, the vast amount of video information and the extension of application scenarios have lead to significant changes on the ways of video acquisition, utilization and distribution. The conventional monotonous and passive video acquisition modes are being replaced by diverse and interactive multimedia services. Meanwhile, the ever increasing video information also results in a series of technological and social problems. There has been a strong demand of video organizing, utilizing, copyright management and content authentication techniques. However, the conventional indexing, retrieval and information security techniques cannot be simply extended to video information. Therefore, developing efficient and effective video information management techniques has become a major topic of interest in both academia and the multimedia industry.
     Taking the characteristics of video information as the point of departure, this dissertation addresses the technical issues arising from video applications. The principal goal of this dissertation is to design effective content management schemes to enhance the availability and reliability of video information. The research work of this dissertation focuses on video structure parsing, video abstraction, video identification and content authentication.
     The main work and contributions of this dissertation are as follows:
     A fast shot boundary detection framework that employs pre-processing techniques is proposed. The motivation of our work is not to design a specific hard cut or gradual transition (GT) detection method. Instead, we concentrate on a fast shot boundary detection framework that can enhance the efficiency of shot boundary detection. Several pre-processing techniques are incorporated in the framework to eliminate non-boundary frames and predict the attributes of potential shot boundaries. Moreover, we also propose a fast shot boundary detection paradigm that is parallel with video coding. The side information generated by video encoder is exploited to facilitate shot boundary detection. As a result, the detector can get rid of the computationally intensive feature extraction procedure. Experimental results indicate that both of the proposed works can effectively improve the efficiency of shot boundary detection, while the detection accuracy can be maintained at a satisfactory level.
     In order to facilitate video browsing, an attention model and on-line clustering based video abstraction algorithm is developed. We first investigate the visual neuron-physiology mechanisms of human attention, based on which the visual attention model is employed in video abstraction. Region of interests (ROI) are detected in each representative frame by simulating the functions of human visual system components in forming attention. In order to reduce the consumption of memory and achieve on-the-fly key frame representation, an on-line clustering scheme is proposed. It is revealed in simulation that the proposed key frame extraction algorithm is content adaptive, and the extracted key frames are well consistent with the results of human perceptions.
     We also present a spatial-temporal salient points based video identification algorithm. The spatial-temporal salient points are detected with the aid of the Harris detector and trajectory tracking techniques. The stability of each salient point is evaluated from both spatial and temporal aspects, and those with the highest spatial saliency and temporal stability are selected as the feature for video identification. In order to cope with the arbitrary order of salient points, the Hausdorff distance is employed as the metric for feature comparison. In addition, a non-negative matrix factorization (NMF) based video identification algorithm is proposed. The updating function of NMF under the Euclidean norm criterion is derived in this work. Consequently, NMF is performed on the input video to obtain the basis images that can represent the spatial-temporal content essence of the input video. Video sequences are identified via the features of basis images. It is demonstrated that the proposed algorithms can achieve accurate video identifications, and their performances are superior to that of the state-of-the-art algorithm. Especially, the spatial-temporal salient points can effectively resist geometrical distortions.
     Also, the application of robust hashing in content authentication is investigated in this work. Firstly, the extension of the concept of hash function from generic data to multimedia data is elaborated. We propose a random Gabor filtering and dithered lattice vector quantization (DLVQ) based robust hash function. In order to enhance the robustness against rotation manipulations, the conventional Gabor filter is adapted to be rotation invariant. Consequently, a key dependent random filtering scheme is developed to facilitate secure feature extraction. The relationship between the security and randomness of robust hash function is investigated. Consider the limitations of existing quantization schemes, a DLVQ based quantization scheme is developed. The efficiency of the DLVQ based quantization scheme is illustrated by analytical and experimental results. It has been revealed that the proposed robust hashing performs outstandingly well on robustness and discrimination. Especially, it shows significant advantages over state-of-the-art algorithms on the robustness against rotation manipulations. In addition, a spatial-temporal energy based video hashing algorithm is developed. The energy relationships between different regions are calculated using three dimensional signal transform and random block partition. The proposed work outperforms existing works in terms of robustness. In addition, analytical results show that the proposed video hashing algorithm can exhibit a high amount of randomness.

引文

1 comScore. Online video: the new face of the Internet [R]. Reston: comScore, 2008
    2 B. Furht, O. Marques. Handbook of video databases: design and applications [M]. Boca Raton: Auerbach Publications, 2004
    3 C. Cotsaces, N. Nikolaidis, I. Pitas. Video shot detection and condensed representation [J]. IEEE Signal Processing Magazine, 2006, 23(2): 28-37
    4 J. Yuan, H. Wang, W. Zheng, et al. A formal study of shot boundary detection [J]. IEEE Trans. Circuits and Systems for Video Technology, 2007, 1(11): 1-19
    5大卫·波德维尔,汤普森.电影艺术——形式与风格(第八版)[M].彭吉象等译.北京:北京大学出版社,2003
    6 Open video project [DB/OL]. [2007-04-01], http://www.open-video.org
    7 B. T. Truong, S. Venkatesh. Video abstraction: a systematic review and classification [J]. ACM Trans. on Multimedia Computing, Communications and Applications, 2007, 3(1): 1-37
    8 T. C. Hoad, J. Zobel. Detection of video sequences using compact signatures [J]. ACM Trans. on Information Systems, 2006, 24(1): 1-50
    9 M. Stamp. Information security: principles and practice (2nd ed.) [M]. New York: John Wiley, 2006
    10旋奈尔.应用密码学协议算法与C源程序[M].吴世忠等译.北京:机械工业出版社,2000
    11牛夏牧,焦玉华.感知哈希综述[J].电子学报, 2008, 36(7):1405-1411
    12 J. Haitsma, T. Kalker. A highly robust audio fingerprinting system [C] // SPIE International Conference on Music Information Retrieval, 2002:107-115
    13 D. D. Feng, W. C. Siu, H. J. Zhang. Multimedia information retrieval and management: technological fundamentals and application [M]. Berlin: Springer, 2003
    14 S. F. Chang, W. Chen, H. Meng, et al. Video Q: an automated contend based video search system using visual cues [C] //ACM International Conference on Multimedia, 1997, Seattle WA:313-324
    15 VisualQ [OL]. http:// wwww/ctr/conlumbia.edu/VisualSeekbia.edu/Videoq
    16 WebSeek [OL]. http://wwww/ctr/conlum
    17 H. D. Wactlar, T. Kanader. Intelligent access to digital video: informedia project [J]. IEEE Computer, 1996, 29(5):46-52
    18 J. Huang, Z. Li, S.Q. Yang. TVFind (TM): an MPEG-7 based video management system over Internet [C] // SPIE International Conference on Storage and Retrieval for Media Databases, 2001:336-346
    19 X. Liu, Y. Zhang, Y. Pan. Webscope-CBVR: a customized content based search engine for video on WWW [C] // SPIE. 2000, San Jose, CA
    20 Y. P. Tan, J. Nagamani, H. Lu. Modified Kolmogorov–Smirnov metric for shot boundary detection [J]. IEE Electronics Letter. 2003, 39(18):1313-1315
    21 C. C. Lo, S. J. Wang. Video segmentation using a histogram-based fuzzy c-means clustering algorithm [J]. Computer Standard and Interface, 2001, 23:429-438
    22 A. Hanjalic. Shot-boundary detection: unraveled and resolved? [J]. IEEE Trans. Circuits and System for Video Technology, 2002, 12 (2):90-105
    23 C. W. Su, H. Y. M. Liao, H. R. Tyan, et al. A motion-tolerant dissolve detection algorithm [J]. IEEE Trans. Multimedia, 2005, 7(6):1106-1113
    24 H. W. Yoo, H. J. Ryoo, D. S. Jang. Gradual shot boundary detection using localized edge blocks [J]. Multimedia Tools and Applications, 2006, 28:283-300
    25 R. Zabih, J. Miller, K. Mai. A feature-based algorithm for detecting and classification production effects [J]. Multimedia Systems, 1999, 7:119-128
    26 J. Bescos, G. Cisneros, J. M. Martinez, et al. A unified model for techniques on video-shot transition detection [J]. IEEE Trans. Multimedia, 2005, 7(2):293-307
    27 C. Grana, R. Cucchiara. Linear transition detection as a unified shot detection approach [J]. IEEE Trans. Circuits and System for Video Technology, 2007, 17(4):483-489
    28 R. A. Joyce, B. D. Liu. Temporal segmentation of video using frame and histogram space [J]. IEEE Trans. Multimedia, 2006, 8(1):130-140
    29 M. H. Lee, H. W. Yoo, D. S. Jang. Video scene change detection using neural network: improved ART2 [J]. Expert Systms with Applications, 2006, 31:13-25
    30 P. Sarah, M. Majid, T. Barry. Temporal video segmentation and classification of edit effects [J]. Image and Visual Computing, 2003, 21: 1097-1106
    31 S. B. John, D. W. Lynn. A hidden markov model framework for video segmentation using audio and image features [C] // International Conference on Acoustics, Speech, and Signal Processing, 1998:3741-3744
    32 A. Miene, T. Hermes, G. T. Ioannidis, et al. Automatic shot boundary detectionusing adaptive thresholds [C] // TRECVID Workshop, 2003:1-7
    33 H. H. Yu, W. Wayne. A hierarchical multiresolution video shot transition detection scheme [J]. Computer Vision and Image Understanding, 1999, 75(1): 196-213
    34 B. L. Yeo, B. Liu. Rapid scene analysis on compressed video [J]. IEEE Trans. Circuits and System for Video Technology, 1995, 5 (6): 533-544
    35 D. Lelescu, D. Schonfeld. Statistical sequential analysis for real-time video scene change detection on compressed multimedia bitstream [J]. IEEE Trans. Multimedia, 2003, 5(1):106-117
    36 W. Zheng, J. Yuan, H. Wang, et al. A novel shot boundary detection framework [C] // SPIE Visual Communication and Image Processing, 2005, 5960: 410-420
    37 S. B. Jun, K. Yoon, H. Y. Lee. Dissolve transition detection algorithm using spatio-temporal distribution of MPEG macro-block types [C] // ACM International Conference on Multimedia, 2000:391-394
    38 J. R. Cao, A. N. Cai. A robust shot transition detection method based on support vector machine in compressed domain [J]. Pattern Recognition Letters, 2007, 28:1534-1540
    39 H. Yi, R. Deepu, L. T. Chia. A motion-based scene tree for compressed video content management [J]. Image and Visual Computing, 2006, 24:131-142
    40 N. V. Patel, I. K. Sethi. Compressed video processing for cut detection [J]. IEE Image Processing, 1996, 143(5):315-323
    41 B. L. Yeo, B. Liu. Rapid scene analysis on compressed video [J]. IEEE Trans. Circuits and Systems for Video Technology, 1995, 5(6):533-544
    42 Y. Yusoff, W. Christmas, J. Kittler. Video shot cut detection using adaptive thresholding [C] // International Conference on British Machine Vision, 2000:362-371
    43 H. Yi, R. Deepu, L.T. Chia. A motion-based scene tree for compressed video content management [J]. Image and Visual Computing, 2006, 24:131-142
    44 L. J. Yang, H. Lu, B. Wang, et al. Shot boundary classification and refinement using inter-frame similarity patterns [C] // IEEE International Conference on Information, Communication and Signal Processing, 2005: 673-677
    45 R. A. Joyce, B. Liu. Temporal segmentation of video using frame and histogram space [J]. IEEE Trans. Multimedia. 2006, 8(1):130-140
    46 S. C. Pei, Y. Z. Chou. Effective wipe detection in MPEG compressed video usingmacroblock type information [J]. IEEE Trans. Multimedia, 2002, 4(3):309-319
    47 H. J. Zhang, A. Kankanhali, S. W. Smoliar. Automatic partitioning of full-motion video [J]. Multimedia Systems, 1993:10-28
    48 W. Zhang, J. Lin, X. Chen, et al. Video shot detection using hidden markov models with complementary features [C] // IEEE International Conference on Innovative Computing, Information and Control, 2006, Beijing, China: 593-596
    49 H. Feng, W. Fang, S. Liu. A new general framework for shot boundary detection based on SVM [C] // International Conference on Neural Networks and Brain, 2005, 2:1112-1117
    50 H. Lu, Y. P. Tan. An effective post-refinement method for shot boundary detection [J]. IEEE Trans. Circuits and Systems for Video Technology, 2005, 15(11): 1407-1422
    51 U. Gargi, R. Kasturi, S. H. Strayer. Performance characterization of video-shot-change detection methods [J]. IEEE Trans. Circuits and Systems for Video Technology, 2000, 10(1):1-13
    52 L. Sébastien, H. Jér?me, V. Nicole. A review of real-time segmentation of uncompressed video sequences for content-based search and retrieval [J]. Real-Time Imaging, 2003, 9 (1):73-98
    53 F. Arman, R. Depommier. Content-based browsing of video sequences [C] //ACM Multimedia, 1994, 8:97-103
    54 E. K. Kang, S. J. Kim, J. S. Choi. Video retrieval based on scene change detection in compressed domain [J]. IEEE Trans. Consumer Electronics. 1999, 45(3): 932-936
    55 C. Kim, J. N. Hwang. Object-based video abstraction for video surveillance systems [J]. IEEE Trans. Circuits and Systems for Video Technology, 2002, 12(12):1128-1138
    56 Z. Rasheed, M. Shan. Scene detection in hollywood movies and TV shows [C] // IEEE International Conference on Computer Vision and Pattern Recognition Conference, 2003, Madison, WI:343-348
    57 N. D. Doulanmis, A. D. Doulanmis, Y. S. Avrithis, et al. A stochastic framework for optimal key frame extraction from MPEG video databases [C] // International Workshop on Multimedia Signal Processing. 1999:3-24
    58 N. D. Doulanmis, A. D. Doulanmis, Y. S. Avrithis, et al. Video content representation using optimal extraction of frames and scenes [C] // IEEEInternational Conference on Image Processing, 1998, Chicago, 1:875-879
    59 A. D. Doulanmis, N. D. Doulanmis, Y. S. Avrithis, et al. A fuzzy video content representation for video summarization and content-based retrieval [J]. Signal Processing, 2000, 80:1049-1067
    60 X. Sun, M. S. Kankanhalli. Video summarization using r-sequences [J]. Real Time Imaging, 2000, 6:449-459
    61 H. S. Chang, S. Sull, S. U. Lee. Efficient video indexing scheme for content-based retrieval [J]. IEEE Trans. Circuits and Systems for Video Technology, 1999. 9(8):1269-1279
    62 Z. Li, G. M. Schuster, A. K. Latsaggelos, et al. Optimal video summarization with a bit budget constraint [C] // IEEE International Conference on Image Processing, 2004:617-620
    63 Y. J. Zhang, H. B. Lu. A hierarchical organization scheme for video data [J]. Pattern Recognition, 2002, 35(11):2381-2387
    64 X. D. Yu, L. Wang, Q. Tian, et al. Multi-level video representation with application to keyframe extraction [C] // International Conference on Multimedia Modeling. 2004:117-121.
    65 C. Kim, J. N. Hwang. Object-based video abstraction using clustering analysis [C] // IEEE International Conference on Image Processing, 2001, Thessaloniki, Greece: 657-660
    66 D. DeMenthon, V. Kobla, D. Doermann. Video summarization by curve simplification [C] // ACM Multimedia, 1998: 211-218
    67 S. L. Zhai, J. Tang, C. Y. Zhang. Video abstraction based on relational graph [C] // International Conference on Image and Graphics, 2007:827-832
    68 M. Cooper, J. Foote. Discriminative techniques for key frame selection [C] // IEEE International Conference on Multimedia and Expo, 2005:502-505.
    69 T. Liu, J. R. Andkender. Rule-Based semantic summarization of instructional videos [C] // IEEE International Conference on Image Processing, 2002:601-604
    70 Y. Rui, A. Gupta, A. Acero. Automatically extracting highlights for TV baseball programs [C] // ACM International Conference on Multimedia, 2000, Los Angeles, USA:105-115
    71 Y. Ariki, M. Kumano, K. Tsukada. Highlight scene extraction in real time from baseball live video [C] // International Workshop on Multimedia Information Retrieval, 2003: 209-214
    72 L. He, E. Sanocki, A. Gupta, et al. Auto-summarization of audio-video presentations [C] // ACM Multimedia. 1999:489-498.
    73 K. Miura, R. Hamada, I. Ide, et al. Motion based automatic abstraction of cooking videos [J]. IPSJ Trans. Computer Vision and Image Media, 2002:1-4
    74 J. G. Kim, H. S. Chang, K. Kang, et al. Summarization of news video and its description for content-based access [J]. International Journal of Imaging System and Technology. 2004, 13(5): 267-274
    75 T. Liu, X. Zhang, J. Feng, et al. Shot reconstruction degree: a novel criterion for key frame selection [J]. Pattern Recognition Letters, 2004, 25(2):1451-1457
    76 T. Y. Liu. Inertia-based cut detection technique: a step to the integration of video coding and content-based retrieval [C] // International Conference on Signal Processing 2000, 2:1018-1025.
    77 H. Zhang, J. Wu, D. Zhong, et al. An integrated system for content-based video retrieval and browsing [J]. Pattern Recognition. 1997, 30(4): 643-658.
    78 M. Yeung, B. Liu. Efficient matching and clustering of video shots [C] // IEEE International Conference on Image Processing, 1995, Washington, USA: 338-341
    79 Y. P. Tan, S. R. Kulkarni, P. J. Ramadge. A framework for measuring video similarity and its application to video query by example [C] // IEEE International Conference on Image Processing, 1999, Kobe, Japan, 2:106-110
    80 S. L. Lee, S. J. Chun, J. H. Lee. Effective similarity search methods for large video data streams [C] // International Conference on Computational Science, 2003:1030-1039.
    81 A. G. Hauptmann, M. G. Christel, N. D. Papernick. Video retrieval with multiple image search strategies [C] // ACM/IEEE-CS Joint Conference on Digital Libraries, 2000:376-376
    82 B. Sevinc, T. S. Husrev, M. Nasir. Video copy detection based on source device characteristics: a complementary approach to content-based methods [C] // ACM Multimedia Information Retrieval, 2008, Vancouver: 435-442
    83 A. Hampapur, R. Bolle. Comparison of sequence matching techniques for video copy detection [C] // International Conference on Storage and Retrieval for Media Databases, 2002:194-201
    84 T. Hoad, J. Zobel. Video similarity detection for digital rights management [C] // Australasian Computer Science Conference, Adelaide, Australia, 2003: 237-245
    85 L. Gustavo, K. Hari, F. Borko. Video identification using video tomography [C]// International Conference on Multimedia and Expo, 2009:1030-1033
    86 Z. Li, A. K. Katsaggelos, B. Gandhi. Fast video shot retrieval based on trace geometry matching [J]. IEE Proceedings on Vision, Image and Signal Processing, 2005, 152 (3):367-373
    87 T. C. Hoad, J. Zobel. Detection of video sequences using compact signatures [J]. ACM Trans. on Information Systems. 2006, 24(1):1-50
    88 A. G. Hauptmann, R. Jin, T. D. Ng. Multi-modal information retrieval from broadcast video using OCR and speech recognition [C] // ACM/IEEE-CS Joint Conference on Digital Libraries. 2002:160-161
    89 Z. Xu, H. Ling, F. Zou, et al. Fast and robust video copy detection scheme using full DCT coefficients [C] // International Conference on Multimedia and Expo, 2009:434-437
    90 C. Kas, H. Nicolas. Compressed domain copy detection of scalable SVC videos [C] // International Workshop on Content-Based Multimedia Indexing, 2009:89-94
    91 T. Kuronuni, K. Kashino, H. Murase. A method for robust and quick video searching using probabilistic dither-voting [C] // IEEE International Conference on Image Processing, 2001, 2:653-656
    92 S. S. Cheung, A. Zakhor. Fast similarity search and clustering of video sequences on the world-wide-web [J]. IEEE Trans on Multimedia,7(3):524-537
    93 K. Shearer, S. Venkantesh, H. Bunke. Video sequence matching via decision tree path following [J]. Pattern Recognition Letters, 2001, 22:479-492
    94 S. Poullot, O. Buisson, M. Crucianu. Z-grid-based probabilistic retrieval for scaling up content-based copy detection [C] // ACM International Conference on Image and Video Retrieval, 2009, Amsterdam, Netherlands:348-355
    95 L. O. Julien, B. Olivier, G. B. Valerie, et al. Robust voting algorithm based on labels of behavior for video copy detection [C] // ACM International Conference on Multimedia, 2006, Santa Barbara, USA:835-844
    96 N. Gengembre, S. A. Berrani. A probabilistic framework for fusing frame-based searches within a video copy detection system [C] // ACM International Conference on Image and Video Retrieval, 2008, Niagara Falls, Canada:211-220
    97 S. Berchtold, C. Bohm, H. P. Kriegel. The pyramid-technique: towards breaking the curse of dimensionality [C] // ACM SIGMOD International Conference on Management of Data, 1998, Seattle, USA:142-153
    98 X. Wu, A. G. Hauptmann, C. W. Ngo. Practical elimination of near-duplicates from web video search [C] //ACM International Conference on Multimedia, 2007, Augsburg, Germany:218-227
    99 Z. Yang, W. T. Ooi, Q. Sun. Hierarchical, non-uniform locality sensitive hashing and its application to video identification [C] // IEEE International Conference on Multimedia and Expo., 2004:743-746
    100 B. Sevinc, T. S. Husrev, M. Nasir. Video copy detection based on source device characteristics: a complementary approach to content-based methods [C] // ACM Multimedia Information Retrieval, 2008, Vancouve: 435-552
    101 T. V. Lanh, K. S. Chong, S. Emmanuel, et al. A survey on digital camera image forensic methods [C] // IEEE International Conference on Multimedia and Expo, 2007:16-19
    102 H. T. Sencar, N. Memon. Overview of state-of-the-art in digital image forensics [M]. Singapore: World Scientific Press, 2008
    103 Y. Wu, Y. Zhuang, Y. Pan. Content-based video similarity model [C] // ACM International Conference on Multimedia. 2000, California, USA:465-467
    104 J. Oostveen, T. Kalker, J. Haitsma. Visual hashing of digital video: applications and techniques [C] // SPIE International Conference on Applications of Digital Image Processing XXIV, 2001:121-131
    105 B. Coskun, B. Sankur, N. Memon. Spatio–temporal transform based video hashing [J]. IEEE Trans. Multimedia, 2006, 8(2):1190-1208
    106 K. Hamon, M. Schmucker, X. Zhou. Histogram-based perceptual hashing for minimally changing video sequences [C] // International Conference on Automated Production of Cross Media Content for Multi-Channel Distribution, 2006:236-241
    107 R. Venkatesan, S. M. Koon, M. H. Jakubowski, et al. Robust image hashing [C] // International Conference on Image Processing, 2000, 3:664-666
    108 J. Fridrich, M. Goljan. Robust hash functions for digital watermarking [C] // International Conference on Information Technology: Coding and Computing, 2000:178-183
    109 S. Xiang, H. J. Kim, J. W. Huang. Histogram-based image hashing scheme robust against geometric deformations [C] // ACM International Workshop on Multimedia and Security, 2007, Texas, USA:121-128
    110 M. K. Mihcak, R. Venkatesan. New iterative geometric methods for robustperceptual image hashing [C] // ACM Workshop on Security and Privacy in Digital Rights Management, 2001, Philadephia, PA:13-21
    111 S. S. Kozat, R. Venkatesan, M. K. Mihcak. Robust perceptual image hashing via matrix invariants [C] // IEEE International Conference on Image Processing, 2004:3443-3446
    112 V. Monga, M. K. Mihcak. Robust and secure image hashing via non-negative matrix factorizations [J]. IEEE Trans. Information Forensic Security, 2007, 2(3):376-390
    113 V. Monga, B. L. Evans. Perceptual image hashing via feature points: performance evaluation and tradeoffs [J]. IEEE Trans. Image Processing, 2006, 15(11):3452-3465
    114 S. Roy, Q. Sun. Robust hash for detecting and localizing image tampering [C] // IEEE International Conference on Image Processing, 2007:117-120
    115 S. S. Jin, J. Haitsma, T. Kalker, et al. A robust image fingerprinting system using the Radon transform [J]. Signal Processing and Image Communication, 2004, 19(4):325-339
    116 A. Swaminathan, Y. Mao, M. Wu. Robust and secure image hashing [J]. IEEE Trans. Information Forensic and Security, 2006, 1(2):215-230
    117 J. Haitsma, T. Kalker. A highly robust audio fingerprinting system [C] // International Conference on Music Information Retrieval, 2002:107-115
    118 Y. Liu, H. S. Yun, N. S. Kim. Audio fingerprinting based on multiple hashing in DCT domain [J]. IEEE Signal Processing Letters, 2009, 16(6):525-528
    119 M. K. Mihcak, R. Venkatesan. A perceptual audio hashing algorithm: a tool for robust audio identification and information hiding [C] // International Workshop Information Hiding, 2001, Pittsburgh, PA:51-65.
    120 E. P. McCarthy, F. Balado, G. C. M. Silvestre, et al. A model for improving the performance of feature extraction based robust hashing [C] // SPIE International Conference on Security, Steganography, and Watermarking of Multimedia Contents VII. 2005:59-67
    121 V. Monga, A. Banerjee, B. L. Evans. A clustering based approach to perceptual image hashing [J]. IEEE Trans. Information Forensic and Security, 2006, 1(1):68-79
    122 Y. Mao, M. Wu. Unicity distance of robust image hashing [J]. IEEE Trans. Information Forensic and Security, 2007, 2(3):462-467
    123 Q. Li, S. Roy. On the security of non-forgeable robust hash functions [C] // IEEE Conference on Image Processing, 2008:3124-3127
    124 W. Li, B. Preneel. Attacking some perceptual image hash algorithms [C] // IEEE Conference on Multimedia and Expo, 2007:879-882
    125 Motion Pictures Experts Group. Generic coding of moving pictures and associated audio [S]. MPEG, ISO, 1995
    126 H. Yi, D. Rajan, L. T. Chia. A motion-based scene tree for compressed video content management [J]. Image and Vision Computing. 2006, 24:131-142
    127王甦,汪安圣等著.认知心理学[M].北京:北京大学出版社,1992
    128罗四维等著.视觉感知系信息处理理论[M].北京:电子工业出版社,2006
    129汪云九.神经信息学——神经系统的理论和模型[M].北京:高等教育出版社,2006
    130寿天德.视觉信息处理的脑机制[M].上海:上海科技教育出版社,1997
    131 C. E. Conner, H. E. Egeth, S. Yantis. Visual attention: bottom-up versus dispatch top-down [J]. Current Biology, 2004, 14:850-852
    132 O. L. Meur, P. L. Callet, D. Barba, et al. A coherent computational approach to model bottom-up visual attention [J]. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2006, 28(5):802-817
    133 P. Reinagel, A.M. Zador. Natural scene statistics at the centre of gaze [J]. Network: Computational Neural Systems, 1999, 10:1-10
    134 D. J. Parkhurst, E. Niebur. Scene content selected by active vision [J]. Spatial Vision, 2003, 16:125-154
    135 L. Itti, C. Koch, E. Niebur. A model of saliency-based visual attention for rapid scene analysis [J]. IEEE Trans. Pattern Analysis and Machine Intelligence, 1998, 20(11):1254-1259
    136 L. Itti, C. Koch. A comparison of feature combination strategies for saliency-based visual attention systems [C] // SPIE International Conference on Human Vision and Electronic Imaging IV, 1999:373-38
    137 L. Itti, C. Koch. A saliency-based search mechanism for overt and covert shifts of visual attention [J]. Vision Research. 2000, 40:1489-1506
    138 G. Mickael, G. Nathalie, P. Denis and L. Patricia. Static and dynamic feature-based visual attention model: comparison to human judgment [C] // European Signal Processing Conference, 2005:1-4
    139 W. H. A. Beaudot. Sensory coding in the vertebrate retina: towards an adaptivecontrol of visual sensitivity [C] // Network: Computation in Neural Systems, 1996, 7:317-323
    140 D. H. Hubel, T. N. Wiesel. Receptive fields, binocular interaction, and functional architecture in cat’s visual cortex [J]. Journal of Physiology, 1962, 160:106-154.
    141 J. G. Daugman. Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical fillers [J]. Optical Society of America, 1985, 2(7):1160-1169.
    142 C. Harris. M. Stevens. A combined corner and edge detector [C] // Alvey Vision Conference, 1988:153-158
    143 D. G. Lowe. Distinctive image features from scale-invariant keypoints [J]. International Journal of Computer Vision, 2004, 60(2):91-110
    144 R. A. Horn, C. R, Jorson.矩阵分析(英文影印版)[M].北京:人民邮电出版社,2005
    145 M. P. Dubuisson, A.K. Jain. A modified Hausdorff distance for object matching [J]. Pattern Recognition, 1994, 1(9):566 -568
    146 D. D. Lee, H. S. Seung. Learning the parts of objects by non-negative matrix factorization [J]. Nature, 1999, 401:788-791
    147 F. Jiao, W. Gao, X. Chen, et al. A face recognition method based on local feature analysis [C] // Asian Conference on Computer Vision, 2002, Melbourne, Australia:1-5
    148 I. Buciu, I. Pitas. Application of non-negative and local non negative matrix factorization to facial expression recognition [C] // IEEE International Conference on Pattern Recognition, 2004:288-291
    149 F. A. Nielsen, D. Balslev, L. K. Hansen. Mining the posterior cingulate: segregation between memory and pain components [J]. NeuroImage, 2005, 27(3):520-522
    150 M. W. Berry, M. Browne. Email surveillance using non-negative matrix factorization [J]. Computational and Mathematical Organization Theory, 2005, 11(3):249-264
    151 C. Fevotte, N. Bertin and J. L. Durrieu. Nonnegative matrix factorization with the itakura-saito divergence: with application to music analysis [J]. Neural Computation, 2009, 21(3):1-32
    152 T. Zhang, B. Fang, Y. Y. Tang, et al. Topology preserving non-negative matrix factorization for face recognition [J]. IEEE Trans. on Image Processing. 2008,17(4):574-584
    153 J. K. Kamarainen, V. Kyrki, H. Kalviainen. Noise tolerant object recognition using Gabor filtering [C] // International Conference on Digital Signal Processing, 2002:1349-1352
    154 C. D. Roover, C. D. Vleeschouwer, F. Lefebvre, et al. Robust video hashing based on radial projections of key frames [J]. IEEE Trans. on Signal Processing. 2005, 53(10):4020-4037
    155 V. Kyrki, J. K. Kamarainen, H. Kalviainen. Simple Gabor feature space for invariant object recognition [J]. Pattern Recognition Letters, 2004, 25(3):311-318
    156 R. Porter, N. Canagarajah. Robust rotation-invariant texture classification: wavelet, Gabor filter and GMRF based schemes [J]. IEE Proc.Vision Image Signal Processing, 1997, 144(3):180-188
    157 J. H. Conway, N. J. A. Sloane. Sphere packings, lattices and groups [M]. Berlin: Springer, 1999
    158 A. Gersho. On the structure of vector quantization [J]. IEEE Trans. Information Theory, 1982, 28(2):157-166
    159 A. Kirac, P. P. Vaidyanathan. Results on lattice vector quantization with dithering [J]. IEEE Trans. on Circuits.-II, 1996, 43(12):811-826
    160孙圣和,陆哲明.矢量量化技术及应用[M].北京:科学出版社,2002
    161 J. H. Conway and N. Sloane. A fast encoding method for lattice codes and quantizers [J]. IEEE Trans. Information Theory, 1983, 29:820-824
    162 R. Laroia, N. Farvardin. A structured fixed rate vector quantizer derived from a variable length scaler quantizer: Pts. I and II [J]. IEEE Trans. Information Theory, 1993, 39:851-876
    163 C. Wang, H. Q. Cao, W. Li and K. K. Tzeng, Lattice labeling algorithms for vector quantization [J]. IEEE Trans. on Circuits and Systems for Video Technology, 1998, 8(2):206-220
    164 Photography Image Database [DB/OL]. [2008-12-01] http://www.stat.psu.edu/~jiali/index.download.html

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700