基于视觉感知的H.264感兴趣区域编码研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
视频编码技术是有效传输和存储视频信息的关键技术之一,是现代信息技术中不可或缺的重要组成部分。H.264/AVC(以下简称H.264)是ITU和ISO/IEC联合制定的最新视频编码标准。从视频编码技术的发展历程来看,如何在复杂度和时延受限的条件下,获得最优化的率失真性能,是视频编码设计的核心问题。研究人员先前主要从减少空间域冗余、时间域冗余和统计冗余三个方面来改善视频编码的率失真性能,而目前采用视觉处理、基于区域的视频编码技术是该领域的热点研究方向之一。视觉神经科学研究已经证明,人类视觉系统(Human VisualSystem,HVS)对视频场景的感知是有选择性的,不同的区域或者对象具有不同的视觉重要性。然而,传统的视频编码算法,在压缩视频图像时,并没有考虑HVS对视频场景感知的多样性。因此,对如何利用视觉感知原理来改善H.264视频编码算法的编码效果和计算效率这个问题进行深入研究,具有重要的理论意义和应用价值。本文正是在这种研究背景下,展开了基于视觉感知的H.264感兴趣区域编码算法的研究。
     第1章绪论部分首先阐述了选题的意义,然后对国内外研究现状进行了综述并作了相应的总结,最后介绍了本课题的主要研究内容和论文结构。
     第2章针对全局运动估计计算复杂度过高的问题,提出了一种基于运动矢量对消和差分原理的快速全局运动估计算法。该算法分为两个步骤,首先基于不同象限运动矢量对之间存在的对称抵消特性,估计出平移运动参数分量,然后使用运动矢量对的差分原理,并且结合一种置信判断的策略,估计出变换运动参数分量。全局运动参数的快速有效估计,为后续三章的研究工作奠定了基础。
     第3章提出了一种基于H.264编码域的移动区域检测算法,以运动矢量和像素差值的绝对值的和(Sum of Absolute Difference,SAD)等H.264编码辅助信息作为输入特征量,通过三个算法步骤实现对移动区域的检测。首先,通过全局运动估计及补偿处理和空间域-时间域两步运动矢量滤波方法,实现对运动显著区域的快速检测;然后通过对零运动矢量处的SAD建立χ~2分布,采用基于F假设检验的变化检测方法,来快速检测包含小幅运动的移动区域;最后利用上述两步的检测结果计算出最终的移动区域分布图。移动区域的快速有效检测,为下一章运动感知子模型的研究奠定了基础。
     第4章提出了一种新颖的视觉感知模型,采用时间域和空间域的特征融合方式,计算视频场景的视觉感知图,有效模拟出HVS对视频场景的感知结果。该视觉感知模型由运动感知子模型、纹理感知子模型和空间位置感知子模型三部分构成。首先基于运动速度、运动方向、运动一致性和生物运动等视觉特征,对HVS的运动感知进行了建模,有效模拟HVS对移动区域的感知;接着基于HVS的视觉敏感度和视觉掩盖效应感知机制,对HVS的纹理感知进行了建模,有效模拟HVS对纹理复杂度的感知;然后基于HVS的中央凹和眼动控制感知机制,对HVS的空间位置感知进行了建模,实现了全局运动类型自适应的空间位置感知权重调整。
     第5章提出了一种基于视觉感知的H.264感兴趣区域编码算法,以视觉感知模型和H.264感兴趣区域编码器之间的信息共享为基础,首先采用已提出的视觉感知模型计算视觉感知图,然后进行基于视觉感知图的比特资源分配和计算资源分配,实现了H.264编码效果的改善,及计算效率的提高。在比特资源分配算法中,首先根据HVS对高频信号失真不敏感的感知机制,研究并提出了一种自适应频率系数压制技术;然后分别从理论以及实验两方面分析了视频编码中比特资源的分布特性;最后基于视觉感知图和一种有效的整体编码控制策略,实现了编码效果的改善。在计算资源分配算法中,在对H.264最优编码模式与视频场景内容特征的内在关联进行实验分析的基础上,根据视觉感知图和全局运动类型,研究并提出了一种高效的H.264快速模式分析算法,实现了计算效率的提高。
     第6章总结了本论文的研究成果和创新点,并提出了进一步研究的方向和任务。
The video coding technology, one of the key technologies in the effective transmission and the storage of the video information, takes an important part in the modern information technology. H.264/AVC (H.264 for short) is the newest video coding standard jointly recommended by ITU and ISO/IEC. In the developing history of the video coding technology, how to achieve the optimal rate-distortion performance under the constraints of the complexity and the allowed delay remains the core problem of the video coding design. In the past, the rate-distortion performance of the video coding was mainly improved by the reduction of the spatial, temporal and statistic redundancies, while nowadays the region-based video coding technology using the visual processing becomes a major research direction in the video coding domain. The perception of HVS (Human Visual System, HVS) for the video scene is selective, and different regions or objects in the video scene have diverse levels of visual importance. However, the conventional video encoding algorithm ignores this diversity of perception mechanism. Therefore, it is of theoretical meaning and practical value to take an in-depth study on the improvement of the compression and computation efficiencies of H.264 encoding algorithm by applying the principle of the visual perception of HVS.
     In chapter 1, the significance of my research work is presented together with a brief summary of the present research status.
     Chapter 2 proposes a fast GME method based on the principle of the symmetry elimination and difference of motion vectors to reduce the computational complexity of global motion estimation (GME). The proposed method consists of two stages. Firstly, the translational parameters are achieved by using the technique of the symmetry elimination of motion vectors. And then the transform parameters are estimated by the principle of the difference of motion vectors and the strategy of the belief judgment. As a result, the effective and efficient estimation of global motion parameters lays a foundation for the following research.
     In chapter 3, a novel moving region detection method in H.264 compressing domain is presented, in which the side encoding information, including motion vectors (MV) and sum of absolute differences (SAD), are applied as the input features. The proposed detection method is composed of three processing steps. In the first step the global motion estimation/compensation processing and the spatio-temporal filter method for MV are used to detect the moving regions with the salient motion. Then, the x~2 distribution about the SAD information at zero MV is to be constructed. Next, a change detection algorithm derived from the F hypothesis test is applied to detect the moving regions including the salient and non-salient motions. Finally, the detected results of the two steps described above are adopted to compute the final moving region map.
     In chapter 4, a novel visual perception model, composed of motion perception, texture perception and spatial position perception sub-models, is proposed by fusing the spatio-temporal visual features. First of all, in order to simulate HVS's perception for moving regions, the motion perception of HVS is modeled by fusing the motion visual features including motion velocity, motion direction, motion coherence and biological motion. Then, the texture perception of HVS is modeled based on the perception mechanism of the visual sensitivity and the visual masking effect in HVS to simulate HVS's perception for texture complexity. Finally, the spatial position perception of HVS is modeled on the basis of the perception mechanism of the fovea and the eye movement in HVS. Therefore, the spatial position perception sub-model can adaptively adjust the perceptual importance of different positions in video scene according to the global motion type.
     Chapter 5 brings forward a novel H.264 region-of-interest coding method based on the visual perception to allocate the bit and computation resources. By the proposed visual perception model the visual perception map (VPM) can be computed. Firstly, In order to allocate the bit resource effectively through the VPM, an adaptive frequency coefficient suppression technique is derived from the principle that HVS is less sensitive to the distortion of high frequency signals. Secondly, the distribution characteristic of the bit resource is theoretically and experimentally analyzed. Finally, the optimal bit resource allocation is achieved according to a novel encoding strategy. In order to allocate the computation resource effectively based on the VPM and the global motion type of the video scene, the relation between the optimal encoding mode and features of the contents of the video scene is experimentally analyzed, during which a fast and effective H.264 mode analysis algorithm is deduced.
     The final chapter concludes the new achievements of the whole research and the prospect of the future research.
引文
[1] ITU-T. Codec for videoconferencing using primary digital group transmission (Recommendation H.I20) [S]. Telecommunication standardization sector of ITU, version 1,1984, version 2,1988; version 3, 1993.
    [2] Ahmed N, Natarajan T, Rao K R. Discrete cosine transform [J]. IEEE Transactions Computers, 1974, C-23(1): 90-93.
    [3] Jain J R, Jain A K. Displacement measurement and its application in interframe image coding [J]. IEEE Transactions Communication, 1981,29(12): 1799-1808.
    [4] ITU-T. Video codec for audiovisual services at p×64 kbit/s (Recommendation H.261) [S]. Telecommunication standardization sector of ITU, version 1, 1990,version 2,1993.
    [5] ITU-T. Video coding for low bit rate communication (Recommendation H.263) [S]. Telecommunication standardization sector of ITU, version 1,1995, version 2,1998; version 3,2000.
    [6] ISO/IEC/JTC1. Coding of moving pictures and associated audio for digital storage media at up to about 1.5 Mbit/s-Part 2: Video (ISO/IEC 11172-2MPEG-1) [S]. ISO/IEC, 1993.
    [7] ISO/IEC/JTC1. Generic coding of moving pictures and associated audio nformation- Part 2: Video (ISO/IEC 13818-2 MPEG-2) [S]. ISO/IEC, 1994.
    [8] ISO/IEC/JTC1. Coding of audio-visual objects-Part 2: Visual (ISO/IEC 14496-2 MPEG-4) [S]. ISO/IEC, 1999-2003.
    [9] ITU-T/ISO/IEC Joint Video Team. Advanced video coding for generic audiovisual services (H.264, ISO/IEC 14496-10 AVC) [S]. ITU-T and ISO/IEC,2003.
    [10] Sullivan G J, Wiegand T. Video compression-from concepts to the H.264/AVC standard [J]. Proceedings of The IEEE, 2005, 93(1): 18-31.
    [11] Richardson I E G. H.264 and MPEG-4 video compression [M]. London: John Wiley & Sons Ltd, 2003.
    [12] Wiegand T, Sullivan G J, Bjntegaard G, et al.. Overview of the H.264/AVC video coding standard [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2003, 13(7): 560-576.
    [13] Sikora T. Trends and perspectives in image and video coding [J]. Proceedings of The IEEE, 2005, 93(1): 6-17.
    [14] Julesz B, Schumer R A. Early visual perception [J]. Annual Review of Psychology, 1981, 32: 575-627.
    [15] Koch C, Ullman S. Shifts in selection in visual attention: toward the underlying neural circuitry [J]. Human Neurobiology, 1985, 4(4): 219-227.
    [16] Itti L, Koch C, Niebur E. A model of saliency-based visual attention for rapid scene analysis [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998, 20(11): 1254-1259.
    [17] Walther D. Interactions of visual attention and object recognition: computational modeling, algorithms, and psychophysics [D]. California: California Institute of Technology, 2006.
    [18] Wang Z, Sheikh H R, Bovik A C. The handbook of video databases: design and applications [M]. Boca Raton: CRC Press, 2003: 1041-1078.
    [19] 寿天德.视觉信息处理的脑机制[M].上海:上海科技教育出版社,1997.
    [20] Henderson J M, Hollingworth A. High-level scene perception [J]. Annual Review of Psychology, 1999, 50: 243-271.
    [21] Kelly D H. Visual contrast sensitivity [J]. Optica Acta, 1977, 24:107-129.
    [22] Legge G E, Foley J M. Contrast masking in human vision [J]. JoumaI of the Optical Society of America, 1980, 70: 1458-1471.
    [23] Derrington A M, Allen D H, Delicato L S. Visual mechanisms of motion analysis and motion perception [J]. Annual Review of Psychology, 2004, 55:181-205.
    [24] Puce A, Perrett D. Electrophysiology and brain imaging of biological motion [J].Philosophical transactions-Royal Society of London. Biological sciences 2003,358(1431): 435-445.
    [25] Itti L, Baldi P. A principled approach to detecting surprising events in video [C].IEEE International Conference on Computer Vision and Pattern Recognition(CVPR), 2005, 1: 631-637.
    [26] Wandell B. Foundations of vision [M]. Sunderland, MA: Sinauer Associations,1995.
    [27] Charles P D, Dan M, Ren(?) M M. Eye movement control by the cerebral cortex[J]. Current Opinion in Neurology, 2004, 17(1): 17-25.
    [28] 任延涛,韩玉昌,隋雪.视觉搜索过程中的眼跳及其机制[J].心理科学进 展, 2006,14(3): 340-345.
    [29] Jay P, Leo T. Pro-saccades and anti-saccades to onset and offset targets [J]. Vision Research, 2005,45(6): 765-774.
    [30] James W. The principles of psychology [M]. New York: Holt, 1890.
    [31] Itti L, Koch C. Computational modeling of visual attention [J]. Nature Reviews Neuroscience, 2001, 2(3): 194-203.
    [32] Treisman A M, Gelade G. A feature-integration theory of attention [J]. Cognitive Psychology, 1980,12(1): 97-136.
    [33] Meur O L, Callet P L, Barba D, et al.. A coherent computational approach to model bottom-up visual attention [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006,28(5): 802-817.
    [34] Ma Y F, Zhang H J. A model of motion attention for video skimming [C]. IEEE International Conference on Image Processing, 2002,1(1): 129-132.
    [35] Meur O L, Thoreau D, Callet P L, et al.. A spatio-temporal model of the selective human visual attention [C]. IEEE International Conference on Image Processing, 2005, 3: 11-14.
    [36] You J Y, Liu G Z, Li H L. A novel attention model and its application in video analysis [J]. Applied Mathematics and Computation, 2007, 185: 963-975.
    [37] Mancas M, Mancas-Thillou C, Gosselin B, et al.. A rarity-based visual attention map-application to texture description [C]. IEEE International Conference on Image Processing, 2006: 445-448.
    [38] Li Q, Wang Z. Video quality assessment by incorporating a motion perception model [C]. IEEE International Conference on Image Processing, 2007, 2:173-176.
    [39] Lopez M, Fernandez-Caballero A, Fernandez M A, et al.. Visual surveillance by dynamic visual attention method [J]. Pattern Recognition, 2006, 39: 2194-2211.
    [40] Cheng W H, Wang C W, Wu J L. Video adaptation for small display based on content recomposition [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2007,17(1): 43-58.
    [41] Ferwerda J A, Pattannaik S N, Shirley P, et al.. A model of visual masking for computer graphics [C]. International Conference on Computer Graphics and Interactive Techniques, 1997: 143-152.
    [42] You J Y, Liu G Z, Li H L. A multiple visual models based perceptive analysis framework for multilevel video summarization [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2007, 17(3): 273-285.
    [43] Song H, Kuo C-C J. A region-based H.263+ codec and its rate control for low VBR video [J]. IEEE Transactions on Multimedia, 2004, 6(3): 489-500.
    [44] Lin C H, Wu J L. Content-based rate control scheme for very low bit-rate video coding [J]. IEEE Transactions on Consumer Electronics, 1997,43(2): 123-133.
    [45] Sun Y, Ahmad I, Li D. Region-based rate control and bit allocation for wireless video transmission [J]. IEEE Transactions on Multimedia, 2006, 8(1): 1-10.
    [46] Liu Y, Li Z G, Soh Y C, et al. Conversational video communication of H.264/AVC with region-of-interest concern [C]. IEEE International Conference on Image Processing, 2006: 3129-3132.
    [47] Winkler S. Issues in vision modeling for perceptual video quality assessment [J]. Signal Processing, 1999, 78(2): 231-252.
    [48] Chai D, Ngan K N. Face segmentation using skin-color map in videophone applications [J]. IEEE Transactions on Circuits and Systems for Video Technology, 1999, 9(4): 551-564.
    [49] Tong L, Rao K R. Region-of-interest based rate control for low-bit-rate video conferencing [J]. Journal of Electronic Imaging, 2006,15(3): 0330101-0330112.
    [50] Tong L. Region of interest (ROI) based rate control for H.263 compatible video conferencing [D]. Arlington: The University of Texas, 2005.
    [51] Chen M J, Chi M C, Hsu C T, et al.. ROI video coding based on H.263+ with robust skin-color detection technique [J]. IEEE Transactions on Consumer Electronics, 2003,49(3): 724-730.
    [52] Habili N, Lim C C, Moini A. Segmentation of the face and hands in sign language video sequences using color and motion cues [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2003,14(8): 1086-1097.
    [53] Wang Z, Bovik A C. Embedded foveation image coding [J]. IEEE Transactions on Image Processing, 2001,10(10): 1397-1410.
    [54] Itti L. Automatic foveation for video compression using a neurobiological model of visual attention [J]. IEEE Transactions on Image Processing, 2004, 13(10):1304-1318.
    [55] Lee S, Pattichis M S, Bovik A C. Foveated video compression with optimal rate control [J]. IEEE Transactions on Image Processing, 2001,10(7): 977-992.
    [56] Kuo S S, Johnston J D. Spatial noise shaping based on human visual sensitivity and its application to image coding [J]. IEEE Transactions on Image Processing, 2002,11(5): 509-517.
    [57] Yang X K, Lin W S, Lu K, et al.. Perceptually-adaptive hybrid video encoding based on just-noticeable-distortion profile [C]. Proceedings of SPIE Conference Visual Communications and Image Processing, 2003, 5150: 1448-1459.
    [58] Yang X K, Lin W S, Lu Z K, et al.. Rate control for videophone using local perceptual cues [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2005, 15(4): 496-507.
    [59] Tsapatsoulis N, Pattichis C, Rapantzikos K. Biologically inspired region of interest selection for low bit-rate video coding [C]. IEEE International Conference on Image Processing, 2007: 333-336.
    [60] Tsapatsoulis N, Rapantzikos K, Pattichis C. An embedded saliency map estimator scheme application to video encoding [J]. International Journal of Neural Systems, 2007,17(4): 289-304.
    [61] Wang Y, Li H, Fan X, et al.. An attention based spatial adaptation scheme for H.264 videos on mobiles [J]. International Journal of Pattern Recognition and Artificial Intelligence, 2006, 20(4): 565-584.
    [62] Tang C W, Chen C H, Yu Y H, et al.. Visual sensitivity guided bit allocation for video coding [J]. IEEE Transactions on Multimedia, 2006, 8(1): 11-18.
    [63] Tang C W. Spatiotemporal visual considerations for video coding [J]. IEEE Transactions on Multimedia, 2007, 9(2): 231-238.
    [64] Lin G X, Zheng S B. Perceptual importance analysis for H.264/AVC bit allocation [J]. Journal of Zhejiang University SCIENCE A, 2008, 9(2): 225-231.
    [65] Liu Y, Li Z G, Soh Y C. Region-of-interest based resource allocation for conversational video communication of H.264/AVC [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2008, 18(1): 134-139.
    [66] Kim C, Xin J, Vetro A, et al.. Complexity scable motion estimation for H.264/AVC [C]. Proceedings of SPIE Conference Visual Communications and Image Processing, 2006, 6077: 109-120.
    [67] Saponara S, Casula M, Rovati F, et al.. Dynamic control of motion estimation search parameters for low complex H.264 video coding [J]. IEEE Transactions on Consumer Electronics, 2006, 52(1): 232-239.
    [68] Chen Z X, Song Y, Ikenaga T, et al.. A dynamic search range algorithm for variable block size motion estimation in H.264/AVC [C]. International Conference on Information, Communications & Signal Processing, 2007: 1-4.
    [69] Zhang D M, Lin S X, Zhang Y D, et al.. Complexity controllable DCT for real-time H.264 encoder [J]. Journal of Visual Communication and Image Representation, 2007, 18(1): 59-67.
    [70] Su L, Lu Y, Wu F, et al.. Real-time video coding under power constraint based on H.264 codec [C]. Proceedings of SPIE Conference Visual Communications and Image Processing, 2007,6508: 6508021-12.
    [71] Kaminsky E, Grois D, Hadar O. Dynamic computational complexity and bit allocation for optimizing H.264/AVC video compression. Journal of Visual Communication and Image Representation, 2008,19(1): 56-74.
    [72] Meier T, Ngan K N. Video segmentation for content-based coding [J]. IEEE Transactions on Circuits and Systems for Video Technology, 1999, 9(8):1190-1203.
    [73] Kim M, Choi J G, Kim D, et al.. A VOP generation tool: automatic segmentation of moving objects in image sequences based on spatio-temporal information [J].IEEE Transactions on Circuits and Systems for Video Technology, 1999, 9(8):1216-1226.
    [74] Pons J, Prades-Nebot J, Albiol A, et al.. Fast motion detection in compressed omain for video surveillance [J]. Electronics Letters, 2002,38(9): 409-411.
    [75] Porikli F. Real-time video object segmentation for MPEG-encoded video sequences [C]. Proceedings of SPIE Conference on Real-Time Imaging Ⅷ,2004,5297:195-203.
    [76] Shi X L, Wang S Z, Zhang Z Y, et al.. Moving traffic object retrieval in H.264/MPEG compressed video [C]. Proceedings of SPIE Conference Visual Information Processing XV, 2006, 6246: 6246001-8.
    [77] Babu R V, Ramakrishnan K R, Srinivasan S H. Video object segmentation: a compressed domain approach [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2004,14(4): 462-474.
    [78] Liu Z, Zhang Z, Shen L. Moving object segmentation in the H.264 compressed domain [J]. Optical Engineering, 2007,46(1): 0170031-5.
    [79] Zeng W, Du J, Gao W. Robust moving object segmentation on H.264/AVC compressed video using the block-based MRF model [J]. Real-Time Image,2005,11:290-299.
    [80] Sutter R D, Wolf K D, Lerouge S, et al.. Lightweight object tracking in compressed video streams demonstrated in region-of-interest coding [J].EURASIP Journal on Advances in Signal Processing, 2006,97845: 1-16.
    [81] Odobez J M, Bouthemy P. Robust multiresolution estimation of parametric motion models [J]. Journal of Visual Communication and Image Representation,1995, 6(4): 348-365.
    [82] Alzoubi H, Pan W D. Very fast global motion estimation using partial data [C]. EEE International Conference on Acoustics, Speech, and Signal Processing,2007,1189-1192.
    [83] Dufaux F, Konrad J. Efficient, robust, and fast global motion estimation for video coding [J]. IEEE Transactions on Image Processing, 2000,9(3): 497-501.
    [84] Keller Y, Averbuch A. Fast gradient methods based on global motion estimation for video compression [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2003,13(4): 300-309.
    [85] Greenberg S, Kogan D. Linear search applied to global motion estimation [J]. Multimedia System, 2007,(12): 493-504.
    [86] Rath G B, Makur A. Iterative least squares and compression based estimations for a four-parameter linear global motion model and global motion compensation [J]. IEEE Transactions on Circuits and Systems for Video Technology, 1999, 9(7): 1075-1099.
    [87] Jang S W, Pomplun M, Kim G Y, et al.. Adaptive robust estimation of affine parameters from block motion vectors [J]. Image and Visual Computing, 2005,(23): 1250-1263.
    [88] Huang Y R, Kuo C M, Kuo C L. Efficient global motion estimation algorithm using recursive least squares [J]. Optical Engineering, 2006, 45(5):0570031-05700313.
    [89] Qian X M, Liu G Z. Global motion estimation from randomly selected motion vector groups and GMLM based applications [J]. Signal, Image and Video Processing, 2007,1(3): 179-189.
    [90] Su Y P, Sun M T. A non-iterative motion vector based global motion estimation algorithm [C]. IEEE International Conference on Multimedia and Expo, 2004:703-706.
    [91] Jozawa H, Kamikura K, Sagata A, et al.. Two-stage motion compensation using adaptive global MC and local affine MC [J]. IEEE Transactions on Circuits and Systems for Video Technology, 1997, 7(1): 75-85.
    [92] Wang D, Wang L. Global motion parameters estimation using a fast and robust algorithm [J]. IEEE Transactions on Circuits and Systems for Video Technology,1997, 7(5): 823-826.
    [93] Zitova B, Flusser J. Image registration methods: a survey [J]. Image and Vision Computing, 2003, 21 (11): 977-1000.
    [94] Chen H H, Liang C-K, Peng Y-C, et al.. Integration of digital stabilizer with video codec for digital video cameras [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2007, 17(7): 801-813.
    [95] Dufaux F, Mosheni F. Motion estimation techniques for digital TV: a review and a new contribution [J]. Proceedings of the IEEE, 1995, 83(6): 858-876.
    [96] Horn R A, Johnson C. Matrix analysis [M]. Cambridge University Press, 1985.
    [97] Su Y P, Sun M T, Hsu V. Global motion estimation from coarsely sampled motion vector field and the applications [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2005, 15(2): 232-242.
    [98] Joint Video Team Jm12.3. http://iphome.hhi.de/suehring/tml/download/old_jm/jm12.3.zip/.
    [99] Radke R J, Andra S, Al-Kofahi O, et al.. Image change detection algorithms: a systematic survey [J]. IEEE Transactions on Image Processing, 2005, 14(3):294-307.
    [100] Aach T, Kaup A. Statistical model-based change detection in moving video [J].Signal Processing, 1993, 31 (2): 165-180.
    [101]盛骤,谢式千,潘承毅.概率论与数理统计[M].3版,北京:高等教育出版社,2007.
    [102]Grill-Spector Kalanit, Malach Rafael. The human visual cortex [J]. Annual Review Neuroscience, 2004, 27: 649-677.
    [103] Chang F, Chen C J, Lu C J. A linear-time component-labeling algorithm using contour tracing technique [J]. Computer Vision and Image Understanding, 2004,93(2): 206-220.
    [104]Venkatesh B R, Anantharaman B, Ramakrishnan K R, et al.. Compressed domain action classification using HMM [C]. IEEE Workshop on Content-Based Access of Image and Video Libraries, 2001: 44-49.
    [105] Ma Y F, Zhang H J. A new perceived motion based shot content representation [C]. IEEE International Conference on Image Processing, 2001, 3: 426-429.
    [106] Rees G, Friston K, Koch C. A direct quantitative relationship between the functional properties of human and macaque V5 [J]. Nature Neuroscience, 2000,3(7): 716 - 723.
    [107]徐战武,朱淼良.基于颜色的皮肤检测综述[J].中国图象图形学报,2007,12(3):377-388.
    [108] Hsu R L, Mohamed A M, Jain A K. Face detection in color images [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(5):696-706.
    [109] Rajashekar U, Cormack L K, Bovik A C. Image features that draw fixations [C].IEEE International Conference on Image Processing, 2003, 3:313-316.
    [110] Tang Q L, Sang N, Zhang T X. Contour detection based on contextual influences [J]. Image and Vision Computing, 2007, 25(8): 1282-1290.
    [111] Umesh R, Ian V D L, Alan C B, et al.. Foveated analysis of image features at fixations [J]. Vision Research, 2007, 47(25): 3160-3172.
    [112]Wang Z, Lu L, Alan C B. Foveation scalable video coding with automatic fixation selection [J]. IEEE Transactions on Image Processing, 2003, 12(2): 243-254.
    [113] Otsu N. A threshold selection method from gray-level histograms [J]. IEEE Transactions on Systems, Man, and Cybernetics, 1979, 9(1): 62-66.
    [114] Walther D, Koch C. Modeling attention to salient proto-objects [J]. Neural Networks, 2006, 19(9): 1395-1407.
    [115] Ribas C J, Shawmin L. Rate control in DCT video coding for low delay video communication [J]. IEEE Transactions on Circuits and Systems for Video Technology, 1999, 9(1): 172-185.
    [116] Chiang T H, Zhang Y Q. A new rate control scheme using quadratic rate distortion model [J]. IEEE Transactions on Circuits and Systems for Video Technology, 1997, 7(1): 246-250.
    [117]Li Z G, Pan F, Lim K P, et al.. Adaptive basic unit layer rate control for JVT JVT-G012rl [J]. Joint Video Team (JVT) of ISO/IEC MPEG & ITU-T VCEG,7th Meething, Pattaya Ⅱ, Thailand, 2003.
    [118] Malvar H S, Hallapuro A, Karczewicz M, et al.. Low-complexity transform and quantization in H.264/AVC [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2003, 13(7): 589-603.
    [119] Schuur B, Wedi T, Wittmann S, et al.. Frequency selective update for video coding [C]. IEEE International Conference on Image Processing, 2006:1709-1712.
    [120]Liu Q, Ye S, Hu R, et al.. A novel intra/inter mode decision algorithm for H.264AVC based on spatio-temporal correlation [C]. Lecture Notes in Computer Science (LNCS 4351), 2007:1: 568-575.
    [121] Kim C, Kuo C-C J. Feature-based intra-/intercoding mode selection for H.264/AVC [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2007,17(4): 441-453.
    [122] Lai C, Hao C, Xi Y, et al.. Adaptive intra mode skipping algorithm for inter frame coding of H.264/AVC [C]. Proceedings of SPIE Conference Visual Communications and Image Processing, 2005, 5960: 5960201-8.
    [123] Choi I, Lee J, Jeon B. Fast coding mode selection with rate-distortion optimization for MPEG-4 part-10 AVC/H.264 [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2006, 16(12): 1557-1561.
    [124]Nieto M, Salgado L, Cabrera J, et al.. Fast mode decision on H.264/AVC baseline profile for real-time performance [J]. Journal of Real-Time Image Processing, 2008, 3: 61-75.
    [125] Pan Y-N, Tsai T-H. Fast motion estimation and edge information inter-mode decision on H.264 video coding [C]. IEEE International Conference on Image Processing, 2006, Ⅱ: 473-476.
    [126] Choi B-D, Nam J-H, Hwang M-C, et al.. Fast motion estimation and intermode selection for H.264 [J]. EURASIP Journal on Applied Signal Processing, 2006,71643: 1-8.
    [127] Wang X, Sun J, Liu Y, et al.. Fast mode decision for H.264 video encoder based on MB motion characteristic [C]. IEEE International Conference on Multimedia and Expo, 2007: 372-375.
    [128] Wang H M, Lin J K, Yang J F. Fast H.264 inter mode decision based on inter and intra block conditions [C]. IEEE International Symposium on Circuits and Systems, 2007: 3647-3650.
    [129] Wang H M, Lin J K, Yang J F. Fast inter mode decision based on hierarchical homogeneous detection and cost analysis for H.264/AVC coders [C]. IEEE International Conference on Multimedia and Expo, 2006: 709-712.
    [130] Nieto M, Salgado L, Cabrera J. Fast mode decision and motion estimation with object segmentation in H.264/AVC encoding [C]. Lecture Notes in Computer Science (LNCS 3708), 2005: 571-578.
    [131] Huang Y W, Hsieh B Y, Chien S Y, et al.. Analysis and complexity reduction of multiple reference frames motion estimation in H.264/AVC [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2006, 16(4):507-522.
    [132] Kuo T Y, Lu H J. Efficient reference frame selector for H.264 [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2008, 18(3):400-405.
    [133]Bjontegaard G Calculation of average PSNR differences between RD-curves (Document VCEG-M33) [R]. Proceedings of the ITU-T VCEG 13th Meeting,Austin, Tex, USA, 2001.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700