基于视频信号描述的视频自适应技术研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
基于网络的多媒体应用是当前多媒体技术发展的必然趋势。多媒体应用环境的多样性给终端用户提供了灵活便捷的多媒体信息获取方式,提高了用户的工作效率和生活质量。但是,这种多样性同时也造成了媒体信息访问的困难。视频自适应技术是目前解决这些问题实现通用媒体访问最有前途的研究方向。
     本文将视频自适应技术作为研究内容,对这领域进行了一些探索性的研究工作。首先,本文对现有的视频自适应研究框架进行了分析,指出该框架存在的问题——不利于通用高效视频自适应技术的设计。为克服这一缺点,本文提出了一种新的研究框架以及相应的算法,主要创新之处有:
     ①本文提出了基于视频信号描述的自适应技术研究方案,即着重研究如何描述原始视频信号使其便于自适应操作。本文进一步提出用特征描述集合的形式作为视频信号的表达形式,每种特征描述反映了原始视频信号某些方面的特性。利用这些特性,可以设计简单高效的自适应操作。
     ②基于上述研究框架,本文提出一种运动信息描述算法用于实现码率的自适应功能。我们首先提出了一种基于分块模式的分层模型用于描述运动信息,然后提出了一种预编码的算法获取运动信息,最后还提出了运动信息的快速抽取算法。本算法生成的运动信息描述可用于码流的快速生成。当需要某一码率的视频时,可以从运动信息描述中抽取出对应的运动信息用于编码,由于不需要运动估计,所以可以快速生成码流。
     ③本文提出一种率失真信息的提取算法可应用于码率分配。我们首先分析了预测帧编码的依赖性提出了一种依赖性的线性描述模型,并定义了影响因子的概念。在此基础上,我们提出了单帧率失真函数的修正算法。修正后的函数隐含了依赖性的考虑,反映了整体失真和单帧码率的关系。我们将修正后的率失真函数应用于码率分配,提高了整体的编码性能。
     ④本文提出一种基于感兴趣信息的空间自适应解决方案。移动终端用户由于有限的显示屏尺寸在欣赏高分辨率视频时存在困难。我们基于感兴趣区域的思想,提出将高分辨率视频转换成由感兴趣区域组成的低分辨率视频以克服显示屏尺寸带来的限制,同时,尽可能保留原视频中的信息,提高用户的欣赏效果。整个方案包括感兴趣信息的提取和封装算法、基于感兴趣信息的量化参数自适应调整算法和快速模式选择算法。
     综上所述,本文对视频自适应技术进行了深入地探讨,取得了一些有价值的研究成果。目前,本文提出的视频自适应的研究框架还处于初级阶段,存在很多问题有待解决,值得我们进行更深入地研究。
Internet based multimedia application is the evolution trend of multimedia techniques. Current complex media environment provides users flexible media data access, which improves working efficiency and life quality of users. However, the diversity property also brings serious difficulties in media data access. Video adaptation is the most promising solution for achieving universal media access.
     We studied video adaptation techniques in this dissertation. Firstly, we analyzed existing video adaptation framework and found that it is not appropriate for designing general and efficient video adaptation operations. Based on this analysis, we proposed a novel video adaptation framework and some related algorithms. The main contributions of this dissertation can be summarized as follows.
     ①We proposed a video adaptation framework based on video signal representation, in which the key point is how to describe video signal to facilitate it in video adaptation operations. We also proposed the set of feature descriptors as the representation of video signal. Each descriptor reflects some characteristics of original video signal. By making using of these descriptors, we can develop simple and efficient video adaptation operations.
     ②Based on the framework above, we proposed a motion description algorithm for bit-rate adaptation. Firstly, we proposed a mode-based hierarchical model for motion information. Secondly, we proposed a pre-encoding method to achieve motion descriptor. We also developed a method of motion extraction. The generated motion descriptor can be applied in fast bit stream generation. While encoding a video in some bit-rate, the encoder can extract proper motion information to accelerate the encoding process. Due to skipping motion estimation, the encoding complexity can be reduced greatly.
     ③We proposed an algorithm to extract rate-distortion information from original video signal. The information can be used in bit allocation. We analyzed the dependency among predictive frames and proposed a linear model for it. We also introduced the definition of impact factor. Based on the linear model, we proposed a method to modify the rate-distortion function of each frame. Then, the modified functions with dependency consideration reflect the relationships between the whole distortion and the single frame rate. With these modified functions, it is easy to achieve optimal bit allocation to improve the encoding performance.
     ④We proposed a spatial adaptation framework based on attention information. The limited display size of mobile devices hinders the perceptive experiences of users when they browse high-resolution videos. Based on the idea of ROI, we proposed to transform high-resolution videos into low-resolution ones which are composed of attention areas in each frame, which will meet the constraint from the limited display size. At the same time, most of attention information in original video signal will be remained. With this framework, the perceptive experience of mobile users will be improved. The whole framework includes three algorithms: the extraction and encapsulation of attention information, QP adaptive adjustment based on attention information, and fast mode decision for transcoding.
     In conclusion, we studied video adaptation techniques and achieved some valuable results. Till now, the proposed framework is still at the initial stage and there exist many problems needed to be solved. The video adaptation framework based on video signal representation is a potential research field which is worthy of further study.
引文
[1] 刘甘娜,朱文胜,付先平,“多媒体应用基础(第二版)”,高等教育出版社,20004年4月第2版。书号:ISBN 7-04-007924-0。
    [2] 钟玉琢,“多媒体技术”,清华大学出版社,1999年8月第2次印刷。书号:ISBN7-302-03416-8/TE.1855。
    [3] 吴玲达,老松杨,魏迎海,“多媒体技术”,电子工业出版社,2003年5月第1版。书号:ISBN 7-5053-8675-1。
    [4] 林福宗,“多媒体技术基础”,清华大学出版社,2003年8月第5次印刷。ISBN7-302-05705-2/TP.3365。
    [5] 马思伟,“基于率失真优化的视频编码研究”,中科院计算所02级博士论文。
    [6] ITU-T, "Video codec for audiovisual services at px64 kbit/s", ITU-T Rec. H. 261, Nov. 1990.
    [7] ITU-T, "Video coding for low bitrate communication", ITU-T Rec. H. 263, Nov. 1995.
    [8] ISO/IEC JTC1, "Coding of moving pictures and associated audio for digital storage media at up to about 1.5M bits/s-Part 2: Video", ISO/IEC 11172-2(MPEG-1), Mar.1993.
    [9] ITU-T and ISO/IEC JTC1, "Generic coding of moving pictures and associated audio information-Part 2: Video", ITU-T Rec. H. 262-ISO/IEC 13818-2 (MPEG-2), Nov. 1994.
    [10] ISO/IEC JTC1, "Coding of audio-visual objects-Part 2: Visual", ISO/IEC 14496-2 (MPEG-4), Apr. 1999.
    [11] ITU-T and ISO/IEC JTC1, "Advanced video coding for generic audiovisual services", ITU-T Rec. H. 264 and ISO/IEC 14496-10 AVC, 2003.
    [12] Y. Wang, J. Ostermann, Y. -Q. Zhang, “视频处理与通信”,电子工业出版社, 2003年6月第1次印刷。ISBN: 7-5053-7635-7。
    [13] J. Ribas-Corbera, "A generalized hypothetical reference decoder for H.264/AVC", IEEE Trans. Circuits Syst. And Video Technol., vol. 13, no. 7, pp. 674-687, Jul. 2003.
    [14] M. Wien, "Variable block-size transforms for H. 264/AVC", IEEE Trans. Circuits Syst. And Video Technol., vol. 13, no. 7, pp. 604-613, Jul. 2003.
    [15] M. Karczewicz and R. Kurceren, "The SP-and SI-frames design for H. 264/AVC", IEEE Trans. Circuits Syst. And Video Technol., vol. 13, no. 7, pp. 637-644, Jul. 2003.
    [16] T. Wiegand, G. -J. Sullivan, G. Bjotegaard, and A. Luthra, "Overview of the H. 264/AVC video coding standard", IEEE Trans. Circuits Syst. And Video Technol., vol. 13, no. 7, pp. 560-576, Jul. 2003.
    [17] T. Wedi and H. -G. MUsmann, "Motion-and aliasing-compensated prediction for hybrid video coding", IEEE Trans. Circuits Syst. And Video Technol., vol. 13, no. 7, pp. 577-586, Jul. 2003.
    [18] S. Wenger, "H. 264/AVC over IP", IEEE Trans. Circuits Syst. And Video Technol., vol. 13, no. 7, pp. 645-656, Jul. 2003.
    [19] T. Stockhammer, M. M. Hannuksela, and T. Wiegand, "H. 264/AVC in wireless environments", IEEE Trans. Circuits Syst. And Video Technol., vol. 13, no. 7, pp. 657-673, Jul. 2003.
    [20] M. Horowitz, A. Joch, F. Kossentini, A. Hallapuro, "H.264/AVC baseline profile decoder complexity analysis", IEEE Trans. Circuits Syst. And Video Technol., vol. 13, no. 7, pp. 704-716, Jul. 2003.
    [21] M. Flierl, and B. Girod, "Generlized B pictures and the draft H.264/AVC video-compression standard", IEEE Trans. Circuits Syst. And Video Technol., vol. 13, no. 7, pp. 587-597, Jul. 2003.
    [22] D. Marpe, H. Schwarz, and T. Wiegand, "Context-based adaptive binary arithmetic coding in the H.264/AVC video compression standard", IEEE Trans. Circuits Syst. And Video Technol., vol. 13, no. 7, pp. 620-636, Jul. 2003.
    [23] V. Lappalainen, A. Hallapuro, and T. -D. Hamalainen, "Complexity of optimized H.26L video decoder implementation", IEEE Trans. Circuits Syst. And Video Technol., vol. 13, no. 7, pp. 717-725, Jul. 2003.
    [24] P. List, A. Joch, J. Lainema, G. Bjotegaard, and M. Karczewicz, "Adaptive deblocking filter", IEEE Trans. Circuits Syst. And Video Technol., vol. 13, no. 7, pp. 614-619, Jul. 2003.
    [25] F. Wu, S. -P. Li, Y.-Q. Zhang, "A Framework for efficient progressive fine granularity scalable video coding", IEEE Trans. Circuits Syst. And Video Technol., vol. 11, no. 3, pp. 332-344, Mar. 2001.
    [26] W. -P. Li, "Overview of fine granularity scalability in MPEG-4 video standard", IEEE Trans. Circuits Syst. And Video Technol., vol. 11, no. 3, pp. 301-317, Jul. 2001.
    [27] F. Wu, H. -H. Sun, G. -B. Shen, S.-P. Li, Y.-Q. Zhang, B. Lin, and M. -C. Li, "SMART: an efficient, scalable and robust streaming video system", EURASIP Journal on Applied Signal Processing, Special issue on Multimedia over IP and Wireless Networks, vol. 2, pp 192-206, 2004.
    [28] F. Wu, S. -P. Li, X. -Y. Sun, B. Zeng, and Y. -Q. Zhang, "Macroblock-based progressive fine granularity scalable video coding", International Journal of Imaging Systems and Technology, special issue on Multimedia Content Description and Video Compression, vol. 13, no 6, pp 297-307, 2004.
    [29] F. Wu, S. -P. Li, R. Yan, X. -Y. Sun and Y. -Q. Zhang, "Efficient and universal scalable video coding", IEEE International Conference on Image Processing (ICIP), vol. 2, pp 37-40, Rochester, September 2002.
    [30] W. -P. Li, "Fine granularity scalability in MPEG-4 for streaming video", IEEE International Symposium on Circuits and Systems, Geneva, Switzerland, May. 2000.
    [31] Y. -F. Hsu, Y. -C. Chen, C. -J. Huang, and M. -J. Sun, "MPEG-2 spatial scalable coding and transport stream error concealment for satellite TV broadcasting using Ka-band", IEEE Trans. Broadcasting, vol. 44, no. 2, pp. 233-242, Jun. 1998.
    [32] R. Dugad, and N. Ahuja, "A scheme for spatial scalability using nonscalable encoders", IEEE Trans. Circuits Syst. And Video Technol., vol. 13, no. 10, pp. 993-999, Oct. 2003.
    [33] Q. -W. Hu and S. Panchanathan, "Image/video spatial scalability in compressed domain", IEEE Trans. Industrial Electronics, vol. 45, no. 1, pp. 23-31, Feb. 1998.
    [34] B. -J. Kim, Z. -X Xiong, and W. A. Perlman, "Low bit-rate scalable video coding with 3-D set partitioning in hierarchical trees (3-D SPIHT)", IEEE Trans. Circuits Syst. And Video Technol., vol. 10, no. 8, pp. 1374-1387, Dec. 2000.
    [35] Y. Andreopoulos, A. Munteanu, G. Van der Auwera, P. Schelkens and J. Cornelis, "Wavelet-based fully-scalable video coding with in-band prediction", Proc. 3nd IEEE Benelux Signal Processing Symposium, Leuven, Belgium, Mar. 2002.
    [36] U. Benzler, "Spatial scalable video coding using a combined subband-dct approach", IEEE Trans. Circuits Syst. And Video Technol., vol. 10, no. 7, pp. 1080-1087, Oct. 2000.
    [37] M. domanski, A. Luczak, and S. Mackowiak, "Spatio-temporal scalability for MPEG video coding", IEEE Trans. Circuits Syst. And Video Technol., vol. 10, no. 7, pp. 1088-1093, Oct. 2000.
    [38] R. Mohan, J. R. Smith, and C.-S. Li, "Adapting Multimedia Internet Content for Universal Access," IEEE Trans. Multimedia, vol. 1, Mar. 1999, pp. 104-114.
    [39] P. van Beek, J. R. Smith, T. Ebrahimi, T. Suzuki, and J. Askelof, "Metadata-Driven Multimedia Access," IEEE Signal Processing Magazine, vol. 20, Mar. 2003. pp. 40-52.
    [40] J. Bormans, J. Gelissen, and A. Pekis, "MPEG-21: The 21 Century Multimedia Framework," IEEE Signal Processing Mag., vol. 20, pp. 53-62, Mar. 2003.
    [41] A. Fox and E. A. Brewer, "Reducing WWW latency and bandwidth requirements by real timer distillation," presented at the Int. WWW Conf., Paris, France, 1996.
    [42] J. R. Smith, R. Mohan, and C. Li, "Scalable multimedia delivery for pervasive computing," presented at the ACM Multimedia Conf., Orlando, FL, 1999.
    [43] S.-F. Chang and A. Vetro, "Video adaptation: Concepts, technologies, and open issues", Proc. IEEE, vol. 93, no. 1, pp. 148-158, Jan. 2005.
    [44] D. Mukherjee, A. Said, and S. Liu, "A framework for fully format-independent adaptation of scalable bit streams", IEEE Trans. Circuits Syst. And Video Technol., vol. 15, no. 10, pp. 1280-1290, Oct. 2005.
    [45] Y. Wang, M. van der Schaar, S.-F. Chang, and A. C. Loui, "Classification-based multidimensional adaptation prediction for scalable video coding using subjective quality evaluation", IEEE Trans. Circuits Syst. And Video Technol., vol. 15, no. 10, pp. 1270-1279, Oct. 2005.
    [46] D. A. Sadlier and E.O'Connor, "Event detection in field sports video using audio-visual features and a support vector machine", IEEE Trans. Circuits Syst. And Video Technol., vol. 15, no. 10, pp. 1225-1233, Oct. 2005.
    [47] I. Cheng, and P. Boulanger, "Feature extraction on 3-D texmesh using scale-space analysis and perceptual evaluation", IEEE Trans. Circuits Syst. And Video Technol., vol. 15, no. 10, pp. 1234-1244, Oct. 2005.
    [48] J.-K. Han, S.-M. Kwak, and J. Kim, "Joint optimization of the motion estimation module and the up/down scalar in transcoders", IEEE Trans. Circuits Syst. And Video Technol., vol. 15, no. 10, pp. 1203-1213, Oct. 2005.
    [49] S. Dasiopoulou, V. Mezaris, I. Kompatsiaris, V.-K. papastathis, and M.-G. Strintzis, "Knowledge-assisted semantic video object detection", IEEE Trans. Circuits Syst. And Video Technol., vol. 15, no. 10, pp. 1210-1224, Oct. 2005.
    [50] Z. Li, G. M. Schuster, and A. -K. Katsaggelos, "MinMax optimal video summarization", IEEE Trans. Circuits Syst. Video Technol., vol. 15, no. 10, pp. 1245-1256, Oct. 2005.
    [51] J. Chakareski, J.-G. Apostolopoulos, S. Wee, W.-T. Tan, and B. Girod, "Rate-distortion hint tracks for adaptive video streaming", IEEE Trans. Circuits Syst. And Video Technol., vol. 15, no. 10, pp. 1257-1269, Oct. 2005.
    [52] A. Cavallaro, O. Steiger, and T. Ebrahimi, "Semantic video analysis for adaptive content delivery and automatic description", IEEE Trans. Circuits Syst. And Video Technol., vol. 15, no. 10, pp. 1200-1209, Oct. 2005.
    [53] B. Shen, "Submacroblock motion compensation for fast down-scale trascoding of compressed video", IEEE Trans. Circuits Syst. And Video Technol., vol. 15, no. 10, pp. 1291-1302, Oct. 2005.
    [54] S.-F. Chang, "Optimal video adaptation and skimming using a utility-based framework", presented at the Tyrrhenian Int. Workshop Digital Communications, Capri Island, Italy, 2002.
    [55] S.-F. Chang, D. Zhong, and R. Kumar, "Real-time content-based adaptive streaming of sports video," presented at the IEEE Workshop Content-based Access to Video/Image Library. IEEE CVPR Conf., Honolulu, Hawaii, Dec.2001.
    [56] M. Irani and P. Anandan, "Video indexing based on mosaic representation," IEEE Trans. Pattern Anal. And March. Intell., vol. 86, pp.905-921. May 1998.
    [57] A. Vetro, C. Christopoulos, H. Sun, "An overview of video transcoding architectures and techniques," IEEE Signal Processing Magazine, Vol.20, No.2, p18-29, Mar. 2003.
    [58] J. Xin, C. -W. Lin, and M. -T. Sun, "Digital video transcoding", Proceedings of the IEEE, vol. 93, no. 1, Jan. 2005.
    [59] H. Sun, W. Kwok, and J. W. Zdepski, "Architectures for MPEG compressed bitstream scaling," IEEE Trans. Circuits Syst. and Video Technol., vol. 6, no. 2, pp. 191-199, Apr. 1996.
    [60] C.-W. Lin and Y.-R. Lee, "Fast algorithms for DCT-domain video transcoding," in Proc. IEEE International Conference Image Processing, vol. 1,2001, pp. 421-424.
    [61] W. Zhu, K. Yang, and M. Beacken, "CIF-to-QCIF video bitstream down-conversion in the DCT domain," Bell Labs. Tech. J., Vol. 3, No. 3, pp. 21-29, Jul.-Sep. 1998.
    [62] J. Youn, M. T. Sun, and C. W. Lin "Motion vector refinement for high performance transcoding," IEEE Trans. Multimedia, Vol. 1, pp. 30-40, Mar. 1999.
    [63] M. A. Smith and T. Kanade, "Video skimming fro quick browsing based on audio and image characterization," Carnegie Mellon Univ. Pittsburgh, PA, Tech. Rep. CMU-CS-95-186, Jul. 1995.
    [64] H. Sundaram, L. Xie, and S. -F. Chang, "A utility framework for the automatic generation of audio-visual skims," presented at the ACM Multimedia Conf., Juan Les Pins, France, 2002.
    [65] J. -K Han, S. -M. Kwak, and J. Kim, "Joint optimization of the motion estimation module and the up/down scalar in transcoders," IEEE Trans. Circuits Syst. And Video Technol., vol. 15, no. 10, pp. 1303-1313, Oct. 2005.
    [66] T. Koga, K. Iinuma, A. Hirano, Y. Iijima, and T. Ishiguro, "Motion compensated interframe coding for video conference", in Proc. Nat. Telecommunications Conf., New Orleans, LA, Dec. 1981, pp.G5.3.1-G.5.3.5.
    [67] J. R. Jain and A. K. Jain, "Displacement measurement and its application in interframe image coding", IEEE Trans. Commun. vol. COM-29, pp. 1799-1808, Dec. 1981.
    [68] R. Li, B. Zeng, and M. L. Liou, "A new three-step search algorithm for block matching motion estimation", IEEE Trans. Circuits Syst. And Video Technol., vol. 4, pp. 438-442, Aug. 1994.
    [69] T. H. Han and S. H. Hwang, "A novel hierarchical-search block matching algorithm and VLSI architecture considering the spatial complexity of the macroblock", IEEE Trans. Consumer Electron., vol. 44, pp. 337-342, May 1998.
    [70] J. Y. Tham, S. Ranganath, M. Ranganath, and A. A. Kassim, "A novel unrestricted center-biased diamond search algorithm for block motion estimation", IEEE Trans. Circuits Syst. Video Technol., vol. 8, pp. 369-377, Aug. 1998.
    [71] T. Zahariadis and D. Kalivas, "A spiral search algorithm fast estimation of block motion vectors", in Proc. EUSIPCO'96, vol. 2, Trieste, Italy, 1996, pp. 1079-1082.
    [72] A. M. Tourapis, O. C. Au, M. L. Liou, and G Shen, "Fast and efficient motion estimation using diamond zonal based algorithms", J. Circuits. Syst. Signal Processing, vol. 20, no.2, pp. 233-251, Jun.2001.
    [73] P. I. Hosur and K. K. Ma, "Motion vector field adaptive fast motion estimation", in Proc. 2nd Int. Conf. Information, Communications and Signal Processing (ICICS'99), Singapore, Dec. 7-10, 1999.
    [74] A.M. Tourapis, O. C. Au, and M. L. Liou, "Highly efficient predictive zonal algorithms for fast block-matching motion estimation", IEEE Trans. Circuits Syst. and Video Technol., vol. 12, no.10, pp. 934-947, Oct. 2002.
    [75] Z. B. Chen, P. Zhou, Y. He, "Fast integer pel and fractional pel motion estimation for JVT", ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6, JVT-F017, 6th meeting, Awaji, Island, Dec. 2002.
    [76] C. Stiller and J. Konrad, "Estimating motion in image sequences", IEEE Signal Processing Magazine, pp. 70-91, Jul. 1999.
    [77] "JVT reference software official version," Image Process ing Homepage, http://bs.hhi.de/~suehring/tml/.
    [78] "H.264/MPEG-4 Part 10: Variable length coding", H.264/AVC tutorial, http://www.vodex.com.
    [79] T.M. Cover, J. A. Thomas, "Elements of information theory", Copyright (?) 1991 John Wiley & Sons, Inc. ISBN 0-471-0625%6 Online ISBN 0-471-2006 1-1.
    [80] G. J. Sullivan and T. Wiegand, "Rate-distortion optimization for video compression", IEEE Signal Processing Magazine, vol. 15, no. 6, pp. 74-90, Nov. 1998.
    [81] Gary Sullivan, "Adaptive quantization encoding technique using an equal expected-value rule", ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6, JVT-NO11, 14th meeting, Hong Kong, China, Jan. 2005.
    [82] R.M. Gray and D. L. Neuhoff, "Quantization", IEEE Trans. Information Theory, vol. 44, no. 6, pp. 2325-2383, Oct. 1998.
    [83] G J. Sullivan and S. Sun, "On dead-zone plus uniform threshold scalar quantization", International Conference of Visual Communication and Image Processing 2005, Proc. Of SPIE vol. 2960, pp. 1041-1052.
    [84] T. Wedi and S. Wittmann, "Quantization offsets for video coding", IEEE International Symposium on Circuits and Systems (ISCAS2005), Kobe, Japan. May. 2005. vol. 1, pp. 324-327.
    [85] T. Wiegand, H. Schwarz, a. Joch, f. Kossentini, and G. J. Sullivan, "Rate-constrained coder control and comparison of video coding standards", IEEE Trans. Circuits and Systems for Video Technology, vol. 13, no. 7, pp. 688-703, Jul. 2003.
    [86] G Al-Regib, Y. Altunbasak, and R. -M. Mersereau, "Bit allocation for joint source and channel coding of progressively compressed 3-D models", IEEE Trans. Circuits Syst. and Video Technol., vol. 15, no.2, pp. 256-268, Feb. 2005.
    [87] W. Yuan, S. -X. Lin, Y. -D. Zhang, W. Yuan, and H. Luo, "Optimum bit allocation and rate control for H.264/AVC", IEEE Trans. Circuits Syst. and Video Technol., vol. 16, no. 6, pp. 705-715, Jun. 2006.
    [88] B. Farber and K. Zeger, "Quantization of multiple sources using integer bit allocation", Proceedings of the 2005 Data Compression Conference (DCC'05).
    [89] H. -H. Wang, G. -M. Schuster, and A. -K. Katsaggelos, "Rate-distortion optimal bit allocation for object-based video coding", IEEE Trans. Circuits Syst. and Video Technol., vol. 15, no. 9, pp. 1113-1123, Sep. 2005.
    [90] Y. Sun, I. Ahmad, D. -D. Li, Y. -Q. Zhang, "Region-based rate control and bit allocation for wireless video transmission", IEEE Trans. Multimedia, vol. 8, no. 1, pp. 1-10, Feb. 2006.
    [91] C. -W. Tang, C. -H. Chen, Y. -H. Yu, and C. -J. Tsai, "Visual sensitivity guided bit allocation for video coding", IEEE Trans. Multimeida, vol. 8, no. 1, pp. 11-18, Feb. 2006.
    [92] B. Xie and W. Zeng, "A sequence-based rate control framework, for consistent quality real-time video", IEEE Trans. Circuits Syst. and Video Technol., vol. 16, no. 1, pp. 56-71, Jan. 2006.
    [93] T. -H. Chiang and Y. -Q. Zhang, "A new rate control scheme using quadratic rate distortion model", IEEE Trans. Circuits and Syst. for Video Technol., vol.7, no. 1, Feb. 1997.
    [94] M. -Q Jiang, N. Ling, "An improved frame and macroblock layer bit allocation scheme for H.264 rate control", IEEE Intemational symposium on circuits and systems, May. 2005, Vol. 2, pp. 1501-1504.
    [95] N. -M. Rajpoot. "Model based optimal bit allocation", Technical report, Department of Computer Science, University of Warwick, UK, Jan. 2004.
    [96] N. Kamaci, Y. Altunbasak, and R. -M. Mersereau, "Frame bit allocation for the H.264/AVC video coder via cauchy-density-based rate and distortion models", IEEE Trans. Circuits and Syst. for Video Technol., vol.15, no. 8, Aug. 2005.
    [97] Y. Yu, J. Zhou, Y. -L. Wang, and C. -W. Chen, "A novel two-pass VBR coding algorithm for fixed-size storage application", IEEE Trans. Circuits and Systems for Video Technology, vol. 11, no. 3, Mar. 2001.
    [98] L. -J. Lin, and A. Ortega, "Bit-rate control using piecewise approximated rate-distortion characteristics", IEEE Trans. Circuits and Systems for Video Technology, vol. 8, Aug. 1998.
    [99] Y. Sermadevi and S. -S. Hemami, "Efficient bit allocation for dependent video coding", IEEE Proc. of the Data Compression Conference, Snowbird, UT, Mar. 2004, pp. 232-241.
    [100] H. -B. Yin, X. -Z. Fang, L. Chen. J. Hou, "A practical consistent-quality two-pass VBR video coding algorithm for digital storage application", IEEE Trans. Consumer Electronics, vol. 50, no. 4, pp. 1142-1150, Nov. 2004.
    [101] R -H. Westerink, R. Rajagopalan, and C. -A. Gonzales, "Two-pass MPEG-2 variable-bit-rate encoding", IBM J. Res. Develop. vol. 43, no. 4. pp. 471-488, Jul. 1999.
    [102] 赖伟,“基于主观感知的视频编码及流传输”,中国科学技术大学电子系02博士论文。
    [103] T. Wiegand, M. Lightstone, D. Mukherjee, T. G. Campbell, and S. K. Mitra. "Rate-distortion optimizaed mode selection for very low bit rate video coding and the emerging H. 263 standard", IEEE Trans. Circuits and Systems for Video Technology, vol. 6, pp. 182-190, Apr. 1996.
    [104] T. Wiegand and B. Girod, "Lagrangian multiplier selection in hybrid video coder control", in Proc. ICIP 2001, Thessaloniki, Greece, Oct. 2001.
    [105] K. Ramcharidran, A. Ortega, and M. Vetterli, "Bit allocation for dependent quantization with application to multiresolution and MPEG video coders", IEEE Trans. Image Processing, vol. 3, no. 9, pp. 533-545, Sep. 1994.
    [106] S. Liu and C. -C. lay Kuo, "Joint temporal-spatial bit allocation for video coding with dependency", IEEE Trans. Circuits and Systems for Video Technology, vol. 15, no. 1, Jan. 2005.
    [107] L. -Q. Chen, X. Xie, X. Fan, W. Y. Ma, H. J. Zhang, and H. Q. Zhou, "A visual attention model for adapting images on small displays", ACM Multimedia Systems Journal, Springer-Verlag, Vol. 9, No. 4, pp. 353-364, 2003.
    [108] T. Koga, K. linuma, A. Hirano, Y. lijima, and T. Ishiguro, "Motion-compensated interframe coding for video conferencing, " Proceedings of NTC 81, pp. C9.6.1-9.6.5, New Orleans, LA, Nov./Dec. 1981.
    [109] X. Fan, X. Xie, H.-Q. Zhou and W.-Y. Ma, "Looking into Vvideo frames on small displays," Proceedings of the eleventh ACM international conference on Multimedia, pp. 247-250, Berkeley, CA, USA, Nov. 2003.
    [110] Y.-F Ma and H.-J. Zhang, "Contrast-based image attention analysis by using fuzzy growing," Proceedings of the eleventh ACM international conference on Multimedia, pp. 374-381, Berkeley, CA, USA, Nov. 2003.
    [111] "Draft ITU-T recommendation and final draft international standard of joint video specification (ITU-T Rec. H. 264/ISO/IEC 14496-10 AVC," in Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG, JVT-GOS0, 2003.
    [112] A. Sinha, G. Agarwal and A. Anbu, "Region-of-interest based compressed domain video transcoding scheme", in Proc. ICASSP'2004, Montreal, Canada, 17-21 May 2004.
    [113] G Agarwal, A. Anbu and A. Sinha, "A fast algorithm to find the region-of-interest in the compressed MPEG domain", in Proc. ICME'2003, Baltimore, MD, USA, 6-9 Jul. 2003.
    [114] A. Vetro, H. Sun and Y. Wang, "Object-based transcoding for adaptable video content delivery", IEEE Trans. CSVT, Vol. 11(3), pp.387-401, Mar. 2001.
    [115] L. Itti, C. Koch, E. Niebur, "Computational modeling of visual attention", Nature Reviews Neuroscience, 2001.2(3): pp 194-203.
    [116] L. Itti,C. Koch, "A comparison of feature combination strategies for saliency-based visual attention systems", in SPIE Human Vision and Electronic Imaging Ⅳ (HVEI'99). 1999. San Jose, CA.
    [117] L. Itti, C. Koch, E. Niebur, A. "Model of saliency-based visual attention for rapid scene analysis", IEEE Trans.s on Pattern Analysis and Machine Intelligence, 1998.
    [118] A. -A. Salah, E. Alpaydin, L. Akarun, "A selective attention-based method for visual pattern recognition with application to handwritten digit recognition and face recognition", IEEE Trans. Pattern Analysis and Machine Intelligence, 2002.24(3).
    [119] M. Bollmann, R. Hoischen, B. Mertsching, "Integration of static and dynamic scene features guiding visual attention", Paulus, E. Wahl, F. M. (eds.), Springer, 1997: pp 483-490.
    [120] X. -S. Li, and C. -W. Yuan, "A learned saliency map for eye detection", in iconip 2001.
    [121] A. Eleftheriadis and D. Anastassiou, "Constrained and general dynamic rate shaping of compressed digital video", in Proc. IEEE International Conference Image Processing, vol. 3, pp. 296-399, 1995.
    [122] H. Sun, W. Kwok, and J. W. Zdepski, "Architectures for MPEG compressed bitstream scaling," IEEE Trans. Circuits Syst. and Video Technol., vol. 6, no. 2, pp. 191-199, Apr. 1996.
    [123] Y. Nakajima, H. Hori, and T. Kanoh, "Rate conversion of MPEG coded video by re-quantization process," in Proc. IEEE International Conference Image Processing, vol. 3, pp. 408-411, 1995.
    [124] J. Youn, M. -T. Sun, and J. Xin, "Video transcoder architectures for bit rate scaling of H.263 bit streams," in Proc. ACM Multimedia, pp. 243-250, Nov. 1999.
    [125] D. -G. Morrison, M. -E. Nilson, and M. Ghanbari, "Reduction of the bit-rate of compressed video while in its coded form," in Proc. 6th International Workshop Packet Video, pp. D17.1-D17.4, 1994.
    [126] G Keesman, R. Hellinghuizen, F. Hoeksema, and G. Heideman, "Transcoding of MPEG bitstreams," Signal Process. Image Commun., voh 8, no. 6, pp. 481-500, Sep. 1996.
    [127] P. A. A. Assuncao and M. Ghanbari, "A frequency-domain video transcoder for dynamic bitrate reduction of MPEG-2 bit streams," IEEE Trans. Circuits Syst. and Video Technol., vol. 8, no. 8, pp. 953-967, Dec. 1998.