基于TMS320DM642的H.264编码器的实现与优化
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
作为新一代的多媒体应用视频编码标准,H.264/AVC采用了许多不同于以往标准的先进技术,在编码效率和性能大幅提高的同时,增强了错误恢复及网络自适应等功能,在广播电视、视频存储与回放、视频会议等领域具有广泛的应用前景。但H.264编码性能的提高是以其计算复杂度的明显增加为代价的。如何在硬件资源有限的嵌入式环境下开发出具有实时编码功能的视频编码器是一项极具挑战性的工作。
     TMS320DM642是美国德州仪器公司开发的第二代高性能超长指令字结构的定点DSP处理器,具有8个独立的功能单元和64个32位通用寄存器,在8个功能单元里扩展了专门用于视频/图像处理的指令集,提高了视频处理的性能和指令结构的并行性;在600MHZ的时钟频率下,DM642的峰值处理速度达到4800MIPS(每秒百万条指令);DM642片内采用两级存储器结构,并具有丰富的片上外围接口,如10/100Mbps以太网接口、三个可配置的视频端口、一个64位的外部存储器接口等。DM642的强大处理和接口能力使它非常适合基于IP和无线网络的音视频传输、安全监控等视频/图像处理领域的应用。
     本论文主要介绍如何在TMS320DM642硬件开发平台上进行H.264“baseline”编码器的开发与优化。编码器源程序采用三大开源代码之一的x264的编码部分。与官方提供的JM系列测试源码相比,x264编码器摒弃了一些对编码性能贡献微小但计算复杂度极高的新特性,更易于移植和优化。视频编码算法在DSP芯片上的高效实现,必须充分挖掘视频处理器的并行特性和计算资源,才能满足系统实时性的要求。我们在原x264编码器程序基础上主要做了以下几项工作:一是对程序进行裁减、修改并最终移植到DSP平台上运行;二是充分利用DM642的EDMA控制器等对数据传输和存储空间进行优化:三是利用内联函数、线性汇编等对H.264核心算法和程序进行改进,提高代码运行的并行性。最后提出了一个复杂度较低、编码效率较高的嵌入式实时H.264编码器方案。
     目前,我们的H.264编码器每秒钟能够完成28~38帧QCIF格式图像的编码。解码后的视频图像具有较高的主观质量和客观质量。
As a video coding standard for next-generation multimedia, H.264/AVC adopts a number of advanced technologies different from the previous standards. In addition to improved coding efficiency and coding performance, other capabilities of the new standard are also enhanced, including error resilience and flexibility for effective use over a broad variety of network types. H.264/AVC provides a technical solution for a broad range of applications, including broadcast television, video storage and playback, videoconferencing, etc. But improved coding efficiency comes at the cost of higher computational complexity. It is full of challenge to develop an embedded real-time video encoder with the limited on-chip memory space.
    The TMS320DM642 device is a fixed-point digital signal processor (DSPs) based on the second-generation high-performance very-long-instruction-word (VLIW) architecture VelociTI.2TM developed by Texas Instruments (TI), which has eight highly independent functional units and 64 32-bit general-purpose registers. The VelociTI.2TM extensions in the eight functional units of DM64x include new instructions to accelerate the performance in video and imaging applications and extend the parallelism of the VelociTI.2TM architecture. At a clock rate of 600MHZ, the DM642 device can perform up to 4800 million instructions per second (MIPS). The DM642 uses a two-level internal memory architecture for program and data and has a powerful and diverse set of peripherals. The peripheral set includes: 10/100 Mbps Ethernet MAC (EMAC); three configurable video ports; a 64-bit external memory interface (EMIFA), etc. The powerful capability of data processing and interface make DM642 very fit for the video and imaging applications, for example, the audio/video transmission and security monitor over IP (Internet Protocol) and wireless networks.
    The main task of this paper is to introduce how to develop and optimize the H.264 "baseline" encoder on the hardware platform based on TMS320DM642. The source program adopted is the encoder part of the "x264" which is one of the open H.264 codec software. Compared with the official JM software, x264, which gets rid of some new characteristics which have little contribution to coding performance and high computational complexity, is easy to be ported and optimized. The effective method to
引文
[1] 沈兰荪,卓力.小波编码与网络视频传输[M].北京:科学出版社,2005.
    [2] Shannon C E. A mathematical theory of communication [J]. Bell System Technical Journal. 1948, 379~424, 623~656.
    [3] Kunt M, et al. Second-generation image-coding techniques [A]. Proceedings of the IEEE[C], 1985, 73(4):549~574.
    [4] Kunt M, et al. Recent results in high-compression image coding [J]. IEEE Transactions on Circuits and Systems for Video technology, 1987, 34(11):1306~1336.
    [5] Casas J R, et al. coding of details in very low bit-rate video systems [J]. IEEE Transactions on Circuit and Systems for Video Technology, 1994, 4(3):317~327.
    [6] Cortez D, et al. Image segmentation towards new image representation methods [J]. Signal Processing: Image Communication, 1995, (6):485~498.
    [7] 张春田 等.匹配视觉生理机制的方向分解图像编码[J].通信学报,1998,12(1):61~68.
    [8] Jacquin A E. Fractal image coding: a review [A]. Proceedings of the IEEE[C], 1993, 81(10):1451~1465.
    [9] 李方慧,王飞等.TMS320C6000系列DSPs原理与应用[M].北京:电子工业出版社,2005.
    [10] Iehiro kuroda and Takao nishitani. Multimedia processors [A]. Proceeding of the IEEE[C]. 1998, 86(6):1203~1221.
    [11] Peter Pirsch and Hans-Joachin Stolberg. VLSI implementations of the image and video multimedia processing system [J]. IEEE Transactions on Circuits and Systems for Video Technology, 1998, 8(2):878~891
    [12] 沈兰荪.图像编码与异步传输[M].北京:人民邮电出版社,1998
    [13] 沈兰荪,卓力等.视频编码与低速率传输[M].北京:电子工业出版社,2001
    [14] CCITT Recommendation H.261. Video codec for audio visual services at P×64Kbit/s[S], 1990.
    [15] ITU-T. Draft ITU-T Recommendation H.261. Video codec for audio visual services at P×64Kbit/s[S], Jan 1993.
    [16] ITU-T. Draft ITU-T Recommendation H.263. Video codec for low bit rate communication[S], May 1996.
    [17] ITU-T. Draft ITU-T Recommendation H.263 Version 2. Video codec for low bit rate communication[S], Jan 1998.
    [18] ITU-T. Draft for "H.263++" annexes U, V, and W to recommendation.263[S], Nov 2000.
    [19] Gall D L. MPEG: A video compression standard for multimedia application [J]. Communications of the ACM, 1991, 34(4):47~58.
    [20] ISO/IEC JCTI/SC29 CD11172-2: MPEG1, international standard for coding of moving pictures and associated audio for digital storage media at up to about 1.5Mbps[S], 1991.
    [21] ISO/IEC JCT1/SC29 CD13818-2: MPEG2, coding of moving pictures and associated audio[S], 1993.
    [22] ISO/IEC JCT1/SC29 WG11/N3536, MPEG-4 overview V. 15[S], Beijing, China, Jul 2000.
    [23] ITU-T Rec. H.264/ISO/IEC 11496-10, "Advanced Video Coding", Final committee Draft[S], Document JVT-G050, Mar 2003.
    [24] Thomas Wiegand and Gary J. Sullivan et al. Overview of the H.264/AVC video coding standard [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2003, 13(7):560~576.
    [25] T. Wiegand, H. Schwarz, A.Joch, E Kossentini, and G J. Sullian. "Rate-Constrained Coder Control and Comparison of Video Coding Standards" [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2003, 13(7):688-703.
    [26] Gary J. Sullivan, Pankaj Topiwala and Ajay Luthra. The H.264/AVC advanced video coding standard: Overview and Introduction to the fidelity range extensions [A]. SPIE Conference on applications of Digital Image Processing[C].
    [27] T. Stockhammer, M. M. Hannuksela, and T.Wiegand. H.264/AVC in wireless environments [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2003, 13(7): 657~673.
    [28] Stephan Wenger. H.264/AVC Over IP [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2003, 13(7):645~656.
    [29] T. Wedi. Motion compensation in H.264/AVC [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2003, 13(7): 577~586.
    [30] Anthony Joch, Faouzi Kossentini and Panos Nasiopoulos. A performance analysis of the ITU-T Draft H.26L video coding standard [A]. Proceedings of Packet Video Workshop[C], 2002, Pittsburgh, USA
    [31] Panos Nasiopoulos, Anthony Joch and Faouzi Kossentini. Overview and performance evaluation of the ITU-T Draft H.26L video coding standard [A]. Proceedings of SPIE of Digital Image Processing[C], 2001.
    [32] Detlev Marpe, Heiko Schuarz and Thomas Wiegard. Context-based adaptive binary arithmetic coding in the H.264/AVC video compression standard [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2003, 13(7):620~636.
    [33] P. List, A. Joch, J.Lainema, et al. Adaptive deblocking filter. IEEE Transactions on Circuits and Systems for Video Technology, 2003, 13(7): 614~619.
    [34] Texas Instrument. TMS320C64x Technical Overview, www.ti.com, January, 2001.
    [35] Texas Instrument. TMS320DM642 Video/Imaging Fixed-Point Digital Signal Processor [DB/OL]. www.ti.com, July, 2002.
    [36] Texas Instrument. TMS320C6000 DSP Peripherals Overview Reference Guide [DB/OL]. www.ti.com, March, 2005.
    [37] Texas Instrument. TMS320C6000 Code Composer Studio Tutorial [DB/OL]. www.ti.com, February, 2000.
    [38] Texas Instrument. Code Composer Studio Development Tools v3.1 [DB/OL]. www.ti.com, May, 2005.
    [39] Texas Instrument. TMS320C6000 Optimizing Compiler v6.0 Beta [DB/OL]. www.ti.com, July, 2005.
    [40] Texas Instrument. TMS320C6000 Programmer's Guide [DB/OL], August, 2002.
    [41] Kwong-Keung Leung, et al. Parallelization methodology for video coding-an implementation on the TM320C80 [J]. IEEE Transactions on Circuits and systems for Video Technology, 2000, 10(8):1413~1425.
    [42] Texas Instrument. TMS320C6000 DSP Enhanced Direct Memory Access (EDMA) Controller Reference Guide [DB/OL], March, 2005.
    [43] Texas Instrument. TMS320C6000 Chip Support Library API Reference Guide [DB/OL], August, 2004.
    [44] Texas Instrument. TMS320C6000 Assembly Language Tools User's Guide [DB/OL]. www.ti.com, April, 2004.
    [45] Orchand M.T., and G.J.Sulliran. OBMC, an estimation-theoretic approach [J]. IEEE Trans Image Processor, 1994:693~699.
    [46] K. Venkatachalapathy, R. Krishnamoorthu, and K. Viswanath. A new adaptive search strategy for fast block motion estimation algorithms [J]. Journal of Visual Communication & Image Representation, 2004:203~213.
    [47] Iain E. G. Richardson. H.264 and MPEG-4 video compression [M]. The Robert Gordon University, Aberdeen, UK, 2004.
    [48] Iain E.G.Richardon. H.264 and MPEG-4 Video Compression Video Coding for Next Generation Multimedia [M]. Wiley Press, 2003.
    [49] 毕厚杰.新一代视频压缩编码标准—H.264/AVC[M].北京:人民邮电出版社,2005.
    [50] Texas Instrument. TMS320C64x/C64x+ DSP CPU and Instruction Set Reference Guide [DB/OL], June, 2005.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700