应用于手持设备的H.264硬件解码IP核的研究与实现
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
H.264是ITU-T与ISO/IEC联合开发了新的数字视频压缩标准,也是目前最先进的压缩标准。H.264的应用范围非常广泛,可满足于不同网络环境和应用场合,如标清和高清电视服务、手机和数码相机等消费电子、多媒体网络视频会议等。在消费领域中,多媒体业务已广泛运用。对视频序列进行H.264实时软解码,使得CPU频率必须运行在300Mhz~400Mhz,导致功耗增加。随着大规模集成电路设计的发展,由于集成电路芯片具有面积小,性能高,功耗低的特点,因此在消费领域,在芯片平台上实现H.264视频解码具有广阔的应用前景和实际意义。
     本课题的目标是设计符合H.264标准,支持图像大小为CIF,baseline(基本档次)/level 3,解码速率为30fps,应用于手持设备的视频硬件解码IP核。本文在概述了H.264解码系统中各个技术环节之后,对H.264硬件解码系统架构做了模块分割,给出了Ping-pong缓存器、码流解析模块中的总控状态机和图像重建信息解析的具体实现细节,并且设计了H.264解码并行计算的时序策略和解码IP核的整体系统硬件架构。由于在解码过程中,图像重建模块包含了大量的计算,对IP核的面积、性能、以及功耗影响最大,因此本文对此模块做了深入研究与精心设计。主要针对反量化反变换计算、帧间预测计算以及帧内预测计算,基于面积成本、性能以及功耗三方面的权衡,提出了三个算法模块的三种硬件实现架构。
     在反量化与反变化硬件模块中,详细分析了DCT反变换矩阵计算,提出了利用存储矩阵,将一维DCT反变换与二维DCT反变换复用计算资源。对于反量化计算,将尺度因子形成的大小为4×4的6个查找表,根据位置的合并,缩小成每个大小为2×2的查找表,降低查找表空间。在性能和功耗的平衡上,提出了利用多个门控时钟形成的计算流水线,在提高计算性能的基础上,降低系统动态功耗的时序结构。
     在帧间预测硬件模块中,由于计算过程比较复杂,本文提出了由内插控制模块选择数据的输出以及亮度6抽头计算结果的锁存,其他计算模块在控制信号的作用下流水计算的架构。这样可降低整体计算复杂度,并且利用亮度6抽头计算参考数据个数与色度内插计算参考数据个数相同的特点,复用数据线,节省系统带宽资源。由于加入了内插控制模块,计算数据流具有了规律,本文又提出在线性计算中插入5×4的存储体矩阵替代标准算法中需要大量数据锁存而引起的片内存储器数量巨增。根据5类内插的不同过程,逐行或逐列地将线性计算中的第一个加法因子存储,在控制信号的作用下,直接与计算得到的第二个加法因子线性计算,得到最后的帧间预测值。
     在帧内预测硬件模块中,分析帧内预测的17种预测模式。其中,非Plane预测模式拥有5种计算形式,为了消除算法中的大量计算冗余,本文将这5种计算形式合并,提出一个可涵盖5种不同形式的计算模式,利于硬件实现的重构。在Plane预测模式计算中,本文给出了基于硬件实现的优化方案。在每个4×4块预测计算之前,计算得到一个基准值,在水平和垂直方向的索引下求和得到4×4块中所有的预测值。这样就避免了原算法中大量的乘法运算,压缩了硬件面积。
     最后,本文给出了基于FPGA平台的视频硬件解码IP核占用逻辑资源大小,并且针对多个300帧4:2:0的标准视频序列进行测试。结果表明,在时钟频率为10Mhz的情况下,对图像大小为CIF的视频,可达到30fps的实时解码。
H.264 is a new generation video encoding standard constituted by ISO/IEC and ITU-T. H.264 is the most advanced video encoding standard currently. H.264 has a wide application and can be satisfied with different network environment. For instance, standard and high definition TV、ceil phone and digital camera、IP visible telephone and so on. At present, the multimedia service has a wide application in consumption field. Video sequcences can be decoded by software in real time when CPU must run at 300Mhz~400Mhz. That will result in the power increase highly. With the fast development of large scale integrated circuit, The integrated circuit chip has less dimension but high stability and low cost. We can conclude that developing H.264 decoder by integrated circuit technology has a significant application prospect in consumption field.
     The main object in this paper is to design the H.264 hardware decoder IP core which is applied to mobile equipments in consumption field. After analyzing the H.264 decoding algorithms in detail, this paper divides the hardware architecture into three module according to the different function, and points out detail hardware implemetations of Ping-pong buffer、FSM and image information analysis in Bitstream controller module. After that, this paper designs the strategy of parallel computing and the whole architecture of H.264 hardware decoder IP core. Because image reconstruction module includes large of calculating which has important impact on area、performance and power consumption of ASIC. This paper take a careful research and design on this module. Three hardware architecture are pointed out for IQIT、inter prediction and intra prediction.
     In IQIT module, after analyzing algorithm of IDCT, this paper points out a hardware architecture to multiple a computing module between 1 dimension IDCT and 2 dimension IDCT. In invert quantification calculating, this paper change the size of rescale look up tables according different pixel locations. The purpose is to decrease the resource of look up table on chip.For the balance of performance and power consumption, a calculating pipline which is generated by gate clocks is presented. This timing structure can increase computing ability and decrease power consumption.
     In inter prediction module, this paper adds an interpolation control module to select computing data output and register the results of computing because of the complicated algorithm. Other calculating modules work under control signals. This architecture can simplify the process of inter predict computing. The number of reference pixels for 6 coeffs filter and chroma interpolation are the same. For this reason, it multiple data lines to decrease the resource of system band width. According to the data stream in interpolation control module, a 5×4 storing matrix is substituted for large of filp-flops in the algorithm of H.264 standard. The first addition factors in linear computing are stored in 5×4 storing matrix according different kinds of interpolating locations. After the second addition factors are getted, they can be added by the data in 5×4 storing matrix directly.
     In intra prediction module, non-plane mode prediction is adopted to wipe off space redundancies of current picture and improve the coding efficiency.According to the characteristics of intra prediction, a reconfigurable hardware decoding architecture of which combine the same operation in different prediction modes is proposed. In plane mode prediction, an optimized method which is based on implementation of hardware is proposed. Every intra predicted value can be obtained by a foundation value according horizon and vertical index. This architecture compresses the hardware implementation area and also improves the module utilization efficiency.
     On FPGA platform, The IP core has passed the RTL level and gate level simulation. It meets the quality and speed requirements on basis of baseline profile of H.264 with 30 fps and the resolution ratio of 352×288 when the frequency is 10MHz
引文
[1]ITU-T Rec.H264.“Advanced video coding for generic audiovisualservices”[S].2005:36-206.
    [2]毕厚杰.新一代视频压缩编码标准—H.264/AVC[M].北京:人民邮电出版社, 2005:149-230.
    [3]Iain E, G.Richardson著,欧阳合,韩军译.H.264和MPEG-4视频压缩—新一代多媒体的视频编码技术[M].长沙:国防科技大学出版社, 2004:26-293.
    [4]Iain E, G.Richardson著,欧阳合,韩军译.视频编解码器设计—开发图像与视频压缩系统[M].长沙:国防科技大学出版社, 2005:95-129.
    [5]Iain E, G.Richards.H.264 and MEPG- 4 video compression[M].UK: 2003.
    [6]电子产品世界,编辑部集成电路[M].2005:30-38.
    [7]齐兵,王群生,杨春玲.H.264解码芯片的比较与研究[J].电视技术,2006,9:34-36.
    [8]祝敏,刘济林,陈国斌,袁开智.H.264/AVC解码器芯片应用简介[J].电视技术, 2006, 2:30-33.
    [9]张力航,林涛,周开伦.软硬件协同设计技术在H.264解码器设计中的应用[J].电子设计应用, 2006, 9:89-92.
    [10]Ke Xu.Power-efficient Design Methodology for Video Decoding[D], 2007.
    [11]Ke Xu, etc.A Power-efficient and Self-adaptive Prediction Engine for H.264/AVC Decoding[J].IEEE Trans on VLSI Systems, 2008:302-313.
    [12]Texas Instruments Inc.TMS320C6000 programmer′s guide[M].
    [13]Texas Instrument Inc.TMS320C6000 DSP Enhanced Direct Memory Access (EDMA) Controller Reference Guide[M].2005.
    [14]H J Wang, Y J Huang, H Li.H.264/AVC Video Encoder Implementation Based on DM642.IEEE Trans on Intelligent Information Hiding and Multimedia Signal Processing, 2006.
    [15]Ke Xu, etc.Power Efficient VLSI Realization of Complex FSM for H.264/AVC Bitstream Parsing[J].IEEE Trans on Circuits and Systems, 2007:984-988.
    [16]Ke Xu, etc.Priority-based Heading One Detector in H.264/AVC Decoding[J]. EURASIP Journal on Embedded Systems, 2007:Article ID 60834.
    [17]Ke Xu, etc.A Low-power BitStream Controller for H.264/AVC Baseline Decoding[M].32nd European Solid-State Circuits Conference, 2006: 162-165.
    [18]Ke Xu, etc.Low-power H.264/AVC Baseline Decoder for Portable Applications[J].International Symposium on Low Power Electronics and Design, 2007:256-261.
    [19]Ke Xu, etc . Power-efficient VLSI Implementation of BitStream Parsing in H.264/AVC Decoder[J].IEEE International Symposium on Circuits and Systems, 2006:5339-5342.
    [20]Khurram Bukhari, Georgi Kuzmanov, Stamatis Vassiliadis.DCT and IDCT Implementations on Different FPGA Technologies[D].Netherlands: Computer Engineering Lab, Delft University of Technology, 232-235.
    [21]Wiegand T, Sullivan G J, Bjontegaard G, etc.Overview of the H.264 / AVC Video Coding Standard[J].IEEE Trans on circuits and systems for video technology, 2003,13(7):560-576.
    [22]Uramoto S, Inoue Y.A 100 MHz 2-D Discrete Cosine Transform Core Processor [ J ].Digest of Technical Papers, 1991, 30: 35-36.
    [23]Chang L W, Wu M C.A Unified Systolic Array for Discrete Cosine and Sine Transforms [ J ].IEEE Trans on Signal Processing, 1991, 39: 192-194.
    [24]Amer I, BadawyW, Jullien G. Hardware prototyping for the H.264 4×4 transformation[J]. International Conference on Acoustics,Speech & Signal Processing, 2004, 5: 77-80.
    [25]Liu L Z, Qiu L, Rong M T, etc.A 2-D Forward / Inverse Integer Transform Processor of H.264 Based on Highly Parallel Architecture[J].IEEE International Workshop on System-on-Chip for Real-time Applications, 2004: 158-161.
    [26]干宗良,李晓蕾.H.264的变换编码和量化过程分析[J].电视技术,2003,12:7-9.
    [27]李清扬,伍瑞卿,樊丰.H.264整数DCT变换与量化系统实现[J].电视技术,2006,8:29-32.
    [28]刘海鹰,张兆杨,沈礼权.基于FPGA的H.264变换量化的高性能的硬件实现[J].中国图象图形学报,2006,11:1636-1639.
    [29]HE Wei Fang, Mao Zhi gaig, Wang Jin riang, etc.Design and Implementation of Motion Compensation for Mpeg-4 AS Profile Streaming Video Decoding[J].IEEE, 2003.
    [30]CHIEN Chih Da, CHEN Ho Chun, Huang Lin Chieh, etc.A Low Power Motion Compensation IP Core Design For Mpeg-1/2/4 Video Decoding[J].IEEE, 2005.
    [31]RongGang Wang, JinTao Li, Chao Huang.Motion Compensation Memory Access Optimization Strategies for H.264 / AVS Decoder[M].Speech, and Signal Processing, 2005. Proceedings. (ICASSP '05), IEEE, 2005.
    [32]阳恩龙,高鹏.H.264 / AVC解码中的运动补偿技术[J].集成电路应用,2005,7:59-61.
    [33]俞尧,杨华中. H.264 baseline解码器中运动补偿模块的硬件设计[J] .电子技术应用,2007,6:43-45.
    [34]赵子梁,郑世宝.H.264 / AVC中1/4精度内插算法的硬件设计与实现[J].中国图象图形学报,2007,10:1740-1744.
    [35]胡力,王峰,郑世宝.H.264中1/4精度像素插值算法的一种硬件实现架构[J].电视技术, 2005, 10:14-17.
    [36]卜帆,顾美康.H.264中1/4精度内插的硬件架构设计[J].电视技术,2008,11:14-16.
    [37]吴斌.H.264解码关键算法的VLSI实现研究[ D].吉林:吉林大学,2008.
    [38]Wiegand T , Sullivan GJ , Bjontegaard G, etc.Over-View of The H. 264 / AVC Video Coding Standard [J ].IEEE Trans on Circuits Sys Video Tech, 2003 ,13 (7):560-576.
    [39]Bhaskaran V , Konstantinides K.Image and Video Compression Standard : Algorithms and Architectures[M].London , Kluwer Academic Publishers, 1995:100-125.
    [40]刘凌志,路奇,戎蒙恬,郑世宝.一种并行结构的H. 264帧内预测器[J].上海交通大学学报,2006,1:54-58.
    [41]S Park, H Cho, H Jung, D Lee.An Implemented of H.264 Video Decoder Using Hardware and Software[J].Custom Integrated Circuits Conference, 2005: 271-275.
    [42]杨晨,李树国.一种高并行度的H.264帧内预测器的VLSI设计[J].微电子学与计算机,2006,12:111-114.
    [44]王珍鹏,李霞,黄玄,周莉.一种AVS帧内预测部分硬件实现方法[J].科学技术与工程, 2007,15:3719-3721.
    [45]Changsung Kim, Hsuan-Huei Shih, Jay Kuo C C.Multistage Mode Decision for Intra Prediction in H.264 Codec[C]. IS&T/SPIE 16th Annual Symposium EI, Visual Communications and Image Processing, Orlando, Florida, 2004.
    [46]Changsung Kim, Hsuan-Huei Shih, Jay Kuo C C.Feature-based Intra Prediction Mode Decision for H.264[C].IEEE Proceedings of International Conference Image Processing, submitted, Singapole, 2004.
    [47]纪洪芝,张玉明,黄晁.H.264 / AVC解码中帧内预测的硬件实现[J].微电子学与计算机, 2007,7,173-175.
    [48]Genhua Jin, Hyuk Jae Lee.A Parallel and Pipelined Execution of H.264/AVC Intra Prediction[J]. Inter—University Semiconductor Research Center, Seoul National University, Seoul, Korea, 2006.
    [49]郑长春.H.264帧内预测算法及其在解码器中的硬件实现[J].铜业工程, 2007,3:56-59.
    [50]Huang Y W, Hsieh B Y,Chen T C,etc.Analysis,Fast algorithm and VLSI architecture design for H.264/AVC intra frame coder[J].IEEE Trans on Circuits and System for Video Technology, 2005, 15(3): 378-401.
    [51]徐张磊,郑世宝,杨宇红.一种支持H.264和AVS的帧内预测器设计[J].中国图象图形学报,2007,12(10):1735-1739.
    [52]Horowitz M , Joch A.H.264 / AVC baseline profile decoder complexity analysis [J ].IEEE Trans Circ and Syst for Video Technol,2003,13 (7) : 704-716.
    [53]Choi S-K, Jeon J-G.Design and Implementation of H.264-based Video Decoder for Digital multimedia broad-casting[A].IEEE Int Conf Multimedia and Ex-po [C].Taiwan ,2004:149-152.
    [54]石磊,林涛,焦孟草.H.264 / AVC硬件解码器设计及其验证策略[J].微电子学,2006,36(1):16-18.
    [55]梁晓芸,焦孟草,林争辉.H.264解码器的ASIC解决方案及其FPGA验证平台[J].电视技术, 2004, 10:50-51.
    [56]夏宇闻.Verilog数字系统设计教程第2版[M].北京航空航天大学出版社, 2008:19-366.
    [57]林丰成,竺红卫,李立.数字集成电路设计与技术[M].科学出版社, 2008:45-325.
    [58]Louis Scheffer, Luciano Lavagno, Grant Mation著,陈力颖,王猛译.集成电路系统设计、验证与测试[M].科学出版社, 2008:26-448.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700