基于分布式算法的离散余弦变换的硬件架构
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
离散余弦变换(DCT)在图像编解码方面具有十分广泛的应用,目前已被JPEG、MPEG1、MPEG2、MPEG4和H26x等众多国际标准所采用。但由于其计算量较大,用软件实现难以满足实时处理的要求,对一些处理速度要求很高的地方,一般采用硬件设计的DCT处理电路。本文的主要研究内容为针对图像处理应用的8×8二维DCT离散余弦变换IP处理核的硬件实现的若干问题。
     本文首先介绍了DCT在图像处理中的作用和原理,以及利用DCT变换实现图像压缩的过程,并与其它变换比较体现了用DCT变换实现图像压缩的优势。然后对DCT的各种快速算法进行了分析研究,总结了前人对DCT快速算法及其VLSI实现的各种方法的优缺点,进而提出了一种DCT IP核的设计方案。该方案利用DCT的行列分离特性,采用6级流水线设计技术,将二维DCT转化为两个一维DCT。在一维DCT设计中,利用DCT余弦因子的对称性以及可旋转性,采用移位和加法逻辑来实现乘法运算,从而避免了采用乘法器设计所造成的资源和面积的浪费,提高运算速度。最后,对所设计的DCT处理核进行了综合和仿真验证,结果表明所设计的DCT处理核能够在100M时钟频率下能正确完成8×8DCT的逻辑运算。
Discrete Cosine Transform is wildly used in coding and decoding of image processing, which has been used by many international standards such as JPEG, MPEG1, MPEG 2, MPEG 4, H26X and so forth. Because of its high computing, software can not meet with the requirement of real time process. Therefore, we advise adopting DCT process circuit to satisfy with the requirement of the speed for processing, due to strict circumstance. The content of this thesis is that 8*8 DCT IP code for hardware implementation.
     This thesis introduce the role and principle of discrete cosine transform DCT and briefly elaborate the process the image transform using DCT method, and then comprise with other transform to specify advantages for image transform using DCT transform. Furthermore, it specifies different DCT method, and then summarizes DCT high speed algorithm and the characteristics for VLSI implementation. According to characteristics of image process and IP design idea, DCT algorithm and hardwire implementation, this thesis introduce one DCT hardwire design to increase speed and minimize design areas and power.
     In this method, we adopt DCT column-row character, using 6 pipelines design, then transform 2-DCT into 1-DCT, column and row method. In 1-dct design, we use DCT cosine coefficients and by making use of its rotation characteristic, multiplication function can be designed by shift and addition logic instead of direct multiplication unit by which can save design resources while improve the speed. At last, synthesize and verification for design has been done. The result shown that the design could complete the function of 8×8DCT under 100M clock frequency.
引文
[1]夏宁闻编著。verilog数字系统设计教程,北京航空航天大学出版社
    [2]Verilog典型例程代码http://www.opencores.org/cvsweb.shtml/
    [3]朱东巍陈晨.色度空间转换的设计及FPGA实现[J].电视技术,2005(10).
    [4]陈禾,毛志刚,叶以止,“DCT快速算法及其VLSI实现,”信号处理,1998:14(增刊),62-68
    [5]A.Peled and B.Liu,A new hardware realization of digital filters,IEEE Trans.Accoust.Speech.Signal Process.,1974,ASSP-22(6):456-462
    [6]M.T.Sun,T.C.Chen and A.M.Gottlieb,VLSI implementation of a 16×16 discrete cosine transform,IEEE Trans.Circuits Syst,1989,CAS-36(4):610-617
    [7]S.Uramoto,Y.Inoue,A.Takabatake,J.Takeda,Y.Yamashits,H.Terane and M.Yoshimoto,A 100MHZ 2-D discrete cosine transform core processor,IEEEJ.Solid-State Circuits.,1992,27(4):492-499
    [8]M.Matsui et al,A 200MHZ 13mm2 2-D DCT macro-cell using sense-amplifying pipeline flip-flop scheme,IEEE J.Solid-State Circuits.,1994,29(12)
    [9]Sungwook Yu and Earl E.Swartzlander Jr.,DCT Implementation with Distributed Arithmetic,IEEE Trans.Comput.,2001,50(9):985-991
    [10]Ahmed Shams et al,A low power high performance distributed DCT architecture,Proc.IEEE ISVLSI'02,2002
    [11]H.T.Kung,Why systolic architecture,Computer,1982,15(1):37-46
    [12]N.L.Cho and S.U.Lee,DCT algorithms for VLSI parallel implementations,IEEE Trans.Acoust.Speech.Signal Processing.,1990,38:121-127
    [13]N.R.Murthy and M.N.S.Swamy,On the real-time conputation of DFT and DCT through systolic architectures,IEEE Trans.SP,1994,42(4)
    [14]L.-W.Chang and M.-C.Wu,A unified systolic array for discrete cosine and sine transforms,IEEE Trans.SP.,1991,39:192-194
    [15]C.L.Wang and C.-Y.Chen,High-throughput VLSI architecture for the 1-D and 2-D discrete cosine transforms,IEEE Trans.Circuits Syst.Video Technol.,1995,5(1):31-40
    [16]C.L.Wang and C.-Y.Chen, A Linear systolic array for the 2-D discrete cosine transform, Proc. IEEE Asia-Pacific Conf.Circuit Syst., 1994:73-78
    [17]Y.-T.Chang and C.-L.Wang, New systolic array implementation of the 2-D discrete cosine transform and its inverse, IEEE Trans.Circuit Syst.Video Technol., 1995, 5(2): 150-157
    [18]C.-M.Wu and A.Chiou.A SIMD-systolic architecture and VLSI chip for the two-dimensional DCT and IDCT, IEEE Trans.CE., 1993, 39(4):859-869
    [19]J.E.Volder, the CORDIC trigonometric computing technique, IRE Trans on electronic computers, 1959,EC-8(3):330-334
    [20]W.-J.Duh and J.-L.Wu, Constant-rotation DCT architecture based on CORDIC techniques, INT.J.ELECTRONIC, 1990,69(5):583-593
    [21]E.P.Mariatos,D.E.Metafas,J.A.Hallas and C.E.Goutis,A fast DCT processor,based on special purpose CORDIC rotators, 1994 IEEE International Symposium on Circuit and Systems.,1994,vol(6):271-274
    [22]F.Zhou and P.Kornerup, High speed DCT/IDCT using a pipeline CORDIC algorithm, (1063-6889/95 1995 IEEE), 1995, vol (7): 180-187
    [23]N.Ahmed, T.Natarajan, and K.R.Rao, "Discrete cosine transform, "IEEE Trans.Comput, vol.C-23, Jan. 1974, pp90-93
    [24]W.H.Chen, C.H.Smith and S.C.Fralick,A fast computational algorithm for the discrete cosine transform, IEEE Trans.Commun., 1977,COM-25:1004-1009
    [25]A.Madisetti and A.N.Willson,A 100MHZ 2-D 8×8 DCT/IDCT processor for HDTV applications,IEEE Trans.Circuits Syst.Video Technol., 1995,5(2): 158-164
    [26]V S rinivasan, R ayL iu,K J.,"Full DC T/ID CT ch ip,"ImagePr ocessing, custom VLSI implementation of high-speed 2-D ICIP-94.IEEE International Conference, 1994 vol.3, 13-16 Nov.1994,pp606-610
    [27] Zhongde Wang. Fast Algorithms for The Discrete Wavelet Transform and for The Discrete fourier Transform. IEEE Trans.Signal Processing. Aug.1984, 32 (4):80 3816
    [28] C.Loetller,A .Ligtenberg,G S.Moschytz.Practical fast1- D DCT algorithms with 11 multiplications.Proc.IEEE.ICASSP89, 1989:988-991
    [29] C.Jen-Shiun,H .Hsiang-chou.New Architecture for High Throughput-rate Realtime 2- D DCT and the VLSID esign.A SIC Conference and Exhibit,Proceed ings, NinthA nnual IEEE I nternational,Se p.1996:219-222
    [30]可编程 ASIC 应用文摘 http://www.fpga.com.cn/application
    [31]M.T.Sun,T.C.Chen and A.M.Gottlieb,"VLSI implementation of a 16×16 discrete cosine transform,"IEEE Trans.Circuits Syst,vol.36,no.4,Apr.1989,pp610-617
    [32]H.T.Kung,"Why systolic arehitecture,"Computer,vol.15,no.1,1982,pp37-46
    [33]N.L.Cho and S.U.Lee,"DCT algorithms for VLSI parallel implementations,"IEEE Trans.Signal Processing,vol.38,no.1,Jan.1990,pp121-127
    [34]N.R.Murthy and M.N.S.Swawy,"On the real-time computation of DFT and DCT through systolic architectures,"IEEE Trans.Signal Processing,vol.42,no.4,Apr.1994,pp988-991
    [35]M.Vetterli,H.Nussbaumer.Simple FFT and DCT Algorithms with Reduced Number of Operations.Signal Processing(North Holland),1984,Vol 6:pp 267-278.
    [36]N.Suehiro and M.Aatori.Fast Algorithms for the DFT and other SinusoidalTransforms.IEEE Transactions on Acoustics,Speech,and Signal Processing,1986,34(3):642-644
    [37]H.S.Hou,A Fast Recursive Algorithm for Computing the Discrete Cosine Transform.IEEE Transactions on Acoustics,Speech,and Signal Processing,1987,35(10):1455-1461
    [38]P.Duhamel and H.H.Mida.New 2DCT Algorithms suitable for VLSI Implementation.Proceedings IEEE International conference on Acoustics,Speech and Signal Processing,1987,12:1805-1808
    [39]张海亮.浅议电视信号的数字化与码率压缩.http://www.lunw.com/thesis/39/7043_1.html,1999-2-1
    [40]张亮.数字电路设计与Verilog HDL人民邮电出版社,2000:189194
    [41]张明.Vefilog HDL实用教程‘电子科技大学出版社,2001:1214
    [42]牛风举,刘元成,朱明程.基于IP复用的数字IC设计技术电子工业出版社.20 03:3538
    [43]王城,FPGA设计指导思想
    [44]Srinivasan V.,et al.VLSI Design of High-Speed Time-Recuisive 2-D DCT/IDCT Processor for Video Applications[J].IEEE Transaction on Circuits and Systems for video Technology,1996-02,6(1):87-96.
    [45]Jen-shium Chiang,et al.A High Throughout z-Dimensional DCT/IDCT Architecture for Real-time Image and video System[A].The 8th IEEE International conference on Electroaics,cricuits and systems,2001,ICECS 2001[C].2001,2:867-870.
    [46]Yamazaki T,et al.Multi-purpose inner-product LSI for Video Rate Signal Processing[A].IEEE Workshop on VLSI Signal Process[C].1990-11,1-12.
    [47]Ahmed Shams,Wendi Pan,Archana Chidanandan and Magdy A.A Low Power High Performance Distributed DCT Architecture,Proceedings of the IEEE Computer Society Annual Symposium on VLSI(ISVLSI'02),0-7695-1486-3/02
    [48]Mehboob Alam,Choudhury A.Rahman,Wael Badawy.Efficient Distributed ArithmeticBased DWT Architecture for Multimedia Applications,Proceedings of the 3rd IEEE International Workshop on System-on-Chip for Real-Time Applications ISBN 0-7695-1929-6/03
    [49]优秀 ASIC 设计的十大准则 http://www.fpga.com.cn/design process
    [50]杨睿,郑学仁,8×8矩阵高速DCT的硬件实现,半导体技术,1999,12(6),pp47-51:

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700