16位高性能嵌入式数字信号处理器的研究与设计
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
微电子技术突飞猛进,工艺特征尺寸已减小到0.18微米以下,0.13微米的工艺已经成熟。基于集成电路工艺的提升,代表集成电路发展水平的微处理器也不断的更新换代,性能越来越高。数字信号处理器(DSP)的发展更是如此。目前,16位定点DSP的工作频率最高已经达到600MHz,而处理能力达到每秒48亿次乘累加运算。3G通信时代的到来将会推动DSP的处理能力的进一步的提高。而未来软件无线电技术的发展,将对DSP的性能提出更高的要求。
     为了满足对高性能DSP的巨大需求,中科院微电子所承担了国家863计划重大项目“高性能DSP的研究与设计”。作为项目的一部分,我们设计了一款16位定点DSP作为原型,本文将详细介绍该DSP的设计。
     在该DSP的设计中,采用4级流水线,在不损失流水线吞吐率的前提下,使流水线的控制相对简单;总线结构采用哈佛结构,能够保证DSP有足够的数据吞吐率,为计算部件提供充分的数据;采用与主流DSP兼容的指令集,以便于后续的开发应用;采用高速的乘累加单元,保证单时钟周期完成一次乘累加运算;采用并行技术,单时钟内完成一次运算和从存储器读取两个操作数;采用“零开销”循环技术,提高循环指令执行的效率;采用延迟转移,提高流水线的吞吐率;采用后变址寻址和位反寻址技术,提高访存指令的效率。在行为级描述上,从数据通路和控制通路两个方面对DSP进行行为建模。在RTL(Register transfer level)设计上,尽量遵守RMM(Reuse methodology manual)规则,以保证代码的可移植性。在功能验证上,采用分层次验证的策略,在模块级采用白盒法进行验证,在顶层采用白盒法与灰盒法相结合验证,采用随机测试矢量与特定测试矢量结合以提高测试覆盖率。采用synopsys公司的综合工具Physical compiler进行综合,采用Astro进行自动布局布线,整个设计的规模约为80万门,时钟达到9ns。
As the technology of microelectronics is advancing rapidly, the process technologies with the feature sizes of 0.18um and even 0.13um are widely used in commercial applications. With the development of microelectronics technology, microprocessors including digital signal processor (DSP) which represents the development level of integrated circuits are updated constantly and the processing ability becomes stronger and stronger. Up to now, 16-bit fixed-point DSP with the frequency of 600MHz can deal with 4800 million multiplier and accumulation computation (MAC) on only one second. Especially, the strong requirement for the third generation (3G) mobile communication applications promotes the rapid development of DSP. In near future, the applications based on Software Defined Radio (SDR) technology will need more powerful DSPs.
    Aiming to meet with the great demands for high performance DSPs, a DSP research project supported by national "863 projects" are being carried out in the Institute of Microelectronics, Chinese Academy of Sciences (IMECAS). As a part of this project, we designed a prototype DSP which is a 16-bit fixed-point DSP. This thesis mainly discusses the design of this DSP.
    In our DSP design, a four-level pipeline is used when designing the architecture, which predigests the control of pipeline without the loss of performance. The bus architecture modifies the Harvard architecture to provide enough data rate for computing units. The instruction set of DSP in mainstream is used which makes it more convenient for later development based on our DSP. The MAC unit of the DSP is fast enough to accomplish a multiplication and an accumulation in a single cycle. The parallel technology is used so that the DSP can do a computation and get two operands from on-chip memory in a single cycle. Zero-overhead loop is implemented and delay branch is realized to make the pipeline more efficient. Behavioral models are made for data path and control path. We obey the rules of Reuse Methodology Manual when designing on Register-transfer-level to make the codes re-usable. Function verification is done in hierarchy: On module level white-box verification is used and on chip level both white-box and g
    rey-box. To improve coverage rate both random and specific stimulus are created. The area of the whole design is about 800K
    
    
    gates and the time is 9ns.
引文
[1] L.R.Rabiner, B.Gold. Theory and application of digital signal processing. Prentice-Hill, 1975.
    [2] P. Lapsley, J. Bier, A. Shoham, et al. DSP processor Fundamentals. IEEE Press, 1997.
    [3] Digital signal Processor Selection Guide. Texas Instruments. www.ti.com.
    [4] Steven W.Smith. The Scientist and Engineer's Guide to Digital Signal Processing (Second Edition). www. DSPguide. com.
    [5] Lars Wanhammar. DSP integrated circuits. Academic press. 1999.
    [6] J L.Hennessy,D A.Patterson.计算机体系结构:量化研究方法(第三版).北京:机械工业出版社,2003.
    [7] Ovadia Bat-Sheva, Be'ery Yair. Statistical Analysis as a Quantitative Basis for DSP Architecture Design. IEEE press. 1994.
    [8] ADSP-2191 DSP instruction set reference. www.analog. com.
    [9] 李亚民.计算机组成与系统结构.北京:清华大学出版社,1999.
    [10] S. Waser, M.J. Flynn. Introduction to Arithmetic for Digital Systems Designers. CBS College Publishing. 1982.
    [11] A. Bellaouar, Mohamed I. Elmasry. Low-Power Digital VLSI Design. Kluwer Academic Publishers. 1997.
    [12] Ovadia Bat-Sheva, Wertheizer Gideon, Briman Eran. Multiple and Parallel Execution Units in Digital Signal Processors. ICSPAT. 1998.
    [13] 张昆藏(译).计算机组织与结构—性能设计(第五版).北京:电子工业出版社,2001.
    [14] R. Bhargava, R. Radhakrishnan. Evaluating MMX technology using DSP and multimedia application. Proc. IEEE sym. On microarchitecture, 1998, pp37-46.
    [15] T. Shimizu et al. A Multimedia 32b RISC Microprocessor with 16Mb DRAM. ISSCC Dig. Tech. Papers, 1995,23(5),pp. 302-303.
    [16] D.M. Tullsen, et al. Simultaneous multithreading maximizing on-chip parallelism. Proceedings of the 22nd international symposium on
    
    computer architecture. 1995, pp392-403.
    [17] M. Johnson. SuperScalar Microprocessor Design. Prentice Hall. 1991.
    [18] Gin-kou. Ma, Fred J.Taylor. Multiplier policies for digital signal processing. IEEE ASSP magazine, 1999, 21(1): 6-19.
    [19] C.S. Wallace. A Suggestion for Fast Multiplier. IEEE Trans. Electronic Computer, 1964, 4 (2): 14-17.
    [20] S. Shah, A.J. Al-Khalili, D. Al-Khalili. Comparison of 32-bit Multipliers for Various Performance Measures. The 12th International Conference on Microelectronics[Tehran], 2000: Oct. 31-Nov. 2.
    [21] 田泽,于敦山,盛世敏(译).ARM SoC体系结构.北京:北京航空航天大学出版社,2002.
    [22] M.Lam,R.P.Wilson.Limits of control flow on parallelism.Proc.OfPDCS-0.2000,46(8):pp41-47.
    [23] 夏宇闻.复杂数字逻辑系统的VerilogHOL设计技术和方法.高等教育出版社,2000.
    [24] Ulrich Golze. VLSI chip design with the hardware description language verilog. Springer-verlag. 1996.
    [25] J.Fadavi-Ardekani. MxN Booth Encoded Multiplier Generator Using Optimized Wallace Trees. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 1993, 1(2):78-81.
    [26] V.G. Oklobdzija and D. Yilleger. Improving Multiplier Design by Using Improved Column Compression Tree and Optimized Final Adder in CMOS Technology. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 1995, 3(2):84-90.
    [27] I.S. Abu-Khater, A. Bellaouar, and M.I. Elmasry. Circuit Techniques for CMOS Low-Power High-Performance Multipliers. IEEE Journal of Solid-State Circuits, 1996, 31 (10): 1535-1539.
    [28] P. J. Song and G. De Micheli, Circuit and Architecture Tradeoff for High-Spedd Multiplication, IEEE JSSC, 1992, 26(9): pp. 1229-1235.
    [29] Niichi hob, Yuka Naemura, Hiroshi Makino, et al. A 500-MHz 54 x 54-bit Multiplier withRectangular-Styled Wallace Tree. JSSC, 2001,36(2):54-58.
    [30] Niichi Itoh, Yuka Naemura, Hiroshi Makino, et al. A Compact 54x54 Multiplier
    
    with Improved Wallace-Tree Structure. Symposium on VLSI Circuits Digest of Technical Papers, 1999.
    [31] Z.J. Mou and F. Jutand. Overturned_Stairs Adder Trees and Multiplier Design. IEEE Transactions on Computers, 1992,41(8) :256-261.
    [32] H. Makase, H. Suzuki, H. Morinaka, et al.A Design of High Speed 4-2 Compressor for Fast Multiplier. IEICE Trans. Electron, 1996, E79C(4): 159-164.
    [33] 孙旭光.高性能算术运算单元的研究与实现[博士学位论文].哈尔滨:哈尔滨工业大学电子科学与技术系,2003.
    [34] S.Onder, R.Gupta. Superscalar execution with direct data forwarding. Proc of the international conference on parallel architectures and compiler technologies[Paris], Oct. 1998:130-135.
    [35] Ohana Issachar, Ovadia Bat-Sheva. TeakDSPCore-New Licensable DSP Core Using Standard ASIC Methodology. ICSPAT, 1999.
    [36] Michael Keating, Pierre Bricaud. Reuse methodology manual for system-on-a-chip designs (second edition). Kluwer academic publishers, 2000.
    [37] Janik Bergeron. Writing testbench—functional verification of HDL models. Kluwer academic publishers, 2001.
    [38] 吕毅.时序逻辑电路的形式验证方法研究[硕士学位论文].北京:中国科学院计算技术研究所,2000.
    [39] A.Bramovici, Miron, Melvin A. Digital systems testing and testable design. New York: computer science press, 1990.
    [40] 丁玉美,高西全.数字信号处理.西安:西安电子科技大学出版社,2001.
    [41] Synopsys Inc. Synopsys on line document, 2003.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700