用户名: 密码: 验证码:
低功耗可扩展FFT专用集成电路的设计
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
由于数字信号处理具有精度高、灵活性好、抗干扰能力强、易于大规模集成等优点,目前已在众多领域取代了传统的模拟信号处理。在各种数字信号处理算法中起着核心作用的是离散傅里叶变换(DFT),但是由于离散傅里叶变换计算量大,且要求占用的内存大,难以实现实时处理,限制了其应用。Cooley和Tukey在1965年提出的快速傅里叶变换算法(FFT)使离散傅里叶变换的运算速度提高了几百倍,解决了数字信号处理实现和应用的瓶颈,快速傅里叶变换处理器也因此成为数字信号处理中最基本也是最重要的一个单元,目前已广泛地应用于数字通信、语音信号处理、图象处理、生物医学工程、雷达、地震、天文等领域。
     由于专用集成电路在功耗方面的优势,本课题采用ASIC方法对FFT处理器进行设计,并且将该FFT处理器设计成可扩展的,可以根据需要配置进行8点、16点、32点、64点、128点、256点、512点和1024点的运算。本文首先阐述了数字信号处理理论中快速傅立叶变换的理论基础,通过对比不同基算法的复杂度以及对功耗的影响,采用按时间抽取的基2算法设计该FFT处理器。然后讨论了FFT的顺序结构、级联结构、并行结构和阵列结构,本设计根据低功耗的要求选择了顺序结构。本设计还将长位宽的存储器分割成两个短位宽的存储器,并在蝶形运算单元中通过数学变换将实数乘法器的个数从4个减少为3个,进一步降低了功耗。通过对乘法器的讨论,选择使用了改进的Booth编码乘法器。本文还对存储器的地址生成算法进行了详细的讨论。该FFT处理器在存储器读写和蝶形单元的运算之间采用流水线方法,提高了处理速度。此外,本文还对设计的各个阶段运用的低功耗方法进行了阐述,对功耗进行了分析。
     该低功耗可扩展的FFT处理器采用Verilog HDL进行代码编写,用Modelsim进行功能仿真,用Xilinx的FPGA进行了验证,最后采用SMIC 0.18μm CMOS工艺库进行综合及布局布线,并根据项目需要成功实现了扩展为8-256点的FFT处理器的流片。测试结果表明:该FFT处理器的计算误差低于3%;进行256点FFT运算需要2063个时钟周期;在1.8V工作电压下,其平均功耗约为1.17 mW/MHz。该设计在功耗、速度和计算精度上都达到了预期的目标。
Digital signal processing due to its high accuracy, good flexibility, strong anti-interference ability and easy-to-large scale integration, etc., has replaced the traditional analog signal processing currently in many areas. DFT(Discrete Fourier Transform) plays a central role in a variety of digital signal processing, but due to the DFT has large amount of calculation and require considerable memory, it is difficult to achieve real-time processing so that limiting its application. Cooley and Tukey proposed the FFT(Fast Fourier Transform) algorithm in 1965 made the computation speed of DFT hundreds of times faster and solved the bottlenecks of implementation and application of digital signal processing,and therefore FFT processor became the most fundamental and important unit in digital signal processing. FFT has been widely used in digital communications, speech signal analysis, image processing, radar, seismic, biomedical engineering and other fields now.
     Because of the advantages of ASIC(application-specific integrated circuit) in power consumption, this project adopted the ASIC design method to design a FFT processor and it was designed to be scalable, which can be configured for 8 points, 16 points, 32 points, 64 Points, 128 points, 256 points, 512 points and 1024 points computation. This paper described the theory of Fast Fourier Transform theory, by comparing the complexity of the algorithm with different radix and the impact on power consumption, we used the DIT(Decimation-In-Time) and radix-2 algorithm to design the FFT processor. Then discussed the sequential structure, cascade structure, parallel structure and the array structure of common FFT hardware and this design chose the sequential structure according to the requirements of low power. A long-bit memory was divided into two short-bit memories and the number of real multiplications in butterfly was decreased from 4 to 3 to father reduce the power consumption. Through the discussion of the multiplier, we chose a modified Booth encoded multiplier. An effective address generation algorithm for this design was proposed in detail too. Pipeline structure was used between memory operation and butterfly computation to improve the computation speed. In addition, this paper describes the low-power methods in each stage of design.
     The Low Power and Scalable FFT processor was designed with Verilog HDL and functionally verified using Modelsim simulator. Xilinx FPGA was used for emulation.
     The design was synthesized, placed and routed using SMIC 0.18μm CMOS library. According to the need of our project, 8-256 points FFT processor was fulfill successfully by MPW((Multi Project Wafer). The simulation and test analysis showed that the calculation error of the FFT processor is less than 3% and there are 2063 clock cycles for 256 points FFT computation. At the 1.8V supply voltage, the average power consumption is about 1.17 mW/MHz. The FFT processor’s power, speed and accuracy have reached the desired goal.
引文
[1]戴明祯.数字信号处理的硬件实现.北京:航空工业出版社,1998,1-2
    [2]程翔,贾宇鹏,韩昌彩等.DSP数字信号处理器发展及应用简介.山东电子, 2003,(1):26-30
    [3]张德民.现代信号处理的研究热点探讨.重庆邮电学院学报, 1996,9(2):19-34
    [4]汪润来.1024点复数专用FFT处理器的ASIC实现:[硕士学位论文].成都:电子科技大学,2007,1-2
    [5]韩泽耀.高速高性能FFT处理器的VLSI实现研究:[博士学位论文].浙江:浙江大学,2002,7-8
    [6]陆旦前,陈建平,陈晓勇.FFT算法的一种FPGA设计.现代电子技术,2007,(6):178-181
    [7] Cooley J W, Turkey J W. An algorithm for the machine computation of complex Fourier series. Mathematics of Computation, 1965, 19(4):297-301
    [8] Bruce E N. Biomedical Signal Processing and Signal Modeling. New York: IEEE Wiley, 2001,1-8
    [9] Rangayyan. Biomedical Signal Analysis: A Case-Stud Approach. New York: IEEE Press, 2002,5-20
    [10] Cerutti S. In the Spotlight: Biomedical Signal Processing. Biomedical Engineering, IEEE Reviews in, 2009, (2):9-11
    [11]植强.一种基于FPGA的FFT阵列处理器.电子对抗技术, 2002,17(6):36-39
    [12]鲍庆龙,刘平.基于FPGA的高速FFT算法实现.微处理机,2007,(2): 16-19
    [13]杨军,郭跃东,蒋慕蓉.基于FPGA的FFT处理器研究与设计.计算机技术与发展,2009,19(9):225-231
    [14] Xilinx Inc. Xcell journal. 2001,(40):30-31
    [15] Lei Wang, Surapa Thiemjarus, Benny Lo, et al. Toward A Mixed-Signal Reconfigurable ASIC for Real-Time Activity Recognition. 5th International Workshop on Wearable and Implantable Body Sensor Networks (BSN 2008) and 5th International Summer School and Symposium on Medical Devices and Biosensors (ISSS-MDBS 2008). Hong Kong, 2008, 227-230
    [16] Liseth O E, Hjortland H A, Lande T S B. Power Efficient Cross-Correlation Beat Detection in Electrocardiogram Analysis Using Bitstreams. Biomedical Circuits and Systems Conference. Beijing, 2009, 237-240
    [17] Ristimaki T, Nurmi J. Reconfigurable IP blocks: a survey. InternationalSymposium on System-on-Chip. Tampere, 2004, 117-122
    [18] Joyce Van de Vegte.数字信号处理基础.候正信,王国安译.北京:电子工业出版社,2003,138-258
    [19]李小进.高速可配置基2FFT处理器的FPGA实现研究:[硕士学位论文].上海:华东师范大学,2004,8-9
    [20] Uwe Meyer-Baese.数字信号处理的FPGA实现.刘凌,胡永生译.北京:清华大学出版社,2003,178-205
    [21]丁法珂.基于FFT的直扩系统中窄带干扰抑制技术研究与实现.科学技术与工程.2009,9(10): 2739-2742
    [22] Baas B M. A low-power,high-performance,1024-point FFT processor. IEEE Journal of Solid-State Circuits,1999,(34): 380-387
    [23]曾烈光,金德鹏,苏厉等.专用集成电路设计.武汉:华中科技大学出版社,2008,118-119
    [24]刘国栋,陈伯孝,陈多芳.FFT处理器的FPGA设计.航空计算技术,2004,34(3):101-104
    [25]晏敏,李杰,章兢等.低功耗可配置FFT处理器的ASIC设计.微电子学,2010,40(6):787-791
    [26] Artiasn Components Inc. SMIC 0.18μm Logic018 SRAM-DP Datasheet. 2005, 5-6
    [27] Zhao Y T, Erdogan A T, Arslan T. A low-power and domain-specific reconfigurable FFT fabric for system-on-chip applications. Parallel and Distributed Processing Symposium. Denver, 2005, 4-7
    [28]李彦正.一种FFT蝶形处理器中的乘法器实现.现代电子技术,2007,(22):135-137
    [29]李楠.快速乘法器的设计:[硕士学位论文].哈尔滨:哈尔滨工业大学, 2007,5-6
    [30] Booth A D. A Signed Binary Multiplication Technique. Journal of Mechanics and Applied Mathematics.1951, 4(2):236~240
    [31] McSorley O L. High Speed Arithmetic in Binary Computers.Proceedings of the IRE,1961,49(1):67-91
    [32]杨荣喜.基于0.35um SiGe工艺的低功耗复数乘法器ASIC芯片设计:[硕士学位论文].成都:电子科技大学,2007,33-34
    [33]郑伟,姚庆栋,张明等.一种高性能、低功耗乘法器的设计.浙江大学学报. 2004,38(5):534-538
    [34]周德金,孙峰,余宗光. 32位高速浮点乘法器优化设计.半导体技术. 2007,32(10):871-874
    [35] Wallaee C S. A Sugge stion for a Fast MultiPlier. IEEE Transactions on Eleetronic Computers, 1964, 13(2):14-17
    [36] Dadda L. Some Schemes for Parallel MultiPliers. Alta Frequenza, 1965, 34(5): 394-356
    [37]邵磊.专用集成电路设计方法研究与应用——高速浮点乘法器设计:[硕士学位论文].无锡:江南大学,2007,16-18
    [38]邵磊,李昆,张树丹等.基于改进4-2压缩结构的32位浮点乘法器设计.微计算机信息,2007,23(3):224-225
    [39] Cohen D. Simplified control of FFT hardware. Acoustics, Speech, and Signal Processing, 1976, (24):577-579
    [40] Yutai Ma. An Effective Memory AddressingScheme for FFT Processors.IEEE Transactions on signal processing, 1999, 47(3):907-911
    [41] Wu Q, Pedram M, Wu W. Clock-gating and its application to low power design of sequential circuits. IEEE transactions on Circuit and Systems I:Fundamental Theory and Applications, 2000, 47(3):415-420
    [42]罗旻,杨波,高德远等.寄存器传输级的低功耗设计方法.小型微型计算机系统.2004,25(7):1207-1211
    [43]虞希清.专用集成电路设计实用教程.杭州:浙江大学出版社,2007,254-255
    [44]周立阳,周玉洁.AES算法的快速低功耗ASIC实现.信息安全与通信保密.2007,(2):160-162
    [45] Synopsys Inc. Design Compiler User Guide, version Z-2007.03, 1-10
    [46]黎声华,邹雪城,莫迟.静态时序分析在数字集成电路设计中的应用.电子技术应用,2003,( 8):63-67
    [47] Hasan M, Arslan T, Thompson J S. A delay spread based low power scalable FFT processor architecture for wireless receiver. International Symposium on System-on-Chip. Tampere, 2003,135-138
    [48] Ahmadinia A, Ahmad B, Arslan T. System Level Modelling of Scalable FFT Architecture for System-on-Chip Design. Adaptive Hardware and Systems. Edinburgh, 2007, 169-175
    [49] Ginder Wu, Yiming Liu. Radix-22 Based Low Power Scalable FFT. International Symposium on Industrial Electronics. Seoul, 2009,1134-1138

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700