时间约束下数字系统的设计空间搜索方法研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
在市场需求的推动下,数字系统的规模越来越大,逻辑功能越来越复杂。传统设计方法所能提供的设计效率渐渐无法满足设计需要,高层次综合(High-level Synthesis, HLS)已成为EDA(Electronic Design Automation)设计以及系统设计方法学领域的研究热点。而设计空间搜索是数字系统高层次综合中最重要的问题。
     在数字系统设计中,系统的时间性能通常是最重要的性能指标,它直接决定了系统是否能够满足用户的需要。因此,本文对时间约束下数字系统的设计空间搜索问题进行研究,主要研究工作包括:
     1)根据常用算术模块的多种设计原理,采用不同的结构、不同的流水线深度,以FPGA(Field Programmable Gate Array)为平台对这些模块进行设计实现,为本文设计空间搜索方法的研究提供实际的模块库支持。研究分析了模块的数据吞吐间隔、数据延迟、最高运行时钟频率和资源代价等属性,提出优选资源集的概念。在给定时钟频率下,通过筛选优选资源集以排除不必考虑的模块,从而缩小搜索空间。
     2)研究非流水系统的设计空间搜索问题。寻找在满足系统数据延时约束的条件下,系统资源代价最小的设计方案。首先提出设计空间的分层搜索方法,进而又提出调度与搜索相融合的LSBB(List Scheduling Branch and Bound)搜索算法。该算法将模块选择及模块数量的搜索与列表调度融合到一个分支定界过程中,将搜索与调度同时进行,从而使搜索树更加细化,提高了搜索速度。对于实际设计中可能出现的数据流图规模,LSBB算法都能在很短的时间内给出近似最优的设计方案。
     3)研究流水系统的设计空间搜索问题。考虑的是单向数据流系统,寻找在满足系统数据吞吐间隔约束的条件下,系统资源代价最小的设计方案。针对可变启动间隔的模块,对流水线调度问题进行建模和分析,得到存在完整调度方案的充分必要条件。基于这个结论,得以将模块选择问题与调度问题相分离,分别进行处理,从而降低了设计空间搜索问题的难度。
     采用整数线性规划(Integer Linear Programming, ILP)方法对模块选择问题进行建模并求解。提出一种混合流水的迭代调度算法,即HPIS(Hybria Pipeline Iterative Scheduler)算法,处理流水线调度问题,寻求系统数据延迟尽可能小的调度方案。该算法在调度过程中通过保证模块的剩余调度能力,从而避免资源冲突的产生;通过优先级提升和迭代过程以寻找更紧凑的调度安排。对于实际设计中可能出现的数据流图规模,ILP方法都能在很短的时间内求解出模块选择方案,HPIS算法可以在多项式时间内给出比较好的流水线调度方案。
     4)基于本文的模块库,采用常用的数字信号处理算法作为测试样例,对本文的设计空间搜索算法进行实验测试。基于实验结果,研究分析了系统的时间约束、时钟频率对设计方案的资源代价和数据延迟的影响。
     5)将本文的设计空间搜索方法应用于实际工程项目中一个高斯滤波器的设计,并按照本文方法给出的设计方案进行设计实现。在满足系统时间性能要求的前提下降低了系统的资源消耗。这表明本文方法对实际数字系统设计具有一定的实用价值。
Driven by the market requirement, the size of the digital system is becoming larger and larger while the logic is becoming more and more complex. The traditional design methodology seems unable to meet the design needs. High-level synthesis (HLS) is now a popular research topic in electronic design automation (EDA) and system design methodology. Design space exploration is the most important problem in high-level synthesis.
     In digital design, the timing performance is most important, since it affects whether the custom's requirements can be met. So we focus on the design space ex-ploration problem under time constraints. Our research work mainly includes:
     1) Based on various design principles of common used arithmetic modules, we build a module library to support our research work. The modules in this library are implemented on the FPGA (Field Programmable Gate Array) platform, with different structures and different pipeline depths. We study the properties of the modules, such as the data introduction interval, the delay time, the maximum frequency and the cost. We propose the concept called the optimal module set. Under a certain system frequency, only the modules in the optimal module set need be considered. So the design space that needs to explore is reduced.
     2) The design space exploration problem in non-pipeline systems is studied. We try to find the design scheme with the minimal cost, under the delay time constraint. A three-stage exploration method is proposed firstly. Then the List Scheduling Branch and Bound (LSBB) algorithm is proposed. In this algorithm, the exploration for module selection and module amount is combined with the list scheduling, and they are carried out concurrently. So the exploration tree is refined, and the speed of exploration is improved. For system size in practical design, the LSBB algorithm can provide near optimal design scheme in a little time.
     3) The design space exploration problem in pipeline systems is studied. We fo- cus on the problem in non-recursive systems and try to find the design scheme with the minimal cost, under the constraint of data introduction interval. We consider the pipeline modules with variable introduction interval and build a model for the pipeline scheduling problem. The necessary and sufficient condition for a full schedule to exist is obtained in our analysis. With this conclusion, the module selection problem and the scheduling problem could be handled respectively, so the difficulty of the whole exploration problem is reduced.
     We use the integer linear programming (ILP) method to solve the module selec-tion problem. The Hybria Pipeline Iterative Scheduler (HPIS) algorithm is proposed to solve the pipeline scheduling problem, which tries to find the scheduling scheme with the minimal system delay. In this algorithm, the remaining capacities of the modules are guaranteed in the scheduling procedure, so resource conflict is avoided. This algo-rithm tries to find a schedule with a small delay using the method of priority promotion and the iterative progress. For system size in practical design, the ILP method can al-ways solve the module selection problem in a little time. And the HPIS algorithm can solve the scheduling problem in polynomial time.
     4) Based on our module library, we take several common used digital signal pro-cessing (DSP) benchmarks for experiments. We analyze the influence of timing con-strains and system frequency on the system cost and system delay.
     5) We apply our method on the design of a Gaussian filter, which is a module in a practical project. And we implement the Gaussian filter according to the design scheme given by our method. The system cost is reduced while the timing constraint is met. This indicates that our method has some practical value for digital system design.
引文
[1]International Technology Roadmap for Semiconductors 2009 Edition Executive Summary. ITRS, http://www.itrs.net/,2009.
    [2]Cetin J, Balasinski A. SoC Design Quality, Cycletime, and Yield Improvement Through DfM. Proceedings of Proceedings of the 6th International Workshop on System on Chip for Real Time Applications,2006.86-90.
    [3]International Technology Roadmap for Semiconductors 2005 Edition Executive Summary. ITRS, http://www.itrs.net/,2005.
    [4]Goering R. Panelists ponder why U.S.lags in ESL design. http://www.eetuk.com/,2005.
    [5]Memik S O, Kastner R, Bozorgzadeh E, et al. A scheduling algorithm for optimization and early planning in high-level synthesis. ACM Transactions on Design Automation of Electronic Systems,2005,10(1):33-57.
    [6]Coutinho J, Luk W. Optimising and adapting high-level hardware designs. Proceedings of IEEE International Conference on Field-Programmable Technology,2002.150-157.
    [7]Lin Y. Recent developments in high-level synthesis. ACM Transactions on Design Automa-tion of Electronic Systems,1997,2(1):2-21.
    [8]McFarland M C, Parker S A C, Composano R. Tutorial on high-level synthesis. Proceedings of Proceedings of the 25th ACM/IEEE Design Automation Conference,1986.330-336.
    [9]Chandrachoodan N, Bhattacharyya S S, Liu K J R. High-Level Synthesis of DSP Appli-cations Using Adaptive Negative Cycle Detection. EURASIP Journal on Applied Signal Processing,2002,9:893-907.
    [10]Wang C Y, Parhi K K. High-Level DSP Synthesis Using Concurrent Transformations, Scheduling and Allocation. IEEE Transaction on Computer-Aided Design,1995,14:274-295.
    [11]Tugsinavisut S, Su R, Beerel P. High-level synthesis for highly concurrent hardware sys-tems. Proceedings of Proceedings of the 6th International Conference on Application of Concurrency to System Design,2006.79-90.
    [12]McFarland M C, Parker A, Camposano R. The high-level synthesis of digital systems. Proceedings of the IEEE,1990,78(2):310-318.
    [13]Ku D C, Micheli G D. HardwareC:A Language for Hardware Design. Technical report, Computer Systems Lab, Stanford University,1990.
    [14]Gajski D, Zhu J, Domer R, et al. SpecC:Specification Language and Methodology. Kluwer Academic Publishers,2000.
    [15]SpecC Homepage.2007. http://www.ics.uci.edu/specc/.
    [16]IEEE Std 1800-2009. IEEE Standard for System Verilog-Unified Hardware Design, Speci-fication, and Verification Language. New York:IEEE,2009.
    [17]Dick R P, Rhodes D L, Wolf W. TGFF:Task Graphs for Free. Proceedings of Proceedings of the 6th International Workshop on Hardware-Software Codesign. IEEE Computer Society, 1998.97-101.
    [18]Chantrapornchai C, Sha E H M, Hu X S. Efficient Design Exploration Based on Module Utility Selection. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,2000,19(1):19-29.
    [19]Rardin R L. Optimization in Operations Research. New Jersey:Prentice Hall,1997.
    [20]Hallschmid P, Saleh R. Fast Design Space Exploration Using Local Regression Modeling With Application to ASIPs. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,2008,27(3):508-515.
    [21]Hu X, Greenwood G, Ravichandran S, et al. A framework for user assisted design space exploration. Proceedings of Proceedings of the 36th Design Automation Conference,1999. 414-419.
    [22]Peixoto H, Jacome M. Algorithm and architecture-level design space exploration using hierarchical data flows. Proceedings of Proceedings on Application-Specific Systems, Ar-chitectures and Processors,1997.272-282.
    [23]Raje S, Bergamaschi R. Generalized resource sharing. Proceedings of Proceedings of the IEEE/ACM International Conference on Computer Aided Design,1997.326-332.
    [24]Hwang K S, Casavant A E, Dragomirecky M, et al. Constrained conditional resource sharing in pipeline synthesis. Proceedings of Proceedings of the IEEE Conference on Computer-Aided Design,1988.52-55.
    [25]Hwang K S, Casavant A E, Chang C, et al. Scheduling and Hardware sharing in pipelined data paths. Proceedings of Proceedings of the IEEE Conference on Computer-Aided Design, 1989.21-27.
    [26]Mandal C A, Chakrabarti P P, Ghose S. Complexity of scheduling in high level synthesis. VLSI Design,1998,7(4):337-346.
    [27]Gary M R, Johnson D S. Computers and Intractablity:A Guide to the Theory of NP-Completeness. New York:W. H. Freeman & Co.,1979.
    [28]Su M, Xue H X, Hong X L. A Global Scheduling Algorithm for CDFC with Nested Condi-tional Branches. Proceedings of Proceedings of the 3nd International CAD/Graphics,1993. 526-530.
    [29]Gebotys C, Elmasry M. Simultaneous scheduling and allocation for cost constrained op-timal architectural synthesis. Proceedings of Proceedings of the 28th ACM/IEEE Design Automation Conference,1991.2-7.
    [30]Avalon interface specifications. Altera Corporation,2008.
    [31]Hwang C T, Lee J H, Hsu Y C. A Formal Approach to the Scheduling Problem in High Level Synthesis. IEEE Transction on Computer-Aided Design,1991,10(4):464-475.
    [32]Micheli G D. Synthesis and Optimization of Digital Circuits. New York:McGraw-Hill, 1994.
    [33]Fujita S, Nakagawa T, Yamashita M. A new list scheduling method based on structural properties of task graphs. Proceedings of Proceedings of the 7th International Conference on Parallel and Distributed Systems:Workshops,2000.3-8.
    [34]Paulin P G, Knight J P. Force-directed scheduling in automatic data path synthesis. Proceed-ings of Proceedings of the 24th ACM/IEEE Conference on Design Automation Conference, 1987.195-202.
    [35]Park I C, Kyung C M. Fast and near optimal scheduling in automatic data path synthesis. Proceedings of Proceedings of the Design Automation Conference (DAC),1991.680-685.
    [36]Note S, Catthoor F, Goossens G, et al. Combined hardware selection and pipelining in high-performance data-path design. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,1992, 11(4):413-423.
    [37]Jain R, Parker A C, Park N. Predicting System-Level Area and Delay for Pipelined and Nonpipelined Designs. IEEE Transactions on Computer-Aided Design,1992,11(8):955-965.
    [38]Shim S, Moon S M. Split-path enhanced pipeline scheduling. IEEE Transactions on Parallel and Distributed Systems,2003,14(5):447-462.
    [39]Park N, A P. Sehwa:A software package for synthesis of pipelines from behavioral speci-fications. IEEE Transction on Computer-Aided Design of Integrated Circuits and Systems, 1988,7(3):356-370.
    [40]Gebotys C H, Elmasry M I. Global Optimization Approach for Architectural Synthesis. IEEE Transction on Computer-Aided Design of Integrated Circuits and Systems,1993, 12(10):1266-1278.
    [41]Nemhauser G L, Sigismondi G. A Strong Cutting Plane/Branch-and-Bound Algorithm for Node Packing. The Journal of the Operational Research Society,1992,43(5):443-457.
    [42]Paulin P G, Knight J P. Force-directed scheduling for the behavioral synthesis of ASIC's. IEEE Transction on Computer-Aided Design,1989,8(6):661-679.
    [43]Hwang C, Hsu Y, Y L. PLS:A scheduler for pipeline synthesis. IEEE Transction on Computer-Aided Design of Integrated Circuits and Systems,1993,12(9):1279-1286.
    [44]Park I C, Kyung C M. FAMOS:An Efficient Scheduling Algorithm for High-Level Synthe-sis. IEEE Transction on Computer-Aided Design of Integrated Circuits and Systems,1993, 12(10):1437-1448.
    [45]Kernighan B W, Lin S. An efficient heuristic procedure for partitioning graphs. The Bell System Technical Journal,1970,49(2):291-308.
    [46]Wang C Y, Parhi K K. High level DSP synthesis using the MARS design system. Proceed-ings of Proceedings of the IEEE International Symposium on Circuits and Systems,1992. 164-167.
    [47]Langevin M, Cerny E. A recursive technique for computing lower-bound performance of schedules. ACM Transactions on Design Automation of Electronic Systems,1996, 1(4):443-456.
    [48]Tiruvuri G, Chung M. Estimation of Lower Bounds in Scheduling Algorithms for High-Level Synthesis. ACM Transactions on Design Automation of Electronic Systems,1998, 3(2):162-180.
    [49]Jain R, Parker A, Park N. Module selection for pipelined synthesis. Proceedings of Pro-ceedings of the 25th ACM/IEEE Design Automation Conference,1998.542-547.
    [50]Bakshi S, Gajski D D. Component Selection for High-Performance Pipelines. IEEE Trans-actions on Very Large Scale Integration (VLSI) Systems,1996,4(2):181-194.
    [51]Shen Z, Jong C. Exploring module selection space for architectural synthesis of low power designs. Proceedings of Proceedings of 1997 IEEE International Symposium on Circuits and Systems,1997.1532-1535.
    [52]Brassard G, Bratley P. Fundamentals of algorithmics. New Jersey:Prentice Hall,1996.
    [53]Mitra B, Rajam S, Roberts M. QUEST:a system for module selection and design optimiza-tion under multiple constraints. Proceedings of Proceedings of the 5th Annual European Computer Conference on Advanced Computer Technology, Reliable Systems and Applica-tions,1991.516-520.
    [54]Dick R P, Jha N K. MOGAC:A Multi-objective Genetic Algorithm for Hardware-Software Co-synthesis of Distributed Embedded Systems. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,1998,17(10):920-935.
    [55]Ahmad I, Dhodhi M K. Integrated scheduling, allocation and module selection for design-space exploration in high-level synthesis. Proceedings of IEE Proceedings on Computers and Digital Techniques, volume 142,1995.65-71.
    [56]Abdelhalim M B, Salama A E, Habib S E D. Hardware Software Partitioning using Particle Swarm Optimization Technique. Proceedings of The 6th International Workshop on System on Chip for Real Time Applications,2006.189-194.
    [57]Yiguo Z, Wenjian L, Zhang Zeming e a. A hardware/software partitioning algorithm based on artificial immune principles. Applied Soft Computing,2008,8(1):383-391.
    [58]W Sun M J W, Neuendorffer S. FPGA Pipeline Synthesis Design Exploration Using Mod-ule Selection and Resource Sharing. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,2007,26(2):254-265.
    [59]Rau B R. Iterative modulo scheduling:an algorithm for software pipelining loops. Pro-ceedings of Proceedings of the 27th annual International Symposium on Micro-architecture, volume 21. ACM Press,1994.63-74.
    [60]Shiue W T, Denison J, Horak A. A Novel Scheduler for Low Power Real Time Systems. Proceedings of Proceedings of the 43rd IEEE Midwest Symposium on Circuits and Systems, 2000.312-315.
    [61]Raghunathan V, Ravi e a. Transient power management through high level synthesis. Pro-ceedings of IEEE/ACM International Conference on Computer Aided Design,2001.545-552.
    [62]Zhong L, Jha N K. Interconnect-aware low-power high-level synthesis. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,2005,24(3):336-351.
    [63]Shiue W T. High Level Synthesis for Peak Power Minimization using ILP. Proceedings of Proceedings of the International Conference on Application-Specific Systems, Architec-tures, and Processors,2000.103-112.
    [64]Bakshi S, Gajski D, Juan H. Component selection in resource shared and pipelined DSP applications. Proceedings of Proceedings of the European Design Automation Conference, 1996.370-375.
    [65]Jain R. MOSP:Module selection for pipelined designs with multi-cycle operations. Pro-ceedings of Proceedings of the IEEE International Conference on Computer-Aided Design, 1990.212-215.
    [66]Snider G. Performance-constrained pipelining of software loops onto reconfigurable hardware. Proceedings of Proc. ACM/SIGDA 10th International Symposium on Field-Programmable Gate Arrays,2002.177-186.
    [67]Harris I G, Orailoglu A. Intertwined scheduling, module selection and allocation in time-and-area constrained synthesis. Proceedings of Proceedings of IEEE International Sympo-sium on Circuits and Systems,1993.1682-1685.
    [68]Fann Y, Rim M, Jaiin R. Global scheduling for high-level synthesis applications. Proceed-ings of Proceedings of the 31 st ACM/IEEE Design Automation Conference,1994.542-546.
    [69]Munch M, When N, Glesner M. An Efficient ILP-Based Scheduling Algorithm for Control-Dominated VHDL Descriptions. ACM Transactions on Design Automation of Electronic Systems,1993,2(4):344-364.
    [70]Hsu Y C, Jeang Y L. Pipeline scheduling techniques in high-level synthesis. Proceedings of Proceedings of 6th Annual IEEE International ASIC Conference and Exhibit,1993.396-403.
    [71]Chen C R, Moricz M Z. Data path scheduling for two-level pipelining. Proceedings of Proceedings of the 28th ACM/IEEE Design Automation Conference,1991.603-606.
    [72]Wang Q, Shayan Y R. A Versatile Signed Array Multiplier Suitable for VLSI Implemen-tation. Proceedings of Canadian Conference on Electrical and Computer Engineering, vol-ume 1,2003.199-202.
    [73]Ienne P, Viredaz M A. Bit-Serial Multipliers and Squarers. IEEE Transactions on Comput-ers,1994,43(12):1445-1450.
    [74]Mottaghi-Dastjerdi A K A, Pedram M. BZ-FAD:A Low-Power Low-Area Multiplier Based on Shift-and-Add Architecture. IEEE Transactions on Very Large Scale Integration (VLSI) Systems,2008,17(2):302-306.
    [75]Cheng E, Mead C. A two's complement pipeline multiplier. Proceedings of IEEE Interna-tional Conference on Acoustics, Speech, and Signal Processing,1976.647-650.
    [76]Cooper A. Parallel architecture modified Booth multiplier. Proceedings of IEE Proceedings on Electronic Circuits and Systems, volume 135,1988.125-128.
    [77]Arithmetic Module Generator. http://modgen.dnsalias.com/.
    [78]Fang C J, Huang C H, Wang J S, et al. Fast and compact dynamic ripple carry adder design. Proceedings of Proceedings of the IEEE Asia-Pacific Conference on ASIC,2002.25-28.
    [79]Doran R. Variants of an improved carry look-ahead adder. IEEE Transactions on Computers, 1988,37(9):1110-1113.
    [80]Hierarchical Carry Save Algorithm-HCSA Generic ALU. http://www.deversys.com/.
    [81]Narasimhan M, Ramanujam J. A fast approach to computing exact solutions to the resource-constrained scheduling problem. ACM Transactions on Design Automation of Electronic Systems,2001,6(4):490-500.
    [82]Sun H. Throughput Constrained and Area Optimized Dataflow Synthesis for FPGAs PhD dissertation[D]. England:Department of Electrical and Computer Engineering, Brigham Young University,2008.
    [83]Wang G, Gong W, Derenzi B, et al. Exploring time/resource trade-offs by solving dual scheduling problems with the ant colony optimization. ACM Transactions on Design Au-tomation of Electronic Systems,2007,12(4):46-65.
    [84]Sllame A M. Design Space Exploration Methodology for High-Performance System-on-a-Chip Hardware Cores. Proceedings of Proceedings of the 3rd IEEE International Workshop on System-on-Chip for Real-Time Applications,2003.216-221.
    [85]Topcuoglu H, Hariri S, Wu M. Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Transaction on Parallel and Distributed Systems,2002, 13(3):260-274.
    [86]Mallon D J, Denyer P B. A new approach to pipeline optimisation. Proceedings of Proceed-ing of European Design Automation Conference,1990.83-88.
    [87]Gebotys C H. Synthesizing embedded speed-optimized architectures. IEEE Journal of Solid-state Circuits,1993,28(3):242-252.
    [88]Ramanathan S, Visvanathan V, Nandy S. Synthesis of ASIPs for DSP algorithms. The VLSI journal,1999,28:13-32.
    [89]黄少滨.EDA中高层次综合算法及两种时序综合理论比较研究[博士学位论文].哈尔滨:哈尔滨工程大学,2004.
    [90]Chatha K S, Vemuri R. Hardware-Software Partitioning and Pipelined Scheduling of Trans-formative Applications. IEEE Transactions on Very Large Scale Integration Systems,2002, 10(3):193-208.
    [91]Lee T F, Wu A C H, Gajski D D, et al. An effective methodology for functional pipelining. Proceedings of Proceedings of the IEEE/ACM International Conference on Computer-Aided Design,1992.230-233.
    [92]Ciletti M D. Advanced Digital Design with the Verilog HDL. New Jersey:Prentice Hall, 2005.
    [93]Girczyc E F. Loop winding-a data flow approach to functional pipelining. Proceedings of Proceedings of the International Symposium on Circuits and Systems,1987.382-385.
    [94]Park N, Kurdahi F J. Register-transfer synthesis of pipelined data paths. VLSI Design,1994, 2(1):17-32.
    [95]Hwang K, Briggs F A. Computer Architecture and Parallel Processing. New York:McGraw-Hill,1986.
    [96]Rubinfield L. A Proof of the Modified Booth's Algorithm for Multiplication. IEEE Trans-actions on Computers,1975, (10):1014-1015.
    [97]The elements of GigE Vision. Basler AG, http://www.basler-vc.com.,1999,28:13-32.
    [98]MVC930DAM-GE30 CCD数字摄像机硬件使用说明书V5.0b.微视图像,2007.
    [99]ADSP-BF561 Blackfin Processor Hardware Reference. Analog Devices Inc.,2007.
    [100]Nios Ⅱ processor reference handbook Version8.0. Altera Corporation,2008.
    [101]Gonzalez R C, Woods R E. Digital Image Processing.2nd ed., Bostion:Addison-Wesley Longman Publishing Co. Inc,1992.
    [102]Fries R, Modestino J. Image enhancement by stochastic homomorphic filtering. IEEE Transactions on Acoustics, Speech and Signal Processing,1979,27(6):625-637.
    [103]CycloneⅡ Device Handbook. Altera Corporation,2007.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700