MPSoC片上互连网络缓冲管理与高速互连技术研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
社会生活与军事科技飞速发展,对高性能嵌入式计算领域提出了更高的要求。VLSI技术的迅猛提升使得片上系统的集成度越来越高,微处理器、存储器、IO设备等越来越多的硬件单元都可以集成在单个芯片上。在应用需求的牵引与VLSI技术的推动下,片上多处理器系统(Multiprocessor System-on-Chip,MPSoC)已经成为高性能嵌入式计算领域的主要研究内容。随着MPSoC技术的发展,单个芯片上所集成的单元数量不断增加,同时这些单元的性能也在不断增长,这使得通信结构设计成为限制系统面积、性能与功耗的主要角色。片上互连网络技术的提出为MPSoC提供了更好的互连解决方案,与传统的片上通信方式相比,NoC具有更好的可预测性、更低的功耗和更好的可扩展性。针对片上多处理器互连网络技术的核心理论与设计技术问题进行研究,可为未来高性能嵌入式多核处理器芯片的设计与实现提供良好的理论与技术基础,具有重要的理论意义和应用价值。
     本文在对片上多处理器互连网络技术进行了相关描述与分类讨论的基础上,深入研究了片上多处理器互连网络中缓冲区分配、管理与使用的相关技术问题,其中包括面向应用的缓冲区分配策略和路由节点的缓冲区动态使用与管理技术。在对主要功能单元设计实现的基础上,构建了RTL级互连网络模拟平台,基于FPGA实现原型系统并对相关设计参数进行了性能分析与设计探索。最后,面向自行研制的异构多核系统YHFT-QDSP,对其片间高速互连扩展技术进行了研究与实现。本文主要创新工作与研究成果如下:
     1)针对片上多处理器互连网络中存在的严重资源受限问题,提出一种基于排队模型的NoC缓冲区分配方法。对片上网络中路由器的缓冲区分配问题进行特征化分析与形式化描述,建立了基于M/M/1排队系统的路由器解析模型,并对相关参数进行提取,给出了目标参数的求解过程。利用该模型实现了面向应用映射数据负载的缓冲区分配算法,针对不同的应用映射流量特征,该算法可实现缓冲区资源的定制分配。系统缓冲区资源得到高效利用,与传统均匀分配缓冲区策略相比,在保持性能变化不大的前提下,能够节省约50%的缓冲区使用量。
     2)分析了静态多通道结构的行为特征与不足,在此基础上提出一种面向输出的多通道动态缓冲区路由器结构OOMCR-DBU,该结构采用基于链表的方式实现动态缓冲资源的管理,使用一种阈值控制的资源预留技术来缓解由于网络拥塞导致动态缓冲资源被无效占用而引起的拥塞干扰问题。完成了两种不同参数的路由节点设计与VLSI实现。实验结果表明,该方法能够在不同的网络流量负载下动态调整虚通道组织方式,改善网络性能。缓解片上路由器缓冲资源利用率低、拥塞现象频繁等问题。同时,阈值控制的资源预留策略有效避免了虚通道间的拥塞干扰问题。
     3)提出一种通用的片上网络性能分析模型,可用于系统性能分析。构建了RTL级软件模拟环境和基于FPGA的硬件仿真平台。基于提出的动态分配虚通道路由器结构构建片上互连网络,以网络延迟和吞吐率为评价函数,分别针对网络规模、报文长度、缓冲区容量、虚通道数目、路由算法等不同设计参数进行了网络性能分析。实验表明,使用所构建的模拟仿真环境和性能分析方法,可以针对不同的设计目标与约束来选取相应参数配置,以获得良好的设计结果。
     4)面向一款异构多核嵌入式系统YHFT-QDSP,提出一种基于PCI Express技术的片上多核高速互连方法。分析了PCI Express技术特点与国内外应用情况,针对YHFT-QDSP系统的层次化互连结构特点,设计并实现了片内外协议转换与路由模块QPB。采用IP复用与裁剪的快速设计方法将PCI Express高速互连技术应用于YHFT-QDSP系统中,实现了PCI Express主从模式的对等连接,缩短了设计周期并实现了YHFT-QDSP系统的片外扩展高速互连。
With the rapid development of social life and military technology, morerequirements have been put forward to high performance embedded computing.Integration of System-on-Chip is increasing driven by the advance of VLSI technology.Microprocessors, memory, IO devices and a growing number of hardware units can beintegrated in a single chip. Multi Processor System on Chip has become a majorresearch area of high performance embedded computing, which is driven by applicationrequirements and VLSI technology. With the development of MPSoC, the number ofcomponents on a single chip and their performance continue to increase, the design ofthe communication architecture plays a major role in affecting the area, performance,and energy consumption of the overall system. Network-on-chip approach wasproposed as a better solution to MPSoC interconnection. NoC approach offers betterpredictability, lower power consumption and greater scalability compared to classicalsolutions for on chip communication. It has great theoretical and practical significanceto study on the theories and design problems about on chip interconnection network inMPSoC, which will provide theory and technology foundation for design andimplementation of future high performance embedded multi-core systems.
     In the dissertation, in-depth study on buffer allocation, management and applyingtechnical issues is presented, including application specific buffer allocation anddynamic using or managing router buffers. These works is on the basis of problemdescription, classification and discussion of relevant issues in network on chips. A NoCsimulation and emulation platform in RTL level is provided based on the design andimplementation of major functional units. Then performance analysis and designexploration of some technical parameters are carried out using this platform. Finally,high speed inter-chip interconnect technology is researched for expansion ofYHFT-QDSP, which is a independent developed and implemented heterogeneousmulti-core system. The main contributions are listed as follows.
     1) A buffer allocation approach is proposed based on queuing model, which isaiming at the serious resource-constrained problem in NoC. Characterization analysisand formal description of buffer allocation in NoC router design are provided. Weestablish an analytical router model which uses M/M/1queuing system. The relevantparameters are extracted and the calculation of object function is proposed. Anapplication specific buffer allocation algorithm is implemented using the analyticalmodel. Customized buffer resource allocation can be achieved using the algorithm,according to traffic pattern of different application mapping. In contrast with thetraditional uniform buffer allocation strategy, about50%saving in buffer resources canbe achieved without and reduction in performance. The buffer resources are utilized efficiently in the system.
     2) A dynamically buffer allocation scheme OOMCR-DBU is proposed to solve thelow buffer utilization and eliminate various congestion, which is based on the behaviorcharacteristics analyzing of static virtual channel structure. Dynamic virtual channelarchitecture is presented using this scheme and the VLSI implementation of router withdynamic virtual channel is completed. The router can regulate the channel organizationaccording to different traffic pattern, and it provides throughput increase and latencydecrease with obvious saving of silicon area and power consumption.
     3) The software simulation environment in RTL level and hardware emulationplatform based on FPGA is presented. We create a network-on-chip system on the basisof the proposed dynamic virtual channel router. Performance analysis of various designparameters, such as network size, packet length, buffer size, the number of virtualchannels, routing strategy, is carried out with the latency and throughput act as theevaluation function. We can select proper parameter configuration to achieve the designobject and meet different constraints, on the basis of simulation or emulation platformand the performance analysis approach.
     4) A high speed interconnect scheme based on PCI Express is proposed forYHFT-QDSP, which is an embedded heterogeneous multi-core system. The technicalfeatures and application of PCI Express are analyzed. QLink-PCIE-Bridge, the protocoltransformation and routing module, is implemented aiming at the hierarchicalinterconnection architecture of YHFT-QDSP. PCI Express technique is applied toYHFT-QDSP by IP reuse and cutting. The inter-chip high speed expansion ofYHFT-QDSP is achieved and the design cycle is shortened.
引文
[1] Wolf W, Jerraya A A, Martin G. Multiprocessor System-on-Chip(MPSoC)Technology [J]. IEEE Transactions on Computer-Aided Design of Integrated Circuitsand Systems,2008,27(10):1701~1713.
    [2] Benini L, De Micheli G. Networks on Chips: A New SoC Paradigm [J].Computer,2002,35(1):70~78.
    [3] Taylor M B, Psota J, Saraf A, et al. Evaluation of the Raw Microprocessor: AnExposed-Wire-Delay Architecture for ILP and Streams [C]//Proceedings of31st AnnualInternational Symposium on Computer Architecture. Munchen, Germany: IEEE,2004:2~13.
    [4] Tilera Corporation. Tile Processor Architecture Overview [Z]. USA: TileraCorporation,2007.
    [5] Kahle J A, Day M N, Hofstee H P, et al. Introduction to the Cell multiprocessor[J]. IBM Journal of Research,2005,49(4.5):589~604.
    [6] Burger D, Keckler S W, McKinley K S, et al. Scaling to the End of Silicon withEDGE Architectures [J]. Computer,2004,37(7):44~55.
    [7] Zhang Y P, Jeong T Y, Chen F, et al. A Study of the On-Chip InterconnectionNetwork for the IBM Cyclops64Multi-Core Architecture [C]//Proceedings of the20thInternational Parallel and Distributed Processing Symposium. Phodes Island, Greece:IEEE Computer Society,2006.
    [8] Yu Z Y, Meeuwsen M J, Apperson R W, et al. AsAP: An Asynchronous Arrayof Simple Processors [J]. IEEE Journal of Solid-State Circuits,2008,43(3):695~705.
    [9] Hennessy J L, Patterson D A. Computer Architecture: A quantitative approach
    [M]. USA: Morgan Kaufman Publishers,2002
    [10] ITRS. International Technology Roadmap for Semiconductors [Z]. USA:ITRS,2009.
    [11] Burger D, Goodman J R. Billion-transistor architectures: there and back again[J]. Computer,2004,37(3):22~28.
    [12] Rabacy J M, Chandrakasan A, Nikolic B. Digital Intergated Circuits: ADesign Perspective影印版[M].北京:清华大学出版社,2004.
    [13] Pavlidis V F, Friedman E G.3-D Topologies for Networks-on-Chip [J]. IEEETransactions on Very Large Scale Integration Systems,2007,15(10):1081~1090.
    [14] Kim J, Nicopoulos C, Park D, et al. A novel dimensionally-decomposedrouter for on-chip communication in3D architectures [C]//Proceedings of the34thannual International Symposium on Computer Architecture. San Diego: ACM,2007:138~149.
    [15]文梅.流体系结构关键技术研究[D].长沙:国防科技大学,2006:2~3.
    [16] Rixner M. Stream Processor Architecture[M]. Boston: Kluwer AcademicPublishers,2001.
    [17] Nomadik. Open Multimedia Platform for Next-generation Mobile Devices[EB/OL]. http://eu.st.com/stoneline/books/ascii/docs/9036.htm.
    [18]汪东.异构多核DSP数据流前瞻关键技术研究[D].长沙:国防科技大学,2007.
    [19] Bjerregaard T, Mahadevan S. A survey of research and practices ofNetwork-on-chip [J]. ACM Computing Surveys,2006,38(1):1~51.
    [20] Dally W J, Towles B. Route packets, not wires: on-chip interconnectionnetworks [C]//Proceedings of the38th annual Design Automation Conference. NewYork: ACM,2001:684~689.
    [21] Owens J D, Dally W J, Ho R, et al. Research Challenges for On-ChipInterconnection Networks [J]. Micro,2007,27(5):96~108.
    [22] Hemani A, Jantsch A, Kumar S, et al. Network on Chip: An architecture forbillion transistor era [C]//Proceedings of IEEE NorChip Conference. Turku:2000.
    [23] Kumar S, Jantsch A, Soininen J, et al. A Network on Chip Architecture andDesign Methodology [C]//Proceedings of the IEEE Computer Society AnnualSymposium on VLSI. Washington DC: IEEE Computer Society,2002.
    [24] Goossens K, Dielissen J, Gangwal O P, et al. A Design Flow for Application-Specific Networks on Chip with Guaranteed Performance to Accelerate SOC Designand Verification [C]//Proceedings of the conference on Design, Automation and Test inEurope. Munich, Germany: IEEE Computer Society,2005:1182~1187.
    [25] Millberg M, Nilsson E, Thid R, et al. Guaranteed bandwidth using loopedcontainers in temporally disjoint networks within the nostrum network on chip[C]//Proceedings of the conference on Design, Automation and Test in Europe. IEEEComputer Society,2004:890~895.
    [26] Adriahantenaina A, Greiner A. Micro-network for SoC: implementation of a32-port SPIN network [C]//Proceedings of the conference on Design, Automation andTest in Europe. IEEE Computer Society,2003:1128~1129.
    [27] Adriahantenaina A, Charlery H, Greiner A. SPIN: a Scalable, Packet Switched,On-chip Micro-network [C]//Proceedings of the conference on Design, Automation andTest in Europe. IEEE Computer Society,2003:70~73.
    [28] Bainbridge J, Furber S. Chain: a delay-insensitive chip area interconnect [J].Mirco,2002,22(5):16~23.
    [29] Bjerregaard T, Sparso J. Implementation of guaranteed services in theMANGO clockless network-on-chip [J]. IEE Proceedings Computer&DigitalTechniques,2006,153(4):217~229.
    [30] Bjerregaard T, Sparso J. A Scheduling Discipline for Latency and BandwidthGuarantees in Asynchronous Network-on-Chip [C]//Proceedings of the11th IEEEInternational Symposium on Asynchronous Circuits and Systems. IEEE ComputerSociety,2005:34~43.
    [31] Stergiou S, Angiolini F, Carta S, et al. Xpipes Lite: A Synthesis OrientedDesign Library For Networks on Chips [C]//Proceedings of the conference on Design,Automation and Test in Europe. IEEE Computer Society,2005:1188~1193.
    [32]杨盛光,李丽,高明伦等.面向能耗和延时的NoC映射方法[J].电子学报,2008,36(5):937~942.
    [33]杨盛光,李丽,徐懿等.基于拥塞预测的NoC自适应仲裁方法[J].计算机应用研究,2009,26(2):652~654.
    [34]岳培培,刘建,SHEIKH Anjum等.NoC映射问题中的列举路径分配算法[J].电子科技大学学报,2008,37(1):54~57.
    [35]岳培培,陈杰,刘建等.应用于片上网络的双通道路由器[J].电子科技大学学报,2009,38(2):309~312.
    [36] Lin H, Zhang L, Tong D, et al. A Fast Hierarchical Multi-Objective MappingApproach for Mesh-Based Networks-on-Chip [J].北京大学学报(自然科学版),2008,44(5):711~720.
    [37]林桦,李险峰,佟冬等.保证QoS的片上网络低能耗映射与路由方法[J].计算机辅助设计与图形学学报,2008,20(4):425~431.
    [38] Lu J L, Liu D, Tong D, et al. An Arbitration Approach of Efficient BandwidthAllocation and Low Latency for SoC Communication [J].北京大学学报(自然科学版),2009,45(1):20~28.
    [39]王宏伟,陆俊林,佟冬等.层次化的片上网络设计方法[J].北京大学学报(自然科学版),2007,43(5):669~676.
    [40]赖明澈,王志英,郭建军等.具有拥塞缓解策略的动态虚通道研究及其VLSI实现[J].计算机学报,2008,31(11):1~13.
    [41]赖明澈,王志英,戴葵.基于路由器解析式模型的NoC网络性能分析方法[J].计算机辅助设计与图形学学报,2009,21(3):339~345.
    [42]常政威,谢晓娜,桑楠等.片上网络映射问题的改进禁忌搜索算法[J].计算机辅助设计与图形学学报,2008,20(2):155~160.
    [43]常政威,熊光泽,桑楠等.基于电压岛的能量和可靠性感知NoC映射[J].计算机辅助设计与图形学学报,2009,21(1):19~26.
    [44]武畅,李玉柏,彭启琮等.可设置仲裁优先程度的NOC路由节点设计[J].电子科技大学学报,2008,37(5):645~648.
    [45]武畅,李玉柏,彭启琮.一种用于Multi-Processor测量系统的NOC结构的路由节点设计及性能评估[J].电子测量与仪器学报,2008,22(5):101~106.
    [46]杜高明.MPSoC-NoC多核体系结构及原型芯片实现技术研究[D].合肥:合肥工业大学,2007.
    [47]周文彪.网格NoC平台中的若干关键技术研究[D].哈尔滨:哈尔滨工业大学,2008.
    [48]张庆利.多核SoC中的片上网络关键技术研究[D].哈尔滨:哈尔滨工业大学,2008.
    [49]刘有耀.片上网络拓扑结构与通信方法研究[D].西安:西安电子科技大学,2009.
    [50]段新明.面向NoC的无死锁路由算法的研究[D].天津:南开大学,2007.
    [51]李磊.片上网络NoC的通信研究[D].杭州:浙江大学,2007.
    [52]董文箫.片上网络低功耗设计研究[D].杭州:浙江大学,2010.
    [53]荆元利.基于片上网络的系统芯片研究[D].西安:西北工业大学,2005.
    [54] http://www.nsfc.gov.cn.
    [55] Chen X, Peh L S. Leakage power modeling and optimization ininterconnection networks [C]//Proceedings of the2003international symposium onLow power electronics and design. Seoul, Korea: ACM,2003:25~27.
    [56] Wang H S, Peh L S, Malik S. Power-driven design of routermicroarchitectures in on-chip networks [C]//Proceedings of the36th annual IEEE/ACMInternational Symposium on Microarchitecture. San Diego: IEEE/ACM,2003:105~116.
    [57] Ye T T, Benini L, De Micheli G. Analysis of power consumption on switchfabrics in network routers [C]//Proceedings of the39th annual Design AutomationConference. New York: ACM,2002:524~529.
    [58] Varatkar G, Marculescu R. Traffic analysis for on-chip networks design ofmultimedia applications [C]//Proceedings of the39th annual Design AutomationConference. New York: ACM,2002:795~800.
    [59] Hu J C, Ogras U Y, Marculescu R. System-Level Buffer Allocation forApplication-Specific Networks-on-Chip Router Design [J]. IEEE Transactions onComputer-Aided Design of Integrated Circuits and Systems,2006,25(12):2919~2933.
    [60]李琼,郭御风,刘光明等.I/O互联技术及体系结构的研究与进展[J].计算机工程,2006,32(12):93~95.
    [61] Ho R, Mai K W, Horowitz M A. The future of wires [J]. Proceedings of theIEEE,2001,89(4):490~504.
    [62] Jantsch A, Tenhunen H. Networks on Chip [M]. Kluwer Academic Publishers,2003.
    [63] Al Faruque M.A, Henkel J. Minimizing Virtual Channel Buffer for Routers inOn-chip Communication Architectures [C]//Proceedings of the conference on Design,automation and test in Europe. New York: ACM,2008:1238~1243.
    [64] Soteriou V, Wang H S, Peh L S. A Statistical Traffic Model for On-ChipInterconnection Networks [C]//Proceedings of the14th IEEE International Symposiumon Modeling, Analysis, and Simulation. Washington DC: IEEE Computer Society,2006:104~116.
    [65] Leonel T, Aline M, Diego G. Traffic generation and performance evaluationfor mesh-based NoCs[C]//Proceedings of the18thannual symposium on Integratedcircuits and system design,2005:184~189.
    [66] Dally W J, Towles B. Principles and Practices of Interconnection Networks[M]. San Mateo, CA: Morgan Kaufmann,2004.
    [67] Lahiri K, Dey S, Raghunathan A. Evaluation of the traffic performancecharacteristics of system-on-chip communication architectures [C]//Proceedings of theThe14th International Conference on VLSI Design. Washington DC: IEEE ComputerSociety,2001:29~35.
    [68] Grecu C, Ivanov A, Pande R, et al. Towards Open Network-on-ChipBenchmarks [C]//Proceedings of the First International Symposium onNetworks-on-Chip. Princeton: IEEE,2007:205.
    [69] Bienia C, Kumar S, Singh J P, et al. The PARSEC benchmark suite:characterization and architectural implications [C]//Proceedings of the17th internationalconference on Parallel architectures and compilation techniques. New York: ACM,2008:72~81.
    [70] Park K, Willinger W. Self-similar network traffic: An overview in Self-Similar Network Traffic and Performance Evaluation[J]. Willinger, Eds. New York:Wiley-Interscience,1999.
    [71] Murali S, De Micheli G. Bandwidth-Constrained Mapping of Cores onto NoCArchitectures [C]//Proceedings of the conference on Design, automation and test inEurope. Washington D C: IEEE Computer Society,2004:896~901.
    [72]柳金普,孙洪祥,王军.应用随机过程[M].北京:清华大学出版社,2006.
    [73] Stephen C, Cameron P, Peter A. A Methodology for GeneratingApplication-Specific Heterogeneous Processor Arrays[C]//Proceedings of the39thAnnual Hawaii International Conference on System Sciences,2006:251a.
    [74] Hansson A, Goossens K, Radulescu A. A Unified Approach to Mapping andRouting on a Network-on-Chip for Both Best-Effort and Guaranteed Service Traffic [J].Hindawi VLSI Design,2007.
    [75] Hu J C, Marculescu R. Energy-and performance-aware mapping for regularNoC architectures [J]. IEEE Transactions on Computer-Aided Design of IntegratedCircuits and Systems,2005,24(4):551~562.
    [76] Murali S, Meloni P, Angiolini F, et al. Designing application-specificnetworks on chips with floorplan information [C]//Proceedings of the2006IEEE/ACMinternational conference on Computer-aided design. New York: ACM,2006:355~362.
    [77] Fei S, Srivaths R, Anand R, Niraj K. Synthesis of Application-SpecificHeterogeneous Multiprocessor Architectures Using Extensible Processors[C]//18thInternational Conference on VLSI Design,2005:551~556.
    [78] Hung W, Addo-Quaye C, Theocharides T, et al. Thermal-Aware IPVirtualization and Placement for Networks-on-Chip Architecture [C]//Proceedings ofthe IEEE International Conference on Computer Design. Washington DC: IEEEComputer Society,2004:430~437.
    [79] Arnab S, Chakrabarti P, Rajeev K. Frame Based Fair MultiprocessorScheduler: A Fast Fair Algorithm for Real-Time Embedded Systems[C]//19thInternational Conference on VLSI Design,2006:677~682.
    [80] Wayne W. The Future of Multiprocessor Systems-on-Chips[C]//41stConference on Design Automation,2004:681~685.
    [81] JoAnn M, Paul, Donald E. Cassidy: High-level modeling and simulation ofsingle-chip programmable heterogeneous multiprocessors[J]. ACM Transation ofDesign Automation. Electric. System,2005,10(3):431~461.
    [82] Yoo S, Jerraya A, Yoo S, et al.Hardware/software cosimulation from interfaceperspective[J]. Computers and Digital Techniques6,2005,152(3):369~379.
    [83] Pop P, Eles P, Pop T, et al. An approach to incremental design of distributedembedded systems [C]//Proceedings of the38th annual Design Automation Conference.New York: ACM,2001:450~455.
    [84] Xie Y, Wolf W. Allocation and scheduling of conditional task graph inhardware/software co-synthesis [C]//Proceedings of the conference on Design,automation and test in Europe. Piscataway: IEEE Press,2001:620~625.
    [85] Gruian F. Hard real-time scheduling for low-energy using stochastic data andDVS processors [C]//Proceedings of the2001international symposium on Low powerelectronics and design. New York: ACM,2001:46~51.
    [86] Schmitz M T, Al-Hashimi B M, Eles P. Iterative schedule optimization forvoltage scalable distributed embedded systems [J]. ACM Transactions on EmbeddedComputing Systems,2004,3(1):182~217.
    [87] Mishra R, Rastogi N, Zhu D, et al. Energy Aware Scheduling for DistributedReal-Time Systems [C]//Proceedings of the17th International Symposium on Paralleland Distributed Processing. Washington, DC: IEEE Computer Society,2003.
    [88] Arakawa F. Multicore SoC for embedded systems [C]//Proceedings of2008International SoC Design Conference. Busan, Korea:2008: I180~I183.
    [89] Shin J L, Huang D, Petrick B, et al. A40nm16-core128-Thread SPARC SoCProcessor [J]. IEEE Journal of Solid-State Circuits,2011,46(1):131~144.
    [90]王堃,许文强,马卓.PCI Express中2.5Gbps高速SerDes的设计与实现[J].计算机工程与科学,2009,31(11).
    [91] Nambiar S.O.S, Abhyankar Y, Chandrababu S. Migrating FPGA based PCIexpress Gen1design to Gen2[C]//Proceedings of2010International Conference onComputer and Communication Technology. Allahabad, India:2010:617-620.
    [92]陈书明,汪东,陈小文等.一种面向多核DSP的小容量紧耦合快速共享数据池[J].计算机学报,2008,31(10).
    [93] Budruk R, Anderson D, Shanlev T. PCI Express System Architecture [M].Boston, USA: Addison Wesley,2003.
    [94] Duato J, Ylamanchili S, Ni L. Interconnection Networks: An EngineeringApproach [M]. San Mateo, CA: Morgan Kaufmann,2002.
    [95] Seo D, Ali A, Lim W, et al. Near-Optimal Worst-Case Throughput Routingfor Two-Dimensional Mesh Networks [C]//Proceedings of the32nd annual internationalsymposium on Computer Architecture. Madison, USA: ACM Press,2005:432~443.
    [96] Towles B, Dally W J. Worst-case Traffic for Oblivious Routing Functions[C]//Proceedings of the fourteenth annual ACM symposium on Parallel algorithms andarchitectures. Winnipeg, Canada: ACM Press,2002:1~8.
    [97] Nilsson E, Millberg M, Oberg J, et al. Load distribution with the proximitycongestion awareness in a network on chip [C]//Proceedings of the conference onDesign, Automation and Test in Europe. Munich, Germany: IEEE Computer Society,2003:1126~1127.
    [98] Hu J C, Marculescu R. DyAD: smart routing for networks-on-chip [C]//Proceedings of the41st annual Design Automation Conference. San Diego, CA: ACMPress,2004:260~263.
    [99] Abad P, Puente V, Gregorio J A, et al. Rotary router: an efficient architecturefor CMP interconnection networks [C]//Proceedings of the34th annual internationalsymposium on Computer architecture. San Diego, CA: ACM Press,2007:116~125.
    [100] Murali S, Atienza D, Benini L et al. A method for routing packets acrossmultiple paths in NoCs with in-order delivery and fault-tolerance guarantees [J].Hindawi VLSI Design,2007.
    [101] Shang L, Peh L S, Kumar A, et al. Thermal Modeling, Characterization andManagement of On-Chip Networks [C]//Proceedings of the37th annual IEEE/ACMInternational Symposium on Microarchitecture. Portland: IEEE Computer Society,2004:67~78.
    [102]陈书明,李振涛,万江华等."银河飞腾"高性能数字信号处理器研究进展[J].计算机研究与发展,2006,43(6):993~1000.
    [103] Gratz P, Kim C, McDonald R, et al. Implementation and Evaluation ofOn-Chip Network Architectures [C]//Proceedings of International Conference onComputer Design. San Jose, CA: IEEE,2006:477~484.
    [104] Vangal S, Howard J, Puhl G, et al. An80-Tile1.28TFLOPS Network-on-Chip in65nm CMOS [C]//Proceedings of IEEE International Solid-State CircuitsConference. San Francisco, CA: IEEE,2007:98.
    [105] Chen S M, Wan J H, Lu J Z, et al. YHFT-QDSP: High-performanceheterogeneous multi-core DSP [J]. Journal of Computer Science and Technology,2010,25(2):214~224.
    [106] Rezazad M, Sarbazi-azad H. The Effect of Virtual Channel Organization onthe Performance of Interconnection Networks [C]//Proceedings of the19th IEEEInternational Parallel and Distributed Processing Symposium. Denver, Colorado: IEEEComputer Society,2005:264~272.
    [107] Ni N, Pirvu M, Bhuyan L. Circular Buffered Switch Design with WormholeRouting and Virtual Channels [C]//Proceedings of the International Conference onComputer Design. Austin, TX: IEEE Computer Society,1998:466~473.
    [108] Dally W J. Virtual-channel flow control [J]. IEEE Transactions on Paralleland Distributed Systems,1992,3(2):194~205.
    [109] Taylor M B, Lee W, Amarasinghe s P, et al. Scalar Operand Networks [J].IEEE Transactions on Parallel and Distributed Systems,2005,16(2):145~162.
    [110] Mullins R, West A, Moore S. Low-latency virtual-channel routers foron-chip networks [C]//Proceedings of the31st annual international symposium onComputer architecture. Munchen, Germany: IEEE Computer Society,2004:188~197.
    [111] Peh L S, Dally W J. A Delay Model and Speculative Architecture forPipelined Routers [C]//Proceedings of the7th International Symposium onHigh-Performance Computer Architecture. Mexico: IEEE Computer Society,2001:255~266.
    [112] Jerger N E, Lipasti M, Peh L S. Circuit-Switched Coherence [C]//Proceedings of the Second ACM/IEEE International Symposium on Networks-on-Chip.Newcastle University: IEEE Computer Society,2008:193~202.
    [113] Wiklund D, Liu D K. SoCBUS: Switched Network on Chip for Hard RealTime Embedded Systems [C]//Proceedings of the17th International Symposium onParallel and Distributed Processing. Nice, France: IEEE Computer Society,2003:8.
    [114] Wolkotte P T, Smit G J M, Rauwerda G K, et al. An Energy-EfficientReconfigurable Circuit-Switched Network-on-Chip [C]//Proceedings of the19th IEEEInternational Parallel and Distributed Processing Symposium. Denver, Colorado: IEEEComputer Society,2005:155a.
    [115] Nicopoulos C A, Park D, Kim J, et al. ViChaR: A Dynamic Virtual ChannelRegulator for Network-on-Chip Routers [C]//Proceedings of the39th AnnualIEEE/ACM International Symposium on Microarchitecture. Orlando, USA: IEEEComputer Society,2006:333~346.
    [116] Kumar A, Peh L S, Kundu P, et al. Express virtual channels: towards theideal interconnection fabric [C]//Proceedings of the34th annual internationalsymposium on Computer architecture. San Diego, CA: ACM,2007:150~161.
    [117] Lu Z H, Liu M, Jantsch A. Layered switching for networks on chip [C]//Proceedings of the44th annual Design Automation Conference. San Diego, CA: ACM,2007:122~127.
    [118] Bjerregaard T, Sparso J. A Router Architecture for Connection-OrientedService Guarantees in the MANGO Clockless Network-on-Chip [C]//Proceedings of theconference on Design, Automation and Test in Europe. Munich, Germany: IEEEComputer Society,2005:1226~1231.
    [119] Leung L F, Tsui C Y. Optimal link scheduling on improving best-effort andguaranteed services performance in network-on-chip systems [C]//Proceedings of the43rd annual Design Automation Conference. San Francisco, USA: ACM,2006:833~838.
    [120] Liang J, Laffely A, Srinivasan S, et al. An architecture and compiler forscalable on-chip communication [J]. IEEE Transactions on Very Large ScaleIntegration (VLSI) Systems,2004,12(7):711~726.
    [121] Beigne E Clermidy F, Vivet P, et al. An Asynchronous NOC ArchitectureProviding Low Latency Service and Its Multi-Level Design Framework [C]//Proceedings of the11th IEEE International Symposium on Asynchronous Circuits andSystems. New York: IEEE Computer Society,2005:54~63.
    [122] Bolotin E, Cidon I, Ginosar R, et al. QNoC: QoS architecture and designprocess for network on chip [J]. Journal of Systems Architecture: the EUROMICROJournal,2004,50(2-3):105~128.
    [123] Harmanci M D, Escudero N P, Leblebici Y, et al. Quantitative modelling andcomparison of communication schemes to guarantee quality-of-service innetworks-on-chip [C]//Proceedings of the IEEE International Symposium on Circuitsand Systems. Kobe, Japan: IEEE Computer Society,2005:1782~1785.
    [124] Marescaux T, Corporaal H. Introducing the SuperGT Network-on-Chip [C]//Proceedings of the44th annual Design Automation Conference. San Diego, CA: ACM,2007:116~121.
    [125] Lee J W, Ng M C, Asanovic K. Globally-Synchronized Frames forGuaranteed Quality-of-Service in On-Chip Networks [C]//Proceedings of the35thAnnual International Symposium on Computer Architecture. Bejing: IEEE ComputerSociety,2008:89~100.
    [126] Van den Brand J W, Ciordas C, Goossens K, et al. Congestion-controlledbest-effort communication for networks-on-chip [C]//Proceedings of the conference onDesign, automation and test in Europe. Nice, France: EDA Consortium,2007:1~6.
    [127] Duato J, Johnson I, Flich J, et al. A New Scalable and Cost-EffectiveCongestion Management Strategy for Lossless Multistage Interconnection Networks[C]//Proceedings of the11th International Symposium on High-Performance ComputerArchitecture. San Francisco, CA: IEEE Computer Society,2005:108~119.
    [128] Ogras U Y, Marculescu R. Analysis and optimization of prediction-basedflow control in networks-on-chip [J]. ACM Transactions on Design Automation ofElectronic Systems,2008,13(1):1~28.
    [129] Taylor M B, Kim J, Miller J, et al. The Raw microprocessor: a computationalfabric for software circuits and general-purpose programs [J]. IEEE Micro,2002,22(2)25~35.
    [130] Patel C S, Chai S M, Yalamanchili S, et al. Power constrained design ofmultiprocessor interconnection networks [C]//Proceedings of the1997InternationalConference on Computer Design. Austin, TX: IEEE Computer Society,1997:408~416.
    [131] Shang L, Peh L S, Jha N K. Dynamic Voltage Scaling with Links for PowerOptimization of Interconnection Networks [C]//Proceedings of the9th InternationalSymposium on High-Performance Computer Architecture. Anaheim, California: IEEEComputer Society,2003:91~102.
    [132] Kim E J, Yum K H, Link G M, et al. Energy optimization techniques incluster interconnects [C]//Proceedings of the2003international symposium on Lowpower electronics and design. Seoul, Korea: ACM,2003:459~464.
    [133] Soteriou V, Peh L S. Design-Space Exploration of Power-Aware On/OffInterconnection Networks [C]//Proceedings of the IEEE International Conference onComputer Design. San Jose, CA: IEEE Computer Society,2004:510~517.
    [134] Beigne E, Clermidy F, Miermont S, et al. Dynamic Voltage and FrequencyScaling Architecture for Units Integration within a GALS NoC [C]//Proceedings of theSecond ACM/IEEE International Symposium on Networks-on-Chip. NewcastleUniversity: IEEE Computer Society,2008:129~138.
    [135] Ogras U Y, Marculescu R, Marculescu D, et al. Design and Management ofVoltage-Frequency Island Partitioned Networks-on-Chip [J]. IEEE Transactions onVery Large Scale Integration (VLSI) Systems,17(3):330~341.
    [136] Bhojwani P S, Lee J D, Mahapatra R N. SAPP: scalable and adaptable peakpower management in nocs [C]//Proceedings of the2007international symposium onLow power electronics and design. Portland, Oregon: ACM,2007:340~345.
    [137] Tamir Y, Frazier G L. High-performance multi-queue buffers for VLSIcommunications switches [C]//Proceedings of the15th Annual InternationalSymposium on Computer architecture. Honolulu, USA: IEEE Computer Society,1988.
    [138] Huang T C, Ogras U Y, Marculescu R. Virtual Channels Planning forNetworks-on-Chip [C]//Proceedings of the8th International Symposium on QualityElectronic Design. IEEE Computer Society,2007.
    [139] Chiu G M. The Odd-Even Turn Model for Adaptive Routing [J]. IEEETransactions on Parallel and Distributed Systems,2000,11(7):729~738.
    [140] Coskun A K, Rosing T S, Whisnant K. Temperature aware task schedulingin MPSoCs [C]//Proceedings of the conference on Design, automation and test inEurope. Nice, France: EDA Consortium,2007:1659~1664.
    [141] Sun C, Shang L, Dick R P. Three-dimensional multiprocessor system-on-chip thermal optimization [C]//Proceedings of the5th IEEE/ACM internationalconference on Hardware/software codesign and system synthesis. Salzburg, Austria:ACM,2007:117~122.
    [142] Marculescu R, Ogras U Y, Peh L, et al. Outstanding Research Problems inNoC Design: System, Microarchitecture, and Circuit Perspectives [J]. IEEETransactions on Computer-Aided Design of Integrated Circuits and Systems,2009,28(1):3~21.
    [143] Shim B, Shanbhag N R. Energy-efficient soft error-tolerant digital signalprocessing [J]. IEEE Transactions on Very Large Scale Integration (VLSI) Systems,2006,14(4):336~348.
    [144] Bogdan P, Dumitras T, Marculescu R. Stochastic Communication: A NewParadigm for Fault-Tolerant Networks-on-Chip [J]. Hindawi VLSI Design,2007.
    [145] Pirretti M, Link G M, Brooks R P, et al. Fault tolerant algorithms fornetwork-on-chip interconnect [C]//Proceedings of IEEE Symposium on VLSI. IEEEComputer Society,2004:46~51.
    [146] Puente V, Gregorio J A, Vallejo F, et al. Immunet: A cheap and robustfault-tolerant packet routing mechanism [C]//Proceedings of the31st annualinternational symposium on Computer architecture. Munchen, Germany: IEEEComputer Society,2004:198~209.
    [147] Ejlali A, Al-Hashimi B M, Rosinger P, et al. Joint consideration of fault-tolerance, energy-efficiency and performance in on-chip networks [C]//Proceedings ofthe conference on Design, automation and test in Europe. Nice, France: EDAConsortium,2007:1647~1652.
    [148] Angiolini F, Atienza D, Murali S, et al. Reliability Support for On-chipMemories Using Networks-on-Chips [C]//Proceedings of24th International Conferenceon Computer Design. San Jose, California: IEEE,2006:389~396.
    [149] Liu C S, Iyengar V, Shi J F, et al. Power-Aware Test Scheduling inNetwork-on-Chip Using Variable-Rate On-Chip Clocking [C]//Proceedings of the23rdIEEE Symposium on VLSI Test. IEEE Computer Society,2005:349~354.
    [150] Hosseinabady M, Dalirsani A, Navabi Z. Using the inter-and intra-switchregularity in NoC switch testing [C]//Proceedings of the conference on Design,automation and test in Europe. Nice, France: EDA Consortium,2007:361~366.
    [151] Grecu C, Pande P, Wang B, et al. Methodologies and Algorithms for TestingSwitch-Based NoC Interconnects [C]//Proceedings of the20th IEEE InternationalSymposium on Defect and Fault Tolerance in VLSI Systems.2005:238~246.
    [152] Amory A M, Briao E, Cota E, et al. A scalable test strategy for network-on-chip routers [C]//Proceedings of IEEE International Conference on Test. Austin, TX:IEEE,2005:591~599.
    [153] Balfour J, Dally W J. Design tradeoffs for tiled CMP on-chip networks [C]//Proceedings of the20th annual international conference on Supercomputing. Cairns,Australia: ACM,2006:187~198.
    [154] Lee K, Lee S J, Kim S E, et al. A51mW1.6GHz on-chip network forlow-power heterogeneous SoC platform [C]//Proceedings of2004InternationalSolid-State Circuits Conference. San Francisco: IEEE,2004:152~518.
    [155] Lee H G, Chang N, Ogras U Y, et al. On-chip communication architectureexploration: A quantitative evaluation of point-to-point, bus, and network-on-chipapproaches [J]. ACM Transactions on Design Automation of Electronic Systems,2007,12(3):1~20.
    [156] Genko N, Atienza, D, De Micheli G, et al. A Complete Network-On-ChipEmulation Framework [C]//Proceedings of the conference on Design, Automation andTest in Europe. Munich, Germany: IEEE Computer Society,2005:246~251.
    [157] Pande P P, Grecu C, Jones M, et al. Performance Evaluation and DesignTrade-Offs for Network-on-Chip Interconnect Architectures [J]. IEEE Transactions onComputers,2005,54(8):1025~1040.
    [158] Ho W H, Pinkston T M. A Methodology for Designing Efficient On-ChipInterconnects on Well-Behaved Communication Patterns [C]//Proceedings of the9thInternational Symposium on High-Performance Computer Architecture. Anaheim,California: IEEE,2003:377~388.
    [159] Ogras U Y, Marculescu R. Energy-and Performance-Driven NoCCommunication Architecture Synthesis Using a Decomposition Approach [C]//Proceedings of the conference on Design, Automation and Test in Europe. Munich,Germany: IEEE Computer Society,2005:352~357.
    [160] Pinto A, Carloni L P, Sangiovanni-Vincentelli A L. Efficient Synthesis ofNetworks On Chip [C]//Proceedings of the21st International Conference on ComputerDesign. San Jose, CA: IEEE Computer Society,2003:146~150.
    [161] Srinivasan K, Chatha K S, Konjevod G. Linear-programming-basedtechniques for synthesis of network-on-chip architectures [J]. IEEE Transactions onVery Large Scale Integration (VLSI) Systems,2006,14(4):407~420.
    [162] Ogras U Y, Marculescu R."It's a small world after all": noc performanceoptimization via long-range link insertion [J]. IEEE Transactions on Very Large ScaleIntegration (VLSI) Systems,2006,14(7):693~706.
    [163] Chang M F, Cong J, Kaplan A, et al. CMP network-on-chip overlaid withmulti-band RF-interconnect [C]//Proceedings of14th International Symposium on HighPerformance Computer Architecture. Salt Lake City, UT: IEEE,2008:191~202.
    [164] Peh L S, Dally W J. Flit-reservation flow control [C]//Proceedings of6thInternational Symposium on High Performance Computer Architecture. Toulouse,France: IEEE,2000:73~84.
    [165] Kim J, Nicopoulos C, Park D, et al. A Gracefully Degrading and Energy-Efficient Modular Router Architecture for On-Chip Networks [C]//Proceedings of the33rd annual international symposium on Computer Architecture. Boston, MA: IEEEComputer Society,2006:4~15.
    [166] Saastamoinen I, Alho M, Nurmi J. Buffer implementation for Proteonetwork-on-chip [C]//Proceedings of the2003International Symposium on Circuits andSystems. Bangkok, Thailand: IEEE,2003: II-113~II-116.
    [167] Hansson A, Coenen M, Goossens K. Undisrupted quality-of-service duringreconfiguration of multiple applications in networks on chip [C]//Proceedings of theconference on Design, automation and test in Europe. Nice, France: EDA Consortium,2007:954~959.
    [168] Lin T, Pileggi L T. Throughput-driven IC communication fabric synthesis[C]//Proceedings of the2002IEEE/ACM international conference on Computer-aideddesign. San Jose, California: ACM,2002:274~279.
    [169] Kim B, Stojanovic V. Equalized interconnects for on-chip networks:modeling and optimization framework [C]//Proceedings of the2007IEEE/ACMinternational conference on Computer-aided design. San Jose, CA: IEEE Press,2007:552~559.
    [170] Nigussie E, Lehtonen T, Tuuna S, et al. High-Performance Long NoC LinkUsing Delay-Insensitive Current-Mode Signaling [J]. Hindawi VLSI Design,2007.
    [171] Jose A P, Patounakis G, Shepard K L. Near speed-of-light on-chipinterconnects using pulsed current-mode signaling [C]//Proceedings of Symposium onVLSI Circuits. IEEE,2005:108~111.
    [172] Mak T S T, Sedcole P, Cheung P Y K, et al. A Hybrid Analog-DigitalRouting Network for NoC Dynamic Routing [C]//Proceedings of the First InternationalSymposium on Networks-on-Chip. Princeton: IEEE Computer Society,2007:173~182.
    [173] Zhao D, Wang Y. SD-MAC: Design and Synthesis of a Hardware-EfficientCollision-Free QoS-Aware MAC Protocol for Wireless Network-on-Chip [J]. IEEETransactions on Computers,2008,57(9):1230~1245.
    [174] Pamunuwa D, Oberg J, Zheng L R, et al. Layout, Performance and PowerTrade-Offs in Mesh-Based Network-on-Chip Architectures [C]//Proceedings of the12thIFIP International Conference on Very Large Scale Integration. Darsmstadt,2003:362.
    [175] Angiolini F, Meloni P, Carta S, et al. Contrasting a NoC and a traditionalinterconnect fabric with layout awareness [C]//Proceedings of the conference on Design,automation and test in Europe. Munich, Germany: European Design and AutomationAssociation,2006:124~129.
    [176] Ye T T, De Micheli G. Physical Planning for On-Chip MultiprocessorNetworks and Switch Fabrics [C]//Proceedings of International Conference onApplication-Specific Systems, Architecture, and Processor.2003:97~103.
    [177] Srinivasan K, Chatha K S. A low complexity heuristic for design of customnetwork-on-chip architectures [C]//Proceedings of the conference on Design,automation and test in Europe. Munich: European Design and Automation Association,2006:130~135.
    [178] Kumar R, Zyuban V, Tullsen D M. Interconnections in Multi-CoreArchitectures: Understanding Mechanisms, Overheads and Scaling [C]//Proceedings ofthe32nd annual international symposium on Computer Architecture. Madison: IEEEComputer Society,2005:408~419.
    [179] Carloni L P, McMillan K L, sangiovanni-vincetelli A L. Theory of Latency-Insensitive Design [J]. IEEE Transactions on Computer-Aided Design of IntegratedCircuits and Systems,2001,20(9):1059~1076.
    [180] Bainbridge W J, Furber S B. Delay Insensitive System-on-Chip Interconnectusing1-of-4Data Encoding [C]//Proceedings of the7th International Symposium onAsynchronous Circuits and Systems. Salt Lake City: IEEE,2001:118~126.
    [181] Sheibanyrad A, Panades I M, Greiner A. Systematic comparison between theasynchronous and the multi-synchronous implementations of a network on chiparchitecture [C]//Proceedings of the conference on Design, automation and test inEurope. Nice, France: EDA Consortium,2007:1090~1095.
    [182] Bjerregaard T, Stensgaard M B, Sparso J. A scalable, timing-safe,network-on-chip architecture with an integrated clock distribution method [C]//Proceedings of the conference on Design, automation and test in Europe. Nice, France:EDA Consortium,2007:648~653.
    [183] Shibayama A, Nose K, Torii S, et al. Skew-Tolerant Global SynchronizationBased on Periodically All-in-Phase Clocking for Multi-Core SOC Platforms [C]//Proceedings of IEEE Symposium on VLSI Circuits. Kyoto: IEEE,2007:158~159.
    [184] Cortadella J, Kishinevsky M, Grundmann B. Synthesis of synchronouselastic architectures [C]//Proceedings of the43rd annual Design AutomationConference. San Francisco: ACM,2006:657~662.
    [185] Yu Z Y, Baas B M. Implementing Tile-based Chip Multiprocessors withGALS Clocking Styles [C]//Proceedings of The IEEE International Conference ofComputer Design. San Jose: IEEE,2006:174~179.
    [186] Chelcea T, Nowick S M. Robust interfaces for mixed-timing systems withapplication to latency-insensitive protocols [C]//Proceedings of the38th annual DesignAutomation Conference. Las Vegas: ACM,2001:21~26.
    [187] Campobello G, Castano M, Ciofi C, et al. GALS networks on chip: a newsolution for asynchronous delay-insensitive links [C]//Proceedings of the conference onDesign, automation and test in Europe. Munich, Germany: European Design andAutomation Association,2006:160~165.
    [188] Madsen J, Mahadevan S, Virk K, et al. Network-on-Chip Modeling forSystem-Level Multiprocessor Simulation [C]//Proceedings of the24th IEEEInternational Real-Time Systems Symposium. Cancun, Mexico: IEEE ComputerSociety,2003:265~274.
    [189] Wang H S, Zhu X P, Peh L S, et al. Orion: a power-performance simulatorfor interconnection networks [C]//Proceedings of the35th annual ACM/IEEEinternational symposium on Microarchitecture. Istanbul, Turkey: IEEE ComputerSociety Press,2002:294~305.
    [190] Ogras U Y, Marculescu R. Analytical Router Modeling forNetworks-on-Chip Performance Analysis [C]//Proceedings of the conference on Design,automation and test in Europe. Nice, France: IEEE,2007:1096~1101.
    [191] Banerjee N, Vellank P, Chatha K S. A power and performance model fornetwork-on-chip architectures [C]//Proceedings of the conference on Design,automation and test in Europe.2004:1250~1255.
    [192] Chan J, Parameswaran S. NoCEE: Energy macro-model extractionmethodology for network on chip routers [C]//Proceedings of the2005IEEE/ACMInternational conference on Computer-aided design. San Jose: IEEE,2005:254~259.
    [193] Chen X, Peh L S. Leakage power modeling and optimization ininterconnection networks [C]//Proceedings of the2003international symposium onLow power electronics and design. Seoul, Korea: ACM,2003:90~95.
    [194] Eisley N, Peh L S. High-level power analysis for on-chip networks [C]//Proceedings of the2004international conference on Compilers, architecture, andsynthesis for embedded systems. ACM,2004:104~115.
    [195] Kim J S, Taylor M B, Miller J, et al. Energy characterization of a tiledarchitecture processor with on-chip networks [C]//Proceedings of the2003internationalsymposium on Low power electronics and design. Seoul: ACM,2003:424~427.
    [196] Kogel T, Doerper M, Wieferink A, et al. A modular simulation frameworkfor architectural exploration of on-chip interconnection networks [C]//Proceedings ofthe1st IEEE/ACM/IFIP international conference on Hardware/software codesign andsystem synthesis. Newport Beach, CA: ACM,2003:7~12.
    [197] Dally W J. Performance analysis of k-ary n-cube interconnection networks[J].IEEE Trans. on Computers,1990:39(6).
    [198] Ould-Khaoua M, Sarbazi-Azad H.An analytical model of adaptive wormholerouting in hypercubes in the presence of hot spot traffic[J]. IEEE Trans on Parallel andDistributed Systems,2001:12(3).
    [199] Guan W, Tsai W, Blough D. An analytical model for wormhole routing inmulticomputer interconnection networks[C]//Proceedings of International Symposim onParallel Processing,1993.
    [200] Adve V S.Performance analysis of mesh interconnection networks withdeterministic routing[J]. IEEE Transaction on Parallel and Distributed Systems,1994,5(3):225~246.
    [201]朱红雷.基于动态缓冲管理的片上网络体系结构研究[D].长沙:国防科学技术大学,2010.
    [202] Hu P, Kleinrock L. An analytical model for wormhole routing with finitesize input buffers[C]//15th Intl. Teletraffic Congress,1997.
    [203] Guz Z, et al. Efficient link capacity and QoS design for wormholenetwork-on-chip[C]//Proceedings of the conference on Design, automation and test inEurope,2006.
    [204] Bertsekas D, Gallager R. Data Networks[M]. Prentice Hall,1992.
    [205] Park J, et al. Design and Evaluation of a DAMQ Multiprocessor NetworkWith Self-Compacting Buffers[C]//ACM/IEEE Conf. on Supercomputing,1994.
    [206] Evripidou M, Nicopoulos C,et al. Virtualizing Virtual Channels for IncreasedNetwork-on-Chip Robustness and Upgradeability[C]//IEEE Computer Society AnnualSymposium on VLSI,2012:21~26.
    [207] Becker D U, et al. Adaptive Backpressure: Efficient Buffer Management forOn-Chip Networks[C]//In Proceedings of the30th IEEE International Conference onComputer Design,2012.