片上网络通信性能分析与优化

英文题名：Analyzing and Optimizing Methods of NoC Communication Performance
作者：王坚
论文级别：博士
学科专业名称：通信与信息系统
中文关键词：片上网络 ; 分析建模 ; 缓存分配 ; 带宽优化 ; 映射
英文关键词：Network-on-Chip ; analyzing and modeling ; buffer allocation ; bandwidth optimization ; mapping
学位年度：2011
导师：彭启琮
学科代码：081001
学位授予单位：电子科技大学
论文提交日期：2011-10-01

摘要

片上网络技术已成为目前国内外研究的热点,它对于解决片上多处理器之间的通信瓶颈,提高片上多处理器系统的性能,促进高性能计算机的发展有着重要的意义。为了提高片上网络的研发效率,减少片上网络的开发周期和开发成本,研究一种能快速准确地评估片上网络性能的方法是有意义的。此外,由于片上通信环境的特殊性,片上网络对系统资源、面积和功耗等开销有严格的限制,因此,研究在约束条件下片上网络的性能优化方法是很必要的。
     本论文正是针对上述问题,在以下几个方面对片上网络的通信性能分析与优化问题进行了探索性研究:
     1.从分析片上网络的系统特性和应用特性入手,解决片上网络通信性能的分析建模问题;对采用虫洞机制的片上网络,本文利用半马尔科夫过程描述了路由节点的各种状态,并以此为基础分析虫洞机制路由节点的性能和片上网络的通信性能。对采用存储转发/虚切片交换机制的片上网络,本文将路由节点中的各个输入缓存抽象为排队系统,通过对排队系统的分析求解数据通过路由节点的延迟,并在此基础上分析整个片上网络的通信性能。
     2.在片上网络缓存资源的约束下,提出新的片上网络缓存资源优化方法,以提高片上网络通信性能;对片上网络路由节点中的缓存资源,本文研究了缓存优化配置的方法,以在不增加缓存总开销的情况下提高片上网络通信性能;对片上网络接口中的缓存资源,本文研究了最佳的网络接口缓存大小,以在保证数据服务质量的同时避免缓存浪费。
     3.在片上网络带宽资源的约束下,研究了链路带宽对片上网络性能的影响,并对非均匀链路带宽的片上网络进行了仿真以修正片上网络的通信性能分析模型。在此基础上,本文提出了片上网络链路带宽的优化设计算法,以降低系统对带宽的需求,从而为片上网络面积和功耗优化提供了理论依据。
     4.研究了特定应用在不同映射方案下对片上资源的需求,并基于蚁群优化理论提出了约束条件下的片上网络映射算法以降低片上网络的开销、提高片上网络的性能;然后分别在延迟约束、缓存资源约束和带宽资源约束的前提下验证了本文映射算法的可行性。
     5.对片上网络常见调度策略进行了总结,并分析了他们的不足。在此基础上,结合片上网络自身的特点,研究了片上网络中的动态调度策略,并提出了相应的调度器设计方案,以解决片上网络调度中的“饿死”问题。同以往的调度算法相比,本文的动态仲裁路由器对片上网络通信性能有一定改善,并减少了网络接口处对缓存的需求。
NoC (Network-on-Chip) has been focused by a lot of researchers for years. The NoC solution is used to solve the bottleneck of the on chip communication between processors, to improve the performance of overall system, and to promote the development of high performance computer. In order to improve the efficiency of NoC design and reduce the time-to-market of NoC product, it is important to develop a useful tool to evaluate the NoC performance. Moreover, since the NoC is limited by the on-chip resources, area and power consumption, it is crucial to research the optimization method with various constraint conditions to improve the NoC performance.
     In this thesis, we make innovative researches on NoC and the main results are shown as follow:
     1. By analyzing the system characters and the application characters, we propose a modeling method for NoC communication performance. For the wormhole NoCs, a semi-Markov process is developed to describe the work state of router. Then, we analyze the router performance and NoC communication performance based on this semi-Markov process. For the store-and-forward/virtual-cut-through NoCs, each router buffer is abstracted as a queuing system. By modeling the NoCs in this queuing manner, we can calculate the average packet latency of each router and then analyze the NoC communication performance.
     2. With respect to the constraint condition of buffer resources, a method is proposed to optimize the NoC buffer resources, such that the NoC communication performance is optimized. For the buffer optimization of NoC router, a novel buffer allocation method is proposed to maximize the NoC performance without any additional buffer cost. For the optimization of the buffer of network interface, a novel buffer sizing method is proposed to determine the proper buffer size of network interface, which maintains the data Quality-of-Service while avoids the waste of buffer resources.
     3. With respect to the constraint condition of bandwidth, we study the influence of bandwidth on the NoC performance, simulate the performance of NoC with different link bandwidth and modify the analytical model for NoC performance. Then, a novel bandwidth optimization method is proposed to determine the bandwidth for all links in NoC, such that the cost of NoC is minimized. Our model also provides the theory basis for the optimization of NoC area and power consumption.
     4. By analyzing the different resources requirements in different NoC mapping, we propose an ACO (Ant Colony Optimization) based mapping algorithm to minimize the cost of NoC and improve the performance of NoC. Then, we evaluate the effectiveness of our algorithm under the constraints of latency, buffer resources and bandwidth, respectively.
     5. After analyzing the disadvantages of the common used schedule strategies in NoC, we propose a dynamic schedule strategy and design the corresponding arbiter to implement the dynamic schedule strategy, such that the‘starvation’problem in NoC schedule is solved. Compared with other schedule strategies, our dynamic schedule strategy can improve the NoC performance and minimize the cost of buffer resources in network interface.

引文

[1] W.J. Dally and B. Towles. Route packets, not wires: on-chip interconnection networks. Proceedings of the 38th Design Automation Conference. 2001: 684-689.
    [2] L. Benini and G. De Micheli, Networks on chips: a new soc paradigm. Computer. 2002, 35: 70-78.
    [3] A. Narasimhan, K. Srinivasan, R. Sridhar. A high-performance router design for vdsm nocs. IEEE International SOC Conference 2005. 2005: 301-304.
    [4] C. A. Nicopoulos et al. Vichar: a dynamic virtual channel regulator for network-on-chip routers. Proc. Int. Symp. Microarchitecture. 2006, Vol2: 333-346.
    [5] J. Hu, U.Y Ogras, R Marculescu. Application-specific buffer space allocation for networks-on-chip router design. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems. 2006, 25: 2919-2933.
    [6] S. Yan and B. Lin. Design of application-specific 3d networks-on-chip architectures. Proceedings of the 26th Conference on Computers. 2008: 142-149.
    [7] W. H. Hom and T. M. Pinkston. A methodology for designing efficient on-chip interconnects on well-behaved communication patterns. Proceedings of Symp. on High-Performance Computer Architecture. 2003: 377-388.
    [8] U. Y. Ogras and R. Marculescu. Energy- and performance-driven noc communication architecture synthesis using a decomposition approach. Proceedings of the Design, Autom, Test in Europe. 2005: 352-357.
    [9] Z. Guz, I.Walter, E.Bolotin, etc. Efficient link capacity and qos design for network-on-chip. Design, Automation and Test in Europe. 2006: 1-6.
    [10] T. Lin and L. T. Pileggi. Throughput-driven ic communication fabric synthesis. Proceeding of Conference on Computer-Aided Design. 2002: 274-279.
    [11] E. Nigussie, T. Lehtonen, S. Tuuna. High-performance long noc link using delayinsensitive current-mode signaling. Hindawi VLSI Design. 2007, 24: 47-60.
    [12] T. T. Ye and G. De Micheli. Physical planning for multiprocessor networks and switch fabrics. Architectures Processor. 2003, Vol2: 97-107.
    [13] T. Sparso, M. B. Stensgaard, and J. Sparso. A scalable, timingsafe, network-on-chiparchitecture with an integrated clock distribution method. Proceedings of Design., Autom. Test in Europe. 2007: 648-653.
    [14] Z. Yu and B. Baas. Implementing tile-based chip multiprocessors with gals clocking styles. in Proceedings of Conference Computer. 2006: 174-179.
    [15] L. Shang, L. Peh, A. Kumar, and N. K. Jha. Thermal modeling, characterization and management of on-chip networks. Proceedings of Internal Symp. on Microarchitecture. 2004: 67-78.
    [16] S. Manolache, P. Eles, and Z. Peng. Fault and energy-aware communication mapping with guaranteed latency for applications implemented on noc. Des. Autom and Test in Europe. 2005: 266-269.
    [17] A. Kumar, L. Peh, P. Kundu, and N. K. Jha. Express virtual channels: towards the ideal interconnection fabric. Internal Symp.on Computer Architecture. 2007: 150-161.
    [18] K. Goossens, J. Dielissen, S. G. Pestana. A design flow for application-specific networks on chip with guaranteed performance to accelerate soc design and verification. Design., Autom. Test in Europe. 2005: 1182-1187.
    [19] E. Beigne, F. Clermidy, P. Vivet, A. Clouard, and M. Renaudin. An asynchronous noc architecture providing low latency service and its multi-level design framework. Internal Symp. on Asynchronous Circuits System. 2005: 54–63.
    [20] U. Y. Ogras and R. Marculescu. Analysis and optimization of prediction-based flow control in networks-on-chip. ACM Trans. Des. Autom. Electron. Syst. 2008, 13(1): 1-28.
    [21] C. S. Patel, S. M. Chai, S. Yalamanchili, and D. E. Schimmel,“Power constrained design of multiprocessor interconnection networks,”Internal. Conference on Computer Design. 1997, pp. 408-416.
    [22] L. Shang, L. Peh, and N. K. Jha. Dynamic voltage scaling with links for power optimization of interconnection networks. Internal Symp. on High-Performance Computer Architecture. 2003: 91-102.
    [23] P. Bhojwani, J. D. Lee, and R. Mahapatra. Sapp: scalable and adaptable peak power management in nocs. Internal Symp. Low Power Electronic Devices. 2007: 340-345.
    [24] A. K. Coskun, T. S. Rosing, and K. Whisnant. Temperature aware task scheduling in mpsocs. Design, Autom. Test in Europ. 2007: 1659-1664.
    [25] P. Bogdan, T. Dumitras, and R. Marculescu. Stochastic communication: a new paradigm for fault-tolerant networks-on-chip. Hindawi VLSI Design. 2007, 3(11): 59-72.
    [26] M. Pirretti, G. M. Link, R. R. Brooks, N. Vijaykrishnan, M. Kandemir, andM. J. Irwin. Fault tolerant algorithms for network-on-chip interconnect. IEEE Symp. VLSI. 2004: 46-51.
    [27] A. Jantsch, R. Lauter, and A. Vitkowski. Power analysis of link level and end-to-end data protection in networks on chip. Internal Symp. on Circuits System. 2005: 1770-1773.
    [28] G. Varatkar and R. Marculescu. On-chip traffic modeling and synthesis for mpeg-2 video applications. IEEE Trans. Very Large Scale Integr. System. 2004, 12(1): 108-119.
    [29] V. Soteriou, H. S. Wang, and L. Peh. A statistical traffic model for on chip interconnection networks. Int. Symp. Model, Computer and Telecommunication System. 2006: 104-116.
    [30] G. Ascia, V. Catania, and M. Palesi. Multi-objective mapping for meshbased noc architectures. Internal Conference on Hardware-Software. 2004: 182-187.
    [31] S. Murali, M. Coenen, A. Radulescu, K. Goossens, and G. De Micheli. A methodology for mapping multiple use-cases onto networks on chips. Design, Autom. Test in Europ Conference. 2006:118-123.
    [32] W. Hung, C. Addo-quaye and Y. Xie. Thermal-aware ip virtualization and placement for networks-on-chip architecture. Internal Conference on Computer Design. 2004: 430-437.
    [33] M. Kim, D. Kim, and G. E. Sobelman. Adaptive scheduling for cdma based networks-on-chip. IEEE Northeast Workshop Circuits System. 2005: 357–360.
    [34] J. Hu and R. Marculescu. Communication and task scheduling of application-specific networks-on-chip. Electronic. Engineering, Computer and Digital Technology. 2005, 152(5): 643-651.
    [35] G. Varatkar and R. Marculescu. Communication-aware task scheduling and voltage selection for total systems energy minimization. IEEE/ACM Int. Conf. Comput.-Aided Design. 2003: 510-517.
    [36] P. Beekhuizen, D. Fenteneer, and I. Adan. Analysis of a tandem network model of a single-router. Annals of operations research. 2008, 162(1): 19-34.
    [37] S. Murali, and G. De Micheli. Bandwidth-constrained mapping of cores onto noc architectures. Design, Autom and Test in Europ. 2004: 896-901.
    [38] A. Khonsari, R. M. Aghajani, A. Tavakkol. Mathematical analysis of buffer sizing for network-on-chips under multimedia traffic. IEEE International Conference on Computer Design. 2008: 150-155.
    [39] J. H. Bahn, N. Bagherzadeh. Design of simulation and analytical models for a 2d-meshed asymmetric adaptive router. Computers and Digital Techniques, IET. 2008, 1(2): 63-73.
    [40] T. C. Huang, U. Y. Ogras, R. Marculescu. Virtual channels planning for networks-on-chip. International Symposium on Quality Electronic Design. 2007: 879-884.
    [41] U. Y. Ogras and R. Marculescu. Analytical router modeling for networks-on-chip performance analysis. Design, Autom and Test in Europ. 2007: 1096-1101.
    [42] J. Chan and S. Parameswaran. Nocee: energy macro-model extraction methodology for network on chip routers. Internal Conference on Computer Aided Design. 2005: 254-259.
    [43] X. Chen and L. Peh. Leakage power modeling and optimization in interconnection networks. Internal Symp. on Low Power Electronic. 2003: 90-95.
    [44] S. Heo and K. Asanovi′c. Replacing global wires with an on-chip network: a power analysis. Symp. on Low Power Elect. and Design (ISLPED 2005). 2005: 369-374.
    [45] N. Eisley and L. Peh. High-level power analysis for on-chip networks. Int. Conf. Compilers, Architectures Synthesis Embedded System. 2004: 104-115.
    [46] G. Palermo and C. Silvano. Pirate: a framework for power/performance exploration of network-on-chip architectures. Internal Workshop Power Timing Model, Optimization Simulation. 2004: 521-531.
    [47] H. Wang, X. Zhu, L. Peh, and S. Malik. Orion: a power-performance simulator for interconnection networks. Internal Symp. on Microarchitecture. 2002: 294-305.
    [48] A. Adriahantenaina and A. Greiner. Micro-network for soc: implementation of a 32-port spin network. Design, Autom and Test in Europ Conference. 2003: 1128-1129.
    [49] M. Taylor, M. B. Taylor, J. Kim, et al. The raw microprocessor: a computational fabric for software circuits and general-purpose programs. IEEE Micro. 2002, 22(2): 25-35.
    [50] S. Vangal, J. Howard, G. Ruhl, et al. An 80-tile 1.28 tflops network-on-chip in 65 nm cmos. Solid-State Circuits Conference. 2007: 98-589.
    [51] J. Liang, A. Laffely, S. Srinivasan, and R. Tessier. An architecture and compiler for scalable on-chip communication. IEEE Trans. Very Large Scale Integration system.2004, 12(7): 711-726.
    [52] H. G. Lee, N. Chang, U. Y. Ogras, and R. Marculescu. On-chip communication architecture exploration: a quantitative evaluation of point-to-point, bus and network-on-chip approaches. ACM Trans. Des. Autom. Electronic System. 2007, 12(3): 1-20.
    [53] S. V. Adve and M. K. Vernon. Performance analysis of mesh interconnection networks with deterministic routing. IEEE Trans. Parallel Distrib.System. 1994, 5(3): 225-246.
    [54] J. W. Dally. Performance analysis of k-ary n-cube interconnection networks. Computer. 1990,39(6): 775–785.
    [55] L. Boudec and P. Thiran. Network calculus. New York: Springer-Verlag, 2001: 79-109.
    [56] S. Dimitrios and V. Anujan. Latency-rate servers: a general model for analysis of traffic scheduling algorithms. IEEE transactions on networking. 1998, 5(6): 611-624.
    [57] M. Fabio and A. Francini. Implementing fair queueing in atm switches-parts 1: a practical methodology for analysis of delay bounds. Global Telecommunication Conference. 1997, 1: 509-519.
    [58] R. Marculescu, Y. U. Ogras, P. Li-Shiuan, et al. Outstanding research problems in noc design: system, microarchitecture, and circuit Perspectives. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems. 2009, 28: 3-21.
    [59] J. Hu, U.Y Ogras, R. Marculescu. Application-specific buffer space allocation for networks-on-chip router design. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems. 2006, 25: 2919-2933.
    [60] P. Lieverse. A methodology for architecture exploration of heterogeneous signal processing systems. Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology. 2001, 29(3): 181-190.
    [61] L. Boudec and P. Thiran. Network calculus. New York: Springer-Verlag, 2001: 117-144.
    [62] V. Chandra, A. Xu, H. Schmit, etc An interconnect channel design for high performance integrated circuits. Design, Autom and Test in Europ Conference. 2004: 1138-1143.
    [63] L. Kleinrock. Queueing Systems, Volume I: Theory. Wiley Interscience, New York, 1975: 17-59.
    [64] M. Moadeli, A. Shahrabi, W. Vanderbauwhede, and M. Ould-Khaoua. An analytical performance model for the spidergon noc. 21st International Conference on Advanced Information Networking and Applications. 2007: 1014-1021.
    [65] M. Bakhouya, S. Suboh, J. Gaber, and T. El-Ghazawi. Analytical modeling and evaluation of on-chip interconnects using network calculus. The 3rd ACM/IEEE International Symposium on Networks-on-Chip. 2009: 74-79.
    [66] K. Lahiri, A. Raghunathan, and G. Lakshminarayana. The lotterybus on-chip communication architecture. IEEE Transactions on Very Large Scale Integration Systems. 2006, 14: 596–608.
    [67] J. H. Bahn, N. Bagherzadeh. Design of simulation and analytical models for a 2d-meshed asymmetric adaptive router. Computers & Digital Techniques, IET. 2008, 1(2): 63– 73.
    [68] L. Yonghui, G. Huaxi, X. Perbo, et al. Performance modeling of fully adaptive wormholerouting in 2d-mesh network-on-chip with mmpp(2) input traffic. International Symposium on Information Science and Engineering. 2008: 58-62.
    [69] M. L. Ni and K. P. McKinley. A survey of wormhole routing techniques in direct networks. Computer. 1993, 2(26): 62–76.
    [70] C.M. Chung, D.A. Chiang, and Y. Qing. Comparative analysis of different arbitration protocols for multiple-bus multiprocessors. Journal of Computer Science and Technology. 1996, 3: 313~325.
    [71] T. N. Mudge, H. B. Al-Sadoun and B. A. Makrucki. Memory-interference model for multiprocessors based on semi-markov processes. IEE Proceedings E Computers and Digital Techniques. 1987, 134: 203-214.
    [72] S. Foroutan, Y. Thonnart, R. Hersemeule and A. Jerraya. An analytical method for evaluating network-on-chip performance. Design, Automation & Test in Europe Conference & Exhibition. 2010: 1629-1632.
    [73] A. Khonsari, M. R. Aghajani, A. Tavakkol, et al. Mathematical analysis of buffer sizing for network-on-chips under multimedia traffic. IEEE International Conference on Computer Design. 2008: 150-155.
    [74]王坚,李玉柏,蒋勇男.片上网络通信性能分析建模与缓存分配优化算法.电子与信息学报. 2009, 05: 1059-1062.
    [75]王力纬,曹阳,李晓辉,朱小虎.虫孔路由NOC的缓冲分配算法.北京邮电大学学报. 2008, 4: 29-32.
    [76]尹亚明,陈书明,孙书为,王耀华.一种面向应用的NOC缓冲区分配算法.国防科技大学学报. 2009, 05: 44-49.
    [77] G. Chiu. The odd-even turn model for adaptive routing. IEEE Trans. on Parallel Distribution and System. 2000, 7(11): 729-738.
    [78] R. Dick, D. Rhodes and W. Wolf. Tgff: task graphs for free. Proceedings of The Sixth International Workshop on Hardware/Software Codesign. 1998: 97-101.
    [79] W. Chang, Y. Li and Q. Peng. Microarchitecture design and performance evaluation of noc router for multi-processor measuring system. Journal of Electronic Measurement and Instructure. 2008, 5(22): 101-106.
    [80] E. Beigne and P. Vivet. Design of on-chip and off-chip interfaces for a gals noc architecture. IEEE International Symposium on Asynchronous Circuits and Systems. 2006: 179-183.
    [81] A. Radulescu, J. Dielissen and S.G. Pestana, etc. An efficient on-chip ni offering guaranteedservices, shared-memory abstraction, and flexible network configuration. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems. 2005, 24: 4-17.
    [82] M. Coenen, S. Murali and A. Radulescu, etc. A buffer-sizing algorithm for networks on chip using tdma and credit-based end-to-end flow control. Hardware/software codesign and system synthesis. 2006: 130-135.
    [83] A. Hansson, M.Wiggers and A. Moonen, etc. Applying dataflow analysis to dimension buffers for guaranteed performance in networks on chip. Second ACM/IEEE International Symposium on Networks-on-Chip. 2008: 211-212.
    [84] M. Ebrahimi, M. Daneshtalab, N. P. Sreejesh, etc. Efficient network interface architecture for network-on-chips. IEEE NORCHIP conference. 2009: 1-4.
    [85] V. Rantala, T. Lehtonen, P. Liljeberg, etc. Multi network interface architectures for fault tolerant network-on-chip. International Symposium on Signals, Circuits and Systems. 2009: 1-4.
    [86] P. Bhojwani and R.N. Mahapatra. Core network interface architecture and latency constrained on-chip communication. 7th International Symposium on Quality Electronic Design. 2006: 358-363.
    [87] C. Chenling, U.Y. Ogras and R. Marculescu. Energy and performance aware incremental mapping for networks on chip with multiple voltage levels. IEEE transactions on computer-aided design of integrated circuits and systems. 2008. 27: 1866-1879.
    [88] X. Wu, J. Yang and L. Shi. Bus buffer evaluation of different arbitration algorithms. IEEE International SOC Conference. 2005: 261-264.
    [89] J. Macgregor Smith and R. B. Cruz. The buffer allocation problem for general finite buffer queuing networks. IIE Transactions on Operations Engineering. 2005, 37: 343-365.
    [90] T. A. Tran, N. D. Truong and M. B. Baas. A gals many-core heterogeneous dsp platform with source-synchronous on-chip interconnection network. The 2009 3rd ACM/IEEE International Symposium on Networks-on-Chip. 2009: 214-223.
    [91] N. Wu, F. Ge and F. Wu. Design of a gals wrapper for network on chip. The 2009 WRI World Congress on Computer Science and Information Engineering. 2009: 592-595.
    [92] M. KrstiC, X. Fan, E. Grass, etc. Gals for bursty data transfer based on clock coupling. Electronic Notes in Theoretical Computer Science. 2009, 245: 103-113.
    [93] K. Goossens, J. Dielissen, O. P. Gangwal, S. G. Pestana, etc. A design flow for application-specific networks on chip with guaranteed performance to accelerate soc designand verification. DATE conference. 2005: 1182-1187.
    [94] S. Murali and G. De Micheli. Bandwidth-constrained mapping of cores onto noc architectures. DATE conference. 2004: 896-901.
    [95] M. Zid, A. Zitouni and A. Baganne, et al. New generic gals noc architectures with multiple qos. DTIS conference. 2006: 345-349.
    [96] Z. Yu and B.M. Baas, A low-area multi-link interconnect architecture for gals chip multiprocessors. IEEE Transactions on Very Large Scale Integration Systems. 2010, 18: 750-762.
    [97] D. Rostislav, V. Vishnyakov and E. Friedman, etc. An asynchronous router for multiple service levels networks on chip. Asynchronous Circuits and Systems. 2005: 44-53.
    [98] D. Bertozzi, A. Jalabert and S. Murali, et al. NoC synthesis flow for customized domain specific multiprocessor systems-on-chip. IEEE Transactions on Parallel and Distributed Systems. 2005, 2(16): 113-129.
    [99] J. Hu and R. Marculescu. Energy-aware mapping for tile-based noc architectures under performance constraints. Asia and South Pacific Design Automation Conference. 2003: 53-57.
    [100] J. Hu and R. Marculescu. Exploiting the routing flexibility for energy/performance aware mapping for regular noc architectures. Proceedings Design, Autimation and Test in Europe. 2003: 688-693.
    [101] S. Murali and G. Micheli. Bandwidth-constrained mapping of cores onto noc architectures. Proceeding Design, Automation and Test in Europe. 2004: 896-901.
    [102] T. Lei and S. Kumar. A two-step genetic algorithm for mapping task graphs to a network-on-chip architecture. Euromicro Symposium on Digital Systems Design. 2003: 53-57.
    [103] K. Srinivasan and K. S. Chatha. Isis: a genetic algorithm based technique for custom on-chip interconnection network synthesis. 18th International Conference on VLSI Design held jointly with 4th International Conference on Embedded Systems Design. 2005: 623-628.
    [104] T. Stutzle and M. Dorigo. Aco algorithms for the quadratic assignment problem. New ideas in optimization. 1999: 33–50.
    [105] S. Kumar., A. Jantsch., and J.-P. Soininen, etc. A network on chip architecture and design methodology, Proc. Symposium on VLSI. 2002: 117-124,
    [106] T. B. E. Vander and T. G. E. Jaspers. Mapping of mpeg4 decoding on a flexible architectureplatform. SPIE conference. 2002: 1-13.
    [107] A. Narasimhan, K. Srinivasan and R. Sridhar. A high-performance router design for vdsm nocs. IEEE International SOC Conference. 2005: 301-304.
    [108] H. Sun, D. Gao, S. Zhang and D. Wang. Design fast round robin scheduler in fpga. IEEE 2002 International Conference on Communications, Circuits and Systems and West Sino Expositions. 2002, 2: 1257-1261.
    [109] D. Bertozzi and L. Benini. Xpipes: a network-on-chip architecture for gigascale systems-on-chip. IEEE Circuits and Systems Magazine. 2004, 4(2): 18-31.
    [110] H. N. Nguyen, V. D. Ngo and H. W. Choi. Assessing routing behavior on on-chip-network. The 2006 International Conference on Computer Engineering and Systems. 2006: 62-65.
    [111] C. Wu, H. Li, Y. Li and Z. Yang. Lottery router: A customized arbitral priority noc router. Internal conference on Computer Science and Software Engineering. 2008: 12-18.
    [112] Y. Zhang. Architecture and performance comparison of a statistic-based lottery arbiter for shared bus on chip. Proceedings of the ASP-DAC. 2005, 2: 1313-1316.
    [113] K. Lahiri, A. Raghunathan and G. Lakshminarayana. The lotterybus on-chip communication architecture. IEEE Transactions on Very Large Scale Integration (VLSI) Systems. 2006, 14(6): 596-608.
    [114]田俊峰,张权雄.一种固定优先级分布式资源仲裁器的设计方法及性能评价.河北大学学报. 1995, 15(2): 47~49.
    [115] K. Lahiri, A. Raghunathan and G. Lakshminarayana. Lotterybus: a new high-performance communication architecture for system-on-chip designs. DAC conference. 2001: 18-22.
    [116]鲍胜荣,吴旭凡,钟锐.一款嵌入式芯片资源仲裁器的设计和评估.电子工程师. 2005, 31(1): 19-22.
    [117] C. M. Chung, D. A. Chiang and Y. Qing. A Comparative analysis of different arbitration protocols for multiple-bus multiprocessors. Journal of Computer Science and Technology. 1996, 11(3):313~325.
    [118] Y. Zhang. Architecture and performance comparison of a statistic-based lottery arbiter for shared bus on chip. IEEE ASP-DAC. 2005:1313-1316.
    [119]王涛.一种可综合的轮换仲裁控制器设计.微电子学与计算机. 2003, 20(9):73-75.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700