大规模众核微处理器互连网络体系结构及性能分析研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
基于多核甚至众核设计的高性能处理器,是未来艾级高性能计算机的支撑技术。高带宽、低延迟、低功耗和强扩展性的互连网络对于释放处理器核强大的并行计算能力、提高众核处理器的性能有十分重要的意义。目前,众核系统的设计挑战中,互连通信逐渐成为制约系统性能提升的瓶颈。新兴的3D集成技术和硅基光子器件在芯片功能、集成密度和功耗方面有独特优势。这些新技术、新器件的发展成熟为解决众核系统互连瓶颈带来新的机会。
     本文以研究众核系统互连瓶颈为出发点,探索众核微处理器互连网络的创新型体系结构,并利用网络演算理论对众核互连网络进行建模与分析。主要研究内容包括四个方面:
     (1)众核系统片上核间互连网络体系结构
     核间传输的报文以控制报文为主,对实时性有着极高的要求。随着计算核节点数增多,传输延迟成为限制大规模众核处理器核间互连网络性能的首要因素。以Mesh为代表的简单低维片上网络结构,虽然布线简单,但由于其网络传输跳步数随着系统节点规模呈比例增长,很难满足大规模众核芯片的低延迟传输需求。利用3D集成技术,本文提出了一种三维扁平蝴蝶形网络的拓扑结构,用于大规模众核处理器的核间电报文传输。采用整数线性规划模型,我们克服了蝶形网络中高阶路由器和长互连线的布线挑战,成功地将扁平蝴蝶形网络嵌入到三维叠层中。扁平蝴蝶形拓扑是一种高维拓扑结构,扩展性强,尤其适合大规模计算核节点之间的互连。三维蝶形网络在保证Mesh连通性的同时增加了额外的捷径链路,同时利用高速的垂直互连线,实现了核间报文的快速传递。实验结果表明,三维蝶形网络能够有效的降低核间互连延迟,显著的提升众核处理器性能。
     (2)众核微处理器光访存网络体系结构
     访存互连对众核处理器至关重要,如果不能快速的存取数据,众核处理器强大的并行计算能力将很难发挥。随着单片上集成的处理器核数越来越多,访存通信带宽需求也急剧增长。传统的基于电IO管脚的“处理器-存储器”互连方案在大规模众核芯片中遇到了挑战,电互连方式很难在满足严格的功耗预算的前提下,为片上众核提供足够大的访存带宽。利用新兴的硅基光电子器件和3D集成技术,我们提出了一种高带宽、低功耗的光访存网络方案,用于众核处理器与DRAM之间的互连通信。这种基于光突发交换协议的访存网络采用光互连接口代替电IO管脚,能够实现众核处理器和存储器的高带宽无缝互连。除了带宽优势外,与以往的光访存网络相比,新方案的波长资源利用率得到了极大的提高,进一步提高了访存通信的功耗效率。实验结果表明,基于光突发交换协议的访存网络的功耗效率比光线路交换的访存网络提高了近2倍,比电接口方案提高了6倍。
     (3)芯片尺度光网络中的电控制层拥塞避免方案
     由于光缓存、光逻辑器件缺失,光电混合网络大都采用电控制层,负责资源仲裁、链路控制。在芯片尺度光突发交换网络研究中,我们发现,大量的细粒度光突发报文、严格的传输延迟限制和中等的网络工作频率限制了光网络的电控制层处理能力,极易导致严重的网络拥塞。因而,我们提出了一套流量整形方案,解决电控制层网络拥塞问题。在注入网络前,系统中所有报文流首先进行全局协调和整形,确保中间任何节点上的控制报文聚合流速率不会超过其最大处理能力,以达到减轻控制层拥塞的目的。我们采用优化算法,选取报文流整形器的整形参数(比如,报文流速度和报文突发性参数)。这种拥塞控制方案在一定程度上,为各个报文流的端到端传输进行资源预约,在带宽方面提供基本的服务质量保证,可以有效的缓解由控制层拥塞引起的光突发报文丢失现象。基于合成流量和真实运用轨迹的实验表明,这种新方法能有效避免控制层拥塞,降低报文丢失率,提高芯片尺度光突发交换网络的系统性能。
     (4)芯片尺度光互连网络性能分析
     芯片尺度光互连网络的设计需要平衡多方面的因素,包括网络延迟、吞吐量、能耗和硅片面积占用。这些系统级互连参数的选择直接影响整个芯片的性能,因而进行片上网络的性能分析,对系统的设计具有重要意义。为此,我们开展了芯片尺度光网络的解析建模工作。利用随机网络演算理论,我们建立了光突发交换网络的存储资源需求模型,以及光器件的波长资源需求估算模型。仿真实验与数值分析的结果表明,这些解析模型计算得到的边界相当紧致。利用这些随机网络演算分析模型,我们可以快速评估众核系统光互连网络的系统级设计参数,比如存储器资源需求、传输延迟、光器件资源需求等。在设计初期,建模分析网络的性能,还可以提前降低设计风险。总的说来,我们的解析模型刻画了系统性能与网络负载、体系结构之间的关系,有助于迅速找出影响性能的关键因素和设计瓶颈,促进设计空间收敛。
     综上所述,本文研究了众核系统的互连瓶颈问题,提出了新的网络体系结构,并基于网络演算理论,对该体系结构进行了解析建模和性能分析。本文理论与实际结合紧密,为众核处理器互连瓶颈问题提供了新的解决方案,对推动高性能处理器技术发展做出了积极的贡献,并进一步扩展了网络演算理论的运用领域。
High performance multi-core, or even manycore processors are the enablingtechnology for future Exascale computing era. To efficiently exploit the unprecedentedparallelism of these cores and further boost the throughput of manycore systems, it isimportant to provide a high-bandwidth, low-latency, low-power and highly scaleablechip-scale interconnection infrastructure. Recently, the challenge of manycoreprocessors has gradually shifted from logic design to interconnects; on-chip inter-corecommunication and processor-to-memory interconnects have become the bottleneck forsystem improvement. The advances of3D integration technology and silicon photonicdevices provide new opportunities for manycore interconnects design.
     In this thesis, aiming at the manycore interconnect design challenges, we proposenew interconnection network solutions for both inter-core and processor-to-memorycommunication by exploiting the advantages of3D integration and silicon photonics.We also develop analytical models to study the performance of these new architecturesusing network calculus. The main contributions are summarized as follows.
     (1) A three dimensional flattened butterfly network for on-chip inter-corecommutation
     Recent studies show that inter-core messages have stringent demand ontransmission delays as most of them are small control packets, e.g. cache-coherentmessages. Transmission delays will get much worse when more cores are integrated, forexample,1000cores. Although low-radix topologies, e.g. the popular2D mesh, are easyto place and route, they are unable to meet the latency budget of large-scale manycoresystem, as the transmission hops of low-radix networks increase proportionally withcores. Therefore, we propose a three dimensional flattened butterfly network forinter-core communication in large-scale manycore systems by exploiting the advantagesof3D integration technology. We overcome the routing challenges of area-hungryhigh-radix routers and global long wires in flattened butterfly using3D stacking andsuccessfully embed it into multiple stacking layers by forming the problem as an integerlinear programming model. A three dimensional flattened butterfly is very efficient forfast inter-core message transfer, because it not only employs the express one-hopvertical interconnects, but also provides additional links besides the connectivity of2Dmesh. Thus, as proved by our simulation results, the new scheme can greatly reduceinter-core message delays and boost the performance of manycore processors.
     (2) A photonic-burst switched memory access network for large-scale manycoreprocessors
     Processor-to-memory schemes are vital for manycore system since tardy memoryaccess will limit the performance of parallel computing cores. Memory bandwidth demand increases proportionally with the number of integrated cores. As projected byITRS, traditional electric IOs are unable to provide enough bandwidth for large-scalemanycore system due to stringent power budget. Therefore, we propose ahigh-bandwidth, low-power optical memory access scheme for manycoreprocessor-to-DRAM communication by exploiting the advantages of3D integrationtechnology and silicon photonic devices. Our photonic burst-switched (PBS) scheme isan adaptation of optical burst switching for chip-scale network using silicon photonicdevices. The PBS network meets the enormous bandwidth demand and stringent energyconstraints by using high-speed low-power CMOS-compatible photonic devices.Furthermore, it has higher bandwidth utilization than previous wavelength-routedschemes and optical circuit-switched memory access networks because ofsub-wavelength optical switching. We examine the system feasibility and performancesusing physically-accurate network-level simulation environment. We evaluate thearchitecture using synthetic traffic patterns and real workloads traces. Simulation resultsshow that our scheme achieves considerable energy savings, compared to opticalcircuit-switched memory access network and traditional electric IO schemes.
     (3) A new method to reduce control-plane congestion in chip-scale OBS network
     In current OBS optical networks, many control-plane operations, such as sharedresources arbitration and link management, are usually performed in the electric domainbecause of the absence of optical buffer devices and optical logic devices. Due to therandom nature of burst arrivals at core nodes, control-plane congestion can occur in anOBS network when the short-term arrival rate of headers at a core-node exceeds themaximum rate at which they can be processed. The problem gets even worse inchip-scale OBS, since1) chip-scale OBS network is characterized by massive shortbursts (fine-grained control messages, like memory read/write requests) that havestringent requirements on communication delay;2) the operation frequency ofchip-scale OBS network is constrained by thermal constraint and limited power budget,and therefore can not be very high. All these features definitely intensify thecontrol-plane congestion. Thus, we propose a new approach to address the control-planecongestion problem in chip-scale OBS using traffic regulations. Before being injected,every concurrent control flow is globally regulated and coordinated so that theaggregated flows do not exceed the header processing capacity of intermediate corenode, leading to the alleviation of control-plane congestion. In other words, ourregulation method provides some end-to-end bandwidth guarantees for each flow,resulting in significant reduction of burst losses. To select optimal regulator parameters,we formulate the regulation method into an optimization problem. Simulation resultswith both real application traces and synthetic flows show that our approach caneffectively resolve the control-plane congestion and achieve considerable performanceimprovements in terms of network delay and burst losses rate.
     (4) Resources dimensioning and performance analysis of chip-scale opticalnetwork using stochastic network calculus
     The design of chip-scale optical network is characterized by challenging trade-offsamong latency, throughput, energy consumption, and silicon area requirements. Thesearchitectural parameters directly influence system performance. Thus, it is very usefulto perform such analysis in early stages of design so as to avoid bottleneck and reducedesign risks. So we develop analytical models to study chip-scale OBS network. Usingstochastic network calculus, we propose an analytical model of the ingress node todimension buffer size and calculate end-to-end latency; we also develop a “virtualwavelength buffer” model to estimate the required wavelength number with respect to atolerable burst loss probability. Analytical performance bounds on buffer size and delayare computed and compared with simulations. The simulation results verify that thetightness of the bounds is good. Using these stochastic network calculus models, we canfast evaluate the interconnect architecture parameters including buffer size, transmissiondelay and wavelength requirement. Our analytical models accurately depict therelationship between system performance and network architectures, so they are veryuseful for locating system bottlenecks, resulting in fast convergence of the complexdesign space.
     In summary, we investigate the manycore interconnect bottleneck and propose newinterconnection network architectures for large-scale manycore processors; we alsobuild analytical performance models for the new interconnect schemes using networkcalculus. We contribute new solutions towards the manycore communication problemand further extend the application field of network calculus theory. Our works havetheir academic and practical value on promoting the advancement of high performanceprocessors.
引文
[1] Moore G E. Cramming More Components onto Integrated Circuits[J].Electronics,1965,38(8):114–117.
    [2] S. Vangal, J. Howard, G. Ruhl, S. Dighe, H. Wilson, J. Tschanz, D. Finan, P.Iyer, A. Singh, T. Jacob, S. Jain, S. Venkataraman, Y. Hoskote and N. Borkar,An80-tile1.28tflops network-on-chip in65nm cmos[C]//Proc. IEEEInternational Solid-State Circuits Conference, pp.98-589,2007.
    [3] Olukotun K, Bergmann J, Chang K, et al. Rationale, Design and Performanceof the Hydra Multiprocessor[R]. Technical report, Stanford University,Stanford, CA, USA,1994.
    [4] Tendler J M, Dodson J S, Fields J S, et al. POWER4systemmicroarchitecture[J]. IBM J. Research and Development,2002,46(1):5–25.
    [5] Kongetira P, Aingaran K, Olukotun K. Niagara: A32-way MultithreadedSPARC processor[J]. IEEE Micro,2005,25(2):21–29.
    [6] Howard J, Dighe S, Hoskote Y, et al. A48-Core IA-32message-passingprocessor with DVFS in45nm CMOS[C]//Proc. Int. Solid-State CircuitsConf.,2010.108–109.
    [7] The International Technology Roadmap for Semiconductors [EB/OL]http://www.itrs.net/Links/2009ITRS/Home2009.htm, accessed Dec.29.2011.
    [8] Horowitz M, Alon E, Patil D, et al. Scaling, Power, and the Future ofCMOS[C]//Proc. Int. Electron Devices Meeting,2005.7–13.
    [9] Pham D, Asano S, Bolliger M, et al. The design and implementation of afirst-generation CELL processor[C]//Proc. Int. Solid-State Circuits Conf.,2007.49–52.
    [10]Tilera TILE64chip-multiprocessor [EB/OL].http://www.tilera.com.
    [11]Peterm. Kogge(ED). Exascale Computing Study: Technology Challenges inAchieving Exascale Systems[R]. Technical Report TR-2008-13, University ofNotre Dame CSE Department,2008.
    [12]Naeemi A, Xu J, Mule’ A, et al. Optical and electrical interconnect partitionlength based on chip-to-chip bandwidth maximization [J]. PhotonicsTechnology Letters, IEEE,2004,16(4):1221–1223.
    [13]I. Young, E. Mohammed, J. Liao, A. Kern, S. Palermo, B. Block, M.Reshotko, and P. Chang, Optical I/O Technology for Tera-Scale Computing[J]IEEE Journal of Solid-State Circuits, vol.45, no.1, pp.235-248, Jan.2010.
    [14]C. Kopp, et al., Silicon Photonic Circuits: On-CMOS Integration, FiberOptical Coupling, and Packaging [J]. Selected Topics in Quantum Electronics,IEEE Journal of, vol.17, pp.498-509,2011.
    [15]Xu Q, Manipatruni S, Schmidt B, et al.12.5Gbit/s Carrier-Injection-BasedSilicon Micro-Ring Silicon Modulators [J]. Optics Express,2007,15(2):430–436.
    [16]Gardes F Y, Brimont A, Sanchis P, et al. High-speed modulation of a compactsilicon ring resonator based on a reverse-biased pn diode [J]. Opt. Express,2009,17(24):21986–21991.
    [17]Poon J K S, Huang Y, Paloczi G T, et al. Wide-Range Tuning of PolymerMicroring Resonators by the Photobleaching of the CLD-1Chromophores [J].Optics Letters,2004,29(22):2584–2586.
    [18]Xu Q, Fattal D, Beausoleil R G. Silicon Micro-Ring Resonator with1.5μmRadius [J]. Optics Express,2008,16(6):4309–4315.
    [19]Sherwood-Droz N, Wang H, Chen L, et al. Optical4x4hitless slicon router foroptical networks-on-chip (NoC)[J]. Opt. Express,2008,16(20):15915–15922.
    [20]Ka zmierczak A, Bogaerts W, Drouard E, et al. Highly Integrated Optical4x4Crossbar in Silicon-on-Insulator Technology [J]. J. Lightwave Technol.,2009,27(16):3317–3323.
    [21]Koester S, Schow C, Schares L, et al.Ge-on-SOI-Detector/Si-CMOS-Amplifier Receivers for High-PerformanceOptical-Communication Applications [J]. Lightwave Technology, Journal of,2007,25(1):46–57.
    [22]Chen L, Preston K, Manipatruni S, et al. Integrated GHz silicon photonicinterconnect with micrometer-scale modulators and detectors [J]. Opt. Express,2009,17(17):15248–15256.
    [23]Pan Y, Kumar P, Kim J, et al. Firefly: illuminating future network-on-chipwith nano-photonics [C]//Proc. Int. Symp. Computer Architecture,2009.429–440.
    [24]Krishnamoorthy A V, Ho R, Zheng X, et al. Computer Systems Based onSilicon Photonic Interconnects [C]//Proc. IEEE,2009,97(7):1337–1361.
    [25]Goodman J, Leonberger F, Kung S Y, et al. Optical interconnections for VLSIsystems[C]//Proc. IEEE,1984,72(7):850–866.
    [26]Kirman N, Kirman M, Dokania R K, et al. Leveraging Optical Technology inFuture Bus-based Chip Multiprocessors [C]//Proceedings of InternationaSymposium on Microarchitecture,2006.492–503.
    [27]Kirman N, Kirman M, Dokania R K, et al. On-Chip Optical Technology inFuture Bus Based Multicore Designs [J]. IEEE Micro,2007,27(1):56–66.
    [28]Shacham A, Bergman K, Carloni L. Photonic Networks-on-Chip for FutureGenerations of Chip Multiprocessors [J]. Computers, IEEE Transactions on,2008,57(9):1246–1260.
    [29]Vantrease D, Schreiber R, Monchiero M, et al. Corona: System Implications ofEmerging Nanophotonic Technology[C]//Proc. Int. Symp. ComputerArchitecture,2008.153–164.
    [30]Vantrease D, Binkert N, Schreiber R, et al. Light Speed Arbitration and FlowControl for Nanophotonic Interconnects[C]//Proceedings of InternationaSymposium on Microarchitecture,2009.
    [31]Petracca M, Lee B G, Bergman K, et al. Design Exploration of OpticalInterconnection Networks for Chip Multiprocessors [C]//Proceedings of IEEESymposium on High Performance Interconnects,2008.31–40.
    [32]Batten C, Joshi A, Orcutt J, et al. Building Manycore Processor-to-DRAMNetworks with Monolithic Silicon Photonics[C]//Proceedings of IEEESymposium on High Performance Interconnects,2008.21–30.
    [33]Joshi A, Batten C, Kwon Y J, et al. Silicon-photonic clos networks for globalon-chip com-munication[C]//Proc. Int. Symp. Networks-on-Chip,2009.124–133.
    [34]M. J. Cianchetti, J. C. Kerekes, D. H. Albonesi, and Acm, Phastlane:a rapidtransit optical routing network[C]//Proc. Int. Symp. Computer Architecture,2009.441–450.
    [35]Miller J, Psota J, Kurian G, et al. ATAC: A Manycore Processor with On-ChipOptical Network[R]. Technical report, Massachusetts Institute of Technology,May,2009.
    [36]Xue J, Garg A, Ciftcioglu B, et al. An Intra-Chip Free-Space OpticalInterconnect[C]//Proc. Wkshp. on Chip Multiprocessor Memory Systems andInterconnects,2009.
    [37]H. S. Rong, A. S. Liu, et al., An all-silicon Raman laser[J]. Nature,433(2005)292-294.
    [38]Y. Kang, H. D. Liu, et al, Monolithic germanium/silicon avalanchephotodiodes with340GHz gain-bandwidth product[J]. Nature Photonics,3(2009)59-63.
    [39]杨德仁.硅基发光材料与光互连的基础研究--浙江大学973计划申请报告[R],2007.
    [40]C. L. Xue, H. Y. Xue., et al,1x4Ge-on-SOI PIN photodetector array forparallel optical interconnects [J]. Journal of Lightwave Technology,27(2009)5687-5689.
    [41]A. S. Liu, R. Jones, et al, A high-speed silicon optical modulator based on ametal-oxide-semiconductor capacitor [J]. Nature,427(2004)615-618.
    [42]Q. F. Xu, B. Schmidt, et al, Micrometre-scale silicon electro-optic modulator[J]. Nature,435(325-327)2005.
    [43]L. Liao, D. Samara-Rubio, et al., High speed silicon Mach-Zehnder modulator[J]. Optics Express,13(2005)3129-3135.
    [44]W. M. J. Green, M. J. Rooks, and L. Sekaric, Ultra-compact, low RF power,10Gb/s silicon Mach-Zehnder Modulator [J]. Optics Express,15(2007)17106-17113.
    [45]S. Manipatruni, Q. F. Xu, and M. Lipson, PINIP based high-speedhigh-extinction ratio micron-size silicon electro-optic modulator [J]. OpticsExpress,15(2007)13035-13042.
    [46]A. S. Liu, L. Liao, et al,200Gbps photonic integrated chip on silicon platform[C]//Proceedings of20085th IEEE International Conference on Group IVPhotonics2008380-382.
    [47]M. M. Geng, L. X. Jia. et al, Four-channel reconfigurable optical add-dropmultiplexer based on photonic wire waveguide [J]. Optics Express,17(2009)5502-5516.
    [48]T. Barwicz, M. A. Popovic, et. al, Reconfigurable siliconphotonic circuits fortelecommunication applications[C]//Proceedings of SPIE,6872(2008)68720Z-1~12.
    [49]E. J. Klein, D. H. Geuzebroek, et. al, Reconfigurable optical add-dropmultiplexer using microring resonators[J]. IEEE Photonics TechnologyLetters,17(2005)2358.
    [50]H. Yamada, T. Chu, et al, Si photonic wire waveguide devices[J]. IEEEJournal of Selected Topics in Quantum Electronics,12(2006)1371-1379.
    [51]H. Fukuda, K. Yamada, et al, Silicon photonic circuit with polarizationdiversity [J]. Optics Express,16(2008)4872-4880.
    [52]A. W. Poon, F. Xu, and X. S. Luo, Cascede active silicon microresonator arraycroos-connect circuits for WDM networks-on-chip [C]//Proceedings of SPIE,6898(2008)689812-1~10.
    [53]Kim J, Dally W, Towles B, et al. Microarchitecture of a high radix router [C]//Proc. Int. Symp. Computer Architecture,2005.420–431.
    [54]Dally W J, Towles B. Principles and Practices of InterconnectionNetworks[M]. San Francisco, CA: Morgan Kaufmann Pub.,2003.
    [55]Hu J, Marculescu R. DyAD-smart routing for networks-on-chip[C]//Proc.Design Automation Conf.,2004.260–263.
    [56]Gratz P, Grot B, Keckler S W. Regional congestion awareness for load balancein networks-on-chip[C]//Proc. Int. Symp. High-Performance ComputerArchitecture,2008.203–214.
    [57]Flich J, Rodrigo S, Duato J. An Efficient Implementation of DistributedRouting Algorithms for NoCs[C]//Proc. Int. Symp. Networks-on-Chip,2008.87–96.
    [58]PalesiM, LongoG, SignorinoS, etal. Design of Band width Aware andCongestion Avoiding Efficient Routing Algorithms for Networks-on-ChipPlatforms[C]//Proc. Int. Symp. Networks-on-Chip,2008.97–106.
    [59]Kumar A, Peh LS, Kundu P, etal. Express virtual channels: towards the idealinterconnection fabric[C]//Proc. Int. Symp. Computer Architecture,2007.150–161.
    [60]Lu Z, Liu M, Jantsch A. Layered Switching for Networks on Chip[C]//Proceedings of Proc. Design Automation Conf.,2007.122–127.
    [61]Mullins R, West A, Moore S. Low-Latency Virtual-Channel Routers forOn-Chip Networks[C]//Proceedings of Proc. Int. Symp. ComputerArchitecture,2004.188–197.
    [62]Jerger N E, Lipasti M, Peh L S. Circuit-Switched Coherence[J]. IEEEComputer Architecture Letters,2007,6(1):5–8.
    [63]Owens J, Dally W, Ho R, et al. Research Challenges for On-ChipInterconnection Networks[J]. Micro, IEEE,2007,27(5):96–108.
    [64]Eisley N, Peh L S. High-level power analysis for on-chip networks[C]//Proceedings of Proc. Int. Conf. Compilers, Architecture&Synthesis forEmbedded Systems,2004.104–115.
    [65]Shang L, Peh L S, Jha N K. PowerHerd: A distributed scheme for dynamicsatisfying peak power constraints in interconnection networks [J]. IEEE Trans.Computer-Aided Design of Integrated Circuits and Systems,2006..
    [66]Mandal S K, Mahapatra R N. PowerAntz: distributed power sharing strategyfor network on chip [C]//Proceedings of Proc. Int. Symp. Low PowerElectronics&Design,2008.177–182.
    [67]Wang H, Peh L S, Malik S. Power-driven design of router microarchitecturesin on-chip networks [C]//Proceedings of Proceedings of InternationaSymposium on Microarchitecture,2003.105–116.
    [68]Shin D, Kim J. Power-aware communication optimization fornetworks-on-chips with volt-age scalable links [C]//Proceedings of Proc. Int.Conf. Hardware/Software Codesign and System Synthesis,2004.170–175.
    [69]Joshi A, Lopez G, Davis J. Design and Optimization of On-Chip InterconnectsUsing Wave-Pipelined Multiplexed Routing [J]. IEEE Trans. VLSI Systems,2007,15(9):990–1002.
    [70]James Balfour and W. J. Dally, Design Tradeoffs for Tiled CMP On-ChipNetworks [C]//Proceedings of International Conference on Supercomputing,Cairns, Queensland, Australia,2006.
    [71]Roelkens G, Liu L, Liang D, et al. III-V/silicon photonics for on-chip andintra-chip optical interconnects [J]. Laser and Photonics Reviews,2010.
    [72]Soref R, Bennett B. Electrooptical effects in silicon [J]. Quantum Electronics,IEEE Journal of,1987,23(1):123–129.
    [73]Prabhu A, Tsay A, Han Z, et al. Ultracompact SOI Microring Add-Drop FilterWith Wide Bandwidth and Wide FSR [J]. Photonics Technology Letters,IEEE,2009,21(10):651–653.
    [74]Takayesu J, Hochberg M, Baehr-Jones T, et al. A Hybrid ElectroopticMicroring Resonator-Based1x4x1ROADM for Wafer Scale OpticalInterconnects [J]. IEEE Journal of Lightwave Technology,2009,27:440–448.
    [75]Block B A, Younkin T R, Davids P S, et al. Electro-optic polymer claddingring resonator modulators [J]. Optics Express,2008,16(22):18326–18333.
    [76]Vlasov Y, Green W M J, Xia F. High-throughput silicon nanophotonicwavelength-insensitive switch for on-chip optical networks [J]. NaturePhtonics,2008,2(4):242–246.
    [77]Li Q, Soltani M, Yegnanarayanan S, et al. Design and demonstration ofcompact, wide bandwidth coupled-resonator filters on a siliconon-insulatorplatform [J]. Opt. Express,2009,17(4):2247–2254.
    [78]T. Agerwala, Exascale computing: The challenges and opportunities in thenext decade [C]//Proc.16th IEEE International Symposium on HighPerformance Computer Architecture, pp.1-1,2010.
    [79]TILE-Gx Processor Family [EB/OL]http://www.tilera.com/products/TILE-Gx.php, accessed Dec.29.2011.
    [80]D. Sanchez, G. Michelogiannakis and C. Kozyrakis, An analysis of on-chipinterconnection networks for large-scale chip multiprocessors [J]. ACM Trans.Archit. Code Optim., vol.7, no.1, Apr2010.
    [81]S. Borkar, Thousand Core ChipsA Technology Perspective [C]//Proc.44thACM/IEEE Design Automation Conference, pp.746-749,2007.
    [82]M. Koyanagi, T. Fukushima and T. Tanaka, Three-dimensional integrationtechnology and integrated systems [C]//Proc. Asia and South Pacific DesignAutomation Conference, pp.409-415,2009.
    [83]T. Barwicz, H. Byun, F. Gan, C. W. Holzwarth, M. A. Popovic, P. T. Rakich,M. R. Watts, E. P. Ippen, F. X. Kartner, H. I. Smith, J. S. Orcutt, R. J. Ram, V.Stojanovic, O. O. Olubuyide, J. L. Hoyt, S. Spector, M. Geis, M. Grein, T.Lyszczarz and J. U. Yoon, Silicon photonics for compact, energy-efficientinterconnects [J]. J. Opt. Netw., vol.6, no.1, pp.63-73, Jan2007.
    [84]J. Kim, J. Balfour and W. J. Dally, Flattened butterfly topology for on-chipnetworks [C]//Proc.40th IEEE/ACM International Symposium onMicroarchitecture, pp.172-182,2007.
    [85]Scharbarg J L, Ridouard F, Fraboul C. A Probabilistic Analysis ofEnd-To-End Delays on an AFDX Avionic Network [J]. IEEE Transactions onIndustrial Informatics.2009,5(1):38–49.
    [86]S. Bell, B. Edwards, J. Amann, R. Conlin, K. Joyce, V. Leung, J. MacKay, M.Reif, B. Liewei, J. Brown, M. Mattina, M. Chyi-Chang, C. Ramey, D.Wentzlaff, W. Anderson, E. Berger, N. Fairbanks, D. Khan, F. Montenegro, J.Stickney and J. Zook, Tile64-processor: A64-core soc with meshinterconnect[C]//Proc. IEEE International Solid-State Circuits Conference, pp.88-598,2008.
    [87]D. Wang, B. Ganesh, N. Tuaycharoen, K. Baynes, A. Jaleel and B. Jacob,Dramsim: A memory system simulator[J]. SIGARCH Comput. Archit. News,vol.33, no.4, pp.100-1072005.
    [88]钱悦.片上网络演算模型及性能分析[D].湖南省长沙市开福区砚瓦池正街147号:国防科学技术大学,2010.
    [89]G. Hendry, E. Robinson, V. Gleyzer, J. Chan, L. Carloni, N. Bliss and K.Bergman, Circuit-switched memory access in photonic interconnectionnetworks for high-performance embedded computing [C]//Proc. InternationalConference for High Performance Computing, Networking, Storage andAnalysis, pp.1-12,2010.
    [90]Selvaraja S K, Sleeckx E, Schaekers M, et al. Low-loss amorphoussilicon-on-insulator technology for photonic integrated circuitry[J]. OpticsCommunications,2009,282(9):1767–1770.
    [91] C. M. Qiao and M. S. Yoo, Optical burst switching (obs)-a new paradigmfor an optical internet [J]. J. High Speed Netw., vol.8, no.1, pp.69-841999.
    [92]Schmitt J B, Zdarsky F A, Thiele L, et al. A comprehensive worst-casecalculus for wireless sensor networks with in-network processing [M]. LosAlamitos: IEEE Computer Soc,2007.
    [93]Sahni S, Luo X, Liu J, et al. Junction field-effect-transistor-based germaniumphotodetector on silicon-on-insulator[J]. Opt. Lett.,2008,33(10):1138–1140.
    [94]A. W. Poon, F. Xu and X. S. Luo, Cascaded active silicon microresonatorarray cross-connect circuits for wdm networks-on-chip [C]//Proc. SPIE Int'lSoc. Opt. Eng.,2008.
    [95]J. Kim, C. Nicopoulos, D. Park, R. Das, Y. Xie, N. Vijaykrishnan, M. S.Yousif, C. R. Das and Acm, A novel dimensionally-decomposed router foron-chip communication in3d architecture [C]//Proc.34th AnnualInternational Symposium on Computer Architecture, pp.138-149,2007.
    [96]Vivien L, Osmond J, Fedeli J M, et al.42GHz p.i.n Germanium photodetectorintegrated in a silicon-on-insulator waveguide [J]. Opt. Express,2009,17(8):6252–6257.
    [97]D. N. Jayasimha, Bilal Zafar, Yatin Hoskote, On-Chip InterconnectionNetworks: Why They are Different and How to Compare them [EB/OL]http://blogs.intel.com/research/terascale/ODI_why-different.pdf, accessedDec.29.2011.
    [98]L. Q. Cheng, N. Muralimanohar, K. Ramani, R. Balasubramonian, J. B. Carter.Interconnect-aware coherence protocols for chip multiprocessors [C]//Proc.33rd International Symposium on Computer Archtiecture, pp.339-350,2006.
    [99]X. Yi, D. Yu, Z. Bo, Z. Xiuyi, Z. Youtao and Y. Jun, A low-radix andlow-diameter3d interconnection network design [C]//Proc. IEEE15thInternational Symposium on High Performance Computer Architecture, pp.30-42,2009.
    [100]L. Feihui, C. Nicopoulos, T. Richardson, X. Yuan, V. Narayanan and M.Kandemir, Design and management of3d chip multiprocessors usingnetwork-in-memory [C]//Proc.33rd International Symposium on ComputerArchitecture, pp.130-141,2006.
    [101]R. Fourer, D. M. Gay, and B. W. Kernighan. AMPL: A Modeling Languagefor Mathematical Programming[M]. Duxbury Press Publishing Company,2nd ed.2002.
    [102]M. Berkelaar, K. Eikland, and P. Notebeat, LP solve: Opern Source(Mixed-Integer) Linear Programming System (2007)[EB/OL]http://lpsolve.sourceforge.net/5.5/, accessed Dec.29.2011.
    [103]A. B. Kahng, L. Bin, P. Li-Shiuan and K. Samadi, Orion2.0: A fast andaccurate noc power and area model for early-stage design space exploration[C]//Proc. Conference on Design, Automation&Test in Europe, pp.423-428,2009.
    [104]Chen L, Lipson M. Ultra-low capacitance and high speedgermaniumphotodetectors on silicon [J]. Opt. Express,2009,17(10):7901–7906.
    [105]M. J. R. Heck, et al., Hybrid Silicon Photonics for Optical Interconnects [J].Selected Topics in Quantum Electronics, IEEE Journal of, vol.17, pp.333-346,2011.
    [106]Chen L, Dong P, Lipson M. High performance germanium photodetectorsintegrated on submicron silicon waveguides by low temperature waferbonding [J]. Opt. Express,2008,16(15):11513–11518.
    [107]A. Shacham, et al., Photonic NoC for DMA Communications in ChipMultiprocessors [C]//Proc.15th Annual IEEE Symposium onHigh-Performance Interconnects,2007., pp.29-38.
    [108]Siegert M, Loken M, Glingener C, et al. Efficient optical coupling between apolymeric waveguide and an ultrafast silicon MSM photodiode [J]. SelectedTopics in Quantum Electronics, IEEE Journal of,1998,4(6):970–974.
    [109]F. Quanyou, B. Dongson, L. Huanzhong, and D. Wenhua, Optical burstswitching for many-core processor-to-memory photonic networks,[C]//Proc.Computer Science&Education (ICCSE),20116th International Conferenceon,2011, pp.541-546.
    [110]Xu F, Poon A W. Silicon cross-connect filters using microring resonatorcoupled multimodeinterference-based waveguide crossings [J]. Opt. Express,2008,16(12):8649–8657.
    [111]N. Barakat and T. E. Darcie, The Control-Plane Stability Constraint inOptical Burst Switching Networks [J]. Communications Letters, IEEE, vol.11, pp.267-269,2007.
    [112]N. Barakat and T. E. Darcie, Control-Plane Congestion inOptical-Burst-Switched Networks [J]. Optical Communications andNetworking, IEEE/OSA Journal of, vol.1, pp. B98-B110,2009.
    [113]Venkatesh and C. S. R. Murthy, an Analytical Approach to Optical BurstSwitched Networks [M]. Springer Publishing Company, Incorporated,2009.
    [114]Y. Le Boudec and P. Thiran, Network Calculus: A Theory of DeterministicQueuing Systems for the Internet [M]. New York: Springer-Verlag,2001.
    [115]L. Cruz, A calculus for network delay, part I: Network elements in isolation;part II: Network analysis [J]. IEEE Trans. Inform. Theory, vol.37, no.1, pp.132–141, Jan.1991.
    [116]Chang, Performance Guarantees in Communication Networks[M]. London,U.K.: Springer-Verlag,2000, p.410.
    [117]D. Stiliadis and A. Varma, Latency-rate servers: A general model for analysisof traffic scheduling algorithms [J].IEEE/ACM Trans. Netw., vol.6, no.5, pp.611–624, Oct.1998.
    [118]Z. Lu, M. Millberg, A. Jantsch, A. Bruce, P. van der Wolf, and T. Henriksson,Flow regulation for on-chip communication [C]//Proc. DATE, Apr.2009, pp.578–581.
    [119]Yijun, et al.,"Control architecture in optical burst-switched WDM networks,"Selected Areas in Communications, IEEE Journal on, vol.18, pp.1838-1851,2000.
    [120]P. Tang and T. Y. C. Tai, Network traffic characterization using token bucketmodel [C]//Proc. IEEE INFOCOM, Mar.1999, pp.51–62.
    [121]P. Bertsekas, Nonlinear Programming[M]. Belmont, MA: Athena Scientific,1999.
    [122]H. Y. Benson, R. J. Vanderbei, and D. F. Shanno, Interior-point methods fornonconvex nonlinear programming: Filter methods and merit functions [J]Computat. Optimiz. Applicat., vol.23, no.2, pp.257–272,2002.
    [123]M. Adams and J. L. Nazareth, Linear and Nonlinear ConjugateGradient-Related Methods [J], Society for Industrial and AppliedMathematics (SIAM),1996.
    [124]J. Chan, G. Hendry, A. Biberman, K. Bergman, and L.P. Carloni,PhoenixSim: A simulator for physical-layer analysis of chip-scale photonicinterconnection networks [C]//Design, Automation&Test in EuropeConference&Exhibition (DATE),2010,2010, pp.691-696.
    [125]C. Woo, M. Ohara, et. al. The SPLASH-2Programs: Characterization andMethodological Considerations [C]//Proc. ISCA,1995.
    [126]E. Miller, et al., Graphite: A distributed parallel simulator for multicores [C]//Proc. High Performance Computer Architecture (HPCA),2010IEEE16thInternational Symposium on,2010, pp.1-12.
    [127]Soref R. Silicon-based optoelectronics [J]. Proc. IEEE,1993,81(12):1687–1706.
    [128]S. Murali and G. De Micheli, Bandwidth-constrained mapping of cores ontoNoC architectures [C]//Proc. DATE,2004, pp.896–901.
    [129]M. H. Wright, Interior methods for constrained optimization [C]//ActaNumerica, vol.1, pp.341-407,1992.
    [130]Vlasov Y, McNab S. Losses in single-mode silicon-on-insulator stripwaveguides and bends [J]. Opt. Express,2004,12(8):1622–1631.
    [131]S.J.B. Yoo, Future prospects of silicon photonics in next generationcommunication and computing systems [J]. Electronics Letters45(2009)584-588.
    [132]S. J. S. Kaustav Banerjee, Pawan Kapur and Krishna C. Saraswat,3-DHeterogeneous ICs: A Technology for the Next Decade and Beyond [C]//Proc. IEEE Workshop on SIGNAL PROPAGATION ONINTERCONNECTS, Venice, Italy May,2001, pp.13-16.
    [133]黎峥,片上光电通信:系统级性能与可靠性的分析和设计[D].清华大学工学博士学位论文,2010.
    [134]李焕忠,基于随机网络演算的性能分析技术研究[D].湖南省长沙市开福区砚瓦池正街147号:国防科学技术大学,2011.
    [135]I. O’Connor and F. Gaffiot, On-chip optical interconnect for low-power [M].in Ultra low-power electronics and design, E. Macii, Ed., ed: Springer,2004,pp.21-39.
    [136]Y. Jiang, and Y. Liu, Stochastic Network Calculus [M]. Springer-Verlag,2008.
    [137]U.Y. Ogras, and R. Marculescu, Analytical Router Modeling forNetworks-on-Chip Performance Analysis [C]//Proc. DATE '07,2007, pp.1-6.
    [138]Q. Yue, L. Zhonghai, and D. Wenhua, Analysis of worst-case delay boundsfor best-effort communication in wormhole networks on chip,Networks-on-Chip [C]//Proc. NoCS2009. pp.44-53.
    [139]M. Bakhouya, S. Suboh, J. Gaber, and T. El-Ghazawi, Analytical modelingand evaluation of On-Chip Interconnects using Network Calculus [C]//Proc.NoCS2009. pp.74-79.
    [140]A. Burchard, J. Liebeherr, and S.D. Patek, A Min-Plus Calculus forEnd-to-End Statistical Service Guarantees [J]. Information Theory, IEEETransactions on52(2006)4105-4114.
    [141]W. Hang-Sheng, Z. Xinping, P. Li-Shiuan, and S. Malik, Orion: apower-performance simulator for interconnection networks,Microarchitecture [C]//Proceedings.35th Annual IEEE/ACM InternationalSymposium on,2002, pp.294-305.
    [142]D. Wentzlaff, P. Griffin, H. Hoffmann, B. Liewei, B. Edwards, C. Ramey, M.Mattina, M. Chyi-Chang, J. F. Brown, and A. Agarwal, On-ChipInterconnection Architecture of the Tile Processor [J]. Micro, IEEE, vol.27,pp.15-31,2007.
    [143]C. Tze-chiang, Device technology innovation for exascale computing [C]//VLSI Technology,2009Symposium on,2009, pp.8-11.
    [144]G. Varatkar and R. Marculescu, Traffic analysis for on-chip networksdesign of multimedia applications [C]//Proceedings of the39th annualDesign Automation Conference, New Orleans, Louisiana, USA,2002.
    [145]J.A. Kash, IntraChip Optical Networks for a FutureSupercomputer-on-a-Chip [J]. Photonics in Switching,2007,2007, pp.55-56.
    [146]C. Cheng-Shang, Stability, queue length, and delay of deterministic andstochastic queueing networks [J]. Automatic Control, IEEE Transactions on39(1994)913-931.
    [147]Antoine Scherrer, Antoine Fraboulet, and Tanguy Risset. Networks-on-Chips:Theory and Practice, chapter4: On-chip Processor Traffic Modeling for NoCDesign [M]. CRC Press,2008.
    [148]Y.M. Jiang, and P.J. Emstad, Analysis of stochastic service guarantees incommunication networks: A server model.[C]//Quality of Service-Iwqos2005, Proceedings,2005, pp.233-245.
    [149]X. Yu, et al., Queueing processes in GPS and PGPS with LRD traffic inputs[J]. IEEE/ACM Trans. Netw., vol.13, pp.676-689,2005
    [150]Kobrinsky M J, Block B A, Zheng J F, et al. On-Chip Optical Interconnects[J]. Intel Technology Journal,2004,8(2):129–142.
    [151]C. Chen-Shang, and J.A. Thomas, Effective bandwidth in high-speed digitalnetworks [J]. Selected Areas in Communications, IEEE Journal on13(1995)1091-1100.
    [152]W. Kui, J. Yuming, and L. Jie, On the model transform in stochastic networkcalculus [C]//Quality of Service (IWQoS),201018th InternationalWorkshop on,2010, pp.1-9.
    [153]L. Chengzhi, et al., A Network Calculus With Effective Bandwidth [J]IEEE/ACM Transactions on Networking. vol.15, pp.1442-1453,2007.
    [154]S. Huimin, L. Zhonghai, A. Jantsch, Z. Dian, and Z. Li-Rong, Modeling andanalysis of Rayleigh fading channels using stochastic network calculus [C]//Wireless Communications and Networking Conference (WCNC),2011IEEE,2011, pp.1056-1061.
    [155]R.L. Cruz, and H.N. Liu, Single Server Queues with Loss: A Formulation[C]//Proceedings of the Twenty-Seventh Annual Conference on InformationSciences and Systems (1993)107-111.
    [156]Cardenas J, Poitras C B, Robinson J T, et al. Low loss etchless siliconphotonic waveguides [J]. Opt. Express,2009,17(6):4752–4757.
    [157]R. Wang, A.D. Raki, and M.L. Majewski, Design of microchannel free-spaceoptical interconnects based on vertical-cavity surface-emitting laser arrays [J].Appl. Opt.41,3469-3478,2002.
    [158]Q. Yue, et al., From2D to3D NoCs: A case study on worst-casecommunication performance [C]//in Computer-Aided Design-Digest ofTechnical Papers,2009. ICCAD2009. IEEE/ACM International Conferenceon,2009, pp.555-562.
    [159]V.M. Hietala, C. Chun, J. Laskar, K.D. Choquette, K.M. Geib, A.A.Allerman, and J.J. Hindi, Twodimensional8×8photoreceiver array andVCSEL drivers for high-throughput optical data links [J]. IEEE J. Solid-StateCircuit.36,1297–1302,2001.
    [160]R. Baets, L. Vanwassenhove,2D inter-chip optical interconnect [J].International Journal of Optical Materials, vol.17, iss.1-2, p.227-233,Dec.2001
    [161]S. Palermo, A. Emami-Neyestanak, and M. Horowitz, A90nm CMOS16Gb/s Transceiver for Optical Interconnects [J]. IEEE Journal of Solid-StateCircuits, vol.43, no.5, pp.1235-1246, May2008.
    [162]Ian O'Connor, Optical solutions for system-level interconnect [C]//Proceedings of the2004international workshop on System level interconnectprediction (SLIP '04),2004.
    [163]A. Palaniappan*and S. Palermo, Power Efficiency Comparisons ofInter-chip Optical Interconnect Architectures [J]. IEEE Transactions onCircuits and Systems-II, vol.57, no.5, pp.343-347, May2010.
    [164]B.E. Lemoff, M.E. Ali, G. Panotopoulos, G.M. Flower, B. Madhavan, A.F.J.Levi, D.W. Dolfi, MAUI: enabling fiber-to-the-Processor with parallelmultiwavelength optical interconnects [J]. J. Lightwave Technol.22,2043–2054,2004.
    [165]Jinhua wu, Introduction to O/E PCB[R]. R&D DepartmentDocument,Shanghai Meadville Science and Technology Co., Ltd,Nov.,2008
    [166]L.A. Buckman Windover, J.N. Simon, S.A. Rosenau, K. Giboney, G.M.Flower, L.W. Mirkarimi, A. Grot, B. Law, C.K. Lin, A. Tandon, R.W.Gruhlke, H. Xia, G. Rankin, and D. W. Dolfi, Parallel optical interconnectsbeyond>100Gb/s [J]. J. Lightwave Technol.22,2055–2063,2004.
    [167]R.T. Chen, L. Lin, C. Choi, Y.J. Liu, B. Bihari, L. Wu, S. Tang, R. Wickman,B. Picor, M.K. Hibbsbrenner, J. Bristow, and Y. S. Liu, Fully embeddedboard-level guided-wave optoelectronic interconnects [C]//Proc. IEEE.88,780–793,2000.
    [168]F. Mederer, R. Jager, H.J. Unold, R. Michalzik, K.J. Ebeling, S. Lehmacher,A. Neyer, and E. Griese,3-Gb/s data transmission with GaAs VCSEL’s overPCB integrated polymer waveguides[J]. IEEE Photon. Technol. Lett.13,1032–1034,2001.
    [169]Y. Li, J. Ai, and J. Popelek, Board-level2-D data-capable opticalinterconnect circuits using polymer fiber-image guides [J]. Proc. IEEE.88,794–805,2000.
    [170]T. May, A.G. Kirk, D.V. Plant, J.F. Ahadian, C.G. Fonstad, K.L. Lear, K.Tatah, M.S. Robinson, and J.A. Trezza, Interconnection of a two-dimensionalarray of vertical-cavity surface-emitting lasers to a receiver array by means ofa fiber image guide [J]. Appl. Opt.39,683–689,2000.
    [171]H.F. Bare, F. Haas, D.A. Honey, D. Mikolas, H.G. Craighead, G. Pugh, andR. Soave, A simple surfaceemitting LED array useful for developingfree-space optical interconnect [J]. IEEE Photon. Technol. Lett.5,172–175,1993.
    [172]J. Jahns, Y.H. Lee, C.A. Burrus, and J.L. Jewell, Optical interconnects usingtop-surface-emitting microlasers and planar optics [J]. Appl. Opt.31,592-597,1992.
    [173]D.V. Plant, M.B. Venditti, E. Laprise, J. Faucher, K. Razavi, M. Chateauneuf,A.G. Kirk, and J.S. Ahearn,256-channel bidirectional optical interconnectusing VCSEL’s and photodiodes on CMOS [J]. J. Lightwave Technol.19,1093–1103,2001.
    [174]D.V. Plant, and A.G. Kirk, Optical interconnects at the chip and board level:challenges and solutions [J]. Proc IEEE.88,806–818,2000.
    [175]J.J Liu, Z. Kalayjian, B. Riely, W. Chang, G.J. Simonis, A. Apsel, and A.Andreou, Multichannel ultrathin silicon-on-sapphire optical interconnects[J].IEEE J. Sel. Top. Quant. Elect.9,380–386,2003.
    [176]Y. Ye, J. Xu, X. Wu, W. Zhang, X. Wang, M. Nikdast, Z. Wang, and W. Liu,Modeling and Analysis of Thermal Effects in Optical Networks-on-Chip[C]//Proc.2011IEEE Computer Society Annual Symposium on VLSI(ISVLSI),2011, pp.254-259.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700