k元n方的高维交换结构和多播研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
在因特网快速发展的今天,宽带视频、多媒体等业务对路由器技术提出了更高的要求,快速增长的网络流量也要求交换容量不断升级。k元n方交换结构由于其灵活的扩展性,已成为构建大容量可扩展路由器的常用选择。
     一方面,采用多维交换结构建分布式的大容量交换网络是分组交换技术的发展趋势。比较相同拓扑类型、不同维度的多维交换结构,一般情况下,高维的结构在吞吐率、交换延迟等性能指标上具有天生的优势。如果交换节点总数(即交换结构规模)固定不变,维度高意味着每维节点数少,节点连接度高,从而交换结构的直径更小,对分带宽更大。直径小有利于减小交换延迟,而对分带宽大有利于提高吞吐率,因为多维交换结构理论上能达到的交换能力与其对分带宽成正比。
     但是高维度也会带来成本上与技术上的问题。维度高意味着每个节点的连接度高,从而需要更多的互连通道及节点缓存。这直接增加了交换结构的实现成本,同时加大了交换节点上的缓存管理、调度、控制信息传输与处理等模块的难度,不利于节点实现。而且由于互连通道数随着节点数的增加而快速增大,使得其可扩展性受到限制。因为受限于复杂互连的可实现性限制等,所以今后的研究方向之一是采用光介质来连接多维交换结构。
     另一方面,面对视频会议、计算机协同工作等新业务的快速增长,多播通信的应用越来越广泛,相对传统的点到点通信方式,多播不仅能够节约大量的网络带宽,而且可以提高工作效率。随着对多播性能要求的不断增加,多播应用逐渐从应用层向网络结构的下层延伸,在交换和路由层上实现多播已经成为当前国内外研究的热点。
     k元n方中的多播既可以通过软件,也可以依靠硬件的支持得以实现。但硬件开销会增加系统的设计成本和实现的复杂度,并会降低路由硬件的速度。同时,现有的k元n方实现系统大多只支持点对点的单播路由。因此,利用现有的单播技术在软件层面上实现多播是目前和今后很长一段时间内的很好的选择。
     针对以上问题,对现有的软件多播算法做出了改进;设计了以单播路由为基础的多播路由算法:KMPAMR(K-Mesh Partition-based Adaptive Multicast Routing)算法;搭建了k元n方交换结构的通用仿真模型,采用面向策略的设计模式为不同的路由算法提供了通用的访问接口,并对算法性能进行了分析和讨论。实验表明,本文提出的两种算法能够取得较好的性能。
     首先,本文从k元n方交换结构、死锁问题、虫孔交换、路由算法和虚通道流控制几个角度出发,介绍了本文研究的背景知识。接着描述了构建多维交换结构需要注意的问题和介绍了多播的前景。
     其次,介绍了多维交换结构的研究背景,并对不同维数的拓扑结构进行了仿真。分析了由维数引起的torus交换结构性能和复杂度上的变化。
     再次,本文在分析已有的软件多播算法的基础上,提出一种针对并发多播业务设计的多播路由算法:KMPAMR算法。KMPAMR算法具有更为灵活的分区方式,通过增大多播的并行性和路由的灵活性来改善多播性能。仿真表明,KMPAMR算法在负载较低的情况下能取得较高的吞吐率和较低的延迟,但会加速交换结构“过饱和”现象的出现。一旦出现“过饱和”,交换结构的性能(特别是吞吐率)会急剧下降。
     最后,为了考察多维交换结构和KMPAMR算法的性能,我们在OPNET平台下搭建了基于虫孔交换和虚通道流控制的k元n方交换结构的通用仿真模型,采用面向策略的设计模式,为不同的路由算法提供了通用的访问接口。此外,本文还结合对仿真结果的讨论,提出了开展下一步研究的参考性建议。
Confronted with the explosion of communication traffic in Internet, the applications of broadband video and multimedia require routers to provide better services and a new switch fabric is needed to constantly meet the requirement of increasing capacity. With the flexible scalability and large capacity, k-ary n-cube switching fabric is being widely used to construct high speed terabit routers.
     On the one hand, it would be a develop trend of packet switch technology to build distributed, vast capacity switch networks with multi-dimension switch fabrics. Under the condition that the topologies are same, in general, high-dimensional torus networks have some inherent advantages in performance indexes such as throughput, switch delay, and so on. High dimension implies few nodes in each dimension, short diameter and wide bisection bandwidth if the total number of switch nodes (the size of switch topology) is fixed. Short diameter is beneficial to decrease switch delay, and wide bisection bandwidth is helpful to increase throughput as the switch capacity of multi-dimension switch fabric and bisection bandwidth has the direct ratio theoretically. But, high-dimension also incurs the problems of cost and technology.
     Higher-dimension implies more node degree, so there must be more interconnect channels and node temporary memory. That directly adds the cost of switch fabric, and increases architectural complexity in managing temporary, scheduling and controlling information transport and handle. The Higher-dimension switch fabrics’potential of scalable is restricted, because the total number of interconnection channels increases fleetly while the number of nodes adds.
     On the other hand, to meet the development of new services, such as video conference and cooperative computing etc., the traditional point to point communication would be replaced by multicast communication. Not only multicast can save vast communication bandwidth, but also it improves the working efficiency of terminal equipment. Along with the fast growing of performance requirement, multicast is extending from application layer to lower layers of network architectures. Therefore, supporting multicast in high speed switch and routers becomes a hot issue for research.
     Although multicast can be implemented in either software or hardware, adding hardware support for it will increase cost and complexity, possibly slowing down the routing speed. Furthermore, most existing wormhole-switched systems only support unicast in hardware. Hence, multicast should be implemented in software in such systems by sending multiple unicast messages, and therefore is suitable for current and future systems.
     To address the above issues, we enhance the existent software multicast algorithm and propose a novel software multicast algorithm, KMPAMR (K-Mesh Partition-based Adaptive Multicast Routing) algorithm, which are based on unicast routing and applicable for multiple multicasts. And this thesis also introduces the general model for simulation, provides a common interface for different routing algorithms with the strategy design pattern, and also discusses the simulation results. Simulation results show that the propose algorithm can perform efficient multiple multicast operations in k-ary n-cube switching fabric.
     First, this thesis introduces the basic research backgrounds about k-ary n-cube switching fabric, such as topology, deadlock problem, wormhole switching and virtual channel flow control. Then, some issues about building multi-dimension switch and the prospect of multicast research in k-ary n-cube switching fabric is described.
     Second, the research background of multi-dimension switch fabric has been introduced. And we have done some simulation in the topologies with different dimensions, then the effect of dimension on performance and complexity has been studied.
     Third, based on problem analysis, this thesis proposes a novel algorithms: KMPAMR (K-Mesh Partition-based Adaptive Multicast Routing) algorithm. KMPAMR can achieve high degree of parallelism by dividing the network into several separated blocks dynamically and sending message copies concurrently in each block. By applying adaptive unicast scheme, KMPAMR can decrease the delay caused by channel contention. Simulation results show that KMPAMR achieves good performance under low applied load. However, KMPAMR might accelerate the switching fabric reaching saturation point and result in beyond saturation. Beyond saturation could cause considerably degraded on performance.
     Finally, a general simulation model is built to test the performance of multi-dimension switch fabric and the proposed algorithm, and provides a common interface for different routing algorithms with the strategy design pattern. Furthermore, simulation results are shown and discussed. Some improvements and future works are also be proposed.
引文
[1] P. K. McKinley, Y. J. Tsai and D. F. Robinson. Collective Communication in Wormhole Routed Massively Parallel Computers. IEEE Trans. on Computers. 1995, 28(12):39-50.
    [2] J. Duato, S. Yalamanchili and L.M. Ni, Interconnection Networks: An Engineering Approach. Revised edition, Morgan Kaufmann, 2003.
    [3] W.J. Dally. Scalable Swithcing Fabric for Internet Routers. Computer Systems Laboratory. Stanford University and Avici Systems. 1999. http://www.avici.com.
    [4]许都,朱旭东,李乐民.下一代可扩展交换机/路由器.电子科技大学学报, 2003,32 (3):230-234
    [5] L.M. Ni. Issues in Designing Truly Scalable Interconnection Networks. Proceedings of 1998 ICPP Workshop on Challenges for Parallel Processing, 1995, 31-44.
    [6] L.M. Ni and P.K. McKinley. A Survey of Wormhole Routing Techniques in Direct Networks. IEEE Computer, 1993, 26(2):62-76.
    [7] A.Singh, W.J. Dally, A.K. Gupta and B. Towles. GOAL: A Load-balanced Adaptive Routing Algorithm for Torus Networks. Proc. 30th Annual International Symposium on Computer Architecture ISCA’03. San.Diego, California, 2003.
    [8] W. J. Dally and C. L. Seitz. The Torus Routing Chip. Jounal of Distributed Computing. 1986, 1(3):187-196.
    [9] W. J. Dally. Virtual-channel Flow Control. IEEE Transaction on Parallel and Distributed Systems. 1992, 3(2):194-205.
    [10] W.J. Dally and C. L. Seitz. Deadlock-free Message Routing in Multiprocessor Interconnection Networks. IEEE Trans. On Computers, 1987, 36(5):547-553.
    [11]朱旭东.多维分组交换结构的研究.博士学位论文.光纤重点实验室.电子科技大学. 2005年3月.
    [12] J. Duato. A General Theory for Deadlock-free Adaptive Routing Using a Mixed Set of Resources. IEEE Transaction on Parallel and Distributed Systems. 2001, 12(12):1219-1235.
    [13] Anjan. K. V. and T. M. Pinkston. DISHA: A Deadlock Recovery Scheme for Fully Adaptive Routing. Proceddings of the 22nd International Symposium on Computer Architecture. 1995, pp. 201-210.
    [14] Juan-Miguell Martinez Rubio, P. Lo′pez and J. Duato. FC3D: Flow Control-Based Distributed Deadlock Detection Mechanism for True Fully Adaptive Routing in Wormhole Networks. IEEE Transaction on Parallel and Distributed Systems. 2003, 14(8):765-779.
    [15] L. Gravano, G. Pifarre, G. Pifarre, P. Berman and J. Sanz. Adaptive Deadlock- and Livelock-free Routing with All Minimal Paths in Torus Networks. IEEE Transaction on Parallel and Distributed Systems. 1994, 5(12):1233-1252.
    [16] Ni L M. Should scalable parallel computers support efficient hardware multicast? In Proc. the 1995 ICPP Workshop on Challenges for Parallel Processing, Aug.1995, pp.2-5.
    [17] R.J. Littlefield. Characterizing and Tuning Communications Performance for Real Applications. Proceedings of the First Intel DELTA Applications Workshop, 1999.
    [18] X. Lin, P. K. McKinley and L. M. Ni. Deadlock-free Multicast Wormhole Routing in 2D Mesh Multicomputers. IEEE Trans. on Parallel and Distributed Systems. 1994, 5(8):793-804.
    [19] M.R. Garey and D. S. Johnson. Computer and Intractability, A Guide to the Theory of NP-Completeness. W. H. Freeman. 1979
    [20] Y.Lan, L.M. Ni and A. H. Esfahanian. A VLSI Router Design for Hypercube Multiprocessors. Integration: The VLSI Journal, 1989, 8(1): 103-125
    [21] T. M. Pinkston and S. Warnakulasuriya. On Deadlocks in Interconnection Networks. Proceedings of the 24th International Symposium on Computer Architecture. 1997.
    [22] M. P. Malumbres, J. Duato and J. Torrelas. An Efficient Implementation of Tree-Based Multicast Routing in Distributed Shared-memory Multiprocessors. Proceedings of the 8th IEEE Symposium on Parallel and Distributed Processing. 1996, pp. 186-189.
    [23] A.Yassin, Al-Dubai, M. Ould-Khaoua and L. M. Makenzie. An Efficient Path-Based Multicast algorithm for mesh networks. Parallel and Distributed Processing Symposium. 2003, pp. 8-17.
    [24] C. M. Chiang and L. M. Ni. Multi-address Encoding for Multicast. Proceedings of the Workshop on Parallel Computer Routing and Communication. 1994, pp. 146-160.
    [25] Hong Wang, Du Xu, Yao Yao, Guo Jiang, Shizhong Xu and Lemin Li. On the load patterns in mesh/torus interconnection networks as packet switching fabrics. Proc. of ICCCAS 2006, 3:1696-1700.
    [26] P.K. Mckinley, Hong Xu, A.H. Esfahanian and Li. M. Ni. Unicast-Based Multicast Communication in Wormhole-routed Networks. IEEE Transactions on Parallel and Distributed Systems. 1994, 5(12):1252-1265.
    [27] D.F. Robinson, B.H. Cheng, and P.K McKinley. Optimal Multicast Communication in Wormhole-routed Torus Networks. IEEE Tran. on Parallel and Distributed Systems. 1995, 6(10):1029-1042.
    [28] H. Xu, Y. Gui, and L.M. Ni. Optimal Software multicast in Wormhole-routed Multistage Networks. IEEE Tran. on Parallel and Distributed Systems. 1997, 8(6):597-606.
    [29] D.R. Kumar, W.A. Najjar and P.K. Srimani. A New Adaptive Hardware Tree-Based Multicast Routing in k-ary n-cubes. IEEE Transactions on Computers. 2001, 50(7):1021-1032.
    [30] D.K. Panda, S. Singal and P. Prabhakaran. Multi-destination Message Passing in Wormhole k-ary n-cube Networks with Base Routing Conformed Paths. IEEE Transactions on Parallel and Distributed Systems. 1999, 10(1): 76-96.
    [31] J. Duato. A Necessary and Sufficient Condition for Deadlock-free Adaptive Routing in Wormhole Networks. IEEE Transactions on Parallel and Distributed Systems. 1995, 6(10):1055-1067.
    [32] Guo Jiang, Du Xu, Lemin Li. A New Partition-Based Multicast Scheme for Torus Networks. Proceedings of International Conference on Communications and Networking in China. 2006, 4: 892-897.
    [33] Nick McKeown. The iSLIP Scheduling Algorithm for Input-Queued Switches. IEEE/ACM Transactions on Networking, 1999, 7 (2): 188-201.
    [34] W. J. Dally. Virtual-channel Flow Control. IEEE Transactions on Parallel and Distributed Systems. 1992, 3(2):194-295.
    [35] E. Baydal, P. Lo′pez and J. Duato. A Simple and Efficient Mechanism to Prevent Saturation in Wormhole Networks. Proc. of International Parallel and Distributed Processing Symposium. 2000.
    [36] Guo Jiang, Du Xu, H. Wang, S.Z. Xu. Multiple Multicasts in Mesh/Torus Interconnection Networks with Relative Quadrant-Based Algorithm. Proceedings of International Conference on Wireless, Mobile & Multimedia Networks. 2006.
    [37]江果,许都,柯灵. k元n方交换结构中的优化多播策略.电子与信息学报. 2007.
    [38] V. Jacobson. Congestion Avoidance and Control. ACM Communication Review, 1988,18(4):314-329.
    [39] J. Duato and P. Lo′pez. Performance Evaluation of Adaptive Rouging Algorithms for k-Ary n-Cubes. Proc. Parallel Computer Routing and Communication Workshop. 1994:45-49.
    [40] P. Lo′pez and J. Duato. Deadlock-Free Adaptive Routing Algorithms for the 3D-Torus: Limitations and Solutions. Computers and Artificial Intelligence. 1995, 14(1):105-125.
    [41] R. V. Boppana and S. Chalasani. A Comparison of Adaptive Wormhole Routing Algorithms. Proceedings of the 20th International Symposium on Computer Architecture. 1993:351-360.
    [42] W. J. Dally and H. Aoki. Deadlock-free Adaptive Routing in Multcomputer Networks Using Virtual Channels. IEEE Transactions on Parallel and Distributed Systems. 1993, 4(4):466-475.
    [43] E. Baydal, P. Lo′pez and J. Duato. A Family of Mechanisms for Congestions Controls in Wormhole Networks. IEEE Transaction on Parallel and Distributed Systems. 2006, 16(9):772-784.
    [44] M.Thottetodi, A.R. Lebeck, S. S. Mukheriee. Exploiting Global Knowledge to Achieve Self-Tuned Congestion Control for k-Ary n-Cube Networks. IEEE Transaction on Parallel and Distributed Systems. 2004, 15(3):257-271.
    [45]王宏.用于分组交换的多维互连交换结构关键技术研究.硕士学位论文.光纤重点实验室.电子科技大学. 2003年5月.
    [46] OPNET Modeler documentation, OPNET Technologies, Inc. http://www.opnet.com.
    [47] Abhay K. Parekh, Robert G. Gallager. A Generalized Processor Sharing Approach to Flow Control in Integrated Services Network: The Single-Node Case. IEEE/ACM Transactions on Networking, 1993, 1(3): 344-357.
    [48] E. Gamma, R. Helm, R. Johnson and J. Vlissides. Design Patterns: Elements of Reusable Object-Oriented Software. Addison Wesley Longman, Inc., 1998.
    [49] W.Horst, R.,TNet:A Reliavble System Area Network. IEEE Micro, 1995:37-35.
    [50] R.E.Kessler and Swatszmeier, J.L.,Cray T3D: A new dimension for Cray research. Compcon, Spring.1993.
    [51] Horst,R.,Tnet:a reliable system area network for I/O and IPC. Hot Interconnects II,1994.Symposium Record.1994.
    [52] WJ Dally,“Performance Analysis of k-ary n-Cube Interconnection Networks,”IEEE Trans. Computers, vol. 39, no. 6, pp. 775-785, June 1990.
    [53] Luis Gravano , Gustavo D. Pifarré, Jorge LC Sanz,“Adaptive deadlock- and livelock-free routing with all minimal paths in torus networks,”IEEE Trans. Parallel and Distributed systems, Volume 5, Issuse 12, Pages:1233-1251, May 1994.
    [54] LM Ni,PK McKinley,“A Survey of Wormhole Routing Techniques in Directed Networks,”Computer, 26(2) :62~76, Feb. 1993.
    [55] P. Rajabzadeh, H. Sarbazi-azad, H.H. Najaf-abadi, M. Ould-Khaoua,“Performance modeling of fully adaptive wormhole routing in n-dimensional mesh-connected multicomputers,”Performance, Computing, and Communications Conference, 2006. IPCCC 2006. 25th IEEE International, Page(s):8 pp, 10-12 April 2006.
    [56] H. Sarbazi-Azad, M. Ould-Khaoua, L.M.Mackenzie, S.G.Akl,“On some properties of k-ary n-cubes,”Parallel and Distributed Systems, 2001. ICPADS 2001. Proceedings. Eighth International Conference, Page(s):517-524, 26-29 June 2001.
    [57] S. Loucif, H.Sarbazi-Azad, M. Ould-Khaoua,“Message latency in k-ary n-cubes with hop-based routing,”IEEE Proc.-Comput. Digit. Tech, Vol148, No.2, March 2001.
    [58] Xudong Zhu, Hong Wang, Lemin Li,“Analysis of a Switching Fabric Employing Direct Interconnection Networks Parallel and Distributed Computing”, Application and Technologies, 2003. PDCAT’2003. Proceedings of the Fourth International Conference, Page(s):37– 41, 27-29 Aug. 2003.
    [59] Xudong Zhu, Hong Wang, Lemin Li,“A Terabit switching fabric employing direct interconnecion networks,”Globle Telecommunications Conference, 2003. GLOBECOM '03. IEEE Volume5, Page(s):2982-2985, 2003.
    [60] WANG Yuli, YANG Xiaodong,“Analysis and Study on the Performance of k-ary n-cube Interconnection Network,”Computer Engineering, Vol.26, Dec. 2000.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700