并行路由器体系结构若干关键技术研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
互联网流量的快速增长对网络容量提出很高要求。密集波分多路复用技术极大地提高了光纤的通信能力,解决了传输链路带宽问题。目前,路由器的发展速度无法适应网络流量和传输带宽的增长速度,成为互联网发展的瓶颈。采用并行路由器体系结构可有效提高路由器的性能。并行路由器由多个较低速的转发和交换模块组成,各模块并行工作,独立处理报文,具有可扩展性好、性能价格比高等优点。
     为提高并行路由器的吞吐量,必须保证负载均衡。但是负载均衡可能导致报文乱序,不必要地触发TCP拥塞控制机制,降低发送速率,从而影响TCP连接的性能和网络的利用率。如何解决这对矛盾是并行路由器设计面临的难点。
     分析报文乱序对TCP连接的性能影响以及并行处理可能引入的报文乱序概率,对合理解决负载均衡和报文保序问题具有重要意义。本文针对NewReno拥塞控制算法,建立了报文丢失率、乱序率和TCP发送速率三者之间的量化关系。并基于该关系,分析了乱序报文对TCP连接的影响。基于Internet2上捕获的实际网络流量,本文估算了并行处理可能产生的报文乱序概率。根据上述分析,本文指出在并行处理时必须充分考虑报文的保序支持,结合报文丢失率等流量特性,对负载均衡和报文保序作出合理的折衷。在网络流量负载较轻或突发性较弱的情况下,需要优先考虑报文的保序问题;在重负载或强突发流量情况时可以优先考虑负载均衡。
     交换系统是路由器的核心组成部件。本文提出一种基于输入缓存交叉开关的并行交换体系结构PSIQC,给出信元调度算法RRDS。RRDS算法能够保证信元顺序,具有负载均衡、吞吐率高的特点,对PSIQC的规模表现出较好的可扩展性。基于队列分割思想,本文改进了PSIQC的队列组织方式,提出SQ-PSIQC交换结构。除保持PSIQC的保序和负载均衡等特点外,SQ-PSIQC还降低对缓存的带宽要求,提高了系统性能。在RRDS算法的基础上,本文提出基于局部序列号机制的LS-RRDS算法。该算法与缓冲区准入机制协同工作,解决了PSIQC的容错问题。LS-RRDS算法充分利用了RRDS算法的轮询特性,具有吞吐率高、容错量大的优点。
     针对高速环境下转发决策困难的问题,本文提出一种由多个网络处理器组成的并行转发引擎结构。为解决负载分配问题,提出一种基于映射表的自适应负载分配算法AIHDA。AIHDA算法根据各网络处理器的负载状况和网络流量特性调整负载分配方式,折衷考虑了负载均衡和报文保序要求,具有较好的综合性能。
The rapid growth of Internet traffic sets a great demand on network capacity. Dense Wavelength Division Multiplexing (DWDM) is making fiber-optic links with very high bandwidth, which satisfies the underlying users' demand. Routers can not keep up with the growth of user traffic and link bandwidth at present, and thus become the bottleneck of the Internet. Deploying the parallel router architecture can efficiently improve the router performance. Parallel router is comprised of multiple lower speed forwarding and switch modules which operate independently and in parallel. Parallel router is scalable and costs effectively.Load balance is the key to improving the throughput of parallel router. Unfortunately, load balance may cause packet reordering, which will unnecessarily trigger the TCP congestion control mechanism, and thus deteriorate the performance of TCP connections and the network utility. How to deal with the contradiction is a big problem in parallel router design.Analysis of the effect of packet reordering on TCP throughput and the reorder rate caused by parallel processing is important to compromise between load balance and packet order guarantee. This dissertation establishes a quantitative relationship among packet loss rate, reorder rate, and TCP throughput in NewReno algorithm. Based on the relationship, the effect on TCP throughput caused by reordered packets is analyzed. This dissertation estimates the reorder rate caused by parallel processing based on the traffic captured in the Internet2. According to the analysis, this dissertation points out that packet order guarantee should be considered during parallel processing, and reasonable tradeoff must be made between it and load balance. The parallel router should first guarantee the packet order under light load, and do more load balance with the increase of the load.Switch is the core of router. This dissertation presents the PSIQC parallel switch architecture comprised of input-queued crossbar switches, and its cell scheduling algorithm, RRDS. RRDS algorithm avoids packet reordering, balances the load efficiently, and has high throughput. It scales well with the size of PSIQC. A revised version of PSIQC based on split queues that is initialed as SQ-PSIQC is proposed. SQ-PSIQC not only has the characteristics of PSIQC, but also reduces the memory bandwidth requirement, and improves the performance. LS-RRDS algorithm is proposed based on RRDS and local sequence number mechanism. LS-RRDS and the buffer admission control mechanism are deployed in PSIQC to make it fault tolerant. Taking advantage of the round robin working property of RRDS, LS-RRDS tolerates lots of cell losses
    
    with, little cost.In order to make forwarding decision at line rate in high speed networks, this dissertation presents a parallel forwarding engine comprised of several network processors. An adaptive load dispatching algorithm based on a mapping table, AIHDA, is proposed. AIHDA dispatches the incoming packets to different NPs according to the workload at each NP and traffic characteristics. It compromises between load balance and packet order guarantee, and has good performance.
引文
[1] D. Comer. Internetworking with TCP/IP volume 1: principles, protocols, and architecture. Prentice Hall International.
    [2] A.M. Odlyzko. Comments on the Larry Roberts and Caspian networks study of Internet traffic growth. The Cook Report on the Internet, Dec. 2001:12~15
    [3] A.M. Odlyzko. The current state and likely evolution of the Internet. Proc. of Globecom'99, 1999:1869~1875
    [4] N. McKeown. Growth in router capacity. IPAM Workshop, Oct. 2003, Lake Arrowhead, CA. Available at http://tiny-tera.stanford.edu/~nickm/talks/index.html
    [5] R. Schaller. Moore's law: past, present and future. IEEE Spectrum, 1997, 34(6):52~59
    [6] 杨壮,杨名.波分复用系统的发展和面临的挑战.http://www.c114.net./ZHANHUI/863/yz.htm
    [7] D. Patterson and J. Hennessy. Computer architecture, a quantitative approach. Morgan Kaufmann, 1996.
    [8] C. Villamizar and C. Song. High performance TCP in ANSNET. Computer Communication Review, 1994, 24(5):45~60
    [9] S. Iyer, R. Kompella, N. McKeown. Analysis of a memory architecture for fast packet buffers. IEEE high performance switching and routing, Dallas, Texas, May 2001:368~373
    [10] Y. Rekhter and T. Li. An architecture for IP address allocation with CIDR. RFC 1518, Sep. 1993. Available at http://www.ietf.org/rfc/rfc1518.txt
    [11] V. Fuller, T. Li, J. Yu, and K. Varadhan. Classless inter-domain routing (CIDR): an address assignment and aggregation strategy. RFC 1519, Sep. 1993. Available at http://www.ietf.org/rfc/rfc1519.txt
    [12] M. Ruiz-Sanchez, E. Biersack, and W. Dabbous. Survey and taxonomy of IP address lookup algorithms. IEEE Network, March 2001
    [13] H. Tzeng and T. Przygienda. On fast address-lookup algorithms. IEEE Journal on Selected Areas in Communications, 1999, 17(6): 1067-1082.
    [14] 徐恪,徐明伟,吴建平,吴剑.路由查找算法研究综述.软件学报,2002,13(1):42~50
    [15] 喻中超,吴建平,徐恪.IP分类技术研究.电子学报,2001,29(2):260~262
    [16] P. Gupta and N. McKeown. Algorithms for packet classification. IEEE Network, March 2001
    [17] V. Srinivasan. Fast and efficient Internet lookups. Ph.D Thesis. Washington University, 1999
    [18] N. McKeown. Scheduling algorithms for input-queued switches [PhD Thesis] . Unive rsity of California, Berkeley, 1995. Available at http://tiny-tera.stanford.edu/~nickm/papers/index.html
    [19] I Keslassy, N McKeown. Maintaining packet order in two-stage switches. Proc. of IEEE Infocom2002. New York, June 2002.
    [20] C. Minkenberg On packet switch design [PhD Thesis] . Eindhoven University of Technology, The Netherlands, 2001.
    [2
    
    [21] H. Khosravi and T. Anderson. Requirements for Separation of IP Control and Forwarding. RFC 3654, Nov. 2003.
    [22] P. Fernndez and N. McKeown. TCP switching: exposing circuits to IP. IEEE Micro, 2002, 22(1):82~89
    [23] P. Fernndez and N. McKeown. The performance of circuit switching in the Internet. The journal of optical networking, 2003, 2(4):83~96
    [24] A. Singhal. Terabit switches and routers. Available at http://www.cis.ohio-state.edu/~jain/cis788-99/terabit-routing/index.html
    [25] 徐恪,熊勇强,吴建平.宽带IP路由器的体系结构分析.软件学报,2000,11(2):179~186
    [26] C. Partridge, P. Carvey, E. Burgess, I. Castineyra, et al. A 50-Gbps IP router. IEEE/ACM Transactions on Networking, 1998, 6(3):237~247
    [27] K. Fendick, V. Kumar, T. Lakshman, and D. Stiliadis. The packetStar 6400 IP switch-an IP switch for the converged network. Bell Labs Technical Journal, 1998:32~47
    [28] D. Kachelmeyer. A new router architecture for tomorrow's Internet. NetStar Inc., Available at http://www.netstar.com
    [29] N. McKeown, M. Izzard, A. Mekkittikul, W. Ellersick, and M.Horowitz. Tiny tera: a packet switch core. IEEE Micro, 1997, 17(1):26~33
    [30] 胡晓峰,邵荣平,孙志刚.并行可扩展路由器体系结构.国家自然科学基金重大研究计划网络与信息安全2003年度学术交流大会.
    [31] W. Stevens. TCP slow start, congestion avoidance, fast retransmit, and fast recovery algorithms. RFC 2001, Jan. 1997.
    [32] M. Allman, V. Paxson, and W. Stevens. TCP congestion control. RFC 2581, April 1999.
    [33] N. McKeown. Scaling routers: where do we go from here? Available at http://klamath. stanford.edu/~nickm/talks/index.html
    [34] Transmission control protocol. RFC 793, Sep. 1981.
    [35] V. Jacobson, R. Braden, and D. Borman. TCP extensions for high performance. RFC 1323, May 1992.
    [36] J. Wang, K. Nahrstedt. Parallel IP packet forwarding for tomorrow's IP routers. Proc. of IEEE workshop on HPSR'01, Dallas, May 2001.
    [37] F. Chiussi, J. Kneuer, and V. Kumar. Low-cost scalable switching solutions for broadband networking: the ATLANTA architecture and chipset. IEEE Communication Magazine, 1997, 35(12):44~53
    [38] S. Floyd and T. Henderson. The NewReno modification to TCP's fast recovery algorithm. RFC 2582, April 1999.
    [39] M. Mathis, J. Mahdavi, S. Floyd, and A. Romanow. TCP selective acknowledgement options. RFC 2018, Oct. 1996.
    [40] S. Floyd, J. Mahdavi, M. Mathis, and M. Podolsky. An extension to the selective acknowledgement (SACK) option for TCP RFC 2883, July 2000.
    [41] 胡晓峰,孙志刚,苏金树,卢锡城.高速路由器并行交换技术研究.计算机研究与 发展,2004,41(1):60~64
    [4
    
    [42] X. Hu, Z. Sun, X. Lu, J. Su. Using split queues to improve the performance of parallel switch. Lecture Notes in Computer Science 2834. Proc. of Advanced Parallel Processing Technologies2003, Xiamen China, Sep. 2003.
    [43] 胡晓峰,孙志刚,卢锡城.并行交换系统PSIQC的容错机制.计算机工程与科学,已录用
    [44] NLANR PMA daily traces, http://pma.nlanr.net/DailyTr/
    [45] Internet protocol. RFC 791, Sep. 1981.
    [46] S. Deering and R. Hinden. Internet protocol, version 6 (IPv6) specification. RFC 2460, Dec. 1998.
    [47] V. Paxson. End-to-end Internet packet dynamics. IEEE/ACM Transactions on Networking, 1999, 7(3): 277~292.
    [48] J. Bennett, C. Partridge, and N. Shectman. Packet reordering is not pathological network behavior. IEEE/ACM Transactions on Networking, 1999, 7(6): 789~798
    [49] K. Thompson, G. Miller, and R. Wilder. Wide-area Internet traffic patterns and characteristics. IEEE Network, 1997, 11(6):10~23
    [50] K. Claffy. The nature of the beast: recent traffic measurements from an Internet backbone. Proc. of INET'98, Geneva, Switzerland, July 1998.
    [51] M. Laor and L. Gendel. The effect of packet reordering in a backbone link on application throughput. IEEE Network, 2002, 16(5): 28-36
    [52] E. Blanton and M. Allman. On making TCP more robust to packet reordering. ACM Sigcomm Computer Communication Review, 2002, 32(1):20~30
    [53] J. Padhye, V. Firoiu, D. Towsley, and J. Kurose. Modeling TCP Reno performance: a simple model and its empirical validation. IEEE/ACM Transactions on Networking, 200, 8(2):133~145
    [54] T. Lakshman and U. Madhow. The performance of TCP-IP for networks with high bandwidth-delay products and random loss. IEEE/ACM Transactions on Networking, 1997, 5(3):336~350
    [55] M. Mathis, J. Semke, J. Mahdavi, and T. Ott. The macroscopic behavior of the TCP congestion avoidance algorithm. Computer Communication Review, 1997, 27(3)
    [56] A. Abouzeid, S. Roy, M. Azizoglu. Stochastic modeling of TCP over lossy links. Proc. of IEEE Infocom2000, Tel-Aviv, Israel, March 2000.
    [57] M. Yajnik, S. Moon, J. Kurose, and D. Towsley. Measurement and modeling of the temporal dependence in packet loss. Proc. of IEEE Infocom1999, New York, March 1999.
    [58] T. Ott, J. Kemperman, and M. Mathis, The stationary behavior of ideal TCP congestion avoidance. Proc. of IEEE Infocom 1999, New York, March 1999.
    [59] M. Allman and V. Paxson. On estimating end-to-end network path properties. Proc. of ACM Sigcomm'99, Sep. 1999.
    [60] M. Zhang, B. Karp, S. Floyd, and L. Peterson. RR-TCP: a reordering-robust TCP with DSACK. ICSI TR-02-006, July 2002
    [61] S. Jaiswal, G. Iannaccone, C. Diot, J. Kurose, D. Towsley. Measurement and classification of out-of-sequence packets in a Tier-1 IP backbone. Proc. of IEEE Infocom2003, San Francisco, March 2003.
    [6
    
    [62] G. Iannaccone, S. Jaiswal, and C. Diot. Packet reordering inside the Sprint backbo ne. Technical Report TR01-ATL-062917, Sprint ATL. Available at http://www.sprint labs.com/People/gianluca/papers/
    [63] J. Bellardo and S. Savage. Measuring Packet Reordering. Proc. of ACM/USENIX Internet Measurement Workshop2002, Marseille, France. Nov. 2002.
    [64] X. Zhou and P. Mieghem Reordering of IP Packets in Internet. Proc. of Passive and Active Measurement Workshop2004, Antibes Juan-les-Pins, France, April. 2004.
    [65] S Iyer, A Awadallah, N McKeown. Analysis of a packet switch with memories running at slower than the line rate. Proc. of IEEE Infocom2000. Tel-Aviv, Israel, March 2000
    [66] S Iyer, N McKeown. Making parallel packet switches practical. Proc. of IEEE Infocom2001, Alaska, April 2001
    [67] W Wang, L Dong, W Wolf. A distributed switch architecture with dynamic load-balancing and parallel input-queued crossbars for terabit switch fabrics. Proc. of IEEE Infocom2002, New York, June 2002.
    [68] F. Chiussi and A. Francini. A distributed scheduling architecture for scalable packet switches. IEEE Journal on Selected Areas in Communication, 2000, 18(12):2665~2683
    [69] D. Khotimsky and S. Krishnan. Stability analysis of a parallel packet switch with bufferless input demultiplexors. Proc. of ICC2001, Helsinki, Finland, June 2001.
    [70] F. Chiussi, D. Khotimsky, and S. Krishnan. Generalized Inverse Multiplexing of Switched ATM Connections. In: Proc. of IEEE Globecom'98, Sydney, Australia, Nov. 1998.
    [71] N. McKeown. A fast switched backplane for a gigabit switched router. Business C ommunications Review, 1997, 27(12). Also available at http://tiny-tera.stanford.edu/~nickm/papers/index.html
    [72] H. Zhang. Service disciplines for guaranteed performance service in packet-switching networks. Proc. of the IEEE, 1995, 83(10): 1374~1396
    [73] P. Gupta. Scheduling in input queued switches: a survey. Available at http://klamat h. stanford, edu/~pankaj/research.html
    [74] M. Karol, M. Hluchyj, and S. Morgan. Input versus output queueing on a space division switch. IEEE Transactions on Communications, 1987, 35(12): 1347~1356
    [75] T. Anderson, S. Owicki, J. Saxe, and C. Thacker. High speed switch scheduling for local area networks. ACM Transactions on Compututer System, 1993, 11(4):319~352
    [76] N. McKeown, V. Anantharam, and J. Walrand. Achieving 100% throughput in an input-queued switch. Proc. of IEEE INFOCOM1996, San Francisco, March 1996
    [77] 谢政,李建平著.网络算法与复杂性理论.长沙:国防科技大学出版社,1995.
    [78] N. McKeown. The iSLIP scheduling algorithm for input-queued switches. IEEE/ACM Transactions on Networking, 1999, 7(2): 188~201
    [79] 孙志刚,苏金树,卢锡城.高效的Crossbar仲裁算法——ISP.计算机学报,2000, 23(10):1078~1082
    [80] S. Chuang, A. Goel, N. McKeown, and B. Prabhakar. Matching output queueing with a combined input output queued switch. EEEE Journal on Selected Areas in Communications, 1999, 17(6): 1030-1039
    
    [81] J. Dai, B. Prabhakar. The throughput of data switches with and without speedup. Proc. of IEEE Infocom2002, New York, June 2002.
    [82] L. Kencl, J. Boudec. Adaptive load sharing for network processors. Proc. of IEEE Infocom2002, New York, June 2002.
    [83] L. Kencl. Load sharing for multiprocessor network nodes [PhD Thesis]. Swiss Federal Institute of Technology in Lausanne, Jan. 2003.
    [84] J. Postel. User datagram protocol. RFC 768, Aug. 1980.
    [85] D. Thaler and C. Ravishankar. Using name-based mappings to increase hit rates. IEEE/ACM Transactions on Networking, 1998, 6(1): 1-14
    [86] S. Floyd. A report on some recent developments in TCP congestion control. EEEE Communications Magazine, April 2001
    [87] NS-2 simulator. Available at http://www.isi.org/nsnam/dist/
    [88] Internet core router test. March 2001, http://www.lightreading.com
    [89] R. LaMaire and D. Serpanos. Two-dimentional round-robin schedulers for packet switches with multiple input queues. EEEE/ACM Transactions on Networking, 1994, 2(5):471-482
    [90] N. McKeown, V. Anantharam, and J. Walrand. Achieving 100% throughput in an input-queued switch. Proc. of EEEE Infocoml996, San Francisco, March 1996.
    [91] SEM simulator. http://klamath.stanford.edu/tools/SIM/
    [92] K. Sklower, B. Lloyd, G McGregor, D. Carr, T. Coradetti. The PPP multilink protocol (MP). RFC 1990, Aug. 1996.
    [93] J. Turner. Resequencing cells in an ATM switch. Washington University Computer Science Department, Report WUCS-91-21, 1991.
    [94] F. Chiussi, D. Khotimsky, and S. Krishnan. Advanced frame recovery in switched connection inverse multiplexing for ATM. Proc. of ICATM1999, Colmar, France, June 1999.
    [95] D. Khotimsky and S. Krishnan. Evaluation of open-loop sequence control schemes for multi-path switches. Proc. of ICC2002, New York, March 2002.
    [96] IBM PowerNP NP4GS3. July 2000. Available at http://www-3.ibm.com/chips/product s/wired/products/network_processors.html
    [97] Intel EXP2800. Available at http://www.intel.eom/designhttp://www.intel.com/design/ne twork/ixa. htm/network/ixa. htm
    [98] P. Sagmeister Scaling Network Processor performance to 40 GbPs. June 2001. Available at http://www.comsoc.org/tcgn/conference/gbn2001/sagmeister_scaling-presentation.pdf
    [99] Z. Cao, Z. Wang, and E. Zegura. Performance of hashing-based schemes for Internet load balancing. Proc. of EEEE Infocom2000, Tel-Aviv, Israel, March 2000.
    [100] S. Floyd and V. Jacobson. Random early detection gateways for congestion avoidance. EEEE/ACM Transactions on Networking, 1993, 1(4):397~413

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700