网络处理器负载均衡技术研究与实现

英文题名：The Rearch and Implementation of Load Balance Technique for Network Processor
作者：李丹丹
论文级别：硕士
学科专业名称：计算机科学与技术
中文关键词：网络处理器 ; 负载均衡 ; 报文保序 ; AHRW ; 流分类 ; 并行转发系统
英文关键词：network processor ; load balance ; packet ordering ; AHRW ; flow classify ; parallel forward system
学位年度：2006
导师：龚雪春
学科代码：081201
学位授予单位：国防科学技术大学
论文提交日期：2006-11-01

摘要

当前，网络处理器处理速度的增长远远赶不上网络流量和传输带宽的增长，因此，网络处理器多采用并行组织结构来提高其性能。网络处理器并行结构主要由多个PE和协处理器及硬件逻辑模块组成。各PE之间、PE与协处理器之间存在多种并行模式，使网络处理器的可编程性、可扩展性等优点得到了充分发挥。
     为提高网络处理器的吞吐率，其处理过程必须保证负载均衡。但是，负载均衡可能导致报文乱序，从而不必要地触发TCP拥塞控制机制，降低发送速率，影响TCP连接的性能和网络利用率。如何解决这对矛盾是网络处理器并行结构设计面临的难点。
     本文研究了当前主要的负载均衡算法，提出了自适应最高随机权重(adaptive highest random weight简称AHRW)算法。它具有报文保序、开销小、最小破坏度、能根据当前负载状况动态调整负载分配等优点。另外，本文基于网络流量特征提出了流分类处理模型。此模型将报文流分为极大流和正常流，当系统负载均衡时都采用AHRW算法进行映射；当系统负载不均衡时，对极大流单独调整，正常流依然按照AHRW算法的结果映射。这样既保证了负载均衡，又减少了调整混乱率。为验证模型性能，我们自主设计了报文并行调度模拟器FES，它采用网络真实trace驱动和虚拟时钟推进相结合的模拟方法，反映了各算法的调度情况，并从负载均衡度、报文丢失率、重映射破坏度、报文乱序率四个方面对算法性能进行统计分析。模拟证明，基于AHRW的流分类处理方法在这些方面都具有较好的性能。
     最后，本文将AHRW算法应用于某核心路由器上10G高速接口的设计方案中，提出了一种使用3个较低速网络处理器IBM NP4GS3构建高速并行转发系统的方法，实现高速的报文转发。文中详细介绍了10G高速并行转发系统的核心——报文并行转发模块的设计。
Now the growth of the network processor speed can not keep up with the growth ofuser traffic and link bandwidth, so adopting the parallel architecture can efficientlyimprove the network processor performance. Parallel network processor is comprised ofmultiple processing elements(PE), coprocessors and hardware logic blocks(HLB). Thereexist many parallel modes among PEs and between PE and coprocessor, whichsufficiency brings into play the characteristic of network processor such asprogrammable and expansible.
     Load balance is the key to improving the throughput of network processor.Unfortunately, load balance may cause packet reordering, which will unnecessarilytrigger the TCP congestion control mechanism, and thus deteriorate the performance ofTCP connections and the network utility. How to deal with the contradiction is a bigproblem in network processor design.
     This thesis studies currently in common using load balance arithmetic and pointsout the adaptive highest random weight (AHRW) arithmetic. AHRW has the propertiesof packet ordering, low overhead, minimal disruption and can dynamicly adjust loadbalancing based of current status. What's more, this thesis studies network trafficcharacteristics, then provides flow classifier scheduling scheme that classifies Internetflows into two categories: the aggressive and the normal, and applies differentscheduling policies to the two classes of flows. When the system is in balanced state,packets is assigned according to the AHRW arithmetic. When system is unbalanced,the load adapter will adjust the aggressive flow, the normal flow will be assignedaccording to AHRW arithmetic still. This can guarantee load balancing and reduceadaptation disruption. To validate the performance of the scheme, we designed a packetparallel scheduling simulator FES, which combines the Internet true trace drive anddummy clock advancing. At last, we analysis the performance of many sorts ofarithmetic from load balancing degree, packet loss rate, adaptation disruption, andpacket reordering, the result is: AHRW has better performance compareing with otherarithmetic.
     At last, this thesis designs a 10G interface in a core router using AHRW arithmetic,and brings forward a method of how to implement a high speed parallel forward systemusing 3 IBM NP4GS3 low speed network processors, realizing high speed packetforwarding. In this thesis, we particularly introduce how to design the packet parallelforward module, which is the kernel of the 10Gbps high speed parallel forward system.

引文

[1] Jeff Hecht, Wavelength Division Multiplexing, http://www.technologyreview.com/articles/hecht0399.asp, March/April 1999
    [2] Niraj Shah, Understanding Network Processor, 2001.9
    [3] 张晓明，高速路由器IP报文转发技术的研究与实现，国防科学技术大学工学硕士学位论文，2001
    [4] 绍荣平，网络处理器并行处理技术研究，国防科学技术大学工学硕士学位论文，2003
    [5] 石晶林等，网络处理器原理、设计与应用，清华大学出版社，2003
    [6] 网络处理器论坛，http://www.npforum.org
    [7] Ch.Semeria, Internet backbone routers and evolving Internet design, Juniper Networks white paper, http://www.juniper.net, September 1999
    [8] Paul Fredette, The Past Present and Future of Inverse Multiplexing[J], IEEE Corranunication Magazine. Apri, 1994
    [9] Devavrat Shah, Sundar Iyer, Balaji Prabhakar, and Nick McKeown. Maintaining Statistics Counters in Router Line Cards. IEEE Micro, Jan-Feb 2002, pp. 76-81
    [10] G. K. Zipf, Human Behavior and the Principle of Least-Effort, Addison Wesley, Cambridge, MA, 1949
    [11] John Moy, OSPF Version 2[S]. RFC2328. Apt, 1998
    [12] K. W. Ross, Hash routing for collections of shared Web caches, IEEE Network, vol. 11, no. 7, pp. 37-44, Nov-Dec 1997
    [13] J.SONG ET AL, Adaptive Load distribution over Multipath in MPLS Networks[A]. ICC'03 [C]. Alaska: IEEE,2003,3
    [14] G. Dittmann, A. Herkersdorf, Network processor load balancing for high-speed links, in 2002 International Symposium on Performance Evaluation of Computer and Telecommunication Systems (SPECTS 2002), San Diego, CA, USA, July 2002, pp. 727-735
    [15] MORE.H.B, HE WU, Throughput improvement through dynamic load balance [A]. Proceedings of Creative Technology Transfer A Global Affair[C]. Southeastcon: IEEE, 1994,4
    [16] K. W. Ross, Hash routing for collections of shared web caches. IEEE Network, Vol. 11, No. 6, November-December 1997
    [17] G. Barish, K. Obraczka, World Wide Web caching: trends and techniques IEEE Communications Magazine, Vol. 38, No. 5, pp. 178-184, May2000
    [18] David G. Thaler, C.V.Ravishankar, Using name-based mappings to increase hit rates. IEEE/ACM Transactions on Networking, 1998:1-14
    [19] L. Kencl, J. Le Boudec, Adaptive load sharing for network processors. In IEEE INFOCOM2002, New York, NY, USA, June2002, pp. 545-554
    [20] L. A. Adamic and B.A. Huberman, Zipf's law and the Internet, Glottometrics 3, 2002, 143-150
    [21] S. Sarvotham, R. Riedi, R. Baraniuk, Connection-level analysis and modeling of network traffic, in ACM SIGCOMM Internet Measurement Workshop, San Francisco, CA, USA, November 2001, pp. 99-103
    [22] LI Xudong, XU Yang, LIU Bin et, Hardwired Logic and Multithread Design in Network Processors. TSINGHUA SCIENCE AND TECHNOLOGY ISSN 1007-0214 14/20 pp207-212 Volume 9, Number 2, April 2004
    [23] Jiani Guo, Jingnan Yao, Laxmi Bhuyan, An Efficient Packet Scheduling Algorithm in network processors. In Proceedings of IEEE INFOCOM 2005, July 2005
    [24] P. Pappu, T. Wolf, Scheduling Processing Resources in Programmable Routers. In Proceedings of IEEE INFOCOM 2002, pp. 104-112, July 2002
    [25] E. Seamans, M. Rosenblum, Parallel Decompositions of a Packet-Processing Work- load, Proc. of Advanced Networking and Communications Hardware Workshop (ANCHOR) held in conjunction with the 31st Annual International Symposium on Computer Architecture (ISCA 2004), Munich, Germany, pp. 40-48, 2004
    [26] N. Brownlee, K. Claffy, Understanding Internet traffic streams: Dragon flies and tortoises, IEEE Communications, vol. 40, no. 10, pp. 110-117, Oct. 2002
    [27] A. Feldmarm, J. Rexford, R. C'aceres, Efficient policies for carrying Web traffic over flow-switched networks, IEEE/ACM Transactions on Networking, vol. 6, no. 6, pp. 673-685, 1998
    [28] S. Sarvotham, R. Riedi, R. Baraniuk, Connection-level analysis and modeling of network traffic, in ACM SIGCOMM Internet Measurement Workshop, San Francisco, CA, USA, November 2001, pp. 99-103
    [29] R. Jain, S. Routhier, Packet trains: Measurements and a new model for computer network traffic, IEEE Journal of Selected Areas in Communications, vol. SAC-4, no. 6, pp. 986-995, September 1986
    [30] Passive Measurement and Analysis (PMA). http://pma.nlanr.net
    [31] Samuel J. Bamett, Mark R. Fauber, Network Processors: Uncovering Architectural Approaches for High-Speed Packet Processing. Vitesse Semiconductor, http://www.vitesse.com
    [32] Mark Janoska, Co-Processors and the Role of Specialized Hardware. Networld InterOP2000, 2000
    [33] Niraj Shah, Understanding Network Processor, 2001
    [34] IBM PowerNP NP4GS3数据手册, IBM Networking Technology. http://www.chips.ibm.com/products/wired/communications/network_processors.html
    [35] Intel IXP1200数据手册，Intel Corporation, http://developer.intel.com/design/network/index.htm
    [36] EZchip Technologies, Network Processor Designs for Next-Generation Networking Equipment, White Paper, http://www.ezchip.com/html/tech_r.html
    [37] MMCNetworks, Network Processing Platforms: Minimizing Total Time-to-Market, MMC Networks White Paper, http://www.mmcnetworks.com/Solutions
    [38] Motorola Inc.http://www.motorola.com
    [39] 崔丙锋，Internet流量工程关键技术研究，北京邮电大学博士学位论文，2005
    [40] 谭章熹，林闯等，网络处理器的分析与研究，软件学报，2003 Vol．14，No．2
    [41] N. Brownlee, K. Claffy, Understanding Internet traffic streams: Dragon flies and tortoises, IEEE Communications, vol. 40, no. 10, pp. 110-117, Oct. 2002
    [42] Z. Cao, Z. Wang, E. Zegura, Performance of hashing-based schemes for Internet load balancing, Proc. of IEEE Infocom2000, Tel-Aviv, Israel, March 2000
    [43] D. E. Knuth, The Art Of Computer Programming, vol. 3: Sorting and Searching, Addison-Wesley, 1st edition, 1969
    [44] Weiguang Shi, M. H. MacGregor, Pawel Gburzynski, Load balancing for parallel forwarding, IEEE/ACM Transactions on Networking (TON), v.13 n.4, p.790-801, August 2005

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700