摘要
本文着重研究对等(Peer-to-Peer, 简称P2P)计算。对等网应用在近
几年内已得到突飞猛进的发展。P2P文件共享系统是对等计算最重要的应
用之一。P2P文件共享系统能否成功极大地取决于搜索机制的多样性和扩
展性。
当前支持分布式哈希表(Distributed Hash Table, DHT)功能的结构
化系统(如CAN [5]、Chord [6]、Pastry [7]和Tapestry [8])易扩展但不能
有效地支持部分匹配的查询;而基于扩散的非结构化系统(如Gnutella)
支持多样化查询但不易扩展。本文作者提出了一种新型的对等网体系结
构—语义对等网(Semantic Peer-to-peer Networks, SPNs),其中语义相关
的结点互相连接在一起。基于内容编址网(Content Addressable Networks,
CAN),我们分别构造了粗粒度语义对等网pGroup 和细粒度语义对等
网fGroup。根据不同的查询情况,我们分别对pGroup和fGroup提出相应的
搜索算法。模拟结果表明,搜索效率比Gnutella网络大大提高了。
为加速大文件的接收,许多P2P系统如Kazaa、Grokster和Morpheus等
采用了并行下载机制。为研究多个用户同时使用这一机制时其对整个网络
的影响,本文作者利用非合作博弈论对并行下载问题进行建模。就我们所
知,这是第一次用非合作博弈论方法分析并行下载问题的工作。在这个框
架下,我们给出了纳什均衡在一般网络中的特征;针对特殊的网络分析了
纳什均衡的性质;并对两个特殊的网络建立了纳什均衡的动态收敛性;最
后,分别从用户和系统的角度,即分别从单个用户的下载时延和所有连接
1本研究受到科学技术部基础研究重大项目前期研究专项第2001CCA0300号,国家自然
科学基金第60273045号和上海市科技发展基金第025115032号资助。
I
II
上总的时延,研究了纳什均衡的效率。结果发现,尽管从用户的角度来
看,纳什均衡是最优的;但从系统的角度来看,纳什均衡却很糟糕。
大多数DHTs如CAN、Chord、Pastry和Tapestry 需要O(logN)的邻居
数和O(logN)的路由长度。为维护路由的正确性和效率,当一个结点
加入或离开系统时它们需要对路由表进行O(logN)的修复操作。考虑
到P2P系统中用户的高度动态性,构造具有O(1)邻居数和O(logN)路由长
度的DHTs是很重要的。最近的文献 [70, 73, 74]提出了基于de Bruijn 图
的P2P网络,但他们都仅使用了单方向的链路。通过引入双向链路,我们
进一步改进了其中的路由算法,并使用连续-离散方法 [75] 构造了DHTs。
This thesis focuses on Peer-to-Peer (P2P) computing. P2P networking
applications have seen explosive growth in recent years. One of the most
important applications of P2P computing is P2P ?le sharing system. The
success of P2P ?le sharing system highly depends on the scalability and
versatility of its search mechanism.
Existing structured P2P networks (such as CAN [5], Chord [6], Pastry
[7] and Tapestry [8]) supporting Distributed Hash Table (DHT) functionality
are scalable, but they can not support partial-match queries e?ectively. On
the opposite, unstructured P2P networks (such as Gnutella) rely on ?ood-
ing for search, thus supporting partial-match queries, but such ?ooding does
not make the systems scalable. We propose a novel architecture called Se-
mantic Peer-to-peer Networks (SPNs) where semantically related nodes are
connected to each other. Based on Content Addressable Networks (CAN),
we construct coarse-grained SPNs (pGroup) and ?ne-grained SPNs (fGroup)
respectively. According to the di?erent querying, we present correspond-
ing search algorithms in pGroup and fGroup respectively. As is shown by
simulations, search e?ciency is improved greatly compared to Gnutella.
To expedite the reception of a large ?le, many P2P systems such as
Kazaa, Grokster and Morpheus have employed the scheme of parallel down-
2This work is supported by a grant from the Ministry of Science and Technology (grant
#2001CCA03000), National Natural Science Fund (grant #60273045) and Shanghai Sci-
ence and Technology Development Fund (grant #025115032)
III
IV
loading. To investigate the impact that this scheme might have on the net-
work if multiple clients simultaneously use it, we formulate parallel down-
loading as a non-cooperative game. To the best of our knowledge this is
the ?rst work that analyzes the parallel downloading problem in a nonco-
operative game theoretical fashion. Within this framework, we present a
characterization of the tra?c con?guration at Nash equilibrium in a general
network, and analyze its properties in a speci?c network. We also establish
the dynamic convergence to equilibrium for two speci?c networks. Finally,
we investigate the e?ciency of Nash equilibrium from the point of view of
the clients and the system respectively, i.e., downloading latencies perceived
by individual clients and total latencies over all connections. We ?nd that
although the tra?c con?guration at Nash equilibrium is optimal from the
point of view of the clients, it may be bad from the point of view of the
system.
Most DHTs such as CAN, Chord, Pastry and Tapestry require O(logN)
neighbors and have O(logN) pathlengths. In order to maintain the e?ciency
and correctness of routing, they require O(logN) repair operations of their
routing tables when a node joins or leaves. Given the highly transient user
populations in P2P systems, it is therefore of great importance to construct
DHTs with O(1) neighbors and O(logN) pathlengths. Several recent papers
[70, 73, 74] have proposed de Bruijn graphs for peer-to-peer networks. But
they all use links in only one direction. By introducing bi-directionality of
links, we further improve the routing algorithm and construct such DHTs by
using the continuous-discrete approach [75].
引文
[1] Peer-to-Peer Working Group. 2001. www.p2pwg.org,
[2] Peer-to-Peer Working Group. There is no P-to-P Market ... But There
is a Market for P-to-P. Aberdeen Group Presentation at the P2PWG,
2001.
[3] C. Shirky. What is P2P ... and what isn’t.
www.openp2p.com/pub/a/p2p/2000/11/24/shirky1-whatisp2p.html,
2001.
[4] F. Heylighen. Principia Cybernetica Web. pespmc1.vub.ac.be, 1997.
[5] S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Shenker. A
scalable content addressable network. In Proc. ACM SIGCOMM, 2001.
[6] Ion Stoica, Robert Morris, David Liben-Nowell, David R. Karger,
M. Frans Kaashoek, Frank Dabek and Hari Balakrishnan. Chord:
A Scalable Peer-to-peer Lookup Protocol for Internet Applications.
IEEE/ACM Transactions on Networking, Vol. 11, No. 1,pp. 17-32,
February 2003.
[7] A. Rowstron and P. Druschel. Pastry: Scalable, distributed object lo-
cation and routing for large-scale peer-to-peer systems. IFIP/ACM In-
ternational Conference on Distributed Systems Platforms (Middleware),
Heidelberg, Germany, pages 329-350, November, 2001.
[8] Y. Zhao, J. Kubiatowicz, and A. Joseph. Tapestry: An infrastruc-
ture for fault-tolerant wide-area location and routing, Technical Report
UCB/CSD-01-1141, University of California, Berkeley, 2000. 16.
[9] IRIS:Infrastructure for Resilient Internet Systems
http://iris.lcs.mit.edu, 2002.
[10] D. Malkhi, M. Naor, D. Ratajczak. Viceroy: A Scalable and Dynamic
Emulation of the Butter?y,” ACM PODC, 2002.
85
参考文献 86
[11] Li Xiao, Zhichen Xu, and Xiaodong Zhang. Low-Cost and Reliable
Mutual Anonymity Protocols in Peer-to-Peer Networks. Special issue on
Security Issues in Distributed Computing Systems of IEEE Transactions
on Parallel and Distributed Systems, 2003.
[12] Steven Hazel and Brandon Wiley. Achord: A Variant of the Chord
Lookup Service for Use in Censorship Resistant Peer-to-Peer. The 1st
International Workshop on Peer-to-Peer Systems (IPTPS ’02).
[13] Amos Fiat and Jared Saia Censorship Resistant Peer-To-Peer Content
Addressable Networks,in a special issue of Journal of Algorithms, 2002.
[14] Dejan S. Milojicic, Vana Kalogeraki, Rajan Lukose, Kiran Nagaraja, Jim
Pruyne, Bruno Richard, Sami Rollins, Zhichen Xu. Peer-to-Peer Com-
puting. http://www.hpl.hp.com/techreports/2002/HPL-2002-57.pdf
[15] Frank Dabek, M. Frans Kaashoek, David Karger, Robert Morris, Ion
Stoica. Wide-area cooperative storage with CFS. In Proceedings of the
18th ACM Symposium on Operating Systems Principles (SOSP’01)
[16] P. DRUSCHEL AND A. ROWSTRON. Past: Persistent and anonymous
storage in a peer-to-peer networking environment. In Proceedings of the
8th IEEE Workshop on Hot Topics in Operating Systems (HotOS 2001)
(Elmau/Oberbayern, Germany, May 2001), pp. 65–70.
[17] KUBIATOWICZ, J., BINDEL, D., CHEN, Y., CZERWINSKI, S.,
EATON, P., GEELS, D., GUMMADI, R., RHEA, S., WEATHER-
SPOON, H., WEIMER, W., WELLS, C., AND ZHAO, B. OceanStore:
An architecture for global-scale persistent storage. In Proceeedings of
the Ninth international Conference on Architectural Support for Pro-
gramming Languages and Operating Systems (ASPLOS 2000) (Boston,
MA, November 2000), pp. 190–201.
[18] RATNASAMY, S., HANDLEY, M., KARP, R., AND SHENKER, S.
Application-level Multicast using Content- Addressable Networks. In
Proceedings of NGC 2001 (Nov. 2001). Oct. 2001).
[19] ZHUANG,S.,ZHAO, B., JOSEPH, A. D., KATZ, R. H., AND KUBIA-
TOWICZ, J. Bayeux: An architecture for wide-area, fault-tolerant data
dissemination. In Proceedings of NOSSDAV’01 (Port Je?erson, NY,
June 2001).
参考文献 87
[20] A. ROWSTRON, A-M. KERMARREC, M. C., AND DRUSCHEL, P.
Scribe: The design of a large-scale event noti?cation infrastructure. In
Proceedings of NGC 2001 (Nov. 2001).
[21] CABRERA, L. F., JONES,M.B., AND THEIMER,M. Herald: Achiev-
ing a global event noti?cation service. In Proceedings of the 8th IEEE
Workshop on Hot Topics in Operating Systems (HotOS-VIII) (El-
mau/Oberbayern, Germany, May 2001).
[22] C.Plaxton, R.Rajaraman, and A.Richa Accessing nearby copies of repli-
cated objects in a distributed environment. In Proceedings of the ACM
SPAA, pp. 311-320.
[23] D. KARGER, E. LEHMAN, F. LEIGHTON, M. LEVINE, D. LEWIN,
AND R. PANIGRAHY Consistent hashing and random trees: Dis-
tributed caching protocols for relieving hot spots on the World Wide
Web. In Proceedings of the 29th Annual ACM Symposium on Theory
of Computing (El Paso, TX, May 1997), pp. 654–663.
[24] Q. Lv, P. Cao, E. Cohen, K. Li, and S. Shenker. Search and replication in
unstructured peer-to-peer networks. In Proceedings of the 16th annual
ACM International Conference on Supercomputing, 2002.
[25] A. Crespo and H. Garcia-Molina. Routing indices for peer-to-peer sys-
tems. In 22nd International Conference on Distributed Computing Sys-
tems (ICDCS’02) July 02 - 05, 2002.
[26] Beverly Yang, Hector Garcia-Molina. Improving Search in Peer-to-Peer
Networks. In 22nd International Conference on Distributed Computing
Systems (ICDCS’02) July 02 - 05, 2002.
[27] E. Cohen and S. Shenker. Replication Strategies in Unstructured Peer-
to-Peer Networks. In Proceedings of ACM SIGCOMM’ 02, Pittsburgh,
PA, August 2002.
[28] Edith Cohen, Amos Fiat, Haim Kaplan. Associative Search in Peer
to Peer Networks: Harnessing Latent Semantics. In IEEE INFOCOM
2003.
[29] Yatin Chawathe, Sylvia Ratnasamy, Lee Breslau, Nick Lanham, and
Scott Shenker. Making Gnutella-like P2P Systems Scalable. In Proceed-
ings of ACM SIGCOMM 2003.
参考文献 88
[30] SAROIU, S., GUMMADI, P. K., AND GRIBBLE, S. D. A Measurement
Study of Peer-to-Peer File Sharing Systems. In Proceedings of Multime-
dia Computing and Networking 2002 (MMCN’02) (San Jose, CA, Jan.
2002).
[31] Kunwadee Sripanidkulchai, Bruce Maggs, and Hui Zhang. E?cient Con-
tent Location Using Interest-Based Locality in Peer-to-Peer Systems, In
INFOCOM 2003.
[32] Adriana Iamnitchi, Matei Ripeanu, Ian Foster. Locating Data in (Small-
World?) Peer-to-Peer Scienti?c Collaborations, 1st International Work-
shop on Peer-to-Peer Systems (IPTPS’02), March 2002, Cambridge,
MA.
[33] Arturo Crespo and Hector Garcia-Molina. Semantic Overlay Net-
works, Submitted for Publication. http://www-db.stanford.edu/ cre-
spo/publications/op2p.pdf
[34] Chunqiang Tang, Zhichen Xu, and Sandhya Dwarkadas. Peer-to-Peer In-
formation Retrieval Using Self-Organizing Semantic Overlay Networks.
In ACM SIGCOMM 2003.
[35] Ricardo Baeza-Yates, Berthier Ribeiro-Neto. Modern Information Re-
trieval. Addison Wesley, 1999.
[36] I. Witten and E. Frank. Data Mining. Morgan Kaufmann Publishers,
1999.
[37] P. Triantallou, C. Xiruhaki, M. Koubarakis, and N. Ntarmos. Towards
high performance peer-to-peer content and resource sharing systems. In
the Conference on Innovative Data Systems Research (CIDR), 2003.
[38] Edith Cohen, Amos Fiat, Haim Kaplan. E?cient sequences of trials. In
Proc.14th ACM-SIAM Symposium on Discrete Algorithms, 2003.
[39] B. Bloom. Space/time tradeo?s in hash coding with allowable errors.
Communications of the ACM, 1970, 13 (7): 422-426.
[40] A. Broder and M. Mitzenmacher. Network applications of Bloom ?lters:
a survey. In: Proceedings of the 40th Annual Allerton Conference on
Communication, Control, and Computing, 2002, 636-646.
[41] M. Mitzenmacher. Compressed bloom ?lters. In Proceedings of the 20th
ACM SIGACT-SIGOPS Symposium on Principles of Distributed Com-
puting, pages 144-150, August 2001.
参考文献 89
[42] J. Byers, J. Considine, M. Mitzenmacher. Simple Load Balancing for
Distributed Hash Tables. IPTPS ’03, February 2003, Berkeley, CA, USA.
[43] Lee Breslau, Pei Cao, Li Fan, Graham Philips, Scott Shenker. Web
Caching and Zipf-like Distributions: Evidence and Implications. IEEE
INFOCOM, pp. 126 -134, 1999.
[44] J. Byers, M. Luby, and M. Mitzenmacher. Accessing Multiple Mirror
Sites in Parallel: Using Tornado Codes to Speed Up Downloads. In Proc.
of INFOCOM, Mar. 1999.
[45] J. Byers, J. Considine, M. Mitzenmacher, and S. Rost. Informed Con-
tent Delivery Across Adaptive Overlay Networks. In Proc. of ACM SIG-
COMM, 2002.
[46] S. Philopoulos and M. Maheswaran. Experimental Study of Parallel
Downloading Schemes for Internet Mirror Sites. In Thirteenth IASTED
International Conference on Parallel and Distributed Computing Sys-
tems (PDCS ’01), pages 44-48, Aug. 2001.
[47] P. Rodriguez and E. Biersack. Dynamic parallel-access to replicated con-
tent in the Internet. IEEE/ACM Transactions on Networking, 10(4),
Aug. 2002.
[48] C. Gkantsidis, M. Ammar, E. Zegura. On the E?ect of Large-Scale
Deployment of Parallel Downloading。In IEEE Workshop on Internet
Applications (WIAPP’03), 2003.
[49] Simon Koo, Catherine Rosenberg, and Dongyan Xu. Analysis of Parallel
Downloading for Large File Distribution. In Proceedings of IEEE Inter-
national Workshop on Future Trends in Distributed Computing Systems
(FTDCS 2003), San Juan, PR, May 2003.
[50] J. B. Rosen. Existence and uniqueness of equilibrium points for concave
N-person games. Econometrica, 33(3):520-534, 1965.
[51] G. Owen. Game Theory. Academic Press, 1995. Third Edition.
[52] A. Orda, R. Rom, and N. Shimkin. Competitive routing in multi-
user communication networks. IEEE/ACM Transactions on Networking,
1:510-521, 1993.
[53] E. Altman, T. Basar, T. Jimenez, and N. Shimkin. Routing into two
parallel links: Game-Theoretic Distributed Algorithms, Special Issue of
参考文献 90
Journal of Parallel and Distributed Computing on “Routing in Com-
puter and Communication Networks”, pp. 1367-1381, Vol. 61, No. 9,
September 1, 2001.
[54] T. Boulogne, E. Altman and O. Pourtallier. On the convergence to Nash
equilibrium in problems of distributed computing. Annals of Operation
research, 109(1): 279-291; Jan 2002.
[55] P. Dubey. Ine?ciency of Nash equilibria. Mathematics of Operations
Research, 1986.
[56] E. Koutsoupias and C. Papadimitriou. Worst-case equilibria. In Pro-
ceedings of the 16th Annual Symposium on Theoretical Aspects of Com-
puter Science, pages 404-413, 1999.
[57] M. Mavronicolas and P. Spirakis. The price of sel?sh routing. In Proc.
33rd ACM STOC, pp. 510-519, 2001.
[58] A. Czumaj, P. Krysta, and B. V?cking. Sel?sh Tra?c Allocation for
Server Farms. In Proc. 34th STOC (Montreal, 2002); 287-296.
[59] A. Czumaj and B. V¨ocking. Tight Bounds for Worst-Case Equilibria.
In Proc. 13th SODA (San Francisco, 2002), pp. 413-420.
[60] T. Roughgarden, The price of anarchy is independent of the network
topology. In Proc. 34th STOC, pages 428-437, 2002.
[61] T. Roughgarden, Sel?sh Routing. PhD thesis, Cornell University, 2002.
[62] T. Roughgarden and E. Tardos. How Bad is Sel?sh Routing? Journal
′
of the ACM, 49(2):236-259, March 2002.
[63] T. Roughgarden and E. Tardos. Bounding the Ine?ciency of Equilibria
′
in Nonatomic Congestion Games, to appear in Games and Economic
Behavior.
[64] J. G. Wardrop. Some theoretical aspects of road tra?c research. In
Proceedings of the Institute of Civil Engineers, Pt. II, volume 1, pages
325-378, 1952.
[65] R. Fielding, J. Gettys, J. Mogul, H. Frystyk, and T. Berners-Lee. Hy-
pertext transfer protocol―HTTP/1.1. Network Working Group, RFC
2068, Jan. 1997.
参考文献 91
[66] LUBY, M., MITZENMACHER, M., SHOKROLLAHI, A., SPIELMAN,
D., AND STEMANN, V. Practical loss-resilient codes. In Proceedings
of the 29th ACM Symposium on Theory of Computing (April 1997).
[67] MACWILLIAMS, F. J., AND SLOANE, N. The Theory of Error-
Correcting Codes. North Holland, Amsterdam, 1977.
[68] RABIN, M. E?cient dispersal of information for security, load balancing
and fault tolerance. Journal of the ACM 38 (1989), 335–348.
[69] V. N. Padmanabhan and J. Mogul. Improving HTTP Latency. In Second
World Wide Web Conference ’94: Mosaic and the Web, pp. 995-1005,
October 1994.
[70] D. Loguinov, A. Kumar, V. Rai, and S. Ganesh, Graph-Theoretic Anal-
ysis of Structured Peer-to-Peer Systems: Routing Distances and Fault
Resilience, in ACM SIGCOMM, August 2003.
[71] K.P. Gummadi, R. Gummadi, S.D. Gribble, S. Ratnasamy, S. Shenker,
and I. Stoica. Impact of DHT Routing Geometry on Resilience and
Proximity, in ACM SIGCOMM, August 2003.
[72] J. Xu, A. Kumar, and X. Yu. On the Fundamental Tradeo?s between
Routing Table Size and Network Diameter in Peer- to-Peer Networks,
in IEEE JSAC, Nov. 2003.
[73] F. Kaashoek and D.R. Karger. Koorde: A Simple Degree-optimal Hash
Table, in IPTPS, February 2003.
[74] P. Fraigniaud and P. Gauron. An Overview of the Content-Addressable
Network D2B, in ACM PODC, 2003.
[75] M. Naor and U. Wieder. Novel Architectures for P2P Applications: the
Continuous-Discrete Approach, ACM SPAA, June 2003.
[76] Sylvia. Ratnasamy, Scott Shenker and Ion Stoica. Routing Algorithms
for DHTs: Some Open Questions. In proc. of IPTPS, March, 2002.
[77] J. Considine and T.A. Florio. Scalable Peer-to-Peer Indexing with Con-
stant State. Boston U. Technical Report 2002-026, August 2002.
[78] M. Imase and M. Itoh. Design to Minimize Diameter on Building-Block
Network, IEEE Trans. on Computers, vol.30, 1981.
参考文献 92
[79] F.T. Leighton. Introduction to Parallel Algorithms and Architectures:
Arrays, Trees, Hypercubes, Academic Press, Morgan Kaufmann, 1991.
[80] K.N. Sivarajan and R. Ramaswami. Lightwave Networks Based on de
Bruijn Graphs, IEEE/ACM Trans. on Networking, vol. 2, no. 1, 1994.
[81] P Ganesan and G S Manku. Optimal Routing in Chord. In Proc. 15th
Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2004.
[82] Z. Liu, T. Y. Sung. Routing and Transmitting Problems in de Bruijn
Networks, IEEE Trans. on Computers, Vol. 45, pp. 1056-1062, Sept.
1996.
[83] M. Mitzenmacher, A. Richa, and R. Sitaraman. The Power of Two
Choices: A Survey of Techniques and Results. Kluwer Academic Pub-
lishers, Norwell, MA, 2001.
[84] By Keren Horowitz and Dahlia Malkhi. Estimating Network Size
from Local Information. The Information Processing Letters journal
88(5):237–243, December 2003.
[85] 施锡铨. 博弈论. 上海财经大学出版社, 1996.
[86] 谢识予. 经济博弈论. 复旦大学出版社, 1996.
[87] 张维迎. 博弈论与信息经济学. 上海人民出版社, 1996.
[88] L.V. 阿尔福斯著, 张立译. 复分析. 上海科学技术出版社, 1962.
参加的科研项目及发表的论文
参加的科研项目
[1] 国家自然科学基金: NP优化问题的难近似性、随机算法和在线算
法;
[2] 国家自然科学基金: NP优化问题的难近似性、随机算法和计算经济
学;
[3] 基础研究重大项目前期研究专项:“信息和知识共享”的系统理论。
发表的论文
[1] Jiantao Song, Chaofeng sha, and Hong Zhu. Nash Equilibria in Parallel
Downloading with Multiple Clients. The 24th International Conference
on Distributed Computing Systems (ICDCS-2004), 2004, pp94-101.
[2] Jiantao Song, Yong Zhang, Chaofeng Sha and Hong Zhu. Building
Semantic Peer to Peer Networks upon CAN. The 5th. International
Workshop on Networked Group Communications(NGC 2003), Lecture
Notes in Computer Science 2816 pp.95-106, 2003.
[3] 宋建涛,沙朝峰,杨智应,朱洪。语义对等网构造及搜索机制研究,
《计算机研究与发展》,Vol. 41,No.4,2004,pp:645-652。
[4] 杨智应,朱洪,宋建涛。矩阵条件数与高斯算法平滑分析的进一步研
究,《软件学报》,已录用。
[5] 宋建涛,沙朝峰,杨智应,朱洪。多类流量下大容量路由器分组调度
算法研究,《计算机学报》,已修改待录用。
[6] Jiantao Song and Hong Zhu. Design of DHTs with Heterogeneity Con-
siderations,拟提交INFOCOM 2005。