面向P2P的Markov模型

设为首页

收藏本站

网站地图 | English | 公务邮箱

远程访问

NSTL服务站

面向P2P的Markov模型

详细信息本馆镜像全文| 推荐本文 | | 获取CNKI官网全文

英文题名：Markov-based Models for P2P Systems
作者：徐陈锋
论文级别：博士
学科专业名称：控制理论与控制工程
中文关键词：P2P ; Markov决策过程 ; Markov切换空间模型 ; 事件 ; 网络电视 ; 视频点播
英文关键词：P2P ; Markov Decision Processes ; Markov Switching-Space Models ; Events ; IPTV ; VoD
学位年度：2008
导师：奚宏生
学科代码：081101
学位授予单位：中国科学技术大学
论文提交日期：2008-04-01

摘要

随着网络和信息技术的发展,基于计算机网络的各种应用和创新不断出现。在当前网络技术研究和应用中,P2P(Peer-to-Peer)和IPTV(Intemet Protocol Television)都是人们关注和研究的热点。由于节点或用户的频繁进出,这两类系统都具有极大的随机性。而且在大规模应用中,这些系统在网络中的覆盖面往往极为广泛,并且和其它大量服务集成在一起,表现出分布式分层的结构特征,确定其性能的主要因素往往很复杂,并且各种因素相互制约。应用系统的性能瓶颈问题和优化问题成为研究、设计和实现这些系统的关键问题,其中有些可以直接通过技术改造和升级来解决,但需要大量人力物力。从系统理论的观点来看,其中一些问题也是可以在现有环境下通过现代优化和控制的理论和方法来解决的,特别是,还可以利用基于Markov过程的模型,以避开具有复杂物理背景的精确数学模型的构造与辨识。
     考虑一类具有分层结构的非结构化P2P系统的层级构成和“组划分”问题。这类系统按照某种原则分成一个个由若干节点构成的组,并在每个组中选出一个超级节点来索引本组内的节点和资源,普通节点的查询通过超级节点的组内查询和组间查询完成。这类系统实际将网络分成了两层,第一层是组内系统,以超级节点为中心;第二层是以超级节点为代表的组间系统,采用纯分布式P2P方式和使用泛洪式搜索。由于系统规模和节点分布是随时间动态变化的,组的划分需要进行适应,以保汪最优的系统总体性能。本文将这类系统的“组划分”及其切换行为进行抽象,利用Markov切换空间模型进行描述:每一个组划分下的状态过程对应于一个状态空间,其动态性可以用Markov过程描述;而组划分切换的控制对应于Markov切换空间模型的行动,根据给定的策略进行选取。可以证明Markov切换空间模型等价为Markov决策过程或参数化Markov报酬过程,在此基础上,利用策略迭代和梯度方法求解最优策略并通过仿真验证所获结果的正确性。
     在实际工程中的一种具有P2P特征的分布式VoD(Video on Demand)系统,采用了全局中心化的目录服务器对存储到网络边缘服务节点上分段缓存的数据进行索引和定位。针对这种中心化资源定位服务存在的问题,使用将随机漫步和中心目录服务器结合的混合搜索方法提高系统的扩展性和降低目录服务器或网路单点失效问题的影响。在此基础上,利用基于Markov过程的模型来描述和讨论采用这种方法的资源定位服务。为了应对大状态空间的问题,保证状态空间不会进一步扩大,本文利用基于事件的方法进行建模,并根据状态的不同性质进行状态聚集,以降低问题的复杂程度。还利用简化的一次定位模型,讨论了混合方法的搜索效率和代价的控制问题。考虑到实际中接入控制和定位服务是息息相关的,本文将接入控制引入资源定位服务模型,和定位服务结合在一起进行了讨论。
With the development of network and information technology, there are more and more various applications and innovations based on computer networks. Recently, P2P (Peer-to-Peer) and IPTV (Internet Protocol Television) are those famous research and application fields of the current network technology. They are both stochastic systems due to the frequent accesses of their nodes or users. And they are generally integrated with many other services and cover a large range of Internet. Therefore, each of them may have a distributed and hierarchical structure and there exist many complicated factors constraining each other in system performances. The performance bottlenecks and optimizations have been become the key problems currently on research, design and implementation of these application systems. Some of them can be solved by direct technical reconstruction or update, which is generally not low-cost. From the view of system theory, some of them can also be solved by using modern optimization and control theories and methods. Especially, the models based on Markov processes can also be applied in order to avoid constructing and identifying the accuracy mathematical descriptions with complicated physical backgrounds.
     Considering the hierarchical structure and switching problem for a class of hierarchical unstructured P2P systems. According to some given principle, these systems are divided to many groups with lots of nodes and elect a supper node for one group. Each supper node indexes the nodes and resources in its own group. A normal node has to search resources via the super node's location service, which consists of the internal search engine in group and the one between groups. These systems actual have a two-layered structure: one is the single group around its supper node, the other is the subsystem between groups, which is a pure P2P and uses flooding-like search. Since the system scale and distribution of nodes varies dynamically, the partitioning has to be adaptive to maintain the system's overall performance. In this paper, a Markov switching-space model is introduced to describe their behavior of dynamic grouping. The state process under each partitioning is characterized as a Markov process with a special state space. The controls of grouping behavior correspond to the actions of Markov switching-space model, which are selected according to some given policy. It can be proved that the Markov switching-space model is equivalent to a Markov decision process or a parameterized Markov reward process. Therefore, the optimal policies for dynamic grouping can be given by applying policy iterations and gradient methods, which are verified by some simulations in this paper.
     A kind of P2P-enhanced distributed VoD (Video on Demand) system in a practical project has a global centralized directory server to index all the data of segment caches on edge service nodes and to provide location service for them. This solution has poor scalability and suffers from single point of failure problem. Therefore, a hybrid search which combines random walk and the existent global service is proposed. And models based on Markov processes are introduced to characterize and discuss this hybrid location service. In order not to extend the large state space, event is used in models. And also state clustering is applied to reduce computational complexity. Then, a simplified single location model is placed to discuss its effect, search efficiency and control problem of search costs. Considering the close relation between admission control and location service, a more complicated model combined with admission control is also discussed.

引文

刘克.2004.实用马尔可夫决策过程[M]北京:清华大学出版社.
    胡奇英,刘建墉.2000.马尔可夫决策过程引论[M].西安:西安电子科技大学出版社.
    候振挺,郭先平.1998.马尔可夫决策过程[M].长沙:湖南科技出版社
    唐昊.2002.Markov控制过程的优化理论和算法[D]:博士.合肥:中国科学技术大学
    候振挺.1982.Q-过程的唯一性准则[M].长沙:湖南科技出版社
    李衍杰.2006.扩展Markov决策过程的性能灵敏度分析与优化[D]:博士.合肥:中国科学技术大学
    许永明,谢质文,欧阳春.2006.IPTV:技术与应用实践[M].北京:电子工业出版社.
    胡兴军.2007.我国交互式网络电视(IPTV)发展现状、存在的问题及促进措施[J].影像技术,2007(3):6-10.
    冯亮.2006.P2P技术综述[J].科技经济市场,2006(11):22-23.
    周文莉,吴晓非.2006.P2P技术综述[J].计算机工程与设计,27(1):76-79.
    罗杰文.2005.Peer to Peer(P2P)综述[EB/OK].中科院计算技术研究所.http://www.intsci.ac.cn/users/luojw/papers/p2p.htm.
    徐恪,叶明江,胡懋智.2007 P2P技术现状及未来发展[J].中兴通讯技术,13(6):6-10.
    王洪波,马轶慧.2007 P2P流媒体技术原理及应用[J].中兴通讯技术,13(6):18-21.
    Feinberg E A and Shwartz A.2002.Handbook of Markov decision processes[M].Boston:Kluwer Academic.
    Puteman M L.1994.Markov decision processes:discrete stochastic dynamic programming[M].NewYork:Wiley.
    Alterman E.1999.Constrianed Markov decision processes[M].Boca Raton:Chapman and Hall/CRC.Bertsekas D P and Shreve S E.1978.Stochastic optimal control:the discrete-time case[M]New York:Academic Press.(republished by Athena Scientific,1997)
    Bertsekas D P.1995a.Dynamic Programming and optimal control:volume Ⅰ[M]Bellmont:Athena Scientific.
    Bertsekas D P.1995b.Dynamic Programming and optimal control:volume Ⅱ[M]Bellmont:Athena Scientific.
    Filar J A and Vrieze K.1996.Competitive Markov decision Processes[M]New York/Berlin/Heidelberg:Springer-Verlag.
    Hernandez-Lerma O.1989.Adaptive Markov control processes[M]New York:Springer-Verlag
    Hemandez-Lerma O and Lasserre J B.1996,Discrete-time Markov control processes-basic optimality criteria[M].New York:Springer-Verlag
    Hernandez-Lerma O and Lasserre J B.1999.Furter topics on discrete-time Markov control processes [M].New York:Springer-Verlag.
    Sennot L I.1999.Stochastic dynamic programming and control of queueing system[M]New York:Wiley Press.
    White D J.1993 Markov decision processes[M].Chichester:Wiley Press.
    Bersekas D P and Tsitsiklis J N 1996 Neuro-Dynamic Programming[M].Belmont/MA:Athena Scientific
    Sutton R S.and Barto A G.1998.Reinforcement learning:an introduction[M].Cambridge/MA MIT Press.
    Cao X R.1998 The relation among potentials,perturbation analysis,Markov decision processes,and other topics[J].Journal of Discrete Event Dynamic Systems,8:71-87.
    Cao X R and Guo X R.2004.A unified approach to Markov decision problems and performance sensitivity analysis with discounted and average criteria:Multichain cases[J].Automatica,40:1749-1759.
    Cao X R.2003.From perturbation analysis to Markov decision processes and reinforcement learning[J]Discrete Event Dynamic Systems:Theory and Applications,13:9-39.
    Cao X R and Chen H F.1997.Perturbation realization,potentials,and sensitivity analysis of Markov processes[J]IEEE Transactions on Automation and Control,42(10):1382-1393.
    Xi H S,Tang H and Yin B Q.2003.Optimal policies for a continuous time MCP with compact action set[J].Acta automatica sinica,29(2):206-211.
    Cao X R.2007.Stochastic learning and optimization:a sensitivity-based approach[M].New York:Springer.
    Marbach P and Tsitsiklis J N.2001.Simulation-based optimization of Markov reward processes[J].IEEE Transactions on Automation and Control,46(2):191-209.
    Howard R A.1960.Dynamic Programming and Markov Processes[M].New York:Wiley.
    Cao X R.2005 Basic ideas for event-based optimization[J].Discrete Event Dynamic Systems:Theory and Applications,15(2):169-197.
    Cao X R.2004.The potential structure of sample paths and performance sensitivities of Markov systems [J].IEEE Transaction on Automatic Control,49(12):2129-2142. Dijk N V 1993 Queueing Networks and Product Forms: A Systems Approach[M]. Chichester: John Willey and Sons

    Ren Z Y and Krogh B H. 2001. Switching control in Multi-mode Markov decision processes[J]. Pro-ceedings of the 40th IEEE Conference on Decision and Control, 2095-2101.

    Guo X P and Liu K. 2001. A Note on Optimality Conditions for Continuous-Time Markov Decision Processes With Average Cost Criterion[J]. IEEE Transaction on Automatic Control, 49(12): 1984-1989

    Guo X P and Lerma O H. 2003 Continuous-time controlled Markov chains[J] The annals of applied probability, 13(1): 363-1989.

    Jiang Q, Xi H S and Yin B Q. 2007. Optimization of Semi-Markov switching state-space control pro-cesses for network communication systems[C]. Proceedings of the 26th Chinese Control Conference,707-711.

    Benevenuto F, Ismael J Jr and Almeida J. 2004. Quantitative evaluation of unstructured peer-to-peer architectures[J]. Proc. of HOT-P2P'04, Washington, IEEE Computer Society, 56-65.

    Lin T N, Wang H P and Wang J M. 2004. Search performance analysis and robust search algorithm in unstructured peer-to-peer networks[C], Proc. of CCGrid 2004, Washington, IEEE Computer Society,346-354.

    Wang S Q, Dong X and Zhao W. 2005. Analyzing and enhancing the resilience of structured peer-to-peer systems[J]. Journal of parallel and distributed computing, 65(2): 207-219.

    Spognardi A and Di Pietro R. 2006. A formal framework for the performance analysis of P2P net-works protocols[EB/OK]. IEEE Computer Society. http://www.cecs uci.edu/papers/ipdpsO6/pdfs/1568976643-HOTP2P-paper-1 pdf.

    Gaeta R and Sereno M. 2006. Model-based evaluation of search strategies in peer-to-peer networks.IEEE Computer Society. http://www.cecs.uci.edu/papers/ipdps06/pdfs/1568976692-HOTP2P-paper-1.pdf

    Cooper R B 1981. Introduction to querying Theory[M]. New York: Elsevier North Holland, Inc.Tang H, Xi H S and Yin B Q. 2003. Performance optimization of continuous-time Markov control processes based on performance potentials[J]. International Journal of Systems Science, 34(1): 63-71.

    Cao X R. 2004 The Potential Structure of Sample Paths and Performance Sensitivities of Markov Systems[J]. IEEE Transactions on Automatic Control, 49(12): 2129-2142.
    Chong E K P and Ramadge P J.1994.Stochastic Optimization of Regenerative Systems Using Innitesimal Perturbation Analysis[J].IEEE Transactions on Automatic Control,39:1400-1410.
    Fang H T and Cao X R.2004.Potential-Based On-line Policy Iteration Algorithms for Markov Decision Processes[J].IEEE Transactions on Automatic Control,49(4):493-505.
    Cranor C D,Green M,Kalmanek G,Shut D,Sibal S,Van der Merwe J E and Sreenan C J.2001
    Enhanced streaming services in a content distribution network[J].IEEE Internet Computing,5(4):66-75.
    Rejaie R,Yu M H H and Estrin D.2000.Multimedia proxy caching mechanism for quality adaptive streaming apphcations in the internet[C].In Proc.IEEE INFOCOM 2000,2:980-989.
    Fabmi H,Latif M,Sedigh-Ali S,Ghafoor A,Liu P and Hsu LH.2001 Proxy servers for scalable interactive video support[J].IEEE Computer,34(9)54-60.
    Liu J C and Xu J L.2004 Proxy caching for media streaming over the Internet[J]IEEE Communications Magazine,42(8):88-94.
    Gruber S,Rexford J and Basso A.2000.Protocol considerations for a prefix-caching for multimedia streams[J]Computer Network,33(1-6):657-668.
    Chen S,Wang H,Zhang X,Shen B and Wee S.2005.Segment-based proxy caching for internet streaming media delivery[J].IEEE Multimedia,12(3):59-67.
    Chen S,Shen B,Wee S and Zhang X.2006.Segment-based streaming media proxy:modeling and optimization[J].IEEE Transactions on Multimedia,8(2):243-256.
    Chen S,Shen B,Wee S and Zhang X 2007.SProxy:A Caching Infrastructure to Support Internet Streaming[J]IEEE Transactions on Multimedia,9(5):1062-1072.
    Wang Y,Zhang Z,Du D and Su D.1998.A network-conscious approach to end-to-end video delivery over wide area networks using proxy servers[C].In Proceedings of IEEE INFOCOM98,2:660-667.
    Jiang X,Dong Y,Xu D and Bhargava B.2003.Gnustream:a p2p media streaming system prototype[C].In Proceedings of the International Conference on Multimedia and Expo(ICME),2:325-328.
    Hefeeda M,Habib A,Botev B,Xu D and Bhargava B.2003.PROMISE:Peer-to-Peer Media Streaming Using CollectCast.In Proc.of ACM Multimedia 2003,45-54.
    Hefeeda M,Bhargava B K and Yau D K Y.2004.A hybrid architecture for cost-effective on-demand media streaming Computer Networks 44(3):353-382.
    Do T T,Hua K A and Tantaoui M A.2004.P2VOD:providing fault tolerant video-on-demand streaming in peer-to-peer environment[C].2004 IEEE International Conference on Communications,3:1467-1472.
    Jin Li. 2005. PeerStreaming: An On-Demand Peer-to-Peer Media Streaming Solution Based On A Receiver-Driv en Streaming Protocol[C] 2005 IEEE 7th Workshop on Multimedia Signal Processing,1-4.

    Padmanabhan V, Wang H, Chou P and Stripanidkulchai K. 2002. Distributing streaming media content using cooperative networkmg[C]. In Proceedings of the 12th international workshop on Network and operating systems support for digital audio and video, 177-186

    Cahill A J and Sreenan C J. 2006. An Efficient Resource Management System for a Streaming Me-dia Distribution Network[J]. International Journal of Interactive Technology and Smart Education(ITSE), 3(1): 31-44.

    Guo L, Chen S and Zhang X. 2006. Design and evaluation of a scalable and reliable P2P assisted proxy for on-demand streaming media delivery[J]. IEEE Transactions on Knowledge and Data Engineering,18(5): 669-682.

    CaoXR. 1994. Realization probabilities: The dynamics of queueingsystems[M]. New York: Springer-Verlag.

    Gong WB. 1998. Smoothed Perturbation Analysis of Markovian Queueing Networks[C]. In Proceed-ings of American Control Conference, 456-461.

    Fu M C and H J Q. 1994 Smoothed perturbation analysis derivative estimation for Markov chains[J].Operations Research Letter, 15(2): 241-251.

    Cao X R and Wan Y W. 1998. Algorithms for Sensitivity Analysis of Markov Systems Through Poten-tials and Pertur bation Realization[J]. IEEE Transactions on Automation and Control, 6(4): 482-494.

    Ho Y C, Zhao Q C and Pepyne D. 2003. The no free lunch theorem, complexity and compute security[J].IEEE Transactions on Automation and Control, 48(5): 783-793.

    Cao X R, Ren Z Y, Bhatnagar S and Marcus S 2002. A time aggregation approach to Markov decision processes[J] Automatica, 38(6): 929-943.

    Forestier J and Varaiya P. 1978. Multilayer control of large Markov chains[J]. IEEE Transactions on Automatic Control, AC-23: 298-304.

    Chang H S, Fard P J, Marcus S I and Shayman M. 2003. Multitime scale Markov decision processes[J].IEEE Transactions on Automatic Control, 48(6): 976-987.

    Wan Y W and Cao X R. 2006. The control of a two-level Markov decision process by time aggregation.Automatica, 42(3): 393-403.

    Ren Z Y and Krogh B H. 2002. State aggregate in Markov decision processes[C]. In Proceedings of the 41st IEEE Conference on Decision and Control, Las Vegas, 3819-3824.
    Ren Z Y and Krogh B H.2001.Switching control in multimode Markov decision processes[C].In Proceedings of the 40th IEEE Conference on Decision and Control,Orlando,2095-2101.
    Ratnasamy S,Francis P,Handley M,Karp R and Shenker S.2001.A scalable content-addressable network[C].In Proceedings of the 2001 ACM SIGCOMM Conference,San Diego,California,USA,161-172.
    Stoica I,Morris R,Karger D,Kaashoek F and Balakrishnan H,Chord:a scalable peer-to-peer lookup service for internet applications[C]In Proceedings of the 2001 ACM SIGCOMM Conference,San Diego,California,USA,149-160.
    Rowstron A and Druschel P.2001 Storage management and caching in past,a large-scale,persistent peer-to-peer storage utility[C].ACM SOSP,Banff,Alberta,Canada,188-201.
    Zhao B Y,Kubiatowicz J and Joseph A D.2001 Tapestry:an infrastructure for fault-resilient wide-area location and routing[C].Technical Report UCB/CSD-01-1141,University of California at Berkeley

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700