教育资源网格中的副本管理策略研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着计算机技术的普及和数字化校园的发展,全国各地中小学都积累了大量的教育资源,包括精品课件,视频教程,实验报告等等。但是,不同的学校以自己的方式对这些资源进行管理,学校间的资源共享度不高,形成了一个个的信息孤岛。教育资源网格就是为了对教育资源进行有效共享而提出的,它将不同的学校看作是一个个的虚拟组织,以网格的思想对教育资源进行管理,为教育资源的有效共享提供了一个高效的平台,避免了资源浪费和软件的重复开发。
     作为数据网格的一种,副本管理技术在教育资源网格中同样起到了非常重要的作用。一方面,副本的存在有效提高了资源的可用性,避免了因节点离开网格而造成的资源不可用现象;另一方面,副本技术使得对于某个文件的请求分散到各个副本节点,而不是集中于同一个节点,在一定程度上均衡了网格负载;第三,副本技术能够通过在请求节点本地或附近节点建立副本的方式缩短数据传输的距离,提高数据访问效率。教育资源网格的独特特点也决定了在教育资源网格环境下研究副本管理技术具有非常重要的意义。本文针对教育资源网格环境中的副本管理问题,主要做了如下工作:
     首先,研究了教育资源网格环境和副本管理技术,阐述了教育资源网格环境下副本管理策略研究的意义。首先介绍了网格和数据网格的概念、特点、体系结构和研究现状等知识,并在此基础上介绍了教育资源网格独有的特点和体系结构,对研究环境作出了比较清晰的描述。接下来阐述了副本管理技术的概念和特点,对副本管理所包含的各个功能模块都进行了详细的介绍。
     其次,提出一种基于节点负载的副本创建策略。针对教育资源网格的存储处理能力低,带宽分布不均的特点,提出了一种基于节点负载的副本放置策略。该策略根据节点所在管理域的不同分成域内和域间副本请求两种情况,对于域内副本请求,副本在负载较小的节点进行创建,以平衡域内节点负载;而对域间副本请求,则在请求域的管理节点上创建副本,以减少副本远程传输的次数。实验表明,在教育资源网格环境下,该策略能够有效提高作业的平均执行时间,改善教育资源网格的性能。
     第三,提出一种基于模糊多属性决策的副本选择策略。该策略综合副本的可用性、安全性、响应时间和花费作为评价副本好坏的标准集,更加关注副本的服务质量。由于该标准集中各个属性很难进行定量分析和度量,本文使用三角模糊数来表示各个属性值,并且,各个属性的评价标准和物理量纲不同,甚至相互冲突和制约。为了使得副本选择更加客观和公正,在均衡各个属性值的基础上做出最优选择,本文使用多属性决策机制来解决副本选择问题,为副本选择策略提出了一种新的解决方案。最后本文通过实例说明了该策略的可行性。
     最后,对OptorSim模拟器进行了深入的研究,并对模拟器自带的经典算法进行了改进。本文在介绍OptorSim的体系结构,主要框架和类以及自带的几种经典副本管理算法的基础上,分析了OptorSim中自带的LRU和LFU算法的不足,提出了LRU和LFU算法的改进,并在OptorSim中进行了实现,得出了实验结果。通过与原算法的实验结果进行比较,证明了改进后的两种算法在性能上有明显的提高。
With the popularization of computer technology and the development of digital campus,primary and middle schools of our country accumulate a large number of educational resources,including Excellent course, video tutorials, experimental reports and so on. However, differentschools manages these resources on their own way. Resource sharing among schools is not high,which forms a separated information island. Education resource grid is proposed in order toefficiently share educational resources. It views different schools as virtual organizations andmanage educational resources based on the grid, which provides an efficient platform for theeducation resource sharing and avoids the waste of resources and software redevelopment.
     As a kind of data grid, replica management technology also plays a very important role ineducation resource grid. On the one hand, the existence of replicas efficiently improvesavailability of resources and avoids the phenomenon that the resources are not available owing tothe nodes’leaving. On the other hand, replica technology makes the request for a certain file notconcentrate on the same node, which balances the network load to some extent. Thirdly, replicatechnology improves the efficiency of data access, the setting up of replica can reduce thedistance of data transmission. Unique characteristics of educational resources grid suggestresearching replica management technology has very important significance in educationalresource grid environment. According to the replica management problems existed in theeducation resource grid, the paper mainly do the following works.
     First, the paper studies educational resource grid environment and replica managementtechnology and elaborates the significance of educational resource grid replica managementstrategy. The paper introduces the concept,features,system structure and research status of gridand data grid and so on, analyzes unique characteristics and system structure of educationalresource grid and clearly describes the research environment. Then, the paper elaborates theconcepts and features of replica management and introduces the various functional modulesincluded in the copy management in detail.
     Second, the paper proposes a replica creation strategy based on the node load. Afterspecificly analyzing features of model of educational resources grid, the paper proposes a replicasetting strategy based on node load. According to the situation where the request node is, thestrategy is divided into replica request in domains and replica request among domains. In thesame domain, replica creation is carried out based on the nodes whose load is small to balancethe nodes load. However, for the quest among domains, replica is created on the managementnodes to reduce the price of remote transmission of replica. The experiments suggest that, in education resource grid model environment, the strategy can improve the average execution timeof jobs.
     Third, the paper proposes a replica selection strategy using fuzzy multi-attributedecision-making. The strategy utilizes availability, security, response time and cost as selectioncriteria between different replicas. In order to evaluate the attributes more convenience, we usetriangular fuzzy numbers to describe the attributes. The selection criteria(availability, security,response time and cost) are heterogeneous and cannot be aggregated with each other for theselection, furthermore, the criteria may contradict one against other. In order to select the replicawith fairness and satisfy the users, the paper considers the problem as a multi-attribute decisionmaking problem,which provides a new solution for replica selection.Finally, an example toillustrate the feasibility of the strategy .
     At last, the paper deeply studies the OptorSim simulator and improves the typical algorithmof the simulator. In order to verify the validation of replica creation strategy, the paper choosesOptorsim as experimental tools to deeply study. The architecture of OptorSim, main frame,theclass and several classical replica management algorithm are introduced. Through analyzing thedefect of LRU and LFU of OptorSim in realization, LRU and LFU are improved. Throughcomparing with the original algorithm, the experimental results suggest the performance of theimproved algorithm is raised.
引文
[1] Ian Foster,Carl Kesselman,Steven Tuecke.The Anatomy of Grid: Enabling Scalable Virtual Organizations
    [C]. Cluster Computing and the Grid, 2001:6-7.
    [2] W.Allcock,A.Chervenak,I Foster, et al. The Data Drid: Towards an Architecture for the DistributedManagement and Analysis of Large Scientific Datasets [J]. Journal of Network and Computer Applications,2000, 23 (3):187-200.
    [3] I.Foster,C.Kesselman.The Grid:Blueprint for a Future Computing Infrastructure[J]. SanFrancisco,USA:Morgan Kaufmann Publishers,1999:15-56.
    [4] Rashedur M.Rahman.Ken Barker.Reda Alhajj. Replica Placement Strategies in Data Grid[J] Journal ofGrid Computing, 2008,6 (1):103-123.
    [5]徐志伟,冯百明,李伟.网格计算技术[M].北京:电子工业出版社,2004.
    [6]刘鹏.网格概念的界定[EB/OL]. http://www.gridhome.com.
    [7]刘鼎鼎.网格体系结构及其Portal应用研究[D].电子科技大学硕士学位论文,2006.
    [8] Antonios Litke, Kleopatra Konstanteli, Vassiliki Andronikou, Sotirios Chatzis, TheodoraVarvarigou.Managing service level agreement contracts in OGSA-based Grids[J]. FutureGenerationComputer Systems,2008,24(4):245-258.
    [9] Sanjiva Weerawarana, Francisco Curbera, et al. Web Services Platform Architecture: SOAP, WSDL,WS-Policy, WS-Addressing, WS-BPEL, WS-Reliable Messaging, and More [M]. USA: Prentice Hall,2005:Part3.
    [10] DoD CIO. GIG, Department of Defense Global Information Grid Architectural Vision[EB/OL], 2007,http://www.defenselink.mil/cio-nii/docs/GIGArchVision.pdf.
    [11] Beiriger J., Johnson W., Bivens H.,et al.Constructing the ASCI Grid[C]. In Proc. 9th IEEE Symposium onHigh Performcince Distributed Computing.2000. IEEE Press.
    [12] D.W. Erwin ,D.F. Snelling. UNICORE: A Grid computing environment[C]. In Proc. 7th InternationalEuro-Par Conference, Lecture Notes in Computer Science (LNCS), 2001,2150:825–834.
    [13] Atsuko Takefusa. Bricks:A Performance Evaluation System for Scheduling Algorithms on the Grids[C].JSPS Workshop on Applied Information Technology for Science, 2001
    [14]中国国家网格[EB/OL]. http://www.cngrid.org/c/portal/layout?p_l_id=PUB.1.101.
    [15]中国教育科研网格[EB/OL]. http://wenku.baidu.com/view/e065a9868762caaedd33d4eb.html
    [16]王意洁,肖侬,任浩.数据网格及其关键技术研究[J].计算机研究与发展,2002,(8):943-948.
    [17] K. Ranganathan and I. Foster, Identifying Dynamic Replication Strategies for a High Performance DataGrid[C]. In Proc. International Workshop on Grid Computing, 2001.
    [18]陈萍,余华山,王彬等.网格计算环境Globus介绍[J].计算机应用研究,2003,(8):96-98.
    [19] C Baru, R Moore, A Rajasekar , M Wan. The SDSC storage resource broker [C]. Proceedings of the 1998Conference of the IBM Centre for Advanced Studies on Collaborative Research (CASCON, 98).Toronto,Canada.IBM Press.1998:5-16.
    [20] Johnson W E,Gannon D,Nitzberg B. Information Power Grid Implementation Plan:Researeh,Development,and Testbeds for High Performance, Widely Distributed,Collabrative,Computing andInformation System Supporting Science and Engineering. NASA Ames Research Center, http:// ~. nas.nasa.gov/IPQ.
    [21]马永征.科学数据网格资源调度技术研究[D].北京:中国科学院研究生院(计算技术研究所),2005.
    [22]陈华钧,姜晓红,吴朝晖.DartGrid:支持中医药信息化的语义网格平台实现[M].浙江:浙江大学出版社.2011.
    [23] Grimshaw A,Wulf W. The legion vision of a worldwide virtual computer[C]. Communications of theACM.1999.
    [24] J. Dongarra, M. W. Berry, M. Beck, et al. SinRG, A Scalable Intracampus Research Grid[EB/OL].http://www.cs.utk.edu/sinrg.
    [25] Casanova H,Dongrarra J. NetSolve:A network-enabled server for solving computational scienceproblems[J]. International Journal of Supercomputer Applications and High Performance Computing,1997,11(3):212-223.
    [26] Gateway to Educational Materials (GEM) [EB/OL]. http://www.geminfo.org.
    [27] Education Network Australia[EB/OL]. http://www.edna.edu.au.
    [28] C.Bitten,J.Gehring,R.Yahyapour,et al. The NRW-Metacomputer: Building blocks for a worldwidecomputational grid[C]. In Proc. Heterogeneous Computing Workshop 2000 at IPDPS 2000, Cancun, Mexico,May 2000.
    [29] B.Lee,Jon B.Weissman. Dynamic Replica Management in the Service Grid[C], In Proc. 10th IEEEInternational Symposium on High Performance Distributed Computing(HPDC-10'01), SanFrancisco,CA:IEEE Press,2001:71-83.
    [30] Brodsky D,Feeley MJ,Hutchinson NC. Topology Sensitive Replica Selection[C]. In Proc. 25th IEEESymposium on Reliable Distributed Systems. Washington:IEEE ComputerSociety Press, 2006:18-28.
    [31] Kavitha Ranganathan,Ian Foster. Design and Evaluation of Dynamic Replication Strategies for a HighPerformance Data Grid[C]. In Proc. International Conference on Computing in High Energy and NuclearPhysics,2001:87-99.
    [32] William H. Bell , David G. Cameron , Ruben Carvajal-Schiaffino , A. Paul Millar , Kurt Stockinger ,Floriano Zini. Evaluation of an Economy-Based File Replication Strategy for a Data Grid[C]. In Proc. 3stInternational Symposium on Cluster Computing and the Grid, May 12-15, 2003:661.
    [33]庞丽萍,陈勇.网格环境下数据副本刨建策略[J].计算机工程与科学, 2005(2):1-3.
    [34]孙海燕,王晓东,周斌,等.基于存储联盟的双层动态副本创建策略---SADDRES[J].电子学报,2005,33(7):1222-1226.
    [35]邢长明,刘方爱,杨林,等.教育资源网格模型及副本创建策略[J].软件学报.2009,20(10):2844-2856.
    [36] CHEREVENAK A. Giggle:A Framework for Constructing Scalable Replica Location Services[C]. InProc. of Super Computing 2002(SC202). Baltimore,USA:IEEE Computer Society Press,2002.
    [37] Yu-zhong Sun,Zhi-wei Xu. Grid Replication Coherence Protocol[C]. The18thInternational Parallel andDistributed Processing Symposium,Santa Fe,USA: 2004:232-239.
    [38] Rahman R.M.,Barker K.,Alhajj R. Replica Placement in Data Grid:Considering Utility and Risk[C]//Proceedings of IEEE International Conference on Coding and Computing. USA: IEEE, 2005:354-359.
    [39] LAMEHAMEDI H,SZYMANSKI B. Simulation of Dynamic Data Replication Strategies in DataGrids[C]//Proceedings of the International Parallel and Distributed Processing Symposium.[S.1.]: IEEEPress,2003:22-26.
    [40]邢长明,杨林,刘方爱.基于教育资源网格的副本放置策略[J].计算机工程.2008,34(6):121-126.
    [41]陈坤,刘方爱,邢长明.一种基于分层P2P结构的教育资源网格检索模型[J].山东大学学报:理学版,2008,43(11):73-76.
    [42]李庆华,郭志鑫.一种面向工作站网络的系统负载预侧方法[J].华中科技大学学报:自然科学版,2002,30(6):49-51.
    [43] Sudharshan Vazhkudai,Jennifer M.Schopf. Using Disk Throughput Data in Predictions of End-to-EndGrid Data Transfer[C].Third International Workshop on Grid Computing(GRID2002 ).Berlin:Springer-Verlag,2002:291-304.
    [44]孙敏,孙济洲,李明楚等.基于蚂蚁算法的数据网格副本选择策略[J].计算机工程与应用.2007,43(1):145-147.
    [45] S Vazhkudai,J Schopf. Using Regression Techniques to Predict Large Data Transfers[J]. special issue onGrid Computing:Infrastructure and Applications,2003(3):75-84.
    [46] Y Hu. IBL for replica selection in data intensive Grid applications[D]. Chicago:Department of ComputerScience,University of Chicago,2003.
    [47] R.M. Rahman, K. Barker, and R. Alhajj. Replica Selection in Grid Environment: A Data-MiningApproach[C]. Proc. Symp. Applied Computing (SAC), 2005.
    [48] A. Jaradat, R. Salleh and A. Abid: Imitating K-Means to Enhance Data Selection[J]. Journal of AppliedSciences,2009,19(9):3569-3574,.
    [49]徐泽水.对方案有偏好的三角模糊数型多属性决策方法研究[J].系统工程与电子技术,2002,24(8):9-12.
    [50]付巧峰.关于TOPSIS法的研究[J].西安科技大学学报,2008,28(1):190-193.
    [51]高强,刘波.关于网格模拟器的研究[J].计算机技术与发展.2010.20(1):100-103.
    [52] Hyo J. Song,Xin Liu,Dennis Jakobsen,Ranjita Bhagwan,Xianan Zhang. The MicroGrid: a ScientificTool for Modeling Computational Grids[C]. Proceedings of Super Computing 2000.2000.
    [53] Henri Casanova. SimGrid: A toolkit for the Simulation of Application Scheduling[C]. Proceedings of theFirst IEEE/ACM international Symposium on Cluster Computing and the Grid.2001.430-437.
    [54]刘祥瑞,朱建勇,樊孝忠.基于GridSim的网格调度模拟[J].计算机工程.2006.32(2):42-44.
    [55]周丽娟.网格模拟器OptorSim的剖析与改进[D].天津:天津大学硕士学位论文.2006.
    [56]David G, Cameron,Ruben Carvajal-Schiaffino,Jamie Ferguson,A.Paul Millar,Caitriana Nicholson, KurtStockinger and Floriano Zini. OptorSimv2.0 Installation and User Guide.November 2,2004.23-35

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700