海洋环境信息可视化网格数据共享技术研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
海洋环境信息是开发海洋和建设海洋的基础,在我国政治、经济、军事和维护国家权益方面都具有举足轻重的地位。我国各有关部门、地方对海洋调查投入了大量的人力、物力和财力,取得了极为丰富的资料和数据,但是由于各方面的原因,这些宝贵的科学数据得不到充分利用,无法实现全社会的数据共享,造成了很大的浪费。因此,只有开展海洋环境信息共享,才能减少重复投资、重复调查,打破长期形成的各种数据壁垒,提高资源的使用效益。
     本课题工作的来源是国家863计划“基于网格的海洋环境数据共享与信息服务技术的研究”(No. 2006AA09Z139)的重要组成部分。由于全球海洋环境信息的数据量十分巨大,并且海洋数据在存储方式上,跨越多个不同的平台,数据格式和结构各不相同,为了有效管理和使用这些数据,在分析海洋环境信息数据格式及特点的基础上,提出了利用网格技术对海洋环境信息数据进行共享管理的方法。将海洋环境信息数据等海量信息分布存储在网格的多个节点上,并对这多个节点上的数据进行有效管理,使用户可以在任何地点、任何时间都方便、透明、快速地访问和使用这些资源,从而为海洋科学研究获取海洋环境信息数据提供方便。
     要对存储在多个节点上的海量的海洋环境信息数据进行充分的共享,就要用网格的元数据管理和副本管理技术。本文通过对各种海洋环境信息数据进行分析,设计元数据服务总体结构及各种元信息数据结构。通过元数据管理技术,可以用少量的元数据管理海量的信息数据,并且不论数据存放在哪个位置,用户都可以根据数据的描述或属性方便地查询到所需要的数据的唯一逻辑文件名,进而通过副本管理得到最快可访问到的数据副本的物理位置。通过对数据副本管理中的副本创建、副本删除、副本选择和副本一致性管理策略进行研究,在广域网环境下的多个节点提供多个数据备份,可以有效减少数据访问时间,降低网络带宽负载。
Ocean environment data is the foundation of exploiting and utilizing ocean. It plays important role in the political,economy,military affairs and protecting national interests of our country. Many departments and prefecture spend plenty of people, matters and money on ocean data study, and obtain a lot of materials and data. However, the valuable data can not be made good use and be shared adequately in the society for a lot of reason. For the sake of reducing investment and investigate repeatedly, breaking all kinds of data barriers and improving the use benefits of the resources, the data must be shared adequately.
     The work in this paper is an important part of the“863”Project of our country ---“Study on Grid based on ocean data sharing and information service”(No. 2006AA09Z139).The amount of the ocean environment data is massive. The storage of the ocean data across multiple different platforms, and data formats and structures are different too. For making good use of the ocean environment data, by analyzing the data format and characteristic of the ocean data, a method of the ocean data sharing management by using the grid technology is proposed. The large amounts of ocean environment data are distributed at many joints in the grid, and they are managed efficiently. The users can visit and access the ocean environment data transparently at any time in any place. Thus it can help the ocean study conveniently.
     For making good use of the massive ocean environment data in the joints, we need to apply the metadata management and replica management technology. By using the metadata management technology, we could manage the massive data with little metadata. And no matter where the data locates, users can visit the unique logical file name of the data with the description and attribution of the data. Furthermore users can obtain the physical location of the replica with the replica management technology. The replica management technology provides several data copies for the users in different places in the WAN, and it can reduce the time of data access and the load of the network bandwidth.
     By analyzing the ocean environment data, its storage and management model are designed. Using the grid middleware and toolkits provided by Globus develops services module and implements the grid system of ocean environment data visualization. The system utilizes the Globus Toolkit and Web Services technology, and the OGSA-DAI as supporting platform at the bottom, adding the data sharing services and data format conversion services, provides the reliable support of the data sharing for the users.
引文
[1] A. Chervenak, I. Foster, C. Kesselman, et al. The Data Grid:Towards an Architecture for the Distributed Management and Analysis of Large Scientific Datasets. Journal of Network and Computer Applications,2000,23(3):187-200
    [2] Foster I.What is the grid? A three- point checklist . Grid Today- Daily Newsand Information For the Global Grid Community,2002,1(6):32- 36
    [3] Foster I,Kesselman C,Tuecke S.The anatomy of the grid:Enabling scalable virtual organizations.International J Supercomputer Applications, 2001,15(3):1- 3
    [4] Foster I, Kesselman C, Nick J M,et al. Grid services for distributed system integration[J].Computer, 2002,35(6):37- 46
    [5] Gurmeet Singh,Shishir Bharathi,Ann Chervenak,et al. A Metadata Catalog Services for Data Intensive Applications. Proceedings of the 2003 ACM/IEEE conference on Supercomputing,Phoenix,Arizona:IEEE Computer Society Press,2003:33-34.
    [6] B. Allcock, J. Bester, J. Bresnahan, et al. Secure,Efficient Data Transport and Replica Management for High-Performance Data-Intensive Computing. Proceedings of the 18th IEEE Symposium on Mass Storage Systems and Technologies. Washington:IEEE Computer Society,2001:13-28
    [7]陈奎英.发展海洋信息事业之我见.海洋信息,2002,(2):4-5
    [8] http://www.e-science.clrc.ac.uk/web
    [9] http://www.nesc.ac.uk/action/projects/project_action.cfm?Title=81
    [10] http://lookingtosea.ucsd.edu
    [11] http://www.earthsystemgrid.org/.
    [12] http://czms.mit.edu/poseidon/new1/.
    [13] http://www.sdg.ac.cn
    [14]许建平.阿尔戈全球海洋观测大探秘.北京:海洋出版社, 2002
    [15] http://www.argo.org.cn/.
    [16] http://mds.coi.gov.cn/wenyan/
    [17] http://www.coi.gov.cn
    [18] http://www.nodc.noaa.gov/
    [19] http://www.bodc.ac.uk/
    [20] http://www-sci.pac.dfo-mpo.gc.ca/default_e.htm
    [21] http://www.fimr.fi/en.html
    [22] http://ak.aoos.org/
    [23] http://shorestation.ucsd.edu/
    [24]都志辉,陈渝,刘鹏.网格计算.北京:清华大学出版书,2002.
    [25] I. Foster, C. Kesselman (eds.).“The Grid: Blueprint for a New Computing Infrastructure.”Morgan Kaufmann Publishers Inc, 1999.
    [26]肖侬.网格计算的实践与发展.长沙:国防科技大学,2002.
    [27] http://www.globus.org/,2007
    [28] http://doesciencegrid.org;
    [29] http://www.cern.ch/grid
    [30] http://www.gridforum.org
    [31] http://www.teragrid.org/
    [32] http://www.griphyn.org
    [33] http://vega.ict.ac.cn/gos/index.htm
    [34]张雪松苏晓龙网格体系结构的发展科技信息(科学教研), Science & Technology Information, 2008,21
    [35]赵念强,鞠时光,网格计算及网格体系结构研究综述,计算机工程与设计2006, 27( 5):728-734
    [36]胡引翠.网格计算技术的应用与其发展趋势.测绘通报,2005,(3),23-26
    [37] Heinz Stockinger, Flavia Donno, Erwin Laure, Shahzad Muzaffar, Peter Kunszt Grid Data Management in Action:Experience in Running and Supporting Data Management Services in the EU DataGrid Project. CERN, Computing in High Energy and Nuclear Physics, 2003,24-28
    [38]户家富.数据网格中元信息管理与一致性维护研究.[硕士学位论文]国防科学技术大学研究生院,2005
    [39] Ian Foster,Carl Kesselman著,金海,袁平鹏,石柯译.网格计算(第二版).电子工业出版社,2004,224-246
    [40]潘慧,张月卓,黄永忠.高性能网格环境中的数据管理和传输.现代计算机,2004,(191):11-17
    [41] B. Allcock, J. Bester, J. Bresnahan, A. Chervenak, I. Foster, C. Kesselman, S. Meder, V. Nefedova, D. Quesnel, S. Tuecke. Secure, Efficient Data Transport and Replica Management for High-Performance Data-Intensive Computing. IEEE Mass Storage Conference, 2001
    [42]吴国豪,曾国荪,张季平.数据网格关键技术分析.计算机工程与应用,2003,(35),60-62
    [43]孙海燕,王晓东,肖侬等.数据网格中的数据复制技术研究.计算机科学,2005,32(7):13-16
    [44]王意洁,肖侬,任浩等.数据网格及其关键技术研究.计算机研究与发展,2002,39(8):943-947
    [45]何凤英,蒋秀凤.网格环境下数据副本创建策略.计算机与现代化,2007,(10):94-97
    [46] Sudharshan Vazhkudai, Steven Tuecke, Ian Foster. Replica Selection in the Globus Data Grid. The First IEEE/ACM International Conference on Cluster Computing and the Grid (CCGRID 2001), IEEE Computer Society Press, 2001,106-113
    [47] A L Chervenak , I Foster , C Kesselman et al. Data management and transfer in high performance computational grid environments.Parallel Computing Journal , 2002,28 (5):749~771
    [48]李田来,刘方爱.基于Globus的数据网格中副本定位策略.计算机应用,2007,27(11),2750-2752
    [49] Andrea Domenici,Flavia Donno,Gianni Pucciani,et al. Replica Consistency in a Data Grid. Nuclear Instruments and Methods in Physics Research,2004,(534):24–28
    [50] http://www.ogsadai.org.uk/documentation/ogsa-dai-wsrf-2.2/doc/
    [51]陈萍,余华山,王彬,等.网格计算环境Globus介绍.计算机应用研究,2003,(8):96-98
    [52]王晨.基于网格的Web Services.情报理论与实践,2004,01(27):77-78