基于InfiniBand的网络存储系统结构与卷分配策略研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
面对网络化环境下数据快速增长与存储容量的急剧膨胀,存储系统的I/O(输入/输出,Input/Output)性能成为衡量外存储系统的主要指标。由于InfiniBand(无限带宽,简称IB)具有高带宽、低延迟的优点,被广泛应用于高性能计算节点的互联网络,如果将该网络连接拓展到存储系统中,一方面可以显著提高存储系统的I/O性能,另一方面便于将计算节点间的网络和存储网络统一起来,给集群系统的管理提供了极大的方便。本文基于此目的,主要研究InfiniBand网络存储系统结构与存储卷的分配策略。研究内容如下:
     网络存储系统结构方面,基于SRP(SCSI RDMA Protocol)协议,将InfiniBand体系映射到SCSI体系,概括出InfiniBand存储网络的I/O路径,利用通信队列对、内核旁路技术、远程直接存取技术设计了一个基于InfiniBand的存储系统目标模拟器。实验表明,远程主机通过InfiniBand网络连接访问目标器存储卷的I/O带宽比通过光纤通道连接或千兆以太网连接的I/O带宽高得多,甚至比主机本地磁盘访问的I/O带宽高5倍。研究还得出了目标器I/O带宽利用率受RAID卡带宽约束的结论,进一步提出了在目标器后端进行多个RAID卡组合,以提高后端磁盘访问的聚合带宽。
     逻辑卷授权方面,通过主机通道全球唯一标识号(Globally Unique Identifier:GUID)与逻辑卷之间的动态映射关系实现了存储系统逻辑存储资源的授权访问与屏蔽策略。此算法作为一个授权功能模块添加在InfiniBand目标器中。系统测试表明:动态映射授权算法,有效确保了存储系统逻辑卷访问权限设置的安全性和灵活性。
     SAN中的数据共享方面,给出了一个IB-SAN集中控制存储共享管理方案。它采用元数据服务器实现SAN存储管理功能,通过InfiniBand扩展网络、第三方数据传送和软件RAID来提高系统性能。提出了一种动态建立磁盘映射的策略以提供物理存储设备虚拟视图给系统中的节点服务器。测试表明,共享存储管理器保证了存储资源的同步性,支持用户对卷的在线增减,跨平台性好、接口简单、扩容简单。
With the rapid data growth and fast storage capacity expansion in the network environment, I/O performance of the storage system becomes the key indicators to evaluate a storage system. Since InfiniBand is high bandwidth, low latency, it’s widely used in the network of high-performance computing nodes. If the network is expanded to storage systems, on the one hand, I/O performance of the storage system can be conspicuously improved, on the other hand, facilitately unites the network of computing nodes and storage network, providing a great convenience for the management of cluster systems. Based on it, this paper mainly researches network storage system archtecture of infiniband and volume allocation strategy. The content of researches:
     In aspect of network storage system architecture, based on SCSI RDMA Protocal, maps infiniband architecture to SCSI architecture, summarizes the I/O path of the infiniband storage network. Using queue pair, core bypassing technology and Remote Direct Memory Access technology, designes a target simulator of storage system based on infiniband. Experments show that the I/O bandwidth through infiniband network connection is much higher than that through Fibre Channel or Gigabit Ethernet, even five times the I/O bandwidth of accessing the local disk. Further reaches the conclusion that I/O bandwidth utilization of target is constrained by the bandwidth of raid. And then proposes combinating of RAID cards and concurrently accessing the target to increase the assembed bandwidth of accessing back-end disks. In aspect of logical volume authorization, realizes logic resource authorization and masking, through interrelated mapping between port Globally Unique Identifier(GUID) and logical volume. Test verifies its high security and flexibility to access authorization.
     In aspect of data sharing in storage area system, a management program of centralized controlling the storage sharing datas in IB-SAN is gaven. It utilizes a metadata server to realize storage sharing management, expands network through infiniband, improves system performance through a third party for transmission and software RAID. A strategy of dynamically establishing the disk map is prosed to provide the virtual view of physical storage devices for the node servers. Tests indicate that storage sharing management ensures the synchronization of storage resources, supportes the online increase and decrease of volumes, cross-platform, simple interface, and easy expansion.
引文
[1] Jim Gray. “What Next? A Few Remaining Problems in Information Technology”. http://research.microsoft.com/~gray/talks/Gray_Turing_F CRC.pdf, 1998
    [2] Yijian Wang, David Kaeli. Execution-driven Simulation of Network Storage Systems. Modeling, Analysis, and Simulation of Computer and Telecommuni- cations Systems. In: The IEEE Computer Society's 12th Annual International Symposium. 2004,604-611
    [3] Qin Xin, Meller, E. L. , et al. Reliability Mechanisms for Very Large Storage Systems. In: The 20th IEEE/11TH NASA Goddard conference on Mass Storage Systems and Technologies. In: 20th IEEE/11th NASA Goddard Conference. Santa Cruz,2003,146-156
    [4] Atul Adya, et al. Available, and Reliable Storage for an Incompletely Trusted Environment. In: 5th Symposium on Operating Systems Design and Implemen- tation (OSDI 2002). Boston,2002,1-12
    [5] A. Rowstron, P. Druschel. Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility. In: Proc. of ACM SOSP. Alberta, 2001,188-201
    [6] 周敬利, 余胜生等. 网络存储原理与技术. 第一版. 北京:清华大学出版社, 2005,78-112
    [7] 谢长生, 书成. SAN 互联新技术-InfiniBand. 计算机工程,2003,11(20): 4-5
    [8] Gregory F. Pfister. Aspects of the InfiniBand Architecture, Cluster Computing. In: IEEE International Conference. 2001,369-371
    [9] 高巍, 谢长生. 一种基于 iSCSI 的 SAN 存储虚拟化的实现. 华中科技大学学报, 2003,05(31):25-27
    [10] Nathan McQueen. University of Washington Media Archive System and Infiniband Architecture. http://www.infinibandta.org/events/past/it_roadshow/it_ perspective.pdf
    [11] Sean Michael Kerner. InfiniBand Brings HPC Power to Enterprise Storage. http://www.enterprisestorageforum.com/technology/features/article.php/3574731,2006-01-03
    [12] 认识网络存储 IP SAN 与 IB SAN. http://www.media.edu.cn/wang_luo_cun_ chu_5178/20061030/t20061030_202388.shtml,2008-02-27
    [13] 冰原. InfiniBand 统一数据中心. 每周电脑报,2006,(26):44-44
    [14] 刘敏, 王意洁. 并行 I/O 技术研究. 计算机应用研究,2003, 20(8):29-31
    [15] JAMES M, DeBANJAN S, SATISH T K. Protection, restoration and disaster recovery. IEEE Network,2004,18(2):3-4
    [16] BrandtSA, XueLan, MillerEL, et al. Efficient Metadata Management in Large Distributed File Systems. In: Proceedings of the 20th IEEE/11th NASA Goddard Conference on Mass Storage Systems and Technologies. 2003,290-298
    [17] A. Lester Buck, Robert A. Coyne. The Storage Server as Virtual Volume Manager. In: Proceedings Twelfth IEEE Symposium on Mass Storage Systems. California, 1993,79-86
    [18] Li Bigang, Shu Jiwu, Zhang Weimin. Design and Implementation of a Storage Virtualization System Based on SCSI Target Simulator in SAN. TSINGHUA SCIENCE AND TECHNOLOGY, 2005,10(1):122-127
    [19] StoreVault. An Introduction To Networked Storage. http://www.storevault.com /downloads/WP_NAS_Intro_071706.pdf
    [20] Brian Pawlowski, Spencer Shepler, et al. The NFS Version 4 Protocol. http:// www.nluug.nl/events/sane2000/papers/pawlowski.pdf,2002-03-16
    [21] Storage Networking Industry Association. Common Internet File System (CIFS) Technical Reference. http://www.snia.org/tech_activities/CIFS/CIFS-TR1p00_FI NAL.pdf,2002-01-02
    [22] 基于 SAN 的网络存储共享系统. http://www.eepw.com.cn/article/77122.htm, 2008-01-14
    [23] Si Yin, Yuanqiu Luo, Lei Zong, et al. Storage Area Network Extension over Passive Optical Networks (S-PONS). Communications Magazine, 2008,44-52
    [24] 顾锦旗. 实用网络存储技术. 第一版. 上海:上海交通大学出版社, 2002:15-22
    [25] M. E. Kuhl, N. M. Steiger, F. B. Armstrong, et al. Component-based Performance Modeling of A Storage Area Network. In: Proceedings of the 2005 Winter Simulation Conference. Florida, 2005, 2417-2426
    [26] Xuhui Liu, Nan Wang, Guozhong Sun, et al. Remote iSCSI Cache on InfiniBand: An Approach to Optimize iSCSI System. In: Proceedings of the 2006 International Conference Workshops on Parallel Processing. WashingtonD.C, 2006, 527-534
    [27] Satran J. Internet SCSI (iSCSI) Draft. http://www.ietf.org/internetdrafts/draft- ietf-ips-iscsi-20.txt,2003-01-12
    [28] Yong Kyu Lee, Shin Woo Kim, Gyoung Bae Kim, et al. Metadata management of the SAN topia file system. C.icpads 2001:492-499
    [29] Kochut A. , N. Bobroff, K. Beaty, et al. Management issues in storage area networks: Detection and isolation of performance problems. In: 10th IEEE/IFIP Network Operations and Management Symposium 1. 2004,453-466
    [30] Jizhong Han, Dan Zhou, Xubin He, et al. I/O Profiling for Distributed IP Storage Systems. In: Proceedings of the Second International Conference on Embedded Software and Systems. Washington D.C, 2005, 583-588
    [31] WU Hao, SHU Jiwu, WEN Dongchan, et al. Design and Implementation of a Fibre Channel Target Driver Supporting SCSI. TSINGHUA SCIENCE AND TECHNOLOGY,2005,10(1): 115-121
    [32] 舒继武, 薛魏, 李必刚等. 一种高可扩展存储网络系统 TH-MSNS 的研究与实现. 计算机学报,2005,28(3):326-333
    [33] Shu Jiwu, Yao Jun, Fu Changdong, et a1. Highly efficient FC-SAN based on load stream.In:Proceedings of the Fifth International Workshop on Advanced Parallel Processing Technologies. Lecture Notes in Computer Science 2834, Berlin, 2003,31-40
    [34] Kaladhar Voruganti, Prasenjit Sarkar. An Analysis of Three Gigabit Networking Protocols for Storage Area Networks. IEEE,2001,29(4): 259-265
    [35] Sayantan Sur, Lei Chai, Hyun-Wook Jin, et al. Shared Receive Queue based Scalable MPI Design for InfiniBand Clusters.In: Parallel and Distributed Process- ing Symposium. http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1639336, 2006-06-26
    [36] Bradley Mitchell. Ipv6. http://compnetworking.about.com/od/networkprotocolsip /g/bldef_ipv6.htm,2008-01-24
    [37] Accelerating High-Speed Networking with Intel I/O Acceleration Tech nology. http://www.zdnet.com.au/whitepaper/0,2000063328,22117920p-16001533q,00.htm,2003-05-01
    [38] Iometer2006.07.27. http://www.iometer.org/doc/downloads.html,2003-0 2-08
    [39] Intel I/O Acceleration Technology. http://www.dell.com/downloads/global/power/ ps2q07-50060435-Intel.pdf
    [40] Palekar A, Ganapathy N, Chadda A, et al. Design and Implementation of A Linux SCSI Target for Storage Area Networks. In: Proceedings of the 5th Annual Linux Showcase & Conference. Berkeley,CA,USA: USENIX Association,2001,11-11
    [41] PAN Jiaming, SHU Jiwu, ZHANG Suqin, et al. Design and Implementation of SCSI Target Emulator. 清华大学学报,自然科学英文版,2006, 11(1):38-43
    [42] William, T Futral. InfiniBand Architecture Development and Deployment. IntelPress,2001,1-6
    [43] 刘爱华, 钱德沛, 董小社等. IPoIB Architecture and its Application. 计算机科学,2003,09(30):85-88
    [44] Balaji P., Bhagvat S., Panda D.K., et al. Advanced Flow-control Mechanisms for the Sockets Direct Protocol over InfiniBand. In: Proceedings of the 2007 International Conference on Parallel Processing. Washington D.C,2007,73-73
    [45] Bill King. LUN Masking in a SAN. http://www.virtual.com/whitepapers/QLogic _LUN_Masking_In_A_SAN_wp.pdf,2001-08-18
    [46] InfiniBand 技术透视. http://server.chinabyte.com/338/1833838.shtml, 2004-07-22
    [47] Jaesung Lee, Hyuk-Jae LEE, Kyoung Park. An Efficient Implementation of the InfiniBand Link Layer. In: SOC Conference. Washington D.C, 2003,355-358
    [48] IBTA. InfiniBandTM Architecture Specification,Volume 1,2(Release 1.1). http:// www.infinibandta.org/,2004-12-10
    [49] IBTA. Supplement to InfiniBandTM Architecture Specification, Volume 1. http:// www.infinibandta.org/,2004-12-12
    [50] James Lentini, Vu Pham, Steven Sears, et al. Implementation and Analysis of the User Direct Access Programming Library. http://www.ic s. uci.edu/~jlentini/doc/ udapl.pdf,2003-04-08
    [51] Cris Simpson. Information Technology-SCSI RDMA Protocol-2(SRP-2). http:// www.t10.org/ftp/t10/drafts/srp2/srp2r00a.pdf,2003-06-07
    [52] Steve Paavola. InfiniBand Architecture Delivers Enhanced Reliability. http:// www.infinibandta.org/newsroom/whitepapers/sky_InfiniBand_Arch_090602.pdf
    [53] RDMA protocal:improving network performance. http://h20000.www2.hp.com/ bc/docs/support/SupportManual/c00589475/c00589475.pdf
    [54] 刘瑞芳, 万继光, 谭志虎. RAID 中零拷贝技术研究. 华中科技大学学报,2005, 12(33):161-163
    [55] 徐列敏, 王宇峰. RDMA 技术分析. http://www.semiapps.com.cn/print.php? content_id=70202042813477670,2002-01-21
    [56] Yu-an Tan, Hong Deng, Guo-min Lin, et al. Design and Implementation of a WBEM Disk Array Provider. In: Proceedings of the Sixth International Confe- rence on Parallel and Distributed Computing, Applications and Technologies (PDCAT’05). Washington D.C,2005,861-865
    [57] IBM TotalStorage: FAStT Best Practices Guide. http://jumpdoc.fz-juelich.de/ doc_pdf/fastt/FAStT_Best_practices_guide.pdf,2005
    [58] Simpson W. PPP. Challenge Handshake Authentication Protocol(CHAP). http://www.networksorcery.com/enp/protocol/CHAP.htm,1992-10-22
    [59] 刘祥涛. 网络存储环境下基于 RADIUS 的 DH-CHAP 方案. 计算机应用.2006, 26(2):343-345
    [60] 谢长生. 数字时代信息存储技术的发展. 计算机世界,2001,9(34):4-5
    [61] 付长冬, 舒继武, 吴昊等. SAN 存储系统的 I/O 路径控制与管理. 小型微型计算机系统,2004,25(5):811-814
    [62] Tom C, Designing Storage Area Network. Massachusetts: Addison. Wesley Long Man Inc,1999,72-73
    [63] 杨雨春, 余胜生, 周敬利. SAN 系统中存储空间共享冲突问题的研究. 计算机工程,2002,28(3):32-34
    [64] 何统洲, 王国强. 一种基于 SAN 共享存储的自适应存储系统的设计和实现. 高等函授学报,2004,17(2):53-55
    [65] IBM Corporation. IBM Total Storage SAN File System Draft Protocol Specifi- cation. http://www.ibm.com/storage/europe/uk/software/virtualization/sfs/proto- col.pdf, 2003-04-02
    [66] Chang-Soo Kim, Gyoung-Bae Kim, Bum-Joo Shin. Volume management in SAN environment.C.ICPADS,2001,500-508
    [67] MarcFarly. Building Storage Network Second Edition. American: The McG-Raw- Hill Companies,Inc,2002,122-128
    [68] M Bancroft. Functionality and Performance Evaluation of File Systems for Storage Area Networks (SAN). In: Proceedings 17th IEEE Symposium on Mass Storage Systems/8 NASA Goddard Conference on Mass Storage. Washington D. C,1999,101-119
NGLC 2004-2010.National Geological Library of China All Rights Reserved.
Add:29 Xueyuan Rd,Haidian District,Beijing,PRC. Mail Add: 8324 mailbox 100083
For exchange or info please contact us via email.