一种高速数据存储方法的研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
信息技术的高速发展推动人类社会全面进入数字时代,突出表现在信息总量和交换量的迅猛增长,不断出现新兴的应用领域。传输、处理和存储如此庞大的数据量使存储系统面临前所未有的机遇和挑战。
     数据传输和数据存储是存储系统中的最重要的两个环节,本文以地震油气勘探仪器系统应用为背景,探索利用现有的普适设备最大性能的发挥网络带宽和磁盘驱动器吞吐量的方式。当前通用计算机的组织架构在处理网络通信和数据存储的过程中至少需要经历从网络到内存,再从内存到磁盘的两次数据拷贝,加之频繁的中断处理使其难以执行高速、持续、可靠的数据传输和存储。因此,本文在对当前的数据存储系统进行研究的基础上,设计了一种高速数据传输、存储的原型。该原型充分利用FPGA在并行事件处理上的优势,降低处理器在传输、存储协议上的负担和中断处理频率,实现了低功耗、小成本的数据存储系统;同时,也使处理器可以将更多的资源用于算法处理与数据运算。
     本论文共分为以下五章:
     第一章作为绪论,从存储需求出发,介绍了存储设备和存储系统的评价标准。在分析当前仪器对数据传输、处理和存储的基本需求以及现有通用计算机平台的吞吐瓶颈的基础上,通过不同方案的对比,提出基于硬件实现的存储原型设计。
     第二章从基本存储设备出发,介绍了扩展存储系统容量、提升存储系统性能、增强存储系统可靠性的方法。为了进一步理解存储系统的运行,讨论了存储通道和接口的实现方法。最后针对存储系统的设计方法,简要分析了数据存储优化的途径。
     第三章详细叙述了研制存储原型过程中硬件实现的技术路线与细节。分别从时钟、复位、接口与总线以及电源等多个方面阐明系统设计。重点介绍了在数据传输、存储过程中扮演重要角色的基于千兆以太网的可靠数据传输和Serial ATA接口的原理、分析和实现。
     第四章给出了已完成的初步测试结果,包括数字时钟同步所实现的同步精度,基于千兆以太网的可靠数据传输的吞吐量,Serial ATA接口与磁盘驱动器的链路初始化结果以及PCI Express的吞吐量,并对测试结果进行了讨论。
     最后是整篇论文的总结与展望,以本原型系统为平台,为继续开展数据传输、处理和存储的深入研究提出建议。
The rapid development of modern information technology has promoted hu-man society into the digital age. The amount of information and exchange ca-pacity has grown impressively, and new application fields continue to emerge. The transmission, processing and storage for such a large amount of data bring unprecedented opportunities and challenges for storage systems.
     The data transmission and storage are the two most important aspects in the storage system. This thesis explores the maximum performance of network and disk drive for the pervasive device on the background of seismic oil and gas exploration system. The memory copy and frequent interrupts make it difficult to perform high-speed, continuous, reliable data transmission and storage for PC. We design a prototype to achieve low power consumption, small cost, high-speed, reliable data transmission and storage with hardware implementation. Taking the advantage of FPGA on the parallel processing, we relieve CPU burden on the transmission and storage protocol, so that more resources can be used for algorithm processing and data operations.
     The thesis is divided into the following five chapters:
     Based on requirements of data storage, chapter1introduces the evaluation standard of the storage devices and systems. After analyzing the needs of seismic oil and gas exploration system, we propose a storage prototype with hardware implementation.
     Chapter2begins with tiered storage approach, introduces the basic storage devices, and analyzes the methods of improving storage system performance and reliability. Then we discuss the implementation of the interface and channel for storage. Finally, a brief analysis of optimizing storage is presented.
     Chapter3describes the details of hardware implementation for the prototype, involving digital clock synchronization, reset, interfaces, bus and power system. The principle diagram and the state machine are provided. We mainly discuss he reliable data transmission of Gigabit Ethernet and data storage of SATA interface.
     Chapter4gives the preliminary testing results, including the accuracy of the digital clock synchronization, the throughput of reliable data transmission based on the Gigabit Ethernet, the link initialization results for SATA hard disk drive etc., as well as brief discussion and analysis of testing results.
     Finally, it's summary and outlook. The prototype can be used as a platform to carry out in-depth study of data transmission, processing and storage.
引文
[1]阮福明.时移地震中高精度数据采集和大容量记录系统的研究[D].中国科学技术大学,2005.
    [2]Gantz J, Reinsel D. THE DIGITAL UNIVERSE IN 2020:Big Data, Bigger Digital Shadows, and Biggest Growth in the Far East. Technical report, IDC,2012.
    [3]Gray J. What next? A few remaining problems in Information Technology. ACM Federated Research Computer Conference,1999.
    [4]SZALA A. Science in an exponential world. Nature,2006,440:2020.
    [5]Preimesberger C. IBM Builds on 50 Years of Spinning Disk Storage. Technical report, IN-FOWORLD, October 17,1994.
    [6]IBM 350 disk storage unit. Technical report, IBM Archives. http://www-03.ibm.com/ibm/history/ exhibits/storage/storage_350.html.
    [7]First. Perpendicular Recording HDD. Technical report, Toshiba Press Release. http://www.toshiba. co.jp/about/press/2004_12/pr1401.htm.
    [8]Mallary M, Torabi A, Benakli M. One terabit per square inch perpendicular recording conceptual design. Magnetics, IEEE Transactions on,2002,38(4):1719-1724.
    [9]Wood R. Future hard disk drive systems. Journal of magnetism and magnetic materials,2009, 321(6):555-561.
    [10]Mayer-Schonberger V, Cukier K N. Big Data:A Revolution That Will Transform How We Live, Work, and Think. Eamon Dolan/Houghton Mifflin Harcourt,2013.
    [11]Bell G, Gray J, Szalay A. Petascale computational systems. Computer,2006,39(1):110-112.
    [12]Traeger A, Zadok E, Joukov N, et al. A nine year study of file system and storage benchmarking. ACM Transactions on Storage (TOS),2008,4(2):5.
    [13]Bai S, Wu H. The Performance Study on Several Distributed File Systems. Cyber-Enabled Dis-tributed Computing and Knowledge Discovery (CyberC),2011 International Conference on,2011. 226-229.
    [14]Subramanian S. Flash to the Rescue-Part One of Maintaining High, Sustained Perfor-mance with PCIe SSDs. Technical report, Virident. http://www.virident.com/resources/blog/ flash-to-the-rescue-part-1/.
    [15]谢明璞.大规模陆上地震仪器结构设计的关键技术[D].中国科学技术大学,2009.
    [16]王兵.高分辨地震勘探仪器设计研究[D].中国科学技术大学,2008.
    [17]马毅超.大规模陆上地震仪器中高速可靠数据传输方法的研究[D].中国科学技术大学,2011.
    [18]Cao P, Song K z, Yang J f, et al. Design of a large remote seismic exploration data acquisition system, with the architecture of a distributed storage area network. Journal of Geophysics and Engineering,2011,8(1):27.
    [19]Aziz I A, Goscinski A M, Hobbs M M. Performance evaluation of open source seismic data pro-cessing packages. Algorithms and Architectures for Parallel Processing,2011.433-442.
    [20]曹强.海量网络存储系统原理与设计.华中科技大学出版社,2010.
    [21]Wu J, Liu L, Han Z, et al. Using data-oriented storage method to build a high-parallel and high-efficiency disk cluster. Real Time Conference (RT),2012 18th IEEE-NPSS,2012.1-5.
    [22]Ruemmler C, Wilkes J. An introduction to disk drive modeling. Computer,1994,27(3):17-28.
    [23]Hylick A, Sohan R, Rice A, et al. An analysis of hard drive energy consumption. Modeling, Analysis and Simulation of Computers and Telecommunication Systems,2008. MASCOTS 2008. IEEE International Symposium on,2008.1-10.
    [24]Bovet D P, Cesati M. Understanding the Linux kernel. O'Reilly Media,2008.
    [25]敖青云.存储技术原理分析(基于Linux2.6内核源代码),2011.
    [26]Kandaswamy M A, Kandemir M, Choudhary A, et al. An experimental evaluation of I/O opti-mizations on different applications. Parallel and Distributed Systems, IEEE Transactions on,2002, 13(12):1303-1319.
    [27]Kim H, Okada K. A method of achieving the maximum write endurance for storage systems through wear-leveling supported by spare regions. IEEE APMRC,2012 Digest,2012.1-4.
    [28]Seo E, Park S Y, Urgaonkar B. Empirical analysis on energy efficiency of flash-based SSDs. Pro-ceedings of the 2008 conference on Power aware computing and systems,2008.17-17.
    [29]王纪奎.成就存储专家之路:存储从入门到精通.清华大学出版社,2009.
    [30]Betts B. Solid state of the art. Engineering & Technology,2012,7(9):84-87.
    [31]Morris R J, Truskowski B J. The evolution of storage systems. IBM systems Journal,2003, 42(2):205-217.
    [32]Patterson D A, Gibson G, Katz R H. A case for redundant arrays of inexpensive disks (RAID), volume 17. ACM,1988.
    [33]Chen P M, Lee E K, Gibson G A, et al. RAID:High-performance, reliable secondary storage. ACM Computing Surveys (CSUR),1994,26 (2):145-185.
    [34]Moon T. Error correction coding. Hoboken. New Jersey:John Wiley & Sons,2005.
    [35]Hillyer B K, Silberschatz A. On the modeling and performance characteristics of a serpentine tape drive.1996,24(1):170-179.
    [36]Schindler J, Griffin J L, Lumb C R, et al. Track-aligned extents:matching access patterns to disk drive characteristics. Proceedings of Conference on File and Storage Technologies,2002.259-274.
    [37]Sivathanu M, Prabhakaran V, Popovici F I, et al. Semantically-smart disk systems. Proceedings of Proceedings of the 2nd USENIX Conference on File and Storage Technologies,2003.73-88.
    [38]Shriver E, Hillyer B K, Silberschatz A. Performance analysis of storage systems. Proceedings of Performance Evaluation:Origins and Directions. Springer,2000:33-50.
    [39]Allan D W, Ashby N, Hodge C C. The science of timekeeping. Hewlett-Packard,1997.
    [40]Xie M, Wu J, Zhang J. Clock data recovery based on delay chain for medium data rate transmission. Real Time Conference,2009. RT'09.16th IEEE-NPSS,2009.137-140.
    [41]Standard for a Precision Clock Synchronization Protocol for Networked Measurement and Control Systems. IEEE Std 1588TM-2008,2008.
    [42]Eidson J C. Measurement, control, and communication using IEEE 1588. Springer Publishing Company, Incorporated,2010.
    [43]TI. Generating Low Phase-Noise Clocks for Audio Data Converters from Low Frequency Word Clock, scaa088. Technical report,2008.
    [44]ARM. AMBA(?)4 AXI4-Stream Protocol Specification Version:1.0, ID030510.
    [45]JEDEC STANDARD:DDR3 SDRAM Specification. JEDEC Solid State Technology Association, July 2010.
    [46]JEDEC STANDARD:1.5 V+/-0.1 V (Normal Range) and 0.9 V-1.6 V (Wide Range) Power Supply Voltage and Interface Standard for Nonterminated Digital Integrated Circuit. JEDEC Solid State Technology Association, September 2007.
    [47]Hennessy J L, Patterson D A. Computer architecture:a quantitative approach. Morgan Kaufmann, 2011:97-101.
    [48]Johnson H W, Graham M, et al. High-speed digital design:a handbook of black magic.1993..
    [49]Bokhari S, Mayder R. Xilinx Virtex-6/Spartan-6 FPGA DDR3 Signal Integrity Analysis and PCB Layout Guidelines, White Paper, WP420 (v1.0). Technical report,2008.
    [50]Xilinx. Spartan-6 FPGA PCB Design and Pin Planning Guide, UG393 (v1.3). Technical report, October 17,2012.
    [51]Xilinx. Spartan-6 FPGA Memory Controller User Guide, UG388 (v2.3). Technical report, August 9,2010.
    [52]Westrelin R, Fugier N, Nordmark E, et al. Studying network protocol offload with emulation:ap-proach and preliminary results. Proceedings of High Performance Interconnects,2004. Proceedings. 12th Annual IEEE Symposium on. IEEE,2004.84-90.
    [53]Han J, Jeong D K. A practical implementation of IEEE 1588-2008 transparent clock for distributed measurement and control systems. Instrumentation and Measurement, IEEE Transactions on,2010, 59(2):433-439.
    [54]Xilinx. Virtex-7 T and XT FPGAs Data Sheet, DS183 (v1.14). Technical report, April 17,2013.
    [55]Xilinx. Spartan-6 FPGA Data Sheet, DS162 (v1.9). Technical report, August 23,2010.
    [56]Altera. Cyclone V Device Overview, ver 2012.12.28. Technical report, Dec 2012.
    [57]Altera. Stratix V Device Overview, ver 3.1. Technical report, Dec 2012.
    [58]Lattice. LatticeECP3 Data Sheet, DS1021 Version 02.2EA. Technical report, April 2012.
    [59]Xilinx. Spartan 6 FPGA GTP tranceivers, UG386 Version 2.2. Technical report, April 30,2010.
    [60]Heydari P, Mohanavelu R. Design of ultrahigh-speed low-voltage CMOS CML buffers and latches. Very Large Scale Integration (VLSI) Systems, IEEE Transactions on,2004,12(10):1081-1093.
    [61]Athavale A, Christensen C. High-speed serial I/O made simple. Xilinx Inc,2005,4.
    [62]Widmer A X, Franaszek P A. A DC-balanced, partitioned-block,8B/10B transmission code. IBM Journal of research and development,1983,27(5):440-451.
    [63]Razavi B. Monolithic phase-locked loops and clock recovery circuits:theory and design. Wiley-IEEE Press,1996.
    [64]Grimsrud K, Smith H. Serial ATA Storage Architecture and Applications:Designing High-Performance, Cost-Effective I/O Solutions. Intel Press,2003.
    [65]Serial ATA:High Speed Serialized AT Attachment Rev.1.0a. Serial ATA workgroup.
    [66]PCI Express Base Specification Revision 3.0. PCI-SIG, November 10,2010.
    [67]Budruk R, Anderson D, Shanley T. PCI express system architecture. Addison Wesley Publishing Company,2004.
    [68]Wiltgen J, Ayer J. Bus Master Performance Demonstration Reference Design for the Xilinx End-point PCI Express Solutions.2008..
    [69]Bittner R. Bus mastering PCI express in an FPGA. Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays,2009.273-276.
    [70]TI. Programmable 1-PLL VCXO Clock Synthesizer With 1.8-V,2.5-V, and 3.3-V Outputs, S-CAS849E. Technical report, March,2010.
    [71]Muller T, Ockert A, Weibel H. PHYs and symmetrical propagation delay.2004 Conference on IEEE 1588,2004.27-29.
    [72]Test TCP (TTCP) Benchmarking Tool and Simple Network Traffic Generator. Technical report. http://www.pcausa.com/Utilities/pcattcp.htm.
    [73]Goldhammer A, Ayer Jr J. Understanding Performance of PCI Express Systems, Xilinx WP350. Technical report, Sept 4,2008.
    [74]ARM. AMBA(?) AXITM and ACETM Protocol Specification, ID022613.
    [75]Djordjevic A R, Sarkar T K. An investigation of delta-I noise on integrated circuits. Electromagnetic Compatibility, IEEE Transactions on,1993,35(2):134-147.
    [76]Actel. Simultaneous Switching Noise and Signal Integrity, Application Note AC263. Technical report.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700