网络存储系统预取协调优化研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着网络技术的飞速发展及云计算概念的提出,用户对存储系统的性能及安全性等方面都提出了更高的要求,云存储也成为目前各存储厂商进军的领域之一。为了提高存储系统的性能,存储系统提供商分别在存储客户端和存储服务器端增加了大容量的内存(main-memory buffers)空间作为缓存(caching)使用。然而,随着硬件的升级和缓存空间的不断增加,系统表现出来的性能却未随之提高,有时甚至出现下降的情况。研究表明,有两个原因限制了这种层次缓存有效性的发挥:第一,随着缓存能力的增加,存储客户端的命中率不断提高,致使到达存储服务器端请求的时间和空间局部性较差,从而导致基于局部性的算法如LRU等对缓存的利用率降低。第二,大量数据被存储客户端和服务器端冗余存储,使整个系统表现出来的性能远远没有达到系统中层次缓存的总和应该达到的性能。因此,如何才能更有效的利用缓存空间成为目前最为广泛研究的问题。
     本文针对当前存储系统的研究现况分析发现,目前存储系统多是直接应用单级存储系统中的预取算法,这些算法直接应用于多级存储系统将带来很多局限,如太过保守的预取算法不能有效的利用底层的缓存空间,而太过积极的预取算法又有可能穿越多层,产生大量的冗余数据,从而浪费了存储空间等。预取协调算法在不改变原有预取算法的基础上,只通过分析当前数据流的请求方式和Cache的状态来调整预取算法的积极性以提高Cache的利用率。本文通过对当前的预取协调方法进行详细分析,发现其存在两个问题:第一,对于随机请求的影响较小,不能明显提高服务器端的利用率。第二,只能用在同构的系统结构中,即在客户和服务器端使用相同的预取算法。
     针对预取协调算法存在的两个缺陷,本文提出了相应的改进方案,不仅提高了其在随机请求下的效率,还可将其应用在异构的存储系统架构当中。为了对改进后的预取协调算法进行验证,本文设计并实现了一个两级存储系统实验平台,分别使用不同的预取算法对改进的预取协调算法进行验证。实验结果表明,改进后的算法无论是对顺序访问模式还是随机访问模式,都能够有效的提高存储系统的性能。该算法还可以应用在异构的系统环境之中,这种异构的系统性能几乎完全取决于服务器端使用的预取算法的性能。预取协调算法为提高存储系统的性能提出了一种新的解决方案,不需要任何的额外花费,具有一定的理论价值和应用价值。
With the development of network technology and introduction of concept of cloud computing, users have put forward higher requirements in performance and security of storage system. Recently, cloud storage has become a new area that the storage vendors joined. In order to improve the performance of storage system, storage providers are using a large buffer as caching in both client and server. However, as hardware upgrades, the capacity of cache is increasing, the performance of storage system is not increasing, or decreasing in sometimes. Researches show that there are two reasons limit the effectiveness of the multi-level cache architecture. First, with the increase of cache capacity, the hit rate in storage client is increasing, then, the requests sent to server has poor temporal locality. This result in the algorithms based-locality such as LRU has a lower utilization. Second, the large amount of data was stored client-side and server-side. These redundant data made the whole system performance lower than the total amount of caches in the hierarchy should be demonstrated. So, how can improve the utilization of cache space become the most extensively studied problems.
     The paper makes a research on the storage system. The results show that the prefetching algorithm applied to multi-level storage system is the algorithm that used in single system. These algorithms directly applied to multi-level storage system also brings a lot of limitations, such as overly conservative prefetching algorithm will not be able to effectively use the lower-level cache space, while overly aggressive prefetching algorithm will be compounded across levels and generate large amounts of wasted prefetch. Thus wasting a lot of storage space. Prefetching coordination algorithm can improve the utilization of L2 Cache by analyzing the mode of the requests and forecasting the state of L2 Cache. And it does not change the original prefetching algorithm. Based on a detailed analysis to the current prefetching coordination algorithm, the thesis found two problems: first, it has little effect for the random request, and cannot improve the utilization of L2 Cache significantly. Second, it can only be worked with homogeneous combinations of prefetching algorithms at multiple levels.
     For the two defects of perfetching coordination algorithm, the thesis put forward a new improvement algorithm. It can not only improve its efficiency in the random request, can also be used in heterogeneous storage systems architecture. In order to validate the improved coordination algorithm, this thesis designs and implements a two-level storage system test bed, and uses different prefetching algorithm in it. The results show that the new algorithm can improve the performance of storage system in both random and sequential requests. It can be used in heterogeneous system environment, and the performance of this kind of heterogeneous system depends almost entirely on the prefetching algorithm that used in server side. Prefetching coordination algorithm provides a new method in improve the performance of storage system, and does not require any additional cost. So, it has very important theoretical and practical value.
引文
1 EMC Corporation. Symmetrix 3000 and 5000 Enterprise Storage Systems product description guide. http://www.emc.com/, 1999
    2 Transaction processing performance council. http://www.tpc.org
    3 T. Wong and J. Wilkes. My cache or yours? Making storage more exclusive. In Proceedings of the 2002 USENIX Annual Technical Conference, Jun 2002:161-175
    4 Y. Zhou, J. Philbin, and K. Li. The multi-queue replacement algorithm for second level buffer caches. In Proceedings of the 2001 USENIX Annual Technical Conference, 2001:91-104
    5胡越明.计算机系统结构.北京航空航天大学出版社,2007:148-162
    6连瑞琦,张兆庆,乔如良.指令集并行编译器的数据预取及优化方法.计算机学报. 2000,23(6):576-584
    7沈立,王志英,鲁健壮,戴葵.基于控制流的混合指令预取.电子学报. 2003,31(8):1141-1144
    8沈立,戴葵,王志英.以基于块为单位元的非顺序指令预取.计算机工程与科学. 2003,25(14):94-98
    9邓让钰,谢伦国,肖立权.一种硬件预取机构及其对系统影响的研究.计算机工程与科学. 2001,23(6):70-72
    10张百达.一种软硬结合的预取技术研究.国防科学技术大学研究生院. 2007:20-34
    11黄震春,李三立.填补内存间距的一种方法-前瞻性Cache.小型微型计算机系统. 2002,23(6):641-645
    12郇丹丹.高性能存储系统研究.中国科学院研究生院.2006:29-50
    13马婉良.微处理器片内存储系统的研究与设计.西北工业大学.1998:26-52
    14马婉良,高德远,张盛兵.微处理器设计中提高访存效率的一种方法.西北工业大学学报. 1999,17(3):338-343
    15孙华锦.面向存储系统瓶颈问题的技术研究.西北工业大学. 2004:23-69
    16杨波,高德远,张盛兵,一种高效预取机制的设计和实现.微电子学与计算机. 2001,18(l): 51-54
    17 Zhifeng Chen, Yuanyuan Zhou, and Kai Li. Eviction-based cache placement for storage caches. In Proceedings of the2003 USENIX Annual Technical Conference, 2003: 269-281
    18 Shuang Liang, Song Jiang, xiaodong Zhang. STEP:Sequentiality and Thrashing Detection Based Prefetching to Improve Performance of Networked Storage Servers. International Conference on Distributed Computing Systems(ICDC’07) 2007:550-559
    19 Song Jiang and Xiaodong Zhang. ULC: a file block placement and replacement protocol to effectively exploit hierarchical locality in multi-level buffer caches. In Proceedings of the 24th International Conference on Distributed Computing Systems (ICDCS), 2004: 168-177
    20 Gala Yadgar, Michael Factor, and Assaf Schuster. Karma: Know-it-all replacement for a multilevel cache. In Proceedings of the 5th USENIX Conference on File and Storage Technologies (FAST), 2007: 169-184
    21 D. P. Bovet and M. Cesati. Understanding the Linux Kernel. O’Reilly, 2005:123-156
    22 Z. Zhang, K. Lee, X. Ma, and Y. Zhou. PFC: Transparent Optimization of Existing Prefetching Strategies for Multi-level Storage Systems . In Proceedings of the 28th International Conference on Distributed Computing Systems, 2008: 740-751
    23 Storage Networking Industry Association. http://www.snia.org
    24冯丹.网络存储关键技术的研究及进展.移动通信.2009.6:35-39
    25姜宁康,时成阁.网络存储导论.清华大学出版社.2007:28-30
    26 Acharya,M.Uysal,and J.Saltz.Active disks:programming model,algorithms and evaluation.In:Proceedings of the 8th Conference on Architectural Support for Programming Languages and Operating System(ASPLOS VIII).1998:81-91
    27 Lakshmi N. Bairavasundaram, Muthian Sivathanu, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. X-ray: A non-invasive exclusive caching mechanism for raids. SIGARCH Comput. Archit. News. 2004,32(2):176
    28薛燕.Cache预测技术的研究.西北工业大学.2005,18-34
    29 D. Brown, T. C. Mowry, and O. Krieger. Compiler-based I/O prefetching for out-of-core applications. ACM ransactions on Computer Systems. 2001,19(2):111-170
    30 S. Devadas P. Jain and L. Rudolph. Controlling cache pollution in prefetching with software-assisted cache replacement.In Tech. Rep. CSG-462,M.I.T, 2001:127-129
    31 E. Shriver, C. Small, and K. A. Smith. Why does file system prefetching work? .In Proceedings of the USENIX 1999 Annual Technical Conference (USENIX), 1999:71-84
    32 A. Smith. Cache memories. In ACM Computing Surveys (CSUR), 1982, 4:473-530
    33 Myoung Kwon Tcheun, Hyunsoo Yoon, and Seung Ryoul Maeng. An adaptive sequential prefetching scheme in shared-memory multiprocessors. In ICPP’97: Proceedings of the international Conference on Parallel Processing, Washington, DC, USA,IEEE Computer Society.1997:306-313
    34 B. Gill and D. Modha. Sarc: Sequential prefetching in adaptive replacement cache. In Proceedings of the 2005 USENIX Annual Technical Conference(USENIX’05), 2005:293-308
    35 B. Gill and L. Bathen. Amp: Adaptive multi-stream prefetching in a shared cache. In Proceedings of the 5th USENIX Conference on File and Storage Technologies(FAST’07), 2007:185-198
    36 A. R. Butt, C. Gniady, and Y. C. Hu. The performance impact of kernel prefetching on buffer cache replacement algorithms. In SIGMETRICS '05: Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, New York, NY, USA, ACM.2005:157-168
    37 B. L. W. Gregory R. Ganger and Y. N. Patt. The disksim simulation environment. In University of Michigan, EECS, Technical Report CSE-TR-358-98, Feb. 1998: 125-131
    38 J. Wilkes. The Pantheon storage-system simulator. Technical Report HPL-SSP-95-14. Storage Systems Program, Hewlett-Packard Laboratories, Palo Alto, CA, 1996:126-131
    39 Z. Chen, Y. Zhang, Y. Zhou, H. Scott, and B. Schiefer. Empirical evaluation of multi- level buffer cache collaboration for storage systems. In SIGMETRICS '05: Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, New York, NY, USA, ACM..2005:145-156
    40何裕南,安虹,郭锐,梁博. OpenCMP:一个支持事务存储模型的多核处理器模拟器.计算机科学, 2007,(01):248-254
    41路放,安虹,梁博,任建.OpenSMT:一个同时多线程处理器模拟器的设计与实现.计算机科学.2006,33(1):158-163
    42 S.W.Schlosser,J.Schindler,S.Papadomanolakis,M.Shao,A.Ailamaki,C.Faloutsos,and G.R.Ganger.On multidimensional data and modern disks. In Proceedings of the 4th USENIX Conference on File and Storage Technologies(FAST),2005:225-238
    43 E.Papaththanasiou and M.L.Scott. Aggressive prefetching: An idea whose time has come. In Proceedings of the 10th Workshop on Hot Topics in Operating Systems(HotOS),2005:21-23
    44 Qingbo Zhu,Zhifeng Chen, Lin Tan, Yuanyuan Zhou, Kimberly Keeton, and John Wilkes. Hibernator: helping disk arrays sleep through the winter. In Proceedings of the twentieth ACM symposium on Operating systems principles(SOSP),2005: 177-190
    45 Alexander James Balik. MST: A Multi-level Storage Testbed. Computer Science, Raleigh, North Carolina, 2008:11-25
    46 Storage Performance Council. SPC Benchmark-1: Specification, version 1.10.1, 2006
    47 Storage Performance Council. SPC Benchmark-2: Specification, version 1.2, 2006
    48 http://traces.cs.umass.edu/index.php/Storage/Storage

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700