支持异构网络存储服务的YaFS文件系统研究与实现
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着经济和科技的发展,用户对移动设备的需求不断增大,使用智能移动手机及上网本等各种移动设备的数量不断增多,对各类移动设备的存储容量、多个设备中数据同步、用户间信息共享、资源随处可用等的要求给相关数据存储与管理带来了挑战。另一方面,随着高速移动网络的广泛部署及云存储等网络存储服务的发展,移动设备通过高速移动网络将网络存储服务作为存储后端已成为可能。本文设计并实现了一个支持异构网络存储服务的YaFS文件系统,通过网络提供商提供的网络环境,使用Internet中的各种网络存储服务增大智能设备的存储容量,能较好地满足各类移动设备对大数据存储、管理、访问与同步的需求。论文主要工作和创新主要包括:
     1、针对移动设备存储容量受限以及多设备数据管理困难等问题,设计并实现了YaFS文件系统框架。通过设计兼容POSIX标准的文件系统接口实现系统对已有应用程序的无缝支持;通过设计存储抽象层与插件机制实现系统对异构网络存储服务的支持。
     2、为减少系统产生的流量以及因网络交互而带来的网络延迟,设计了一套缓存管理策略。通过使用数据预取策略提高了系统读写性能并降低了操作响应时间;通过设计一种基于缓存替换代价的mLRU替换算法提高了缓存命中率;通过设计一套基于租约锁的一致性维护协议保证了多客户端缓存使用的正确性;通过本地缓存的使用确保系统在离线状态可用,增强了系统可靠性和可用性。
     3、针对移动网络特点,为进一步提高系统性能以及数据安全性,设计了一组可动态加载的模块化系统优化方法。通过引入数据条块化技术提高了系统对数据读写的并发度,增强了系统的总体性能;通过设计并使用元数据与数据混合存储的处理模式提高了系统对小文件操作的性能;通过采用动态可配置的数据压缩、加密等处理策略减少了系统流量并保证了操作数据安全。
     4、对原型系统进行性能测试,分别从接口支持和系统性能两方面对课题实现的原型系统进行测试。接口功能测试实验显示,系统有效地实现了对POSIX标准接口的支持;系统性能测试结果表明,本文设计并实现的系统是一种简化应用程序透明访问异构存储服务较为有效的解决方案。
With the development of economic and technological, it is common for users to have one or more mobile devices. And the data synchronization, information sharing and resources accessibility make the data management face with more challenges. On the other hand, with the widespread deployment of high-speed mobile networks and various network storage services, it has become possible for mobile device to use the network storage services as its storage back-end via high-speed network.
     This paper designs and implements an extensible file system framework, YaFS, using heterogeneous network storage services as its back-ends to increase the storage capacity of the mobile devices. The main work and innovations are as follows:
     Firstly, in order to deal with the storage capacity defect of mobile devices and the difficulty of data management, we design and implement YaFS, a file system using heterogeneous storage services as its back-ends. It implements a POSIX compliant interface to guarantee the usability of existing applications. And by implementing a storage abstraction layer it makes the file system possible to support heterogeneous services.
     Secondly, with the purpose of reducing the network traffic and improving the system performance, YaFS has designed a caching management strategy. By using data prefetching and a cost-based cache replacement algorithm mLRU, it makes the system operations with a high performance and low latency. By implementing a lease-lock based cache coherence protocol, it assures the user of the system correctness. And with the use of cache, it makes the system be runnable in the offline mode.
     Thirdly, according to the properties of mobile network, we design a modular optimization framework in order to improve the system performance and data security. By referring to the idea of data stripe, we concurrently process the data stripe to achieve a high performance. And we design a hybrid mode in YaFS to improve the performance for small files. Besides, we use a encryption module to deal with the data security and a compression module to limit the network traffic.
     Fourthly, we test our system on the aspects of interface supporting and system performance. With the evaluation on a prototype implementation with email services as its storage back-end shows that the performance and usability of the framework is viable.
引文
[1] Research in China. 2009年全球智能手机市场及产业链研究报告. Technical report, Research in China, 2009.
    [2] C. Cachin, I. Keidar, and A. Shraer. Trusting the cloud. SIGACT News, ACM, 2009, 2(40): 81~86.
    [3] D. Chappell. A Short Introduction To Cloud. Technical report, Microsoft Corporation, 2008.
    [4] IDC. Ultra Low-Cost Notebook PCs Poised for Consumer Acceptance as Additive and Educational Devices. Technical report, International Data Corporation, 2008.
    [5] H. Choi and H. Varian. Predicting the present with google trends. Technical report, Google Inc, 2009.
    [6] IDC. Storage-as-a-Service: Buyer Preferences and Market Analysis. Technical report, International Data Corporation, 2008.
    [7] Palankar, M.R. and Iamnitchi, A. and Ripeanu, M. and Garfinkel, S. Amazon S3 for science grids: a viable solution. Proceedings of the 2008 international workshop on Data-aware distributed computing, 2008: 55~64.
    [8] Gallmeister, B.O. POSIX.4: Programming for the real world. O’Reilly & Associate Inc., 1995.
    [9] R. Jones. GmailFS: GMail virtual file system. http://richard.jones.name/google- hacks/gmail-filesystem/gmail-filesystem.html.
    [10] Satyanarayanan, M., Kistler, J.J., Kumar, P., Okasaki, M.E., Siegel, E.H., and Steere. Coda: A highly available file system for a distributed workstation environment. IEEE Transactions on computers, 1990, 39(4): 447~459.
    [11] M. Szeredi. Filesystem in Userspace. http://fuse.sourceforge.net.
    [12] B. Hammersley. Hacking Gmail. Wiley Publishing, Inc., 2006.
    [13] LFTP. http://en.wikipedia.org/wiki/Lftp.
    [14] J. Howard. An overview of the andrew file system. Proceedings of the USENIX Winter Technical Conference, 1988: 23~26.
    [15] J. Hartman and J. Ousterhout. The Zebra striped network file system. ACM Transactions on Computer Systems, 1995, 13(3): 274~310.
    [16] A. Joseph, G. Candea, and M. Kaashoek. RFS: A Mobile-Transparent File System for the Rover Toolkit. Presented as a Works-In-Progress poster at the Sixteenth Symposium on Operating Systems Principles, 1997.
    [17] J. C. Lui, O. K. So, and T. Tam. NFS/M: An open platform mobile file system. Proceedings of the International Conference on Distributed Computing Systems, 1998, 18: 488~497.
    [18] W. Shi, H. Lufei, and S. Santhosh. Cegor: An adaptive, distributed file system forheterogeneous network environments. Proceedings of the tenth International Conferences on Parallel and Distributed Systems, 2004, 10: 145-142.
    [19] A. Muthitacharoen, B. Chen, and D. Mazieres. A low-bandwidth network file system. Proceedings of the eighteenth ACM symposium on Operating systems principles, 2001, 5: 174~187.
    [20] MobileMe. http://en.wikipedia.org/wiki/MobileMe.
    [21] Live Mesh. http://en.wikipedia.org/wiki/Live_Mesh..
    [22] Windows Live Sync. http://en.wikipedia.org/wiki/Windows_Live_Sync.
    [23] Windows Live SkyDrive. http://en.wikipedia.org/wiki/Windows_Live_SkyDrive.
    [24]备份与同步. http://www.iusesoft.info/2009/07/28/backup-sync-1-cloud/.
    [25] C. Gluster. GlusterFS user guide. Maio de, 2007.
    [26] R. Noronha and D. Panda. IMCa: A High Performance Caching Front-End for GlusterFS on InfiniBand. Proceedings of the 2008 37th International Conference on Parallel Processing, 2008: 462~469.
    [27]沈洁,毛华坚,卢宇彤.一种新型分布并行文件系统GlusterFS分析与性能测试.中国高性能计算年会,2009.
    [28]毛德操,胡希明.Linux内核源代码情景分析.浙江大学出版社,2001.
    [29] Sourceforge. http://sourceforge.net.
    [30] Stallman, R. The GNU operating system and the free software movement. 1999.
    [31] G. Ganger and M. Kaashoek. Embedded inodes and explicit grouping: exploiting disk bandwidth for small files. Proceedings of the USENIX 1997 Annual Technical Conference, 1997.
    [32] cache替换算法. http://hi.baidu.com/alex%BA%E9%C1%C1/blog/item/1917e- f6f4d915cd080cb4a46.html
    [33] D. Howells and R. Ltd. Fs-cache: A network filesystem caching facility. Proceedings of the Linux Symposium, 2006.
    [34] S. Santhosh and W. Shi. A Semantic-based Cache Replacement Algorithm for Mobile File Access. In the 14th International World Wide Web Conference, 2005.
    [35]卢凯,金士尧,卢锡城.并行文件系统中适度贪婪的Cache预取一体化算法.计算机学报,1999,22 (11):1172~1177.
    [36]周可,张江陵.基于存取模式的Cache预取自适应策略研究.计算机工程与科学,2003,25(01):82~87.
    [37] Crovella, M. and Barford, P. The network effects of prefetching. IEEE INFOCOM 98– The Conference On Computer communications, 1998, 1(3): 1232~1239.
    [38] S. Byna, Y. Chen, X. Sun, R. Thakur, and W. Gropp. Parallel I/O prefetching using MPI file caching and I/O signatures. Proceedings of the 2008 ACM/IEEE conference on Supercomputing, 2008.
    [39] Y. Chen, S. Byna, X. Sun, R. Thakur, and W. Gropp. Hiding I/O latency with pre-execution prefetching for parallel applications. Proceedings of the 2008 ACM/IEEE conference on Supercomputing, 2008.
    [40] P. Honeyman and L. Huston. Communications and consistency in mobile file systems. Ann Arbor, 1995, 1(1001): 4831~4943.
    [41] Q. Lu and M. Satyanarayanan. Improving data consistency in mobile computing using isolation-only transactions. Proceedings of the Fifth IEEE HotOS Topics Workshop, 1995.
    [42] G. Coulouris, J. Dollimore, and T. Kindberg. Distributed Systems: Concepts and Design. Addison-Wesley Longman, 2005.
    [43] J. Kistler and M. Satyanarayanan. Disconnected operation in the Coda file system. ACM Transactions on computer Systems, 1992, 10(1): 3~25.
    [44] Y. Lu, H. Mao, and J. Shen. A Distributed filesystem framework for transparent accessing heterogeneous storage services. Proceedings of the 2009 IEEE International Symposium on Parallel & Distributed Processing, 2009.
    [45] Python. http://www.python.org.
    [46] M. Crispin. IMAP4, Internet Message Access Protocol Version 4. Technical report, University of Washington, 1994.
    [47] SmartQ5. http://www.smartdevices.com.cn/index.html.