用户名: 密码: 验证码:
Research on data pre-deployment in information service flow of digital ocean cloud computing
详细信息    查看全文
  • 作者:Suixiang Shi (1) (2)
    Lingyu Xu (3)
    Han Dong (1) (2)
    Lei Wang (3)
    Shaochun Wu (3)
    Baiyou Qiao (4)
    Guoren Wang (4)
  • 关键词:HDFS ; data prefetching ; cloud computing ; service flow ; digital ocean
  • 刊名:Acta Oceanologica Sinica
  • 出版年:2014
  • 出版时间:September 2014
  • 年:2014
  • 卷:33
  • 期:9
  • 页码:82-92
  • 全文大小:1,899 KB
  • 参考文献:1. Chen D W, He Y J. 2010. A study on secure data storage strategy in cloud computing. JCIT: Journal of Convergence Information Technology, 5(7): 175鈥?79 CrossRef
    2. Chilimbi T M, Hirzel M. 2002. Dynamic hot data stream prefetching for general-purpose programs. In: Proceedings of the ACM SIGPLAN 2002 Conference on Programming Language Design and Implementation. New York: ACM Press, 199鈥?09 CrossRef
    3. Cilku B, Ye X D, Hu G, et al. 2010. Using a local prefetch strategy to obtain temporal time predictability. In: 2010 3rd IEEE International Conference on Computer Science and Information Technology (ICCSIT 2010). Chengdu, China: IEEE, 8: 576鈥?80
    4. Couceiro M, Romano P, Rodrigues L. 2011. PolyCert: Polymorphic self-optimizing replication for in-memory transactional grids. In: Proceedings of the ACM/IFIP/USENIX 12th International Middleware Conference. Berlin Heidelberg: Springer, 309鈥?28
    5. Huang Y, Gu Z M, Tang J, et al. 2012. Reducing cache pollution of threaded prefetching by controlling prefetch distance. In: Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW 2012). Shanghai, China: IEEE, 1812鈥?819
    6. Kawata S. 2010. Review of PSE (Problem Solving Environment) study. JCIT: Journal of Convergence Information Technology, 5(4): 204鈥?15 CrossRef
    7. Kobashi H, Kawata S, Manabe Y, et al. 2010. PSE Park: Framework for problem solving environments. JCIT: Journal of Convergence Information Technology, 5(4): 225鈥?39 CrossRef
    8. Kyriazis D, Tserpes K, Menychtas A, et al. 2008. An innovative workflow mapping mechanism for grids in the frame of quality of service. Future Generation Computer Systems, 24(6): 498鈥?11 CrossRef
    9. Lin F, Zeng W H, Jiang Y, et al. 2010. A group tracing and filtering tree for REST DDos in cloud. JDCTA: International Journal of Digital Content Technology and its Applications, 4(9): 212鈥?24 CrossRef
    10. Lin L, Li X M, Jiang H, et al. 2008. AMP: an affinity-based metadata prefetching scheme in large-scale distributed storage systems. In: Proceedings of the 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid (CCGRID鈥?8). 459鈥?66 CrossRef
    11. Liu K, Chen J, Yang Y, et al. 2008. A throughput maximization strategy for scheduling transaction-intensive workflows on SwinDeW-G. In: Concurrency and Computation: Practice and Experience-2nd International Workshop on Workflow Management and Applications in Grid Environments. Chichester, UK: John Wiley and Sons Ltd., 1807鈥?820
    12. Nori A K. 2010. Distributed caching platforms. In: Proceedings of the 36th International Conference on Very Large Data Bases (VLDB 2010). Singapore: VLDB Endowment Inc., 1645鈥?646
    13. Seo S, Jang I, Woo K, et al. 2009. HPMR: Prefetching and pre-shuffling in shared MapReduce computation environment. In: Proceedings of IEEE International Conference on Cluster Computing and Workshops. New Orleans, LA: IEEE, 1鈥?
    14. Shafer J, Rixner S, Cox A L. 2010. The Hadoop distributed filesystem: Balancing portability and performance. In: Proceedings of the IEEE International Symposium on Performance Analysis of Systems & Software (ISPASS 2010). White Plains, NY: IEEE, 122鈥?33 CrossRef
    15. Shi Suixiang, Liu Yang, Wei Hongyu, et al. 2013. Research on cloud computing and services framework of marine environmental information management. Acta Oceanologica Sinica, 32(10):57鈥?6 CrossRef
    16. Tang L M, Xing S X, Chen T H. 2012. An improved adaptive cache prefetch algorithm. In: 2012 5th International Symposium on Computational Intelligence and Design (ISCID 2012), 2: 255鈥?58 CrossRef
    17. Wenisch T F, Somogyi S, Hardavellas N, et al. 2005. Temporal streaming of shared memory. In: Proceedings of the 32nd Annual International Symposium on Computer Architecture. Los Alamitos: IEEE Computer Society, 222鈥?33
    18. Wu C J, Jaleel A, Martonosi M, et al. 2011. PACMan: Prefetchaware cache management for high performance caching. In: Proceedings of the Annual International Symposium on Microarchitecture, MICRO. Porto Alegre, Brazil: ACM, 442鈥?53
    19. Xu Y J, Xu L Y, Liu N, et al. 2010. Marine service flow design based on cloud computing. In: 2010 3rd International Conference on Computer and Electrical Engineering. V4-24鈥揤4-27
    20. Yoon U K, Kim H J, Chang J Y. 2010. Intelligent data prefetching for hybrid flash-disk storage using sequential pattern mining technique. In: Proceedings of the 2010 IEEE/ACIS 9th International Conference on Computer and Information Science. Yamagata: IEEE, 280鈥?85 CrossRef
  • 作者单位:Suixiang Shi (1) (2)
    Lingyu Xu (3)
    Han Dong (1) (2)
    Lei Wang (3)
    Shaochun Wu (3)
    Baiyou Qiao (4)
    Guoren Wang (4)

    1. National Marine Data and Information Service, State Oceanic Administration, Tianjin, 300171, China
    2. Key Laboratory of Digital Ocean, State Oceanic Administration, Tianjin, 300171, China
    3. College of Computer Engineering and Science, Shanghai University, Shanghai, 200072, China
    4. College of Information Science and Engineering, Northeastern University, Shenyang, 110819, China
  • ISSN:1869-1099
文摘
Data pre-deployment in the HDFS (Hadoop distributed file systems) is more complicated than that in traditional file systems. There are many key issues need to be addressed, such as determining the target location of the data prefetching, the amount of data to be prefetched, the balance between data prefetching services and normal data accesses. Aiming to solve these problems, we employ the characteristics of digital ocean information service flows and propose a deployment scheme which combines input data prefetching with output data oriented storage strategies. The method achieves the parallelism of data preparation and data processing, thereby massively reducing I/O time cost of digital ocean cloud computing platforms when processing multi-source information synergistic tasks. The experimental results show that the scheme has a higher degree of parallelism than traditional Hadoop mechanisms, shortens the waiting time of a running service node, and significantly reduces data access conflicts.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700