视频阵列处理器多层次分布式存储结构设计
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Design of distributed memory architecture for video array processor
  • 作者:蒋林 ; 崔朋飞 ; 山蕊 ; 武鑫 ; 田汝佳
  • 英文作者:JIANG Lin;CUI Pengfei;SHAN Rui;WU Xin;TIAN Rujia;School of Electronic Engineering, Xi'an University of Posts & Telecommunications;
  • 关键词:视频阵列处理器 ; 分布式存储结构 ; 目录协议 ; 高速缓存 ; 层次化
  • 英文关键词:video array processor;;distributed storage architecture;;directory protocol;;cache;;hierarchical
  • 中文刊名:JSGG
  • 英文刊名:Computer Engineering and Applications
  • 机构:西安邮电大学电子工程学院;
  • 出版日期:2018-06-15
  • 出版单位:计算机工程与应用
  • 年:2018
  • 期:v.54;No.907
  • 基金:国家自然科学基金(No.61272120,No.61634004,No.61602377);; 陕西省自然科学基金(No.2015JM6326);; 陕西省科技统筹创新工程项目(No.2016KTZDGY02-04-02);; 陕西省教育厅自然科学研究项目(No.17JK0689)
  • 语种:中文;
  • 页:JSGG201812010
  • 页数:6
  • CN:12
  • 分类号:62-67
摘要
随着视频编解码标准的不断演进,算法处理的数据量也随之剧增。多核结构并行化处理技术在提升算法计算速度的同时,使得存储结构成为了整个编解码系统性能的瓶颈。针对视频编解码算法访存的局部性、各算法之间数据交互频繁性、算法内部大量临时数据不交互性的特点,设计并实现了由私有存储层和共享存储层构成的多层次分布式存储结构。通过Xilinx公司的Virtex-6系列xc6vlx550T开发板对设计进行测试,实验结果表明,该结构在保持简洁性和可扩展性的同时,最高可提供9.73 GB/s的访存带宽,能够满足视频编解码算法数据访存的需求。
        With the continuous development of video coding standards, the amount of data processed by video codec algorithm is also increasing sharply. Although the parallel processing technology of multi-core architecture promote the speed of processing of video codec algorithm, it makes the memory structure becomes the bottleneck of the whole codec system.Aiming at the characteristics of video codec algorithm of the locality accessing data, the high frequency of data exchange between different algorithm, a large number of temporary internal data does not need to interact, this paper designs and implements a multi level distributed storage structure which contains private storage layer and shared storage layer. The design has been tested on the FPGA development board of the company of Xilinx, the experimental results show that the structure not only maintain the simplicity and scalability, but also can provide data access bandwidth which can up to be9.73 GB/s, meet the needs of video codec algorithm data access.
引文
[1]Schmitz J A,Gharzai M K,Balkir S,et al.A 1,000 frames/s vision chip using scalable pixel-neighborhood-level parallel processing[J].IEEE Journal of Solid-State Circuits,2017,52(2):556-568.
    [2]Oh K,So J,Kim J.Low complexity implementation of slim HEVC encoder design[C]//International Conference on Systems,Signals and Image Processing,2016:1-4.
    [3]Sullivan G J,Ohm J,Han W J,et al.Overview of the High Efficiency Video Coding(HEVC)standard[J].IEEE Transactions on Circuits&Systems for Video Technology,2012,22(12):1649-1668.
    [4]Ohta M.Optical switching of many wavelength packets:A conservative approach for an energy efficient exascale interconnection network[C]//IEEE International Conference on High Performance Switching and Routing,2016:69-74.
    [5]Wang K,Gu H,Yang Y,et al.Optical interconnection network for parallel access to multi-rank memory in future computing systems[J].Optics Express,2015,23(16):20480-20494.
    [6]Wang Y,Gu H,Wang K,et al.Low-power low-latency optical network architecture for memory access communication[J].IEEE/OSA Journal of Optical Communications and Networking,2016,8(10):757-764.
    [7]黄安文,高军,张民选.多核处理器片上存储系统研究[J].计算机工程,2010,36(4):4-6.
    [8]于学球.可扩展64核处理器关键技术研究——片上网络、存储体系及LTE实现[D].上海:复旦大学,2014.
    [9]相里博.基于VMM的多核处理器共享缓存的研究与验证[D].西安:西安电子科技大学,2016.
    [10]Li J,Dai Z,Li W,et al.Study and implementation of cluster hierarchical memory system of multicore cryptographic processor[C]//IEEE International Conference on Asic,2015:1-4.
    [11]Hwang K.Advanced computer architecture:parallelism,scalability,programmability=[M].[S.l.]:Mcgraw-Hill,1993.
    [12]Gir?o G,Oliveira B C D,Soares R,et al.Cache coherency communication cost in a No C-based MPSo C platform[C]//Symposium on Integrated Circuits and Systems Design,Copacabana,Rio De Janeiro,Brazil,September,2007:288-293.
    [13]庞征斌.基于SMP的CC-NUMA类大规模系统中Cache一致性协议研究与实现[D].长沙:国防科学技术大学,2007.
    [14]Patterson D A,Hennessy J L.Computer architecture:A quantitative approach[M]//Computer Architecture:A Quantitative Approach.[S.l.]:Morgan Kaufmann Publishers Inc,2007:93.
    [15]Gir?o G,Oliveira B C D,Soares R,et al.Design and performance evaluation of a cache consistent noc-based mp-soc[C]//Iberchip Workshop,2007.
    [16]王浩.基于视频压缩算法的硬件模板设计和可重构阵列架构研究[D].上海:上海交通大学,2011.
    [17]Li T,Xiao L,Huang H,et al.PAAG:A polymorphic array architecture for graphics and image processing[C]//Fifth International Symposium on Parallel Architectures,Algorithms and Programming,2012:242-249.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700