用户名: 密码: 验证码:
移动GPU中模型视图变换单元的可重构设计
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Reconfigurable Design of Model View Transformation Unit in Mobile GPU
  • 作者:谢晓燕 ; 芦守鹏 ; 邓军勇 ; 田汝佳
  • 英文作者:XIE Xiao-yan;LU Shou-peng;DENG Jun-yong;TIAN Ru-jia;School of Computer Science and Technology,Xi'an University of Posts and Telecommunications;School of Electronic Engineering,Xi'an University of Posts and Telecommunications;
  • 关键词:模型视图变换 ; 可重构 ; 并行计算 ; 移动图形处理器 ; 矩阵计算
  • 英文关键词:model view transformation;;reconfigurable;;parallel computing;;mobile graphics processors;;matrix computations
  • 中文刊名:WJFZ
  • 英文刊名:Computer Technology and Development
  • 机构:西安邮电大学计算机学院;西安邮电大学电子工程学院;
  • 出版日期:2018-11-15 10:11
  • 出版单位:计算机技术与发展
  • 年:2019
  • 期:v.29;No.261
  • 基金:国家自然科学基金(61272120,61602377,61634004);; 陕西省科技统筹创新工程项目(2016KTZDGY02-04-02);; 陕西省重点研发计划(2017GY-060)
  • 语种:中文;
  • 页:WJFZ201901012
  • 页数:6
  • CN:01
  • ISSN:61-1450/TP
  • 分类号:61-66
摘要
针对移动图形处理器,正确显示并渲染图像是一个重要的指标,尤其是在保证图像质量的基础上,如何在多个PE(processing element)阵列的结构上将图像变换时的矩阵计算并行化,并在低功耗和有限带宽的情况下实现高效的图形渲染和高质量的图形效果是一项十分重要的工作。结合图形学中的3D仿射变换的基本概念和过程,并根据PE阵列结构的算法映射特点,提出一种可重构模型视图变换单元的并行化设计方案。该方案使用32个PE对平移、缩放、旋转操作的矩阵运算进行并行处理,并在FPGA(field programmable gate array)开发板上完成了原型验证。FPGA平台的输出结果和Linux平台的输出结果对比表明,可重构电路可以正确实现预期的图形变换效果。这种设计方案更具有灵活性且可以实时适应计算任务要求的变化,其电路工作频率可达183. 23 MHz。
        For the mobile GPU,displaying and rendering the image correctly is an important indicator. Especially on the basis of ensuring the image quality,how to parallelize the matrix computation of image transformation in the structure of multiple PE and achieve efficient rendering and high quality graphics effect under the condition of low power consumption and limited bandwidth is an important work.Combined with the basic concept of 3 D affine transformation in computer graphics and the algorithm mapping characteristics of PE array structure,we propose a parallelized design scheme of reconfigurable model view transformation unit. The scheme uses 32 PEs to parallelize the matrix computation of translation,scaling and rotation operations,and complete the prototype verification on the FPGA board. The comparison between the output image of FPGA platform and the output image of Linux platform shows that the reconfigurable circuit can achieve the expected effect of graphics transformation. This design is more flexible and can adapt to changes in computing tasks requirements,the circuit frequency can reach 183. 23 MHz.
引文
[1] GARZIA F,BRUNELLI C,ROSSI D,et al. Implementation of a floating-point matrix-vector multiplication on a reconfigurable architecture[C]//IEEE international symposium on parallel and distributed processing. M iami,FL,USA:IEEE,2008:1-6.
    [2] SICOE O,POPA M. Generation of floating point 2D scaling operators for FPGA[C]//IEEE international symposium on applied computational intelligence and informatics. Timisoara,Romania:IEEE,2016.
    [3] MONDAL P,BISWAL P K,BANERJEE S. FPGA based accelerated 3D affine transform for real-time image processing applications[J]. Computers&Electrical Engineering,2016,49:69-83.
    [4] AKBUDAK K,AYKANAT C. Exploiting locality in sparse matrix-matrix multiplication on many-core architectures[J]. IEEE Transactions on Parallel&Distributed Systems,2017,28(8):2258-2271.
    [5]强倩,张嘉琛.面向视频处理的可重构计算阵列系统设计[J].微计算机信息,2010,26(31):95-97.
    [6]魏文辉.基于Android系统3D引擎的设计与实现[D].武汉:武汉理工大学,2012.
    [7]程鹏润.基于Android平台的增强现实的实现和应用[D].杭州:浙江工业大学,2015.
    [8]章夏芬,朱昌明.计算机图形学中仿射变换的教学[J].计算机教育,2017(2):136-140.
    [9]蒋林,王杏军,刘镇弢,等.基于SystemC的可重构阵列处理器模型[J].西安邮电大学学报,2016,21(3):73-78.
    [10]刘小宁,谢宜壮,陈禾,等. CORDIC算法的优化及实现[J].北京理工大学学报,2015,35(11):1164-1170.
    [11]宋定昆,刘桂雄,唐文明.改进SF CORDIC算法正余弦函数求解及其应用[J].中国测试,2016,42(12):100-104.
    [12]孔德元.针对正弦余弦计算的CORDIC算法优化及其FPGA实现[D].长沙:中南大学,2008.
    [13]邓军勇,李涛,蒋林,等.面向OpenGL的图形加速器设计与实现[J].西安电子科技大学学报:自然科学版,2015,42(6):124-130.
    [14]李文石,姚宗宝.基于阿姆达尔定律和兰特法则计算多核架构的加速比[J].电子学报,2012,40(2):230-234.
    [15]冯叶.非对称多核体系下的阿姆达尔定律性能模型研究[D].上海:上海交通大学,2012.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700