摘要
本文根据卷积神经网络特点,提出了一种基于FPGA的流水线并行加速方案,设计优化了卷积模块电路、激活模块电路以及下采样模块电路,从而构建了卷积神经网络运算的FPGA基本单元.在网络结构和处理数据相同的情况下,50MHz频率的FPGA计算效率为CPU的8倍、GPU的近5倍,而功耗则只占GPU的27.8%.
Sccording to the characteristics of convolutional neural network,this paper proposes a pipeline parallel acceleration scheme of FPGA.Convolution module circuit,activation module circuit and down-sampling module circuit are designed to construct the FPGA basic unit of convolution neural network operation.With the same network structure and processing data,FPGAs with 50 MHz frequency are 8 xand nearly 5 xcomputational efficiency of the CPU and the GPU,while power consuming only 27.8% of the GPU.
引文
[1] Girshick R,Donahue J,Darrell T,et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C]//proceedings of the 2014IEEE Conference on Computer Vision and Pattern Recognition,Columbus,OH,USA,2014.
[2] Li H,Li Y,Porikli F.DeepTrack:learning discriminative feature representations by convolutional neural networks for visual tracking[C]//Proceedings British Machine Vision Conference.York,UK,2014.
[3] Lecun Y,Boser B,Denker J S,et al.Backpropagation Applied to Handwritten Zip Code Recognition[J].Neural Computation,2014,1(4):541-51.
[4]满凤环,陈秀宏,何佳佳.一种基于模拟退火算法改进的卷积神经网络[J].微电子学与计算机,2017,34(9):58-62.
[5]朱锡祥,刘凤山,张超,等.基于一维卷积神经网络的车载语音识别研究[J].微电子学与计算机,2017,34(11):21-26.
[6]李彦冬,郝宗波,雷航.卷积神经网络研究综述[J].计算机应用,2016,36(9):2508-2523.
[7] Sankaradas M,Jakkula V,Cadambi S,et al.A massively parallel coprocessor for convolutional neural networks[C]//proceedings of the IEEE International Conference on Application-Specific Systems,Architectures and Processors.Milan,Italy,2009.
[8]李施豪,应三丛.基于FPGA的卷积神经网络浮点激励函数实现[J].微电子学与计算机,2017,34(10):105-109.