摘要
FDTD算法是电磁场领域使用非常广泛的数值计算方法,该方法具有很好的精度与灵活性,已成为求解各种电磁场问题的有力工具。半导体技术的快速发展使得CPU的计算性能有了飞跃性的进步,但是直到现在FDTD法的在CPU上的计算时间依旧非常耗时,这极大地限制了FDTD法在各种工程领域里的应用。论文主要在GPU上实现和优化FDTD算法,从而提高FDTD方法的计算效率,节省仿真时间。实验结果表明相对Intel Xeon处理器上执行的串行程序,GPU最高可获得166倍的加速。同时根据Roofline模型,GPU性能达到理论值的89%。
FDTD algorithm is a very extensive numerical method for the electromagnetic field,which has good accuracy and flexibility,and has become a powerful tool for solving various electromagnetic problems.The rapid development of semiconductor technology makes the computational performance of the CPU has made progress in leaps and bounds,but until now on the CPU computing time of FDTD method is still very time consuming,which greatly limits the FDTD method in various engineering fields of applications.The FDTD algorithm on the GPU is realized and optimized,so as to improve the calculation efficiency of the FDTD method,save the simulation time.Experimental results show that the serial program is executed on the relative Xeon Intel processor,and the maximum of GPU can get 166 times speedup.According to the roofline model,the performance of the GPU reaches 89% of the theoretical value.
引文
[1]K.Yee.Numerical solution of initial boundary value problems involving Maxwells equations in isotropic media[J].IEEE Trans.Antennas and Propagation,1966,16:302-307.
[2]Krakiwsky,S.E.,L.E.Turner,et al.Graphics Processor Unit(GPU)acceleration of Finite-Difference Time-Domain(FDTD)algorithm[C]//IEEE International Symposium on Circuits and Systems,May 23,2004-May 26,2004.
[3]Krakiwsky,S.E.,L.E.Turner,et al.(2004).Acceleration of Finite-Difference Time-Domain(FDTD)using Graphics Processor Units(GPU)[C]//2004IEEE MITT-S International Microwave Symposium Digest,June 6,2004-June 11,2004.
[4]Baron,G.S.,C.D.Sarris,et al.(2005).Fast and accurate time-domain simulations with commodity graphics hardware[C]//IEEE Antennas and Propagation Society International Symposium,2005.
[5]Robert G Ilgner,David B Davidson.A comparison of the parallel implementations of the FDTD method on a Shared Memory Processor and a GPU[C]//Proceedings of 2nd African Conference on Computational Mechanics(AfriCOMP11),Cape Town,South Africa,5-8January 2011.
[6]M.R.Zunoubi,J.Payne,W.P.Roach.CUDA implementation of TEz-FDTD solution of Maxwells equations in dispersive media[C]//IEEE Antennas and Wireless Propagation Letters 9(2010)756.
[7]K.Datta,et al.Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures[C]//SC 2008,International Conference for High PerformanceComputing,Networking,Storage and Analysis,15-21Nov.2008,IEEE,2009:1-12.
[8]S.Williams,A.Waterman,D.Patterson.Roofline:an insightful visual performance model for multicore architectures[C]//Communications of the ACM 52(2009)65.
[9]Kim,K.-H.,et al.Performance analysis and optimization of three-dimensional FDTD on GPU using roofline model[J].Computer Physics Communications,182(6):1201-1207.
[10]Ronglin Jiang,S.J.,2Yu Zhang,2Ying Xu,1Lei Xu,1and Dandan Zhang1(2014)."GPU-Accelerated Parallel FDTD on Distributed Heterogeneous Platform."International Journal of Antennas and Propagation.