基于GPU计算平台的电磁散射计算并行加速技术
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
目标电磁散射计算,尤其是电大目标雷达散射截面预估与逆合成孔径雷达成像,对于国防建设有着十分重要的意义,一直是计算电磁学的研究热点之一。但是在解决飞机、舰船等实际目标的高频电磁散射特性分析问题时,往往会遇到计算量巨大和硬件计算能力不足等难题。
     本文为解决目标电磁散射特性的快速计算问题,借鉴计算机图形学中快速射线追踪等技术,并利用图形处理器(GPU)的强大的并行数值计算能力,分别采用GPU、CPU-GPU异构架构和GPU集群三种计算平台对频域电磁计算方法进行并行加速。
     本文提出了基于统一计算设备架构(CUDA)的多分辨率弹跳射线法,该方法综合使用了弹跳射线法的两类加速算法。第一,通过采用多分辨率射线管,有效地减少了参与计算的射线管总数;第二,使用基于线索增强的无堆栈kd树遍历算法,大大减少了不必要的内部节点遍历,加速了单根射线与目标的求交。在GPU平台上,本文还基于CUDA对矩量法进行了加速。在阻抗矩阵填充过程中,通过应用不同的核函数分别计算奇异性元素与非奇异性元素,避免了CUDA对分支语句的序列化处理带来的效率下降。并且基于CUDA提供的基础线性代数运算库CUBLAS开发了稳定双共轭梯度法,提高了矩阵方程求解的计算效率。
     本文将弹跳射线法和截断—增量长度绕射系数映射到CPU-GPU异构架构上,高效地充分利用了所有可用计算资源。在该方法中,利用GPU强大的单精度浮点运算能力加速弹跳射线法,而考虑到截断—增量长度绕射系数对于数值精度相对较为敏感,选择在CPU上基于双精度浮点数对其进行实现。根据相邻角度计算负载和计算时间几乎相同这一事实,采用基于前一角度计算时间来调整当前角度负载分配的动态负载均衡算法,保证CPU与GPU之间的负载均衡。该方法提升了高频方法在目标成像等应用中的计算精度和效率。
     最后,本文还提出了基于GPU集群的并行弹跳射线法,该方法采用虚拟孔径面划分的并行策略,克服了基于角度的负载分配方案受GPU数量限制的不足。为保证GPU节点间的负载均衡,该方法并不依赖于各个计算节点计算能力相同这一假设,而是基于前一角度各节点的计算时间来动态调整当前角度下虚拟孔径面的划分,因此该方法也适用于配备不同GPU的异构GPU集群。
     本文结合使用图形学中的快速射线追踪技术,及GPU、CPU-GPU异构架构和GPU集群三种计算平台,对多种频域计算方法进行加速,有效地提升了电大目标电磁散射分析的精度与计算效率。
The calculation of the electromagnetic scattering of targets, especially the radar cross section (RCS) prediction and the inverse synthetic aperture radar (ISAR) imaging, has important significance for the national defense construction. It is also a hot research topic in computational electromagnetics. However, it is very time-consuming for analyzing the electromagnetic scattering characteristic of the realistic targets (e.g., airplanes and ships) at high frequency due to extensive computation and insufficient processing power.
     In order to solve the electromagnetic scattering problems fast, this thesis adopts the real time ray tracing algorithm in computer graphics, and utilizes the GPU, the heterogeneous CPU-GPU architecture and the GPU cluster to accelerate the frequency-domain methods by exploiting the powerfully parallel computing ability of the GPU, respectively.
     The proposed CUDA-based multiresolution shooting and bouncing ray (MSBR) method with the kd-tree acceleration structure is fully implemented on the GPU to accelerate the SBR method. The multiresolution grid algorithm can greatly reduce the total number of ray tubes, as it adaptively adjusts the density of ray tubes for regions with different complexities of their structures, while the kd-tree acceleration structure can highly decrease the number of ray-patch intersection tests. We also present a CUDA-based MOM, which calculates the singular and non-singular elements of impedance matrix separately to avoid the performance degradation resulting from the branch divergence. Additionally, the CUBLAS library provided by CUDA is applied to develop the BiCGSTAB to efficiently solve the matrix equation.
     The SBR and the truncated wedge incremental length diffraction coefficients (TW-ILDC) are combined and implemented on the heterogeneous CPU-GPU architecture to fully utilize all available resources. The SBR is calculated in the GPU because numerous independent ray tubes can make full use of the massively parallel resources on the GPU, while the TW-ILDC is implemented on the CPU since it requires complex and high-precision numerical calculation to get the accurate result. As the workload and the computation time of neighboring aspect angles are similar, a dynamic load adjustment method is presented to achieve reasonable load balancing between the CPU and GPU. The proposed method provides higher accuracy and efficiency for ISAR imaging of electrically large complex targets.
     Finally, an efficient parallel shooting and bouncing ray (SBR) method on the GPU cluster is introduced. The parallel SBR method applies the virtual aperture partitioning scheme to overcome the drawback of angle distribution scheme. This method is not based on the assumption all the GPUs have the same performance, and it employs the computational time at the previous angle to dynamically adjust the partitioning at the current angle. This strategy not only achieves excellent load balance, but also makes the proposed method work well on the heterogeneous GPU cluster.
     This thesis combines the real time ray tracing algorithm in computer graphics and three parallel computing platforms, i.e. the GPU, the heterogeneous CPU-GPU architecture and the GPU cluster, to improve several frequency-domain approaches. The numerical results show the above-mentioned methods improve the accuracy, efficiency and scale of the analysis of scattering characteristic of the electrically large targets.
引文
[1]黄培康,殷红成,许小剑编著.雷达目标特性[M].北京:电子工业出版社,2006.
    [2]何国瑜,卢才成,洪家才,邓辉编著.电磁散射的计算和测量[M].北京:北京航空航天大学出版社,2006.
    [3]聂在平,方大纲.目标与环境电磁散射特性建模[M].北京:国防工业出版社,2009.
    [4]C A Balanis. Advanced Engineering Electromagnetics[M]. New York:Wiley, 1989.
    [5]汪茂光.几何绕射理论[M].第二版.西安:西安电子科技大学出版社,1994.
    [6]R G Kouyoumjian, P H Pathak. A uniform geometrical theory of diffraction for an edge in a perfectly conducting surface[J]. Proceedings of the IEEE.1974,62(11): 1448-1461.
    [7]H Ling, R Chou, S W Lee. Shooting and bouncing rays:calculating the RCS of an arbitrarily shaped cavity[J]. Antennas and Propagation, IEEE Transactions on.1989, 37(2):194-205.
    [8]P M Johansen. Uniform physical theory of diffraction equivalent edge currents for truncated wedge strips[J]. Antennas and Propagation, IEEE Transactions on.1996, 44(7):989-995.
    [9]R F Harrington. Field Computation by Moment Methods[M]. New York: Macmillan,1968.
    [10]J M Jin. The Finite Element Method in Electromagnetics[M]. New York:Wiley, 1993.
    [11]J D Owens, D Luebke, N Govindaraju, et al. A Survey of General-Purpose Computation on Graphics Hardware[J]. Computer Graphics Forum.2007,26(1): 80-113.
    [12]M I Skolnik. Introduction to Radar Systems[M],3rd edition. New York: McGraw-Hill Book Company,2001.
    [13]阮颖铮等 编著.雷达截面与隐身技术[M].北京:国防工业社,1998.
    [14]丁鹭飞,耿富录,陈建春.雷达原理[M].第四版.北京:电子工业出版社,2009.
    [15]彭群生,鲍虎军,金小刚编辑.计算机真实感图形的算法基础[M].北京:北学出版社,1999.
    [16]J M Rius, M Ferrando, L Jofrc. GRECO:graphical electromagnetic computing for RCS prediction in real timc[J]. Antennas and Propagation Magazine, IEEE.1993, 35(2):7-17.
    [17]S X Peng, Z P Nie. Acceleration of the Method of Moments Calculations by Using Graphics Processing Units[J]. Antennas and Propagation, IEEE Transactions on. 2008,56(7):2130-2133.
    [18]E Lezar, D B Davidson. GPU-Accelerated Method of Moments by Example: Monostatic Scattering[J]. Antennas and Propagation Magazine, IEEE.2010,52(6): 120-135.
    [19]T Topa, A Noga, A Karwowski. Adapting MoM With RWG Basis Functions to GPU Technology Using CUDA[J]. Antennas and Wireless Propagation Letters, IEEE. 2011,10:480-483.
    [20]Y B Tao, H Lin, H J Bao. GPU-Based Shooting and Bouncing Ray Method for Fast RCS Prediction[J]. Antennas and Propagation, IEEE Transactions on.2010,58(2): 494-502.
    [21]M J Inman, A Z Elsherbeni. Programming video cards for computational electromagnetics applications[J]. Antennas and Propagation Magazine, IEEE.2005, 47(6):71-78.
    [22]M Unno, S Aono, H Asai. GPU-Based Massively Parallel 3-D HIE-FDTD Method for High-Speed Electromagnetic Field Simulation[J]. Electromagnetic Compatibility, IEEE Transactions on.2012,54(4):912-921.
    [23]J B Keller. Geometrical theory of diffraction[J]. Journal of the Optical Society Of America.1962,2(52):116-130.
    [24]A Michaeli. Elimination of infinities in equivalent edge currents, part I:Fringe current components[J]. Antennas and Propagation, IEEE Transactions on.1986,34(7): 912-918.
    [25]A Michaeli. Elimination of infinities in equivalent edge currents, Part II:Physical optics components[J]. Antennas and Propagation, IEEE Transactions on.1986,34(8): 1034-1037.
    [26]A Michaeli. Equivalent edge currents for arbitrary aspects of observation[J]. Antennas and Propagation, IEEE Transactions on.1984,32(3):252-258.
    [27]J Zhang, W M Yu, X Y Zhou, et al. Efficient Evaluation of the Physical-Optics Integrals for Conducting Surfaces Using the Uniform Stationary Phase Method[J]. Antennas and Propagation, IEEE Transactions on.2012,60(5):2398-2408.
    [28]P Y Ufimtsev. Fundamentals of the Physical Theory of Diffraction[M]. New Jersey:John Wiley & Sons,2007.
    [29]K M Mitzner. Incremental length diffraction coefficients. Aircraft Division Northrop Corp., Apr.1974, Tech.Rep.AFAL-TR-73-296.
    [30]J T Moore, A D Yaghjian, R A Shore. Shadow boundary and truncated wedge ILDCs in Xpatch[C]. IEEE Antennas and Propagation Society International Symposium,2005,1:10-13.
    [31]S Suk, T Seo, H Park, et al. Multiresolution grid algorithm in the SBR and its application to the RCS calculation[J]. Microwave and Optical Technology Letters. 2001,29(6):394-397.
    [32]Y B Tao, H Lin, H J Bao. Kd-tree based fast ray tracing for RCS prediction[J]. Progress In Electromagnetics Research.2008(81):329-341.
    [33]Y B Tao, H Lin, H J Bao. Adaptive Aperture Partition in Shooting and Bouncing Ray Method[J]. Antennas and Propagation, IEEE Transactions on.2011,59(9): 3347-3357.
    [34]K Yee. Numerical solution of initial boundary value problems involving maxwell's equations in isotropic media[J]. Antennas and Propagation, IEEE Transactions on.1966,14(3):302-307.
    [35]葛德彪,闫玉波.电磁波时域有限差分方法[M].第二版.西安:西安电子科技大学出版社,2005.
    [36]盛新庆.计算电磁学要论[M].北京:科学出版社,2004.
    [37]S M Rao, D Wilton, A W Glisson. Electromagnetic scattering by surfaces of arbitrary shape[J]. Antennas and Propagation, IEEE Transactions on.1982,30(3): 409-418.
    [38]E Jorgensen, J L Volakis, P Meincke, et al. Higher order hierarchical Legendre basis functions for electromagnetic modeling[J]. Antennas and Propagation. IEEE Transactions on.2004,52(11):2985-2995.
    [39]F Valdes, F P Andriulli, K Cools, ct al. High-order Div- and Quasi Curl-Conforming Basis Functions for Calderon Multiplicative Preconditioning of the EF1E[J]. Antennas and Propagation, IEEE Transactions on.2011,59(4):1321-1337.
    [40]M Djordjevic, B M Notaros. Three types of higher-order MoM basis functions automatically satisfying current continuity conditions[C]. IEEE Antennas and Propagation Society International Symposium,2002.
    [41]M R Hestenes, E Stiefel. Methods of Conjugate Gradients for Solving Linear Systems[J]. Journal of Research of the National Bureau of Standards.1952,49(6): 409-436.
    [42]N Engheta, W D Murphy, V Rokhlin, et al. The fast multipole method (FMM) for electromagnetic scattering problems[J]. Antennas and Propagation, IEEE Transactions on.1992,40(6):634-641.
    [43]C C Lu, W C Chew. Fast algorithm for solving hybrid integral cquations[J]. Microwaves, Antennas and Propagation, IEE Proceedings H.1993,140(6):455-460.
    [44]R Coifman, V Rokhlin, S Wandzura. The fast multipole method for the wave equation:a pedestrian prescription[J]. Antennas and Propagation Magazine, IEEE. 1993,35(3):7-12.
    [45]C Y Shen, K J Glover, M I Sancer, et al. The discrete Fourier transform method of solving differential-integral equations in scattering theory[J]. Antennas and Propagation, IEEE Transactions on.1989,37(8):1032-1041.
    [46]M F Catedra, R P Torres, J Basterrechea, et al. The CG-FFT Method:Application of Signal Processing Techniques to Electromagnetics[M]. Boston:Artech House, 1995.
    [47]E Bleszynski, M Bleszynski, T Jaroszewicz. AIM:Adaptive integral method for solving large-scale electromagnetic scattering and radiation problems[J]. Radio Science.1996,31(5):1225-1251.
    [48]J M Song, W C Chew. Multilevel fast-multipole algorithm for solving combined field integral equations of electromagnetic scattering[J]. Microwave and Optical Technology Letters.1995,10(1):14-19.
    [49]I Ismatullah, T F Eibert. Surface Integral Equation Solutions by Hierarchical Vector Basis Functions and Spherical Harmonics Based Multilevel Fast Multipole Method[J]. Antennas and Propagation, IEEE Transactions on.2009,57(7):2084-2093.
    [50]Z Peng, X C Wang, J F Lee. Integral Equation Based Domain Decomposition Method for Solving Electromagnetic Wave Scattering From Non-Penetrable Objects[J]. Antennas and Propagation, IEEE Transactions on.2011,59(9):3328-3338.
    [51]W R Mark, R S Glanville, K Akeley, et al. Cg:A system for programming graphics hardware in a C-like language[J]. Acm Transactions On Graphics.2003, 22(3):896-907.
    [52]徐波等译.D Shreiner, M Woo, J Neider, T Davis著OpenGL编程指南[M].第五版.北京:机械工业出版社,2006.
    [53]I Buck, T Foley, D Horn, et al. Brook for GPUs:Stream computing on graphics hardware[J]. Acm Transactions On Graphics.2004,23(3):777-786.
    [54]M Mccool, S Du Toit, T Popa, et al. Shader algebra[J]. ACM Transactions On Graphics.2004,23(3):787-795.
    [55]NVIIDA. NVIDIA CUDA C Programming Guide V3.2,2010.
    [56]J D Owens, M Houston, D Luebke, et al. GPU Computing[J]. Proceedings of the IEEE.2008,96(5):879-899.
    [57]张云泉等译Gaster Benedict R, Howes Lee, Kaeli David R等著.OpenCL异构计算[M].北京:清华大学出版社,2012.
    [58]A R Brodtkorb, C Dyken, T R Hagen, et al. State-of-the-art in heterogeneous computing[J]. Scientific Programming.2010,18(1):1-33.
    [59]IBM. Software development kit for multicore acceleration version 3.1: Programmers guide,2008.
    [60]EDA IndustryWorking Groups, http://www.vhdl.org/.
    [61]Verilog website, http://www.verilog.com/.
    [62]Y Zhang, M Taylor, T Sarkar, et al. Solving large complex problems using a higher-order basis:parallel in-core and out-of-core integral-equation solvers[J]. Antennas and Propagation Magazine, IEEE.2008,50(4):13-30.
    [63]Y Zhang, M Taylor, T Sarkar, et al. Parallel in-core and out-of-core solution of electrically large problems using the RWG basis functions[J]. Antennas and Propagation Magazine, IEEE.2008,50(5):84-94.
    [64]O Ergul, L Gurel. Rigorous Solutions of Electromagnetic Problems Involving Hundreds of Millions of Unknowns[J]. Antennas and Propagation Magazine, IEEE. 2011,53(1):18-27.
    [65]X M Pan, W C Pi, M L Yang, et al. Solving Problems With Over One Billion Unknowns by the MLFMA[J]. Antennas and Propagation, IEEE Transactions on.2012, 60(5):2571-2574.
    [66]Z Fan, F Qiu, A E Kaufman. Zippy:A Framework for Computation and Visualization on a GPU Cluster[J]. Computer Graphics Forum.2008,27(2):341-350.
    [67]Q Hu, N A Gumerov, R Duraiswami. Scalable fast multipole methods on distributed heterogeneous architectures[C]. International Conference for High Performance Computing, Networking, Storage and Analysis (SC),2011.
    [68]S Guochun, S Gottlieb, A Torok, et al. Design of MILC Lattice QCD Application for GPU Clusters[C]. IEEE International Parallel & Distributed Processing Symposium (IPDPS),2011,2011:363-371.
    [69]X Go, N Del, N Nunn, et al. Scalability of Higher-Order Discontinuous Galerkin FEM Computations for Solving Electromagnetic Wave Propagation Problems on GPU Clusters[J]. Magnetics, IEEE Transactions on.2010,46(8):3469-3472.
    [70]T P Stefanski, N Chavannes, N Kuster. Hybrid OpenCL-MPI parallelization of the FDTD method[C]. International Conference on Electromagnetics in Advanced Applications (ICEAA),2011,2011:1201-1204.
    [71]T Nagaoka, S Watanabe. Accelerating three-dimensional FDTD calculations on GPU clusters for electromagnetic field simulation[C]. Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC),2012, 2012:5691-5694.
    [72]PVM website, http://www.csm.oml.gov/pvm/.
    [73]MPI website, http://www.mcs.anl.gov/research/projects/mpi/.
    [74]NVIDIA. CUDA CUBLAS Library. PG-05326-032_V02 ed.2010.
    [75]H. A. van der Vorst. Bi-CGSTAB:A Fast and Smoothly Converging Variant of Bi-CG for the Solution of Non symmetric Linear Systems[J]. SIAM Journal on Scientific and Statistical Computing.1992,13(2):631-644.
    [76]NVIDIA. NVIDIA's next generation CUDA compute architecture:Fermi,2009.
    [77]J Baldauf, S W Lee, L Lin, et al. High frequency scattering from trihedral corner reflectors and other benchmark targets:SBR versus experiment[J]. Antennas and Propagation, IEEE Transactions on.1991,39(9):1345-1351.
    [78]K S Jin, T I Suh, S H Suk, et al. Fast ray tracing using a space-division algorithm for RCS prediction[J]. J.Electromagn.Waves Applicat.2006,20(1):119-126.
    [79]V Havran. Heuristic ray shooting algorithms[D]. Phd thesis, Faculty of Electrical Engineering, Czech Technical University in Prague,2000.
    [80]P Sundararajan, M Y Niamat. FPGA implementation of the ray tracing algorithm used in the XPATCH software[C]. IEEE Midwest Symposium on Circuits and Systems,2001,1:14-17.
    [81]S Popov, J Gunther, H Seidel, et al. Stackless KD-Tree Traversal for High Performance GPU Ray Tracing[J]. Computer Graphics Forum.2007,26(3):415-424.
    [82]Macdonald J D, Booth K S. Heuristics for ray tracing using space subdivision[C]. Proc. Graphics Interface,1989:152-163.
    [83]陶煜波.基于图形硬件的快速电磁计算方法与系统[D].浙江大学,2009.
    [84]J A Kong. Electromagnetic Wave Theory[M].北京:高等教育出版社,2002.
    [85]W B Gordon. Far-field approximations to the Kirchoff-Helmholtz representations of scattered fields[J]. Antennas and Propagation, IEEE Transactions on.1975,23(4): 590-592.
    [86]徐俊英.基于GPU的电磁散射问题积分方程方法并行数值求解[D].电子科技大学,2010.
    [87]C Lanczos. Solution of systems of linear equations by minimized iterations[J]. Journal of Research of the National Bureau of Standards.1952,49(1):33-53.
    [88]P Sonneveld. CGS, a fast Lanczos-type solver for nonsymmetric linear systems[J]. SIMA J.Scientific and Statistical Computing.1989,10(1):36-52.
    [89]Y Saad. Iterative Methods for Sparse Systems[M].2nd edition. Philadelphia: SIAM,2003.
    [90]A Pajot, L Barthe, M Paulin, et al. Combinatorial Bidirectional Path-Tracing for Efficient Hybrid CPU/GPU Rendering[J]. Computer Graphics Forum.2011,30(2): 315-324.
    [91]M Martorella, E Giusti, A Capria, et al. Automatic Target Recognition by Means of Polarimetric ISAR Images and Neural Networks[J]. Geosciencc and Remote Sensing, IEEE Transactions on.2009,47(11):3786-3794.
    [92]R Bhalla, H Ling. A fast algorithm for signature prediction and image formation using the shooting and bouncing ray technique[J]. Antennas and Propagation, IEEE Transactions on.1995,43(7):727-731.
    [93]R Bhalla, H Ling. Cross range streaks in ISAR images generated via the shooting-and-bouncing ray technique:cause and solutions[J]. Antennas and Propagation Magazine, IEEE.1997,39(2):76-80.
    [94]C Guiffaut, K Mahdjoubi. A parallel FDTD algorithm using the MPI library[J]. Antennas and Propagation Magazine, IEEE.2001,43(2):94-103.
    [95]莫则尧 等译Dongarra Jack, Foster Ian, Fox Geoffrey等编著.并行计算综论[M].北京:电子工业出版社,2005.
    [96]C C Lu, W C Chew. Fast far-field approximation for calculating the RCS of large objects[J]. Microwave and Optical Technology Letters.1995,8(5):238-241.
    [97]J M Jin, F Ling, S T Carolan, et al. A hybrid SBR/MoM technique for analysis of scattering from small protrusions on a large conducting body[J]. Antennas and Propagation, IEEE Transactions on.1998,46(9):1349-1357.
    [98]W J Zhao, J L W Li, L Hu. Efficient Current-Based Hybrid Analysis of Wire Antennas Mounted on a Large Realistic Aircraft[J]. Antennas and Propagation, IEEE Transactions on.2010,58(8):2666-2672.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700