Time-Domain BEM for the Wave Equation: Optimization and Hybrid Parallelization
详细信息    查看全文
  • 作者:Berenger Bramas (16)
    Olivier Coulaud (16)
    Guillaume Sylvand (17)
  • 关键词:Boundary element method (BEM) ; time domain ; sparse matrix ; vector product (SpMV) ; shared/distributed memory parallelization ; SIMD
  • 刊名:Lecture Notes in Computer Science
  • 出版年:2014
  • 出版时间:2014
  • 年:2014
  • 卷:8632
  • 期:1
  • 页码:511-523
  • 全文大小:992 KB
  • 参考文献:1. Liu, Y.J., Mukherjee, S., Nishimura, N., Schanz, M., Ye, W., Sutradhar, A., Pan, E., Dumont, N.A., Frangi, A., Saez, A.: Recent advances and emerging applications of the boundary element method. ASME Applied Mechanics Review聽64(5), 138 (2011)
    2. I. Terrasse, R茅solution math茅matique et num茅rique des 茅quations de Maxwell instationnaires par une m茅thode de potentiels retard茅s, PhD dissertation, Ecole Polytechnique Palaiseau France (1993)
    3. Abboud, T., Pallud, M., Teissedre, C.: SONATE: A Parallel Code for Acoustics Nonlinear oscillations and boundary-value problems for Hamiltonian systems, Technical report (1982), http://imacs.xtec.polytechnique.fr/Reports/sonate-parallel.pdf
    4. Hu, F.Q.: An efficient solution of time domain boundary integral equations for acoustic scattering and its acceleration by Graphics Processing Units. In: 19th AIAA/CEAS Aeroacoustics Conference, ch. (2013), doi:10.2514/6.2013-2018
    5. Langer, S., Schanz, M.: Time Domain Boundary Element Method. In: Marburg, S., Nolte (eds.) Computational Acoustics of Noise Propagation in Fluids - Finite and Boundary Element Methods, pp. 495鈥?16. Springer, Heidelberg (2008)
    6. Takahashi, T.: A Time-domain BIEM for Wave Equation accelerated by Fast Multipole Method using Interpolation, pp. 191鈥?92 (2013), doi:10.1115/1.400549
    7. Karakasis, V., Goumas, G., Koziris, N.: Perfomance Models for Blocked Sparse Matrix-Vector Multiplication Kernels. In: International Conference on Parallel Processing 2009, pp. 356鈥?64 (2009), doi:10.1109/ICPP.2009.21
    8. Nishtala, R., Vuduc, R.W.: When Cache Blocking of Sparse Matrix Vector Multiply Works and Why. In: Proceedings of the PARA 2004 Workshop on the State-of-the-art in Scientific Computing (2004)
    9. Toledo, S.: Improving the memory-system performance of sparse-matrix vector multiplication. IBM Journal of Research and Development聽41(6), 711鈥?25 (1997) CrossRef
    10. Pinar, A., Heath, M.T.: Improving performance of sparse matrix-vector multiplication. In: Proceedings of the 1999 ACM/IEEE Conference on Supercomputing. ACM (1999)
    11. Yzelman, A.N., Bisseling, R.H.: Cache-Oblivious Sparse MatrixVector Multiplication by Using Sparse Matrix Partitioning Methods. SIAM Journal on Scientific Computing聽31(4), 3128鈥?154 (2009), doi:10.1137/080733243 CrossRef
    12. Vuduc, R.W., Moon, H.-J.: Fast sparse matrix-vector multiplication by exploiting variable block structure. In: Yang, L.T., Rana, O.F., Di Martino, B., Dongarra, J. (eds.) HPCC 2005. LNCS, vol.聽3726, pp. 807鈥?16. Springer, Heidelberg (2005) CrossRef
    13. Goto, K., Advanced, T.: High-Performance Implementation of the Level-3 BLAS, 117 (2006)
    14. Morton, G.M.: A Computer Oriented Geodetic Data Base and a New Technique in File Sequencing. International Business Machines Company (1966)
    15. Amestoy, P.R., Duff, I.S., L鈥橢xcellent, J.-Y.: MUMPS MUltifrontal Massively Parallel Solver Version 2.0 (1998)
    16. Snir, M., Otto, S., et al.: The MPI core, 2nd edn (1998)
    17. OpenMP specifications, Version 3.1 (2011), http://www.openmp.org
  • 作者单位:Berenger Bramas (16)
    Olivier Coulaud (16)
    Guillaume Sylvand (17)

    16. Inria Bordeaux, Sud-Ouest, 33405, Talence, France
    17. Airbus Group Innovations, Applied Mathematics and Simulation, Toulouse, France
  • ISSN:1611-3349
文摘
The problem of time-domain BEM for the wave equation in acoustics and electromagnetism can be expressed as a sparse linear system composed of multiple interaction/convolution matrices. It can be solved using sparse matrix-vector products which are inefficient to achieve high Flop-rate. In this paper we present a novel approach based on the re-ordering of the interaction matrices in slices. We end up with a custom multi-vectors/vector product operation and compute it using SIMD intrinsic functions. We take advantage of the new order of the computation to parallelize in shared and distributed memory. We demonstrate the performance of our system by studying the sequential Flop-rate and the parallel scalability, and provide results based on an industrial test-case with up to 32 nodes.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700