电磁问题分析中的有限元方程组的快速求解技术

英文题名：Fast Solution Techniques for Finite Element Equations in Electromagnetic Analysis
作者：田瑾
论文级别：博士
学科专业名称：电磁场与微波技术
中文关键词：有限元 ; ANSYS建模 ; 多波前 ; 预条件 ; GPU
英文关键词：Finite element method ; ANSYS modeling ; Multifrontal (MF) ;
英文关键词：Preconditioning ; GPU
学位年度：2012
导师：史小卫
学科代码：080904
学位授予单位：西安电子科技大学
论文提交日期：2012-01-01

摘要

随着计算机技术的飞速发展，有限元计算方法以及基于该方法的ANSYS等软件正成为工程和科学研究不可或缺的电磁问题分析工具。采用有限元法分析电磁问题，均可以归结为求解稀疏线性方程组。因此，稀疏线性方程组的快速解法成为有限元计算的关键技术之一，也是计算电磁学中的热点问题。尽管计算机CPU峰值速度和内存存贮容量在不断的提高，但随着工程和科学研究规模增大、复杂程度增加以及对分析精度和速度要求的提高，计算机性能仍然不能满足大规模电磁计算的需要。因此，研究如何进一步减少存贮需求及提高计算效率具有重要的理论和实践价值。
     本文构建了基于ANSYS的有限元模型及矢量有限元通用计算公式，针对该技术得到的稀疏线性方程组，在分析现有稀疏线性方程组求解技术的基础上，围绕如何减少存贮需求和提高计算效率的问题，拓展了多波前法、预条件技术以及基于预条件迭代求解的GPU加速技术，主要工作和创新点如下：
     1.构建了基于ANSYS的有限元模型及矢量有限元通用计算公式，并分析了稀疏线性方程组求解技术。对基于ANSYS的有限元建模技术进行了研究，给出了多介质区域的模型建立方法，结果验证了该建模技术的可行性；针对闭域和部分开域问题分别推导了有限元基本公式，并给出通用计算表达式；针对基于通用计算公式得到的稀疏线性方程组，分析并讨论了稀疏线性方程组的直接求解方法及其关键技术，特别就稀疏矩阵的Cholesky分解法的几种执行方式的特征进行了对比和分析。
     2.提出了有限元稀疏线性方程组的求解方法，并基于该方法提出了预条件技术，将其与迭代法结合进行求解，结果验证了所提方法的正确性和有效性。基于Cholesky分解方法提出了适用于复对称矩阵的拓展的Cholesky分解法，并将之与多波前法结合以用于对称问题的计算；针对非对称问题，采用基于LU分解的多波前法进行求解，并给出具体实施策略；基于对两种多波前方法的研究，论文提出了预条件技术以提高计算效率，并将其与迭代法结合求解有限元方程组。结果验证了所提方法的正确性和有效性。
     3.提出了GPU的内存访问准则，并基于该准则，提出一种行排序和三种列排序GPU压缩存贮格式，通过性能比较表明提出的存贮格式能够快速有效地提高计算性能。在研究中，首先分析了GPU的内存模型，并基于不同内存模型的特性提出了相应的内存访问准则；进而，基于该准则研究了GPU计算采用的压缩存贮格式，将所有格式分为行排序和列排序两大类；随后，提出更高效率的MCTO行存贮格式，该格式能够合并同时访问同一内存的地址，从而满足内存访问准则；最后，提出sliced EET, sliced EEV-T and sliced EEV-F三种列存贮格式，三种格式均满足内存访问准则。性能比较表明提出的格式能够具有快速有效的计算性能。
     4.针对本文提出的四种压缩存贮格式，提出了相应的GPU并行计算策略。基于对行排序并行策略的研究，提出了适合MCTO格式的并行计算方法，并分别给出该算法在加法、点乘和稀疏矩阵矢量乘中的具体实施策略；基于对列排序并行策略的研究，提出了适合sliced EET、sliced EEV-T以及sliced EEV-F格式的并行计算方法。基于两种压缩存贮格式的并行算法性能评估的结果表明，并行策略能得到更好的计算性能。
With the fleeting progress of computer technology, the finite element method(FEM) and FEM-based softwares such as ANSYS are becoming indispensable tools forthe design and analysis of electromagnetic problems. The key of the FEM inelectromagnetic problems is to solve sparse linear equations. Fast solution of sparselinear equations is one of the most important technologies in the FEM calculation, andis a hot issue in computational electromagnetics. With the increasing scale, addingcomplexity, and rising requirements of accuracy and speed in engineering and scientificresearch, computer performance still can not meet the needs of large-scale calculationeven though the peak processing power and memory bandwidth of Central ProcessingUnit (CPU) has constantly increased. Therefore, the research on how to further reducestorage requirements and improve the computational efficiency has importanttheoretical value and practical significance.
     This thesis builts the finite element model based on the ANSYS and the generalfinite element expression. Based on the analysis of exsting techniques for direct solvingsparse linear equations equations, we extend the Multifrontal (MF) algorithm,preconditioning techniques and Graphics Processing Unit (GPU) accelerate strategies(for the solution of precondition iterative methods) around the solution of sparse linearequations arising vector FEM. Main work and innovation of this thesis include:
     1. FEM models and general expression are established, and the direct methodsfor solving sparse linear equations equations are discussed. We present FEM modelingtechniques on the basis of ANSYS for multidielectric region, and the examples show thefeasibility of using our technology. Then we give general expressions of vector FEM forthe closed region and some open region problems. Based on the sparse linear equationsobtained by using the expression, we analyze and discuss the direct methods and theirkey technology. Comparison and analysis are made on the implementationcharacteristics of several Cholesky decomposition methods.
     2. Several solution methods for sparse linear equations are presented, and, apreconditioning technique is proposed and combined with iterative to solve theequations. Results show the proposed methods are correct and effective. Based on theCholesky-based MF method, we propose an expanded Cholesky method (ECM) andcombine it with MF for solving complex symmetric equations. Then we use LU-based MF method to solve unsymmetric equations and present its specific implementionmeasures and methods. Based on the study of the MF algorithm, we propose apreconditioning technique to improve the computation efficiency. And then, wecombine preconditioning technique and an iterative to solve the equations. Results showthe proposed methods are correct and effective.
     3. The memory access rules is presented and, based on the rules, a row-majorand three column-major compression storage formats are proposed. Performancecomparisons show the proposed formats can have fast and efficient computingperformance. We propose memory access rules on the basis of analysis about thememory models. Based on the rules, we study the compression storage formats used byGPU computation and group these formats into two main classes. Then we propose amore efficient row-major format called MCTO which can coalesce simultaneousaccessed addresses to memorys, and thus the MCTO satisfy the memory access rules.We propose three column-major formats including sliced EET, sliced EEV-T and slicedEEV-F which also satisfy the memory access rules. Performance comparisons show theproposed formats can have fast and efficient computing performance.
     4. The parallelization strategies for a row-major and three column-majorcompression storage formats. Performance evaluations verify the better performance ofthe parallel algorithms. Based on the study of row-major parallelization strategies, wepropose suitable parallelization strategy for MCTO and present the specificimplementation methods of addition, dot product and sparse matrix vector product(SMVP). Based on the study of column-major parallelization strategies, we proposesuitable parallelization strategies for sliced EET, sliced EEV-T and sliced EEV-Fformats. Performance evaluations verify the better performance of the parallelalgorithms.

引文

[1] Silvester P. Finite element solution of homogeneous waveguide problems[J]. Alta Frequenza.1969,38:313-317.
    [2] Whitney H. Geometric integration theory[M].5ed. Princeton University Press Princeton, NJ,1957.
    [3] Nédélec J C. Mixed finite elements in R3[J]. Numerische Mathematik.1980,35(3):315-341.
    [4] Hano M. Finite-element analysis of dielectric-loaded waveguides[J]. Microwave Theory andTechniques, IEEE Transactions on.1984,32(10):1275-1279.
    [5] Barton M L, Cendes Z J. New vector finite elements for three‐dimensional magnetic fieldcomputation[J]. Journal of Applied Physics.1987,61(8):3919-3921.
    [6] Vollaire C, Musy F, Perrussel R. Computation of Power with Vector Finite Element[C]. Miami,FL:2006.
    [7] Fisher A, White D, Rodrigue G. An efficient vector finite element method for nonlinearelectromagnetic modeling[J]. Journal of Computational Physics.2007,225(2):1331-1346.
    [8] Chen R S, Wang D X, Yung E K N, et al. Application of the multifrontal method to the vectorFEM for analysis of microwave filters[J]. Microwave and Optical Technology Letters.2001,31(6):465-470.
    [9] Jin J M. The Finite Element Method in Electromagnetics[M]. Wiley-IEEE Press,1993.
    [10] Felippa C A. Recent developments in basic finite element technologies[M]. Kidlington, OxfordOX51GB, UK: ELSEVIER SCIENCE Ltd,1999:141-156.
    [11] Zienkiewicz O C, Taylor R L, Zhu J Z. The Finite Element Method–Its Basis and Fundamentals,volume1[M]. Heinemann, Amsterdam, London: Elsevier Butterworth,2005.
    [12] Saad Y, Saad Y. Iterative methods for sparse linear systems[M]. PWS Pub. Co.,1996.
    [13] Davis T A. Direct methods for sparse linear systems[M].2ed. THE United States of America:Society for Industrial Mathematics,2006.
    [14] Peterson A F, Ray S L, Mittra R, et al. Computational methods for electromagnetics[M]. IEEEpress New York,1998.
    [15] Chew W C, Jin J M, Lu C C, et al. Fast solution methods in electromagnetics[J]. Antennas andPropagation, IEEE Transactions on.1997,45(3):533-543.
    [16]李勇.多工况PCG的研究与实现[D].北京大学,2004.
    [17] Pu C, Shuli S. A NEW HIGH PERFORMANCE SPARSE STATIC SOLVER IN FINITEELEMENT ANALYSIS WITH LOOP-UNROLLING[J]. Acta Mechanica Solida Sinica.2005,18(3):248-255.
    [18]杨绍棋，谈跟林.稀疏矩阵[M].北京:高等教育出版社,1985:1-218.
    [19]袁慰平，张令敏，黄新芹等.数值分析[M].南京:东南大学出版社,1992:69-150.
    [20] Duff I S. Matrix methods[R].,1998.
    [21] Duff I S, Reid J K. MA47, a Fortran code for direct solution of indefinite sparse symmetric linearsystems[R]. Oxfordshire,1995.
    [22] Chen D, Rotkin V, Toledo S. A Library of Sparse Linear Solvers[Z].2003.
    [23] Davis T A. Algorithm832: UMFPACK V4.3---an unsymmetric-pattern multifrontal method[J].ACM Transactions on Mathematical Software (TOMS).2004,30(2):196-199.
    [24] Chen P, Zheng D, Sun S, et al. High performance sparse static solver in finite element analyseswith loop-unrolling[J]. Advances in Engineering Software.2003,34(4):203-215.
    [25] Chen P, Sun S L, Yuan M W. A fast sparse static solver in finite element analysis[C]. Atluri S N,Brust F W:2000.
    [26]周洪伟，吴舒，陈璞.有限元分析快速直接求解技术进展[J].力学进展.2007,37(2):175-188.
    [27] Chu E, George A, Liu J, et al. SPARSPAK: Waterloo Sparse Matrix Package. User's Guide forSPARSPAK-A: a Collection of Modules for Solving Sparse Systems of Linear Equations[M].Faculty of Mathematics, University of Waterloo,1984.
    [28] George A, Ng E, University O W F O. SPARSPAK: Waterloo Sparse Matrix Package: User'sGuide for SPARSPAK-B[M]. Faculty of Mathematics, University of Waterloo,1984.
    [29] Gustavson F G. Some basic techniques for solving sparse systems of linear equations[C]. Donald JR, Willoughby R A, eds: New York: Plenum Press,1972.
    [30] George A, Liu J W. Computer Solution of Large Sparse Positive Definite[M]. Englewood Cliffs,NJ: Prentice Hall Inc,1981.
    [31] Duff I S, Grimes R G, Lewis J G. Users' guide for the Harwell-Boeing sparse matrix collection(Release I)[R]. Lyon, France: CERFACS,1992.
    [32] Lam H L, Choi K K, Haug E J. A sparse matrix finite element technique for iterative structuraloptimization[J]. Computers&Structures.1983,16(1-4):289-295.
    [33] Hopper M J. Harwell Subroutine Library: A Catalogue of Subroutines (1980).[M]. United,Kingdom Atomic Energy Authority: AERE, Computer Science and Systems Division,1980.
    [34] Sertel K, Volakis J L. Incomplete LU preconditioner for FMM implementation[J]. MicrowaveAnd Optical Technology Letters.2000,26(4):265-267.
    [35] Ping X W, Cui T. The Factorized Sparse Approximate Inverse Preconditioned Conjugate GradientAlgorithm for Finite Element Analysis of Scattering Problems[J]. Progress In ElectromagneticsResearch.2009,98:15-31.
    [36] Benzi M, Meyer C D, Tuma M. A sparse approximate inverse preconditioner for the conjugategradient method[J]. SIAM Journal on Scientific Computing.1996,17(5):1135-1149.
    [37] Li S S, Rui P L, Chen R S. An Effective Sparse Approximate Inverse Preconditioner for VectorFinite Element Analysis of3D EM Problems[C]. Albuquerque, NM: Antennas and PropagationSociety International Symposium, IEEE,2006.
    [38] Xie Y, He J, Sullivan A, et al. A simple preconditioner for electric‐field integral equations[J].Microwave and Optical Technology Letters.2001,30(1):51-54.
    [39] Bunse-Gerstner A, Gutierrez-Canas I. A hierarchically semiseparable preconditioner for theMFLMM-based solution of the EFIE[C]. Albuquerque, NM: IEEE,2006.
    [40] Yin J F. A class of preconditioners based on matrix splitting for nonsymmetric linear systems[J].Applied Mathematics and Computation.2010,216(6):1694-1706.
    [41] Asgasri A, Tate J E. Implementing the chebyshev polynomial preconditioner for the iterativesolution of linear systems on massively parallel graphics processors[C].2009.
    [42] Zhou F, Zhang Y. Solving for a Class of Matrix Equation by Polynomial Preconditioning[J].Journal of Chongqing University of Technology (Natural Science).2011,25(3):102-107.
    [43] Ortega J M. Introduction to parallel and vector solution of linear systems[M]. Springer,1988.
    [44] Chow E, Saad Y. Approximate inverse preconditioners via sparse-sparse iterations[J]. SIAMJournal on Scientific Computing.1998,19(3):995-1023.
    [45] Manteuffel T A. An incomplete factorization technique for positive definite linear systems[J].Math．Comp．.1980,34(150):473-497.
    [46] Munksgaard N. Solving sparse symmetric sets oflinear equations by preconditioned conjugategradients[J]. ACM Trans．Math．Software.1980,6(2):206-219.
    [47] Lin C, Mort J J. Incomplete Cholesky factorizations with limited memory[J].21.1999,1(24-45).
    [48] Jones M T, Plassmann P E. An improved incomplete Cholesky Factorization[J]. ACMTrans．Math. Software.1995,21(1):5-17.
    [49] Saad Y. ILUT: A dual threshold incomplete LU factofization[J]. Numer．Linear Algebra Appl．.1994,1(4):387-402.
    [50] Bell N, Garland M. Efficient sparse matrix-vector multiplication on CUDA[R].,2008.
    [51] Intel. Manual, Intel. Math Kernel Library.[DB/CD].2004.
    [52] Williams S, L O, Vuduc R, et al. Optimization of sparse matrix-vector multiplication on emergingmulticore platforms[J]. Parallel Computing.2009,35(3):178-194.
    [53] Bell N, Garland M. Efficient sparse matrix-vector multiplication on CUDA[C].2009.
    [54] Baskaran M M, Bordawekar R. Optimizing sparse matrix-vector multiplication on GPUs[R]. IBMresearch report RC24704,2009.
    [55] Van den Braak G J, Mesman B, Corporaal H. Compile-time GPU memory access optimizations[C].2010.
    [56] Kincaid D R, Oppe T C, Young D M. ITPACKV2D User’s Guide[DB/CD].1983.
    [57] T. M R, F. D A E, R. F M. Progress towards Optimizing the PETSc Numerical Toolkit on the CrayX1Cray Users Group2005.
    [58] F. V, G. O, J. F J, et al. Improving the performance of the sparse matrix vector product withGPUs[C].2010.
    [59] A. M, A. L, A. A. Automatically tuning sparse matrix-vector multiplication for GPUarchitectures[J]. High Performance Embedded Architectures and Compilers, Lecture Notes inComputer Science.2010,5952:111-125.
    [60] Jianming J. The finite element method in electromagnetics[M]. Wiley New York,2002.
    [61] Raphael B, Krishnamoorthy C S. Automating finite element development using object orientedtechniques[J]. Engineering Computations.1993,10:267.
    [62] Rao S S. The finite element method in engineering[M]. Butterworth-heinemann,2005.
    [63] Bahrami H, Hakkak M, Pirhadi A. Analysis and design of highly compact bandpass waveguidefilter using complementary split ring resonators (CSRR)[J]. Progress In ElectromagneticsResearch, PIER.2008,80:107-122.
    [64] Malekabadi S A, Attari A R, Mirsalehi M M. Design of Compact Broadband Microstrip AntennasUsing Coplanar Coupled Resonators[J]. Journal of Electromagnetic Waves and Applications.2009,23(13):1755-1762.
    [65] Alliez P, Cohen-Steiner D, Yvinec M, et al. Variational tetrahedral meshing[C]. ACM,2005.
    [66] Geuzaine C, Remacle J F. Gmsh: a three-dimensional finite element mesh generator with built-inpre-and post-processing facilities[J]. International Journal for Numerical Methods in Engineering.2009,10.
    [67] Brown P R. A non-interactive method for the automatic generation of finite element meshes usingthe schwarz-christoffel transformation[J]. Computer methods in applied mechanics andengineering.1981,25:101-126.
    [68] Spekreijse S P. Elliptic grid generation based on Laplace equations and algebraictransformations[J]. Journal of Computational Physics.1995,118(1):38-61.
    [69] Kadivart M H, Sharifit H. DOUBLE MAPPING OF ISOPARAMETRIC[J]. Computers&Structures.1996,59(3):471-477.
    [70] Lo S H. Volume discretization into tetrahedra-II.3D triangulation by advancing front approach[J].Computers and Structures.1991,39(5):501-511.
    [71] Topping B H V, Muylle J, Ivanyi P, et al. Finite element mesh generation[M]. Saxe-CoburgPublications,2004.
    [72] Wordenweber B. Finite element mesh generation[J]. Computer Aided Design.1984,16(5):285-291.
    [73] Chatterjee A, Jin J M, Volakis J L. Computation of cavity resonances using edge-based finiteelements[J]. IEEE Transactions on Antennas and Propagation.1992,40(11):2106-2108.
    [74] Volakis J L, Chatterjee A, Kempel L C. Finite element method for electromagnetics: antennas,microwave circuits, and scattering applications[M]. Wiley-IEEE Press,1998.
    [75] Yu B A, N L M. COMPARISON OF ABSORBING BOUNDARY CONDITIONS FORNUMERICAL ANALYSIS OF PERIODIC STRUCTURES[C].2007.
    [76] Shangwu M, Gao Y. Comparison between some absorbing boundary conditions and numericalvalidation[C]. Beijing Univ. Posts&Telecommun. Press,2000.
    [77] Ramahi O M. Absorbing boundary conditions for the three-dimensional vector wave equation[J].IEEE Transactions on Antennas and Propagation.1999,47(11):1735-1736.
    [78] Boag A, Industries I A. A numerical absorbing boundary condition for finite element analysis[C].1995.
    [79] Chatterjee A, Volakis J L. Conformal absorbing boundary conditions for3-D problems: derivationand applications[J]. IEEE Transactions on Antennas and Propagation.1995,43(8):860-866.
    [80] Wave V O, Boundary A, Sun W, et al. Vector one-way wave absorbing boundary conditions forFEM applications-Antennas and Propagation, IEEE Transactions on[J].1994,42(6):872-878.
    [81] Kanellopoulos V N, Webb J P. A numerical study of vector absorbing boundary conditions for thefinite-element solution of Maxwell's equations[J]. IEEE Microwave and Guided Wave Letters.1991,1(11):325-327.
    [82] Peterson A F. Accuracy of local absorbing boundary conditions for use with the vector Helmholtzequation[C]. Ieee,1991.
    [83] Khebir A, Mittra R. Absorbing boundary conditions for arbitrary outer boundary[J]. Methods.1989(4):46-49.
    [84] Mittra R, Ramahi O, Khebir A, et al. A review of absorbing boundary conditions for two andthree-dimensional electromagnetic scattering problems[J]. Ieee transactions on magnetics.1989,25(4):3034-3039.
    [85] Peterson A F. Absorbing Boundary Conditions For The Vector Wave Equation[J]. Microwave andOptical Technology Letters.1988,1(2):62-64.
    [86] Jin J M, Sheng X, Chew W C. Complementary perfectly matched layers to reduce reflectionerrors[J]. Microwave and Optical Technology Letters.1997,14(5):284-287.
    [87] Deyue Z, Fuming M A, Ming F. Finite Element Method with Perfectly Matched Absorbing Layersfor Wave Scattering from a Cavity[J]. Chinese Journal of Computational Physics.2008,25(3):301-308.
    [88] Donderici B, Teixeira F L. Conformal Perfectly Matched Layer for the Mixed Finite ElementTime-Domain Method[J]. IEEE Transactions on Antennas and Propagation.2008,56(4):1017-1026.
    [89] Jin J M, Chew W C. Combining PML and ABC for the finite-element analysis of scatteringproblems[J]. Microwave and Optical Technology Letters.1996,12(4):192-197.
    [90]赵倩. FEM/PML算法在电磁散射特性分析中的应用[D].[硕士论文]，西北工业大学,2006.
    [91]年丰,董硕,周乐柱, et al. Analysis of3D radiation problems using FEM+PML macro2element method[J].电波科学学报.2007,22(5):717-722.
    [92] Botha M M, Jin J M. On the Variational Formulation of Hybrid Finite Element-Boundary IntegralTechniques for Electromagnetic Analysis[J]. IEEE Transactions on Antennas and Propagation.2004,52(11):3037-3047.
    [93] Dunn E A, Byun J K, Jin J M. A Higher-Order Finite Element-Boundary Integral Method forElectromagnetic Scattering from Bodies of Revolution[C].2005.
    [94] Cook O P, Delaware U, Gatica G, et al. Advances in Boundary Integral Equations and RelatedTopics A conference in honor of G. C. Hsiao’ s75th birthday[C].2009.
    [95] Li P. Coupling of finite element and boundary integral methods for electromagnetic scattering in atwo-layered medium☆[J]. Journal of Computational Physics.2010,229(2):481-497.
    [96] Jin J, Volakis J L, Collins J D. A finite element-boundary integral method for scattering andradiation by two-and three-dimensional structures[J]. IEEE Transactions on Antennas andPropagation.1991,33(3):22-32.
    [97] Peterson A F. Analysis of heterogeneous electromagnetic scatterers: research progress of the pastdecade[J]. Proceedings of the IEEE.1991,79(10):1431-1441.
    [98]周平,徐金平. Method for solving linear equations of hybrid finite element-boundary elementmethod for EM problems[J]. JOURNAL OF SOUTHEAST UNIVERSITY(NATURAL SCIENCEEDITION.2005,35(3):343-346.
    [99] Davidson D B, Botha M M. Evaluation of a Spherical PML for Vector FEM Applications[J]. IEEETransactions on Antennas and Propagation.2007,55(2):494-498.
    [100] Volakis J L, Chatterjee A, Kempel L C. Review of the finite-element method forthree-dimensional electromagnetic scattering[J]. Journal of the Optical Society of America A.1994,11(4):1422-1433.
    [101] Hlynka M. Queueing Networks and Markov Chains (Modeling and Performance Evaluation WithComputer Science Applications)[J]. Technometrics.2007,49(1):104-105.
    [102] Farebrother R W. Linear Least Squares Computations[M]. Marcel Dekker,1988.
    [103] Higham N J. Analysis of the Cholesky Decomposition of a Semi-Definite Matrix[J]. OxfordUniversity.1990:161-185.
    [104] Loehlin J C. The Cholesky approach: A cautionary note[J]. Behavior Genetics.1996,26(1):65-69.
    [105]赵金熙.对称不定矩阵的广义Cholesky分解法[J].计算数学.1996,4(11):442-448.
    [106] Reid J K, Scott J A. An out-of-core sparse Cholesky solver[J]. ACM Transactions onMathematical Software (TOMS).2009,36(2):1-33.
    [107]袁尉平，张令敏，黄欣芹，等.数值分析[M].南京:东南大学出版社,1992:69-150.
    [108] CSEP Numerical linear Algebra[DB/OL][DB/CD].1995.
    [109] Choi J, Dongarra J J, Ostrouchov L S. The design and implementation of the ScalLAPACK LU,QR and Cholesky Factorization routines [DB/OL][DB/CD].1994.
    [110] Dongarra J J, Bunch J R, Moler C B. LINPACK Users' Guide.[DB/CD]. Philadelphia: SIAM Press,1979.
    [111] Anderson E, Bai Z, Demmel J. LAPACK Users' Guide.[DB/CD]. Philadelphia: SIAM,1992.
    [112] Alpatov P, Baker G, Edwards C, et al. PLAPACK: parallel linear algebra package designoverview[C]. ACM,1997.
    [113] George J A, Heath M, Liu J W H. Parallel Cholesky factorization on a shared-memorymultiprocessor[J]. Linear Algebra And Its Applications.1986,77:165-187.
    [114] Martin R S, Wilkinson J H. Symmetric decomposition of positive definite band matrices[J].Numerische Mathematik.1965,7(5):355-361.
    [115] Jennings A. A compact storage scheme for the solution of symmetric linear simultaneousequations[J]. The Computer Journal.1966,9(3):281-285.
    [116] Watkins D S. Fundamentals of matrix computations[M].56ed. John Wiley and Sons,2002.
    [117] Irons B M. A frontal solution program for finite element analysis[J]. International Journal forNumerical Methods in Engineering.1970,2(1):5-32.
    [118] Sizaire R. keyFE2User Manual2005/日期.
    [119] Szabo B A, Babuska I. Finite element analysis[M]. Wiley-Interscience,1991:152-157.
    [120] Davis T A, Duff I S. A combined unifrontal/multifrontal method for unsymmetric sparsematrices[J]. ACM Transactions on Mathematical Software (TOMS).1999,25(1):1-20.
    [121] Duff I S. A review of frontal methods for solving linear systems[J]. Computer PhysicsCommunications.1996,97(1-2):45-52.
    [122] George A, Ng E. A brief description of SPARSPAK Waterloo sparse linear equations package[J].ACM SIGNUM Newsletter.1981,16(2):17-20.
    [123] Duff I S, Reid J K. The Multifrontal Solution of Indefinite Sparse Symmetric Linear equations[J].ACM Transactions on Mathematical Software.1983,9(3):302-325.
    [124] Liu J W H. The multifrontal method for sparse matrix solution: Theory and practice[J]. Society forIndustrial and Applied Mathematics.1992,34(1):82-109.
    [125] Schreiber R. A New Implementation Elimination[J]. ACM Transactions on MathematicalSoftware.1982,8(3):256-276.
    [126]陈璞，孙树立，袁明武.有限元分析快速解法[J].力学学报.2002,34(2):216-222.
    [127] Amestoy P R, Li X S, Ng E G. Diagonal Markowitz scheme with local symmetrization[J]. SIAMJournal on Matrix Analysis and Applications.2008,29(1):228.
    [128] Markowitz H M. The elimination form of the inverse and its application to linear programming[J].Management Science.1957,3(3):255-269.
    [129] Tinney W F, Walker J W. Direct solution of sparse network equations by optimally orderedtriangular factorization[C].1967.
    [130] Rose D J. A graph-theoretic study of the numerical solution of sparse positive definite systems oflinear equations[J]. Graph Theory and Computing.1972,183:217.
    [131] Tewarson R P. Sparse Matrices[M]. New York: Academic Press,1973.
    [132] George A, Liu J W H. The evolution of the minimum degree ordering algorithm[J]. Siam review.1989,31:1-19.
    [133] Amestoy P R, Davis T A, Duff I S. An approximate minimum degree ordering algorithm[J]. SIAMJournal on Matrix Analysis and Applications.1996,17(4):886-905.
    [134] Davis T A, Gilbert J R, Larimore S I, et al. A column approximate minimum degree orderingalgorithm[J]. submitted to ACM Trans. on Mathematical Software. Available at http://www. cise.ufl. edu/research/sparse/colamd.2000.
    [135] Dollar H S, Scott J A. A note on fast approximate minimum degree orderings for symmetricmatrices with some dense rows[J]. Numerical Linear Algebra with Applications.2010,17(1):43-55.
    [136] George A. Nested dissection of a regular finite element mesh[J]. SIAM Journal on NumericalAnalysis.1973:345-363.
    [137] Brandhorst K, Head-Gordon M. Fast sparse cholesky decomposition and inversion using nesteddissection matrix reordering[J]. Journal of Chemical Theory and Computation.2011,7(2):351-368.
    [138] Grigori L, Boman E, Donfack S, et al. Hypergraph-based unsymmetric nested dissection orderingfor sparse LU factorization[R]. HAL: inria-00271394,2008.
    [139] Boman E G, Wolf M M. A nested dissection approach to sparse matrix partitioning for parallelcomputations[R]. Sandia National Laboratories,2008.
    [140] Boman E G, Catalyurek U V, Chevalier C, et al. Advances in parallel partitioning, load balancingand matrix ordering for scientific computing[R]. IOP Publishing,2009.
    [141] Hendrickson B, Leland R. A multilevel algorithm for partitioning graphs[C]. ACM,1995.
    [142] Karypis G, Kumar V. A fast and high quality multilevel scheme for partitioning irregular graphs[J].SIAM Journal on Scientific Computing.1998,20(1):359-392.
    [143] Karypis G, Kumar V. A coarse-grain parallel formulation of multilevel k-way graph partitioningalgorithm[C].1997.
    [144] Karypis G, Kumar V. METIS–A Software Package for Partitioning Unstructured Graphs,Partitioning Meshes, and Computing Fill-Reducing Orderings of Sparse Matrices–Version4.0[DB/CD].1998.
    [145] Gupta A. Fast and effective algorithms for graph partitioning and sparse-matrix ordering[J]. IBMJournal of Research and Development.1997,41(1):171-183.
    [146]吴舒.矩阵填充元优化方法的研究[D].北京:北京大学,2004.
    [147] Rothberg E, Hendrickson B. Sparse matrix ordering methods for interior point linearprogramming[J]. INFORMS Journal on Computing.1998,10:107-113.
    [148] Duff I S, Reid J K. MA27-A set of Fortran subroutines for solving symmetric sets of linearequations[DB/CD]. London:1982.
    [149] Cormen T H. Introduction to algorithms[M]. The MIT press,2001.
    [150]郝晓松.多波前法的理论研究及实施技术[D].[硕士论文]，东南大学,2003.
    [151]刘淑静.有限差分结合多波前算法分析波导问题[J].淮阴师范学院学报.2004,3(4):11-14.
    [152]夏前鹏.多波前法在计算电磁学中的应用[D].2007.
    [153] Duff I S. The impact of high-performance computing in the solution of linear systems: trends andproblems[J]. Journal of computational and applied mathematics.2000,123(1):515-530.
    [154] Chen R S, Wang D X, Yung E K N, et al. A fast analysis of microwave devices by the combinedunifrontal/multifrontal solver for unsymmetric sparse matrices[J]. Microwave and OpticalTechnology Letters.2002,35(1):76-81.
    [155] Qu Y, Fish J. Multifrontal incomplete factorization for indefinite and complex symmetricsystems[J]. International journal for numerical methods in engineering.2002,53(6):1433-1459.
    [156] Tian J, Lv Z Q, Shi X W, et al. An Efficient Approach for Multifrontal Algorithm to SolveNon-Positive-Definite Finite Element Equations in Electromagnetic Problems[J]. Progress InElectromagnetics Research.2009,95:121-133.
    [157] Frisk E. Efficient elimination orders for the elimination problem in diagnosis[R].,2009.
    [158] Kaya K, U ar B. Constructing elimination trees for sparse unsymmetric matrices[R]. INSTITUTNATIONAL DE RECHERCHE EN INFORMATIQUE ET EN AUTOMATIQUE,2011.
    [159] Liu J W H. Modification of the minimum-degree algorithm by multiple elimination[J]. ACMTransactions on Mathematical Software (TOMS).1985,11(2):141-153.
    [160] Guermouche A L, Excllent J, Utard G. On the memory usage of a parallel multifrontal solver[C].2003.
    [161] Dongarra J J, Eijkhout V. Numerical linear algebra algorithms and software[J]. Elsevier J ComputAPPl Math.2000,123:489-514.
    [162] Liu J W H. The role of elimination tree in sparse factorization[J]. SIAM J Matrix Anal Appl.1990,11:134-172.
    [163] Meurant G. Computer solution of large linear systems[M].28ed. North Holland,1999.
    [164] Peyton B W. Some applications of clique trees to the solution of sparse linear systems[R].Clemson, South Carolina: Clemson University,1986.
    [165] Lewis J G, Peyton B W, Pothen A. A fast algorithm for reordering sparse matrices for parallelfactorization[J]. SIAM Journal on Scientific and Statistical Computing.1989,10(6):1146-1173.
    [166] Rothberg E, Schreiber R. Efficient methods for out-of-core sparse Cholesky factorization[J].SIAM Journal on Scientific Computing.1999,21(1):129-144.
    [167] Rotkin V, Toledo S. The design and implementation of a new out-of-core sparse Choleskyfactorization method[J]. ACM Transactions on Mathematical Software (TOMS).2004,30(1):19-46.
    [168] Ashcraft C, Grimes R, Lewis J. Progress in sparse matrix methods for large linear systems onvector supercomputers[J]. Internet J Supercomput App.1987(1):10-29.
    [169] Gupta A. Improved symbolic and numerical factorization algorithms for unsymmetricsparsematrices[C].2002.
    [170] Zitney S E, Mallya J, Davis T A. Multifrontal vs frontal techniques for chemical processsimulation on supercomputers[J]. Computers&chemical engineering.1996,20(6-7):641-646.
    [171] Zhuang W, Feng X P, Mo L, et al. Multifronatal Method Preconditioned Sparse-Matrix/CanonicalGrid Algorithm for Fast Analysis of Microstrip Structure[C]. Asia-Pacific MicrowaveConferenceProceedings: APMC,2005.
    [172]彭朕，盛新庆.一种多层不完全LU分解预处理方法在合元极技术中的应用[J].电子学报.2008,36(2):230-234.
    [173]周少博.大型线性方程组不完全分解预条件方法的研究[D].[硕士论文]，电子科技大学,2008.
    [174] Li L, Huang T, Jing Y, et al. Application of the incomplete Cholesky factorization preconditionedKrylov subspace method to the vector finite element method for3-D electromagnetic scatteringproblems[J]. Computer Physics Communications.2010,181(2):271-276.
    [175]盛新庆，彭朕.合元极技术的再研究[J].电子学报.2006,34(1):93-98.
    [176] Lin C J, Moré J J. Incomplete Cholesky factorizations with limited memory[J]. SIAM Journal onScientific Computing.2000,21(1):24-45.
    [177] Fish J, Qu Y. Global-basis two-level method for indefinite systems. Part1: convergence studies[J].International Journal for Numerical Methods in Engineering.2000,49(3):439-460.
    [178] Benzi M, Szyld D B, Van Duin A. Orderings for incomplete factorization preconditioning ofnonsymmetric problems[J]. SIAM Journal on Scientific Computing.1999,20(5):1652-1670.
    [179] Qu Y, Fish J. Multifrontal incomplete factorization for indefinite and complex symmetricsystems[J]. International journal for numerical methods in engineering.2002,53(6):1433-1459.
    [180]陈杰夫，朱宝钟，万勰。半解析对偶棱边元及其在波导不连续性问题中的应用[J]，物理学报。2009.58（2），1091-1098。
    [181] Yaghjian A D, McGagan R V. Broadside radar cross-section of a perfectly conducting cube[J].IEEE Trans. Antennas Propagat..1985,33:577-579.
    [182] Woo A, Schuh M, Simon M, Wang H T G, Sanders M L. Radar cross-section measurement data ofa simple rectangular cavity. Tech. Rep. NWC TM7132,1991.
    [183] Campbell Y, Davis T A. Incomplete LU factorization: A multifrontal approach[R].,1995.
    [184] Velamparambil S, Mackinnon-Cormier S, Perry J, et al. GPU accelerated Krylov subspacemethods for computational electromagnetics[C]. IEEE,2008.
    [185] Wang T, Yao Y, Han L, et al. Implementation of Jacobi iterative method on graphics processorunit[C].2009.
    [186] Zhang Z, Miao Q, Wang Y. CUDA-Based Jacobi's Iterative Method[C]. IEEE,2009.
    [187] Xu K, Fan Z, Ding D Z, et al. GPU accelerated unconditionally stable Crank-Nicolson FDTDmethod for the analysis of three-dimensional microwave circuits[J]. Progress In ElectromagneticsResearch.2010,102:381-395.
    [188] Dziekonski A, Lamecki A, Mrozowski M. Jacobi and Gauss-Seidel preconditioned complexconjugate gradient method with GPU acceleration for finite element method [C]. European:2010.
    [189] Nagaoka T, Watanabe S. A GPU-based calculation using the three-dimensional FDTD method forelectromagnetic field analysis[C]. IEEE,2010.
    [190] Open Computing Language (OpenCL)[Z].2009.
    [191] NVIDIA CUDA Programming Guide Version3.0.2/20/2010[Z].2010.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700