线性约束矩阵最小二乘问题：理论与算法

英文题名：Linearly Constrained Matrix Least Squares Problems: Theory and Algorithms
作者：裘渔洋
论文级别：博士
学科专业名称：计算数学
中文关键词：基矩阵 ; 正交投影 ; 特征值分解 ; 矩阵形式迭代 ; Krylov子空间方法
英文关键词：Basis Matrix ; Orthogonal Projection ; Eigenvalue Decomposition ; Matrix-Form Iteration ; Krylov Subspace Methods
学位年度：2007
导师：张振跃
学科代码：070102
学位授予单位：浙江大学
论文提交日期：2007-01-01

摘要

近年来，随着社会的发展和科学的进步，并受其他学科与工程技术领域应用中产生的迫切需求所驱动，带各种约束的线性方程及其对应的最小二乘问题越来越引起人们的关注。如在众多领域中拥有广泛应用的Sylvester方程，Lvapunov方程和带各种约束的“逆特征值问题”及其最小二乘问题等。
     理论上，利用Kronecker积，可以等价地将矩阵形式的约束最小二乘问题写成向量形式，因此，通解能够用高维空间的向量表示。由于约束方程可以视为对应的最小二乘问题残量为零时的特殊情形，所以这种表示也同样适合约束方程。但是，这种从矩阵到向量的等价变换方式引起系数阵的不必要增大，将直接导致计算量和存储量的成倍增加；同时，系数矩阵的一些特殊结构在解的表示中可能会丢失。为了克服在结构分析和通解表示中引入Kronecker积产生的各种困难，许多研究者更为感兴趣地是从矩阵层面上寻求无约束或者线性约束最小二乘问题的通解表示。常用的方法是借助于一些矩阵分解，如广义奇异值分解(GSVD)，典型相关分解(CCD)，和商奇异值分解(QSVD)等。一般而言，矩阵方程可以通过特殊分解简化为“对角”形式，由此产生的新系统是容易求解的：我们只需将新变量阵合理的进行分块即可，其中的子块一部分可以直接求解，另一部分则通过求解一个小规模的线性系统实现。此外，有些研究者也通过迭代数值求解一些特殊的约束最小二乘问题。
     一般地，矩阵分解是对特殊问题采取的特定解法，需要很强的技巧性；并且它也依赖于方程本身和约束条件。因此，很难将一个特定的分解直接应用于一系列相关的却又不同的问题，可“移植”性较差。此外，这些方法通常是不“鲁棒”的，微小的改变约束条件可能会引起求解的巨大变化。一个自然的问题是：对于一般的线性约束最小二乘问题，是否在通解和迭代上存在着统一的求解模式。在本文中，我们试图在理论和算法上给出框架式的研究方法，通过对老问题从新的切入点和视角入手进行研究。我们相信文中提出的方法对约束最小二乘问题的求解能够提供启发和帮助。
     本文的主要贡献如下：
     1．利用约束空间的基矩阵(约束矩阵)，我们用一个低维的列向量空间(坐标空间)刻划该线性约束空间，并且给出一般矩阵到这个约束空间上正交投影的矩阵表示。
     2．我们揭示了强约束最小二乘问题和弱约束最小二乘问题解空间之间的内在联系。强约束解集是否包含于弱约束解集，取决于方程的右端矩阵T。对于一般情形，我们证明了：存在一个T的等价映射φ(T)，使得关于右端为φ(T)的强约束最小二乘问题与原强约束问题同解。更为重要的是：以φ(T)为右端的弱约束问题的解集中，包含了原强约束问题的所有解。因此，对给定的右端φ(T)，如果相应的弱约束问题的通解已知或者容易得到，那么，原强约束问题也就容易求解。我们用定理3．5揭露上述潜在的性质，同时，该定理也表明：强约束解集实际上就是强约束空间与以φ(T)为右端的弱约束解集的交集。因此，等价映射的获得就成为了求解约束最小二乘问题的关键。T的等价映射φ(T)是不唯一的，它可以取成是在强约束变换值域(约束问题本身的线性变换关于强约束空间的值域)上的正交投影。在文中，我们提出了刻划等价映射的几个定理，并且给出了等价映射可验证的一些构造途径。然后，将这些理论应用于一些特殊的约束最小二乘问题，得到了通解的简单表示形式。从弱约束解集中寻找强约束通解为一般的带约束最小二乘问题提供了一种统一的求解模式，特别适合逐步施加更多约束条件的矩阵最小二乘问题的求解。
     3对于特殊约束的两类特定方程，通过约束条件中矩阵的特征值分解，我们实现约束方程的无约束化；同时将原约束方程等价于一个通解已知的无约束方程；然后重构出原约束方程的解。在通解的表示中不涉及具体的特征值分解。
     4．我们提出了求解线性约束最小二乘问题的几种矩阵形式Krylov子空间方法。本质上，这些算法是向量形式的CGLS，GMRES和LSQR对应的矩阵迭代，但不仅仅是它们简单的重写。理论上，借助于基矩阵和Kronecker积，我们得到约束最小二乘问题的无约束向量形式，并对其应用向量形式的Krylov子空间方法，由此可以推导出矩阵形式的Krylov方法。但是在具体的矩阵迭代中，我们要求是不涉及Kronecker积和基矩阵的。因此，这种向量到矩阵的对应是不平凡的。在文中，我们利用定理5．2刻划了给定向量到约束空间上特定矩阵的1-1对应，从而实现了Krylov子空间方法从向量到矩阵形式地等价转换，并给出了特殊约束方程的具体算法。此外，我们也用几个例子说明这些算法的数值效果。我们相信所提出的想法将适合于寻找别的矩阵形式的迭代法。
In the recent decades, with the development of society and the progress of science, more and more new problems which are urgent to be solved arise from the engineer technology and many other domains. Many of these problems can be reduced to linear systems of matrix equations and/or least squares problems with linear constraints on solutions, Those systems include Sylvester equation, Lyapunov equation, inverse eigenvalue problem and the corresponding least squares problems which have been widely considered in many applied domain. More and more people have been attracted in this research field.
     Theoretically, a least squares problem in matrix form can be rewritten in vector form by Kronecker product. So solutions of the matrix least squares problem are known in terms of higher dimensional column-vectors. So does it for matrix equations since a linear system of matrix equations can be looked as a least squares problem with zero residual. However, this equivalent transformation from matrix LS to vector LS may enlarge the coefficient matrices of the linear systems unnecessarily, which results in a great computation cost and storage requirement. Moreover, possible special structure of the coefficient matrices in the matrix system will be lost in the vector-expressions of general solutions, even if matrix solutions are reconstructed from the vector-expressions. To avoid these difficulties, basically to avoid introducing Kronecker products in the structure analysis of solutions or formulating solutions, matrix-level discussion or analysis for solutions of matrix LS problems is much interesting and there have been some valuable efforts on this issue. The techniques commonly used are some matrix-factorizations such as the generalized singular value decomposition (GSVD), the canonical correlation decomposition (CCD), and the quotient singular value decomposition (QSVD). In general, these factorizations are used to reduce matrix equations to be simple diagonal-like form so that the new systems can be easily solved by suitably partitioning the new matrix-variables - some sub-matrices are determined immediately and others are obtained by solving a small linear system. In the meantime, iterative methods for numerically solving the least squares problems are also considered. Some problems with special constraints have been solved.
     In general, the produce factorization applied on matrix LS problems requires proficient skill, depending on the system of matrix equations and the constraints on solutions. It is difficult to directly apply a consistent factorization produce on those related but different problems. Furthermore, those approaches are generally not robust. Minor modifications on the constraint may give rise to great change on solutions. A natural question kept in our mind is that whether we can find a uniform approach for analysis or iteratively solving the linear system of matrix equations with linear constrains on solutions. This thesis will concentrate on this question and try to give answers, based on a new viewpoint and jumping-off point of these well-known problems. We will present a frame work in both theory and algorithms. We believe that the methods proposed in this thesis is enlightening and helpful for other constrained matrix LS problems.
     The contributions of this thesis are outlined as follows:
     1. We character the linear constraint matrix-spaces by their lower-dimension parameter vector spaces, using matrix-form basis. Those properties will be used to analyze and solve the least squares problems with linear constraints, and for arbitrary matrix, we also present its orthogonal projection in the constrained space.
     2. We uncover the inherent relations between the solution-space of the matrix LS with (relatively stronger) constraints and the solution-space to the matrix LS with weaker constraints. Whether the former solution space is contained by the latter uniquely depends on the right-hand side matrix T of the equation. We prove that there exists an equivalent transformationφ(T) on the right-hand side matrix T such that the solution space of the constrained problem is not changed when the r.h.s, matrix T is replaced byφ(T). More importantly, the solution space of the weaker-constrained matrix LS with the transformed r.h.sφ(T) contains all the solutions of the original problem with the (stronger) constraints. Therefore, if the solution space associate to the weak-constraints and r.s.hφ(T) is known or can be easily obtained, the solution set of the strong-constrained problem can be easily constructed - it is the intersection of the strong-constraint space and the solution set of the weak-constraint problem. We discover this latent properties by Theorem 3.5. This transformation can be (not unique) the orthogonal projection of the r.h.s matrix on to the range space of the linear system of matrix equations corresponding to the strong-constraints. We present the details of constructing the equivalent map. This development is important for solving linearly constrained matrix LS problems since the matrix LS with weaker constraints (or without constraints) may have been solved. We apply this technique on some constrained matrix LS problems and obtain simple matrix-representations of solutions. Indeed, this technique can be used to handle the matrix LS problems when more linear constraints are added.
     3. We discuss two kinds of structure-matrix equations with specific constraints. Theoretically, a matrix eigenvalue decomposition are used to remove the constraints in our discussion. However, the eigenvalue decomposition is also gotten off later. General solutions of the constrained equations are reconstructed. Note that the formula of the general solutions is simple and eigenvector-free.
     4. We propose some matrix-form Krylov subspace methods for solving linearly con strained matrix LS problems. Basically, they are the algorithms CGLS, GMRES, bidiagonalization Lanczos, and LSQR in matrix-iteration. The matrix-form Krylov subspace methods are not trivial resetting of the vector-Krylov methods. The matrix-Krylov methods are derived by theoretically removing the constraints (using basis of the constraint space), employing Kronecker product to form a linear system in long-vector form, and applying vector-form Krylov methods. However, Kronecker products are not involved in the resulting matrix-form Krylov methods. We propose Theorem 5.2 to accomplish this vector-matrix iterative form transformation. Numerical experiments are reported to show the numerical efficiency of the matrix Krylov methods. We believe that the basic ideas proposed can be generalized to find other matrix-form iterative algorithms as well.

引文

[1] L. E. Andersson, Tommy Elfving. A constrained procrustes problem. SIAM J. Matrix Anal. Appl. 18(1)(1997)124-139.
    [2] W. E. Arnoldi. The Principle of Minimized Iterations in the Solutions of the Matrix Eigenvalue Problem. Quart. Appl. Math. 9(1951)17-29.
    [3] M. Arav, D. Hershkowitz and H. Schneider. The recursive inverse eigenvalue problem. SIAM J. Matrix Anal. Appl. 21(2)(2000)392-412.
    [4] J. K. Baksalary, and R. Kala. The matrix equation AXB+CYD=E. Linear Algebra Appl. 30(1980)141-147.
    [5] A. Barrlund. Efficient solution of constrained least squares problems with kronecher product structure. SIAM J. Matrix Anal. Appl. 19(1)(1998)154-160.
    [6] R. H. Bartels, G. W. Stewart. Algorithm 432: solution of the matrix equation AX+XB= C. Circ. Syst. Signal Proc. 13(1994)820-826.
    [7] J. Baumeister. Stable soltuions of inverse problems. Vieweg-Verlag, Brauschweig, 1987.
    [8] M. A. Beitia and J. M. Gracia. Sylvester Matrix Equation for Matrix Pencils. Linear Algebra Appl. 232(1996)155-197
    [9] D. D. Bini and B. Meini. Effective Methods for Solving Banded Toeplitz Systems. SIAM J. Matrix Anal. Appl. 20(3)(1999)700-719.
    [10] A. Bulgokav, and S. Godunov. Computing positive definite solutions of Lypunov equations. (In Russian), In Numerical Methods of Linear Algebra, Nauka, Novosibirsk, Russian(1985) 17-38.
    [11] F. Cecioni. Sopra operazioni algebriche. Ann. Scuola Norm. Sup. Pisa Sci.Fis. Mat. 11(1910)17-20.
    [12] 陈景良，陈向晖．特殊矩阵．清华大学出版社，北京，2001．
    [13] Y. T. Chen. Iterative methods for linear least-squares problem. Res. Rep CS-75-04, Dep. of Computer Science, Univ. Waterloo, Ont, Canada, 1975.
    [14] D. Chu, B. D. Moor. On a varitional formulation of the QSVD and RSVD. Linear Algebra Appl. 311(2000)61-78.
    [15] K. E. Chu. Singular value and generalized singular value decompositions and solutions of linear matix equations. Linear Algebra Appl. 88/89(1987)83-98.
    [16] K. E. Chu. Symmetric solutions of linear matrix equations by matrix decompositions. Linear Algebra Appl. 119(1989)35-50.
    [17] H. Dai. On the symmetric solutions of linear matrix equations, Linear and Multilinear Algebra 131(1990)1-7.
    [18] B. N. Datta, K. Datta. Theoretical and computational aspects of some linear algebra problems in control theory. Computational and Combinatorial Methods in Systems Theory, Elsevier, Amsterdam (1986)201-212.
    [19] Y. B. Deng, X. Y. Hu, L. Zhang. Least squares solutions of BXA~T=T over symmetric,skew- symmetric,and positive semidefinite X~*. SIAM J. Matrix Anal. Appl. 25(2003)486-494.
    [20] B. W. Dickinson. Efficient solution of linear equations with banded Toeplitz matrices. IEEE ASSP. 27 (1979)421-423.
    [21] G. R. Duan. Solutions to the matrix equation AV+BW=EVF and eigenstructure assignment for descriptor system. Autornatica 28(3)(1992)639-643.
    [22] G. R. Duan. Solutions to the matrix equation AV+BW=EVF and their application to eigenstructure assignment linear systems. IEEE. Trans. Automat. Control, 38(2)(1993)276-280.
    [23] H. W. Engl. Regularization methods for the stable solution of inverse problems. Surv. Math. Ind. 3(1993)71-143.
    [24] H. W. Engl, M. Hanke, Neubauer. Regularization of inverse problems. Dordrcht, Kluwer, 1996.
    [25] D. K. Faddeev, V. N. Faddeeva. Computational Methods of Linear Algebra. Freeman, Lodon, 1963.
    [26] D. W. Fausett, C. T. Fulton. Large least squares problems involving Kronecker products. SIAM J. Matrix Anal. Appl. 15(1994)219-227.
    [27] G. H. Golub, W. Kahan. Calculating the singular values and pseudoinverse of a matrix. SIAM J. Numer. Anal. 2(1965)205-224.
    [28] G. H. Golub, S. Nash, C. F. Van Loan. A Hessenberg-Schur method for the problem AX+XB=C. IEEE Trans. Auto. Contr. AC-24(1979)909-913.
    [29] G. H. Golub, C. F. Van Loan. Matrix Computations. Johns Hopkins University Press, Baltimore, Maryland, 3rd edition, 1996.
    [30] G. H. Golub, H. Y. Zha. Perturbation analysis of the canonical correlation of matrix pairs. Linear Algebra Appl. 210(1994)3-28.
    [31] A. Greenbaum and Z. Strakos. predicting the Behavior of Finite Precision Lanczos and Conjugate Gradient Computations. SIAM J. Matrix Anal. Appl. 13(1992)121-137.
    [32] A. Greenbaum and L. N. Trefethen. GMRES/CR and Arnoldi/Lanczos as Matrix Approximation Problems. SIAM J. Sci. Comp. 15(1994)359-368.
    [33] C. W. Groetsh. Inverse problems in the mathematical science. Braunschweig: Wiesbaden Vieweg. 1993.
    [34] M. R. Hestenes. Conjugate Direction Methods in Optimization. Springer-Verlag, Berlin.
    [35] M. R. Hestenes, E. Stiefel. Methods of conjugate gradients for solving linear system. J. res. Nat. Bur. Stand. 49(1952)409-436.
    [36] G. Hever, and C. Kenney. The sensitivity of the stable Lyapunov equation SIAM J. Control Optim. 26(1988)321-344.
    [37] N. J. Higham. Computing a nearest symmetric positive semidefinite matrix. Linear Algebra Appl. 103(1988)103-118.
    [38] N. J. Higham. Perturbation Theory and Backward Error for AX-XB=C. BIT(1993).
    [39] N. J. Higham. The symmetric Procrustes problem. BIT 28(1988)133-143.
    [40] A. S. Hodel. Recent applications of the Lyapunov equations in control theory, In IMACS. Iterative Methods in Linear Algebra, R. Beauwens and P. DeGroen, eds. Elsevier Science Publisher, North-Holland, (1992)217-227.
    [41] D. Y. Hu, and L. Reichel. Krylov-Subspace Methods for the Sylvester Equation. Linear Algebra Appl. 172(1992)283-313.
    [42] C. M. Huang, and D. P. O'leary. A Krylov Multisplitting Algorithm for solving Linear Systems of Equations. Linear Algebra Appl. 194(1993)9-29.
    [43] A. P. Liao, Z. Z. Bai, and Y. Lei. Best approximate solution of matrix equation AXB+CYD=E. SIAM J. Matrix Anal. Appl. 27(3)(2005)875-888.
    [44] S. K. Mitra. Common solutions to a pair of linear matrix equations A_1XB_1=C_1, A_2XB_2=C_2. Proc. Cambridge Philos. Soc. 704(1973)213-216.
    [45] S. K. Mitra. The matrix equations AX=C, XB=D. Linear Algebra Appl. 59(1984)171-181.
    [46] S. K. Mitra. A pair of simultaneous linear matrix equations A_1XB_1=C_1, A_2XB_2=C_2 and a matrix programming problem. Linear Algebra Appl. 131(1990)107-123.
    [47] B. D. Moor, G. H. Golub. Generalized singular value decompositions:a proposal for a standardized nomenclature. Zaterual Report, 89-10. ESAT-SISTA, Leuven, Belgium, 1989.
    [48] A. Navarra, P. L. Odell, and D. M. Young. A representation of the general common solutions to the matrix equations A_1XB_1=C_1 and A_2XB_2=C_2 with application. Computers and Mathematics with Application 41 (2001) 929-325.
    [49] A. B. OZGULER. The equation AXB+CYD=E over a principle ideal domain. SIAM J. Matrix Anal. Appl. 12(1991)581-591.
    [50] C. C. Paige. Bidiagalization of matrices of solution of linear equations. SIAM J. Numer. Anal. 11(1)(1974)197-209
    [51] C. C. Paige, B. N. Parlett, and H. A. Van Der Vorst. Approximate Solutions and Eigenvalue Bounds From Krylov Subspaces. Numer. Linear Algebra with Applic. 2(1995)115-134.
    [52] C. C. Paige, A. Saunders. Solutions of sparse indefinite systems of equations and least squares problems. RES. REP. STAN-CS-73-339. Stanford Univ. Stanford, CA, 1973.
    [53] C. C. Paige, A. Saunders. LSQR: An Algorithm for Sparse Linear Equations and Sparse Least Squares. ACM Transactions on Mathematical Software 8(1)(1982)43-71.
    [54] C. C. Paige, A. Saunders. LSQR: Sparse Linear Equations and Sparse Least Squares Problems. ACM Transactions on Mathematical Software 8(2)(1982)195-209.
    [55] C. C. Paige. Computing the generalized singular value decomposition. SIAM J. Sci. Stat. Comput. 7(1986)1126-1146.
    [56] C. C. Paige, M. A. Saunders. Towards a generalized singular value decomposition. SIAM J. Numer. Anal. 18(1981)398-405.
    [57] C. R. Rao, S. K. Mitra. Generalized inverse of matrices and its applications. Wiley, New York, 1971.
    [58] J. Rissanen. Solution of linear equations with Hankel and Toeplitz matrices. Numer. Marh. 22(1974)361-366.
    [59] Y. Saad. On the rates of convergence of the Lanczos and the block Lanczos methods. SIAM J. Numer. Anal. 17(1980)689-706.
    [60] Y. Saad. Krylov Subspace Methods for Solving Large Unsymmetric Linear Systems. Math. Comp. 37(1981)105-126.
    [61] Y. Saad. Practical Use of Some Krylov Subspace Methods for Solving Indefinite and Nonsymmetric Linear Systems. SIAM J. Sci. and Stat. Comp. 37(1984)203-228.
    [62] Y. Saad, M. Schultz. GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Stat. Comp. 7(3)(1986)856-869.
    [63] Y. Saad. Krylov Subspace Methods on Supercomputers. SIAM J. Sci. and Stat. Comp. 10(1989) 1200-1322.
    [64] Y. Saad. Numerical Methods for Large Eigenvalue Problems. Manchester University Press, Manchester, UK, 1992 Corrections January 3rd, 2000.
    [65] Y. Saad. Iterative Methods for Sparse Linear Systems. Copyright@2000 by Yousef Saad, Second Edition With Corrections January 3rd, 2000.
    [66] G. Starke. SOR for AX-XB=C. Linear Algebra Appl. 154(1991)355-375.
    [67] S. Y. Shim, Y. Chen. Least squares solutions of matrix equation AXB~*+CYD~*=E~*. SIAM J. Matrix Anal. Appl. 24(3)(2003)802-808.
    [68] G. W. Stewart. Conjugate Direction Methods for Solving Systems of Linear Equations. Numer. Math. 21(1973)284-297.
    [69] G. W. Stewart. The Convergence of the Method of Conjugate Gradients at Isolated Extreme Points in the Spectrum. Numer. Math. 24(1975)85-93.
    [70] E. Stiefel. Ausgleichung ohne Aufstellung der Gausschen Normalgleichungen. Wiss. Z. Tech. Hochsch. Dresden, 2(1952/53)441-442.
    [71] J. G. Sun. Two kinds of inverse eigenvalue problems for real symmetric matrices. Math. Numer. Sinica. 10(1988)282-290.
    [72] W. F. Trench. Hermitian, hermitian R-symmetric, and hermitian R-skew symmetric Procrustes problems. Linear Algebra Appl. 387(2004)83-98.
    [73] 肖庭延，于慎根，王彦飞．反问题的数值解法．科学出版社，北京，2003．
    [74] G. P. Xu, M. S. Wei, D. S. Zheng. On solutions of matrix equation AXB+CYD=F. Linear Algebra Appl. 279(1998)93-109.
    [75] 徐树方．矩阵计算的理论与方法．北京大学出版社，北京，1995．
    [76] J. W. Van der Woude. On the existence of a common solution X to the matrix equation A_iXB_j=C_(ij), (i, j)∈Γ. Linear Algebra Appl. 375(2003)135-145.
    [77] 武际可，邵秀明．循环矩阵及其在结构计算中的应用．计算数学，2(1979)144-154．
    [78] 张振跃．三对角矩阵的逆特征值问题．计算数学，1(1991)48-53．
    [79] Z. Y. Zhang, J. Wang, and M. Fang. Necessary and Sufficient Conditions for the Existence of Positive Definite Solutions for the Symmetric Recursive Inverse Eigenvalue Problem. SIAM J. Matrix Anal. Appl. 26(4)(2005)1115-1131.
    [80] Z. Y. Zhang, Y. Y. Qiu, K. Q. Du. Conditions for Optimal Solutions of Unbalanced Procrusts Problem on Stiefel Manifold. Journal of Computational Mathematics, accepted.
    [81] Y. P. Zhen. An iterative method for the least squares symmetric solution of the linear matrix equation AXB=C. Applied Mathematics and Computation 170(2005)711-723.
    [82] S. Zohar. The sikytuib if a Toeplitz set of linear equations. J. Ass. Comput. Math. 21(1974)272-276.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700