The /iterative folder contains various iterative solution algorithms. Generally speaking, direct solvers favor robustness over speed, whereas iterative solvers lean the other direction. Often the two solution frameworks are combined together. In finite element substructuring methods, for example, direct solvers are used to explicitly factor out interior subdomain unknowns, and the leftover interface-matching problem is then solved using iterative techniques. Another noteworthy example is mixed precision refinement, wherein a low (single) precision direct solver is used as a preconditioner to solve a high (double) precision A*x=b problem. By providing both direct and iterative solution routines, MyraMath seeks to be an attractive "ecosystem" for implementing these kinds of algorithms.
All the iterative algorithms in MyraMath take Action's as their inputs. Action is an abstraction for a "callback" to apply a matrix-vector product (b=A*x), the key step for Krylov-type linear solution and eigensolution algorithms. In addition to Krylov-type algorithms, simpler algorithms like backwards refinement and randomized low-rank sampling are also implemented in terms of Action.
Most of the classes we've used so far have been encapsulations of data. For example, Matrix encapsulates a 2D table of numbers, PatternBuilder encapsulates a bitmask of nonzero entries, and SparseRCholeskySolver encapsulates a (permuted) sparse triangular factor L such that A = L*L'. Action represents a slightly different concept, it is used to encapsulate behavior. Specifically, Action encapsulates the behavior of applying a linear operator A to a column vector x to produce another column vector b. This is a fundamental operation behind many iterative solution algorithms: given a candidate solution x, the Action A can be applied to it to measure a residual, which is used in turn to improve the solution (ad infinitum).
Many functions and methods could readily be adapted into Action's. For instance, diagonal scaling (think dense/dimm.h), a sparse matrix-vector multiply (think sparse/gemm.h) and even the .solve() method of a Solver, all of these model the concept of an Action by mapping one input vector x into another output vector b. Within /iterative there are many "adaptor" methods to make atomic Action's:
This language of composition can be used to rapidly prototype all sorts of preconditioning effects (additive/multiplicative combination, deflation, etc). Even third-party or end-user code can be encapsulated within an Action (see iterative/UserAction.h) . Any process that maps input vectors to output vectors is a suitable model for the Action concept. See the Appendix below for a working example. Just like MatrixRange served as the "interface type" to get external algebraic data into direct routines (like getrf()), Action serves as the "interface type" to get external algebraic behavior into iterative routines (like gmres()).
Refinement
Backwards refinement (see iterative/refine.h) is a simple process for improving the accuracy of a direct solver. Each step entails:
Forward multiply to evaluate the residual, r = b-A*x
Backsolve to find the correction vector, A*c = r
Update the solution with the correction, x += c
Refinement delegates to two Action's: one to apply A when forming the residual, and another to solve by A when forming the correction (this "inverse" Action is traditionally called M). The example below (tutorial/iterative/refine.cpp) demonstrates how to use refine(). The forward multiply by A is accomplished by make_SymmAction(), and the backsolution M comes from wrapping a make_SolveAction() around an existing instance of an RLDLTSolver.
Most of the sparse direct solvers in /multifrontal already encapsulate this sequence of steps into a solver.refine() method, see Iterative Refinement for details.
It's possible to mix precision types (float and double) when performing backwards refinement, typically by applying a low-precision solver (M) to a high-precision system (A*x=b). The advantage here is that a low-precision solver requires less memory than a high-precision one (roughly half), but there is some risk that a low precision factorization is more likely to breakdown due to ill-conditioning. See iterative/RaiseAction.h and iterative/LowerAction.h for helper classes that adjust the precision of an Action, or iterative/mixed_refine.h for canned algorithms.
Krylov linear solvers
The conditions under which iterative refinement converges are fairly restrictive. In particular, the original factorization must be fairly accurate to guarantee convergence/improvement. Krylov schemes are an alternative solution framework for solving A*x=b, wherein the k'th solution xk is sought by taking a linear combination of {b, A*b, A2*b, A3*b,..Ak*b}, called the Krylov space of (A,b). Under some mild assumptions about A and b, this collection of vectors will completely span Rn after n iterations, guaranteeing they capture to the exact solution to A*x=b (in exact arithmetic, at least).
In practice n iterations is still far too long to wait, but a more detailed analysis of the convergence properties of Krylov methods (too advanced to pursue here) reveals that they can converge quickly if A has favorable spectral properties (clustered eigenvalues, well conditioned). This observation leads to preconditioned Krylov schemes, in which another Action M is introduced and an alternative problem M-1*A*x=M-1*b is iterated instead. The preconditioner M-1 serves to improve the spectral properties of A and should be in some sense an approximation of A's inverse (such that M-1A is well conditioned and/or better clustered). Finding a good M-1 is problem dependent, but preconditioning strategies often draw from (i) incomplete factorizations (ii) classical smoothers (iii) approximating the underlying "physics" of A cheaply.
There are many Krylov-type solvers, but the most important ones are probably:
iterative/bicgstab.h, for Biconjugate Gradient Stabilized, suitable for most A. Inexpensive, but somewhat erratic convergence.
iterative/pcg.h, for Preconditioned Conjugate Gradients. Superior performance, but only suitable for symmetric positive A.
iterative/minres.h, for Minimum Residual. Almost as good as pcg(), still requires symmetric A but it can be indefinite.
iterative/gmres.h, for Generalized Minimum Residual, suitable for arbitrary A. Optimally convergent, but costly in memory/time.
All of these routines take an Action for A and another Action to apply M-1, and overwrite your initial guess for x with the final solution. They also return profiling data (convergence history, etc), check the detailed source documentation for details. The example code below (tutorial/iterative/pcg.cpp) demonstrates pcg() on a symmetric positive definite A (a 2D graph laplacian), using an incomplete Cholesky factorization as a preconditioner M-1. For comparison, the problem is also solved with no preconditioner (by setting M-1=I).
Incomplete Cholesky preconditioner. Presents a solve() function, but it's only approximate.
-------- no preconditioner -------
|A*x-b|/|b| = 7.24724e-09
iterations = 28
------- with preconditioner ------
|A*x-b|/|b| = 3.55127e-09
iterations = 10
For sake of completeness, tutorial/iterative/bicgstab_gmres.cpp (shown below) compares the performance of bicgstab() and gmres() on a small 50x50 unsymmetric system.
The performance of the two methods is difficult to compare directly: gmres() converges better (monotonically) but requires more internal workspace, while each bicgstab() iteration internally requires two applications of A and M. End user experimentation is encouraged. It is likely that more Krylov solvers will be added to MyraMath over time.
Regardless of the method, successful Krylov solution hinges on good preconditioning. The .schur() and .partialsolve() methods on MyraMath's /multifrontal Solver's are intended to be the building blocks of substructuring preconditioners. This would fall under strategy (iii), exploiting the physical intuition that "long range interactions" are less important, and that instead solving a collection of local subproblems can often strongly cluster a large sparse A that arises from discretizing a partial differential equation.
Krylov eigensolvers
Spans of Krylov vectors {b, A*b, A2*b, A3*b,..Ak*b} are also good spaces for approximating extremal eigenpairs (consider their similarity to the iterates generated by the classical power method). Eigensolvers are considerably harder to implement than linear solvers. However, unstructured nested dissection requires solving eigenproblems over sparse graph laplacians. So MyraMath does implement a few simple Krylov eigensolvers (as details of the internal reorder()'ing routine), and these algorithms are callable from user code.
Two algorithms are provided:
iterative/lanczos1.h, the Lanczos algorithm, suitable for finding a large eigenpair of a symmetric A
iterative/lopcg1.h, the locally optimal conjugate gradient algorithm, suitable for finding a small eigenpair of a symmetric A
The example below (tutorial/iterative/lanczos_lopcg.cpp) uses both these algorithms to find one large eigenpair and one small eigenpair of a sparse real symmetric A:
Note these methods are only suitable for real symmetric A (because graph laplacians are real symmetric). For general A, more sophisticated algorithms (like the Arnoldi method) are needed, which are not implemented here. Fortunately there are robust open source packages in this space that can help, like ARPACK and SLEPc.
Appendix 1: Writing your own Action
The algorithms in /iterative can be used with any Action. A variety of common Action's (sparse matrix-vector multiply, diagonal scaling, etc) and a language to compose them (sum, cascade, etc) are available. However, advanced users may discover a need to inject their own code as an Action (most likely as a preconditioner). Adaption routines for this use case can be found within iterative/UserAction.h.
User code should be encapsulated within a class with the following signatures:
A typedef for the Number type that the user code operates upon (float for example, or std::complex<double>)
A size() method that returns the size of the Action as a std::pair<int,int>
A multiply() method that maps b:=A*x, for (const) CMatrixRange x and (mutable) MatrixRange b
Then, call make_UserAction() to adapt an instance of your class into an Action. Any user code wrapped like this can be further composed (added, cascaded, scaled, etc) just like any other native Action. The following program (tutorial/iterative/action1.cpp) shows a working example of this process. The "user code" applies a 1D graph laplacian over N points and is encapsulated within the MyAction class. It is adapted into an Action using make_UserAction(), then composed with a rank-1 correction (a regularization), and finally fed into pcg() for linear solution.
Appendix 2: Writing an Action that calls third party code.
Another notable use case for make_UserAction() is to help adapt third party code (like an external preconditioning algorithm) into an Action. Probably, this third party code isn't something you can easily modify or cut/paste into a MyAction.multiply() method. Possibly it's only available in compiled form anyway. You can still make this work, instead you just need to write a class (similar to MyAction from before) that delegates into the external library. The following program (tutorial/iterative/action2.cpp) shows a working example of this process.
This is essentially the Adaptor design pattern, where MyraMath is the Client, the third party code is the Adaptee, and you need to write MyAction to be the Adaptor. The pre-canned Action's for calling BLAS routines (gemm(), etc) basically work like this. Injecting your own preconditioning routines is strongly encouraged. Or try building new ones from the other building blocks in MyraMath.