MyraMath
|
/sparse
routines The /sparse folder supplies containers, ranges and algorithms for matrices that are sparse, in the sense that most of their entries are zero-valued and not explicitly stored. The chief containers are Pattern and SparseMatrix, which store their contents in a compressed-sparse-column (CSC) format. CSC is compact and contiguous, which leads to reasonably efficient algorithms for e.g. sparse matrix-vector multiplication. However, the rigidity of this data structure also implies that Pattern and SparseMatrix are basically immutable, in that their nonzero entries must be explicitly provided at construction-time and you can never add any more. This is a drag for casual usage, so some helper classes (PatternBuilder, SparseMatrixBuilder) that do allow random access/mutable stencils are provided.
Following similar design idioms that were laid out in /dense
, SparseMatrix has an associated interface type (SparseMatrixRange) which all of the algorithms in /sparse
operate upon. Like MatrixRange, SparseMatrixRange provides chainable methods for arbitrary row/column slicing into shallow subviews. A new concept in /sparse
is SparseMatrixIterator, which provides fine grained access for one-by-one traversals over all the structural nonzeroes in a SparseMatrixRange. SparseMatrixRange also provides the means for getting appropriately formatted "external data" into MyraMath routines.
Below are links to the sections of this page:
A Pattern is used to specify the nonzero stencil of a SparseMatrix, but the rigidity of these data structures makes both of them essentially immutable. Helper classes, called PatternBuilder and SparseMatrixBuilder, are provided to assist in their construction. PatternBuilder's are constructed empty, then populated with a variety of .insert()
statements. The code excerpt below (tutorial/sparse/pattern.cpp
) visualizes the process of constructing a Pattern using a PatternBuilder. Neither Pattern nor PatternBuilder carry any numeric data, only symbolic nonzeroes.
The last few lines created a SparseMatrix from the resulting PatternBuilder (with 1's for most of the numeric values), then examined the contents in a dense format. Separating the symbolic fill from the numeric fill is commonplace in finite element modeling, because the symbolic fill is cheap to compute and can be used to load balance (and synchronize) the more expensive numeric fill in parallel environments.
SparseMatrixBuilder can also be used to construct a SparseMatrix, but unlike Pattern and PatternBuilder it can hold numeric data too. It is constructed empty and then populated using "function call" syntax A(i,j)=value
for each nonzero. When finished, a SparseMatrix can be extracted by calling the .make_SparseMatrix()
method. The code excerpt below (tutorial/sparse/builder.cpp
) uses a SparseMatrixBuilder to produce the exact same SparseMatrix as before:
Filling a finite element system directly into a SparseMatrixBuilder is easier than the two phase serial-symbolic/parallel-numeric scheme outlined previously, but would be a strictly serial process. In both cases, PatternBuilder and SparseMatrixBuilder are playing similar roles as "precursor" ingredients for SparseMatrix and the choice between them is largely a matter of taste.
The listing below (tutorial/sparse/range.cpp
) generates a test SparseMatrix and experiments with various SparseMatrixRange slicing operations. SparseMatrixRange supports the same suite of slices as the MatrixRange from /dense
. Namely: window()
for general slices, top(), bottom(), row(), rows()
for row slicing, and left(), right(), column(), columns()
for column slicing.
The native output from std::cout << A
for a SparseMatrix is a little cryptic, but basically it prints all the nonzero entries, displaying them one column at a time. Recall that in CSC format, A(:,j)
is stored contiguously so accessing/printing A by column is cheap. The last few lines of this example show that:
33-36
) .erase()
method of SparseMatrixBuilder, (line 38-43
of the output) For the record, removing nonzeroes from SparseMatrix's by roundtripping in and out of a SparseMatrixBuilder is discouraged because it leads to excessive data movement. For best performance, try to organize/separate your sparse calculations into (i) a "populate" phase using PatternBuilder and/or SparseMatrixBuilder (ii) a single call to construct a SparseMatrix (iii) a "calculate" phase where you feed it into non-modifying downstream algorithms (like iterative or direct solvers).
SparseMatrixRange's provide coarse grained access to large collections of nonzeroes. For fine-grained access to single nonzeroes, using a SparseMatrixIterator is preferred. Note that this is a little different from how /dense
Matrix's and MatrixRange's work. To visit every nonzero of a dense Matrix, you'd typically use a double for loop over all (i,j) indices. But for a SparseMatrix, this is awkward and asymptotically inefficient because not every (i,j)
pair maps to an actual nonzero. Though SparseMatrix and SparseMatrixRange provide a .pointer(i,j)
method that can be used to "probe" for the presence of a nonzero, it would be preferable to somehow restructure the loops to "jump directly" from one nonzero to the next. This is what SparseMatrixIterator does. The code excerpt below (tutorial/sparse/iterator.cpp
) demonstrates its usage.
It should come as little surprise that the iterator-based traversal reported the entries in column major order, because SparseMatrix is internally stored in CSC format. The iterator presents the nonzero through three accessor functions: the row/column indices i()
and j()
, and numeric value value()
(which is actually mutable, if A
is mutable). Many algorithms like sparse/frobenius.h
, sparse/minmax.h
, sparse/transpose.h
, etc. are implemented in terms of SparseMatrixIterator's.
Compared to the dense BLAS, the sparse BLAS has (in this authors opinion) never really caught on. MyraMath does not bother to wrap or exploit any standard sparse BLAS routines, but instead provides a few routines that are BLAS-like in nature.
The first is a set of overloads in sparse/gemm.h
to perform multiplication by a SparseMatrix. Like the dense gemm()
, it performs a fused multiply-add update of the form C=alpha*op(A)*op(B)+beta*C
for dense C
. However, exactly one of its A
or B
operands is sparse. (So in effect it is either a sparse*dense multiply from the left, or a dense*sparse multiply from the right). This operation, applying a sparse matrix to a dense one, is the dominant calculation behind many iterative/krylov solution schemes. It is also required internally by MyraMath's direct solvers, for updating residuals inside iterative refinement routines. The following code excerpt (tutorial/sparse/gemm.cpp
) demonstrates how to use the sparse version of gemm()
and also checks it against the dense gemm()
on an equivalent dense Matrix.
There is also a set of overloads in sparse/trsm.h
for solving triangular systems of the form op(A)*X=B
or X*op(A)=B
for sparse A
, overwriting B
with X
. It still supports the same wide variety of options that dense trsm()
supports: solving from either the 'L'
eft or 'R'
ight side, solving by either the 'U'
pper or 'L'
ower triangle of A
, and applying a variety of modifiers (N,T,H,C
) to A
. The following code excerpt (tutorial/sparse/trsm.cpp
) demonstrates how to use trsm()
and also checks it against the dense trsm()
on an equivalent dense Matrix.
This routine is used internally for solving by the triangular factors that result from incomplete LU and incomplete Cholesky factorizations. Additional routines of note are (i) sparse/trmm.h
, for multiplying by a triangular SparseMatrix (ii) sparse/symm.h
, for multiplying by a symmetric SparseMatrix and (iii) sparse/hemm.h
, for multiplying by a hermitian SparseMatrix.
Like MatrixRange in /dense
, SparseMatrixRange also serves as the interface class for injecting external data into /sparse
algorithms. This data must be stored in compressed-sparse-column (CSC) format, which requires 3 flat arrays. They are:
cumsum()
of the column counts) SparseMatrixRange has a constructor that accepts pointers to these arrays (and some size metadata), the resulting range can be passed into any of the /sparse
algorithms. The code excerpt below (tutorial/sparse/external.cpp
) shows an example of passing external data into the sparse gemm()
algorithm.
The same technique will be used to get external datasets into MyraMath's sparse direct solvers.
Continue to Tutorial for /multifrontal
solvers (part 1), or go back to API Tutorials