The central part of my GSOC project concerns implementing the Jacobi-Davidson method natively in Julia, available in JacobiDavidson.jl. This method computes a few approximate solutions of the eigenvalue problem for large matrices and . As it uses iterative solvers internally, much time has gone into improving IterativeSolvers.jl in general. Lastly, as iterative solvers are typically used with preconditioners, I have implemented the incomplete LU factorization for sparse matrices as well in ILU.jl.
JacobiDavidson.jl
The Jacobi-Davidson implementation is ready for use and can be applied to solving the (generalized) eigenvalue problem for non-Hermitian matrices. It’s similar to the eigs
method already available in Julia: you can get a couple eigenvalues near a specified target in the complex plane.
The jdqr
function computes a partial Schur decomposition of a general matrix . The columns of are orthonormal Schur vectors, and the diagonal of contains the eigenvalues of . Working with Schur vectors is numerically more stable than working with eigenvectors, but eigenvectors can be computed from the Schur vectors when necessary.
For the generalized problem we have the jdqz
function that computes a partial generalized Schur decomposition and for a matrix pencil . Here and have orthonormal columns and and upper triangular. The eigenvalues of can be found by computing .
At this point no official release has been tagged yet, as there is still some work to be done: hopefully the jdqr
and jdqz
methods can largely be merged as they are very similar; there are some optimizations for Hermitian problems that aren’t yet implemented; and lastly the methods do not yet support generic vectors and numbers.
For an example of jdqz
see the last section of this post.
IterativeSolvers.jl
My contributions to IterativeSolvers.jl include
- improving speed and memory usage in GMRES, CG, Chebyshev iteration, the power method and stationary methods mentioned below.
- adding missing iterative methods MINRES and BiCGStab(l);
- implementing stationary methods for sparse matrices #156;
- upgrading the package to Julia 0.6 and thereby updating the tests;
- improving and restructuring the documentation.
ILU.jl
Computing the full LU decomposition of a large, sparse matrix requires too much memory and computational work. By simply dropping small terms during the LU factorization these problems can be remedied, at the expense of being inexact or incomplete (ILU). This ILU factorization can still be an excellent preconditioner to speed up iterative methods.
As ILU for the SparseMatrixCSC
type was not yet available in Julia, I’ve implemented it based on the article “Crout versions of ILU for general sparse matrices” by Na Li, Yousef Saad and Edmond Chow.
The package is completely ready for use and is well tested. See below for an example.
In a later version the package can be improved by adding different dropping strategies.
Examples
Below you can find a few examples on how to use the packages I’ve been working on.
Jacobi-Davidson
Let’s take a look at a toy example of the generalized eigenvalue problem where and are diagonal matrices of size with and . The eigenvalues are just the integers .
We implement the action of the matrices and matrix-free, using LinearMaps.jl:
using LinearMaps
function myA!(y, x)
for i = 1 : length(x)
@inbounds y[i] = sqrt(i) * x[i]
end
end
function myB!(y, x)
for i = 1 : length(x)
@inbounds y[i] = x[i] / sqrt(i)
end
end
A = LinearMap{Complex128}(myA!, n; ismutating = true)
B = LinearMap{Complex128}(myB!, n; ismutating = true)
Next we fix , set the number of requested eigenpairs to 5 and target the eigenvalues with largest real part, somewhere near . We use a couple steps of BiCGStab(2) as an internal iterative solver, and set the dimension of the search space from 5 to 10 vectors:
using JacobiDavidson
n = 100
target = LR(110.0 + 0.0im) # Largest Real part
schur, resnorms = jdqz(A, B,
bicgstabl_solver(n, max_mv_products = 10, l = 2),
target = target,
pairs = 5,
ɛ = 1e-9,
min_dimension = 5,
max_dimension = 10,
max_iter = 100,
verbose = true
)
println.(schur.alphas ./ schur.betas)
plot(resnorms, yscale = :log10, marker = :+, label = "Residual norm", xlabel = "Iteration")
It indeed finds the eigenvalues from 96 up to 100:
100.00000000000041 + 5.329070518200757e-14im
98.99999999999996 - 1.7674527390216706e-14im
98.00000000000013 - 3.542337114695263e-14im
97.00000000000024 + 1.749508591650104e-14im
96.00000000000013 + 1.3053503572900918e-14im
The plot shows the convergence history, which is the residual norm in each iteration:
Preconditioning
The problem gets harder when we increase a few orders of magnitude. Especially if we want to target eigenvalues in the interior of the spectrum (near ), GMRES and BiCGStab(l) might have trouble solving a very indefinite system.
In that case a preconditioner for can be used, where is the target. We will just use the exact inverse, which is a diagonal matrix with entries .
It can be implemented matrix-free and in-place:
import Base.LinAlg.A_ldiv_B!
struct SuperPreconditioner{numT <: Number}
target::numT
end
function A_ldiv_B!(p::SuperPreconditioner, x)
for i = 1 : length(x)
@inbounds x[i] *= sqrt(i) / (i - p.target)
end
end
Now we call Jacobi-Davidson with the Near
target and pass the preconditioner. This time we use GMRES, but BiCGStab works as well.
n = 100_000
τ = 50_000.1 + 0im
target = Near(τ)
A = LinearMap{Complex128}(myA!, n; ismutating = true)
B = LinearMap{Complex128}(myB!, n; ismutating = true)
P = SuperPreconditioner(τ)
schur, residuals = jdqz(A, B,
gmres_solver(n, iterations = 10),
preconditioner = P,
target = target,
pairs = 5,
ɛ = 1e-9,
min_dimension = 5,
max_dimension = 10,
max_iter = 200,
verbose = true
)
It converges to the eigenvalues 49999, 50000, 50001, 50002 and 50004:
50004.00000000014 + 3.5749921718300463e-12im
49999.999999986496 - 7.348301591250897e-12im
50001.00000000359 - 1.9761169705101647e-11im
49998.99999999998 - 1.0866253642291695e-10im
50002.00000000171 - 2.3559720511618024e-11im
It does not yet detect 50003, but that might happen when pairs
is increased a bit. What is more interesting is that the total number of iterations is small as a result of the preconditioner:
It’s not easy to construct a preconditioner this good for any given problem, but usually people tend to know what works well in specific classes of problems. If no specific preconditioner is availabe, you can always try a general one such as ILU. The next section illustrates that.
ILU example
As an example of how ILU can be used we generate a non-symmetric, banded matrix having a structure that typically arises in finite differences schemes of three-dimensional problems:
n = 64
N = n^3
A = spdiagm((fill(-1.0, n - 1), fill(3.0, n), fill(-2.0, n - 1)), (-1, 0, 1))
Id = speye(n)
A = kron(A, Id) + kron(Id, A)
A = kron(A, Id) + kron(Id, A)
x = ones(N)
b = A * x
The matrix has size . We want to solve the problem using for instance BiCGStab(2), but it turns out that convergence can get slow when the size of the problem grows. A quick benchmark shows it takes about 2.0 seconds to solve the problem to a reasonable tolerance:
> using BenchmarkTools, IterativeSolvers
> my_x = @btime bicgstabl($A, $b, 2, max_mv_products = 2000);
2.051 s
> norm(b - A * my_x) / norm(b)
1.6967043606691152e-9
Now let’s construct the incomplete LU decomposition:
> using ILU
> LU = crout_ilu(A, τ = 0.1)
> nnz(LU) / nnz(A)
2.1180353639352374
Using the above drop tolerance , we get an approximate LU decomposition using only about twice as many entries as the original matrix, which is reasonable. Let’s see what happens when we benchmark the solver again, now with ILU as a preconditioner:
> my_x = @btime bicgstabl($A, $b, 2, Pl = $LU, max_mv_products = 2000);
692.187 ms
> norm(b - A * my_x) / norm(b)
2.133397068536056e-9
It solves the problem 66% faster to the same tolerance. There is of course a caveat, as constructing the preconditioner itself takes time as well:
> LU = @btime crout_ilu($A, τ = 0.1);
611.019 ms
So all in all the problem is solved about 36% faster. However, if we have multiple right-hand sides for the same matrix, we can construct the preconditioner only once and use it multiple times. Even when the matrix changes slightly you could reuse the ILU factorization. The latter is exactly what happens in Jacobi-Davidson.