Jacobi-Davidson

Jacobi-Davidson is an iterative method for approximately solving the generalized eigenvalue problem

$Ax = \lambda Bx$

for a handful of eigenpairs. The matrices are huge and can be general. For ease of discussion we take $B = I$ .

Its major selling point is that it can compute interior eigenvalues without iterating with $A^{-1}$ exactly; approximate solves are enough. For very large matrices it is typically infeasible to do an exact solve per iteration.

Other advantages include the possibility to target any point in the complex plane; freedom in the selection procedure (ordinary / harmonic / refined Ritz pairs); and the possibility to specify a good initial search subspace when one knows approximately the directions of the eigenvectors.

Its main disadvantage is expensive iterations in comparison to Arnoldi and the like.

Subspace methods & Ritz pairs

Jacobi-Davidson is a subspace method, which means that it selects an approximate eigenvector from a low-dimensional search subspace $\mathcal{V} \subset \mathbb{C}^n$ . Clearly one cannot expect to find the exact eigenvector in a small subspace; therefore one requires the residual to be perpendicular to a test space $\mathcal{W} \subset \mathbb{C}^n$ . For now we will assume that these two coincide: $\mathcal{V} = \mathcal{W}$ . This is called a Galerkin condition.

If the columns of an orthonormal matrix $V \in \mathbb{C}^{n \times m}$ form a basis for $\mathcal{V}$ , then the above can be summarized as

$Au - \theta u \perp \mathcal{V} \text{ s.t. } u \in \mathcal{V} \Longleftrightarrow V^*AVy = \theta y$

for $y \in \mathbb{C}^m$ . Finding $(y, \theta)$ is cheap, because the matrix $V^*AV$ is small. An approximate eigenpair is then obtained as $(\theta, Vy)$ and is called a Ritz pair. We pick the approximate eigenpair that is of most interest; for instance with a Ritz value closest to a specified target.

Note that iterative methods build these search and test spaces incrementally: each iteration the spaces are expanded by a new basis vector.

Jacobi-Davidson’s search space

Now what really defines Jacobi-Davidson is the way in which it expands the search subspace: it extends it by a vector that would roughly correct the error in the best approximate eigenvector so far. Suppose we find a Ritz pair $(\theta, u)$ in our search subspace, then we can improve on this as follows: find a vector $t \perp u$ and a scalar $\varepsilon$ so that the actual eigenvector $x = u + t$ and the actual eigenvalue $\lambda = \theta + \varepsilon$ . Let $r := Au - \theta u$ be the residual, then it is not hard to show (noting $u \perp r$ ) that we have the identity

$r + \varepsilon u + (A - \theta I)t - \varepsilon t = 0.$

The term $\varepsilon t$ should be really small as it is the product of two errors, so we disregard that. Premultiplying this with the projection matrix $(I - uu^*)$ which removes components in the direction of $u$ we find an equation only in terms of $t$ :

$(I - uu^*)(A - \theta I)t = -r \text{ with } t \perp u.$

This is the correction equation. Having solved it for $t$ , we could in principle decide to just update our approximate eigenvector $u$ . However, the point of Jacobi-Davidson is rather to extend the search subspace with $t$ and repeat the whole process untill the residual is small enough.

Lastly, note that if our matrix $A$ is Hermetian, we should try and preserve that. One way is to apply the projection matrix to the other side as well: let $\tilde{A} := (I - uu^*)(A - \theta I)(I - uu^*)$ and rewrite the correction equation as

$\tilde{A}t = -r \text{ with } t \perp u.$

Since $t \perp u$ anyway, this will not influence the solution.

Solving the correction equation (approximately)

Remember the selling point of Jacobi-Davidson? We aren’t even going to solve the correction equation exactly, but only approximately, using a handful of iterations of an iterative method.

It turns out that Krylov methods are an excellent choice for this. The condition $t \perp u$ is automatically satisfied, since the Krylov subspace

$\mathcal{K}_s(\tilde{A}, r) = \text{span} \{ r, \tilde{A}r, \cdots, \tilde{A}^{s-1}r \}$

as a whole is automatically orthogonal to $u$ , simply because $r \perp u$ . This means you should take $0$ as the initial guess of the solution.

Solving the correction equation for testing purposes

Julia’s multiple dispatch allows some inversion of control when solving the correction equation. In the Jacobi-Davidson routine itself we don’t bother about how the correction $t$ is retrieved, but we simply specify that is should do that. So we pass the correction equation solver as an argument to the Jacobi-Davidson routine, and simply call it when needed.

It is useful to have an exact correction equation solver, so you can isolate the components of the algorithm when writing tests. The way to go is to use a direct solver for the augmented linear system

$\begin{bmatrix} A - \theta I & u \\ u^* & 0 \end{bmatrix} \begin{bmatrix} t \\ z \end{bmatrix} = \begin{bmatrix} -r \\ 0 \end{bmatrix}$

to find $t$ .

What’s next?

So far we’ve seen that, given a search and test subspace, we can cheaply compute a Ritz pair as an approximate eigenpair. Jacobi-Davidson is characterized by roughly computing a correction to this approximation and use it to extend the search space.

What we haven’t seen so far is:

How to find the next eigenpair once a previous one is converged. (This is really easy: simply remove it from the search space and subsequently update the correction equation by requiring orthogonality to the converged vectors as well.)
Implementation tricks for solving the small-dimensional eigenvalue problem. (Actually you want to use a Schur decomposition rather than an eigenvalue decomposition.)
How to restart the method when the search subspace becomes to big. (If you use Schur vectors, then this is identical to IRAM’s method.)
How to incorporate preconditioners when solving the correction equation (You need to apply deflation to the preconditioner as well, but surprisingly it is not required to perform a lot of matrix-vector multiplications.)
How to extract different types of Ritz pairs from the search subspace. (It turns out harmonic Ritz pairs are better for interior eigenvalue problems. The problem with ordinary Ritz values is that they are typically on their way converging to an exterior eigenvalue, and therefore we could get useless approximate eigenvectors.)
Actual Julia code. (At this point my code is somewhat Matlab-ish, just coding the ideas to verify I get everything.)

I hope to write more about these things very soon and actually show code.