Interior eigenvalues in Jacobi-Davidson

Ritz pairs turn out to be good approximate eigenpairs when one is interested in exterior eigenpairs. If we are interested in interior eigenpairs (somewhere inside the convex hull of the spectrum), we must ensure that we do not restart Jacobi-Davidson with a set of Ritz vectors that correspond to Ritz values converging to exterior eigenvalues. The simple criterion of Ritz values being close to our target, might give us a low-quality search space at restart. Maybe we could make a better judgement about which Ritz vectors to keep if we would inspect their residuals, but that strategy would be far too expensive.

Shift-and-invert is a well-known trick to solve the above problem: rather than solving the eigenvalue problem $Ax = \lambda x$ for $\lambda$ close to $\tau$ , we reformulate the problem to solving $(A - \tau I)^{-1}x = \theta^{-1} x$ with $\theta = \lambda - \tau$ . The eigenvectors remain unchanged, but the eigenvalues close to our target are transformed to exterior eigenvalues. The message is that we do not change our methods, rather we change the problem itself.

However, changing the problem like this can be very costly: iterating with a matrix $(A - \tau I)^{-1}$ requires solving a system each iteration. In fact, the Arnoldi method requires accurate solves because otherwise the underlying Krylov subspace would be inexact.

Harmonic Ritz values

The main question is whether we can get higher quality approximations to eigenvalues (without doing much work), so that we can restart with a search space that does not contain partly converged exterior eigenvectors.

One way to go is to look at the Ritz values of $(A - \tau I)^{-1}$ with respect to a specially chosen subspace $(A - \tau I)V$ , so that the $(A - \tau I)^{-1}$ term drops out:

$(A - \tau I)^{-1}x - \theta^{-1} x \perp (A - \tau I)V \text{ for } x = (A - \tau I)Vy$

is equivalent to the requirement

$(A - \tau I)z - \theta z \perp (A - \tau I)V \text{ for } z = Vy.$

The latter is a Petrov-Galerkin projection: the search space is $V$ and the test space is $(A - \tau I)V$ . We refer to a pair $(\theta, Vy)$ as a harmonic Ritz pair.

The important thing is that $\theta$ is identical in both expressions. The second shows a practical means to compute it, while the first shows its interpretation as being a Ritz value of $(A - \tau I)^{-1}$ .

Now the line of thought is as follows: if the Ritz values $\theta^{-1}$ converge from within the convex hull of the spectrum of $(A - \tau I)^{-1}$ towards the exterior eigenvalues, then conversely its reciprocal $\theta$ becomes small only when it approximates an eigenvalue of $(A - \tau I)$ well. This in turn means that $\theta + \tau$ can only be close to an eigenvalue of $A$ near $\tau$ if it is nearly converged. And that is what we wanted to ensure at restart.

All in all we work with the practical Petrov-Galerkin projection. Let’s write $\tilde{W} := (A - \tau I)V$ so that the last equation can be rewritten as the generalized eigenvalue problem

$\tilde{W}^*\tilde{W}y = \theta \tilde{W}^*Vy.$

For stability reasons we like to work with orthonormal matrices. The Gram-Schmidt procedure allows us to perform the QR-decomposition of $\tilde{W}$ into $\tilde{W} = WM_A$ with $W$ orthonormal and $M_A$ upper triangular. If we also define $M := W^*V$ similarly to what we had before, then the generalized eigenvalue problem becomes

$M_Ay = \theta My$

assuming that $M_A$ is invertible (which is the case a long as the columns of $W$ are independent).

So we can just obtain harmonic Ritz pairs by computing eigenpairs $(\theta, y)$ from the small generalized eigenvalue problem, and transform them back to approximate eigenpairs by setting $\lambda = \tau + \theta$ and $z = Vz$ .

In the implementation we will explicitly build these two small matrices $M$ and $M_A$ .

Schur vectors vs eigenvectors

We have used a Schur decomposition in favour of an eigendecomposition of the low-dimensional problem to make it easier to shrink the search subspace. Let’s see how this works with harmonic Ritz values.

We compute the generalized Schur (QZ) decomposition $M_AS^R = S^LT^A$ and $MS^R = S^LT$ for unitary matrices $S^R, S^L$ and upper triangular matrices $T^A, T$ . The eigenvalues $\theta$ are now $T_{ii}^A / T_{ii}$ and can be reordered. In Julia it is the familiar function ordschur!. We must order them from small to large in absolute magnitude. It is not hard to verify that the first column $s_1^R$ of $S^R$ is an eigenvector with $T_{11}^A / T_{11}$ as eigenvalue.

Now if the $m_{min}$ harmonic Ritz values we want to keep are moved to the front, we can easily restart by updating both our search space as $V \leftarrow VS^R[:, 1 : m]$ and our test space as $W \leftarrow WS^L[:, 1 : m]$ .

Rayleigh quotient vs harmonic Ritz value

Harmonic Ritz values are an excellent choice to determine which Ritz vectors to use at restart. They are however suboptimal for expansion. A few posts ago we wanted the residual $r$ to be perpendicular to the approximate eigenvector $u$ , so that when solving the correction equation $(I - uu^*)(A - \phi I)t = -r$ with a Krylov method, $t$ would automatically be perpendicular to $u$ .

With harmonic Ritz values this might not be the case. Therefore we can choose to use harmonic Ritz values $\theta$ when it comes to shrinking the search space, yet use the Rayleigh quotients $\phi$ when it comes to expanding it.

We obtain it by setting $(u, r) = 0$ ; in our case $u = Vs_1^R$ and $r = (A - \tau I)u - \phi u$ . This gives after some manipulations that $\phi = \bar{T}_{11} T_{11}^A$ , where the bar denotes complex conjugation.