Applied Mathematics 205

Unit 5. Eigenvalue Problems
and Iterative Methods

Lecturer: Petr Karnakov

November 21, 2022

Outline

In this Unit, we will discuss
- methods to compute eigenvalues and eigenvectors of matrices
- iterative methods to solve linear systems
Eigenvalue problems have applications in stability analysis,
vibration analysis, and are useful to study properties of matrices
Iterative methods are better suited for large-scale problems
and parallel computation than direct methods (e.g. Gaussian elimination)

Eigenvalues and Eigenvectors

Consider a matrix $A\in\mathbb{R}^{n\times n}$
Vector $v\in\mathbb{R}^{n}$ is called an eigenvector of $A$ if \[ Av = \lambda v \] for a scalar $\lambda\in\mathbb{R}$
The corresponding $\lambda$ is called an eigenvalue of $A$
Pair $(\lambda,v)$ is called an eigenpair
The prefix comes from German “eigen” meaning “own”
In the following, we will also consider complex matrices $A\in\mathbb{C}^{n\times n}$,
eigenvectors $v\in\mathbb{C}^n$, and eigenvalues $\lambda\in\mathbb{C}$

Motivation: Eigenvalue Problems

The definition of eigenvalues extends to linear operators in general,
including differential operators in a function space
Recall the wave equation describing the vibration of a string \[ u_{tt}-c^2u_{xx}=0 \] with zero Dirichlet boundary conditions $u(0, t)=u(1, t)=0$
Eigenfunctions $U(x)$ of the operator $U_{xx}$ found from the problem \[ U_{xx}=\lambda U \] correspond to solutions of the wave equation called standing waves \[ u(x,t)=e^{i\omega t} U(x) \]

Motivation: Eigenvalue Problems

By substituting the ansatz $u(x,t)=e^{i\omega t} U(x)$ into the wave equation \[ \frac{\partial^2}{\partial t^2}\Big(e^{i\omega t} U(x)\Big) - c^2\frac{\partial^2}{\partial x^2}\Big(e^{i\omega t} U(x)\Big) = 0 \] and using that $U(x)$ is an eigenfunction, we get \[ (-\omega^2 - c^2 \lambda) e^{i\omega t} U(x) = 0 \]
So the wave equation is satisfied for \[ \omega = c\sqrt{-\lambda} \]

Motivation: Eigenvalue Problems

Eigenfunctions $U(x)$ of the operator $U_{xx}$ \[ U_{xx}=\lambda U \] that satisfy boundary conditions $U(0)=U(1)=0$ are given by \[ U_k(x)=\sin{(\pi k x)} \qquad k=1,2,\dots \] with eigenvalues $\lambda_k=-\pi^2k^2$

$\sin(\pi x)$

$\sin(2\pi x)$

$\sin(3\pi x)$

$\sin(4\pi x)$

Motivation: Eigenvalue Problems 🔊

Wave equation with forcing [Unit 3] \[ u_{tt} - u_{xx} = f \]
Energy $\int{u_t^2dx}$
Sound $\int{u_x^2dx}$ (change in arc length)

Forcing $f = x \sin(\omega(t) t)$
$\omega(t)=at + b$

Motivation: Eigenvalue Problems

This is an example of resonance:
the system is able to store energy at certain frequencies
Other systems and phenomena related to resonance
- pendulums
- natural vibration modes of structures
- musical instruments
- lasers
- nuclear magnetic resonance (NMR)

Eigenvalue Problems

Eigenvalues and eigenvectors of real-valued matrices can be complex
Therefore, we will generally work with complex-valued matrices and vectors
For $A \in \mathbb{C}^{n\times n}$, consider the eigenvalue problem:
find $(\lambda,v) \in \mathbb{C}\times\mathbb{C}^n$ such that \[ \begin{aligned}Av &= \lambda v\\ \|v\|_2 &= 1\end{aligned} \]
The 2-norm of a complex vector $v\in\mathbb{C}^n$ is defined
using absolute values of components (as opposed to just $(v_k)^2$):
\[ \textstyle \|v\|_2 = \big(\sum_{k=1}^n |v_k|^2\big)^{1/2} \]

Eigenvalues and Eigenvectors

This problem can be reformulated as \[ (A - \lambda{I})v = 0 \]
We know this system has a non-trivial solution
if and only if $(A - \lambda I)$ is singular, therefore \[ \det(A - \lambda{I}) = 0 \]
The polynomial $p(z) = \det(A - z{I})$ is called
the characteristic polynomial of $A$
Eigenvalues $\lambda$ are roots of $p(z)$

Characteristic Polynomial

By the fundamental theorem of algebra, we can factorize $p(z)$ as \[ p(z) = c_n(z - \lambda_1)(z - \lambda_2)\cdots(z - \lambda_n) \] where the roots $\lambda_i \in \mathbb{C}$ need not be distinct
Note also that complex eigenvalues of a real matrix $A \in \mathbb{R}^{n\times n}$
must occur as complex conjugate pairs
That is, if $\lambda = \alpha + i\beta$ is an eigenvalue,
then so is its complex conjugate $\overline\lambda = \alpha - i\beta$

Characteristic Polynomial

This follows from the fact that for a polynomial $p$ with real coefficients,
$p(\overline{z}) = \overline{p(z)}$ for any $z \in \mathbb{C}$: \[ p(\overline{z}) = \sum_{k=0}^n c_k (\overline{z})^k = \sum_{k=0}^n c_k \overline{z^k} = \overline{\sum_{k=0}^n c_k z^k} = \overline{p(z)} \]
Therefore, if $w \in \mathbb{C}$ is a root of $p$, then so is $\overline{w}$, since \[ 0 = p(w) = \overline{p(w)} = p(\overline{w}) \]

Companion Matrix

We have seen that every matrix has an associated characteristic polynomial
Conversely, every polynomial has an associated companion matrix
The companion matrix $C_n$, of a polynomial $p \in \mathbb{P}_n$
is a matrix which has eigenvalues that match the roots of $p$
Divide $p$ by its leading coefficient to get a monic polynomial,
i.e. with leading coefficient equal to 1 (this doesn’t change the roots) \[ p_\text{monic}(z) = c_0 + c_1 z + \cdots + c_{n-1}z^{n-1} + z^n \]

Companion Matrix

Then $p_\text{monic}$ is the characteristic polynomial
of the following $n\times n$ matrix \[ C_n = \left[ \begin{array}{ccccc} 0 & 0 & \cdots & 0 & -c_0\\ 1 & 0 & \cdots & 0 & -c_1\\ 0 & 1 & \cdots & 0 & -c_2\\ \vdots & \vdots & \ddots & \vdots & \vdots\\ 0 & 0 & \cdots & 1 & -c_{n-1} \end{array} \right] \]
Therefore, $C_n$ is companion matrix for $p$

Companion Matrix

Let us show this for the $n=3$ case
Consider \[ p_{\text{monic}}(z) = c_0 + c_1z + c_2 z^2 + z^3 \] for which \[ \small C_3 = \left[ \begin{array}{ccc} 0 & 0 & -c_0\\ 1 & 0 & -c_1\\ 0 & 1 & -c_2 \end{array} \right] \]
Recall that the determinant of a $3 \times 3$ matrix is \[ \small\textstyle \det \left[ \begin{array}{ccc} a_{11} & a_{12} & a_{13}\\ a_{21} & a_{22} & a_{23}\\ a_{31} & a_{32} & a_{33} \end{array} \right] = \begin{array}{c} \\ a_{11}a_{22}a_{33} + a_{12}a_{23}a_{31} + a_{13}a_{21}a_{32} \\ - a_{13}a_{22}a_{31} - a_{11}a_{23}a_{32} - a_{12}a_{21}a_{33} \end{array} \]

Companion Matrix

Substituting entries of $C_3$ then gives \[ \det(z {\rm I} - C_3) = c_0 + c_1 z + c_2 z^2 + z^3 = p_{\text{monic}}(z) \]
This link between matrices and polynomials is used by numpy.roots()
that computes roots of a polynomial as eigenvalues of the companion matrix

Eigenvalue Decomposition

Let $\lambda$ be an eigenvalue of $A \in \mathbb{C}^{n\times n}$; the set of all eigenvalues is called the spectrum of $A$
The algebraic multiplicity of $\lambda$ is the multiplicity of the corresponding root of the characteristic polynomial
The geometric multiplicity of $\lambda$ is the number of linearly independent eigenvectors corresponding to $\lambda$
For example, for $A = {\rm I}$, $\lambda = 1$ is an eigenvalue with algebraic and geometric multiplicity of $n$

Eigenvalue Decomposition

Theorem: The geometric multiplicity of an eigenvalue
is less than or equal to its algebraic multiplicity
If $\lambda$ has geometric multiplicity strictly less than algebraic multiplicity,
then $\lambda$ is said to be defective
We say a matrix is defective if it has at least one defective eigenvalue

Eigenvalue Decomposition

For example, the matrix \[ A = \left[ \begin{array}{ccc} 2 & 1 & 0\\ 0 & 2 & 1\\ 0 & 0 & 2 \end{array} \right] \] has one eigenvalue with algebraic multiplicity of 3
and geometric multiplicity of 1

>>> import numpy as np
>>> a = np.array([[2, 1, 0], [0, 2, 1], [0, 0, 2]])
>>> d, v = np.linalg.eig(a)
>>> d
array([ 2.,  2.,  2.])
>>> v
array([[  1.00000e+00,  -1.00000e+00,   1.00000e+00],
       [  0.00000e+00,   4.44089e-16,  -4.44089e-16],
       [  0.00000e+00,   0.00000e+00,   1.97215e-31]])

Eigenvalue Decomposition

Let $A \in\mathbb{C}^{n\times n}$ be a nondefective matrix, then it has a full set of $n$ linearly independent eigenvectors $v_1, v_2, \ldots, v_n \in \mathbb{C}^n$
Suppose $V \in \mathbb{C}^{n\times n}$ contains the eigenvectors of $A$ as columns,
and let $D = \mathop{\mathrm{diag}}(\lambda_1,\ldots,\lambda_n)$
Then $A v_i = \lambda_i v_i$, $i=1,2,\ldots,n$ is equivalent to $AV = VD$
Since we assumed $A$ is nondefective, we can invert $V$ to obtain \[ \htmlClass{color5}{ A = VDV^{-1}} \]
This is the eigendecomposition of $A$
This shows that for a non-defective matrix, $A$ is diagonalized by $V$

Eigenvalue Decomposition

We introduce the conjugate transpose $A^\ast \in \mathbb{C}^{n\times m}$ of a matrix $A \in \mathbb{C}^{m\times n}$ \[ (A^\ast)_{ij} = \overline{A_{ji}}, \quad i = 1,2,\ldots,m, ~ j = 1,2,\ldots,n \]
A matrix is said to be hermitian if $A = A^\ast$
(this generalizes matrix symmetry)
A matrix is said to be unitary if $AA^\ast = {\rm I}$
(this generalizes the concept of an orthogonal matrix)
Also, for $v \in\mathbb{C}^n$, $\|v\|_2 = \sqrt{v^*v}$

Eigenvalue Decomposition

For numpy-array, the .T property contains the transpose,
while the .getH() function performs the conjugate transpose


>>> import numpy as np
>>> a = np.matrix([[1+1j, 2+3j], [0, 4]])
>>> a.T
matrix([[ 1.+1.j,  0.+0.j],
        [ 2.+3.j,  4.+0.j]])
>>> a.getH()
matrix([[ 1.-1.j,  0.-0.j],
        [ 2.-3.j,  4.-0.j]])

Eigenvalue Decomposition

In some cases, the eigenvectors of $A$
can be chosen such that they are orthonormal \[ v_i^\ast v_j = \begin{cases} 1, \quad i=j\\ 0, \quad i\neq j \end{cases} \]
In such a case, the matrix of eigenvectors $Q$ is unitary,
and hence $A$ can be unitarily diagonalized \[ A = Q D Q^\ast \]

Eigenvalue Decomposition

Theorem: A hermitian matrix is unitarily diagonalizable,
and its eigenvalues are real
But hermitian matrices are not the only matrices that can be unitarily diagonalized
Matrix $A \in\mathbb{C}^{n\times n}$ is called normal if \[ A^\ast A = A A^\ast \]
Theorem: A matrix is unitarily diagonalizable if and only if it is normal

Gershgorin’s Theorem

Due to the link between eigenvalues and polynomial roots,
in general one has to use iterative methods to compute eigenvalues
(recall that polynomials of degree higher than four cannot be solved in radicals)
However, it is possible to gain some information about eigenvalue locations more easily from Gershgorin’s Theorem
Let $D(c, r) = \{ x \in \mathbb{C}: |x - c| \leq r\}$ denote a disk in the complex plane centered at $c$ with radius $r$
For a matrix $A \in \mathbb{C}^{n\times n}$, disk $D(a_{ii}, R_i)$ is called a Gershgorin disk, where \[ \textstyle R_i = \sum_{\substack{j=1\\j\neq i}}^n |a_{ij}| \]

Gershgorin’s Theorem

Theorem: All eigenvalues of $A \in \mathbb{C}^{n\times n}$ are contained
within the union of all $n$ Gershgorin disks of $A$
Proof: Assume that $Av=\lambda v$, and $i=\mathrm{argmax}_j|v_j|$. \[ \left|\lambda -a_{ii}\right|=\left|\sum _{j\neq i}{\frac {a_{ij}v_{j}}{v_{i}}}\right|\leq \sum _{j\neq i}\left|a_{ij}\right|=R_{i}\quad\square \]

Gershgorin’s Theorem

Recall that a matrix is diagonally dominant if \[ \textstyle |a_{ii}| > \sum_{\substack{j=1\\j\neq i}}^n |a_{ij}|, \quad \text{for}~i=1,2,\ldots,n \]
It follows from Gershgorin’s Theorem that a diagonally dominant matrix cannot have a zero eigenvalue, hence must be invertible
For example, the finite difference discretization matrix of the differential operator $-\nabla^2 + {\rm I}$ is diagonally dominant
In -dimensions, $(-\nabla^2 + {\rm I})u = -u_{xx} - u_{yy} + u$
(each row of the corresponding discretization matrix contains
diagonal entry $4/h^2 + 1$, and four off-diagonal entries of $-1/h^2$)

Algorithms for Eigenvalue Problems

Power Method

The power method is perhaps the simplest eigenvalue algorithm
It finds the eigenvalue of $A \in \mathbb{C}^{n\times n}$ with largest absolute value

1:$\hspace{0em}$choose $x_0 \in \mathbb{C}^n$ arbitrarily
2:$\hspace{0em}$for $k = 1,2,\ldots$ do
3:$\hspace{1.2em}x_k = A x_{k-1}$
4:$\hspace{0em}$end for
Question: How does this algorithm work?

Power Method

Assuming $A$ is nondefective, so the eigenvectors $v_1,v_2,\ldots,v_n$
provide a basis for $\mathbb{C}^n$
Assume that the eigenvalues are ordered: $|\lambda_1| \leq |\lambda_2| \leq \cdots \leq |\lambda_n|$
Therefore there exist coefficients $\alpha_i$ such that $x_0 = \sum_{j=1}^n \alpha_j v_j$
Then, we have \[ \begin{aligned}\small x_k &= Ax_{k-1} = A^2x_{k-2} = \cdots = A^kx_0\\ &= A^k \left(\sum_{j=1}^n \alpha_j v_j\right) = \sum_{j=1}^n \alpha_j A^k v_j = \sum_{j=1}^n \alpha_j \lambda_j^k v_j \\ &= \lambda_n^k\left(\alpha_nv_n + \sum_{j=1}^{n-1} \alpha_j \left[\frac{\lambda_j}{\lambda_n}\right]^k v_j \right)\end{aligned} \]

Power Method

Then if $|\lambda_n| > |\lambda_j|$, $1 \leq j < n$, we see that $x_k \to \lambda_n^k\alpha_nv_n$ as $k \to \infty$
This algorithm converges linearly: the error terms are scaled by a factor at most $|\lambda_{n-1}|/|\lambda_{n}|$ at each iteration
Also, we see that the method converges faster if $\lambda_n$ is well-separated from the rest of the spectrum

Power Method

However, in practice the exponential factor $\lambda_n^k$ could cause overflow or underflow after relatively few iterations
Therefore the standard form of the power method is actually
the normalized power method

1:$\hspace{0em}$choose $x_0 \in \mathbb{C}^n$ arbitrarily
2:$\hspace{0em}$for $k = 1,2,\ldots$ do
3:$\hspace{1.2em}y_k = A x_{k-1}$
4:$\hspace{1.2em}x_k = y_k/\|y_k\|$
5:$\hspace{0em}$end for

Power Method

Convergence analysis of the normalized power method is essentially the same as the un-normalized case
Only difference is we now get an extra scaling factor, $c_k \in \mathbb{R}$, due to the normalization at each step \[ x_k = c_k \lambda_n^k\left(\alpha_nv_n + \sum_{j=1}^{n-1} \alpha_j \left[\frac{\lambda_j}{\lambda_n}\right]^k v_j \right) \]

Power Method

This algorithm directly produces the eigenvector $v_n$
One way to recover $\lambda_n$ is to note that \[ y_k = A x_{k-1} \approx \lambda_n x_{k-1} \]
Hence we can compare an entry of $y_k$ and $x_{k-1}$ to approximate $\lambda_n$
We also note two potential issues:
1. we require $x_0$ to have a nonzero component of $v_n$
2. there may be more than one eigenvalue with maximum absolute value

Power Method

These issues may not realize in practice
Issue 1:
- Very unlikely that $x_0$ will be orthogonal to $v_n$
- Even if $x_0^\ast v_n = 0$, rounding error will introduce a component of $v_n$ during the power iterations
Issue 2:
- We cannot ignore the possibility that there is more than one maximum eigenvalue
- In this case $x_k$ would converge to a member of the corresponding eigenspace

Power Method

An important idea in eigenvalue computations is to consider the “shifted” matrix $A - \sigma {\rm I}$, for $\sigma \in \mathbb{R}$
We see that \[ (A - \sigma {\rm I}) v_i = (\lambda_i - \sigma) v_i \] and hence the spectrum of $A-\sigma{\rm I}$ is shifted by $-\sigma$, and the eigenvectors are the same
For example, if all the eigenvalues are real, a shift can be used with the power method to converge to $\lambda_1$ instead of $\lambda_n$

Inverse Iteration

The eigenvalues of $A^{-1}$ are the reciprocals of the eigenvalues of $A$, since \[ Av = \lambda v \Longleftrightarrow A^{-1} v = \frac{1}{\lambda} v \]
Question: What happens if we apply the power method to $A^{-1}$?

Inverse Iteration

Answer: We converge to the largest (in absolute value) eigenvalue of $A^{-1}$, which is $1/\lambda_1$ (recall that $\lambda_1$ is the smallest eigenvalue of $A$)
This is called inverse iteration

1:$\hspace{0em}$choose $x_0 \in \mathbb{C}^n$ arbitrarily
2:$\hspace{0em}$for $k = 1,2,\ldots$ do
3:$\hspace{1.2em}$solve $Ay_k = x_{k-1}$ for $y_k$
4:$\hspace{1.2em}x_k = y_k/\|y_k\|$
5:$\hspace{0em}$end for

Inverse Iteration

Hence inverse iteration gives $\lambda_1$ without requiring a shift
This is helpful since it may be difficult to determine
what shift is required to get $\lambda_1$ in the power method
Question: What happens if we apply inverse iteration
to the shifted matrix $A - \sigma {\rm I}$?

Inverse Iteration

The smallest eigenvalue of $A - \sigma {\rm I}$ is $(\lambda_{i^\ast}-\sigma)$, where \[ i^\ast = \argmin_{i=1,\dots,n}|\lambda_i - \sigma| \]
Answer: We converge to $\tilde\lambda = 1/(\lambda_{i^\ast} - \sigma)$, then recover $\lambda_{i^\ast}$ via \[ \lambda_{i^\ast} = \frac{1}{\tilde\lambda} + \sigma \]
Inverse iteration with shift allows us to find the eigenvalue closest to $\sigma$

Rayleigh Quotient

Consider a real matrix $A \in \mathbb{R}^{n\times n}$
Assume that the eigenvalues are ordered: $|\lambda_1| \leq |\lambda_2| \leq \cdots \leq |\lambda_n|$
The Rayleigh quotient is a function $r:\mathbb{R}^n\to\mathbb{R}$ defined as \[ r(x) = \frac{x^T A x}{x^T x} \]
If $Av=\lambda v$, then \[ r(v) = \frac{v^T A v}{v^T v} = \frac{\lambda v^T v}{v^T v} = \lambda \]

Rayleigh Quotient

Theorem: Suppose $A \in \mathbb{R}^{n\times n}$ is a symmetric matrix, then for any $x \in\mathbb{R}^n$ \[ \lambda_{\rm 1} \leq r(x) \leq \lambda_n \]
Proof: We write $x$ as a linear combination of orthogonal eigenvectors
$x = \sum_{j=1}^n \alpha_j v_j$, and the lower bound follows from \[ r(x) = \frac{x^T A x}{x^T x} = \frac{\sum_{j=1}^n \lambda_j \alpha_j^2}{\sum_{j=1}^n \alpha_j^2} \geq \lambda_1 \frac{\sum_{j=1}^n \alpha_j^2}{\sum_{j=1}^n \alpha_j^2} = \lambda_1 \]
The proof of the upper bound $r(x) \leq \lambda_n$ is analogous $\quad \square$
Therefore, the Rayleigh quotient of a symmetric matrix
always remains within the range of its spectrum

Rayleigh Quotient

Theorem: A symmetric matrix $A \in \mathbb{R}^{n\times n}$ is positive definite
if and only if all of its eigenvalues are positive
Proof: ($\Rightarrow$) Suppose $A$ is symmetric positive definite,
then for any nonzero $x \in \mathbb{R}^n$, we have $x^T A x > 0$. Take $x=v_1$ \[ \lambda_1 = r(v_1) = \frac{v_1^TAv_1}{v_1^T v_1} > 0 \]
($\Leftarrow$) Suppose $A$ has positive eigenvalues, then for any nonzero $x \in \mathbb{R}^n$,
from the previous theorem \[ x^T A x = r(x) (x^T x) \geq \lambda_1 \|x\|_2^2 > 0\quad \square \]

Rayleigh Quotient

If $x\in\mathbb{R}^n$ approximates an eigenvector,
then $r(x)$ approximates the eigenvalue
Consider Taylor’s expansion of $r(x)$ about a vector $v$ \[ r(x) = r(v) + \nabla r(v)^T (x-v) + \mathcal{O}(\|x-v\|^2_2) \]

Rayleigh Quotient

Let’s compute the gradient $\nabla r(x)$
Recall from [Unit 1, slide 69] that $\nabla (x^T A x) = (A+A^T)x$
Then using the product rule \[ \begin{aligned}\nabla r(x) &= \nabla \Big({x^TAx}\frac{1}{x^Tx}\Big) = \frac{\nabla (x^TAx)}{x^Tx} - (x^TAx)\frac{\nabla(x^T x)}{(x^Tx)^2} = \\ &= \frac{(A+A^T)x}{x^Tx} - (\htmlClass{color5}{x^TAx})\frac{2x}{(\htmlClass{color5}{x^Tx})^2} = \frac{(A+A^T)x}{x^Tx} - \htmlClass{color5}{r(x)}\frac{2x}{x^Tx} = \\ &= \frac{2}{x^T x} \Big( \frac{A+A^T}{2}x - r(x) x \Big) = \frac{2}{x^T x} \Big( \frac{A+A^T}{2} - r(x) I \Big) x \end{aligned} \]

Rayleigh Quotient

If $A$ is symmetric, then $A=\tfrac{A+A^T}{2}$ and \[ \nabla r(x) = \frac{2}{x^T x} \big(A - r(x) I \big)x \]
Therefore, eigenvectors of a symmetric matrix
coincide with stationary points of its Rayleigh quotient
Indeed, for any $x\neq 0$ and $\lambda\in\mathbb{R}$ \[ Ax=\lambda x \;\Leftrightarrow\; \frac{2}{x^T x}(A-\lambda I) x=0 \;\Leftrightarrow\; \nabla r(x) = 0, \; \lambda=r(x) \]

Rayleigh Quotient

Suppose that $Av=\lambda v$
Then $r(v)=\lambda$ and $\nabla r(v)=0$, therefore Taylor’s expansion turns into \[ r(x) = r(v) + \nabla r(v)^T (x-v) + \mathcal{O}(\|x-v\|^2_2) = \lambda + \mathcal{O}(\|x-v\|^2_2) \]
Then the approximation error is \[ |r(x) - \lambda| = \mathcal{O}(\|x-v\|^2_2) \]
That is, the Rayleigh quotient approximation to an eigenvalue
squares the error of the approximation to the eigenvector

Rayleigh Quotient Iteration

The Rayleigh quotient iteration combines the inverse iteration,
spectrum shifts, and Rayleigh quotient approximations to an eigenvalue

1:$\hspace{0em}$choose $x_0 \in \mathbb{R}^n$ arbitrarily
2:$\hspace{0em}$for $k = 1,2,\ldots$ do
3:$\hspace{1.2em}\sigma_k = \tfrac{x_{k-1}^T A x_{k-1}}{x_{k-1}^T x_{k-1}}$
4:$\hspace{1.2em}$solve $(A-\sigma_k I)y_k = x_{k-1}$ for $y_k$
5:$\hspace{1.2em}x_k = y_k/\|y_k\|$
6:$\hspace{0em}$end for

Rayleigh Quotient Iteration

For a symmetric matrix $A$, if the Rayleigh quotient iteration
converges, it results in cubic convergence
Let’s show the idea for the case $0 < \lambda_1 < \lambda_2 \leq \dots \leq \lambda_n$
assuming that $x_k\to v_1$ and $\sigma_k \to \lambda_1$
Convergence of the inverse iteration is linear, and the rate is determined
by the ratio of the two eigenvalues closest to zero. Asymptotically, \[ \|x_k - v_1\| \sim \tfrac{|\lambda_1 - \sigma_k|}{|\lambda_2 - \sigma_k|}\, \|x_{k-1} - v_1\| \sim \tfrac{|\lambda_1 - \sigma_k|}{|\lambda_2 - \lambda_1|}\, \|x_{k-1} - v_1\| \]
On the other hand, the Rayleigh quotient squares the error \[ |\lambda_1 - \sigma_k| = \mathcal{O}(\|x_{k-1} - v_1\|^2) \]
This shows cubic convergence \[ \|x_k - v_1\| \leq C \|x_{k-1} - v_1\|^3 \]

Rayleigh Quotient Iteration: Example

See [examples/unit5/rayleigh_iter.py]
the Rayleigh quotient iteration applied to a $3\times 3$ matrix \[ A = \begin{bmatrix} 5 & 1 & 1 \\ 1 & 6 & 1 \\ 1 & 1 & 7 \end{bmatrix} \]


it=0
|Ax - sigma x|   = 2.2176638128637163e-01
|sigma - lambda| = 2.1431974337752990e-01

it=1
|Ax - sigma x|   = 1.2052279264915474e-03
|sigma - lambda| = 1.2049892791683448e-03

it=2
|Ax - sigma x|   = 1.9350397099098787e-10
|sigma - lambda| = 1.9349855051586928e-10

it=3
|Ax - sigma x|   = 0.0000000000000000e+00
|sigma - lambda| = 5.3290705182007514e-15

QR Algorithm

The QR algorithm is a standard algorithm for computing eigenvalues
It was developed independently in the late 1950s
by John G.F. Francis (England) and Vera N. Kublanovskaya (USSR)
The QR algorithm efficiently provides approximations
to all eigenvalues and eigenvectors of a matrix
In the following, assume that $A\in\mathbb{A}^{n\times n}$ is symmetric

QR Algorithm

To motivate the QR-algorithm, let’s start with
the power method applied to $p$ vectors at once
Let $x_1^{(0)},\ldots,x_p^{(0)}$ denote $p$ linearly independent starting vectors
stored in the columns of $X_0\in\mathbb{R}^{n\times p}$
The power method applied to these vectors results in

1:$\hspace{0em}$choose an $n\times p$ matrix $X_0$ arbitrarily
2:$\hspace{0em}$for $k = 1,2,\ldots$ do
3:$\hspace{1.2em}$$X_k = A X_{k-1}$
4:$\hspace{0em}$end for

QR Algorithm

Assume that the eigenvalues are ordered $|\lambda_1| \leq |\lambda_2| \leq \cdots \leq |\lambda_n|$
and $v_1,\dots,v_n$ is a full set of eigenvectors
Again, to analyze convergence of the method, express each $x_i^{(k)}$
in the basis of $v_1,\dots,v_n$ for each $i=1,2,\ldots,p$ \[ \begin{aligned} x_i^{(k)} &= \lambda_n^k\alpha_{i,n}v_n + \lambda_{n-1}^k\alpha_{i,n-1}v_{n-1} + \cdots + \lambda_{1}^k\alpha_{i,1}v_{1} \\ &= \lambda_{n-p}^k\Big( \htmlClass{color5}{ \sum_{j=n-p+1}^{n} \big(\tfrac{\lambda_j}{\lambda_{n-p}}\big)^k\alpha_{i,j} v_j} + \htmlClass{color1}{ \sum_{j=1}^{n-p} \big(\tfrac{\lambda_j}{\lambda_{n-p}}\big)^k\alpha_{i,j} v_j }\Big)\end{aligned} \]
If $|\lambda_{n-p+1}| > |\lambda_{n-p}|$, the sum in orange will dominate
compared to the sum in green as $k \to \infty$
Therefore, the columns of $X_k$ will converge to a basis of $\mathrm{span}\{v_{n-p+1},\ldots,v_n\}$

QR Algorithm

However, this method does not provide a good basis:
since $\lambda_n$ is the largest eigenvalue, columns of $X_k$ will approach $v_n$
Therefore the columns of $X_k$ will be more “linearly dependent”
We can resolve this issue by enforcing orthonormality at each step

QR Algorithm

Using the reduced QR factorization, we orthonormalize the vectors after each iteration
This algorithm is called the simultaneous iteration

1:$\hspace{0em}$choose $n\times p$ matrix $\hat{Q}_0$ with orthonormal columns
2:$\hspace{0em}$for $k = 1,2,\ldots$ do
3:$\hspace{1.2em}$$X_k = A\hat Q_{k-1}$
4:$\hspace{1.2em}$$\hat Q_k \hat R_k = X_{k}$
5:$\hspace{0em}$end for
The column spaces of $\hat Q_k$ and $X_{k}$ in line 4 are the same
Columns of $\hat Q_k$ converge to an orthonormal basis of $\mathrm{span}\{v_{n-p+1},\ldots,v_n\}$

QR Algorithm

In fact, columns $\hat{Q}_k$ do not just converge to a basis,
they actually converge to a set of eigenvectors
Theorem: The columns of $\hat Q_k$ converge to the $p$ dominant eigenvectors of $A$
We will not discuss the full proof, but this result is not surprising since
- the eigenvectors of a symmetric matrix are orthogonal
- columns of $\hat Q_k$ converge to a basis of $\mathrm{span}\{v_{n-p+1},\ldots,v_n\}$
To approximate the eigenvalues, we again use the Rayleigh quotient
\[ \hat Q^T A \hat Q \approx \mathop{\mathrm{diag}}(\lambda_1,\ldots,\lambda_n) \]

QR Algorithm

With $p=n$, the simultaneous iteration will converge to all eigenpairs of $A$
We now show a more convenient formulation of the simultaneous iteration
To distinguish matrices from two different formulations,
we introduce some extra notation: the $Q$ and $R$ matrices in
the simultaneous iteration will be underlined: $\underline{Q}_k$, $\underline{R}_k$

QR Algorithm

Define the $k$-th Rayleigh quotient matrix $A_k = \underline{Q}_k^TA \underline{Q}_k$
and the QR factors $Q_k$, $R_k$ as $Q_{k} R_k = A_{k-1}$
Our goal is to show that \[ A_k = R_k Q_k, \qquad k=1,2,\ldots \]
Initialize $\underline{Q}_0 = {\rm I} \in \mathbb{R}^{n\times n}$
Then in the first iteration: $X_1 = A$ and $\underline{Q}_1 \underline{R}_1 = A$
It follows that $\htmlClass{color5}{ A_1} = \underline{Q}_1^T A \underline{Q}_1 = \underline{Q}_1^T (\underline{Q}_1 \underline{R}_1) \underline{Q}_1 = \htmlClass{color5}{ \underline{R}_1 \underline{Q}_1}$
Also $Q_1 R_1 = A_0 = \underline{Q}_0^T A \underline{Q}_0 = A$, so that $Q_1 = \underline{Q}_1$, $R_1 = \underline{R}_1$, and $A_1 = R_1 Q_1$

QR Algorithm

In the second iteration, we have $X_2 = A \underline{Q}_1$,
and we compute the QR factorization $\underline{Q}_2 \underline{R}_2 = X_2$
Also, using our QR factorization of $A_1$ gives \[ X_2 = A \underline{Q}_1 = (\underline{Q}_1 \underline{Q}_1^T) A \underline{Q}_1 = \underline{Q}_1 A_1 = \underline{Q}_1 (Q_2 R_2) \] which implies that $\underline{Q}_2 = \underline{Q}_1 Q_2 = Q_1 Q_2$ and $\underline{R}_2 = R_2$
Therefore \[ \htmlClass{color5}{ A_2} = \underline{Q}_2^T A \underline{Q}_2 = Q_2^T\underline{Q}_1^T A \underline{Q}_1 Q_2 = Q_2^T A_1 Q_2 = Q_2^T Q_2 R_2 Q_2 = \htmlClass{color5}{ R_2 Q_2} \]

QR Algorithm

The same pattern continues for $k=3,4,\ldots$:
we QR factorize $A_k$ to get $Q_k$ and $R_k$, then we compute $A_{k+1} = R_k Q_k$
The columns of the product $\underline{Q}_k = Q_1 Q_2 \cdots Q_k$
approximate the eigenvectors of $A$
The diagonal entries of the Rayleigh quotient matrix $A_k = \underline{Q}_k^T A \underline{Q}_k$
approximate the eigenvalues of $A$
Also, $A_k$ converges to a diagonal matrix due to the eigenvalue decomposition

QR Algorithm

This discussion motivates the QR algorithm

1:$\hspace{0em}$$A_0 = A$
2:$\hspace{0em}$for $k = 1,2,\ldots$ do
3:$\hspace{1.2em}$$Q_k R_k = A_{k-1}$
4:$\hspace{1.2em}$$A_k = R_k Q_{k}$
5:$\hspace{0em}$end for

QR Algorithm: Example

See [examples/unit5/qr_algorithm.py],
eigenvalues and eigenvectors of a 4 by 4 matrix \[ A = \left[ \begin{array}{cccc} 2.9766 & 0.3945 & 0.4198 & 1.1159\\ 0.3945 & 2.7328 & -0.3097 & 0.1129\\ 0.4198 & -0.3097 & 2.5675 & 0.6079\\ 1.1159 & 0.1129 & 0.6079 & 1.7231 \end{array} \right] \]
This matrix has eigenvalues 1, 2, 3 and 4

QR Algorithm

We have presented the simplest version of the QR algorithm:
the “unshifted” QR algorithm
Practically relevant implementations include various improvements
- introduce shifts to accelerate convergence,
  like in the Rayleigh quotient iteration
- reduce $A$ to a tridiagonal form (e.g. via Householder reflectors)
  to reduce computational cost
- add reliable convergence criteria for the eigenvalues and eigenvectors
One example is _geev() in LAPACK used by numpy.linalg.eig()

Iterative Methods
for Linear Systems

Conjugate Gradient Method

Krylov Subspaces

Given a matrix $A$ and vector $b$, a Krylov sequence
is the set of vectors \[ \{b, Ab, A^2 b, A^3 b, \ldots \} \]
The corresponding Krylov subspaces are the spaces spanned
by successive groups of these vectors
\[ \mathcal{K}_m(A,b) = \mathop{\mathrm{span}}\{b, Ab, A^2 b, \ldots, A^{m-1} b\} \]

Krylov Subspaces

Krylov subspaces are the basis for iterative methods
An important advantage: Krylov methods do not deal directly with $A$,
but rather with matrix–vector products involving $A$
This is particularly helpful when $A$ is large and sparse,
since matrix–vector multiplications are relatively cheap

Conjugate Gradient Method

The conjugate gradient method (CG) is one Krylov subspace methods
Assume that $A\in\mathbb{R}^{n\times n}$ is symmetric and positive definite
CG is an iterative method for solving $Ax = b$

Conjugate Gradient Method

Iterative solvers (e.g. CG) and direct solvers (e.g. Gaussian elimination)
for solving $Ax=b$ are fundamentally different
- direct solvers: In exact arithmetic, gives exact answer after finitely many steps
- iterative solvers: In principle require infinitely many iterations,
  but should give accurate approximation after few iterations
Iterative methods are typically more efficient for very large, sparse systems
Also, iterative methods are generally better suited to parallelization,
hence an important topic in high performance computing

Conjugate Gradient Method

This is the conjugate gradient algorithm

1:$\hspace{0em}x_0 = 0$, $r_0 = b$, $p_0 = r_0$
2:$\hspace{0em}$for $k = 1,2,3,\ldots$ do
3:$\hspace{1.2em}\alpha_k = (r_{k-1}^T r_{k-1}) / (p^T_{k-1}A p_{k-1})$
4:$\hspace{1.2em}x_{k} = x_{k-1} + \alpha_k p_{k-1}$
5:$\hspace{1.2em}r_{k} = r_{k-1} - \alpha_k A p_{k-1}$
6:$\hspace{1.2em}\beta_{k} = (r_{k}^T r_{k}) / (r_{k-1}^T r_{k-1})$
7:$\hspace{1.2em}p_{k} = r_{k} + \beta_{k} p_{k-1}$
8:$\hspace{0em}$end for

Conjugate Gradient Method

We will now discuss CG in more detail
Let $x_\ast = A^{-1}b$ denote the exact solution,
and let $e_k = x_\ast - x_k$ denote the error at step $k$
Also, let $\|\cdot\|_A$ denote the norm \[ \|x\|_A = \sqrt{x^T A x} \]

Conjugate Gradient Method

Theorem: The CG iterate $x_k$ is the unique member of $\mathcal{K}_k(A,b)$
which minimizes $\|e_k\|_A$. Also, $x_k = x_\ast$ for some $k \leq n$.
Proof: This result relies on a set of identities which can be derived
by induction from the CG algorithm:
- (i) $\mathcal{K}_k(A,b) = \mathrm{span}\{x_1,x_2,\ldots,x_k\} = \mathrm{span}\{ p_0,p_1\ldots,p_{k-1}\}$
  $\hspace{5.2em}= \mathrm{span}\{r_0,r_1,\ldots,r_{k-1}\}$
- (ii) $r_k^T r_j = 0$ for $j < k$
- (iii) $p_k^T A p_j = 0$ for $j < k$

Conjugate Gradient Method

From the first identity above, it follows that $x_k \in \mathcal{K}_k(A,b)$
We will now show that $x_k$ is the unique minimizer in $\mathcal{K}_k(A,b)$
Let $\tilde x \in \mathcal{K}_k(A,b)$ be another “candidate minimizer”
and let $\Delta x = x_k - \tilde x$, then \[ \begin{aligned} \|x_\ast - \tilde x\|_A^2 &= \|(x_\ast - x_k) + (x_k - \tilde x)\|_A^2\\ &= \|e_k + \Delta x\|_A^2\\ &= (e_k + \Delta x)^T A (e_k + \Delta x)\\ &= e_k^TA e_k + \htmlClass{color5}{ 2e_k^T A \Delta x} + \Delta x^T A \Delta x\end{aligned} \]

Conjugate Gradient Method

Next, let $r(x_k) = b - Ax_k$ denote the residual at step $k$, so that \[ \htmlClass{color5}{ r(x_k)} = b - Ax_k = b - A(x_{k-1} + \alpha_k p_{k-1}) = \htmlClass{color5}{ r(x_{k-1}) - \alpha_k A p_{k-1}} \]
Since $r(x_0) = b = r_0$, by induction we see that
for $r_k$ computed in line 5 of CG, \[ r_k = r_{k-1} - \alpha_k A p_{k-1} \] we have $r_k = r(x_k)$, $k=1,2,\ldots$

Conjugate Gradient Method

Now, recall our expression for $\|x_\ast - \tilde x\|_A^2$: \[ \|x_\ast - \tilde x\|_A^2 = e_k^TA e_k + \htmlClass{color5}{ 2e_k^T A \Delta x} + \Delta x^T A \Delta x \] and note that \[ \htmlClass{color5}{ 2 e_k^T A \Delta x} = 2\Delta x^T A (x_\ast - x_k) = 2\Delta x^T (b - A x_k) = 2\Delta x^T r_k \]
Now,
- $\Delta x = x_k - \tilde x \in \mathcal{K}_k(A,b)$
- from (i), $\mathcal{K}_k(A,b) = \text{span}\{r_0,r_1,\ldots,r_{k-1}\}$
- from (ii), $r_k \perp \text{span}\{r_0,r_1,\ldots,r_{k-1}\}$
Therefore, we have $\htmlClass{color5}{ 2 e_k^T A \Delta x} = 2\Delta x^T r_k = \htmlClass{color5}{ 0}$

Conjugate Gradient Method

This implies that, \[ \|x_\ast - \tilde x\|_A^2 = e_k^TA e_k + \Delta x^T A \Delta x \geq \|e_k\|_A^2, \] with equality only when $\Delta x = 0$, so $x_k \in \mathcal{K}_k(A,b)$ is the unique minimizer
This also tells us that if $x_\ast \in \mathcal{K}_k(A,b)$, then $x_k = x_\ast$
Therefore CG will converge to $x_\ast$ in at most $n$ iterations
since $\mathcal{K}_k(A,b)$ is a subspace of $\mathbb{R}^n$ of dimension $k$ $\quad \square$

Conjugate Gradient Method

The theorem implies that CG will converge in at most $n$ steps
However, in floating point arithmetic we will not get exact convergence to $x_\ast$
Also, if $n$ is large, we want to terminate CG well before $n$ iterations,
after reaching sufficient accuracy
Steps of CG are chosen to give the orthogonality properties (ii), (iii),
which lead to the remarkable CG optimality property:
CG minimizes the error over the Krylov subspace $\mathcal{K}_k(A,b)$ at step $k$

Conjugate Gradient Method

Question: Where did the steps in the CG algorithm come from?
Answer: It turns out that CG can be derived by developing an optimization algorithm for $\phi : \mathbb{R}^n \to \mathbb{R}$ given by \[ \phi(x) = \frac{1}{2} x^T A x - x^T b \]

e.g. lines 3 and 4 in CG perform line search, line 7 gives a search direction $p_k$

Conjugate Gradient Method

The name “conjudate gradient” comes from the property \[ \htmlClass{color5}{ \text{(ii)}} \quad \nabla \phi(x_k)^T\nabla \phi(x_j) = r_k^T r_j = 0 \text{ for } j < k \] since $-\nabla \phi(x) = b - Ax = r(x)$
That is, the gradient directions at $x_k$ and $x_j$ are orthogonal, or “conjugate”

Conjugate Gradient Method

Question: Why is the quadratic objective function $\phi$ relevant to $Ax = b$?
Answer: Minimizing $\phi$ is equivalent to minimizing $\|e_k\|_A^2$, since \[ \begin{aligned} \|e_k\|_A^2 &= (x_\ast - x_k)^T A (x_\ast - x_k) \\ &= x_k^T A x_k - 2 x_k^T A x_\ast + x_\ast^T A x_\ast\\ &= x_k^T A x_k - 2 x_k^T b + x_\ast^T b\\ &= 2\phi(x_k) + \text{const}\end{aligned} \]
Our argument from above shows that at iteration $k$,
CG solves the optimization problem \[ \htmlClass{color5}{ \min_{x \in \mathcal{K}_k(A,b)} \phi(x)} \]

Conjugate Gradient Method

How fast does $\|e_k\|_A$ converge?
One result for CG is that if $A$ has 2-norm condition number $\kappa$, then \[ \frac{\|e_k\|_A}{\|e_0\|_A} \leq 2\left(\frac{\sqrt{\kappa}-1}{\sqrt{\kappa} + 1}\right)^k \]
Smaller condition number $\kappa(A)$ implies faster convergence
Taking this upper bound as an error estimate, it can be shown that
the number of CG iterations to reach $\frac{\|e_k\|_A}{\|e_0\|_A} \leq \epsilon$ for a given tolerance $\epsilon>0$
grows approximately as $\sqrt{\kappa}$

Multigrid Method

Model Problem

Consider a boundary value problem for the discrete Poisson equation \[ \begin{aligned} -u_{i-1} + 2u_i - u_{i+1} &= f_i \qquad i=1,\dots,n-1 \\ u_0 &= 0 \\ u_n &= 0 \\ \end{aligned} \] where $u_i$ approximate a function $u(x_i)$ with $x_i=i/n\in[0,1]$

Model Problem

This problem is linear and can be expressed in matrix form \[ Au=f \] \[ \small A = \left[ \begin{array}{cccccc} 1 & 0 & & & & \\ 0 & 2 & -1 & & & \\ & -1 & \ddots & \ddots & & \\ & & \ddots & \ddots & -1 & \\ & & & -1 & 2 & 0 \\ & & & & 0 & 1 \\ \end{array} \right] \quad u = \left[ \begin{array}{c} u_0\\ u_1\\ \vdots \\ u_{n-1}\\ u_n\\ \end{array} \right] \quad f = \left[ \begin{array}{c} 0\\ f_1\\ \vdots \\ f_{n-1}\\ 0\\ \end{array} \right]. \]
Note that the boundary conditions are eliminated
from the equations for $i=1$ and $i=n-1$
The matrix $A\in\mathbb{R}^{(n+1)\times(n+1)}$ is symmetric

Jacobi Method

One iterative method for solving $Au=f$ is the Jacobi method
The update rule to obtain $u^{(k+1)}_i$ is \[ -u^{(k)}_{i-1} + 2u^{(k+1)}_i - u^{(k)}_{i+1} = f_i \] or equivalently \[ 2(u^{(k+1)}_i- u^{(k)}_i) -u^{(k)}_{i-1} + 2u^{(k)}_i - u^{(k)}_{i+1} = f_i \]
In matrix form \[ D(u^{(k+1)} - u^{(k)}) + Au^{(k)} = f \] \[ Du^{(k+1)} = f- (A-D)u^{(k)} \] where $D$ is the diagonal part of $A$

Jacobi Method

The black line is the exact solution $u(x) = \sin{(4 \pi x)} + 0.5 \sin{(16 \pi x)}$
for the right-hand side generated as $f=Au$

Gauss-Seidel Method

Another iterative method for solving $Au=f$ is the Gauss-Seidel method
The update rule to obtain $u^{(k+1)}_i$ is \[ -u^{(k+1)}_{i-1} + 2u^{(k+1)}_i - u^{(k)}_{i+1} = f_i \]
In matrix form \[ L(u^{(k+1)} - u^{(k)}) + Au^{(k)} = f \] \[ Lu^{(k+1)} = f- (A-L)u^{(k)} \] where $L$ is the lower triangular part of $A$ (including the diagonal)

Gauss-Seidel Method

The black line is the exact solution $u(x) = \sin{(4 \pi x)} + 0.5 \sin{(16 \pi x)}$
for the right-hand side generated as $f=Au$
Gauss-Seidel converges slightly faster than Jacobi

Multigrid Method

However, both of these methods are local,
i.e. iteration only depends on the neighboring points
Note that the high-frequency components
are found much faster than the low-frequency components
The idea of the multigrid method is to solve the problem
on a hierarchy of coarser grids

Multigrid Method

See [examples/unit5/multigrid.py],
implementation of the multigrid method
for the one-dimensional Poisson equation

Multigrid Method

Convergence of the multigrid method using the Gauss-Seidel smoother

Multigrid Method

Convergence of the multigrid method using the Jacobi smoother

Multigrid Method

Convergence of the multigrid method using the Jacobi smoother
with relaxation factor $\omega=0.5$

\[ u^{(k+1)} = u^{(k)} + \omega\big(D^{-1} (f- (A-D)u^{(k)}) - u^{(k)}\big) \]

Applied Mathematics 205

Unit 5. Eigenvalue Problems and Iterative Methods

Outline

Eigenvalues and Eigenvectors

Motivation: Eigenvalue Problems

Motivation: Eigenvalue Problems

Motivation: Eigenvalue Problems

Motivation: Eigenvalue Problems 🔊

Motivation: Eigenvalue Problems

Eigenvalue Problems

Eigenvalues and Eigenvectors

Characteristic Polynomial

Characteristic Polynomial

Companion Matrix

Companion Matrix

Companion Matrix

Companion Matrix

Eigenvalue Decomposition

Eigenvalue Decomposition

Eigenvalue Decomposition

Eigenvalue Decomposition

Eigenvalue Decomposition

Eigenvalue Decomposition

Eigenvalue Decomposition

Eigenvalue Decomposition

Gershgorin’s Theorem

Gershgorin’s Theorem

Gershgorin’s Theorem

Algorithms for Eigenvalue Problems

Power Method

Power Method

Power Method

Power Method

Power Method

Power Method

Power Method

Power Method

Inverse Iteration

Inverse Iteration

Inverse Iteration

Inverse Iteration

Inverse Iteration

Rayleigh Quotient

Rayleigh Quotient

Rayleigh Quotient

Rayleigh Quotient

Rayleigh Quotient

Rayleigh Quotient

Rayleigh Quotient

Rayleigh Quotient

Rayleigh Quotient Iteration

Rayleigh Quotient Iteration

Rayleigh Quotient Iteration: Example

QR Algorithm

QR Algorithm

QR Algorithm

QR Algorithm

QR Algorithm

QR Algorithm

QR Algorithm

QR Algorithm

QR Algorithm

QR Algorithm

QR Algorithm

QR Algorithm

QR Algorithm: Example

QR Algorithm

Iterative Methodsfor Linear Systems

Conjugate Gradient Method

Krylov Subspaces

Krylov Subspaces

Conjugate Gradient Method

Conjugate Gradient Method

Conjugate Gradient Method

Conjugate Gradient Method

Conjugate Gradient Method

Conjugate Gradient Method

Conjugate Gradient Method

Conjugate Gradient Method

Conjugate Gradient Method

Unit 5. Eigenvalue Problems
and Iterative Methods

Iterative Methods
for Linear Systems