%matplotlib inline
$x$: nonzero vector
Rayleigh quotient: $R(M,x)={x^{*}Mx \over x^{*}x}$
If $x$ is an eigenvector of $A$ then $R(M,x)$ is its eigenvalue. Otherwise, $R(M,x)$ is the closest scalar to an eigenvalue (in least squares):
Exercise 1:
Which is the value $\alpha$ that minimizes $||Ax - \alpha x||_2^2$?
hint: Compute the derivative of $||Ax - \alpha x||_2^2$ w.r.t. $\alpha$
Exercise 2:
np.random.seed
function)np.random.randn
)np.linalg.eig
)matplotlib.pyplot.hist
)import numpy as np
import matplotlib.pyplot as plt
np.random.seed(0)
def rayleigh_quotient(x,A):
return(x.T.dot(A).dot(x)/x.T.dot(x))
A = np.random.randn(2,2)
A = A.T.dot(A)
l,X = np.linalg.eig(A)
print(A.dot(X))
print(X.dot(np.eye(2)*l))
rayleigh_quotient(X[:,1],A)
eigs = []
for i in range(1000):
random_vectors = np.random.randn(2)*10
eigs.append(rayleigh_quotient(random_vectors,A))
_= plt.hist(eigs,bins=10)
n = 4
A = np.random.randn(n,n)
A = A.T.dot(A)
l,X = np.linalg.eig(A)
l
X
eigs = []
for i in range(100000):
random_vectors = np.random.uniform(-1,1,n)
eigs.append(rayleigh_quotient(random_vectors,A))
min(eigs),max(eigs)
_=plt.hist(eigs,bins=100)
for ll in l:
plt.vlines(x=ll,ymax=1600,ymin=0)
A well known method to compute eigenvalues is the power iteration method (also known as the Von Mises iteration).
This method takes $A$ a diagonalizable matrix as an input, and it outputs its greatest eigenvalue $\lambda$ (in absolute value), and its corresponding eigenvector $v$ (i.e., $Av=\lambda v$).
Power iteration($A$):
This method does not converge if:
Exercise 3 How does this method work?
Exercise 4
np.linalg.eig
np.linalg.eig
, and the current value estimated by the power iteration method.def power_iteration(A, num_simulations):
# choose a random vector
b_k = np.random.rand(A.shape[1])
b_k = b_k/np.linalg.norm(b_k)
# Compute the eigenvalues
l,X = np.linalg.eig(A)
diff_l1_l2 = np.abs(l[0] - l[1])
error = []
for _ in range(num_simulations):
# calculate the matrix-by-vector product Ab
b_k1 = np.dot(A, b_k)
# calculate the norm
b_k1_norm = np.linalg.norm(b_k1)
# re normalize the vector
b_k = b_k1 / b_k1_norm
lambda_ = b_k.T.dot(A).dot(b_k)
error.append(np.abs(lambda_ - l[0]))
return b_k,lambda_,error, diff_l1_l2
diff = []
err = []
A = np.random.randn(10,10)
A = A.T.dot(A)
b_k,lambda_,error, diff_l1_l2 = power_iteration(A,30)
plt.plot(error)
Matrix $(A − \mu I)^{−1}$ and $A$ have the same eigenvectors, and the eigenvalues of $(A − \mu I)^{−1}$ are $(\lambda - \mu)^{-1}$
Exercise 5: Prove the previous statement.
$AX = X \Lambda$
$A = X \Lambda X^{-1}$
$A-\mu \mathbf{I} = X \Lambda X^{-1} - \mu \mathbf{I}$
$A-\mu \mathbf{I} = X \Lambda X^{-1} - \mu X X^{-1}$
$A-\mu \mathbf{I} = X \Lambda X^{-1} - \mu X \mathbf{I} X^{-1}$
$A-\mu \mathbf{I} = X \Lambda X^{-1} - X (\mu \mathbf{I}) X^{-1}$
$A-\mu \mathbf{I} = X (\Lambda - \mu \mathbf{I}) X^{-1}$
$(A-\mu \mathbf{I})^{-1} = (X (\Lambda - \mu \mathbf{I}) X^{-1})^{-1}$
$(A-\mu \mathbf{I})^{-1} = X (\Lambda - \mu \mathbf{I})^{-1} X^{-1}$
If $\mu$ is close to the eigenvalue $\lambda_j$ then $(\lambda_j - \mu \mathbf{I})^{-1}$ is a larger eigenvalue than $(\lambda_i - \mu \mathbf{I})^{-1}, \; \forall j \neq i$.
The power iteration method applied to $(\lambda_i - \mu \mathbf{I})^{-1}$ converges quickly.
Inverse Iteration Algorithm ($A$, $\mu$):
Exercise 6:
np.linalg.eig
function.np.round
function)def inverse_iteration(A,mu,nb_iterations):
n = A.shape[0]
v = np.random.randn(n)
v = v/np.linalg.norm(v)
for i in range(nb_iterations):
w = np.linalg.inv(A-mu*np.eye(n)).dot(v)
v = w / np.linalg.norm(w)
l = v.T.dot(A).dot(v)
return(v,l)
A = np.random.randn(10,10)
A = A.T.dot(A)
inverse_iteration(A,10,100)
lambdas,V = np.linalg.eig(A)
ls = []
for i in range(100):
v,l = inverse_iteration(A,np.random.randn(1)[0]*10,500)
ls.append(np.round(l,decimals=2))
np.unique(ls)
lambdas
Instead of using a random value $\mu$ as initial guess, we can use the Rayleigh quotient instead.
Moreover updating $\mu$ at each iteration allows a faster convergence.
These ideas give birth to the Rayleigh quotient iteration method
Rayleigh quotient iteration($A$):
Exercise 7:
np.linalg.eig
function.np.round
function)def rayleigh_quotient_iteration(A,nb_iterations):
n = A.shape[0]
v = np.random.randn(n)
v = v/np.linalg.norm(v)
l = v.T.dot(A.dot(v))
for i in range(nb_iterations):
w = np.linalg.inv(A-l*np.eye(n)).dot(v)
v = w / np.linalg.norm(w)
l = v.T.dot(A).dot(v)
return(v,l)
A = np.random.randn(10,10)
A = A.T.dot(A)
rayleigh_quotient_iteration(A,100)
lambdas,V = np.linalg.eig(A)
lambdas
ls = []
for i in range(100):
v,l = rayleigh_quotient_iteration(A,100)
ls.append(np.round(l,decimals=2))
np.unique(ls)
Goal: Factorize $A$ in order to reveal the eigenvalues.
$$A = Q T Q^*$$Theorem: Every square matrix has a Schur decomposition
Method: This factorization is computed iteratively using the $QR$ factorization
Schur factorization(A):
Exercise 8:
np.linalg.eig
functiondef schur_factorization(A,nb_iterations):
for i in range(nb_iterations):
Q,R = np.linalg.qr(A)
A = R.dot(Q)
return Q.T,A,Q
A = np.random.randn(10,10)
A = A.T.dot(A)
Q,T,Q_t = schur_factorization(A,1000)
plt.imshow(T)
np.diag(T)
l,V = np.linalg.eig(A)
l