How to understand Eigenvector/Eigenvalue?

This topic has been in my wish list for a long time. I will try to articulate my understanding of eigenvector and eigenvalue after digestion informative materials from different sources.

First of all, we can consider multiplication between a square matrix $A \in \mathbb{R}^{k \times k}$ and a vector $\vec{v} \in \mathbb{R}^k$ as a transformation of $\vec{v}$ such that $\vec{v}$ is rotated and scaled. By definition, eigenvectors of $A$ are vectors on the directions on which $A$ has invariant rotation transformation, i.e., $A\vec{v}=\lambda\vec{v}$ .

What kind of matrices have eigenvectors? Only square matrices are possible to have eigenvectors. This is because:

ref: http://math.stackexchange.com/a/583976/235140

However, not all square matrices have eigenvectors: http://math.stackexchange.com/a/654713/235140

Two useful interpretations of eigenvectors if you regard a square matrix as rotation + scale transformation:

no other vectors when acted by this matrix can be stretched as much as eigenvectors. (ref: http://math.stackexchange.com/a/243553/235140)
eigenvectors keep the same direction after the transformation. (ref: https://www.quora.com/What-do-eigenvalues-and-eigenvectors-represent-intuitively)

Now, let’s see the scenarios in computer science realm where eigenvectors/eigenvalues play a role.

1. PCA

Now we look at all kinds of matrics including square matrices. One way to understand a matrix is that it represents a coordination system. Suppose $P \in \mathbb{R}^{m \times k}$ is a matrix and $\vec{x} \in \mathbb{R}^k$ is a column vector, then one way to interpret $P \vec{x}=\vec{y}$ is that every row of the matrix $P$ is a base vector of its coordination system. Every component of $\vec{y}$ is a dot product of a row of $P$ and $\vec{x}$ , which can be seen as a kind of projection from $\vec{x}$ on the corresponding row of $P$ .

If now we have a transformation matrix $P \in \mathbb{R}^{m \times k}$ in which rows are base vectors of some coordination system, $X \in \mathbb{X}^{k \times n}$ is data matrix where $n$ is the number of data points and $k$ is the number of dimensions, then $Y=PX, Y \in \mathbb{R}^{m \times n}$ is a dimension reduced matrix of $X$ such that every column of $Y$ is representation of every column of $X$ using base vectors in $P$ .

PCA is trying to find such $P$ that not only reduces the dimension of $X$ but also satisfies other properties. And what it concludes is that $P$ is a matrix in which rows are eigenvectors of $X$ ‘s covariance matrix. (ref: https://czxttkl.com/?p=339, https://www.cs.princeton.edu/picasso/mats/PCA-Tutorial-Intuition_jp.pdf)

2. PageRank / Stationary distribution of Markov chains

Both try to find eigenvectors with eigenvalues equal to 1: $P\vec{x} = \vec{x}$ . In PageRank, $P$ is a fractional PageRank constribution link matrix in which $P_{ij}$ is the page rank contributed to webpage $j$ when webpage $i$ links it. $\vec{x}$ is the PageRank value vector for all web pages that we want to calculate. When calculating stationary distribution of Markov States, $P$ is simply a transition matrix denoting pairwise transition probabilities between nodes while $\vec{x}$ is the ultimate stationary distribution of nodes.

How to understand Eigenvector/Eigenvalue?

Leave a comment

Cancel reply