How to understand Eigenvector/Eigenvalue?

This topic has been in my wish list for a long time. I will try to articulate my understanding of eigenvector and eigenvalue after digestion informative materials from different sources.

First of all, we can consider multiplication between a square matrix A \in \mathbb{R}^{k \times k} and a vector \vec{v} \in \mathbb{R}^k as a transformation of \vec{v} such that \vec{v} is rotated and scaled. By definition, eigenvectors of A are vectors on the directions on which A has invariant rotation transformation, i.e., A\vec{v}=\lambda\vec{v}.

What kind of matrices have eigenvectors? Only square matrices are possible to have eigenvectors. This is because:

Screenshot from 2016-01-04 13:59:31

ref: http://math.stackexchange.com/a/583976/235140

However, not all square matrices have eigenvectors: http://math.stackexchange.com/a/654713/235140

Two useful interpretations of eigenvectors if you regard a square matrix as rotation + scale transformation:

  1.  no other vectors when acted by this matrix can be stretched as much as eigenvectors. (ref: http://math.stackexchange.com/a/243553/235140)
  2. eigenvectors keep the same direction after the transformation. (ref: https://www.quora.com/What-do-eigenvalues-and-eigenvectors-represent-intuitively)

Now, let’s see the scenarios in computer science realm where eigenvectors/eigenvalues play a role.

1.  PCA

Now we look at all kinds of matrics including square matrices. One way to understand a matrix is that it represents a coordination system. Suppose P \in \mathbb{R}^{m \times k} is a matrix and \vec{x} \in \mathbb{R}^k is a column vector, then one way to interpret P \vec{x}=\vec{y} is that every row of the matrix P is a base vector of its coordination system. Every component of \vec{y} is a dot product of a row of P and \vec{x}, which can be seen as a kind of projection from \vec{x} on the corresponding row of P.

If now we have a transformation matrix P \in \mathbb{R}^{m \times k} in which rows are base vectors of some coordination system, X \in \mathbb{X}^{k \times n} is data matrix where n is the number of data points and k is the number of dimensions, then Y=PX, Y \in \mathbb{R}^{m \times n} is a dimension reduced matrix of X such that every column of Y is representation of every column of X using base vectors in P.

PCA is trying to find such P that not only reduces the dimension of X but also satisfies other properties. And what it concludes is that P is a matrix in which rows are eigenvectors of X‘s covariance matrix. (ref: https://czxttkl.com/?p=339https://www.cs.princeton.edu/picasso/mats/PCA-Tutorial-Intuition_jp.pdf)

2.  PageRank / Stationary distribution of Markov chains

Both try to find eigenvectors with eigenvalues equal to 1: P\vec{x} = \vec{x}. In PageRank, P is a fractional PageRank constribution link matrix in which P_{ij} is the page rank contributed to webpage j when webpage i links it. \vec{x} is the PageRank value vector for all web pages that we want to calculate. When calculating stationary distribution of Markov States, P is simply a transition matrix denoting pairwise transition probabilities between nodes while \vec{x} is the ultimate stationary distribution of nodes.

Leave a comment

Your email address will not be published. Required fields are marked *