PCA和SVD的关系

Update 07/15/2015:

First, we should know how to calculate covariance matrix and correlation matrix.

According to the wikipedia page, we have the calculation procedure of covariance matrix as:

And the calculation procedure of correlation matrix is:

I am going to use an R example to illustrate the calculation procedures. Let’s say we have a `m1` matrix composed of six columns.

v1 <- c(1,1,1,1,1,1,1,1,1,1,3,3,3,3,3,4,5,6)
v2 <- c(1,2,1,1,1,1,2,1,2,1,3,4,3,3,3,4,6,5)
v3 <- c(3,3,3,3,3,1,1,1,1,1,1,1,1,1,1,5,4,6)
v4 <- c(3,3,4,3,3,1,1,2,1,1,1,1,2,1,1,5,6,4)
v5 <- c(1,1,1,1,1,3,3,3,3,3,1,1,1,1,1,6,4,5)
v6 <- c(1,1,1,2,1,3,3,3,4,3,1,1,1,2,1,6,5,4)
m1 <- cbind(v1,v2,v3,v4,v5,v6)

The covariance matrix of `m1` is:

> cov(m1)
         v1        v2       v3        v4       v5       v6
v1 2.535948 2.3071895 1.300654 1.0849673 1.183007 1.026144
v2 2.307190 2.3790850 1.013072 0.9934641 1.071895 1.052288
v3 1.300654 1.0130719 2.535948 2.2026144 1.300654 1.084967
v4 1.084967 0.9934641 2.202614 2.4869281 1.084967 1.075163
v5 1.183007 1.0718954 1.300654 1.0849673 2.535948 2.379085
v6 1.026144 1.0522876 1.084967 1.0751634 2.379085 2.486928

So, let’s look at `cov(m1)[1,1] = 2.535948` and `cov(m2)[1,2] = 2.3071895`:

# nrow(m1) is the number of data points. 
> (v1-mean(v1)) %*% (v1 - mean(v1) )/ (nrow(m1) - 1)
         [,1]
[1,] 2.535948
> (v1-mean(v1)) %*% (v2 - mean(v2) )/ (nrow(m1) - 1)
        [,1]
[1,] 2.30719

The correlation matrix of `m1` is:

> cor(m1)
          v1        v2        v3        v4        v5        v6
v1 1.0000000 0.9393083 0.5128866 0.4320310 0.4664948 0.4086076
v2 0.9393083 1.0000000 0.4124441 0.4084281 0.4363925 0.4326113
v3 0.5128866 0.4124441 1.0000000 0.8770750 0.5128866 0.4320310
v4 0.4320310 0.4084281 0.8770750 1.0000000 0.4320310 0.4323259
v5 0.4664948 0.4363925 0.5128866 0.4320310 1.0000000 0.9473451
v6 0.4086076 0.4326113 0.4320310 0.4323259 0.9473451 1.0000000

`cor(m1)[1,1]=1.000` and `cor(m1)[1,2]=0.9393083`. Let’s see how they are calculated:

> (v1-mean(v1)) %*% (v1 - mean(v1) )/ (nrow(m1) - 1)/ (sd(v1) * sd(v1))
     [,1]
[1,]    1
> (v1-mean(v1)) %*% (v2 - mean(v2) )/ (nrow(m1) - 1)/ (sd(v1) * sd(v2))
          [,1]
[1,] 0.9393083

Update 07/15/2015. end.

————————————————————————————————————————————————-

现在就是这篇帖子的主题，PCA是什么？以及PCA和SVD的关系。

[gview file=”https://czxttkl.com/wp-content/uploads/2015/06/PCA-SVD-NNMF.pdf” height=”1000px” width=”700px”]

Two examples of visualizing PCA

下图是一个SVD分解的示意图。每个矩阵都代表着一种空间的变化，那么M代表的变化可以分解为：

1. 先经过V这个旋转(Rotation)变化

2. 再经过$latex \Sigma$这个拉伸(Scaling)变化

3. 再经过U这个旋转(Rotation)变化

图中，起始的图形是一个半径为单位长度的圆形。

PCA, as visualized here, is trying to minimize the error between data and the principle component’s direction:

Update on 06/25/2015:

The following is a R script I wrote myself:

# A test file related to my post: https://czxttkl.com/?p=339 
# and https://czxttkl.com/?p=397 
remove(list=ls())
data("iris")
log.ir <- log(iris[,1:2])

# center but not scale by myself
log.ir[,1] <- log.ir[,1] - mean(log.ir[,1])
log.ir[,2] <- log.ir[,2] - mean(log.ir[,2])

ir.pca <- prcomp(log.ir, center = FALSE, scale. = FALSE)

# corresponds to p, x, y in my post.
p <- t(ir.pca$rotation)
x <- t(log.ir)
y <- p %*% x

# cov() calculates covariance matrix of the input matrix columns
# cov_y should have non-zero diaognal elements and close-to-zero non-diagonal elements
cov_y <- cov(t(y), t(y))

# Check if my calculation of covariance of y is consistent with cov_y
y_col2 <- t(y)[,2]
cov_y_2_2 <- (y_col2 %*% y_col2) / (length(y_col2) - 1)

# The red lines are the directions of principle component vectors. They should be orthogonal.
plot(log.ir, xlim=c(-1,1), ylim=c(-1,1))
points(t(y), col="blue")
abline(b=ir.pca$rotation[1,2]/ir.pca$rotation[1,1], a=0, col="red")
abline(b=ir.pca$rotation[2,2]/ir.pca$rotation[2,1], a=0, col="red")

PCA和SVD的关系

Leave a comment

Cancel reply