PCA
首先对于一维数据 \(Y=Y_{1},Y_{2},\dots ,Y_{n}\), 其方差表示为
\[S_{y}=\frac{1}{n-1}\sum_{1-1}^n(Y_{i}-\bar{Y})^2
\]
对于矩阵 \(X=[x_{1},x_{2},\dots ,x_{n}]^T\),其中 \(x_{i}=(x_{i1},x_{i2},\dots,x_{in})^T\)
\[X=[x_{1},x_{2},\dots ,x_{n}]^T=\left[ \begin{matrix}
x_{11}&x_{12}&\dots&x_{1n}\\
x_{21}&x_{22}&\dots&x_{2n} \\
\dots&\dots&\dots&\dots \\
x_{n 1} &x_{n 2} & \dots & x_{n n}
\end{matrix} \right]_{n\times n}
\]
\(X\) 平均值可表示为
\[\bar{X}_{n\times 1}=\left[ \begin{matrix}
\bar{x_{1}^T} \\
\bar{x_{2}^T} \\
\cdots \\
\bar{x_{n}^T}
\end{matrix}\right]=\frac{1}{n-1}X \cdot 1_{n\times 1}
\]
\[\bar{x}_{1\times n}=\left[ \begin{matrix}
\bar{x_{*1}^T} &
\bar{x_{*2}^T} &
\cdots &
\bar{x_{*n}^T}
\end{matrix}\right]=\frac{1}{n-1}1_{1\times n}\cdot X
\]
\(X\) 的协方差可表示为
\[S=\frac{1}{n-1}\sum_{1=1}^n(x_{i}-\bar{x})^T(x_{i}-\bar{x})
\]
将每一个向量 \(x_{i}\) 投影到向量 \(v\),得到新的向量 \(s\)
\[\mid\mid s_{i}\mid\mid_{2}=v\cdot x_{i}=v^{T}x_{i}=x_{i}^Tv
\]
求解使得所有的投影后的向量的方差最小
\[\begin{align}
S_{v}&=\frac{1}{n-1}\sum\mid\mid s_{i}\mid\mid_{2}^2=\frac{1}{n-1}\sum(v^{T}x_{i}\cdot x_{i}^Tv) \\
&=\frac{1}{n-1}\sum v^{T} x_{i}x_{i}^{T} v \\
&=\frac{1}{n-1}v^T\left[ \begin{matrix}
x_{1}&x_{2} & \cdots & x_{n}
\end{matrix}\right]\left[ \begin{matrix}
x_{1}^T \\
x_{2}^T \\
\cdots \\
x_{n}^T
\end{matrix}\right]v \\
&=\frac{1}{n-1}v^TXX^Tv=v^{T} \frac{1}{n-1}XX^Tv=v^TSv
\end{align}
\]
PCA 的主要问题就是
\[\max S_{v} \quad st \quad v^Tv=1
\]
构造拉格朗日函数
\[G=v^TSv-\lambda (1-v^Tv)
\]
\(G\) 对 \(v\) 矩阵求导
\[2S\cdot v-2\lambda v=0\implies Sv=\lambda v
\]

浙公网安备 33010602011771号