## 最小化方差

$min\ {J} = \sum_{n=1}^{N}(y_n - \overrightarrow{w}^T * \overrightarrow{x_n})^2 \cdots \cdots (1)$

$\frac{\partial{J}}{\partial{\overrightarrow{w}}}=-2\sum_{n=1}^{N}{(y_n - \overrightarrow{w}^T * \overrightarrow{x_n})*\overrightarrow{x_n}}=0 \cdots \cdots (2)$

$\sum_{n=1}^{N}{(\overrightarrow{w}^T * \overrightarrow{x_n}*\overrightarrow{x_n})} = \sum_{n=1}^{N}{(y_n * \overrightarrow{x_n})} \cdots \cdots (3)$

$\sum_{n=1}^{N}{[(\overrightarrow{w}^T * \overrightarrow{x_n})*\overrightarrow{x_n}]} = \sum_{n=1}^{N}{[\overrightarrow{x_n} * (\overrightarrow{w}^T * \overrightarrow{x_n})]} = \sum_{n=1}^{N}{[\overrightarrow{x_n} * (\overrightarrow{x_n}^T * \overrightarrow{w})]} = [\sum_{n=1}^{N}{(\overrightarrow{x_n} * \overrightarrow{x_n}^T)]*\overrightarrow{w} = \sum_{n=1}^{N}{(y_n * \overrightarrow{x_n})} \cdots \cdots (4)$

$X = \begin{bmatrix} \overrightarrow{x_1} & \overrightarrow{x_2} & \cdots & \overrightarrow{x_N} \end{bmatrix}= \begin{bmatrix} 1 & 1 & \cdots & 1 \\ x_{11} & x_{21} & \cdots & x_{N1} \\ \vdots & \vdots & \cdots & \vdots \\ x_{1M} & x_{2M} & \cdots & x_{NM} \end{bmatrix}$

$Y = \begin{bmatrix} y_1 \\ y_2 \\ \vdots \\ y_N \end{bmatrix}$

$(XX^T)*\overrightarrow{w} = XY \cdots \cdots (5)$

$\overrightarrow{w} = (XX^T)^{-1}(XY) \cdots \cdots (6)$

$\begin{bmatrix} w_0 \\ w_1 \end{bmatrix} = \frac{1}{N\sum{x_n^2} - (\sum{x_n})^2} \begin{bmatrix} (\sum{x_n^2}) * (\sum{y_n}) - (\sum{x_n}) * (\sum{x_ny_n}) \\ N(\sum{x_ny_n}) - (\sum{x_n})(\sum{y_n}) \end{bmatrix} \cdots \cdots (7)$

## 最小化点到直线的距离和

$min\ J = \frac{\sum_{n=1}^{N}(ax_n + by_n + c)^2}{a^2 + b^2} \cdots \cdots (8)$

$min\ J = \sum_{n=1}^{N}(ax_n + by_n + c)^2 \cdots \cdots (9)$

$s.t. \ a^2 + b^2 = 1$

$L(a, b, c, \lambda) = \sum_{n=1}^{N}(ax_n + by_n + c)^2 + \lambda(a^2 + b^2 - 1) \cdots \cdots (10)$

\begin{align*} \frac{\partial{L}}{\partial{a}} & = 2\sum{(ax_n + by_n + c)x_n} + 2\lambda a & = 0 & \cdots \cdots (11) \\ \frac{\partial{L}}{\partial{b}} & = 2\sum{(ax_n + by_n + c)y_n} + 2\lambda b & = 0 & \cdots \cdots (12) \\ \frac{\partial{L}}{\partial{c}} & = 2\sum{(ax_n + by_n + c)} & = 0 & \cdots \cdots (13) \end{align*}

由（13）式可得：

$c = -\frac{1}{N}\sum{(ax_n + by_n)} \cdots \cdots (14)$

$\overrightarrow{w} = [a, b]^T$$\overrightarrow{s} = [x, y]^T$，则$c = -\frac{1}{N}\sum{\overrightarrow{w}^T * \overrightarrow{s_n}} = -\overrightarrow{w}^T * \overrightarrow{s_0}$，其中$\overrightarrow{s_0} = \frac{1}{N}\sum{\overrightarrow{s_n}}$，带入拉格朗日函数得：

$L(a, b, c, \lambda) = \sum{(\overrightarrow{w}^T * \overrightarrow{s_n} - \overrightarrow{w}^T * \overrightarrow{s_0})^2} + \lambda (\overrightarrow{w}^T * \overrightarrow{w} - 1) \cdots \cdots (15)$

\begin{align*} \frac{\partial{L(\overrightarrow{w}, \lambda)}}{\partial{\overrightarrow{w}}} & = 2\sum{[\overrightarrow{w}^T(\overrightarrow{s_n} - \overrightarrow{s_0})*(\overrightarrow{s_n} - \overrightarrow{s_0})]} + 2\lambda \overrightarrow{w} & \\ & = 2\sum{(\overrightarrow{s_n} - \overrightarrow{s_0})(\overrightarrow{s_n} - \overrightarrow{s_0})^T\overrightarrow{w} + 2\lambda \overrightarrow{w}} & \cdots \cdots (16)\\ & = 0 & \end{align*}

$S = [\overrightarrow{s_1} - \overrightarrow{s_0}, \overrightarrow{s_2} - \overrightarrow{s_0}, \cdots, \overrightarrow{s_N} - \overrightarrow{s_0}] = \begin{bmatrix} x_1 - x_0 & x_2 - x_0 & \cdots & x_N - x_0 \\ y_1 - y_0 & y_2 - y_0 & \cdots & y_N - y_0] \end{bmatrix}$

$A = \sum{(\overrightarrow{s_n} - \overrightarrow{s_0})(\overrightarrow{s_n} - \overrightarrow{s_0})^T} = SS^T \cdots \cdots (17)$

$\frac{\partial{L(\overrightarrow{w}, \lambda)}}{\partial{\overrightarrow{w}}} = 2A\overrightarrow{w} + 2\lambda \overrightarrow{w} = 0 \cdots \cdots (18)$

$A\overrightarrow{w} = -\lambda \overrightarrow{w} \cdots \cdots (19)$

A是2X2的矩阵，我们知道它有两个特征值和两个特征向量，因此此问题有两个解。因为A是实对称矩阵，他的两个特征向量是正交的，这说明有两条互相垂直的直行分别对应原问题的两个局部极值点。其中，较小特征值对应的特征向量即为最优解。

posted on 2015-08-08 14:29  balabala已被注册  阅读(4134)  评论(0编辑  收藏  举报