梯度下降算法与Normal equation

Normal equation: Method to solve for θ analytically

正规方程:分析求解θ的方法

对于损失函数

\[J\left( {{\theta _0},{\theta _1},...,{\theta _n}} \right) = \frac{1}{{2m}}\sum\limits_{i = 1}^m {{{\left( {{h_\theta }\left( {{x^{\left( i \right)}}} \right) - {y^{\left( i \right)}}} \right)}^2}} \]

只要满足

\[\frac{\partial }{{\partial {\theta _1}}}J\left( \theta  \right) = \frac{\partial }{{\partial {\theta _2}}}J\left( \theta  \right) =  \cdot  \cdot  \cdot  = \frac{\partial }{{\partial {\theta _n}}}J\left( \theta  \right) = 0\]

就可以直接得到所有的参数

\[{{\theta _0},{\theta _1},...,{\theta _n}}\]

而满足上面的连续等式的解是

\[\theta  = {\left( {{X^T}X} \right)^{ - 1}}{X^T}y\]

其中 

\[X = \left[ {\begin{array}{*{20}{c}}
{\begin{array}{*{20}{c}}
{1,x_1^{\left( 1 \right)},x_2^{\left( 1 \right)},...,x_n^{\left( 1 \right)}}\\
{1,x_1^{\left( 2 \right)},x_2^{\left( 2 \right)},...,x_n^{\left( 2 \right)}}\\
\begin{array}{l}
\cdot \\
\cdot
\end{array}
\end{array}}\\
{1,x_1^{\left( m \right)},x_2^{\left( m \right)},...,x_n^{\left( m \right)}}
\end{array}} \right]\]

是变量的矩阵;

\[y = \left[ {\begin{array}{*{20}{c}}
{\begin{array}{*{20}{c}}
{{y^{\left( 1 \right)}}}\\
{{y^{\left( 2 \right)}}}\\
\begin{array}{l}
\cdot \\
\cdot
\end{array}
\end{array}}\\
{{y^{\left( m \right)}}}
\end{array}} \right]\]

是对应的输出值

Gradient Descent Normal Equation
Need to choose α No need to choose α
Needs many iterations Don't need to iterate
Works well even when n is large O(n3)Need to compute ()
 O(kn2) Slow if n is very large

 如果矩阵不可逆,可以计算伪逆矩阵。

posted @ 2018-10-23 22:39  qkloveslife  阅读(281)  评论(0编辑  收藏  举报