梯度下降算法与Normal equation

Normal equation: Method to solve for θ analytically

正规方程：分析求解θ的方法

对于损失函数

\[J\left( {{\theta _0},{\theta _1},...,{\theta _n}} \right) = \frac{1}{{2m}}\sum\limits_{i = 1}^m {{{\left( {{h_\theta }\left( {{x^{\left( i \right)}}} \right) - {y^{\left( i \right)}}} \right)}^2}} \]

只要满足

\[\frac{\partial }{{\partial {\theta _1}}}J\left( \theta \right) = \frac{\partial }{{\partial {\theta _2}}}J\left( \theta \right) = \cdot \cdot \cdot = \frac{\partial }{{\partial {\theta _n}}}J\left( \theta \right) = 0\]

就可以直接得到所有的参数

\[{{\theta _0},{\theta _1},...,{\theta _n}}\]

而满足上面的连续等式的解是

\[\theta = {\left( {{X^T}X} \right)^{ - 1}}{X^T}y\]

其中

\[X = \left[ {\begin{array}{*{20}{c}}
{\begin{array}{*{20}{c}}
{1,x_1^{\left( 1 \right)},x_2^{\left( 1 \right)},...,x_n^{\left( 1 \right)}}\\
{1,x_1^{\left( 2 \right)},x_2^{\left( 2 \right)},...,x_n^{\left( 2 \right)}}\\
\begin{array}{l}
\cdot \\
\cdot
\end{array}
\end{array}}\\
{1,x_1^{\left( m \right)},x_2^{\left( m \right)},...,x_n^{\left( m \right)}}
\end{array}} \right]\]

是变量的矩阵；

\[y = \left[ {\begin{array}{*{20}{c}}
{\begin{array}{*{20}{c}}
{{y^{\left( 1 \right)}}}\\
{{y^{\left( 2 \right)}}}\\
\begin{array}{l}
\cdot \\
\cdot
\end{array}
\end{array}}\\
{{y^{\left( m \right)}}}
\end{array}} \right]\]

是对应的输出值

Gradient Descent	Normal Equation
Need to choose α	No need to choose α
Needs many iterations	Don't need to iterate
Works well even when n is large	O(n³)Need to compute ()
O(kn²)	Slow if n is very large

如果矩阵不可逆，可以计算伪逆矩阵。

posted @ 2018-10-23 22:39 qkloveslife 阅读(294) 评论(0) 收藏举报

刷新页面返回顶部

qkloveslife

梯度下降算法与Normal equation

公告