Gradient descent

Graditent descent

标签（空格分隔）： ML

syllabus

Coutour plot satisfies our demand with two independent variables and one dependent variable.

\(\to\)
But what if there are more than three variables, How about 100? Obviously, a contour plot cannot work!

Gradient descent is able to cope with more than three variables.But how does it work?

//pseudo-code

algorithm start 

initialize some  x,y with  zero

keep changing x,y to reduce J(x,y) until we hopefully end up at minimum

Gradient descent algorithm:

\(\to repeat \quad until\quad convergence \quad \{\)

\(\Theta_j := \Theta_j - \alpha\frac{\partial}{\partial\Theta_j}J(\Theta_1,\Theta_2)\)

\(\}\)

Notes:

simultaneously update!

different local optimum

posted @ 2022-10-04 21:42 44636346 阅读(40) 评论(0) 收藏举报

刷新页面返回顶部