Gradient descent

Graditent descent

标签(空格分隔): ML

syllabus


1. flaws of coutour plot

Coutour plot satisfies our demand with two independent variables and one dependent variable.

\(\to\)
But what if there are more than three variables, How about 100? Obviously, a contour plot cannot work!

2.gradient descent

2.1 first impression

Gradient descent is able to cope with more than three variables.But how does it work?

2.2 Algorithm

//pseudo-code

algorithm start 

initialize some  x,y with  zero

keep changing x,y to reduce J(x,y) until we hopefully end up at minimum


Gradient descent algorithm:

\(\to repeat \quad until\quad convergence \quad \{\)

\(\Theta_j := \Theta_j - \alpha\frac{\partial}{\partial\Theta_j}J(\Theta_1,\Theta_2)\)

\(\}\)

Notes:

simultaneously update!

2.3 flaws of gradient descent

different local optimum

posted @ 2022-10-04 21:42  44636346  阅读(40)  评论(0)    收藏  举报