Gradient descent
Graditent descent
标签(空格分隔): ML
syllabus
目录
1. flaws of coutour plot
Coutour plot satisfies our demand with two independent variables and one dependent variable.
\(\to\)
But what if there are more than three variables, How about 100? Obviously, a contour plot cannot work!
2.gradient descent
2.1 first impression
Gradient descent is able to cope with more than three variables.But how does it work?
2.2 Algorithm
//pseudo-code
algorithm start
initialize some x,y with zero
keep changing x,y to reduce J(x,y) until we hopefully end up at minimum
Gradient descent algorithm:
\(\to repeat \quad until\quad convergence \quad \{\)
\(\Theta_j := \Theta_j - \alpha\frac{\partial}{\partial\Theta_j}J(\Theta_1,\Theta_2)\)
\(\}\)
Notes:
simultaneously update!
2.3 flaws of gradient descent
different local optimum

浙公网安备 33010602011771号