网易公开课 笔记

Class 2 Gradient Descent

For \[n\times n\] matrix A, B,

  tr(AB)=trr(BA)

  tr(ABC)=tr(CAB)=tr(BCA)

  tr(A)=tr($A^T$)

tr():representing the trace of matrix, equal to the sum of diagonal elements of matrix

 

for \[A\in R^{m*n}, f(A) \in R^1:\]

$(\bigtriangledown)_A f(x)=[\frac{\partial f(A)}{\partial A_(ij)}]_{m*n)$

 

$ (\bigtriangledown)_A tr(ABA^TC)=CAB+C^TAB^T$

 

 

 

least square formula solution

 

$x\times \theta  to predict y$

$x=[

1, x_{11}, x_{12}, x_{13},..x_{1n}

1, x_{21}, x_{22}, x_{23},..x_{2n}

...

1, x_{m1}, x_{m2}, x_{m3},..x_{mn}

]

where m is number of observations, n is number of features $

$\theta=[\theta_0, \theta_1, \theta_2, ..., \theta_n] ^T is parameters$ 

To get the least square, we can get the following equaltion

$x^T\times x \times \theta=x^T\times y&

$\theta=(x^T\tims x)^{-1}\tims x^T\times y$

 

 

 

posted @ 2014-03-09 08:44  yjjsdu  阅读(540)  评论(0)    收藏  举报