随笔档案「2012年9月」 - sidereal

gradient descent & ascent

摘要：stochastic gradient descent is to minimize cost function:$\theta_j := \theta_j - \alpha \frac{\partial}{\partial \theta_j}J(\theta)$while gradient ascent is to maximize likelihood function:$\theta_j := \theta_j + \alpha \frac{\partial}{\partial \theta_j}l(\theta)$ 阅读全文

posted @ 2012-09-29 10:55 sidereal 阅读(284) 评论(0) 推荐(0)

Probability bases

摘要：Bernoulli distribution:$y \epsilon \{0,1\}$, $\phi=p(y=1)$$p(y;\phi)=\phi ^y(1-\phi)^{1-y}$the mean of the Bernoulli is given by $\phi$ 阅读全文

posted @ 2012-09-29 10:15 sidereal 阅读(93) 评论(0) 推荐(0)

From summation to matrix

摘要：x,y $\epsilon R^n$$x^Ty \epsilon R =\sum_{i=1}^n{x_iy_i}$, 注意将此式扩展。X is a matrix of m*n.$X^T X$其第j个对角线元素为$\sum_i X_{ij}^2 $$\sum_i \sum_j X_{ij}^2 =\sum_j (X^T X)_{jj} = tr X^T X$$x^{(i)}$ is a vector of n*1; $\vec y \epsilon R^m$$X^T=[x^{(1)} x^{(2)} ... x^{(m)}]$ m is the number of training set, n 阅读全文

posted @ 2012-09-28 22:59 sidereal 阅读(101) 评论(0) 推荐(0)

sidereal

09 2012 档案

公告