Linear Regression

Hypothesis

Linear Regression, 线性回归是机器学习中监督学习的一种非常常用的方法。

\[h_\theta(x) = \theta_0 + \theta_1 x_1 + \theta_2 x_2 + \cdots + \theta_n x_n \]

The vectorized version is:

\[h_\theta(x) = \left[ \begin{matrix} \theta_{0}&\theta_{1}&\cdots&\theta_{n} \end{matrix} \right] \left[ \begin{matrix} x_0 \\ x_1 \\ \vdots \\ x_n \end{matrix} \right] = \theta^T x \]

\(\text{Training examples are stored in X row-wise, like such:}\)

\[\boldsymbol{X} = \left[ \begin{matrix} x_{0}^{(1)}&x_{1}^{(1)} \\ x_{0}^{(2)}&x_{1}^{(2)} \\ x_{0}^{(3)}&x_{1}^{(3)} \end{matrix} \right], \boldsymbol{\theta} = \left[ \begin{matrix} \theta_{0} \\ \theta_{1} \end{matrix} \right] \]

\(\text{yet, like this:}\)

\[h_\theta(\boldsymbol{X}) = \boldsymbol{X}\boldsymbol{\theta} \]

Cost Function

\[J(\theta) = \frac{1}{2m} \sum_{i=1}^{m} (h_\theta^{(i)} - y^{(i)})^2 \]

The vectorized version is:

\[J(\theta) = \frac{1}{2m} (\boldsymbol{X}\boldsymbol{\theta} - \vec{\boldsymbol{y}} )^T(\boldsymbol{X}\boldsymbol{\theta} - \vec{\boldsymbol{y}}) \]

Gradient Descent

\[\theta_0 = \theta_0 - \alpha \frac{1}{m} \sum_{i=1}^{m}(h_\theta^{(i)} - y^{(i)}) x_0^{(i)} \]

\[\theta_1 = \theta_1 - \alpha \frac{1}{m} \sum_{i=1}^{m}(h_\theta^{(i)} - y^{(i)}) x_1^{(i)} \]

\[\theta_2 = \theta_2 - \alpha \frac{1}{m} \sum_{i=1}^{m}(h_\theta^{(i)} - y^{(i)}) x_2^{(i)} \]

\[\cdots \]

In other words:

\[\theta_j = \theta_j - \alpha \frac{1}{m} \sum_{i=1}^{m}(h_\theta^{(i)} - y^{(i)}) x_j^{(i)} \]

The vectorized version is:

\[\boldsymbol{\theta} = \boldsymbol{\theta} - \frac{\alpha}{m} (\boldsymbol{X}\boldsymbol{\theta} - \vec{y}) \]

Stochastic Gradient Descent

  1. Randomly 'shuffle' the dataset
  2. For i=1…m

\[\theta_j = \theta_j - \alpha (h_\theta^{(i)} - y^{(i)}) x_j^{(i)} \]

Regularized Linear Regression

Cost Function

\[min_\theta \frac{1}{2m} \left[ \sum_{i=1}^m (h_\theta^{(i)} - y^{(i)})^2 + \lambda \sum_{j=1}^{n}\theta_{j}^2 \right] \]

Gradient Descent

\[\theta_0 = \theta_0 - \alpha \frac{1}{m} \sum_{i=1}^{m}(h_\theta^{(i)} - y^{(i)}) x_0^{(i)} \]

\[\theta_j = \theta_j - \alpha \left[ \frac{1}{m} \sum_{i=1}^{m}(h_\theta^{(i)} - y^{(i)}) x_j^{(i)} + \frac{\lambda}{m} \theta_j \right], j \in \{1, 2, ..., n \} \]

Code

# todo
posted @ 2018-04-15 22:37  BerMaker  阅读(191)  评论(0编辑  收藏  举报