week_1

Andrew Ng机器学习笔记---by OrangeStar

Week 1

A computer program is said to learn from experience E with respect to some task T and some performance measure P if its performance on T, as measured by P, improves with experience E.


1.监督学习和非监督学习

  • 监督学习(supervised learning)

  又叫回归问题(regression)

  预测一个连续值的输出(房价预测)(regression)
  或者分类某些数据集(肿瘤识别)(clasification)
 
   

  • 无监督学习(unsupervised learning)
无监督学习中,所有数据都是一样的
但是通过无监督学习,可以将所有的数据分为几个不同的聚类。这就是所谓的聚类算法。(网页推送)(声音处理)

总结: 要求找出数据中,蕴含的类型结构

2.线性回归算法(supervised learning)

符号定义
= Number of training examples(训练样本数量)
x's = "input" variable / features
y's = "output" variable / "target" vaiable
\((x^{(i)},y^{(i)})\)= one training example

3.代价函数

\(h_\theta\)(x) = $$\theta_0 + \theta_1x$$
\(\theta_i\) 叫做模型参数
Hypothesis ->假设函数

要尽量减少(x)与真实数据y的差距

\(J(\theta_0,\theta_1)=\frac {\sum^m_{i=1}(h_\theta(x^{(i)})-y^{(i)})^2} {2m}\)

找J函数的最小值,这个就是代价函数
也叫平方误差函数

之所以选择除2m是因为求导后,恰好等于m

4.代价函数J(\(\theta_1\))-Intuition

Goal -> minimize J(\(\theta_0,\theta_1\))

5. 梯度下降 Gradient Descent

用来求min J(\(\theta_0,\theta_1\))

Gradient descent algorithm:
repeat until convergence:

\(\theta_j := \alpha\frac\delta {\delta\theta_j}J(\theta_0,\theta_1)\)
(for j=0 and j=1)

(要同时更新\(\theta_0和\theta_1\)

(:=表示赋值)
(\(\alpha\)叫做学习速率,她控制了”下山的步伐“.即以多大的幅度更新theta)

1.计算temp0
2.计算tmep1
3.赋值给theta0和theta1
4.如果没有达到停止条件,返回第一步

\(\alpha\)有很重要的作用

  1. 如果太小,会梯度下降得太慢
  2. 如果太大,有可能无法收敛(converge)甚至会发散(diverge)

6.第一个机器学习算法(线性回归算法)

Gradient descent algorithm
Repeat until convergence
\(\theta_0 := \theta_0 - \alpha\frac1m\sum^m_{i=1}(h_\theta(x^{(i)} - y^{(i)}))\)
\(\theta_1 := \theta_1 - \alpha\frac1m\sum_{i=1}^m(h_\theta(x^{(i)} - y^{(i)}) * x^{(i)})\)

结合上一课:可以简写为:

\(\theta_j := \alpha\frac\delta {\delta\theta_j}J(\theta_0,\theta_1)\)

7.回顾线性代数知识

 

posted @ 2019-07-10 14:26  M1kanN  阅读(262)  评论(0)    收藏  举报