week_1
Andrew Ng机器学习笔记---by OrangeStar
Week 1
A computer program is said to learn from experience E with respect to some task T and some performance measure P if its performance on T, as measured by P, improves with experience E.
1.监督学习和非监督学习
- 监督学习(supervised learning)
又叫回归问题(regression)
预测一个连续值的输出(房价预测)(regression)
或者分类某些数据集(肿瘤识别)(clasification)
- 无监督学习(unsupervised learning)
无监督学习中,所有数据都是一样的
但是通过无监督学习,可以将所有的数据分为几个不同的聚类。这就是所谓的聚类算法。(网页推送)(声音处理)
总结: 要求找出数据中,蕴含的类型结构
2.线性回归算法(supervised learning)
符号定义
m = Number of training examples(训练样本数量)
x's = "input" variable / features
y's = "output" variable / "target" vaiable
\((x^{(i)},y^{(i)})\)= one training example
3.代价函数
\(h_\theta\)(x) = $$\theta_0 + \theta_1x$$
\(\theta_i\) 叫做模型参数
Hypothesis ->假设函数
要尽量减少(x)与真实数据y的差距
\(J(\theta_0,\theta_1)=\frac {\sum^m_{i=1}(h_\theta(x^{(i)})-y^{(i)})^2} {2m}\)
找J函数的最小值,这个就是代价函数
也叫平方误差函数
之所以选择除2m是因为求导后,恰好等于m
4.代价函数J(\(\theta_1\))-Intuition
Goal -> minimize J(\(\theta_0,\theta_1\))
5. 梯度下降 Gradient Descent
用来求min J(\(\theta_0,\theta_1\))
Gradient descent algorithm:
repeat until convergence:
\(\theta_j := \alpha\frac\delta {\delta\theta_j}J(\theta_0,\theta_1)\)
(for j=0 and j=1)
(要同时更新\(\theta_0和\theta_1\))
(:=表示赋值)
(\(\alpha\)叫做学习速率,她控制了”下山的步伐“.即以多大的幅度更新theta)
1.计算temp0
2.计算tmep1
3.赋值给theta0和theta1
4.如果没有达到停止条件,返回第一步
\(\alpha\)有很重要的作用
- 如果太小,会梯度下降得太慢
- 如果太大,有可能无法收敛(converge)甚至会发散(diverge)
6.第一个机器学习算法(线性回归算法)
Gradient descent algorithm
Repeat until convergence
\(\theta_0 := \theta_0 - \alpha\frac1m\sum^m_{i=1}(h_\theta(x^{(i)} - y^{(i)}))\)
\(\theta_1 := \theta_1 - \alpha\frac1m\sum_{i=1}^m(h_\theta(x^{(i)} - y^{(i)}) * x^{(i)})\)
结合上一课:可以简写为:
\(\theta_j := \alpha\frac\delta {\delta\theta_j}J(\theta_0,\theta_1)\)
7.回顾线性代数知识
略

浙公网安备 33010602011771号