WuEnda Lesson1

Week 1 Introduction

Supervised Learning 有监督学习

需要Input(x) 和 output(y)

Structed Data(database table) Unstructed Data(audio, image, text)

forward propagation step 正向传播步骤

backward propagation step 反向传播步骤

feature vector 特征向量\(n_x\) : \(64\times64\times3 \to 1\times12288\)

可以理解为，输入变量有12288个！

(x, y) 一个单独的样本
training set : \(\{(x^{(1)}, y^{(1)}), (x^{(2)}, y^{(2)}), ... , (x^{(m)}, y^{(m)})\}\)
\(m_{train} , m_{test}\) 训练集和测试的样本数
\(X^{n_x\times m}\) 表示所有样本的Input: \(x^{(1)}, x^{(2)}, ... , x^{(m)}\)所组成的矩阵，n行(输入变量维数)m列(样本大小) X.shape = (n, m)
\(Y^{1 \times m}\)表示所有样本的Output: \(y^{(1)}, y^{(2)}, ... , y^{(m)}\)所组成的矩阵 Y.shape = (1, m)

应用于Output为0/1的二分分类问题的回归算法：

Given x(1, 12288), want \(\hat{y} = P(y=1|x)\)

Parameters of Logistic Regression: \(w^{n_x}, b\)

Output:

Linear Regression: \(\hat{y} = w^Tx + b\)
为了令output在0～1之间，Logistic Regression: \(\hat{y} = sigmoid(w^Tx + b)\) 。其中，

\[Sigmoid(z) = \sigma(z) = \frac{1}{1+e^{-z}} \]

loss(error) function 损失函数用于衡量算法对单个样本的准确程度

\(L(\hat{y}, y) = \frac12(\hat{y} - y)^2\) 在Logistic Regression中不使用，因为会令优化问题变成非凸的(non-convex)

\[L(\hat{y}, y) = -(y\log{\hat{y} + (1-y)log(1-\hat{y})}) \]

Logistic Regression算法使用这个Loss Function，由于y = 1或0：

cost function 成本函数用于衡量算法对整个样本集的准确程度

\[J(w, b) = \frac1m\sum^{m}_{i=1}L(\hat{y}^{(i)}, y^{(i)}) = -\frac1m\sum^m_{i=1}y^{(i)}\log{\hat{y}^{(i)}} + (1-y^{(i)})\log{(1-\hat{y}^{(i)})} \]

Repeat: {

\(w := w - \alpha\frac{\partial{J(w, b)}}{\partial{w}}\)

}

\(\alpha\)表示学习率。重复上述步骤直到w收敛。

略。

posted @ 2021-04-27 21:14 Geasse 阅读(65) 评论(0) 收藏举报

刷新页面返回顶部