作业 01 - Logistic Regression 实现分类
算是第一个大作业吧,实现的还是比较粗糙的.
题目:给定 200 个图片作为训练样本,训练出可以识别一张图片是否是猫的程序
考虑 $\mathrm{P(Y=1|X) = \sigma (W^{T}X+b)}$, 其中 $\mathrm{\sigma(z) = \frac{1}{1 + exp(-z)}}$, 即逻辑斯蒂函数.
损失函数可以定义为 $\mathrm{J = \sum_{i = 1}^{m} y[i]ln(a[i]) + (1 - y[i])ln(1 - a[i])}$, 即正确率取对数.
目标是通过训练数据学习 $\mathrm{w,b}$ 使得 $\mathrm{J}$ 最大,为了方便不妨取负号转化为 $\mathrm{J}$ 最小.
正向计算时可以直接用向量来做,然后偏导用反向传播算法即可(也是向量化)
公式:$\mathrm{\frac{\partial J}{\partial z[i]} = a[i] - y[i]}$
这里有几点要注意一下:
1. 初始时要将像素归一为 $\mathrm{[0,1]}$ 之间的数字因为 $\mathrm{sigmoid}$ 函数的特殊性
2. 学习次数过多会出现过拟合现象
代码:
import numpy as np import h5py import matplotlib.pyplot as plt from lr_utils import load_dataset def sigmoid(x): return 1 / (1.0 + np.exp(-x)) # 对于逻辑斯蒂回归的梯度下降算法 def function(train_x, train_y, step, lr): n, m = train_x.shape w ,b = np.random.randn(1, n) * 0.008, 0 for i in range(step) : z = np.dot(w, train_x) + b a = sigmoid(z) dz = a - train_y db = np.sum(dz) dw = np.dot(train_x, dz.T).T dw = dw / m db = db / m w = w - lr * dw b = b - lr * db return w, b train_x, train_y, test_x, test_y, classes = load_dataset() train_x = train_x.reshape(train_x.shape[0], -1).T / 255 test_x = test_x.reshape(test_x.shape[0], -1).T / 255 for i in range(0, 1004, 20): tw, tb = function(train_x, train_y, i, 0.009) a = sigmoid(np.dot(tw, test_x) + tb) cc = 0 for j in range(0, 50): an = 0 if a[0][j] > 0.5: an = 1 if an != test_y[0][j]: cc = cc + 1 print(i, cc / 50)