线性回归算法

在线公式编辑器：https://www.codecogs.com/latex/eqneditor.php

简介

解决回归问题
思想简单，实现容易
许多强大的非线性模型的基础
结果具有很好的可解释性
蕴含机器学习中的很多重要思想

　　线性回归算法以一个坐标系里一个维度为结果，其他维度为特征（如二维平面坐标系中横轴为特征，纵轴为结果），无数的训练集放在坐标系中，发现他们是围绕着一条执行分布。线性回归算法的期望，就是寻找一条直线，最大程度的“拟合”样本特征和样本输出标记的关系

简单线性回归

模型

　　被用来描述因变量X与自变量Y以及偏差之间的关系的方程称为回归模型简单线性回归的模型如下：

$y=ax+b+\varepsilon $

$a$：回归线斜率，表示X每增加1单位所引起Y的增量

$b$：回归系数，回归线有纵轴上的截距

$\varepsilon$：Y与回归线的平均数间的误差，随机变量，服从正态分布

　　由于母体回归线无法得知，无法得到真实的斜率截距，可以采用样本回归线估计值，对于每一个样本点 $x^{i}$

预测值为：$\widehat{y}^{i}=ax^{i}+b$

真实值为：$y^{i}$

　　我们希望 $y^{i}$和$\widehat{y}^{i}$ 的差距尽量小，故使 $\sum_{i=1}^{m}(y^{i}-\widehat{y}^{i})^{2}$ 尽可能小，则有

$\widehat{y}^{i}=ax^{i}+b$

$a=\frac{\sum_{i=1}^{m}(x^{i}-\overline{x})(y^{i}-\overline{y})}{\sum_{i=1}^{m}(x^{i}-\overline{x})^{2}}$，$b=\overline{y}-a\overline{x}$

基本思想

　　找到 $a$ 和 $b$ ，使得 $\sum_{i=1}^{m}(y^{i}-ax^{i}-b)^{2}$ 尽可能小

最小二乘法推导过程

　　对a和b分别求偏导，过程略

简单线性回归实例

数据

　　汽车卖家做电视广告数量与卖出的汽车数量：

　　如何找到适合简单线性回归模型的最佳回归线

　　假设有一周广告数量为2，预测的汽车销售量是多少

PYTHON代码实现

　　值的呈现

import numpy as np
import matplotlib.pyplot as plt
x = np.array([1, 3, 2, 1, 3])
y = np.array([14, 24, 18, 17, 27])
plt.scatter(x, y, c='r')
plt.axis([0, 4, 0, 28])
plt.show()

　　回归线的呈现

x_mean = np.mean(x)
y_mean = np.mean(y)
num = 0.0
d = 0.0
for x_i, y_i in zip(x, y):
    num += (x_i - x_mean) * (y_i - y_mean)
    d += (x_i - x_mean) ** 2
a = num/d
b = y_mean - a * x_mean
y_hat = a * x + b
plt.scatter(x, y, c='r')
plt.plot(x, y_hat, color='b')
plt.axis([0, 4, 0, 28])
plt.show()

　　预测值的呈现

x_predict = 2
y_predict = a * x_predict + b
plt.scatter(x_predict, y_predict, c='g')

代码封装

import numpy as np
import matplotlib.pyplot as plt
class SimpleLinearRegression1:
    def __init__(self):
        # 初始化Simple Linear Regression 模型
        self.a_ = None
        self.b_ = None

    def fit(self, x_train, y_train):
        # 根据训练集x_train，y_train 训练Simple Linear Regression 模型
        assert x_train.ndim == 1, \
            "Simple Linear Regression can only solve simple feature training data"
        assert len(x_train) == len(y_train), \
            "the size of x_train must be equal to the size of y_train"
        # 求均值
        x_mean = x_train.mean()
        y_mean = y_train.mean()
        # 分子
        num = 0.0
        # 分母
        d = 0.0
        # 计算分子分母
        for x_i, y_i in zip(x_train, y_train):
            num += (x_i - x_mean) * (y_i - y_mean)
            d += (x_i - x_mean) ** 2
        # 计算参数a和b
        self.a_ = num / d
        self.b_ = y_mean - self.a_ * x_mean
        return self

    def predict(self, x_predict):
        # 给定待预测集x_predict，返回x_predict对应的预测结果值
        assert x_predict.ndim == 1, \
            "Simple Linear Regression can only solve simple feature training data"
        assert self.a_ is not None and self.b_ is not None, \
            "must fit before predict!"
        return np.array([self._predict(x) for x in x_predict])

    def _predict(self, x_single):
        # 给定单个待预测数据x_single，返回x_single对应的预测结果值
        return self.a_ * x_single + self.b_

    def __repr__(self):
        return "SimpleLinearRegression1()"

x = np.array([1, 3, 2, 1, 3])
y = np.array([14, 24, 18, 17, 27])
reg1 = SimpleLinearRegression1()
reg1.fit(x, y)
x_predict = 2
# x_predict = 2
y_predict = reg1.a_ * x_predict + reg1.b_
plt.scatter(x_predict, y_predict, c='g')
# reg1.predict(np.array([x_predict]))#单值预测
# print(reg1.a_)
# print(reg1.b_)
y_hat1 = reg1.predict(x)  # 产生多个预测值
plt.scatter(x, y)
plt.plot(x, y_hat1, color='r')
plt.axis([0, 4, 0, 28])
plt.show()

向量化

　　使用向量的点乘方式可以实现乘积累加求和的效果

$a=\frac{\sum_{i=1}^{m}(x^{i}-\overline{x})(y^{i}-\overline{y})}{\sum_{i=1}^{m}(x^{i}-\overline{x})^{2}}$

$\sum_{i=1}^{m}w^{i}\cdot v^{i}$

$w=(w^{1},w^{2},...,w^{m})$

$v=(v^{1},v^{2},...,v^{m})$

def fit(self, x_train, y_train):
    #根据训练数据集x_train,y_train训练Simple Linear Regression模型
    assert x_train.ndim == 1, \
        "Simple Linear Regressor can only solve single feature training data."
    assert len(x_train) == len(y_train), \
        "the size of x_train must be equal to the size of y_train"
    x_mean = np.mean(x_train)
    y_mean = np.mean(y_train)
    self.a_ = (x_train - x_mean).dot(y_train - y_mean) / (x_train - x_mean).dot(x_train - x_mean)
    self.b_ = y_mean - self.a_ * x_mean
    return self

衡量线性回归算法的指标

R Squared

$R^{2}=1-\frac{\sum (\widehat{y}^{i}-y^{i})^{2}}{\sum (\overline{y}-y^{i})^{2}}=1-\frac{\frac{\sum_{i=1}^{m}(\widehat{y}^{i}-y^{i})^{2}}{m}}{\frac{\sum_{i=1}^{m}(\overline{y}-y^{i})^{2}}{m}}=1-\frac{MSE(\widehat{y},y)}{Var(y)}$

　　代码实现

scikit-learn中的 r2_score
from sklearn.metrics import r2_score
s = r2_score(y_test, y_predict)
print(s)

R Squared 的意义

$\sum (\widehat{y}^{i}-y^{i})^{2}$：使用我们的模型预测产生的错误
$\sum (\overline{y}-y^{i})^{2}$：使用 $y=\overline{y}$ 预测产生的错误
$R^{2}<=1$
$R^{2}$ 越大越好，当我们的预测模型不犯任何错误时，$R^{2}$ 得到最大值1
当我们的模型等于基准模型时，$R^{2}=0$
如果 $R^{2}<0$ ，则我们的数据不存在线性关系

多元线性回归

　　在回归分析中，如果有两个或两个以上的自变量，就称为多元回归。事实上，一种现象常常是与多个因素相联系的，由多个自变量的最优组合共同来预测或估计因变量，比只用一个自变量进行预测或估计更有效，更符合实际。因此多元线性回归比一元线性回归的实用意义更大。

多元线性回归方程

$y=\theta _{0}+\theta _{1}x_{1}+\theta _{2}x_{2}+\cdots +\theta _{n}x_{n}$
$\theta _{0}$ ：常数项
$\theta _{1}$, $\theta _{2}$,...,$\theta _{n}$ 称为 $y$ 对应于$x_{1}$, $x_{2}$,..., $x_{n}$ 的偏回归系数
$\widehat{y}^{(i)}=\theta _{0}+\theta _{1}x_{1}^{(i)}+\theta _{2}x_{2}^{(i)}+\cdots +\theta _{n}x_{n}^{(i)}$

目标：找到 $\theta _{0}$，$\theta _{1}$，...，$\theta _{n}$，使 $\sum_{i=1}^{m}(y^{(i)}-\widehat{y}^{(i)})^{2}$ 尽可能小

多元线性回归公式推导

$\widehat{y}^{(i)}=\theta _{0}+\theta _{1}x_{1}^{(i)}+\theta _{2}x_{2}^{(i)}+\cdots +\theta _{n}x_{n}^{(i)}$ ， $x_{0}^{(i)}\equiv 1$

$x^{(i)}=(x_{0}^{(i)},x_{1}^{(i)},x_{2}^{(i)},\cdots ,x_{n}^{(i)})$

$\theta =(\theta _{0},\theta _{1},\theta _{2},\cdots ,\theta _{n})^{T}$

$\widehat{y}^{(i)}=x^{(i)}\cdot \theta $

$X_{b}=\begin{pmatrix}
1& x_{1}^{(1)}& x_{2}^{(1)}& x_{n}^{(1)}& \\
1& x_{1}^{(2)}& x_{2}^{(2)}& x_{n}^{(2)}& \\
\cdots & & & \cdots & \\
1& x_{1}^{(m)}& x_{2}^{(m)}& x_{n}^{(m)}&
\end{pmatrix}$ $\theta =\begin{pmatrix}
\theta _{0}\\
\theta _{1}\\
\theta _{2}\\
\cdots \\
\theta _{n}
\end{pmatrix}$

$\widehat{y}=X_{b}\cdot \theta $

使 $\sum_{i=1}^{m}(y^{(i)}-\widehat{y}^{(i)})^{2}$ 尽可能小

使 $(y-X_{b}\cdot \theta )^{T}(y-X_{b}\cdot \theta )$ 尽可能小

$\theta =(X_{b}^{T}X_{b})^{-1}X_{b}^{T}y$

posted @ 2019-11-19 10:15 一心取信阅读(1070) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

一心取信