线性回归

数学中的回归是指，现实中的变量之间存在一种函数关系，通过一批样本数据找出这个函数关系，即通过样本数据回归到真实的函数关系。

线性回归/Linear Regression是指，一些变量之间存在线性关系，通过一批样本数据找出这个关系，线性关系函数的图形是一条直线。

线性函数的方程如下：

线性回归就是根据一批样本数据，确定这个方程，即确定权重

因此，要创建线性模型，需要:

应变量(y)
斜率或权重变量(w)
截距或偏置(b)
自变量(x)

让我们开始使用TensorFlow建立线性模型:

import tensorflow.compat.v1 as tf
import numpy as np
tf.compat.v1.disable_eager_execution()
# 为参数斜率(W)创建变量，初始值为0.4
W = tf.Variable([.4], tf.float32)

# 为参数截距(b)创建变量，初始值为-0.4
b = tf.Variable([-0.4], tf.float32)

# 为自变量(用x表示)创建占位符
x = tf.placeholder(tf.float32)

# 线性回归方程
linear_model = W * x + b

# 初始化所有变量
sess = tf.compat.v1.Session()
init = tf.compat.v1.global_variables_initializer()
sess.run(init)

# 运行回归模型，输出y值
print(sess.run(linear_model, feed_dict={x: [1, 2, 3, 4]}))

输出

C:\Anaconda3\python.exe "C:\Program Files\JetBrains\PyCharm 2019.1.1\helpers\pydev\pydevconsole.py" --mode=client --port=60639
import sys; print('Python %s on %s' % (sys.version, sys.platform))
sys.path.extend(['C:\\app\\PycharmProjects', 'C:/app/PycharmProjects'])
Python 3.7.6 (default, Jan  8 2020, 20:23:39) [MSC v.1916 64 bit (AMD64)]
Type 'copyright', 'credits' or 'license' for more information
IPython 7.12.0 -- An enhanced Interactive Python. Type '?' for help.
PyDev console: using IPython 7.12.0
Python 3.7.6 (default, Jan  8 2020, 20:23:39) [MSC v.1916 64 bit (AMD64)] on win32
runfile('C:/app/PycharmProjects/ArtificialIntelligence/test.py', wdir='C:/app/PycharmProjects/ArtificialIntelligence')
WARNING:tensorflow:From C:\Anaconda3\lib\site-packages\tensorflow\python\ops\resource_variable_ops.py:1666: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
2020-06-19 18:08:36.592548: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2020-06-19 18:08:36.612575: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x1a941da5370 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-06-19 18:08:36.614292: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
[0.        0.4       0.8000001 1.2      ]

上面的代码只是根据线性方程，输入x值，输出y值。

我们需要使用样本数据训练权重w和偏置b，根据输出的y值，计算误差(预测结果和已知结果之间的差异)，得到代价函数，利用梯度下降法求取代价函数的最小值，得到最终的权重w和偏置b。

代价函数

代价函数用于度量模型的实际输出和期望输出之间的差距。我们将使用常用的均方差作为代价函数：

$\frac{1}{2}(t – y)^2E=21(t–y)2$

t – 目标输出
y – 实际输出
E – 均方差

# y占位符，接受样本中的y值
y = tf.placeholder(tf.float32)

# 计算均方差
error = linear_model - y
squared_errors = tf.square(error)
loss = tf.reduce_sum(squared_errors)

# 打印误差
print(sess.run(loss, feed_dict = {x:[1, 2, 3, 4], y:[2, 4, 6, 8]}))

完整代码

import tensorflow.compat.v1 as tf
import numpy as np
tf.compat.v1.disable_eager_execution()
# 为参数斜率(W)创建变量，初始值为0.4
W = tf.Variable([.4], tf.float32)

# 为参数截距(b)创建变量，初始值为-0.4
b = tf.Variable([-0.4], tf.float32)

# 为自变量(用x表示)创建占位符
x = tf.placeholder(tf.float32)

# 线性回归方程
linear_model = W * x + b

# 初始化所有变量
sess = tf.compat.v1.Session()
init = tf.compat.v1.global_variables_initializer()
sess.run(init)
# 运行回归模型，输出y值
print(sess.run(linear_model, feed_dict={x: [1, 2, 3, 4]}))
# y占位符，接受样本中的y值
y = tf.placeholder(tf.float32)

# 计算均方差
error = linear_model - y
squared_errors = tf.square(error)
loss = tf.reduce_sum(squared_errors)

# 打印误差
print(sess.run(loss, feed_dict = {x:[1, 2, 3, 4], y:[2, 4, 6, 8]}))

输出

C:\Anaconda3\python.exe "C:\Program Files\JetBrains\PyCharm 2019.1.1\helpers\pydev\pydevconsole.py" --mode=client --port=64343
import sys; print('Python %s on %s' % (sys.version, sys.platform))
sys.path.extend(['C:\\app\\PycharmProjects', 'C:/app/PycharmProjects'])
Python 3.7.6 (default, Jan  8 2020, 20:23:39) [MSC v.1916 64 bit (AMD64)]
Type 'copyright', 'credits' or 'license' for more information
IPython 7.12.0 -- An enhanced Interactive Python. Type '?' for help.
PyDev console: using IPython 7.12.0
Python 3.7.6 (default, Jan  8 2020, 20:23:39) [MSC v.1916 64 bit (AMD64)] on win32
runfile('C:/app/PycharmProjects/ArtificialIntelligence/test.py', wdir='C:/app/PycharmProjects/ArtificialIntelligence')
WARNING:tensorflow:From C:\Anaconda3\lib\site-packages\tensorflow\python\ops\resource_variable_ops.py:1666: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
2020-06-19 18:15:29.396415: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2020-06-19 18:15:29.415583: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x17166c35f50 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-06-19 18:15:29.417842: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
[0.        0.4       0.8000001 1.2      ]
90.24

可以看到输出的误差值很大。因此，我们需要调整权重(W)和偏差(b)，以减少误差。

模型训练

TensorFlow提供了优化器，可以缓慢地更改每个变量(权重w，偏置b)，最小化代价函数。
最简单的优化器是梯度下降优化器，它根据代价函数对变量的变化率(导数)来修改对应变量，进行迭代得到代价函数的最小值。

# 创建梯度下降优化器实例，学习率为0.01
optimizer = tf.train.GradientDescentOptimizer(0.01)

# 使用优化器最小化代价函数
train = optimizer.minimize(loss)

# 在1000次迭代中最小化误差，这样在迭代时，将使用优化器根据误差修改模型参数w & b以最小化误差
for i in range(1000):
     sess.run(train, {x:[1, 2, 3, 4], y:[2, 4, 6, 8]})

# 打印权重和偏差
print(sess.run([W, b]))

完整代码：

import tensorflow.compat.v1 as tf
import numpy as np
tf.compat.v1.disable_eager_execution()
# 为参数斜率(W)创建变量，初始值为0.4
W = tf.Variable([.4], tf.float32)

# 为参数截距(b)创建变量，初始值为-0.4
b = tf.Variable([-0.4], tf.float32)

# 为自变量(用x表示)创建占位符
x = tf.placeholder(tf.float32)

# 线性回归方程
linear_model = W * x + b

# 初始化所有变量
sess = tf.compat.v1.Session()
init = tf.compat.v1.global_variables_initializer()
sess.run(init)
# 运行回归模型，输出y值
print(sess.run(linear_model, feed_dict={x: [1, 2, 3, 4]}))
# y占位符，接受样本中的y值
y = tf.placeholder(tf.float32)

# 计算均方差
error = linear_model - y
squared_errors = tf.square(error)
loss = tf.reduce_sum(squared_errors)

# 打印误差
print(sess.run(loss, feed_dict = {x:[1, 2, 3, 4], y:[2, 4, 6, 8]}))
# 创建梯度下降优化器实例，学习率为0.01
optimizer = tf.train.GradientDescentOptimizer(0.01)

# 使用优化器最小化代价函数
train = optimizer.minimize(loss)

# 在1000次迭代中最小化误差，这样在迭代时，将使用优化器根据误差修改模型参数w & b以最小化误差
for i in range(1000):
     sess.run(train, {x:[1, 2, 3, 4], y:[2, 4, 6, 8]})

# 打印权重和偏差
print(sess.run([W, b]))

输出

C:\Anaconda3\python.exe "C:\Program Files\JetBrains\PyCharm 2019.1.1\helpers\pydev\pydevconsole.py" --mode=client --port=64387
import sys; print('Python %s on %s' % (sys.version, sys.platform))
sys.path.extend(['C:\\app\\PycharmProjects', 'C:/app/PycharmProjects'])
Python 3.7.6 (default, Jan  8 2020, 20:23:39) [MSC v.1916 64 bit (AMD64)]
Type 'copyright', 'credits' or 'license' for more information
IPython 7.12.0 -- An enhanced Interactive Python. Type '?' for help.
PyDev console: using IPython 7.12.0
Python 3.7.6 (default, Jan  8 2020, 20:23:39) [MSC v.1916 64 bit (AMD64)] on win32
runfile('C:/app/PycharmProjects/ArtificialIntelligence/test.py', wdir='C:/app/PycharmProjects/ArtificialIntelligence')
WARNING:tensorflow:From C:\Anaconda3\lib\site-packages\tensorflow\python\ops\resource_variable_ops.py:1666: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
2020-06-19 18:16:36.829150: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2020-06-19 18:16:36.848335: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x193806eb120 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-06-19 18:16:36.850094: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
[0.        0.4       0.8000001 1.2      ]
90.24
[array([1.9999996], dtype=float32), array([9.863052e-07], dtype=float32)]

posted on 2020-06-19 18:18 大码王阅读(1091) 评论(0) 收藏举报

刷新页面返回顶部

线性回归

代价函数

模型训练

公告