# 机器学习之TensorFlow介绍

TensorFlow的概念很简单：使用python定义一个计算图，然后TensorFlow根据计算图生成高性能的c++代码。

• 支持多平台，Windows, Linux，macOS，iOS，Android
• 提供了简单的python api
• 有大量的其他的基于TensorFlow的高一级的库
• 可扩展性
• 高性能的c++实现
• 提供了很多方便计算代价函数的节点，带有自动求导功能
• 提供了强大的可视化工具TensorBoard
• 提供了云计算能力
• 开发社区比较活跃

## Creating Your First Graph and Running It in a Session

import tensorflow as tf

reset_graph()

x = tf.Variable(3, name='x')
y = tf.Variable(4, name='y')
f = x * x * y + y + 2


sess = tf.Session()
sess.run(x.initializer)
sess.run(y.initializer)
result = sess.run(f)
print(result)
sess.close()


with tf.Session() as sess:
x.initializer.run()
y.initializer.run()
result = f.eval()

"""
42
"""


init = tf.global_variables_initializer()

with tf.Session() as sess:
init.run()
result1 = f.eval()


TensorFlow程序一般分为两个步骤：首先创建计算图，其次运行。

## Managing Graphs

reset_graph()

x1 = tf.Variable(1)
x1.graph is tf.get_default_graph()

'''
True
'''


graph = tf.Graph()
with graph.as_default():
x2 = tf.Variable(2)

x2.graph is graph

'''
True
'''

x2.graph is tf.get_default_graph()

'''
False
'''


## Lifecycle of a Node Value

w = tf.constant(3)
x=w+2
y=x+5
z=x*3

with tf.Session() as sess:
print(y.eval()) # 10
print(z.eval()) # 15


All node values are dropped between graph runs, except variable values, which are maintained by the session across graph runs (queues and readers also maintain some state, as we will see in Chapter 12). A variable starts its life when its initializer is run, and it ends when the session is closed.

with tf.Session() as sess:
y_val, z_val = sess.run([y, z])
print(y_val) # 10
print(z_val) # 15



In single-process TensorFlow, multiple sessions do not share any state, even if they reuse the same graph (each session would have its own copy of every variable). In distributed TensorFlow (see Chap‐ ter 12), variable state is stored on the servers, not in the sessions, so multiple sessions can share the same variables.

## Linear Regression with TensorFlow

TensorFlow operations简称为ops，能够接受任何数量的输入和任何数量的输出，比如加法和乘法操作符，他们可以接受2个输入，并产生一个输出，Constants和variables不需要输人，它输出一个值。如果输入和输出是多维数组，则成为“tensor（张量）”。

import numpy as np
from sklearn.datasets import fetch_california_housing

reset_graph()

housing = fetch_california_housing()
m, n = housing.data.shape
housing_data_plus_bias = np.c_[np.ones((m, 1)), housing.data]

X = tf.constant(housing_data_plus_bias, dtype=tf.float32, name='X')
y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name='y')

XT = tf.transpose(X)

theta = tf.matmul(tf.matmul(tf.matrix_inverse(tf.matmul(XT, X)), XT), y)

with tf.Session() as sess:
theta_value = theta.eval()

"""
array([[-3.68962631e+01],
[ 4.36777472e-01],
[ 9.44449380e-03],
[-1.07348785e-01],
[ 6.44962370e-01],
[-3.94082872e-06],
[-3.78797273e-03],
[-4.20847952e-01],
[-4.34020907e-01]], dtype=float32)
"""



X = housing_data_plus_bias
y = housing.target.reshape(-1, 1)
theta_numpy = np.linalg.inv(X.T.dot(X)).dot(X.T).dot(y)

print(theta_numpy)

"""
[[-3.69419202e+01]
[ 4.36693293e-01]
[ 9.43577803e-03]
[-1.07322041e-01]
[ 6.45065694e-01]
[-3.97638942e-06]
[-3.78654265e-03]
[-4.21314378e-01]
[-4.34513755e-01]]
"""



from sklearn.linear_model import LinearRegression
lin_reg = LinearRegression()
lin_reg.fit(housing.data, housing.target.reshape(-1, 1))

print(np.r_[lin_reg.intercept_.reshape(-1, 1), lin_reg.coef_.T])

"""
[[-3.69419202e+01]
[ 4.36693293e-01]
[ 9.43577803e-03]
[-1.07322041e-01]
[ 6.45065694e-01]
[-3.97638942e-06]
[-3.78654265e-03]
[-4.21314378e-01]
[-4.34513755e-01]]
"""



from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
scaled_housing_data = scaler.fit_transform(housing.data)
scaled_housing_data_plus_bias = np.c_[np.ones((m, 1)), scaled_housing_data]



reset_graph()

n_epochs = 1000
learning_rate = 0.01

X = tf.constant(scaled_housing_data_plus_bias, dtype=tf.float32, name="X")
y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name="y")
theta = tf.Variable(tf.random_uniform([n + 1, 1], -1.0, 1.0, seed=42), name="theta")
y_pred = tf.matmul(X, theta, name="predictions")
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name="mse")
gradients = 2/m * tf.matmul(tf.transpose(X), error)
training_op = tf.assign(theta, theta - learning_rate * gradients)

init = tf.global_variables_initializer()

with tf.Session() as sess:
sess.run(init)

for epoch in range(n_epochs):
if epoch % 100 == 0:
print("Epoch", epoch, "MSE =", mse.eval())
sess.run(training_op)

best_theta = theta.eval()

"""

Epoch 0 MSE = 9.161542
Epoch 100 MSE = 0.7145004
Epoch 200 MSE = 0.56670487
Epoch 300 MSE = 0.5555718
Epoch 400 MSE = 0.5488112
Epoch 500 MSE = 0.5436363
Epoch 600 MSE = 0.5396291
Epoch 700 MSE = 0.5365092
Epoch 800 MSE = 0.53406775
Epoch 900 MSE = 0.5321473
"""

best_theta

"""
array([[ 2.0685523 ],
[ 0.8874027 ],
[ 0.14401656],
[-0.34770882],
[ 0.36178368],
[ 0.00393811],
[-0.04269556],
[-0.6614529 ],
[-0.6375279 ]], dtype=float32)
"""



### Using autodiff

reset_graph()

n_epochs = 1000
learning_rate = 0.01

X = tf.constant(scaled_housing_data_plus_bias, dtype=tf.float32, name="X")
y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name="y")
theta = tf.Variable(tf.random_uniform([n + 1, 1], -1.0, 1.0, seed=42), name="theta")
y_pred = tf.matmul(X, theta, name="predictions")
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name="mse")

training_op = tf.assign(theta, theta - learning_rate * gradients)

init = tf.global_variables_initializer()

with tf.Session() as sess:
sess.run(init)

for epoch in range(n_epochs):
if epoch % 100 == 0:
print("Epoch", epoch, "MSE =", mse.eval())
sess.run(training_op)

best_theta = theta.eval()

print("Best theta:")
print(best_theta)

"""
Epoch 0 MSE = 9.161542
Epoch 100 MSE = 0.71450037
Epoch 200 MSE = 0.56670487
Epoch 300 MSE = 0.5555718
Epoch 400 MSE = 0.54881126
Epoch 500 MSE = 0.5436363
Epoch 600 MSE = 0.53962916
Epoch 700 MSE = 0.5365092
Epoch 800 MSE = 0.53406775
Epoch 900 MSE = 0.5321473
Best theta:
[[ 2.0685523 ]
[ 0.8874027 ]
[ 0.14401656]
[-0.3477088 ]
[ 0.36178365]
[ 0.00393811]
[-0.04269556]
[-0.66145283]
[-0.6375278 ]]
"""



TensorFlow采用reverse-mode autodiff,这个模式比较使用于有大量输入和少量输出的情况。下图显示了其他模式：

### Using an Optimizer

gradients = tf.gradients(mse, [theta])[0]
training_op = tf.assign(theta, theta - learning_rate * gradients)



optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(mse)



optimizer = tf.train.MomentumOptimizer(learning_rate=learning_rate,
momentum=0.9)



## Feeding Data to the Training Algorithm

reset_graph()

A = tf.placeholder(tf.float32, shape=(None, 3))
B = A + 5
with tf.Session() as sess:
B_val_1 = B.eval(feed_dict={A: [[1, 2, 3]]})
B_val_2 = B.eval(feed_dict={A: [[4, 5, 6], [7, 8, 9]]})

print(B_val_1)
"""
[[6. 7. 8.]]
"""

print(B_val_2)
"""
[[ 9. 10. 11.]
[12. 13. 14.]]
"""



n_epochs = 1000
learning_rate = 0.01

reset_graph()

X = tf.placeholder(tf.float32, shape=(None, n + 1), name="X")
y = tf.placeholder(tf.float32, shape=(None, 1), name="y")

theta = tf.Variable(tf.random_uniform([n + 1, 1], -1.0, 1.0, seed=42), name="theta")
y_pred = tf.matmul(X, theta, name="predictions")
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name="mse")
training_op = optimizer.minimize(mse)

init = tf.global_variables_initializer()

n_epochs = 10

batch_size = 100
n_batches = int(np.ceil(m / batch_size))

def fetch_batch(epoch, batch_index, batch_size):
np.random.seed(epoch * n_batches + batch_index)  # not shown in the book
indices = np.random.randint(m, size=batch_size)  # not shown
X_batch = scaled_housing_data_plus_bias[indices] # not shown
y_batch = housing.target.reshape(-1, 1)[indices] # not shown
return X_batch, y_batch

with tf.Session() as sess:
sess.run(init)

for epoch in range(n_epochs):
for batch_index in range(n_batches):
X_batch, y_batch = fetch_batch(epoch, batch_index, batch_size)
sess.run(training_op, feed_dict={X: X_batch, y: y_batch})

best_theta = theta.eval()

"""
array([[ 2.0703337 ],
[ 0.8637145 ],
[ 0.12255152],
[-0.31211877],
[ 0.38510376],
[ 0.00434168],
[-0.0123295 ],
[-0.83376896],
[-0.8030471 ]], dtype=float32)
"""



## Saving and Restoring Models

• 创建Saver
• 调用saver.save()保存
• 调用saver。restore()恢复

saver = tf.train.Saver()

with tf.Session() as sess:
sess.run(init)

for epoch in range(n_epochs):
if epoch % 100 == 0:
print("Epoch", epoch, "MSE =", mse.eval())                                # not shown
save_path = saver.save(sess, "/tmp/my_model.ckpt")
sess.run(training_op)

best_theta = theta.eval()
save_path = saver.save(sess, "/tmp/my_model_final.ckpt")



with tf.Session() as sess:
saver.restore(sess, "/tmp/my_model_final.ckpt")
best_theta_restored = theta.eval() # not shown in the book



By default the saver also saves the graph structure itself in a second file with the extension .meta. You can use the function tf.train.import_meta_graph() to restore the graph structure. This function loads the graph into the default graph and returns a Saver that can then be used to restore the graph state (i.e., the variable values):

reset_graph()

saver = tf.train.import_meta_graph("/tmp/my_model_final.ckpt.meta")  # this loads the graph structure
theta = tf.get_default_graph().get_tensor_by_name("theta:0") # not shown in the book

with tf.Session() as sess:
saver.restore(sess, "/tmp/my_model_final.ckpt")  # this restores the graph's state
best_theta_restored = theta.eval() # not shown in the book



## Visualizing the Graph and Training Curves Using TensorBoard

TensorBoard是一个强大的基于web的工具，它的原理是：根据保存在本地的日志数据进行绘图，可以显示图的结果和训练效果。

1. 定义需要保存日志的文件夹

from datetime import datetime

now = datetime.utcnow().strftime("%Y%m%d%H%M%S")
root_logdir = "tf_logs"
logdir = "{}/run-{}/".format(root_logdir, now)


2. 在construction phase之后，写下边的代码

mse_summary = tf.summary.scalar('MSE', mse)
file_writer = tf.summary.FileWriter(logdir, tf.get_default_graph())


3. 在需要写入的地方写入数据

with tf.Session() as sess:                                                        # not shown in the book
sess.run(init)                                                                # not shown

for epoch in range(n_epochs):                                                 # not shown
for batch_index in range(n_batches):
X_batch, y_batch = fetch_batch(epoch, batch_index, batch_size)
if batch_index % 10 == 0:
summary_str = mse_summary.eval(feed_dict={X: X_batch, y: y_batch})
step = epoch * n_batches + batch_index
sess.run(training_op, feed_dict={X: X_batch, y: y_batch})

best_theta = theta.eval()


4. 运行TensorBoard，键入下边命令

python3 -m tensorboard.main --logdir=tf_logs



## Name Scopes

with tf.name_scope("loss") as scope:
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name="mse")



## Modularity

Suppose you want to create a graph that adds the output of two rectified linear units(ReLU). A ReLU computes a linear function of the inputs, and outputs the result if it is positive, and 0 otherwise,

reset_graph()

n_features = 3
X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")

w1 = tf.Variable(tf.random_normal((n_features, 1)), name="weights1")
w2 = tf.Variable(tf.random_normal((n_features, 1)), name="weights2")
b1 = tf.Variable(0.0, name="bias1")
b2 = tf.Variable(0.0, name="bias2")

z1 = tf.add(tf.matmul(X, w1), b1, name="z1")
z2 = tf.add(tf.matmul(X, w2), b2, name="z2")

relu1 = tf.maximum(z1, 0., name="relu1")
relu2 = tf.maximum(z1, 0., name="relu2")  # Oops, cut&paste error! Did you spot it?



reset_graph()

def relu(X):
w_shape = (int(X.get_shape()[1]), 1)
w = tf.Variable(tf.random_normal(w_shape), name="weights")
b = tf.Variable(0.0, name="bias")
z = tf.add(tf.matmul(X, w), b, name="z")
return tf.maximum(z, 0., name="relu")

n_features = 3
X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")
relus = [relu(X) for i in range(5)]



TensorFlow在创建node的时候。会为该node创建一个唯一的name，因此，我们最好在函数中使用name scopes，这样图的结构更加清晰。

def relu(X):
with tf.name_scope("relu"):
w_shape = (int(X.get_shape()[1]), 1)                          # not shown in the book
w = tf.Variable(tf.random_normal(w_shape), name="weights")    # not shown
b = tf.Variable(0.0, name="bias")                             # not shown
z = tf.add(tf.matmul(X, w), b, name="z")                      # not shown
return tf.maximum(z, 0., name="max")                          # not shown



## Sharing Variables

reset_graph()

def relu(X, threshold):
with tf.name_scope("relu"):
w_shape = (int(X.get_shape()[1]), 1)                        # not shown in the book
w = tf.Variable(tf.random_normal(w_shape), name="weights")  # not shown
b = tf.Variable(0.0, name="bias")                           # not shown
z = tf.add(tf.matmul(X, w), b, name="z")                    # not shown
return tf.maximum(z, threshold, name="max")

threshold = tf.Variable(0.0, name="threshold")
X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")
relus = [relu(X, threshold) for i in range(5)]



reset_graph()

def relu(X):
with tf.name_scope("relu"):
if not hasattr(relu, "threshold"):
relu.threshold = tf.Variable(0.0, name="threshold")
w_shape = int(X.get_shape()[1]), 1                          # not shown in the book
w = tf.Variable(tf.random_normal(w_shape), name="weights")  # not shown
b = tf.Variable(0.0, name="bias")                           # not shown
z = tf.add(tf.matmul(X, w), b, name="z")                    # not shown
return tf.maximum(z, relu.threshold, name="max")



TensorFlow提供了get_variable()函数来获取变量，它依赖variable_scope(),变量域，

reset_graph()

with tf.variable_scope("relu"):
threshold = tf.get_variable("threshold", shape=(),
initializer=tf.constant_initializer(0.0))



with tf.variable_scope("relu", reuse=True):
threshold = tf.get_variable("threshold")



with tf.variable_scope("relu") as scope:
scope.reuse_variables()
threshold = tf.get_variable("threshold")



Once reuse is set to True, it cannot be set back to False within the block. Moreover, if you define other variable scopes inside this one, they will automatically inherit reuse=True. Lastly, only variables created by get_variable() can be reused this way.

reset_graph()

def relu(X):
with tf.variable_scope("relu", reuse=True):
threshold = tf.get_variable("threshold")
w_shape = int(X.get_shape()[1]), 1                          # not shown
w = tf.Variable(tf.random_normal(w_shape), name="weights")  # not shown
b = tf.Variable(0.0, name="bias")                           # not shown
z = tf.add(tf.matmul(X, w), b, name="z")                    # not shown
return tf.maximum(z, threshold, name="max")

X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")
with tf.variable_scope("relu"):
threshold = tf.get_variable("threshold", shape=(),
initializer=tf.constant_initializer(0.0))
relus = [relu(X) for relu_index in range(5)]



Variables created using get_variable() are always named using the name of their variable_scope as a prefix (e.g., "relu/thres hold"), but for all other nodes (including variables created withtf.Variable()) the variable scope acts like a new name scope. In particular, if a name scope with an identical name was already cre‐ ated, then a suffix is added to make the name unique. For example, all nodes created in the preceding code (except the threshold vari‐ able) have a name prefixed with "relu_1/" to "relu_5/"

## Extra material

reset_graph()

with tf.variable_scope("my_scope"):
x0 = tf.get_variable("x", shape=(), initializer=tf.constant_initializer(0.))
x1 = tf.Variable(0., name="x")
x2 = tf.Variable(0., name="x")

with tf.variable_scope("my_scope", reuse=True):
x3 = tf.get_variable("x")
x4 = tf.Variable(0., name="x")

with tf.variable_scope("", default_name="", reuse=True):
x5 = tf.get_variable("my_scope/x")

print("x0:", x0.op.name)
print("x1:", x1.op.name)
print("x2:", x2.op.name)
print("x3:", x3.op.name)
print("x4:", x4.op.name)
print("x5:", x5.op.name)
print(x0 is x3 and x3 is x5)

"""

x0: my_scope/x
x1: my_scope/x_1
x2: my_scope/x_2
x3: my_scope/x
x4: my_scope_1/x
x5: my_scope/x
True
"""



## Exercises

1. What are the main benefits of creating a computation graph rather than directly executing the computations? What are the main drawbacks?

Main benefits and drawbacks of creating a computation graph rather than directly executing the computations:

• Main benefits:
• TensorFlow can automatically compute the gradients for you (using reverse-mode autodiff).
• TensorFlow can take care of running the operations in parallel in different threads.
• It makes it easier to run the same model across different devices.
• It simplifies introspection—for example, to view the model in TensorBoard.
• Main drawbacks:
• It makes the learning curve steeper.
• It makes step-by-step debugging harder.
2. Is the statement a_val = a.eval(session=sess) equivalent to a_val = sess.run(a)?

Yes, the statementa_val=a.eval(session=sess)is indeed equivalent toa_val = sess.run(a).

3. Is the statement a_val, b_val = a.eval(session=sess), b.eval(ses sion=sess) equivalent to a_val, b_val = sess.run([a, b])?

4. Can you run two graphs in the same session?

5. If you create a graph g containing a variable w, then start two threads and open a session in each thread, both using the same graph g, will each session have its own copy of the variable w or will it be shared?

6. When is a variable initialized? When is it destroyed?

7. What is the difference between a placeholder and a variable?

8. What happens when you run the graph to evaluate an operation that depends on a placeholder but you don’t feed its value? What happens if the operation does not depend on the placeholder?

If you run the graph to evaluate an operation that depends on a placeholder but you don’t feed its value, you get an exception. If the operation does not depend on the placeholder, then no exception is raised.

9. When you run a graph, can you feed the output value of any operation, or just the value of placeholders?

When you run a graph, you can feed the output value of any operation, not just the value of placeholders. In practice, however, this is rather rare (it can be useful, for example, when you are caching the output of frozen layers;


10. How can you set a variable to any value you want (during the execution phase)?

You can specify a variable’s initial value when constructing the graph, and it will be initialized later when you run the variable’s initializer during the execution phase. If you want to change that variable’s value to anything you want during the execution phase, then the simplest option is to create an assignment node (dur‐ ing the graph construction phase) using the tf.assign() function, passing the variable and a placeholder as parameters. During the execution phase, you can run the assignment operation and feed the variable’s new value using the place‐ holder.


import tensorflow as tf
x = tf.Variable(tf.random_uniform(shape=(), minval=0.0, maxval=1.0))
x_new_val = tf.placeholder(shape=(), dtype=tf.float32)
x_assign = tf.assign(x, x_new_val)
with tf.Session():
x.initializer.run() # random number is sampled *now*
print(x.eval()) # 0.646157 (some random number)
x_assign.eval(feed_dict={x_new_val: 5.0})
print(x.eval()) # 5.0


11. How many times does reverse-mode autodiff need to traverse the graph in order to compute the gradients of the cost function with regards to 10 variables? What about forward-mode autodiff? And symbolic differentiation?

Reverse-mode autodiff (implemented by TensorFlow) needs to traverse the graph only twice in order to compute the gradients of the cost function with regards to any number of variables. On the other hand, forward-mode autodiff would need to run once for each variable (so 10 times if we want the gradients with regards to 10 different variables). As for symbolic differentiation, it would build a different graph to compute the gradients, so it would not traverse the original graph at all (except when building the new gradients graph). A highly optimized symbolic differentiation system could potentially run the new gradients graph only once to compute the gradients with regards to all variables, but that new graph may be horribly complex and inefficient compared to the original graph.


12. Implement Logistic Regression with Mini-batch Gradient Descent using Tensor‐ Flow. Train it and evaluate it on the moons dataset (introduced in Chapter 5). Try adding all the bells and whistles:

• Define the graph within a logistic_regression() function that can be reused easily.
• Save checkpoints using a Saver at regular intervals during training, and save the final model at the end of training.
• Restore the last checkpoint upon startup if training was interrupted.
• Define the graph using nice scopes so the graph looks good in TensorBoard.
• Add summaries to visualize the learning curves in TensorBoard.
• Try tweaking some hyperparameters such as the learning rate or the mini- batch size and look at the shape of the learning curve.

from sklearn.datasets import make_moons

m = 1000
X_moons, y_moons = make_moons(m, noise=0.1, random_state=42)



plt.plot(X_moons[y_moons == 1, 0], X_moons[y_moons == 1, 1], 'go', label="Positive")
plt.plot(X_moons[y_moons == 0, 0], X_moons[y_moons == 0, 1], 'r^', label="Negative")
plt.legend()
plt.show()



X_moons_with_bias = np.c_[np.ones((m, 1)), X_moons]



y_moons_column_vector = y_moons.reshape(-1, 1)



test_ratio = 0.2
test_size = int(m * test_ratio)
X_train = X_moons_with_bias[:-test_size]
X_test = X_moons_with_bias[-test_size:]
y_train = y_moons_column_vector[:-test_size]
y_test = y_moons_column_vector[-test_size:]



def random_batch(X_train, y_train, batch_size):
rnd_indices = np.random.randint(0, len(X_train), batch_size)
X_batch = X_train[rnd_indices]
y_batch = y_train[rnd_indices]
return X_batch, y_batch



reset_graph()
n_inputs = 2
X = tf.placeholder(tf.float32, shape=(None, n_inputs + 1), name="X")
y = tf.placeholder(tf.float32, shape=(None, 1), name="y")
theta = tf.Variable(tf.random_uniform([n_inputs + 1, 1], -1.0, 1.0, seed=42), name="theta")
logits = tf.matmul(X, theta, name="logits")
y_proba = 1 / (1 + tf.exp(-logits))



y_proba = tf.sigmoid(logits)



epsilon = 1e-7  # to avoid an overflow when computing the log
loss = -tf.reduce_mean(y * tf.log(y_proba + epsilon) + (1 - y) * tf.log(1 - y_proba + epsilon))



loss = tf.losses.log_loss(y, y_proba)  # uses epsilon = 1e-7 by default


learning_rate = 0.01
training_op = optimizer.minimize(loss)

init = tf.global_variables_initializer()

n_epochs = 1000
batch_size = 50
n_batches = int(np.ceil(m / batch_size))

with tf.Session() as sess:
sess.run(init)

for epoch in range(n_epochs):
for batch_index in range(n_batches):
X_batch, y_batch = random_batch(X_train, y_train, batch_size)
sess.run(training_op, feed_dict={X: X_batch, y: y_batch})
loss_val = loss.eval({X: X_test, y: y_test})
if epoch % 100 == 0:
print("Epoch:", epoch, "\tLoss:", loss_val)

y_proba_val = y_proba.eval(feed_dict={X: X_test, y: y_test})



Epoch: 0 	Loss: 0.792602
Epoch: 100 	Loss: 0.343463
Epoch: 200 	Loss: 0.30754
Epoch: 300 	Loss: 0.292889
Epoch: 400 	Loss: 0.285336
Epoch: 500 	Loss: 0.280478
Epoch: 600 	Loss: 0.278083
Epoch: 700 	Loss: 0.276154
Epoch: 800 	Loss: 0.27552
Epoch: 900 	Loss: 0.274912



y_pred = (y_proba_val >= 0.5)


from sklearn.metrics import precision_score, recall_score

precision_score(y_test, y_pred)

"""
0.86274509803921573
"""

recall_score(y_test, y_pred)
"""
0.88888888888888884
"""



X_train_enhanced = np.c_[X_train,
np.square(X_train[:, 1]),
np.square(X_train[:, 2]),
X_train[:, 1] ** 3,
X_train[:, 2] ** 3]
X_test_enhanced = np.c_[X_test,
np.square(X_test[:, 1]),
np.square(X_test[:, 2]),
X_test[:, 1] ** 3,
X_test[:, 2] ** 3]



def logistic_regression(X, y, initializer=None, seed=42, learning_rate=0.01):
n_inputs_including_bias = int(X.get_shape()[1])
with tf.name_scope("logistic_regression"):
with tf.name_scope("model"):
if initializer is None:
initializer = tf.random_uniform([n_inputs_including_bias, 1], -1.0, 1.0, seed=seed)
theta = tf.Variable(initializer, name="theta")
logits = tf.matmul(X, theta, name="logits")
y_proba = tf.sigmoid(logits)
with tf.name_scope("train"):
loss = tf.losses.log_loss(y, y_proba, scope="loss")
training_op = optimizer.minimize(loss)
loss_summary = tf.summary.scalar('log_loss', loss)
with tf.name_scope("init"):
init = tf.global_variables_initializer()
with tf.name_scope("save"):
saver = tf.train.Saver()
return y_proba, loss, training_op, loss_summary, init, saver


from datetime import datetime

def log_dir(prefix=""):
now = datetime.utcnow().strftime("%Y%m%d%H%M%S")
root_logdir = "tf_logs"
if prefix:
prefix += "-"
name = prefix + "run-" + now
return "{}/{}/".format(root_logdir, name)



n_inputs = 2 + 4
logdir = log_dir("logreg")

X = tf.placeholder(tf.float32, shape=(None, n_inputs + 1), name="X")
y = tf.placeholder(tf.float32, shape=(None, 1), name="y")

y_proba, loss, training_op, loss_summary, init, saver = logistic_regression(X, y)

file_writer = tf.summary.FileWriter(logdir, tf.get_default_graph())



n_epochs = 10001
batch_size = 50
n_batches = int(np.ceil(m / batch_size))

checkpoint_path = "/tmp/my_logreg_model.ckpt"
checkpoint_epoch_path = checkpoint_path + ".epoch"
final_model_path = "./my_logreg_model"

with tf.Session() as sess:
if os.path.isfile(checkpoint_epoch_path):
# if the checkpoint file exists, restore the model and load the epoch number
with open(checkpoint_epoch_path, "rb") as f:
print("Training was interrupted. Continuing at epoch", start_epoch)
saver.restore(sess, checkpoint_path)
else:
start_epoch = 0
sess.run(init)

for epoch in range(start_epoch, n_epochs):
for batch_index in range(n_batches):
X_batch, y_batch = random_batch(X_train_enhanced, y_train, batch_size)
sess.run(training_op, feed_dict={X: X_batch, y: y_batch})
loss_val, summary_str = sess.run([loss, loss_summary], feed_dict={X: X_test_enhanced, y: y_test})
if epoch % 500 == 0:
print("Epoch:", epoch, "\tLoss:", loss_val)
saver.save(sess, checkpoint_path)
with open(checkpoint_epoch_path, "wb") as f:
f.write(b"%d" % (epoch + 1))

saver.save(sess, final_model_path)
y_proba_val = y_proba.eval(feed_dict={X: X_test_enhanced, y: y_test})
os.remove(checkpoint_epoch_path)



y_pred = (y_proba_val >= 0.5)
precision_score(y_test, y_pred)
"""
0.97979797979797978
"""

recall_score(y_test, y_pred)
"""
0.97979797979797978
"""



from scipy.stats import reciprocal

n_search_iterations = 10

for search_iteration in range(n_search_iterations):
batch_size = np.random.randint(1, 100)
learning_rate = reciprocal(0.0001, 0.1).rvs(random_state=search_iteration)

n_inputs = 2 + 4
logdir = log_dir("logreg")

print("Iteration", search_iteration)
print("  logdir:", logdir)
print("  batch size:", batch_size)
print("  learning_rate:", learning_rate)
print("  training: ", end="")

reset_graph()

X = tf.placeholder(tf.float32, shape=(None, n_inputs + 1), name="X")
y = tf.placeholder(tf.float32, shape=(None, 1), name="y")

y_proba, loss, training_op, loss_summary, init, saver = logistic_regression(
X, y, learning_rate=learning_rate)

file_writer = tf.summary.FileWriter(logdir, tf.get_default_graph())

n_epochs = 10001
n_batches = int(np.ceil(m / batch_size))

final_model_path = "./my_logreg_model_%d" % search_iteration

with tf.Session() as sess:
sess.run(init)

for epoch in range(n_epochs):
for batch_index in range(n_batches):
X_batch, y_batch = random_batch(X_train_enhanced, y_train, batch_size)
sess.run(training_op, feed_dict={X: X_batch, y: y_batch})
loss_val, summary_str = sess.run([loss, loss_summary], feed_dict={X: X_test_enhanced, y: y_test})
if epoch % 500 == 0:
print(".", end="")

saver.save(sess, final_model_path)

print()
y_proba_val = y_proba.eval(feed_dict={X: X_test_enhanced, y: y_test})
y_pred = (y_proba_val >= 0.5)

print("  precision:", precision_score(y_test, y_pred))
print("  recall:", recall_score(y_test, y_pred))



Iteration 0
logdir: tf_logs/logreg-run-20170606195328/
batch size: 19
learning_rate: 0.00443037524522
training: .....................
precision: 0.979797979798
recall: 0.979797979798
Iteration 1
logdir: tf_logs/logreg-run-20170606195605/
batch size: 80
learning_rate: 0.00178264971514
training: .....................
precision: 0.969696969697
recall: 0.969696969697
Iteration 2
logdir: tf_logs/logreg-run-20170606195646/
batch size: 73
learning_rate: 0.00203228544324
training: .....................
precision: 0.969696969697
recall: 0.969696969697
Iteration 3
logdir: tf_logs/logreg-run-20170606195730/
batch size: 6
learning_rate: 0.00449152382514
training: .....................
precision: 0.980198019802
recall: 1.0
Iteration 4
logdir: tf_logs/logreg-run-20170606200523/
batch size: 24
learning_rate: 0.0796323472178
training: .....................
precision: 0.980198019802
recall: 1.0
Iteration 5
logdir: tf_logs/logreg-run-20170606200726/
batch size: 75
learning_rate: 0.000463425058329
training: .....................
precision: 0.912621359223
recall: 0.949494949495
Iteration 6
logdir: tf_logs/logreg-run-20170606200810/
batch size: 86
learning_rate: 0.0477068184194
training: .....................
precision: 0.98
recall: 0.989898989899
Iteration 7
logdir: tf_logs/logreg-run-20170606200851/
batch size: 87
learning_rate: 0.000169404470952
training: .....................
precision: 0.888888888889
recall: 0.808080808081
Iteration 8
logdir: tf_logs/logreg-run-20170606200932/
batch size: 61
learning_rate: 0.0417146119941
training: .....................
precision: 0.980198019802
recall: 1.0
Iteration 9
logdir: tf_logs/logreg-run-20170606201026/
batch size: 92
learning_rate: 0.000107429229684
training: .....................
precision: 0.882352941176
recall: 0.757575757576



posted @ 2019-09-06 11:30  马在路上  阅读(159)  评论(0编辑  收藏