TensorFlow入门笔记(三)CNN实现文本分类代码注释

还没入门,就因为工作需要,要用CNN实现文本分类,用了github上现成的cnn-text-classification-tf代码,边读边学吧。

 

源码为四个PY文件,分别是

  • text_cnn.py:网络结构设计
  • train.py:网络训练
  • eval.py:预测&评估
  • data_helpers.py:数据预处理

下面分别进行注释。

 1 import tensorflow as tf
 2 import numpy as np
 3 
 4 #定义网络的结构
 5 class TextCNN(object):
 6     """
 7     A CNN for text classification.
 8     Uses an embedding layer, followed by a convolutional, max-pooling and softmax layer.
 9     """
10     def __init__(
11       self, sequence_length, num_classes, vocab_size,
12       embedding_size, filter_sizes, num_filters, l2_reg_lambda=0.0):
13 
14         # Placeholders for input, output and dropout
15         self.input_x = tf.placeholder(tf.int32, [None, sequence_length], name="input_x")
16         self.input_y = tf.placeholder(tf.float32, [None, num_classes], name="input_y")
17         self.dropout_keep_prob = tf.placeholder(tf.float32, name="dropout_keep_prob")
18 
19         # Keeping track of l2 regularization loss (optional)
20         l2_loss = tf.constant(0.0)
21 
22         # Embedding layer
23         with tf.device('/cpu:0'), tf.name_scope("embedding"):
24             self.W = tf.Variable(
25                 tf.random_uniform([vocab_size, embedding_size], -1.0, 1.0),
26                 name="W")
27             self.embedded_chars = tf.nn.embedding_lookup(self.W, self.input_x)
28             self.embedded_chars_expanded = tf.expand_dims(self.embedded_chars, -1)
29 
30         # Create a convolution + maxpool layer for each filter size
31         pooled_outputs = []
32         for i, filter_size in enumerate(filter_sizes):
33             with tf.name_scope("conv-maxpool-%s" % filter_size):
34                 # Convolution Layer
35                 filter_shape = [filter_size, embedding_size, 1, num_filters]
36                 W = tf.Variable(tf.truncated_normal(filter_shape, stddev=0.1), name="W")
37                 b = tf.Variable(tf.constant(0.1, shape=[num_filters]), name="b")
38                 conv = tf.nn.conv2d(
39                     self.embedded_chars_expanded,
40                     W,
41                     strides=[1, 1, 1, 1],
42                     padding="VALID",
43                     name="conv")
44                 # Apply nonlinearity
45                 h = tf.nn.relu(tf.nn.bias_add(conv, b), name="relu")
46                 # Maxpooling over the outputs
47                 pooled = tf.nn.max_pool(
48                     h,
49                     ksize=[1, sequence_length - filter_size + 1, 1, 1],
50                     strides=[1, 1, 1, 1],
51                     padding='VALID',
52                     name="pool")
53                 pooled_outputs.append(pooled)
54 
55         # Combine all the pooled features
56         num_filters_total = num_filters * len(filter_sizes)
57         self.h_pool = tf.concat(pooled_outputs, 3)
58         self.h_pool_flat = tf.reshape(self.h_pool, [-1, num_filters_total])
59 
60         # Add dropout
61         with tf.name_scope("dropout"):
62             self.h_drop = tf.nn.dropout(self.h_pool_flat, self.dropout_keep_prob)
63 
64         # Final (unnormalized) scores and predictions
65         with tf.name_scope("output"):
66             W = tf.get_variable(
67                 "W",
68                 shape=[num_filters_total, num_classes],
69                 initializer=tf.contrib.layers.xavier_initializer())
70             b = tf.Variable(tf.constant(0.1, shape=[num_classes]), name="b")
71             l2_loss += tf.nn.l2_loss(W)
72             l2_loss += tf.nn.l2_loss(b)
73             self.scores = tf.nn.xw_plus_b(self.h_drop, W, b, name="scores")
74             self.predictions = tf.argmax(self.scores, 1, name="predictions")
75 
76         # Calculate mean cross-entropy loss
77         with tf.name_scope("loss"):
78             losses = tf.nn.softmax_cross_entropy_with_logits(logits=self.scores, labels=self.input_y)
79             self.loss = tf.reduce_mean(losses) + l2_reg_lambda * l2_loss
80 
81         # Accuracy
82         with tf.name_scope("accuracy"):
83             correct_predictions = tf.equal(self.predictions, tf.argmax(self.input_y, 1))
84             self.accuracy = tf.reduce_mean(tf.cast(correct_predictions, "float"), name="accuracy")
text_cnn.py

可以看到类TextCNN定义了神经网络的结构,用若干函数参数初始化

  • sequence_length:句子固定长度(不足补全,超过截断)
  • num_classes:类别数
  • vocab_size:词库大小
  • embedding_size:词向量维度
  • filter_sizes:卷积核尺寸
  • num_filters:每个尺寸的卷积核数量
  • l2_reg_lambda=0.0:L2正则参数

下面开始构建网络,一句句看。

self.input_x = tf.placeholder(tf.int32, [None, sequence_length], name="input_x")
self.input_y = tf.placeholder(tf.float32, [None, num_classes], name="input_y")
self.dropout_keep_prob = tf.placeholder(tf.float32, name="dropout_keep_prob")
l2_loss = tf.constant(0.0)

变量input_x存储句子矩阵,宽为sequence_length,长度自适应(=句子数量);input_y存储句子对应的分类结果,宽度为num_classes,长度自适应;

变量dropout_keep_prob存储dropout参数,常量l2_loss为L2正则超参数。

# Embedding layer
with tf.device('/cpu:0'), tf.name_scope("embedding"):
      self.W = tf.Variable(
      tf.random_uniform([vocab_size, embedding_size], -1.0, 1.0), name="W")
      self.embedded_chars = tf.nn.embedding_lookup(self.W, self.input_x)
      self.embedded_chars_expanded = tf.expand_dims(self.embedded_chars, -1)  

Embedding层

self.W可以理解为词向量词典,存储vocab_size个大小为embedding_size的词向量,随机初始化为-1~1之间的值;

self.embedded_chars是输入input_x对应的词向量表示;size:[句子数量, sequence_length, embedding_size]

self.embedded_chars_expanded是,将词向量表示扩充一个维度(embedded_chars * 1),维度变为[句子数量, sequence_lengthembedding_size, 1],方便进行卷积(tf.nn.conv2dinput参数为四维变量,见后文)

函数tf.expand_dims(input, axis=None, name=None, dim=None):在inputaxis位置增加一个维度(dim用法等同于axis,官方文档已弃用)

# Create a convolution + maxpool layer for each filter size
pooled_outputs = []
for i, filter_size in enumerate(filter_sizes):
    with tf.name_scope("conv-maxpool-%s" % filter_size):
        # Convolution Layer
        filter_shape = [filter_size, embedding_size, 1, num_filters]
        W = tf.Variable(tf.truncated_normal(filter_shape, stddev=0.1), name="W")
        b = tf.Variable(tf.constant(0.1, shape=[num_filters]), name="b")
        conv = tf.nn.conv2d(
                    self.embedded_chars_expanded,
                    W,
                    strides=[1, 1, 1, 1],
                    padding="VALID",
                    name="conv")
        # Apply nonlinearity
        h = tf.nn.relu(tf.nn.bias_add(conv, b), name="relu")
        # Maxpooling over the outputs
        pooled = tf.nn.max_pool(
                    h,
                    ksize=[1, sequence_length - filter_size + 1, 1, 1],
                    strides=[1, 1, 1, 1],
                    padding='VALID',
                    name="pool")
        pooled_outputs.append(pooled)  

卷积层(以下的参数名都是conv-maxpool-i域名下的) 

卷积计算

conv-maxpool-i/filter_shape:卷积核矩阵的大小,包括num_filters个(输出通道数)大小为filter_size*embedding_size的卷积核,输入通道数为1;卷积核尺寸中的embedding_size,相当于对输入文字序列从左到右卷,没有上下卷的过程。

conv-maxpool-i/W:卷积核,shape为filter_shape,元素随机生成,正态分布

conv-maxpool-i/b:偏移量,num_filters个卷积核,故有这么多个偏移量

conv-maxpool-i/convconv-maxpool-i/Wself.embedded_chars_expanded的卷积

函数tf.nn.conv2d(input, filter, strides, padding, use_cudnn_on_gpu=None, name=None)实现卷积计算:参考http://blog.csdn.net/mao_xiao_feng/article/details/78004522,本处调用的参数:

  • input:输入的词向量,[句子数(图片数)batch, 句子定长(对应图高),词向量维度(对应图宽), 1(对应图像通道数)]
  • filter:卷积核,[卷积核的高度,词向量维度(卷积核的宽度),1(图像通道数),卷积核个数(输出通道数)]
  • strides:图像各维步长,一维向量,长度为4,图像通常为[1, x, x, 1]
  • padding:卷积方式,'SAME'为等长卷积, 'VALID'为窄卷积
  • 输出feature map:shape是[batch, height, width, channels]这种形式

 

激活函数

conv-maxpool-i/h:存储WX+b后非线性激活的结果

函数tf.nn.bias_add(value, bias, name = None):将偏差项bias加到value上,支持广播的形式,bias必须为1维的,value维度任意,最后一维和bias大小一致;

函数tf.nn.relu(features, name = None):非线性激活单元relu激活函数

池化(Pooling)

conv-maxpool-i/pooled:池化后结果

函数tf.nn.max_pool(value, ksize, strides, padding, name=None):对value池化

  • value:待池化的四维张量度是[batch, height, width, channels]
  • ksize:池化窗口大小,长度(大于)等于4的数组,与value的维度对应,一般为[1,height,width,1],batchchannels上不池化
  • strides:与卷积步长类似
  • padding:与卷积的padding参数类似
  • 返回值shape仍然是[batch, height, width, channels]这种形式

池化后的结果append到pooled_outputs中。对每个卷积核重复上述操作,故pooled_outputs的数组长度应该为num_filters

# Combine all the pooled features
num_filters_total = num_filters * len(filter_sizes)
self.h_pool = tf.concat(pooled_outputs, 3)
self.h_pool_flat = tf.reshape(self.h_pool, [-1, num_filters_total])

tf.concat(values, concat_dim)连接values中的矩阵,concat_dim指定在哪一维(从0计数)连接。values[i].shape = [D0, D1, ... Dconcat_dim(i), ...Dn],连接后就是:[D0, D1, ... Rconcat_dim, ...Dn]。回想pool_outputs的shape,是存在pool_outputs中的若干种卷积核的池化后结果,维度为[len(filter_sizes), batch, height, width, channels=1],因此连接的第3维为width,即对句子中的某个词,将不同核产生的计算结果(features)拼接起来。

 1          # Add dropout
 2          with tf.name_scope("dropout"):
 3              self.h_drop = tf.nn.dropout(self.h_pool_flat, self.dropout_keep_prob)
 4  
 5          # Final (unnormalized) scores and predictions
 6          with tf.name_scope("output"):
 7              W = tf.get_variable(
 8                  "W",
 9                  shape=[num_filters_total, num_classes],
10                  initializer=tf.contrib.layers.xavier_initializer())
11              b = tf.Variable(tf.constant(0.1, shape=[num_classes]), name="b")
12              l2_loss += tf.nn.l2_loss(W)
13              l2_loss += tf.nn.l2_loss(b)
14              self.scores = tf.nn.xw_plus_b(self.h_drop, W, b, name="scores")
15              self.predictions = tf.argmax(self.scores, 1, name="predictions")
16  
17          # Calculate mean cross-entropy loss
18          with tf.name_scope("loss"):
19              losses = tf.nn.softmax_cross_entropy_with_logits(logits=self.scores, labels=self.input_y)
20              self.loss = tf.reduce_mean(losses) + l2_reg_lambda * l2_loss
21  
22          # Accuracy
23          with tf.name_scope("accuracy"):
24              correct_predictions = tf.equal(self.predictions, tf.argmax(self.input_y, 1))
25              self.accuracy = tf.reduce_mean(tf.cast(correct_predictions, "float"), name="accuracy")
——————————————————————————————————————————————————————————————————
应邀补上后面的注释,时间久了有错误欢迎指正
dropout层
tf.nn.dropout(self.h_pool_flat, self.dropout_keep_prob)
dropout层,对池化后的结果h_pool_flat做dropout,概率是dropout_keep_prob,防止过拟合
上面就是神经网络的隐层

输出层(+softmax层)
W和b均为线性参数,因为加了两个参数所以增加了L2损失,都加到了l2_loss里;
scores = tf.nn.xw_plus_b(self.h_drop, W, b, name="scores")这里计算了WX+b,作为模型最后各个分类的得分
predict = tf.argmax(self.scores, 1, name="predictions")这里取到scores中数值最大的类别作为模型预测结果
这个scores是没有归一化的结果
后面利用tf.nn.softmax_cross_entropy_with_logits(logits=self.scores, labels=self.input_y),计算了模型预测值scores和真实值input_y之间的交叉熵损失
最终的损失值为交叉熵损失+L2正则损失,l2_reg_lambda是正则项系数
模型的训练应该是用这个loss值作为梯度下降的损失函数的

模型评估
这里不属于模型结构的一部分,只是计算了模型的准确率,用tf.equal判断预测结果和真实值之间是否相等,tf.reduce_mean计算了模型和真实值一致的结果占得比例

posted @ 2018-01-29 17:13  黄昏与钟声  阅读(11555)  评论(0编辑  收藏  举报