keras

kerasdl

keras作为深度学习框架，为用户提供简洁友好的接口，后端可以使用TensorFlow, Theano, CNTK。

1. 序贯模型

是线性的，“一条路走到黑”时的网络可以使用序贯模型。

from keras .models import Sequential
from keras.layers import Dense,Activation
model = Sequential([Dense(32,units = 784),Activation('relu'),Dense(10),Activation('softmax'),])

也可以通过add函数添加网络层：

model = Sequential()
model.add(Dense(32, input_shape=(784,)))
model.add(Activation('relu'))

第一层需要指定输入数据的shape，后续层会自动推导出数据的shape。batch_size参数是不需要给出的，如果是B x H x W x C这样的数据，input_shape=(H, W, C)就好。
使用compile()进行参数配置，compile接收优化器，loss, 以及评估标准, optmizer, loss可以接收字符串或者对象,评估指标需要传入一个列表，metrics=['accuracy']。
最后使用fit()进行模型的训练，最简单的只需传入model.fit(data, label)即可，数据是numpy.ndarray形式，也可以指定其它参数，如epochs=100, batch_size=64, validation_split=0.1, shuffle=True, verbose=1, initial_epoch=0。注意这里shuffle和validation_split时，先进行split，然后才分别对train, val数据进行shuffle，所以要注意不要validation中所有类别都是同一类。

2. 函数式模型

在使用多输出模型、非循环有向模型或具有共享层的模型等复杂模型时，序贯模型无法解决这个问题，这时需要使用函数式模型，当然函数式模式也可以处理序贯模型问题。

需要使用Input指定输入大小，之后通过不断使用网络层作用得到最后的输出，在Model()里指定网络的inputs, outputs即可。

from keras.layers import Input, Dense
from keras.models import Model

# This returns a tensor
inputs = Input(shape=(784,))

# a layer instance is callable on a tensor, and returns a tensor
x = Dense(64, activation='relu')(inputs)
x = Dense(64, activation='relu')(x)
predictions = Dense(10, activation='softmax')(x)

# This creates a model that includes
# the Input layer and three Dense layers
model = Model(inputs=inputs, outputs=predictions)
model.compile(optimizer='rmsprop',
              loss='categorical_crossentropy',
              metrics=['accuracy'])
model.fit(data, labels)  # starts training

3. 常见模型

3.1 多输入和多输出模型

from keras.layers import Input, Embedding, LSTM, Dense
from keras.models import Model

# Headline input: meant to receive sequences of 100 integers, between 1 and 10000.
# Note that we can name any layer by passing it a "name" argument.
main_input = Input(shape=(100,), dtype='int32', name='main_input')

# This embedding layer will encode the input sequence
# into a sequence of dense 512-dimensional vectors.
x = Embedding(output_dim=512, input_dim=10000, input_length=100)(main_input)

# A LSTM will transform the vector sequence into a single vector,
# containing information about the entire sequence
lstm_out = LSTM(32)(x)

auxiliary_output = Dense(1, activation='sigmoid', name='aux_output')(lstm_out)

auxiliary_input = Input(shape=(5,), name='aux_input')
x = keras.layers.concatenate([lstm_out, auxiliary_input])

# We stack a deep densely-connected network on top
x = Dense(64, activation='relu')(x)
x = Dense(64, activation='relu')(x)
x = Dense(64, activation='relu')(x)

# And finally we add the main logistic regression layer
main_output = Dense(1, activation='sigmoid', name='main_output')(x)

# 两个输入，两个输出
model = Model(inputs=[main_input, auxiliary_input], outputs=[main_output, auxiliary_output])
model.compile(optimizer='rmsprop', loss='binary_crossentropy',
              loss_weights=[1., 0.2])
model.fit([headline_data, additional_data], [labels, labels],
          epochs=50, batch_size=32)

因为前面指定了输入、输出层的名字，可以通过字典去传递相关参数：

# 在complie时，要配置output使用的loss，以及多个输出的loss权重
model.compile(optimizer='rmsprop',
              loss={'main_output': 'binary_crossentropy', 'aux_output': 'binary_crossentropy'},
              loss_weights={'main_output': 1., 'aux_output': 0.2})

# 在fit时，要给input数据，给output标签，这样才能训练
model.fit({'main_input': headline_data, 'aux_input': additional_data},
          {'main_output': labels, 'aux_output': labels},
          epochs=50, batch_size=32)

3.2 共享层

任务：判断两条微博是否来自同一个人，需要同时输入两个数据，我们想利用一个共享层同时处理这两条数据：

a = Input(shape=(32, 32, 3))
b = Input(shape=(64, 64, 3))

conv = Conv2D(16, (3, 3), padding='same')
conved_a = conv(a)

# Only one input so far, the following will work:
assert conv.input_shape == (None, 32, 32, 3)

conved_b = conv(b)
# now the `.input_shape` property wouldn't work, but this does:

同时用一个conv去卷积，需要加上以下两句，表示不同结点：

assert conv.get_input_shape_at(0) == (None, 32, 32, 3)
assert conv.get_input_shape_at(1) == (None, 64, 64, 3)

3.3 inception模型

将多个卷积操作合并。

from keras.layers import Conv2D, MaxPooling2D, Input

input_img = Input(shape=(256, 256, 3))

tower_1 = Conv2D(64, (1, 1), padding='same', activation='relu')(input_img)
tower_1 = Conv2D(64, (3, 3), padding='same', activation='relu')(tower_1)

tower_2 = Conv2D(64, (1, 1), padding='same', activation='relu')(input_img)
tower_2 = Conv2D(64, (5, 5), padding='same', activation='relu')(tower_2)

tower_3 = MaxPooling2D((3, 3), strides=(1, 1), padding='same')(input_img)
tower_3 = Conv2D(64, (1, 1), padding='same', activation='relu')(tower_3)

output = keras.layers.concatenate([tower_1, tower_2, tower_3], axis=1)

3.4 残差连接

from keras.layers import Conv2D, Input

# input tensor for a 3-channel 256x256 image
x = Input(shape=(256, 256, 3))
# 3x3 conv with 3 output channels (same as input channels)
y = Conv2D(3, (3, 3), padding='same')(x)
# this returns x + y.
z = keras.layers.add([x, y])

3.5 共享视觉模型

from keras.layers import Conv2D, MaxPooling2D, Input, Dense, Flatten
from keras.models import Model

# First, define the vision modules
digit_input = Input(shape=(27, 27, 1))
x = Conv2D(64, (3, 3))(digit_input)
x = Conv2D(64, (3, 3))(x)
x = MaxPooling2D((2, 2))(x)
out = Flatten()(x)

vision_model = Model(digit_input, out)

# Then define the tell-digits-apart model
digit_a = Input(shape=(27, 27, 1))
digit_b = Input(shape=(27, 27, 1))

# The vision model will be shared, weights and all
out_a = vision_model(digit_a)
out_b = vision_model(digit_b)

concatenated = keras.layers.concatenate([out_a, out_b])
out = Dense(1, activation='sigmoid')(concatenated)

classification_model = Model([digit_a, digit_b], out)

3.6 视觉问答模型

from keras.layers import Conv2D, MaxPooling2D, Flatten
from keras.layers import Input, LSTM, Embedding, Dense
from keras.models import Model, Sequential

# First, let's define a vision model using a Sequential model.
# This model will encode an image into a vector.
vision_model = Sequential()
vision_model.add(Conv2D(64, (3, 3), activation='relu', padding='same', input_shape=(224, 224, 3)))
vision_model.add(Conv2D(64, (3, 3), activation='relu'))
vision_model.add(MaxPooling2D((2, 2)))
vision_model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
vision_model.add(Conv2D(128, (3, 3), activation='relu'))
vision_model.add(MaxPooling2D((2, 2)))
vision_model.add(Conv2D(256, (3, 3), activation='relu', padding='same'))
vision_model.add(Conv2D(256, (3, 3), activation='relu'))
vision_model.add(Conv2D(256, (3, 3), activation='relu'))
vision_model.add(MaxPooling2D((2, 2)))
vision_model.add(Flatten())

# Now let's get a tensor with the output of our vision model:
image_input = Input(shape=(224, 224, 3))
encoded_image = vision_model(image_input)

# Next, let's define a language model to encode the question into a vector.
# Each question will be at most 100 word long,
# and we will index words as integers from 1 to 9999.
question_input = Input(shape=(100,), dtype='int32')
embedded_question = Embedding(input_dim=10000, output_dim=256, input_length=100)(question_input)
encoded_question = LSTM(256)(embedded_question)

# Let's concatenate the question vector and the image vector:
merged = keras.layers.concatenate([encoded_question, encoded_image])

# And let's train a logistic regression over 1000 words on top:
output = Dense(1000, activation='softmax')(merged)

# This is our final model:
vqa_model = Model(inputs=[image_input, question_input], outputs=output)

# The next stage would be training this model on actual data.

3.7 视频问答模型

在原来的视觉模型加入LSTM，就可以解决视频问题。

from keras.layers import TimeDistributed

video_input = Input(shape=(100, 224, 224, 3))
# This is our video encoded via the previously trained vision_model (weights are reused)
encoded_frame_sequence = TimeDistributed(vision_model)(video_input)  # the output will be a sequence of vectors
encoded_video = LSTM(256)(encoded_frame_sequence)  # the output will be a vector

# This is a model-level representation of the question encoder, reusing the same weights as before:
question_encoder = Model(inputs=question_input, outputs=encoded_question)

# Let's use it to encode the question:
video_question_input = Input(shape=(100,), dtype='int32')
encoded_video_question = question_encoder(video_question_input)

# And this is our video question answering model:
merged = keras.layers.concatenate([encoded_video, encoded_video_question])
output = Dense(1000, activation='softmax')(merged)
video_qa_model = Model(inputs=[video_input, video_question_input], outputs=output)

4. API

4.1 compile

model.compile()中提供了loss_weights用于多个loss进行加权，如loss_weight=[0.8, 0.2]。

compile(self, optimizer, loss, metrics=None, loss_weights=None, sample_weight_mode=None, weighted_metrics=None, target_tensors=None)

4.2 fit

model.fit()中class_weight将不同类别给予不同损失函数权重，steps_per_epoch表示一个epoch里需要多少次迭代，默认为None时值为(nb_samples + batch_size - 1) // batch_size。validation_data提供验证集，可以覆盖validation_split。

fit(self, x=None, y=None, batch_size=None, epochs=1, verbose=1, callbacks=None, validation_split=0.0, validation_data=None, shuffle=True, class_weight=None, sample_weight=None, initial_epoch=0, steps_per_epoch=None, validation_steps=None)

利用Python的生成器，逐个生成数据的batch并进行训练。生成器与模型将并行执行以提高效率。例如，该函数允许我们在CPU上进行实时的数据预处理，同时在GPU上进行模型训练：

steps_per_epoch: 1个epoch中应包含的迭代数
workers: 处理数据进程数
max_q_size：生成器队列的最大容量

fit_generator(self, generator, steps_per_epoch, epochs=1, verbose=1, callbacks=None, validation_data=None, validation_steps=None, class_weight=None, max_q_size=10, workers=1, pickle_safe=False, initial_epoch=0)

例子：

def generate_arrays_from_file(path):
    while 1:
    f = open(path)
    for line in f:
        # create numpy arrays of input data
        # and labels, from each line in the file
        x1, x2, y = process_line(line)
        yield ({'input_1': x1, 'input_2': x2}, {'output': y})
    f.close()

model.fit_generator(generate_arrays_from_file('/my_file.txt'),
        steps_per_epoch=10000, epochs=10)

predict_generator参数与fit_generator差不多，需要传入genrator, steps。

predict_generator(self, generator, steps, max_queue_size=10, workers=1, use_multiprocessing=False, verbose=0)

4.3 Lambda层

使用keras.layers.core.Lambda自定义层。

from keras.layers.core import Lambda

def lambda_ctc_func(x):
	y_pred, labels, pred_len, label_len = x
	pred = pred[:, :, 0, :]
	return K.ctc_batch_cost(labels, y_pred, pred_len, label_len)

input_tensor, y_pred = build_model(args.img_size[0], args.num_channels)
loss_out = Lambda(lambda_ctc_func, name="ctc_loss" )([y_pred, labels, pred_len, label_len])

model = Model(inputs=[input_tensor, labels, pred_len, label_len], outputs=loss_out)
model.compile(loss: {'ctc_loss': lambda label, loss_out: loss_out})

compile中的loss函数需要接收两个函数，一个是传入的ground truth label，一个是网络的输出，一般loss函数通过这两者计算loss，但我们的loss计算已经在Lambda层中计算了，所以网络的输出loss_out就是最后的loss值，这里使用lambda表达式直接将这个值输出作为最后的损失值。

另外在使用这个网络时，由于loss_out中并没有参数需要载入，我们新建模型时只需要把之前网络输出的y_pred拿出来就好了，

model = Model(inputs=input_tensor, outputs=y_pred)
model.load_weights('model.h5')

4.4 BatchNormalization层

该层在每个batch上将前一层的激活值重新规范化，即使得其输出数据的均值接近0，其标准差接近1。

keras.layers.normalization.BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001, center=True, scale=True, beta_initializer='zeros', gamma_initializer='ones', moving_mean_initializer='zeros', moving_variance_initializer='ones', beta_regularizer=None, gamma_regularizer=None, beta_constraint=None, gamma_constraint=None)

posted @ 2018-06-05 19:29 bairuiworld 阅读(735) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

bairuiworld

keras

keras

1. 序贯模型

2. 函数式模型

3. 常见模型

3.1 多输入和多输出模型

3.2 共享层

3.3 inception模型

3.4 残差连接

3.5 共享视觉模型

3.6 视觉问答模型

3.7 视频问答模型

4. API

4.1 compile

4.2 fit

4.3 Lambda层

4.4 BatchNormalization层

公告