Deep Learning 33:读论文“Densely Connected Convolutional Networks”-------DenseNet 简单理解

一.读前说明

1.论文"Densely Connected Convolutional Networks"是现在为止效果最好的CNN架构,比Resnet还好,有必要学习一下它为什么效果这么好.

2.代码地址:https://github.com/liuzhuang13/DenseNet

3.这篇论文主要参考了Highway Networks,Residual Networks (ResNets)和GoogLeNet,所以在读本篇论文之前,有必要读一下这几篇论文,另外还可以看一下Very Deep Learning with Highway Networks

4.参考文献 :ResNet && DenseNet(原理篇)DenseNet模型

二.阅读笔记 

Abstract

      最近的一些论文表明,如果卷积神经网络的各层到输入层和输出层的连接更短,那么该网络就大体上可以设计得更深、更准确、训练得更有效。本文基于此提出了“稠密卷积网络(DensNet),该网络每一层均以前馈的形式与其他任一层连接。因此,传统卷积网络有L层就只有L个连接,而DenseNet的任一层不仅与相邻层有连接,而且与它的随后的所有层都有直接连接,所以该网络有L(L+1)/2个直接连接。任意一层的输入都是其前面所有层的特征图,而该层自己的特征图是其随后所有层的输入。DenseNet有以下几个令人激动的优点:1.减轻了梯度消失问题;2.强化了特征传播;3.大幅度减少了参数数量。该网络结构在4个高竞争性的目标识别基准数据集上进行了评估,包括:CIFAR-10,CIFAR-100,SVHN,ImageNet。DenseNet在这些数据集上大部分都获得了巨大的提高,达到目前为止最高的识别准确率。

1.Introduction

      在视觉识别中,CNN是一种强大的机器学习方法。尽管CNN在20年以前就被提出来,但是只是在最近几年,计算机硬件和网络结构的提高才使得真正的深层CNN的训练变成可能。最开始的LeNet5包含5层,VGG包含19层,只有去年的Highway Networks和ResNets才超过了100层这个关卡。

 

三.阅读感想

      翻译了一半,居然感觉完全不用翻译,真接看英文原文也能看懂,嗯对,这篇文章写得通俗易懂,根本不用像看那些什么hiton、begio、yanlecun之类大牛写的文章一样,直接一遍看过去,看得似懂非懂的。看这篇论文看完之后,感觉像吃了蜂蜜一样,看了还想看,连连最后实验结果分析和discuss也写得非常好,特别是discuss中那个图,该文创意非常棒,并且简单,最主要的是该文创意来源就是我最喜欢的那种,就是总结以前很多文章中效果好的原因,找出它们的共性,然后强化这个共性,从而得到更好的结果

 

四.DenseNet结构

1.在CIFAR-10上用训练时的结构DenseNet-BC

如果depth=40, growth_rate=12, bottleneck=True, reduction=0.5=1-compression,则每个denseblock里面的层数n_layers=((40-4)/3)//2=6.其中//2表示除以2后向下取整。
注:conv表示正常的2D卷积,CONV表示BN-ReLU-conv
结构如下:
input:(32,32,3)
conv(24,3,3), % 其中conv(24,3,3)=conv(filters=2*growth_rate=24,kernel_size=3,3)

#第1个dense block
CONV(48,1,1)-CONV(12,3,3)-merge(36)- % 其中CONV(48,1,1)=CONV(filters=inter_channel = nb_filter*4=48,1,1),merge后nb_filter=24+12=36
CONV(48,1,1)-CONV(12,3,3)-merge(48)- % 同上,merge后nb_filter=36+12=48
CONV(48,1,1)-CONV(12,3,3)-merge(60)-

CONV(48,1,1)-CONV(12,3,3)-merge(72)-
CONV(48,1,1)-CONV(12,3,3)-merge(84)-
CONV(48,1,1)-CONV(12,3,3)-merge(96)- % 此时nb_filter每多一层就增加growth_rate=12个,这里1个dense block里有6层,故增加72个,所以nb_falter=24+72=96

#第1个Transition Layer
CONV(48,1,1) % nb_filter=nb_filter*compression=96*0.5=48
AveragePool(2,2,(2,2)) % pool_size=2,2 strides=(2,2)

#第2个dense block
CONV(48,1,1)-CONV(12,3,3)-merge(108)- % 其中CONV(48,1,1)=CONV(filters=inter_channel = nb_filter*4=48,1,1),merge后nb_filter=96+12=108
CONV(48,1,1)-CONV(12,3,3)-merge(120)-
CONV(48,1,1)-CONV(12,3,3)-merge(132)-
CONV(48,1,1)-CONV(12,3,3)-merge(144)-
CONV(48,1,1)-CONV(12,3,3)-merge(156)-
CONV(48,1,1)-CONV(12,3,3)-merge(168)- % 此时nb_filter每多一层就增加growth_rate=12个,这里1个dense block里有6层,故增加72个,所以nb_falter=96+72=168

#第2个Transition Layer
CONV(60,1,1) % nb_filter=nb_filter*compression=120*0.5=60
AveragePool(2,2,(2,2)) % pool_size=2,2 strides=(2,2)

#第3个dense block
CONV(48,1,1)-CONV(12,3,3)-merge(180)- % 其中CONV(48,1,1)=CONV(filters=inter_channel = nb_filter*4=48,1,1)
CONV(48,1,1)-CONV(12,3,3)-merge(192)-
CONV(48,1,1)-CONV(12,3,3)-merge(204)-
CONV(48,1,1)-CONV(12,3,3)-merge(216)-
CONV(48,1,1)-CONV(12,3,3)-merge(228)-
CONV(48,1,1)-CONV(12,3,3)-merge(240
)-
% 此时nb_filter每多一层就增加growth_rate=12个,这里1个dense block里有6层,故增加72个,所以nb_falter=168+72=240

Relu-GlobalAveragePool-softmax

为验证以上的分析,用keras==1.2.0版本验证结果如下:

  1 Model created
  2 ____________________________________________________________________________________________________
  3 Layer (type) Output Shape Param # Connected to 
  4 ====================================================================================================
  5 input_1 (InputLayer) (None, 32, 32, 3) 0 
  6 ____________________________________________________________________________________________________
  7 initial_conv2D (Convolution2D) (None, 32, 32, 24) 648 input_1[0][0] 
  8 ____________________________________________________________________________________________________
  9 batchnormalization_1 (BatchNorma (None, 32, 32, 24) 96 initial_conv2D[0][0] 
 10 ____________________________________________________________________________________________________
 11 activation_1 (Activation) (None, 32, 32, 24) 0 batchnormalization_1[0][0] 
 12 ____________________________________________________________________________________________________
 13 convolution2d_1 (Convolution2D) (None, 32, 32, 48) 1152 activation_1[0][0] 
 14 ____________________________________________________________________________________________________
 15 batchnormalization_2 (BatchNorma (None, 32, 32, 48) 192 convolution2d_1[0][0] 
 16 ____________________________________________________________________________________________________
 17 activation_2 (Activation) (None, 32, 32, 48) 0 batchnormalization_2[0][0] 
 18 ____________________________________________________________________________________________________
 19 convolution2d_2 (Convolution2D) (None, 32, 32, 12) 5184 activation_2[0][0] 
 20 ____________________________________________________________________________________________________
 21 merge_1 (Merge) (None, 32, 32, 36) 0 initial_conv2D[0][0] 
 22 convolution2d_2[0][0] 
 23 ____________________________________________________________________________________________________
 24 batchnormalization_3 (BatchNorma (None, 32, 32, 36) 144 merge_1[0][0] 
 25 ____________________________________________________________________________________________________
 26 activation_3 (Activation) (None, 32, 32, 36) 0 batchnormalization_3[0][0] 
 27 ____________________________________________________________________________________________________
 28 convolution2d_3 (Convolution2D) (None, 32, 32, 48) 1728 activation_3[0][0] 
 29 ____________________________________________________________________________________________________
 30 batchnormalization_4 (BatchNorma (None, 32, 32, 48) 192 convolution2d_3[0][0] 
 31 ____________________________________________________________________________________________________
 32 activation_4 (Activation) (None, 32, 32, 48) 0 batchnormalization_4[0][0] 
 33 ____________________________________________________________________________________________________
 34 convolution2d_4 (Convolution2D) (None, 32, 32, 12) 5184 activation_4[0][0] 
 35 ____________________________________________________________________________________________________
 36 merge_2 (Merge) (None, 32, 32, 48) 0 initial_conv2D[0][0] 
 37 convolution2d_2[0][0] 
 38 convolution2d_4[0][0] 
 39 ____________________________________________________________________________________________________
 40 batchnormalization_5 (BatchNorma (None, 32, 32, 48) 192 merge_2[0][0] 
 41 ____________________________________________________________________________________________________
 42 activation_5 (Activation) (None, 32, 32, 48) 0 batchnormalization_5[0][0] 
 43 ____________________________________________________________________________________________________
 44 convolution2d_5 (Convolution2D) (None, 32, 32, 48) 2304 activation_5[0][0] 
 45 ____________________________________________________________________________________________________
 46 batchnormalization_6 (BatchNorma (None, 32, 32, 48) 192 convolution2d_5[0][0] 
 47 ____________________________________________________________________________________________________
 48 activation_6 (Activation) (None, 32, 32, 48) 0 batchnormalization_6[0][0] 
 49 ____________________________________________________________________________________________________
 50 convolution2d_6 (Convolution2D) (None, 32, 32, 12) 5184 activation_6[0][0] 
 51 ____________________________________________________________________________________________________
 52 merge_3 (Merge) (None, 32, 32, 60) 0 initial_conv2D[0][0] 
 53 convolution2d_2[0][0] 
 54 convolution2d_4[0][0] 
 55 convolution2d_6[0][0] 
 56 ____________________________________________________________________________________________________
 57 batchnormalization_7 (BatchNorma (None, 32, 32, 60) 240 merge_3[0][0] 
 58 ____________________________________________________________________________________________________
 59 activation_7 (Activation) (None, 32, 32, 60) 0 batchnormalization_7[0][0] 
 60 ____________________________________________________________________________________________________
 61 convolution2d_7 (Convolution2D) (None, 32, 32, 48) 2880 activation_7[0][0] 
 62 ____________________________________________________________________________________________________
 63 batchnormalization_8 (BatchNorma (None, 32, 32, 48) 192 convolution2d_7[0][0] 
 64 ____________________________________________________________________________________________________
 65 activation_8 (Activation) (None, 32, 32, 48) 0 batchnormalization_8[0][0] 
 66 ____________________________________________________________________________________________________
 67 convolution2d_8 (Convolution2D) (None, 32, 32, 12) 5184 activation_8[0][0] 
 68 ____________________________________________________________________________________________________
 69 merge_4 (Merge) (None, 32, 32, 72) 0 initial_conv2D[0][0] 
 70 convolution2d_2[0][0] 
 71 convolution2d_4[0][0] 
 72 convolution2d_6[0][0] 
 73 convolution2d_8[0][0] 
 74 ____________________________________________________________________________________________________
 75 batchnormalization_9 (BatchNorma (None, 32, 32, 72) 288 merge_4[0][0] 
 76 ____________________________________________________________________________________________________
 77 activation_9 (Activation) (None, 32, 32, 72) 0 batchnormalization_9[0][0] 
 78 ____________________________________________________________________________________________________
 79 convolution2d_9 (Convolution2D) (None, 32, 32, 48) 3456 activation_9[0][0] 
 80 ____________________________________________________________________________________________________
 81 batchnormalization_10 (BatchNorm (None, 32, 32, 48) 192 convolution2d_9[0][0] 
 82 ____________________________________________________________________________________________________
 83 activation_10 (Activation) (None, 32, 32, 48) 0 batchnormalization_10[0][0] 
 84 ____________________________________________________________________________________________________
 85 convolution2d_10 (Convolution2D) (None, 32, 32, 12) 5184 activation_10[0][0] 
 86 ____________________________________________________________________________________________________
 87 merge_5 (Merge) (None, 32, 32, 84) 0 initial_conv2D[0][0] 
 88 convolution2d_2[0][0] 
 89 convolution2d_4[0][0] 
 90 convolution2d_6[0][0] 
 91 convolution2d_8[0][0] 
 92 convolution2d_10[0][0] 
 93 ____________________________________________________________________________________________________
 94 batchnormalization_11 (BatchNorm (None, 32, 32, 84) 336 merge_5[0][0] 
 95 ____________________________________________________________________________________________________
 96 activation_11 (Activation) (None, 32, 32, 84) 0 batchnormalization_11[0][0] 
 97 ____________________________________________________________________________________________________
 98 convolution2d_11 (Convolution2D) (None, 32, 32, 48) 4032 activation_11[0][0] 
 99 ____________________________________________________________________________________________________
100 batchnormalization_12 (BatchNorm (None, 32, 32, 48) 192 convolution2d_11[0][0] 
101 ____________________________________________________________________________________________________
102 activation_12 (Activation) (None, 32, 32, 48) 0 batchnormalization_12[0][0] 
103 ____________________________________________________________________________________________________
104 convolution2d_12 (Convolution2D) (None, 32, 32, 12) 5184 activation_12[0][0] 
105 ____________________________________________________________________________________________________
106 merge_6 (Merge) (None, 32, 32, 96) 0 initial_conv2D[0][0] 
107 convolution2d_2[0][0] 
108 convolution2d_4[0][0] 
109 convolution2d_6[0][0] 
110 convolution2d_8[0][0] 
111 convolution2d_10[0][0] 
112 convolution2d_12[0][0] 
113 ____________________________________________________________________________________________________
114 batchnormalization_13 (BatchNorm (None, 32, 32, 96) 384 merge_6[0][0] 
115 ____________________________________________________________________________________________________
116 activation_13 (Activation) (None, 32, 32, 96) 0 batchnormalization_13[0][0] 
117 ____________________________________________________________________________________________________
118 convolution2d_13 (Convolution2D) (None, 32, 32, 96) 9216 activation_13[0][0] 
119 ____________________________________________________________________________________________________
120 averagepooling2d_1 (AveragePooli (None, 16, 16, 96) 0 convolution2d_13[0][0] 
121 ____________________________________________________________________________________________________
122 batchnormalization_14 (BatchNorm (None, 16, 16, 96) 384 averagepooling2d_1[0][0] 
123 ____________________________________________________________________________________________________
124 activation_14 (Activation) (None, 16, 16, 96) 0 batchnormalization_14[0][0] 
125 ____________________________________________________________________________________________________
126 convolution2d_14 (Convolution2D) (None, 16, 16, 48) 4608 activation_14[0][0] 
127 ____________________________________________________________________________________________________
128 batchnormalization_15 (BatchNorm (None, 16, 16, 48) 192 convolution2d_14[0][0] 
129 ____________________________________________________________________________________________________
130 activation_15 (Activation) (None, 16, 16, 48) 0 batchnormalization_15[0][0] 
131 ____________________________________________________________________________________________________
132 convolution2d_15 (Convolution2D) (None, 16, 16, 12) 5184 activation_15[0][0] 
133 ____________________________________________________________________________________________________
134 merge_7 (Merge) (None, 16, 16, 108) 0 averagepooling2d_1[0][0] 
135 convolution2d_15[0][0] 
136 ____________________________________________________________________________________________________
137 batchnormalization_16 (BatchNorm (None, 16, 16, 108) 432 merge_7[0][0] 
138 ____________________________________________________________________________________________________
139 activation_16 (Activation) (None, 16, 16, 108) 0 batchnormalization_16[0][0] 
140 ____________________________________________________________________________________________________
141 convolution2d_16 (Convolution2D) (None, 16, 16, 48) 5184 activation_16[0][0] 
142 ____________________________________________________________________________________________________
143 batchnormalization_17 (BatchNorm (None, 16, 16, 48) 192 convolution2d_16[0][0] 
144 ____________________________________________________________________________________________________
145 activation_17 (Activation) (None, 16, 16, 48) 0 batchnormalization_17[0][0] 
146 ____________________________________________________________________________________________________
147 convolution2d_17 (Convolution2D) (None, 16, 16, 12) 5184 activation_17[0][0] 
148 ____________________________________________________________________________________________________
149 merge_8 (Merge) (None, 16, 16, 120) 0 averagepooling2d_1[0][0] 
150 convolution2d_15[0][0] 
151 convolution2d_17[0][0] 
152 ____________________________________________________________________________________________________
153 batchnormalization_18 (BatchNorm (None, 16, 16, 120) 480 merge_8[0][0] 
154 ____________________________________________________________________________________________________
155 activation_18 (Activation) (None, 16, 16, 120) 0 batchnormalization_18[0][0] 
156 ____________________________________________________________________________________________________
157 convolution2d_18 (Convolution2D) (None, 16, 16, 48) 5760 activation_18[0][0] 
158 ____________________________________________________________________________________________________
159 batchnormalization_19 (BatchNorm (None, 16, 16, 48) 192 convolution2d_18[0][0] 
160 ____________________________________________________________________________________________________
161 activation_19 (Activation) (None, 16, 16, 48) 0 batchnormalization_19[0][0] 
162 ____________________________________________________________________________________________________
163 convolution2d_19 (Convolution2D) (None, 16, 16, 12) 5184 activation_19[0][0] 
164 ____________________________________________________________________________________________________
165 merge_9 (Merge) (None, 16, 16, 132) 0 averagepooling2d_1[0][0] 
166 convolution2d_15[0][0] 
167 convolution2d_17[0][0] 
168 convolution2d_19[0][0] 
169 ____________________________________________________________________________________________________
170 batchnormalization_20 (BatchNorm (None, 16, 16, 132) 528 merge_9[0][0] 
171 ____________________________________________________________________________________________________
172 activation_20 (Activation) (None, 16, 16, 132) 0 batchnormalization_20[0][0] 
173 ____________________________________________________________________________________________________
174 convolution2d_20 (Convolution2D) (None, 16, 16, 48) 6336 activation_20[0][0] 
175 ____________________________________________________________________________________________________
176 batchnormalization_21 (BatchNorm (None, 16, 16, 48) 192 convolution2d_20[0][0] 
177 ____________________________________________________________________________________________________
178 activation_21 (Activation) (None, 16, 16, 48) 0 batchnormalization_21[0][0] 
179 ____________________________________________________________________________________________________
180 convolution2d_21 (Convolution2D) (None, 16, 16, 12) 5184 activation_21[0][0] 
181 ____________________________________________________________________________________________________
182 merge_10 (Merge) (None, 16, 16, 144) 0 averagepooling2d_1[0][0] 
183 convolution2d_15[0][0] 
184 convolution2d_17[0][0] 
185 convolution2d_19[0][0] 
186 convolution2d_21[0][0] 
187 ____________________________________________________________________________________________________
188 batchnormalization_22 (BatchNorm (None, 16, 16, 144) 576 merge_10[0][0] 
189 ____________________________________________________________________________________________________
190 activation_22 (Activation) (None, 16, 16, 144) 0 batchnormalization_22[0][0] 
191 ____________________________________________________________________________________________________
192 convolution2d_22 (Convolution2D) (None, 16, 16, 48) 6912 activation_22[0][0] 
193 ____________________________________________________________________________________________________
194 batchnormalization_23 (BatchNorm (None, 16, 16, 48) 192 convolution2d_22[0][0] 
195 ____________________________________________________________________________________________________
196 activation_23 (Activation) (None, 16, 16, 48) 0 batchnormalization_23[0][0] 
197 ____________________________________________________________________________________________________
198 convolution2d_23 (Convolution2D) (None, 16, 16, 12) 5184 activation_23[0][0] 
199 ____________________________________________________________________________________________________
200 merge_11 (Merge) (None, 16, 16, 156) 0 averagepooling2d_1[0][0] 
201 convolution2d_15[0][0] 
202 convolution2d_17[0][0] 
203 convolution2d_19[0][0] 
204 convolution2d_21[0][0] 
205 convolution2d_23[0][0] 
206 ____________________________________________________________________________________________________
207 batchnormalization_24 (BatchNorm (None, 16, 16, 156) 624 merge_11[0][0] 
208 ____________________________________________________________________________________________________
209 activation_24 (Activation) (None, 16, 16, 156) 0 batchnormalization_24[0][0] 
210 ____________________________________________________________________________________________________
211 convolution2d_24 (Convolution2D) (None, 16, 16, 48) 7488 activation_24[0][0] 
212 ____________________________________________________________________________________________________
213 batchnormalization_25 (BatchNorm (None, 16, 16, 48) 192 convolution2d_24[0][0] 
214 ____________________________________________________________________________________________________
215 activation_25 (Activation) (None, 16, 16, 48) 0 batchnormalization_25[0][0] 
216 ____________________________________________________________________________________________________
217 convolution2d_25 (Convolution2D) (None, 16, 16, 12) 5184 activation_25[0][0] 
218 ____________________________________________________________________________________________________
219 merge_12 (Merge) (None, 16, 16, 168) 0 averagepooling2d_1[0][0] 
220 convolution2d_15[0][0] 
221 convolution2d_17[0][0] 
222 convolution2d_19[0][0] 
223 convolution2d_21[0][0] 
224 convolution2d_23[0][0] 
225 convolution2d_25[0][0] 
226 ____________________________________________________________________________________________________
227 batchnormalization_26 (BatchNorm (None, 16, 16, 168) 672 merge_12[0][0] 
228 ____________________________________________________________________________________________________
229 activation_26 (Activation) (None, 16, 16, 168) 0 batchnormalization_26[0][0] 
230 ____________________________________________________________________________________________________
231 convolution2d_26 (Convolution2D) (None, 16, 16, 168) 28224 activation_26[0][0] 
232 ____________________________________________________________________________________________________
233 averagepooling2d_2 (AveragePooli (None, 8, 8, 168) 0 convolution2d_26[0][0] 
234 ____________________________________________________________________________________________________
235 batchnormalization_27 (BatchNorm (None, 8, 8, 168) 672 averagepooling2d_2[0][0] 
236 ____________________________________________________________________________________________________
237 activation_27 (Activation) (None, 8, 8, 168) 0 batchnormalization_27[0][0] 
238 ____________________________________________________________________________________________________
239 convolution2d_27 (Convolution2D) (None, 8, 8, 48) 8064 activation_27[0][0] 
240 ____________________________________________________________________________________________________
241 batchnormalization_28 (BatchNorm (None, 8, 8, 48) 192 convolution2d_27[0][0] 
242 ____________________________________________________________________________________________________
243 activation_28 (Activation) (None, 8, 8, 48) 0 batchnormalization_28[0][0] 
244 ____________________________________________________________________________________________________
245 convolution2d_28 (Convolution2D) (None, 8, 8, 12) 5184 activation_28[0][0] 
246 ____________________________________________________________________________________________________
247 merge_13 (Merge) (None, 8, 8, 180) 0 averagepooling2d_2[0][0] 
248 convolution2d_28[0][0] 
249 ____________________________________________________________________________________________________
250 batchnormalization_29 (BatchNorm (None, 8, 8, 180) 720 merge_13[0][0] 
251 ____________________________________________________________________________________________________
252 activation_29 (Activation) (None, 8, 8, 180) 0 batchnormalization_29[0][0] 
253 ____________________________________________________________________________________________________
254 convolution2d_29 (Convolution2D) (None, 8, 8, 48) 8640 activation_29[0][0] 
255 ____________________________________________________________________________________________________
256 batchnormalization_30 (BatchNorm (None, 8, 8, 48) 192 convolution2d_29[0][0] 
257 ____________________________________________________________________________________________________
258 activation_30 (Activation) (None, 8, 8, 48) 0 batchnormalization_30[0][0] 
259 ____________________________________________________________________________________________________
260 convolution2d_30 (Convolution2D) (None, 8, 8, 12) 5184 activation_30[0][0] 
261 ____________________________________________________________________________________________________
262 merge_14 (Merge) (None, 8, 8, 192) 0 averagepooling2d_2[0][0] 
263 convolution2d_28[0][0] 
264 convolution2d_30[0][0] 
265 ____________________________________________________________________________________________________
266 batchnormalization_31 (BatchNorm (None, 8, 8, 192) 768 merge_14[0][0] 
267 ____________________________________________________________________________________________________
268 activation_31 (Activation) (None, 8, 8, 192) 0 batchnormalization_31[0][0] 
269 ____________________________________________________________________________________________________
270 convolution2d_31 (Convolution2D) (None, 8, 8, 48) 9216 activation_31[0][0] 
271 ____________________________________________________________________________________________________
272 batchnormalization_32 (BatchNorm (None, 8, 8, 48) 192 convolution2d_31[0][0] 
273 ____________________________________________________________________________________________________
274 activation_32 (Activation) (None, 8, 8, 48) 0 batchnormalization_32[0][0] 
275 ____________________________________________________________________________________________________
276 convolution2d_32 (Convolution2D) (None, 8, 8, 12) 5184 activation_32[0][0] 
277 ____________________________________________________________________________________________________
278 merge_15 (Merge) (None, 8, 8, 204) 0 averagepooling2d_2[0][0] 
279 convolution2d_28[0][0] 
280 convolution2d_30[0][0] 
281 convolution2d_32[0][0] 
282 ____________________________________________________________________________________________________
283 batchnormalization_33 (BatchNorm (None, 8, 8, 204) 816 merge_15[0][0] 
284 ____________________________________________________________________________________________________
285 activation_33 (Activation) (None, 8, 8, 204) 0 batchnormalization_33[0][0] 
286 ____________________________________________________________________________________________________
287 convolution2d_33 (Convolution2D) (None, 8, 8, 48) 9792 activation_33[0][0] 
288 ____________________________________________________________________________________________________
289 batchnormalization_34 (BatchNorm (None, 8, 8, 48) 192 convolution2d_33[0][0] 
290 ____________________________________________________________________________________________________
291 activation_34 (Activation) (None, 8, 8, 48) 0 batchnormalization_34[0][0] 
292 ____________________________________________________________________________________________________
293 convolution2d_34 (Convolution2D) (None, 8, 8, 12) 5184 activation_34[0][0] 
294 ____________________________________________________________________________________________________
295 merge_16 (Merge) (None, 8, 8, 216) 0 averagepooling2d_2[0][0] 
296 convolution2d_28[0][0] 
297 convolution2d_30[0][0] 
298 convolution2d_32[0][0] 
299 convolution2d_34[0][0] 
300 ____________________________________________________________________________________________________
301 batchnormalization_35 (BatchNorm (None, 8, 8, 216) 864 merge_16[0][0] 
302 ____________________________________________________________________________________________________
303 activation_35 (Activation) (None, 8, 8, 216) 0 batchnormalization_35[0][0] 
304 ____________________________________________________________________________________________________
305 convolution2d_35 (Convolution2D) (None, 8, 8, 48) 10368 activation_35[0][0] 
306 ____________________________________________________________________________________________________
307 batchnormalization_36 (BatchNorm (None, 8, 8, 48) 192 convolution2d_35[0][0] 
308 ____________________________________________________________________________________________________
309 activation_36 (Activation) (None, 8, 8, 48) 0 batchnormalization_36[0][0] 
310 ____________________________________________________________________________________________________
311 convolution2d_36 (Convolution2D) (None, 8, 8, 12) 5184 activation_36[0][0] 
312 ____________________________________________________________________________________________________
313 merge_17 (Merge) (None, 8, 8, 228) 0 averagepooling2d_2[0][0] 
314 convolution2d_28[0][0] 
315 convolution2d_30[0][0] 
316 convolution2d_32[0][0] 
317 convolution2d_34[0][0] 
318 convolution2d_36[0][0] 
319 ____________________________________________________________________________________________________
320 batchnormalization_37 (BatchNorm (None, 8, 8, 228) 912 merge_17[0][0] 
321 ____________________________________________________________________________________________________
322 activation_37 (Activation) (None, 8, 8, 228) 0 batchnormalization_37[0][0] 
323 ____________________________________________________________________________________________________
324 convolution2d_37 (Convolution2D) (None, 8, 8, 48) 10944 activation_37[0][0] 
325 ____________________________________________________________________________________________________
326 batchnormalization_38 (BatchNorm (None, 8, 8, 48) 192 convolution2d_37[0][0] 
327 ____________________________________________________________________________________________________
328 activation_38 (Activation) (None, 8, 8, 48) 0 batchnormalization_38[0][0] 
329 ____________________________________________________________________________________________________
330 convolution2d_38 (Convolution2D) (None, 8, 8, 12) 5184 activation_38[0][0] 
331 ____________________________________________________________________________________________________
332 merge_18 (Merge) (None, 8, 8, 240) 0 averagepooling2d_2[0][0] 
333 convolution2d_28[0][0] 
334 convolution2d_30[0][0] 
335 convolution2d_32[0][0] 
336 convolution2d_34[0][0] 
337 convolution2d_36[0][0] 
338 convolution2d_38[0][0] 
339 ____________________________________________________________________________________________________
340 batchnormalization_39 (BatchNorm (None, 8, 8, 240) 960 merge_18[0][0] 
341 ____________________________________________________________________________________________________
342 activation_39 (Activation) (None, 8, 8, 240) 0 batchnormalization_39[0][0] 
343 ____________________________________________________________________________________________________
344 globalaveragepooling2d_1 (Global (None, 240) 0 activation_39[0][0] 
345 ____________________________________________________________________________________________________
346 dense_1 (Dense) (None, 10) 2410 globalaveragepooling2d_1[0][0] 
347 ====================================================================================================
348 Total params: 257,218
349 Trainable params: 249,946
350 Non-trainable params: 7,272
351 ____________________________________________________________________________________________________
352 Finished compiling
353 Building model...
View Code

 

五.疑问

1.运行完keras实验之后发现,居然在每个CONV(48,1,1)-CONV(12,3,3)- 后面都有一个Merge,可是在代码中我并没有发现呀,哪里来的?肯定是我看漏了,可是它是从哪来的呢?

答:原来在dense_block的定义中有这样一句话看掉了:

 

1     for i in range(nb_layers):
2         x = conv_block(x, growth_rate, bottleneck, dropout_rate, weight_decay)
3         feature_list.append(x)
4         x = merge(feature_list, mode='concat', concat_axis=concat_axis)
5         nb_filter += growth_rate

 

意思就是在每个这样一个模块后,都要进行Merge,即:就是把每一层的输出都串联在一起,从而组成一个新的tensor。

 

2.为什么每个denseblock里面的层数n_layers=((40-4)/3)//2=6.其中//2表示除以2后向下取整?即为什么是减4?

答:因为该结构中层,除了dense block 中有很多层外,还1个初始的卷积层、2个过渡层、以及1个最后分类输出层。注意:在该论文中,讲的结构深度depth为L,它并不包括输入层在内。

所以对本论文中的深度depth或L的定义如下:

a.初始的卷积conv,算作1层;

b.每个过渡层,算作1层;

c.每个dense block中的CONV(48,1,1)-CONV(12,3,3)模块,算作2层,即:1个CONV就算作1层;

d.最后的输出模块Relu-GlobalAveragePool-softmax,算作1层。

也可这么说:深度就是卷积层的层数加上1个softmax层。

 

posted @ 2017-02-22 16:27  夜空中最帅的星  阅读(9012)  评论(0编辑  收藏  举报