在训练神经网络时,控制学习率对训练的速度和准确度都有很大作用.逐渐减小学习率在实践中被证明对训练的收敛有正向效果,Tensorflow自带两种衰减方法:指数衰减和多项式衰减
tf.train.expotional_decay
在Tensorflow API r1.4是这样定义的:
exponential_decay(
learning_rate,
global_step,
decay_steps,
decay_rate,
staircase=False,
name=None
)
参数:
learning_rate:学习率初始值
global_step:全局step数,代表当前衰减的步数
decay_steps:学习率衰减的步数,也代表学习率每次更新相隔的步数
decay_rate:指数衰减系数
staircase:是否阶梯性更新学习率,即学习率是取float还是int
学习率计算公式:
decayed_learning_rate = learning_rate *decay_rate
^ (global_step / decay_steps)
官方demo
...
global_step = tf.Variable(0, trainable=False)
starter_learning_rate = 0.1
learning_rate = tf.exponential_decay(starter_learning_rate, global_step,
100000, 0.96, staircase=True)
optimizer = tf.GradientDescent(learning_rate)
# Passing global_step to minimize() will increment it at each step.
optimizer.minimize(...my loss..., global_step=global_step)
tf.train. polynomial_decay
Tensorflow是这样定义的:
polynomial_decay(
learning_rate,
global_step,
decay_steps,
end_learning_rate=0.0001,
power=1.0,
cycle=False,
name=None
)
参数:
earning_rate:初始值
global_step:全局step数
decay_steps:学习率衰减的步数,也代表学习率每次更新相隔的步数
end_learning_rate:衰减最终值
power:多项式衰减系数
cycle:step超出decay_steps之后是否继续循环
学习率计算公式:
decay_steps = decay_steps * ceil(global_step / decay_steps)
decayed_learning_rate = (learning_rate - end_learning_rate) *
(1 - global_step / decay_steps) ^ (power) + end_learning_rate
两个函数的返回值都是与learning_rate类型相同的tensor标量值