对DOTA模型的修改与优化问题

学习率的调整问题

关于warmup：

　　可以参考神经网络中 warmup 策略为什么有效；有什么理论解释么？ - 香侬科技的回答 - 知乎 https://www.zhihu.com/question/338066667/answer/771252708

　　简单来说就是：在神经网络训练前，为了防止提前过拟合和保持稳定性，先采取一个小的学习率进行学习。

DOTA中学习率的机制如下：

　lr: 0.0005

lr_step: '45,52' #表示学习率改变的两个epoch

lr_factor: 0.1

warmup: true

warmup_lr: 0.00005

warmup_step: 1000

begin_epoch: 0

end_epoch: 60

base_lr = lr
lr_factor = config.TRAIN.lr_factor
lr_epoch = [float(epoch) for epoch in lr_step.split(',')]  
lr_epoch_diff = [epoch - begin_epoch for epoch in lr_epoch if epoch>begin_epoch]  #begin=0 等同于le_epoch
 lr = base_lr * (lr_factor ** (len(lr_epoch) - len(lr_epoch_diff)))  #还一直是base_lr
lr_iters = [int(epoch * len(roidb) / batch_size) for epoch in lr_epoch_diff]  #两个大整数 表示iter
 print('lr', lr, 'lr_epoch_diff', lr_epoch_diff, 'lr_iters', lr_iters)
lr_scheduler = WarmupMultiFactorScheduler(lr_iters, lr_factor, config.TRAIN.warmup, config.TRAIN.warmup_lr, config.TRAIN.warmup_step)

class WarmupMultiFactorScheduler(LRScheduler):

"""Reduce learning rate in factor at steps specified in a list

Assume the weight has been updated by n times, then the learning rate will

base_lr * factor^(sum((step/n)<=1)) # step is an array

Parameters

----------

step: list of int

schedule learning rate after n updates

factor: float

the factor for reducing the learning rate

"""

def __init__(self, step, factor=1, warmup=False, warmup_lr=0, warmup_step=0):

super(WarmupMultiFactorScheduler, self).__init__()

assert isinstance(step, list) and len(step) >= 1

for i, _step in enumerate(step):

if i != 0 and step[i] <= step[i-1]:

raise ValueError("Schedule step must be an increasing integer list")

if _step < 1:

raise ValueError("Schedule step must be greater or equal than 1 round")

if factor > 1.0:

raise ValueError("Factor must be no more than 1 to make lr reduce")

self.step = step

self.cur_step_ind = 0

self.factor = factor

self.count = 0

self.warmup = warmup

self.warmup_lr = warmup_lr

self.warmup_step = warmup_step

def __call__(self, num_update):

"""

Call to schedule current learning rate

Parameters

----------

num_update: int

the maximal number of updates applied to a weight.

"""

# NOTE: use while rather than if (for continuing training via load_epoch)

if self.warmup and num_update < self.warmup_step:

return self.warmup_lr

while self.cur_step_ind <= len(self.step)-1:

if num_update > self.step[self.cur_step_ind]:

self.count = self.step[self.cur_step_ind]

self.cur_step_ind += 1

self.base_lr *= self.factor

logging.info("Update[%d]: Change learning rate to %0.5e",

num_update, self.base_lr)

else:

return self.base_lr

总结下来学习率的规则就是;前1000次使用较小的lr_warmup=0.00005，之后le_step中对应[m1,m2]次更新，在m1次前使用base_lr,在m1与m2之间，使用base_lr *factor,在m2之后，跳出循环，学习率为base_lr *factor*factor不变直到结束，不论从哪里开始训练，对于阶段的学习率的取值总是这样。

posted @ 2020-03-11 16:50 Snow丶Flower 阅读(398) 评论(0) 收藏举报

刷新页面返回顶部

Snow丶Flower

对DOTA模型的修改与优化问题

公告