Fork me on GitHub

优化方法

1、Slowing Down the Weight Norm Increase in Momentum-based Optimizer

地址:https://arxiv.org/pdf/2006.08217.pdf

github:https://github.com/clovaai/AdamP.

2、OD-SGD: ONE-STEP DELAY STOCHASTIC GRADIENT DESCENT FOR DISTRIBUTED TRAINING

地址:https://arxiv.org/pdf/2005.06728.pdf

posted @ 2020-07-14 19:18  西西嘛呦  阅读(274)  评论(0编辑  收藏  举报