摘要: 目录概Gradient Noise Scale McCandlish S., Kaplan J., Amodei D. and OpenAI Dota Team. An empirical model of large-batch training. 2018. 概 本文讨论了随着 batch si 阅读全文
posted @ 2024-11-26 17:01 馒头and花卷 阅读(187) 评论(0) 推荐(0)