摘要: 1 Introduction Github: https://github.com/microsoft/DeepSpeed ZeRO: Memory Optimizations Toward Training Trillion Parameter Models ZeRO-Offload: Democ 阅读全文
posted @ 2024-09-07 05:53 ForHHeart 阅读(2214) 评论(0) 推荐(0)