摘要:
目录概符号说明GaLore Zhao J., Zhang Z., Chen B., Wang Z., Anandkumar A. and Tian Y. GaLore: Memory-efficient llm training by gradient low-rank projection. IC 阅读全文
posted @ 2024-08-27 16:05
馒头and花卷
阅读(146)
评论(0)
推荐(0)
摘要:
目录概BAdam代码 Luo Q., Yu H. and Li X. BAdam: A memory efficient full parameter optimization method for large language models. arXiv preprint, 2024. 概 本文介 阅读全文
posted @ 2024-08-27 10:12
馒头and花卷
阅读(234)
评论(0)
推荐(0)

浙公网安备 33010602011771号