2024 年 8月 27 日随笔档案 - 馒头and花卷

2024年8月27日

GaLore Memory-Efficient LLM Training by Gradient Low-Rank Projection

摘要：目录概符号说明GaLore Zhao J., Zhang Z., Chen B., Wang Z., Anandkumar A. and Tian Y. GaLore: Memory-efficient llm training by gradient low-rank projection. IC 阅读全文

posted @ 2024-08-27 16:05 馒头and花卷阅读(146) 评论(0) 推荐(0)

BAdam: A Memory Efficient Full Parameter Optimization Method for Large Language Models

摘要：目录概BAdam代码 Luo Q., Yu H. and Li X. BAdam: A memory efficient full parameter optimization method for large language models. arXiv preprint, 2024. 概本文介阅读全文

posted @ 2024-08-27 10:12 馒头and花卷阅读(234) 评论(0) 推荐(0)

馒头and花卷

公告