摘要:
快捷键配置 workbench.view.explorer -> cmd + E workbench.action.previousEditor -> shift+ h workbench.action.nextEditor -> shift + l go to definition -> ctr+ 阅读全文
摘要:
目录TL;DR大规模高效训练算法优化通信重叠优化其他优化容错设计(Fault Tolerance)数据收集与分析诊断测试Fast Checkpointing and RecoveryExperiment相关链接 MegaScale: Scaling Large Language Model Trai 阅读全文
摘要:
目录Efficient Memory Management for Large Language Model Serving with PagedAttentionTL;DRMotivation现状:GPU显存是瓶颈具体浪费情况MethodvLLM Framework调度与抢占其它TrickExpe 阅读全文
摘要:
目录FlashAttention: Fast and Memory-Efficient Exact Attention with IO-AwarenessTL;DRMethodFlashAttention算法详解Sparse FlashAttentionExperimentQ&A总结与思考相关链接 阅读全文