上一页 1 2 3 4 5 6 7 8 9 ··· 19 下一页
摘要: 创建$HOME/.tmux.conf文件,参考以下内容配置 bind h select-pane -L bind j select-pane -D bind k select-pane -U bind l select-pane -R setw -g mode-keys vi bind '"' sp 阅读全文
posted @ 2025-06-30 13:52 fariver 阅读(7) 评论(0) 推荐(0)
摘要: 目录AWQ: ACTIVATION-AWARE WEIGHT QUANTIZATION FOR ON-DEVICE LLM COMPRESSION AND ACCELERATIONTL;DRMethodStory为什么用weights quantized-only的方案?如何挑选import wei 阅读全文
posted @ 2025-06-21 17:17 fariver 阅读(148) 评论(0) 推荐(0)
摘要: 目录名称TL;DRMethodZeRO-DPZeRO-R背景优化策略小结Experiment效果可视化总结与思考相关链接 名称 link 时间:19.10 单位:microsoft 作者相关工作:https://i.cnblogs.com/posts/edit;postId=18916963 dee 阅读全文
posted @ 2025-06-21 14:36 fariver 阅读(29) 评论(0) 推荐(0)
摘要: 快捷键配置 workbench.view.explorer -> cmd + E workbench.action.previousEditor -> shift+ h workbench.action.nextEditor -> shift + l go to definition -> ctr+ 阅读全文
posted @ 2025-06-21 11:27 fariver 阅读(13) 评论(0) 推荐(0)
摘要: 整体架构 物理模块 包含关系为:GPC > TPC > SM > CORE GPC(Graphics Processing Clusters 图形处理簇):GPC负责处理图形渲染和计算任务。每个GPC包含多个TPC,以及与其相关的专用硬件单元和缓存。 TPC(Texture Processing C 阅读全文
posted @ 2025-06-19 21:29 fariver 阅读(225) 评论(0) 推荐(0)
摘要: 目录TL;DR大规模高效训练算法优化通信重叠优化其他优化容错设计(Fault Tolerance)数据收集与分析诊断测试Fast Checkpointing and RecoveryExperiment相关链接 MegaScale: Scaling Large Language Model Trai 阅读全文
posted @ 2025-06-19 20:34 fariver 阅读(55) 评论(0) 推荐(0)
摘要: 目录Efficient Memory Management for Large Language Model Serving with PagedAttentionTL;DRMotivation现状:GPU显存是瓶颈具体浪费情况MethodvLLM Framework调度与抢占其它TrickExpe 阅读全文
posted @ 2025-06-12 22:06 fariver 阅读(94) 评论(0) 推荐(0)
摘要: 目录PyTorch FSDP: Experiences on Scaling Fully Sharded Data ParallelTL;DRMethodSystem DesignModel InitializationSharding Strategies(分片策略)​​Full Sharding 阅读全文
posted @ 2025-06-07 18:44 fariver 阅读(122) 评论(0) 推荐(0)
摘要: 目录DeepSpeed Inference: Enabling Efficient Inference of Transformer Models at Unprecedented ScaleTL;DR推理优化方法针对Transformer Kernel优化DeepFusionSBI-GeMM针对D 阅读全文
posted @ 2025-06-06 21:54 fariver 阅读(77) 评论(0) 推荐(0)
摘要: 目录FlashAttention: Fast and Memory-Efficient Exact Attention with IO-AwarenessTL;DRMethodFlashAttention算法详解Sparse FlashAttentionExperimentQ&A总结与思考相关链接 阅读全文
posted @ 2025-06-06 21:53 fariver 阅读(63) 评论(0) 推荐(0)
上一页 1 2 3 4 5 6 7 8 9 ··· 19 下一页