摘要:
MLLM 综述A Survey on Multimodal Large Language Modelshttps://hjfy.top/arxiv/2306.13549TL;DR本文全面综述了多模态大语言模型(MLLM)的最新进展,重点探讨其如何以大模型为核心处理多模态任务。文章系统性地梳理了架构设 阅读全文
摘要:
Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free https://hjfy.top/arxiv/2505.06708 TL; DR 这篇论文提出了一种 Gated A 阅读全文
摘要:
AI Infra 综述(一)Efficient Training of Large Language Models on Distributed Infrastructures: A Survey参考资料https://arxiv.org/abs/2407.20018https://github.c 阅读全文