摘要: device_map# 以下内容参考 Huggingface Accelerate文档:超大模型推理方法 在 HuggingFace 中有个重要的关键字是 device_map,它可以简单控制模型层部署在哪些硬件上。 设置参数 device_map="auto",Accelerate会自动检测在哪个 阅读全文
posted @ 2025-09-11 17:23 阳光一生 阅读(23) 评论(0) 推荐(0)
摘要: What is vLLM? vLLM is a high-performance library for LLM (Large Language Model) inference and serving. It is optimized for speed, efficiency, and ease 阅读全文
posted @ 2025-09-11 17:21 阳光一生 阅读(31) 评论(0) 推荐(0)
摘要: Introduction to vLLM vLLM is an efficient, high-performance inference and serving engine designed for large language models (LLMs). It is optimized fo 阅读全文
posted @ 2025-09-11 17:21 阳光一生 阅读(87) 评论(0) 推荐(0)