vllm 安装

参考：https://docs.vllm.ai/en/latest/getting_started/installation/gpu/#pre-built-wheels

例如（我的是 cu126 的 H100）：

pip install vllm --extra-index-url https://download.pytorch.org/whl/cu126 -i https://pypi.tuna.tsinghua.edu.cn/simple

集群连 tsinghua 源最近被限速了，换个 aliyun 的源：

pip install vllm --extra-index-url https://download.pytorch.org/whl/cu126 -i https://mirrors.aliyun.com/pypi/simple/

现在 vllm==0.12 支持 python=3.10~3.13，老一点的版本选个 python=3.10 比较保险。

不用先装 pytorch，指定好 vllm 版本和 cuda 版本，直接 pip install 即可。

vllm==0.12 和 pytorch==2.9.0 不太兼容 qwen2.5 的 flash attn 2，需要：

VLLM_USE_FLASH_ATTENTION=0 \
VLLM_ATTENTION_BACKEND=TORCH_SDPA \

或者降级到 vllm=0.10.1 和 pytorch==2.8.0

从源码安装：

# install PyTorch first, either from PyPI or from source
git clone https://github.com/vllm-project/vllm.git
cd vllm
python use_existing_torch.py
pip install -r requirements/build.txt
pip install --no-build-isolation . --verbose

posted @ 2025-12-16 00:45 Cold_Chair 阅读(7) 评论(0) 收藏举报

刷新页面返回顶部

Cold_Chair的博客

天天被锤爆！怎么办？菜哭了啊o(╥﹏╥)o

vllm 安装

从源码安装：

公告