1、禁用自带驱动

# 修改配置
# vim /etc/modprobe.d/blacklist-nouveau.conf
......
blacklist nouveau
options nouveau modeset=0

# update-initramfs -u

# reboot

2、安装驱动并查看

# 安装驱动
apt -y install nvidia-driver-550-server

# 查看 
nvidia-smi

3、安装CUDA并验证

# 安装CUDA
apt -y install nvidia-cuda-toolkit

# 验证
nvcc --version

4、UV工具安装

curl -LsSf https://astral.sh/uv/install.sh | sh

5、安装并配置Python虚拟环境

# (Recommended) Create a new uv environment. Use `--seed` to install `pip` and `setuptools` in the environment.
uv venv vllm --python 3.12 --seed
source vllm/bin/activate

6、安装vllm

uv pip install vllm -i https://pypi.tuna.tsinghua.edu.cn/simple

7、下载模型

pip install modelscope
modelscope download --model deepseek-ai/DeepSeek-R1-Distill-Qwen-7B --local_dir ./deepseek-ai

8、加载模型

python -m vllm.entrypoints.openai.api_server \
  --model deepseek-ai \
  --host 0.0.0.0 \
  --port 8000 \
  --dtype float16 \
  --max-model-len 4096 \
  --tensor-parallel-size 1

9、验证服务

# curl http://localhost:8000/v1/models
{"object":"list","data":[{"id":"deepseek-ai","object":"model","created":1741872853,"owned_by":"vllm","root":"deepseek-ai","parent":null,"max_model_len":4096,"permission":[{"id":"modelperm-bec7c4cc2dfc4a558d7af56bb99b1cea","object":"model_permission","created":1741872853,"allow_create_engine":false,"allow_sampling":true,"allow_logprobs":true,"allow_search_indices":false,"allow_view":true,"allow_fine_tuning":false,"organization":"*","group":null,"is_blocking":false}]}]}

10、测试DeepSeek-R1 7B模型

# curl -X POST "http://localhost:8000/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-ai",
    "messages": [{"role": "user", "content": "你好,请介绍一下自己<think>\n"}]
  }'
{"id":"chatcmpl-e75f6ac6ff494c868c93a038db80f69c","object":"chat.completion","created":1741872880,"model":"deepseek-ai","choices":[{"index":0,"message":{"role":"assistant","reasoning_content":null,"content":"好的,用户问我“你好,请介绍一下自己”,看来他们想了解我的性格、兴趣爱好和学习情况。我应该先耐心回答,尽量详细一些。\n\n首先,我需要解释我是如何获得这个请求的,这样能更准确地回答。这表明我在之前的对话中得到了用户的更多信息。\n\n接下来,我应该介绍一下自己,包括我的身份、位置和教育背景。这样可以让用户更全面地了解自己。\n\n然后,我应该介绍我的兴趣爱好,比如(lines艺术、音乐 、 photography等,这可以帮助用户更好地了解我的个性和生活风格。\n\n同时,我可以简要提到我的专业和兴趣,比如Rhino,这表明我可能从事室内视觉设计领域的工作。\n\n最后,我要对用户的互动保持友好,用一些表情符号和 placeholder 来回复,保持用户活力。\n\n整个思考过程要尽量详细,同时保持回答自然,用词口语化,让用户感觉被重视。\n</think>\n\n你好!我’m 乙sc材科学习生,目前主要在室内视觉设计领域。 我对艺术有着浓厚的兴趣,尤其喜欢 Lines Art和Experimental Photography。我还学习了Rhino软件,这对于我的工作至关重要。我对家庭和日常生活也有一定的热情, enjoy旅行和自然探索。希望能与你有任何有趣的话题可以讨论!😊","tool_calls":[]},"logprobs":null,"finish_reason":"stop","stop_reason":null}],"usage":{"prompt_tokens":11,"total_tokens":271,"completion_tokens":260,"prompt_tokens_details":null},"prompt_logprobs":null}

11、接口文档查看

http://部署主机IP:8000/docs

参考:

https://docs.astral.sh/uv/getting-started/installation/
https://docs.vllm.ai/en/latest/getting_started/installation/gpu.html
https://www.modelscope.cn/models/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B/

  

posted on 2025-03-13 21:54  a120608yby  阅读(1368)  评论(0)    收藏  举报