Docker部署Ollama模型
技术背景
前面写过几篇关于DeepSeek大模型的本地部署以及本地Docker部署OpenClaw的教程。但是这里边的Ollama都是直接部署在裸机上的,图个方便,想来还是不妥,于是补充本文,基于Ubuntu Linux的Docker环境中部署Ollama模型的方法。
Docker部署
先查看系统中是否安装了nvidia-container-toolkit:
$ dpkg -l | grep nvidia-container-toolkit
ii nvidia-container-toolkit 1.13.5-1 amd64 NVIDIA Container toolkit
ii nvidia-container-toolkit-base 1.13.5-1 amd64 NVIDIA Container Toolkit Base
如果没有安装,可以参考一下之前写过的这篇博客进行安装。然后就是在DockerHub拉取Ollama的镜像(如有必要,请自行配置Docker源):
$ docker pull ollama/ollama
Using default tag: latest
latest: Pulling from ollama/ollama
817807f3c64e: Pull complete
ae25ca5ada6c: Pull complete
2608ea1d5119: Pull complete
84d58e6813b6: Pull complete
Digest: sha256:
Status: Downloaded newer image for ollama/ollama:latest
docker.io/ollama/ollama:latest
拉取完成后,就可以在本地看到Ollama的镜像:
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
ollama/ollama latest bc1c 3 hours ago 6.01GB
然后就可以启动一个带GPU环境的Ollama容器:
$ docker run -d --name my-ollama.docker --runtime=nvidia -p xxx:11434 -v ~/.ollama:/root/.ollama --gpus all ollama/ollama
7ae5
这里边有两个重要参数配置:--runtime=nvidia还有--gpus all都需要配上,否则无法调用到GPU算力。具体用哪些GPU,这个就按照自己的环境来。然后就可以在Docker的容器列表中看到这一条(这里xxx就是你配置的本地监听端口):
$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
7ae5 ollama/ollama "/bin/ollama serve" 7 seconds ago Up 6 seconds 0.0.0.0:xxx->11434/tcp, [::]:xxx->11434/tcp my-ollama.docker
然后可以使用该镜像直接pull远程模型,例如这里pull一个qwen3.5的模型:
$ docker exec -it my-ollama.docker ollama pull qwen3.5:latest
$ docker exec -it my-ollama.docker ollama list
NAME ID SIZE MODIFIED
qwen3.5:latest 96fa 6.6 GB 18 hours ago
pull完成后,可以查看模型的具体信息:
$ docker exec -it my-ollama.docker ollama show qwen3.5:latest
Model
architecture qwen35
parameters 9.7B
context length 262144
embedding length 4096
quantization Q4_K_M
requires 0.17.1
Capabilities
completion
vision
tools
thinking
Parameters
presence_penalty 1.5
temperature 1
top_k 20
top_p 0.95
License
Apache License
Version 2.0, January 2004
...
可以看到这里下载的是一个Q4_K_M的量化模型,如果有需要,可以自己检索一下看看哪个模型更适合自己。最后再确认GPU是否能够正常调用:
$ docker exec -it my-ollama.docker nvidia-smi
Mon Mar 23 06:27:21 2026
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.274.02 Driver Version: 535.274.02 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
能够正常看到GPU,就代表这个环境没问题,Ollama的镜像是自带显卡驱动的。
Ollama模型测试
完成Ollama的Docker环境配置后,就可以在本地进行测试了:
$ curl http://localhost:xxx/api/generate -d '{
"model": "qwen3.5:latest",
"prompt": "who are you?",
"stream": false
}'
{"model":"qwen3.5:latest","created_at":"2026-03-23T06:28:08.27341445Z","response":"I'm **Qwen3.5**, a large language model developed by Tongyi Lab. I'm here to help with tasks like answering questions, creating content, coding, analyzing data, and more. I support over 100 languages, handle long documents with full context, and can even visualize data or generate code. What would you like to do? 😊","thinking":"Okay, the user is asking \"who are you?\" This is a straightforward question about my identity. I need to provide a clear and accurate response. Since I'm Qwen3.5, I should mention that I'm a large language model developed by Tongyi Lab. I should also highlight some of my capabilities like multi-language support, logical reasoning, and long-context understanding. Keep the response friendly and concise. Make sure to avoid technical jargon to keep it accessible.","done":true,"done_reason":"stop","context":,"total_duration":31831260725,"load_duration":28675863540,"prompt_eval_count":14,"prompt_eval_duration":49629384,"eval_count":174,"eval_duration":278832
这里就成功获取了相应的对话输出。另外在运行的时候,可以后台查看GPU的使用情况:
$ docker exec -it my-ollama.docker ollama ps
NAME ID SIZE PROCESSOR CONTEXT UNTIL
qwen3.5:latest fa5f 23 GB 100% GPU 262144 3 minutes from now
如果这里不是纯GPU的运行,那么就可能会导致速度偏慢。
总结概要
接上一篇关于Docker中的Ollama养虾的文章之后,这里补充一个Docker中部署Ollama模型的方案,通过这种方式建立一个完整的虚拟化环境,确保Ollama和OpenClaw在环境内的操作都相对可控。
版权声明
本文首发链接为:https://www.cnblogs.com/dechinphy/p/docker-ollama.html
作者ID:DechinPhy
更多原著文章:https://www.cnblogs.com/dechinphy/
请博主喝咖啡:https://www.cnblogs.com/dechinphy/gallery/image/379634.html

接上一篇关于Docker中的Ollama养虾的文章之后,这里补充一个Docker中部署Ollama模型的方案,通过这种方式建立一个完整的虚拟化环境,确保Ollama和OpenClaw在环境内的操作都相对可控。
浙公网安备 33010602011771号