记录一个简单的零停机后端部署方案(Gunicorn + Caddy 蓝绿部署)
项目背景:
后端 Python/django 框架,单实例 VPS
方案概述
通过同时运行两个 Gunicorn 实例(蓝绿部署模式),在更新代码时实现零停机:
- 在备用端口启动新实例
- 健康检查通过后,Caddy 切换流量到新实例
- 优雅关闭旧实例
整个部署过程中始终至少有一个可用的服务实例,对外服务不中断。

脚本配置
app-server-blue.service
[Unit]
Description=Product server (Blue Instance - Port 8000)
After=network.target
[Service]
Type=simple
User=root
WorkingDirectory=/app/test_project/app-server
ExecStart=/home/azureuser/miniconda3/envs/appenv/bin/gunicorn \
--preload \
--workers 8 \
--bind 127.0.0.1:8000 \
--timeout 120 \
--access-logfile /app/test_project/app-server/logs/gunicorn_access_blue.log \
--error-logfile /app/test_project/app-server/logs/gunicorn_error_blue.log \
app.wsgi:application
Restart=on-failure
RestartSec=10
KillSignal=SIGQUIT
TimeoutStopSec=30
Environment="DB_PORT=3306"
Environment="OPENAI_BASE_URL=https://api.openai.com/v1"
Environment="OPENAI_API_KEY=***"
Environment="OPENAI_MODEL=gpt-5.1-chat-latest"
Environment="PERPLEXITY_API_KEY=***"
Environment="APPLE_SHARED_SECRET=***"
Environment="FEATURE_FLAGS_PATH=/app/keys/feature_flags.json"
Environment="APPLE_CLIENT_ID=com.example.app.dev"
Environment="APNS_BUNDLE_ID=com.example.app.dev"
[Install]
WantedBy=multi-user.target
app-server-green.service
[Unit]
Description=Product server (Green Instance - Port 8100)
After=network.target
[Service]
Type=simple
User=root
WorkingDirectory=/app/test_project/app-server
ExecStart=/home/azureuser/miniconda3/envs/appenv/bin/gunicorn \
--preload \
--workers 8 \
--bind 127.0.0.1:8100 \
--timeout 120 \
--access-logfile /app/test_project/app-server/logs/gunicorn_access_green.log \
--error-logfile /app/test_project/app-server/logs/gunicorn_error_green.log \
app.wsgi:application
Restart=on-failure
RestartSec=10
KillSignal=SIGQUIT
TimeoutStopSec=30
Environment="DB_PORT=3306"
Environment="OPENAI_BASE_URL=https://api.openai.com/v1"
Environment="OPENAI_API_KEY=***"
Environment="OPENAI_MODEL=gpt-5.1-chat-latest"
Environment="PERPLEXITY_API_KEY=***"
Environment="APPLE_SHARED_SECRET=***"
Environment="FEATURE_FLAGS_PATH=/app/keys/feature_flags.json"
Environment="APPLE_CLIENT_ID=com.example.app.dev"
Environment="APNS_BUNDLE_ID=com.example.app.dev"
[Install]
WantedBy=multi-user.target
Caddyfile.new
api.example.com {
encode zstd gzip
@static path /static/*
handle @static {
root * /app/test_project/app-server/staticfiles
file_server
}
@media path /media/*
handle @media {
root * /app/test_project/app-server/media
file_server
}
reverse_proxy {
import /etc/caddy/upstream.txt
}
}
蓝绿部署初始化脚本
setup_bluegreen.sh
#!/bin/bash
set -e
echo "Initializing blue-green deployment..."
sudo cp /app/test_project/app-server-blue.service /etc/systemd/system/
sudo cp /app/test_project/app-server-green.service /etc/systemd/system/
echo "to 127.0.0.1:8000" | sudo tee /etc/caddy/upstream.txt > /dev/null
sudo cp /etc/caddy/Caddyfile /etc/caddy/Caddyfile.backup
sudo cp /app/test_project/Caddyfile.new /etc/caddy/Caddyfile
sudo systemctl daemon-reload
if sudo systemctl is-active --quiet app-server-prod.service 2>/dev/null; then
sudo systemctl stop app-server-prod.service
sudo systemctl disable app-server-prod.service
fi
sudo systemctl start app-server-blue.service
sudo systemctl enable app-server-blue.service
sudo systemctl reload caddy
echo "Blue-green deployment initialized"
零停机部署脚本
restart_app.sh
#!/bin/bash
set -e
BLUE_PORT=8000
GREEN_PORT=8100
BLUE_SERVICE="app-server-blue.service"
GREEN_SERVICE="app-server-green.service"
UPSTREAM_FILE="/etc/caddy/upstream.txt"
PROJECT_DIR="/app/test_project"
HEALTH_CHECK_URL="http://127.0.0.1"
HEALTH_CHECK_ENDPOINT="/v1/config/feature-flags/"
MAX_HEALTH_RETRIES=30
HEALTH_RETRY_INTERVAL=2
首次设置
cd /app/test_project
sudo ./setup_bluegreen.sh
该脚本将:
- 复制 systemd service 文件
- 创建
/etc/caddy/upstream.txt - 更新 Caddyfile
- 启动 Blue 实例
- 禁用旧的单实例服务
后续部署/更新代码
cd /app/test_project
./restart_app.sh dev # dev 可替换为待部署的分支名称
查询服务状态
sudo systemctl is-active app-server-blue.service
sudo systemctl is-active app-server-green.service
cat /etc/caddy/upstream.txt
验证服务不中断
客户端测试脚本:
while true; do
result=$(curl -s -o /dev/null -w "%{http_code} %{time_total}s" https://api.example.com/v1/config/feature-flags/ 2>&1)
echo "$(date '+%H:%M:%S') $result"
sleep 0.5
done
在整个部署过程中,请求应持续返回 200,不会出现连接中断。

浙公网安备 33010602011771号