Kubernetes Controller 核心管理实操指南 - 实践
文章目录
- Kubernetes Controller 核心管理实操指南
- 一、Controller 核心概念与分类
- 二、Deployment 控制器实操(无状态应用)
- 三、ReplicaSet 控制器实操
- 四、Controller 进阶实操
- 五、StatefulSet 控制器实操(有状态应用)
- 六、总结
Kubernetes Controller 核心管理实操指南
一、Controller 核心概念与分类
1. Controller 作用与定位
Controller(控制器)是 Kubernetes 中管理工作负载(Workload)的核心组件,核心职责是将 Pod 实际状态持续调整为用户定义的 “期望状态”,包括维持副本数、故障自愈、版本更新等。所有 K8s 应用的全生命周期均通过 Controller 管控。
官方文档地址:https://kubernetes.io/zh-cn/docs/concepts/workloads/controllers/
2. Controller 完整分类
Kubernetes 控制器按应用场景分为 7 类,核心功能与适用场景如下:
| 控制器类型 | 核心功能 | 适用场景 | 关键说明 |
|---|---|---|---|
| Deployments | 部署无状态应用;集成滚动升级、版本回滚、副本伸缩;底层依赖 ReplicaSet | Web 服务、API 服务等无状态应用 | 生产环境首选,支持服务无中断升级 |
| ReplicaSet(RS) | 仅负责 Pod 副本数管理(扩容 / 缩容);支持基于集合的标签选择器 | 作为 Deployment 底层组件,不建议单独使用 | 无版本升级 / 回滚功能 |
| ReplicationController(RC) | ReplicaSet 老版本,仅支持等值标签选择器 | legacy 系统兼容 | 已被 Deployment + ReplicaSet 替代 |
| StatefulSets | 部署有状态应用;提供稳定网络标识、持久存储;有序部署 / 升级 | 数据库(MySQL)、缓存集群(Redis) | 需配合 Headless Service 和 PV/PVC |
| DaemonSet | 在所有(或指定)节点运行 1 个 Pod;节点新增 / 移除时自动部署 / 清理 | 日志收集(Filebeat)、节点监控(Node Exporter) | 副本数 = 节点数,不支持手动扩容 |
| Jobs | 执行一次性短任务;确保任务成功完成后 Pod 终止 | 数据备份、文件压缩、批量计算 | 任务完成后 Pod 进入 Completed 状态 |
| CronJob | 基于 Cron 表达式周期性执行任务;本质是定时创建 Job | 定时备份、日志清理、周期性数据统计 | 依赖 Job 实现,支持并发数配置 |
二、Deployment 控制器实操(无状态应用)
1. 核心功能与无状态应用特性
(1)Deployment 核心能力
- 集成 ReplicaSet 所有功能,自动管理副本数;
- 支持滚动升级(逐步替换旧 Pod,服务无中断);
- 支持版本回滚(恢复任意历史版本);
- 故障自愈(Pod 崩溃 / 节点宕机时自动重建)。
(2)无状态应用特点
- 所有 Pod 无差异(运行同一镜像,配置一致);
- 无启动顺序依赖,可在任意节点调度;
- 支持随意扩容 / 缩容(如静态 Web 服务)。
2. 创建 Deployment 应用
步骤 1:获取配置帮助
[root@master deployment_dir]# kubectl explain deployment
GROUP: apps
KIND: Deployment
VERSION: v1
步骤 2:编写 YAML 配置文件(deployment-nginx.yml)
apiVersion: apps/v1
kind: Deployment
metadata:
name: deploy-nginx # Deployment 名称
spec:
replicas: 1 # 期望 Pod 副本数(默认 1)
selector: # 标签选择器,匹配待管理的 Pod
matchLabels:
app: nginx # 匹配标签为 app: nginx 的 Pod
template: # Pod 配置模板
metadata:
labels:
app: nginx # Pod 标签,必须与 selector 一致
spec:
containers: # 容器定义
- name: nginx
image: nginx:1.26-alpine
imagePullPolicy: IfNotPresent
ports:
- containerPort: 80 # 容器暴露端口(Nginx 默认 80)
步骤 3:应用配置并验证
# 1. 创建 Deployment
[root@master deployment_dir]# kubectl apply -f deployment-nginx.yml
deployment.apps/deploy-nginx created
# 2. 查看 Deployment 状态
[root@master deployment_dir]# kubectl get deployment
NAME READY UP-TO-DATE AVAILABLE AGE
deploy-nginx 1/1 1 1 2m17s
# 3. 查看关联的 ReplicaSet(自动创建)
[root@master deployment_dir]# kubectl get replicaset
NAME DESIRED CURRENT READY AGE
deploy-nginx-66d785bdb5 1 1 1 8m58s
# 4. 查看 Pod 状态
[root@master deployment_dir]# kubectl get pods
NAME READY STATUS RESTARTS AGE
deploy-nginx-66d785bdb5-nztrw 1/1 Running 0 106s
3. 访问 Deployment 管理的 Pod
步骤 1:查看 Pod 详细信息
[root@master deployment_dir]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
deploy-nginx-66d785bdb5-nztrw 1/1 Running 0 11m 10.244.166.145 node1 <none> <none>
# Pod 运行在 node1 节点,IP:10.244.166.145
步骤 2:验证集群网络互通
# 查看集群节点网段(所有节点在 10.244.0.0/16 网段)
[root@master deployment_dir]# ifconfig tunl0 | head -2
tunl0: flags=193<UP,RUNNING,NOARP> mtu 1480
inet 10.244.219.64 netmask 255.255.255.255
# 任意节点访问 Pod
[root@master deployment_dir]# ping -c 4 10.244.166.145
PING 10.244.166.145 (10.244.166.145) 56(84) bytes of data.
64 bytes from 10.244.166.145: icmp_seq=1 ttl=63 time=0.543 ms
64 bytes from 10.244.166.145: icmp_seq=2 ttl=63 time=0.366 ms
64 bytes from 10.244.166.145: icmp_seq=3 ttl=63 time=0.435 ms
64 bytes from 10.244.166.145: icmp_seq=4 ttl=63 time=0.381 ms
# 访问 Nginx 服务
[root@master deployment_dir]# curl http://10.244.166.145
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and working.</p>
</body>
</html>
4. 核心操作:删除 Pod 验证自愈能力
# 1. 强制删除 Pod
[root@master deployment_dir]# kubectl delete pod deploy-nginx-66d785bdb5-nztrw --grace-period=0 --force
Warning: Immediate deletion does not wait for confirmation...
pod "deploy-nginx-66d785bdb5-nztrw" force deleted
# 2. 查看重建结果(Pod 节点和 IP 变化,副本数保持 1)
[root@master deployment_dir]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
deploy-nginx-66d785bdb5-zv2v4 1/1 Running 0 2m6s 10.244.104.29 node2 <none> <none>
结论:Pod IP 不固定,集群重启后 Pod 自动启动但 IP 变化,需通过 Service 提供固定访问端点。
5. 核心操作:Pod 版本升级
步骤 1:查看升级前版本
[root@master ~]# kubectl describe pods deploy-nginx-66d785bdb5-zv2v4 | grep Image:
Image: nginx:1.26-alpine
[root@master ~]# kubectl exec deploy-nginx-66d785bdb5-zv2v4 -- nginx -v
nginx version: nginx/1.26.3
步骤 2:执行升级(升级至 1.29 版本)
[root@master ~]# kubectl set image deployment deploy-nginx nginx=nginx:1.29-alpine --record
Flag --record has been deprecated, --record will be removed in the future
deployment.apps/deploy-nginx image updated
步骤 3:验证升级结果
[root@master ~]# kubectl get pods
NAME READY STATUS RESTARTS AGE
deploy-nginx-b5d6b46fb-f56rg 1/1 Running 0 74s
[root@master ~]# kubectl exec deploy-nginx-b5d6b46fb-f56rg -- nginx -v
nginx version: nginx/1.29.0
# 查看升级状态(提示成功则完成)
[root@master ~]# kubectl rollout status deployment deploy-nginx
deployment "deploy-nginx" successfully rolled out
说明:
nginx=nginx:1.29-alpine中前一个nginx为容器名,可通过kubectl describe pod <pod名>、kubectl edit deployment <部署名>查看容器名。
6. 核心操作:版本回滚
# 1. 查看版本历史
[root@master ~]# kubectl rollout history deployment deploy-nginx
deployment.apps/deploy-nginx
REVISION CHANGE-CAUSE
1 <none>
2 kubectl set image deployment deploy-nginx nginx=nginx:1.29-alpine --record=true
# 2. 查看指定版本详情(版本 1 对应 1.26 镜像)
[root@master ~]# kubectl rollout history deployment deploy-nginx --revision=1
deployment.apps/deploy-nginx with revision #1
Pod Template:
Labels: app=nginx
pod-template-hash=66d785bdb5
Containers:
nginx:
Image: nginx:1.26-alpine
Port: 80/TCP
Environment: <none>
# 3. 回滚到版本 1
[root@master ~]# kubectl rollout undo deployment deploy-nginx --to-revision=1
deployment.apps/deploy-nginx rolled back
# 4. 验证回滚结果
[root@master ~]# kubectl exec deploy-nginx-66d785bdb5-8jjdw -- nginx -v
nginx version: nginx/1.26.3
7. 核心操作:副本扩容与缩容
步骤 1:扩容操作
# 扩容到 2 个副本
[root@master ~]# kubectl scale deployment deploy-nginx --replicas=2
deployment.apps/deploy-nginx scaled
步骤 2:故障排查(Pod 卡在 ContainerCreating)
# 查看 Pod 状态
[root@master ~]# kubectl get pods
NAME READY STATUS RESTARTS AGE
deploy-nginx-66d785bdb5-8jjdw 1/1 Running 0 19m
deploy-nginx-66d785bdb5-jfcxt 0/1 ContainerCreating 0 8m15s
# 查看事件(定位 Calico 网络插件故障)
[root@master ~]# kubectl describe pod deploy-nginx-66d785bdb5-jfcxt
Events:
Warning FailedCreatePodSandBox 8m53s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container ... networkPlugin cni failed to set up pod ... error getting ClusterInformation: connection is unauthorized: Unauthorized]
# 解决方法:重启 Calico 组件
[root@master ~]# kubectl rollout restart daemonset calico-node -n kube-system
步骤 3:继续扩容 / 缩容
# 扩容到 4 个副本(支持超过节点数)
[root@master ~]# kubectl scale deployment deploy-nginx --replicas=4
deployment.apps/deploy-nginx scaled
# 缩容到 0 个副本(暂停服务,保留配置)
[root@master ~]# kubectl scale deployment deploy-nginx --replicas=0
deployment.apps/deploy-nginx scaled
[root@master ~]# kubectl get pods
No resources found in default namespace.
8. 核心操作:多副本滚动更新
# 1. 扩容到 16 个副本
[root@master ~]# kubectl scale deployment deploy-nginx --replicas=16
deployment.apps/deploy-nginx scaled
# 2. 执行滚动更新
[root@master ~]# kubectl set image deployment deploy-nginx nginx=nginx:1.29-alpine --record
# 3. 验证更新结果
[root@master ~]# kubectl rollout status deployment deploy-nginx
deployment "deploy-nginx" successfully rolled out
9. 删除 Deployment
[root@master ~]# kubectl delete deployment deploy-nginx
deployment.apps "deploy-nginx" deleted
[root@master ~]# kubectl get pods
No resources found in default namespace.
三、ReplicaSet 控制器实操
1. 核心定位
- 仅负责 Pod 副本数管理,是 Deployment 底层依赖;
- 支持基于集合的标签选择器,无版本升级 / 回滚功能;
- 不建议单独使用,推荐通过 Deployment 间接管理。
2. 创建 ReplicaSet 应用
步骤 1:编写 YAML 配置文件(rs-nginx.yml)
apiVersion: apps/v1
kind: ReplicaSet
metadata:
name: rs-nginx
namespace: default
spec:
replicas: 2 # 期望副本数
selector:
matchLabels:
app: nginx
template:
metadata:
name: nginx
labels:
app: nginx # 与 selector 标签一致
spec:
containers:
- name: nginx
image: nginx:1.26-alpine
ports:
- name: http
containerPort: 80
步骤 2:应用配置并验证
# 1. 创建 ReplicaSet
[root@master rep_dir]# kubectl apply -f rs-nginx.yml
replicaset.apps/rs-nginx created
# 2. 查看 ReplicaSet 状态
[root@master rep_dir]# kubectl get rs
NAME DESIRED CURRENT READY AGE
rs-nginx 2 2 2 3m3s
# 3. 查看 Pod 状态(无 Deployment 关联)
[root@master rep_dir]# kubectl get pods
NAME READY STATUS RESTARTS AGE
rs-nginx-2ftcr 1/1 Running 0 44s
rs-nginx-lzcv7 1/1 Running 0 44s
[root@master rep_dir]# kubectl get deployment
No resources found in default namespace.
3. 核心操作:副本扩容与版本限制
# 1. 扩容到 4 个副本
[root@master rep_dir]# kubectl scale replicaset rs-nginx --replicas=4
replicaset.apps/rs-nginx scaled
# 2. 尝试版本升级(无效果)
[root@master rep_dir]# kubectl set image replicaset rs-nginx nginx=nginx:latest --record
replicaset.apps/rs-nginx image updated
# 3. 验证版本(旧 Pod 未更新)
[root@master rep_dir]# kubectl describe pods rs-nginx-2ftcr | grep Image:
Image: nginx:1.26-alpine
结论:ReplicaSet 不支持版本自动升级,仅新创建的 Pod 会使用新镜像。
四、Controller 进阶实操
1. DaemonSet 控制器(守护进程集)
(1)核心特性
- 每个节点仅运行 1 个 Pod;
- 节点新增时自动部署,节点移除时自动删除;
- 需配置容忍(tolerations)才能在 master 节点运行;
- 适用于日志收集、节点监控等场景。
(2)创建 DaemonSet 应用
步骤 1:编写 YAML 配置文件(daemonset-nginx.yml)
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: daemonset-nginx
spec:
selector:
matchLabels:
name: nginx-ds
template:
metadata:
labels:
name: nginx-ds
spec:
tolerations: # 容忍 master 节点污点
- key: node-role.kubernetes.io/master
effect: NoSchedule
containers:
- name: nginx
image: nginx:1.26-alpine
imagePullPolicy: IfNotPresent
resources: # 资源限制
limits:
memory: 100Mi
requests:
memory: 100Mi
步骤 2:应用配置并验证
# 1. 创建 DaemonSet
[root@master rep_dir]# kubectl apply -f daemonset-nginx.yml
daemonset.apps/daemonset-nginx created
# 2. 查看 DaemonSet 状态
[root@master rep_dir]# kubectl get daemonset
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE AGE
daemonset-nginx 2 2 2 2 2 4m
# 3. 查看 Pod 分布(每个节点 1 个副本)
[root@master rep_dir]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
daemonset-nginx-7795c 1/1 Running 0 5m19s 10.244.166.166 node1 <none> <none>
daemonset-nginx-l94ds 1/1 Running 0 5m19s 10.244.104.51 node2 <none> <none>
步骤 3:版本升级
# 升级到 latest 版本
[root@master rep_dir]# kubectl set image daemonset daemonset-nginx nginx=nginx:latest
daemonset.apps/daemonset-nginx image updated
# 验证升级结果
[root@master rep_dir]# kubectl describe pods daemonset-nginx-mpjfd | grep Image:
Image: nginx:latest
2. Job 控制器(一次性任务)
(1)核心特性
- 针对非耐久性任务,任务完成后 Pod 终止;
- 支持配置任务执行次数(completions)和并发数(parallelism);
- 与 ReplicaSet 区别:ReplicaSet 管理持久运行 Pod,Job 管理一次性任务。
(2)应用案例 1:计算圆周率 2000 位
步骤 1:编写 YAML 配置文件(job.yml)
apiVersion: batch/v1
kind: Job
metadata:
name: pi
spec:
template:
metadata:
name: pod-pi
spec:
nodeName: node2 # 指定调度节点
containers:
- name: c-pi
image: perl # 依赖 Perl 镜像(约 800M,建议提前导入)
imagePullPolicy: IfNotPresent
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
restartPolicy: Never # 任务完成不重启
步骤 2:应用配置并验证
# 1. 创建 Job
[root@master job_dir]# kubectl apply -f job.yml
job.batch/pi created
# 2. 查看 Job 状态
[root@master job_dir]# kubectl get job
NAME COMPLETIONS DURATION AGE
pi 1/1 4m43s 5m
# 3. 查看 Pod 状态(任务完成后为 Completed)
[root@master job_dir]# kubectl get pod
NAME READY STATUS RESTARTS AGE
pi-mjk9t 0/1 Completed 0 4m43s
# 4. 查看任务结果(打印圆周率 2000 位)
[root@master job_dir]# kubectl logs pi-mjk9t
3.141592653589793238462643383279502884197169399375105820974944592307816406286208...
(3)应用案例 2:固定次数任务(执行 10 次)
步骤 1:编写 YAML 配置文件(job2.yml)
apiVersion: batch/v1
kind: Job
metadata:
name: busybox-job
spec:
completions: 10 # 任务执行次数
parallelism: 1 # 并发数(每次 1 个任务)
template:
metadata:
name: busybox-job-pod
spec:
containers:
- name: busybox
image: busybox
command: ["echo", "hello"]
restartPolicy: Never
步骤 2:应用配置并验证
[root@master job_dir]# kubectl apply -f job2.yml
job.batch/busybox-job created
# 查看 Pod 状态(10 个 Pod 均为 Completed)
[root@master job_dir]# kubectl get pod
NAME READY STATUS RESTARTS AGE
busybox-job-2x7kw 0/1 Completed 0 17s
busybox-job-4drnr 0/1 Completed 0 27s
...(省略 8 个 Pod)
# 查看任务输出
[root@master job_dir]# kubectl logs busybox-job-2x7kw
hello
(4)应用案例 3:一次性备份 MySQL 数据库
步骤 1:部署 MySQL 环境(mysqld.yml)
apiVersion: v1
kind: Service
metadata:
name: mysql-test
spec:
ports:
- port: 3306
name: mysql
clusterIP: None # 无头服务
selector:
app: mysql-dump
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: db
spec:
selector:
matchLabels:
app: mysql-dump
serviceName: "mysql-test"
template:
metadata:
labels:
app: mysql-dump
spec:
nodeName: node1 # 调度到 node1
containers:
- name: mysql
image: mysql:5.7
env:
- name: MYSQL_ROOT_PASSWORD
value: "abc123" # 数据库密码
ports:
- containerPort: 3306
volumeMounts:
- mountPath: "/var/lib/mysql"
name: mysql-data
volumes:
- name: mysql-data
hostPath:
path: /opt/mysqldata # 本地存储目录
步骤 2:部署 MySQL 并验证
[root@master job_dir]# kubectl apply -f mysqld.yml
service/mysql-test created
statefulset.apps/db created
# 查看 MySQL 状态
[root@master job_dir]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
db-0 1/1 Running 0 13m 10.244.166.170 node1 <none>
步骤 3:编写备份 Job 配置(job_mysqld.yml)
apiVersion: batch/v1
kind: Job
metadata:
name: mysql-dump
spec:
template:
metadata:
name: mysql-dump
spec:
nodeName: node2 # 调度到 node2
containers:
- name: mysql-dump
image: mysql:5.7
command: ["/bin/sh", "-c", "mysqldump --host=mysql-test -uroot -pabc123 --databases mysql > /root/mysql_back.sql"]
volumeMounts:
- mountPath: "/root"
name: mysql-data
restartPolicy: Never
volumes:
- name: mysql-data
hostPath:
path: /opt/mysqldump # 备份存储目录
步骤 4:执行备份并验证
[root@master job_dir]# kubectl apply -f job_mysqld.yml
job.batch/mysql-dump created
# 查看备份文件(node2 节点)
[root@node2 ~]# ls /opt/mysqldump/
mysql_back.sql
3. CronJob 控制器(周期性任务)
(1)核心特性
- 基于 Cron 表达式调度,本质是定时创建 Job;
- 支持配置任务周期、并发数、重试策略;
- 适用于定时备份、日志清理等场景。
(2)应用案例 1:周期性输出字符
步骤 1:编写 YAML 配置文件(cronjob.yml)
apiVersion: batch/v1
kind: CronJob
metadata:
name: cronjob1
spec:
schedule: "* * * * *" # Cron 表达式(每分钟执行)
jobTemplate:
spec:
template:
spec:
containers:
- name: hello
image: busybox
args:
- /bin/sh
- -c
- date; echo hello kubernetes
restartPolicy: OnFailure # 失败时重启
步骤 2:应用配置并验证
[root@master cronjob_dir]# kubectl apply -f cronjob.yml
cronjob.batch/cronjob1 created
# 查看 CronJob 状态
[root@master cronjob_dir]# kubectl get cronjob
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE
cronjob1 * * * * * False 0 15s 91s
# 查看任务 Pod 及输出
[root@master cronjob_dir]# kubectl get pod
NAME READY STATUS RESTARTS AGE
cronjob1-29236967-svqln 0/1 Completed 0 4s
[root@master cronjob_dir]# kubectl logs cronjob1-29236967-svqln
Sun Aug 3 10:47:00 UTC 2025
hello kubernetes
(3)应用案例 2:周期性备份 MySQL 数据库
步骤 1:编写 CronJob 配置(cronjob_mysqld.yml)
apiVersion: batch/v1
kind: CronJob
metadata:
name: mysql-dump
spec:
schedule: "*/1 * * * *" # 每分钟执行一次
jobTemplate:
spec:
template:
spec:
nodeName: node1
containers:
- name: c1
image: mysql:5.7
command: ["/bin/sh","-c","mysqldump --host=mysql-test -uroot -pabc123 --databases mysql > /root/mysql`date +%Y%m%d%H%M`.sql"]
volumeMounts:
- name: mysql-data
mountPath: "/root"
restartPolicy: Never
volumes:
- name: mysql-data
hostPath:
path: /opt/mysqldump
步骤 2:应用配置并验证
[root@master cronjob_dir]# kubectl apply -f cronjob_mysqld.yml
cronjob.batch/mysql-dump created
# 查看备份文件(node1 节点)
[root@node1 ~]# ls /opt/mysqldump/
mysql202508031109.sql mysql202508031110.sql ...
五、StatefulSet 控制器实操(有状态应用)
1. 核心概念与特点
(1)StatefulSet 作用
- 管理有状态应用的部署、扩缩容及滚动更新;
- 提供稳定网络标识(通过 Headless Service)和持久存储;
- 支持有序部署 / 缩容(如 0→1→2 部署,2→1→0 缩容)。
(2)无状态 vs 有状态应用
| 类型 | 示例 | 核心特点 |
|---|---|---|
| 无状态应用 | Nginx | 请求包含所有响应信息,无数据依赖,Pod 可随意替换 |
| 有状态应用 | MySQL | 前后请求有依赖,需持久化数据,Pod 有唯一标识,不可随意替换 |
(3)StatefulSet 依赖组件
- Headless Service:提供稳定网络标识(无 ClusterIP);
- StatefulSet 资源:定义应用配置;
- VolumeClaimTemplate:存储卷模板,自动为每个 Pod 创建 PVC。
2. 部署准备:NFS 存储配置
步骤 1:部署 NFS 服务器(IP:192.168.100.138)
# 1. 安装 NFS 服务
[root@nfsserver ~]# yum install -y nfs-utils
# 2. 创建共享目录
[root@nfsserver ~]# mkdir -p /data/nfs
[root@nfsserver ~]# chmod 777 /data/nfs
# 3. 配置共享(/etc/exports)
[root@nfsserver ~]# echo "/data/nfs *(rw,no_root_squash,sync)" > /etc/exports
# 4. 启动服务并关闭防火墙
[root@nfsserver ~]# systemctl restart nfs-server
[root@nfsserver ~]# systemctl enable nfs-server
[root@nfsserver ~]# systemctl stop firewalld
[root@nfsserver ~]# setenforce 0
# 验证共享
[root@nfsserver ~]# showmount -e
Export list for nfsserver:
/data/nfs *
步骤 2:所有 Node 节点安装 NFS 客户端
[root@node1 ~]# yum install -y nfs-utils
[root@node2 ~]# yum install -y nfs-utils
# 验证 NFS 连通性
[root@node1 ~]# showmount -e 192.168.100.138
Export list for 192.168.100.138:
/data/nfs *
3. 静态存储:PV 与 PVC 配置
步骤 1:创建 PV(pv-nfs.yml)
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-nfs
spec:
capacity:
storage: 1Gi # 存储容量
accessModes:
- ReadWriteMany # 多节点读写
nfs:
server: 192.168.100.138 # NFS 服务器 IP
path: /data/nfs # 共享目录
步骤 2:创建 PVC(pvc-nfs.yml)
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc-nfs
spec:
accessModes:
- ReadWriteMany # 与 PV 一致
resources:
requests:
storage: 1Gi # 与 PV 容量一致
步骤 3:应用配置并验证
# 1. 创建 PV 和 PVC
[root@master statefulset_dir]# kubectl apply -f pv-nfs.yml
[root@master statefulset_dir]# kubectl apply -f pvc-nfs.yml
# 2. 验证绑定状态
[root@master statefulset_dir]# kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM AGE
pv-nfs 1Gi RWX Retain Bound default/pvc-nfs 11m
[root@master statefulset_dir]# kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES AGE
pvc-nfs Bound pv-nfs 1Gi RWX 91s
步骤 4:部署 Nginx 应用(使用 PV/PVC)
# dep-nginx-nfs.yml
apiVersion: apps/v1
kind: Deployment
metadata:
name: deploy-nginx-nfs
spec:
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.26-alpine
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html # 挂载目录
ports:
- containerPort: 80
volumes:
- name: www
persistentVolumeClaim:
claimName: pvc-nfs # 关联 PVC
# 应用配置并验证
[root@master statefulset_dir]# kubectl apply -f dep-nginx-nfs.yml
[root@master statefulset_dir]# kubectl get pod -o wide
# 在 NFS 服务器创建测试文件
[root@nfsserver nfs]# echo "hello nfs storage" > index.html
# 访问 Pod 验证
[root@master statefulset_dir]# curl http://10.244.166.133
hello nfs storage
4. 动态存储:StorageClass 配置
(1)核心优势
- 无需手动创建 PV,PVC 申请时自动生成;
- 管理员无需预先配置大量 PV,按需动态供给。
(2)部署动态供给插件(NFS-Client Provisioner)
步骤 1:下载并配置 StorageClass(storageclass-nfs.yml)
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: nfs-client # StorageClass 名称
provisioner: k8s-sigs.io/nfs-subdir-external-provisioner # 插件名称
archiveOnDelete: "false" # 删除 PVC 时不归档数据
[root@master nfs-cli]# kubectl apply -f storageclass-nfs.yml
storageclass.storage.k8s.io/nfs-client created
步骤 2:配置 RBAC 权限(授权插件操作 PV/PVC)
# 下载 RBAC 配置文件
[root@master nfs-cli]# wget https://raw.githubusercontent.com/kubernetes-sigs/nfs-subdir-external-provisioner/master/deploy/rbac.yaml
# 应用配置
[root@master nfs-cli]# kubectl apply -f rbac.yaml
serviceaccount/nfs-client-provisioner created
clusterrole.rbac.authorization.k8s.io/nfs-client-provisioner-runner created
...
步骤 3:部署 Provisioner Deployment
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nfs-client-provisioner
spec:
replicas: 1
selector:
matchLabels:
app: nfs-client-provisioner
template:
metadata:
labels:
app: nfs-client-provisioner
spec:
serviceAccountName: nfs-client-provisioner
containers:
- name: nfs-client-provisioner
image: registry.cn-beijing.aliyuncs.com/pylixm/nfs-subdir-external-provisioner:v4.0.0 # 国内镜像
volumeMounts:
- name: nfs-client-root
mountPath: /persistentvolumes
env:
- name: PROVISIONER_NAME
value: k8s-sigs.io/nfs-subdir-external-provisioner
- name: NFS_SERVER
value: 192.168.100.138 # NFS 服务器 IP
- name: NFS_PATH
value: /data/nfs # 共享目录
volumes:
- name: nfs-client-root
nfs:
server: 192.168.100.138
path: /data/nfs
[root@master nfs-cli]# kubectl apply -f deployment.yaml
deployment.apps/nfs-client-provisioner created
# 验证插件状态
[root@master nfs-cli]# kubectl get pod
NAME READY STATUS RESTARTS AGE
nfs-client-provisioner-54b9cb8bf9-zjbdz 1/1 Running 0 18s
步骤 4:StatefulSet 应用动态存储
# nginx-storageclass-nfs.yml
apiVersion: v1
kind: Service
metadata:
name: nginx-svc
spec:
ports:
- port: 80
name: web
clusterIP: None # 无头服务
selector:
app: nginx
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web
spec:
serviceName: "nginx-svc" # 关联无头服务
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx-c
image: nginx:1.26-alpine
ports:
- containerPort: 80
name: web
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html
volumeClaimTemplates: # PVC 模板(自动生成 PVC)
- metadata:
name: www
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: "nfs-client" # 关联 StorageClass
resources:
requests:
storage: 1Gi
# 应用配置并验证
[root@master nfs-cli]# kubectl apply -f nginx-storageclass-nfs.yml
service/nginx-svc created
statefulset.apps/web created
# 查看自动生成的 PVC 和 PV
[root@master nfs-cli]# kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES AGE
www-web-0 Bound pvc-91f70f53-f928-406c-afcc-bd47a83b0e56 1Gi RWO 111s
www-web-1 Bound pvc-e3339644-0385-4e6b-8aac-155a6c2b3e3f 1Gi RWO 108s
[root@master nfs-cli]# kubectl get pv
NAME CAPACITY ACCESS MODES STATUS CLAIM AGE
pvc-91f70f53-f928-406c-afcc-bd47a83b0e56 1Gi RWO Bound default/www-web-0 2m57s
pvc-e3339644-0385-4e6b-8aac-155a6c2b3e3f 1Gi RWO Bound default/www-web-1 2m54s
(3)验证动态存储效果
# 在 NFS 服务器查看自动创建的目录
[root@nfsserver nfs]# ls
default-www-web-0-pvc-91f70f53-f928-406c-afcc-bd47a83b0e56
default-www-web-1-pvc-e3339644-0385-4e6b-8aac-155a6c2b3e3f
# 写入测试数据
[root@nfsserver nfs]# echo "this is web0" > default-www-web-0-pvc-91f70f53-f928-406c-afcc-bd47a83b0e56/index.html
[root@nfsserver nfs]# echo "this is web1" > default-www-web-1-pvc-e3339644-0385-4e6b-8aac-155a6c2b3e3f/index.html
# 访问 Pod 验证
[root@master nfs-cli]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
web-0 1/1 Running 0 9m 10.244.166.154 node1 <none>
web-1 1/1 Running 0 8m 10.244.104.62 node2 <none>
[root@master nfs-cli]# curl http://10.244.166.154
this is web0
[root@master nfs-cli]# curl http://10.244.104.62
this is web1
5. 金丝雀发布实战(StatefulSet 特性)
(1)核心原理
通过 partition 参数控制更新范围,仅更新编号 ≥ partition 的 Pod,实现灰度发布。
(2)实操步骤
步骤 1:设置 partition 参数(仅更新编号 ≥1 的 Pod)
[root@master nfs-cli]# kubectl patch sts web -p '{"spec":{"updateStrategy":{"rollingUpdate":{"partition":1}}}'
statefulset.apps/web patched
# 验证参数
[root@master nfs-cli]# kubectl get sts web -o yaml | grep partition
partition: 1
步骤 2:查看更新前版本
[root@master nfs-cli]# kubectl describe pod web-0 | grep Image:
Image: nginx:1.26-alpine
[root@master nfs-cli]# kubectl describe pod web-1 | grep Image:
Image: nginx:1.26-alpine
步骤 3:执行版本更新
[root@master nfs-cli]# kubectl set image sts/web nginx-c=nginx:1.29-alpine
statefulset.apps/web image updated
步骤 4:验证更新结果(仅 web-1 更新)
[root@master nfs-cli]# kubectl describe pod web-1 | grep Image:
Image: nginx:1.29-alpine
[root@master nfs-cli]# kubectl describe pod web-0 | grep Image:
Image: nginx:1.26-alpine
步骤 5:扩容验证(新增 Pod 用新版本)
# 扩容到 4 个副本
[root@master nfs-cli]# kubectl scale sts web --replicas=4
statefulset.apps/web scaled
# 查看版本(web-2、web-3 为 1.29 版本)
[root@master nfs-cli]# kubectl get pods -o custom-columns=Name:metadata.name,Image:spec.containers[0].image
Name Image
web-0 nginx:1.26-alpine
web-1 nginx:1.29-alpine
web-2 nginx:1.29-alpine
web-3 nginx:1.29-alpine
步骤 6:全量更新(设置 partition=0)
[root@master nfs-cli]# kubectl patch sts web -p '{"spec":{"updateStrategy":{"rollingUpdate":{"partition":0}}}'
statefulset.apps/web patched
# 验证全量更新结果
[root@master nfs-cli]# kubectl get pods -o custom-columns=Name:metadata.name,Image:spec.containers[0].image
Name Image
web-0 nginx:1.29-alpine
web-1 nginx:1.29-alpine
web-2 nginx:1.29-alpine
web-3 nginx:1.29-alpine
六、总结
本文完整覆盖了 Kubernetes 核心控制器的实操流程,包括 Deployment(无状态应用)、ReplicaSet、DaemonSet、Job、CronJob、StatefulSet 及存储相关(NFS、PV/PVC、动态供给),所有步骤均保留原文档的关键命令、配置示例和验证结果,可直接跟着实操。
核心要点:
- 无状态应用优先用 Deployment,支持升级回滚;
- 有状态应用用 StatefulSet,配合 Headless Service 和存储保证稳定性;
- 节点级服务用 DaemonSet,一次性任务用 Job,周期性任务用 CronJob;
- 存储推荐用动态供给(StorageClass),减少手动配置成本。
如需进一步学习,可深入 Controller 的高级参数配置(如滚动更新策略、亲和性调度)或云原生存储方案(如 Ceph、GlusterFS)。
浙公网安备 33010602011771号