Kubernetes Controller 核心管理实操指南 - 实践

文章目录

Kubernetes Controller 核心管理实操指南

一、Controller 核心概念与分类

1. Controller 作用与定位

Controller(控制器)是 Kubernetes 中管理工作负载(Workload)的核心组件,核心职责是将 Pod 实际状态持续调整为用户定义的 “期望状态”,包括维持副本数、故障自愈、版本更新等。所有 K8s 应用的全生命周期均通过 Controller 管控。

官方文档地址:https://kubernetes.io/zh-cn/docs/concepts/workloads/controllers/

2. Controller 完整分类

Kubernetes 控制器按应用场景分为 7 类,核心功能与适用场景如下:

控制器类型核心功能适用场景关键说明
Deployments部署无状态应用;集成滚动升级、版本回滚、副本伸缩;底层依赖 ReplicaSetWeb 服务、API 服务等无状态应用生产环境首选,支持服务无中断升级
ReplicaSet(RS)仅负责 Pod 副本数管理(扩容 / 缩容);支持基于集合的标签选择器作为 Deployment 底层组件,不建议单独使用无版本升级 / 回滚功能
ReplicationController(RC)ReplicaSet 老版本,仅支持等值标签选择器legacy 系统兼容已被 Deployment + ReplicaSet 替代
StatefulSets部署有状态应用;提供稳定网络标识、持久存储;有序部署 / 升级数据库(MySQL)、缓存集群(Redis)需配合 Headless Service 和 PV/PVC
DaemonSet在所有(或指定)节点运行 1 个 Pod;节点新增 / 移除时自动部署 / 清理日志收集(Filebeat)、节点监控(Node Exporter)副本数 = 节点数,不支持手动扩容
Jobs执行一次性短任务;确保任务成功完成后 Pod 终止数据备份、文件压缩、批量计算任务完成后 Pod 进入 Completed 状态
CronJob基于 Cron 表达式周期性执行任务;本质是定时创建 Job定时备份、日志清理、周期性数据统计依赖 Job 实现,支持并发数配置

二、Deployment 控制器实操(无状态应用)

1. 核心功能与无状态应用特性

(1)Deployment 核心能力
  • 集成 ReplicaSet 所有功能,自动管理副本数;
  • 支持滚动升级(逐步替换旧 Pod,服务无中断);
  • 支持版本回滚(恢复任意历史版本);
  • 故障自愈(Pod 崩溃 / 节点宕机时自动重建)。
(2)无状态应用特点
  • 所有 Pod 无差异(运行同一镜像,配置一致);
  • 无启动顺序依赖,可在任意节点调度;
  • 支持随意扩容 / 缩容(如静态 Web 服务)。

2. 创建 Deployment 应用

步骤 1:获取配置帮助
[root@master deployment_dir]# kubectl explain deployment
GROUP:      apps
KIND:       Deployment
VERSION:    v1
步骤 2:编写 YAML 配置文件(deployment-nginx.yml
apiVersion: apps/v1
kind: Deployment
metadata:
name: deploy-nginx  # Deployment 名称
spec:
replicas: 1         # 期望 Pod 副本数(默认 1)
selector:           # 标签选择器,匹配待管理的 Pod
matchLabels:
app: nginx      # 匹配标签为 app: nginx 的 Pod
template:           # Pod 配置模板
metadata:
labels:
app: nginx    # Pod 标签,必须与 selector 一致
spec:
containers:     # 容器定义
- name: nginx
image: nginx:1.26-alpine
imagePullPolicy: IfNotPresent
ports:
- containerPort: 80  # 容器暴露端口(Nginx 默认 80)
步骤 3:应用配置并验证
# 1. 创建 Deployment
[root@master deployment_dir]# kubectl apply -f deployment-nginx.yml
deployment.apps/deploy-nginx created
# 2. 查看 Deployment 状态
[root@master deployment_dir]# kubectl get deployment
NAME           READY   UP-TO-DATE   AVAILABLE   AGE
deploy-nginx   1/1     1            1           2m17s
# 3. 查看关联的 ReplicaSet(自动创建)
[root@master deployment_dir]# kubectl get replicaset
NAME                      DESIRED   CURRENT   READY   AGE
deploy-nginx-66d785bdb5   1         1         1       8m58s
# 4. 查看 Pod 状态
[root@master deployment_dir]# kubectl get pods
NAME                            READY   STATUS    RESTARTS   AGE
deploy-nginx-66d785bdb5-nztrw   1/1     Running   0          106s

3. 访问 Deployment 管理的 Pod

步骤 1:查看 Pod 详细信息
[root@master deployment_dir]# kubectl get pod -o wide
NAME                            READY   STATUS    RESTARTS   AGE   IP              NODE    NOMINATED NODE   READINESS GATES
deploy-nginx-66d785bdb5-nztrw   1/1     Running   0          11m   10.244.166.145   node1   <none>           <none>
  # Pod 运行在 node1 节点,IP:10.244.166.145
步骤 2:验证集群网络互通
# 查看集群节点网段(所有节点在 10.244.0.0/16 网段)
[root@master deployment_dir]# ifconfig tunl0 | head -2
tunl0: flags=193<UP,RUNNING,NOARP>  mtu 1480
  inet 10.244.219.64  netmask 255.255.255.255
  # 任意节点访问 Pod
  [root@master deployment_dir]# ping -c 4 10.244.166.145
  PING 10.244.166.145 (10.244.166.145) 56(84) bytes of data.
  64 bytes from 10.244.166.145: icmp_seq=1 ttl=63 time=0.543 ms
  64 bytes from 10.244.166.145: icmp_seq=2 ttl=63 time=0.366 ms
  64 bytes from 10.244.166.145: icmp_seq=3 ttl=63 time=0.435 ms
  64 bytes from 10.244.166.145: icmp_seq=4 ttl=63 time=0.381 ms
  # 访问 Nginx 服务
  [root@master deployment_dir]# curl http://10.244.166.145
  <!DOCTYPE html>
    <html>
      <head>
        <title>Welcome to nginx!</title>
          <style>
            html { color-scheme: light dark; }
            body { width: 35em; margin: 0 auto;
            font-family: Tahoma, Verdana, Arial, sans-serif; }
            </style>
              </head>
                <body>
                  <h1>Welcome to nginx!</h1>
                    <p>If you see this page, the nginx web server is successfully installed and working.</p>
                      </body>
                        </html>

4. 核心操作:删除 Pod 验证自愈能力

# 1. 强制删除 Pod
[root@master deployment_dir]# kubectl delete pod deploy-nginx-66d785bdb5-nztrw --grace-period=0 --force
Warning: Immediate deletion does not wait for confirmation...
pod "deploy-nginx-66d785bdb5-nztrw" force deleted
# 2. 查看重建结果(Pod 节点和 IP 变化,副本数保持 1)
[root@master deployment_dir]# kubectl get pod -o wide
NAME                            READY   STATUS    RESTARTS   AGE    IP            NODE    NOMINATED NODE   READINESS GATES
deploy-nginx-66d785bdb5-zv2v4   1/1     Running   0          2m6s   10.244.104.29   node2   <none>           <none>

结论:Pod IP 不固定,集群重启后 Pod 自动启动但 IP 变化,需通过 Service 提供固定访问端点。

5. 核心操作:Pod 版本升级

步骤 1:查看升级前版本
[root@master ~]# kubectl describe pods deploy-nginx-66d785bdb5-zv2v4 | grep Image:
Image:          nginx:1.26-alpine
[root@master ~]# kubectl exec deploy-nginx-66d785bdb5-zv2v4 -- nginx -v
nginx version: nginx/1.26.3
步骤 2:执行升级(升级至 1.29 版本)
[root@master ~]# kubectl set image deployment deploy-nginx nginx=nginx:1.29-alpine --record
Flag --record has been deprecated, --record will be removed in the future
deployment.apps/deploy-nginx image updated
步骤 3:验证升级结果
[root@master ~]# kubectl get pods
NAME                           READY   STATUS    RESTARTS   AGE
deploy-nginx-b5d6b46fb-f56rg   1/1     Running   0          74s
[root@master ~]# kubectl exec deploy-nginx-b5d6b46fb-f56rg -- nginx -v
nginx version: nginx/1.29.0
# 查看升级状态(提示成功则完成)
[root@master ~]# kubectl rollout status deployment deploy-nginx
deployment "deploy-nginx" successfully rolled out

说明:nginx=nginx:1.29-alpine 中前一个 nginx 为容器名,可通过 kubectl describe pod <pod名>kubectl edit deployment <部署名> 查看容器名。

6. 核心操作:版本回滚

# 1. 查看版本历史
[root@master ~]# kubectl rollout history deployment deploy-nginx
deployment.apps/deploy-nginx
REVISION  CHANGE-CAUSE
1         <none>
  2         kubectl set image deployment deploy-nginx nginx=nginx:1.29-alpine --record=true
  # 2. 查看指定版本详情(版本 1 对应 1.26 镜像)
  [root@master ~]# kubectl rollout history deployment deploy-nginx --revision=1
  deployment.apps/deploy-nginx with revision #1
  Pod Template:
  Labels:  app=nginx
  pod-template-hash=66d785bdb5
  Containers:
  nginx:
  Image:        nginx:1.26-alpine
  Port:         80/TCP
  Environment:  <none>
    # 3. 回滚到版本 1
    [root@master ~]# kubectl rollout undo deployment deploy-nginx --to-revision=1
    deployment.apps/deploy-nginx rolled back
    # 4. 验证回滚结果
    [root@master ~]# kubectl exec deploy-nginx-66d785bdb5-8jjdw -- nginx -v
    nginx version: nginx/1.26.3

7. 核心操作:副本扩容与缩容

步骤 1:扩容操作
# 扩容到 2 个副本
[root@master ~]# kubectl scale deployment deploy-nginx --replicas=2
deployment.apps/deploy-nginx scaled
步骤 2:故障排查(Pod 卡在 ContainerCreating)
# 查看 Pod 状态
[root@master ~]# kubectl get pods
NAME                            READY   STATUS              RESTARTS   AGE
deploy-nginx-66d785bdb5-8jjdw   1/1     Running             0          19m
deploy-nginx-66d785bdb5-jfcxt   0/1     ContainerCreating   0          8m15s
# 查看事件(定位 Calico 网络插件故障)
[root@master ~]# kubectl describe pod deploy-nginx-66d785bdb5-jfcxt
Events:
Warning  FailedCreatePodSandBox  8m53s  kubelet  Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container ... networkPlugin cni failed to set up pod ... error getting ClusterInformation: connection is unauthorized: Unauthorized]
# 解决方法:重启 Calico 组件
[root@master ~]# kubectl rollout restart daemonset calico-node -n kube-system
步骤 3:继续扩容 / 缩容
# 扩容到 4 个副本(支持超过节点数)
[root@master ~]# kubectl scale deployment deploy-nginx --replicas=4
deployment.apps/deploy-nginx scaled
# 缩容到 0 个副本(暂停服务,保留配置)
[root@master ~]# kubectl scale deployment deploy-nginx --replicas=0
deployment.apps/deploy-nginx scaled
[root@master ~]# kubectl get pods
No resources found in default namespace.

8. 核心操作:多副本滚动更新

# 1. 扩容到 16 个副本
[root@master ~]# kubectl scale deployment deploy-nginx --replicas=16
deployment.apps/deploy-nginx scaled
# 2. 执行滚动更新
[root@master ~]# kubectl set image deployment deploy-nginx nginx=nginx:1.29-alpine --record
# 3. 验证更新结果
[root@master ~]# kubectl rollout status deployment deploy-nginx
deployment "deploy-nginx" successfully rolled out

9. 删除 Deployment

[root@master ~]# kubectl delete deployment deploy-nginx
deployment.apps "deploy-nginx" deleted
[root@master ~]# kubectl get pods
No resources found in default namespace.

三、ReplicaSet 控制器实操

1. 核心定位

  • 仅负责 Pod 副本数管理,是 Deployment 底层依赖;
  • 支持基于集合的标签选择器,无版本升级 / 回滚功能;
  • 不建议单独使用,推荐通过 Deployment 间接管理。

2. 创建 ReplicaSet 应用

步骤 1:编写 YAML 配置文件(rs-nginx.yml
apiVersion: apps/v1
kind: ReplicaSet
metadata:
name: rs-nginx
namespace: default
spec:
replicas: 2  # 期望副本数
selector:
matchLabels:
app: nginx
template:
metadata:
name: nginx
labels:
app: nginx  # 与 selector 标签一致
spec:
containers:
- name: nginx
image: nginx:1.26-alpine
ports:
- name: http
containerPort: 80
步骤 2:应用配置并验证
# 1. 创建 ReplicaSet
[root@master rep_dir]# kubectl apply -f rs-nginx.yml
replicaset.apps/rs-nginx created
# 2. 查看 ReplicaSet 状态
[root@master rep_dir]# kubectl get rs
NAME       DESIRED   CURRENT   READY   AGE
rs-nginx   2         2         2       3m3s
# 3. 查看 Pod 状态(无 Deployment 关联)
[root@master rep_dir]# kubectl get pods
NAME             READY   STATUS    RESTARTS   AGE
rs-nginx-2ftcr   1/1     Running   0          44s
rs-nginx-lzcv7   1/1     Running   0          44s
[root@master rep_dir]# kubectl get deployment
No resources found in default namespace.

3. 核心操作:副本扩容与版本限制

# 1. 扩容到 4 个副本
[root@master rep_dir]# kubectl scale replicaset rs-nginx --replicas=4
replicaset.apps/rs-nginx scaled
# 2. 尝试版本升级(无效果)
[root@master rep_dir]# kubectl set image replicaset rs-nginx nginx=nginx:latest --record
replicaset.apps/rs-nginx image updated
# 3. 验证版本(旧 Pod 未更新)
[root@master rep_dir]# kubectl describe pods rs-nginx-2ftcr | grep Image:
Image:          nginx:1.26-alpine

结论:ReplicaSet 不支持版本自动升级,仅新创建的 Pod 会使用新镜像。

四、Controller 进阶实操

1. DaemonSet 控制器(守护进程集)

(1)核心特性
  • 每个节点仅运行 1 个 Pod;
  • 节点新增时自动部署,节点移除时自动删除;
  • 需配置容忍(tolerations)才能在 master 节点运行;
  • 适用于日志收集、节点监控等场景。
(2)创建 DaemonSet 应用
步骤 1:编写 YAML 配置文件(daemonset-nginx.yml
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: daemonset-nginx
spec:
selector:
matchLabels:
name: nginx-ds
template:
metadata:
labels:
name: nginx-ds
spec:
tolerations:  # 容忍 master 节点污点
- key: node-role.kubernetes.io/master
effect: NoSchedule
containers:
- name: nginx
image: nginx:1.26-alpine
imagePullPolicy: IfNotPresent
resources:  # 资源限制
limits:
memory: 100Mi
requests:
memory: 100Mi
步骤 2:应用配置并验证
# 1. 创建 DaemonSet
[root@master rep_dir]# kubectl apply -f daemonset-nginx.yml
daemonset.apps/daemonset-nginx created
# 2. 查看 DaemonSet 状态
[root@master rep_dir]# kubectl get daemonset
NAME              DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   AGE
daemonset-nginx   2         2         2       2            2           4m
# 3. 查看 Pod 分布(每个节点 1 个副本)
[root@master rep_dir]# kubectl get pod -o wide
NAME                    READY   STATUS    RESTARTS   AGE     IP              NODE    NOMINATED NODE   READINESS GATES
daemonset-nginx-7795c   1/1     Running   0          5m19s   10.244.166.166  node1   <none>           <none>
  daemonset-nginx-l94ds   1/1     Running   0          5m19s   10.244.104.51   node2   <none>           <none>
步骤 3:版本升级
# 升级到 latest 版本
[root@master rep_dir]# kubectl set image daemonset daemonset-nginx nginx=nginx:latest
daemonset.apps/daemonset-nginx image updated
# 验证升级结果
[root@master rep_dir]# kubectl describe pods daemonset-nginx-mpjfd | grep Image:
Image:          nginx:latest

2. Job 控制器(一次性任务)

(1)核心特性
  • 针对非耐久性任务,任务完成后 Pod 终止;
  • 支持配置任务执行次数(completions)和并发数(parallelism);
  • 与 ReplicaSet 区别:ReplicaSet 管理持久运行 Pod,Job 管理一次性任务。
(2)应用案例 1:计算圆周率 2000 位
步骤 1:编写 YAML 配置文件(job.yml
apiVersion: batch/v1
kind: Job
metadata:
name: pi
spec:
template:
metadata:
name: pod-pi
spec:
nodeName: node2  # 指定调度节点
containers:
- name: c-pi
image: perl  # 依赖 Perl 镜像(约 800M,建议提前导入)
imagePullPolicy: IfNotPresent
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
restartPolicy: Never  # 任务完成不重启
步骤 2:应用配置并验证
# 1. 创建 Job
[root@master job_dir]# kubectl apply -f job.yml
job.batch/pi created
# 2. 查看 Job 状态
[root@master job_dir]# kubectl get job
NAME   COMPLETIONS   DURATION   AGE
pi     1/1           4m43s      5m
# 3. 查看 Pod 状态(任务完成后为 Completed)
[root@master job_dir]# kubectl get pod
NAME         READY   STATUS      RESTARTS   AGE
pi-mjk9t     0/1     Completed   0          4m43s
# 4. 查看任务结果(打印圆周率 2000 位)
[root@master job_dir]# kubectl logs pi-mjk9t
3.141592653589793238462643383279502884197169399375105820974944592307816406286208...
(3)应用案例 2:固定次数任务(执行 10 次)
步骤 1:编写 YAML 配置文件(job2.yml
apiVersion: batch/v1
kind: Job
metadata:
name: busybox-job
spec:
completions: 10  # 任务执行次数
parallelism: 1   # 并发数(每次 1 个任务)
template:
metadata:
name: busybox-job-pod
spec:
containers:
- name: busybox
image: busybox
command: ["echo", "hello"]
restartPolicy: Never
步骤 2:应用配置并验证
[root@master job_dir]# kubectl apply -f job2.yml
job.batch/busybox-job created
# 查看 Pod 状态(10 个 Pod 均为 Completed)
[root@master job_dir]# kubectl get pod
NAME                READY   STATUS      RESTARTS   AGE
busybox-job-2x7kw   0/1     Completed   0          17s
busybox-job-4drnr   0/1     Completed   0          27s
...(省略 8 个 Pod)
# 查看任务输出
[root@master job_dir]# kubectl logs busybox-job-2x7kw
hello
(4)应用案例 3:一次性备份 MySQL 数据库
步骤 1:部署 MySQL 环境(mysqld.yml
apiVersion: v1
kind: Service
metadata:
name: mysql-test
spec:
ports:
- port: 3306
name: mysql
clusterIP: None  # 无头服务
selector:
app: mysql-dump
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: db
spec:
selector:
matchLabels:
app: mysql-dump
serviceName: "mysql-test"
template:
metadata:
labels:
app: mysql-dump
spec:
nodeName: node1  # 调度到 node1
containers:
- name: mysql
image: mysql:5.7
env:
- name: MYSQL_ROOT_PASSWORD
value: "abc123"  # 数据库密码
ports:
- containerPort: 3306
volumeMounts:
- mountPath: "/var/lib/mysql"
name: mysql-data
volumes:
- name: mysql-data
hostPath:
path: /opt/mysqldata  # 本地存储目录
步骤 2:部署 MySQL 并验证
[root@master job_dir]# kubectl apply -f mysqld.yml
service/mysql-test created
statefulset.apps/db created
# 查看 MySQL 状态
[root@master job_dir]# kubectl get pod -o wide
NAME   READY   STATUS    RESTARTS   AGE   IP              NODE    NOMINATED NODE
db-0   1/1     Running   0          13m   10.244.166.170  node1   <none>
步骤 3:编写备份 Job 配置(job_mysqld.yml
apiVersion: batch/v1
kind: Job
metadata:
name: mysql-dump
spec:
template:
metadata:
name: mysql-dump
spec:
nodeName: node2  # 调度到 node2
containers:
- name: mysql-dump
image: mysql:5.7
command: ["/bin/sh", "-c", "mysqldump --host=mysql-test -uroot -pabc123 --databases mysql > /root/mysql_back.sql"]
volumeMounts:
- mountPath: "/root"
name: mysql-data
restartPolicy: Never
volumes:
- name: mysql-data
hostPath:
path: /opt/mysqldump  # 备份存储目录
步骤 4:执行备份并验证
[root@master job_dir]# kubectl apply -f job_mysqld.yml
job.batch/mysql-dump created
# 查看备份文件(node2 节点)
[root@node2 ~]# ls /opt/mysqldump/
mysql_back.sql

3. CronJob 控制器(周期性任务)

(1)核心特性
  • 基于 Cron 表达式调度,本质是定时创建 Job;
  • 支持配置任务周期、并发数、重试策略;
  • 适用于定时备份、日志清理等场景。
(2)应用案例 1:周期性输出字符
步骤 1:编写 YAML 配置文件(cronjob.yml
apiVersion: batch/v1
kind: CronJob
metadata:
name: cronjob1
spec:
schedule: "* * * * *"  # Cron 表达式(每分钟执行)
jobTemplate:
spec:
template:
spec:
containers:
- name: hello
image: busybox
args:
- /bin/sh
- -c
- date; echo hello kubernetes
restartPolicy: OnFailure  # 失败时重启
步骤 2:应用配置并验证
[root@master cronjob_dir]# kubectl apply -f cronjob.yml
cronjob.batch/cronjob1 created
# 查看 CronJob 状态
[root@master cronjob_dir]# kubectl get cronjob
NAME       SCHEDULE    SUSPEND   ACTIVE   LAST SCHEDULE   AGE
cronjob1   * * * * *   False      0        15s             91s
# 查看任务 Pod 及输出
[root@master cronjob_dir]# kubectl get pod
NAME                      READY   STATUS      RESTARTS   AGE
cronjob1-29236967-svqln   0/1     Completed   0          4s
[root@master cronjob_dir]# kubectl logs cronjob1-29236967-svqln
Sun Aug  3 10:47:00 UTC 2025
hello kubernetes
(3)应用案例 2:周期性备份 MySQL 数据库
步骤 1:编写 CronJob 配置(cronjob_mysqld.yml
apiVersion: batch/v1
kind: CronJob
metadata:
name: mysql-dump
spec:
schedule: "*/1 * * * *"  # 每分钟执行一次
jobTemplate:
spec:
template:
spec:
nodeName: node1
containers:
- name: c1
image: mysql:5.7
command: ["/bin/sh","-c","mysqldump --host=mysql-test -uroot -pabc123 --databases mysql > /root/mysql`date +%Y%m%d%H%M`.sql"]
volumeMounts:
- name: mysql-data
mountPath: "/root"
restartPolicy: Never
volumes:
- name: mysql-data
hostPath:
path: /opt/mysqldump
步骤 2:应用配置并验证
[root@master cronjob_dir]# kubectl apply -f cronjob_mysqld.yml
cronjob.batch/mysql-dump created
# 查看备份文件(node1 节点)
[root@node1 ~]# ls /opt/mysqldump/
mysql202508031109.sql mysql202508031110.sql ...

五、StatefulSet 控制器实操(有状态应用)

1. 核心概念与特点

(1)StatefulSet 作用
  • 管理有状态应用的部署、扩缩容及滚动更新;
  • 提供稳定网络标识(通过 Headless Service)和持久存储;
  • 支持有序部署 / 缩容(如 0→1→2 部署,2→1→0 缩容)。
(2)无状态 vs 有状态应用
类型示例核心特点
无状态应用Nginx请求包含所有响应信息,无数据依赖,Pod 可随意替换
有状态应用MySQL前后请求有依赖,需持久化数据,Pod 有唯一标识,不可随意替换
(3)StatefulSet 依赖组件
  1. Headless Service:提供稳定网络标识(无 ClusterIP);
  2. StatefulSet 资源:定义应用配置;
  3. VolumeClaimTemplate:存储卷模板,自动为每个 Pod 创建 PVC。

2. 部署准备:NFS 存储配置

步骤 1:部署 NFS 服务器(IP:192.168.100.138)
# 1. 安装 NFS 服务
[root@nfsserver ~]# yum install -y nfs-utils
# 2. 创建共享目录
[root@nfsserver ~]# mkdir -p /data/nfs
[root@nfsserver ~]# chmod 777 /data/nfs
# 3. 配置共享(/etc/exports)
[root@nfsserver ~]# echo "/data/nfs *(rw,no_root_squash,sync)" > /etc/exports
# 4. 启动服务并关闭防火墙
[root@nfsserver ~]# systemctl restart nfs-server
[root@nfsserver ~]# systemctl enable nfs-server
[root@nfsserver ~]# systemctl stop firewalld
[root@nfsserver ~]# setenforce 0
# 验证共享
[root@nfsserver ~]# showmount -e
Export list for nfsserver:
/data/nfs *
步骤 2:所有 Node 节点安装 NFS 客户端
[root@node1 ~]# yum install -y nfs-utils
[root@node2 ~]# yum install -y nfs-utils
# 验证 NFS 连通性
[root@node1 ~]# showmount -e 192.168.100.138
Export list for 192.168.100.138:
/data/nfs *

3. 静态存储:PV 与 PVC 配置

步骤 1:创建 PV(pv-nfs.yml
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-nfs
spec:
capacity:
storage: 1Gi  # 存储容量
accessModes:
- ReadWriteMany  # 多节点读写
nfs:
server: 192.168.100.138  # NFS 服务器 IP
path: /data/nfs          # 共享目录
步骤 2:创建 PVC(pvc-nfs.yml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc-nfs
spec:
accessModes:
- ReadWriteMany  # 与 PV 一致
resources:
requests:
storage: 1Gi  # 与 PV 容量一致
步骤 3:应用配置并验证
# 1. 创建 PV 和 PVC
[root@master statefulset_dir]# kubectl apply -f pv-nfs.yml
[root@master statefulset_dir]# kubectl apply -f pvc-nfs.yml
# 2. 验证绑定状态
[root@master statefulset_dir]# kubectl get pv
NAME     CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM             AGE
pv-nfs   1Gi        RWX            Retain           Bound    default/pvc-nfs   11m
[root@master statefulset_dir]# kubectl get pvc
NAME        STATUS   VOLUME   CAPACITY   ACCESS MODES   AGE
pvc-nfs     Bound     pv-nfs   1Gi        RWX            91s
步骤 4:部署 Nginx 应用(使用 PV/PVC)
# dep-nginx-nfs.yml
apiVersion: apps/v1
kind: Deployment
metadata:
name: deploy-nginx-nfs
spec:
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.26-alpine
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html  # 挂载目录
ports:
- containerPort: 80
volumes:
- name: www
persistentVolumeClaim:
claimName: pvc-nfs  # 关联 PVC
# 应用配置并验证
[root@master statefulset_dir]# kubectl apply -f dep-nginx-nfs.yml
[root@master statefulset_dir]# kubectl get pod -o wide
# 在 NFS 服务器创建测试文件
[root@nfsserver nfs]# echo "hello nfs storage" > index.html
# 访问 Pod 验证
[root@master statefulset_dir]# curl http://10.244.166.133
hello nfs storage

4. 动态存储:StorageClass 配置

(1)核心优势
  • 无需手动创建 PV,PVC 申请时自动生成;
  • 管理员无需预先配置大量 PV,按需动态供给。
(2)部署动态供给插件(NFS-Client Provisioner)
步骤 1:下载并配置 StorageClass(storageclass-nfs.yml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: nfs-client  # StorageClass 名称
provisioner: k8s-sigs.io/nfs-subdir-external-provisioner  # 插件名称
archiveOnDelete: "false"  # 删除 PVC 时不归档数据
[root@master nfs-cli]# kubectl apply -f storageclass-nfs.yml
storageclass.storage.k8s.io/nfs-client created
步骤 2:配置 RBAC 权限(授权插件操作 PV/PVC)
# 下载 RBAC 配置文件
[root@master nfs-cli]# wget https://raw.githubusercontent.com/kubernetes-sigs/nfs-subdir-external-provisioner/master/deploy/rbac.yaml
# 应用配置
[root@master nfs-cli]# kubectl apply -f rbac.yaml
serviceaccount/nfs-client-provisioner created
clusterrole.rbac.authorization.k8s.io/nfs-client-provisioner-runner created
...
步骤 3:部署 Provisioner Deployment
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nfs-client-provisioner
spec:
replicas: 1
selector:
matchLabels:
app: nfs-client-provisioner
template:
metadata:
labels:
app: nfs-client-provisioner
spec:
serviceAccountName: nfs-client-provisioner
containers:
- name: nfs-client-provisioner
image: registry.cn-beijing.aliyuncs.com/pylixm/nfs-subdir-external-provisioner:v4.0.0  # 国内镜像
volumeMounts:
- name: nfs-client-root
mountPath: /persistentvolumes
env:
- name: PROVISIONER_NAME
value: k8s-sigs.io/nfs-subdir-external-provisioner
- name: NFS_SERVER
value: 192.168.100.138  # NFS 服务器 IP
- name: NFS_PATH
value: /data/nfs        # 共享目录
volumes:
- name: nfs-client-root
nfs:
server: 192.168.100.138
path: /data/nfs
[root@master nfs-cli]# kubectl apply -f deployment.yaml
deployment.apps/nfs-client-provisioner created
# 验证插件状态
[root@master nfs-cli]# kubectl get pod
NAME                                      READY   STATUS    RESTARTS   AGE
nfs-client-provisioner-54b9cb8bf9-zjbdz   1/1     Running   0          18s
步骤 4:StatefulSet 应用动态存储
# nginx-storageclass-nfs.yml
apiVersion: v1
kind: Service
metadata:
name: nginx-svc
spec:
ports:
- port: 80
name: web
clusterIP: None  # 无头服务
selector:
app: nginx
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web
spec:
serviceName: "nginx-svc"  # 关联无头服务
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx-c
image: nginx:1.26-alpine
ports:
- containerPort: 80
name: web
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html
volumeClaimTemplates:  # PVC 模板(自动生成 PVC)
- metadata:
name: www
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: "nfs-client"  # 关联 StorageClass
resources:
requests:
storage: 1Gi
# 应用配置并验证
[root@master nfs-cli]# kubectl apply -f nginx-storageclass-nfs.yml
service/nginx-svc created
statefulset.apps/web created
# 查看自动生成的 PVC 和 PV
[root@master nfs-cli]# kubectl get pvc
NAME        STATUS   VOLUME                                     CAPACITY   ACCESS MODES   AGE
www-web-0   Bound    pvc-91f70f53-f928-406c-afcc-bd47a83b0e56   1Gi        RWO            111s
www-web-1   Bound    pvc-e3339644-0385-4e6b-8aac-155a6c2b3e3f   1Gi        RWO            108s
[root@master nfs-cli]# kubectl get pv
NAME                                       CAPACITY   ACCESS MODES   STATUS   CLAIM               AGE
pvc-91f70f53-f928-406c-afcc-bd47a83b0e56   1Gi        RWO            Bound    default/www-web-0   2m57s
pvc-e3339644-0385-4e6b-8aac-155a6c2b3e3f   1Gi        RWO            Bound    default/www-web-1   2m54s
(3)验证动态存储效果
# 在 NFS 服务器查看自动创建的目录
[root@nfsserver nfs]# ls
default-www-web-0-pvc-91f70f53-f928-406c-afcc-bd47a83b0e56
default-www-web-1-pvc-e3339644-0385-4e6b-8aac-155a6c2b3e3f
# 写入测试数据
[root@nfsserver nfs]# echo "this is web0" > default-www-web-0-pvc-91f70f53-f928-406c-afcc-bd47a83b0e56/index.html
[root@nfsserver nfs]# echo "this is web1" > default-www-web-1-pvc-e3339644-0385-4e6b-8aac-155a6c2b3e3f/index.html
# 访问 Pod 验证
[root@master nfs-cli]# kubectl get pod -o wide
NAME    READY   STATUS    RESTARTS   AGE   IP              NODE    NOMINATED NODE
web-0   1/1     Running   0          9m    10.244.166.154   node1   <none>
  web-1   1/1     Running   0          8m    10.244.104.62    node2   <none>
    [root@master nfs-cli]# curl http://10.244.166.154
    this is web0
    [root@master nfs-cli]# curl http://10.244.104.62
    this is web1

5. 金丝雀发布实战(StatefulSet 特性)

(1)核心原理

通过 partition 参数控制更新范围,仅更新编号 ≥ partition 的 Pod,实现灰度发布。

(2)实操步骤
步骤 1:设置 partition 参数(仅更新编号 ≥1 的 Pod)
[root@master nfs-cli]# kubectl patch sts web -p '{"spec":{"updateStrategy":{"rollingUpdate":{"partition":1}}}'
statefulset.apps/web patched
# 验证参数
[root@master nfs-cli]# kubectl get sts web -o yaml | grep partition
partition: 1
步骤 2:查看更新前版本
[root@master nfs-cli]# kubectl describe pod web-0 | grep Image:
Image:          nginx:1.26-alpine
[root@master nfs-cli]# kubectl describe pod web-1 | grep Image:
Image:          nginx:1.26-alpine
步骤 3:执行版本更新
[root@master nfs-cli]# kubectl set image sts/web nginx-c=nginx:1.29-alpine
statefulset.apps/web image updated
步骤 4:验证更新结果(仅 web-1 更新)
[root@master nfs-cli]# kubectl describe pod web-1 | grep Image:
Image:          nginx:1.29-alpine
[root@master nfs-cli]# kubectl describe pod web-0 | grep Image:
Image:          nginx:1.26-alpine
步骤 5:扩容验证(新增 Pod 用新版本)
# 扩容到 4 个副本
[root@master nfs-cli]# kubectl scale sts web --replicas=4
statefulset.apps/web scaled
# 查看版本(web-2、web-3 为 1.29 版本)
[root@master nfs-cli]# kubectl get pods -o custom-columns=Name:metadata.name,Image:spec.containers[0].image
Name      Image
web-0     nginx:1.26-alpine
web-1     nginx:1.29-alpine
web-2     nginx:1.29-alpine
web-3     nginx:1.29-alpine
步骤 6:全量更新(设置 partition=0)
[root@master nfs-cli]# kubectl patch sts web -p '{"spec":{"updateStrategy":{"rollingUpdate":{"partition":0}}}'
statefulset.apps/web patched
# 验证全量更新结果
[root@master nfs-cli]# kubectl get pods -o custom-columns=Name:metadata.name,Image:spec.containers[0].image
Name      Image
web-0     nginx:1.29-alpine
web-1     nginx:1.29-alpine
web-2     nginx:1.29-alpine
web-3     nginx:1.29-alpine

六、总结

本文完整覆盖了 Kubernetes 核心控制器的实操流程,包括 Deployment(无状态应用)、ReplicaSet、DaemonSet、Job、CronJob、StatefulSet 及存储相关(NFS、PV/PVC、动态供给),所有步骤均保留原文档的关键命令、配置示例和验证结果,可直接跟着实操。

核心要点:

  1. 无状态应用优先用 Deployment,支持升级回滚;
  2. 有状态应用用 StatefulSet,配合 Headless Service 和存储保证稳定性;
  3. 节点级服务用 DaemonSet,一次性任务用 Job,周期性任务用 CronJob;
  4. 存储推荐用动态供给(StorageClass),减少手动配置成本。

如需进一步学习,可深入 Controller 的高级参数配置(如滚动更新策略、亲和性调度)或云原生存储方案(如 Ceph、GlusterFS)。

posted on 2026-02-02 16:34  ljbguanli  阅读(0)  评论(0)    收藏  举报