rook ceph部署手册(helm版本)
📚 Rook Ceph 完整部署指南 (1 Master + 2 Worker 架构)
概览:本文档专门针对 1 Master + 2 Worker 的 Kubernetes 架构设计,提供从零开始部署 Rook Ceph 存储集群的完整流程,涵盖前置准备、Operator 安装、集群创建到验证测试的全过程。
📋 部署流程概览
整个部署流程分为四个主要阶段,每个阶段都有明确的目标和验证步骤:
✅ 第 0 阶段:前置检查与准备
在开始部署前,请逐项完成以下检查,这是后续所有步骤成功的基础。
0.1 检查 Kubernetes 集群
kubectl get nodes
期望输出(示例):
NAME STATUS ROLES AGE VERSION
master Ready control-plane 15d v1.28.0
worker1 Ready worker 15d v1.28.0
worker2 Ready worker 15d v1.28.0
- 看到 3 个节点:1个 Master 控制平面节点 + 2个 Worker 节点,状态均为 Ready
- 记下两个 Worker 节点的名字(例如 worker1, worker2),后面配置要用
0.2 检查所有节点的存储设备
在 Master 节点和两个 Worker 节点上分别执行:
sudo lsblk -f
期望状态(示例):
在 worker1 和 worker2 节点上:
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 50G 0 disk
├─sda1 8:1 0 500M 0 part /boot/efi
└─sda2 8:2 0 49.5G 0 part /
sdb 8:16 0 100G 0 disk ← 未格式化,用于 OSD
在 master 节点上:
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 50G 0 disk
├─sda1 8:1 0 500M 0 part /boot/efi
└─sda2 8:2 0 49.5G 0 part /
- 在 两个 Worker 节点 上都有一块类似
/dev/sdb的磁盘,且 FSTYPE 列为空 - Master 节点不需要存储设备,可以没有 sdb
0.3 检查 Helm 版本
helm version
要求:版本为 3.x
0.4 在所有节点安装 lvm2 (建议)
在两个 Worker 节点上执行:
# OpenEuler / CentOS / RHEL
sudo yum install -y lvm2
# Ubuntu/Debian
sudo apt-get install -y lvm2
🚀 第 1 阶段:安装 Rook Ceph Operator
1.1 添加 Helm 仓库并更新
helm repo add rook-release https://charts.rook.io/release
helm repo update
1.2 创建命名空间
kubectl create namespace rook-ceph
1.3 准备 Ceph Operator 的 Values 配置文件
创建 rook-ceph-values.yaml 文件,内容如下:
# CSI 插件配置(使用国内镜像源)
csi:
cephcsi:
# -- Ceph CSI image repository
repository: quay.io/cephcsi/cephcsi
# -- Ceph CSI image tag
tag: v3.14.1
registrar:
# -- Kubernetes CSI registrar image repository
repository: registry.aliyuncs.com/google_containers/csi-node-driver-registrar
# -- Registrar image tag
tag: v2.13.0
provisioner:
# -- Kubernetes CSI provisioner image repository
repository: registry.aliyuncs.com/google_containers/csi-provisioner
# -- Provisioner image tag
tag: v5.2.0
snapshotter:
# -- Kubernetes CSI snapshotter image repository
repository: registry.aliyuncs.com/google_containers/csi-snapshotter
# -- Snapshotter image tag
tag: v8.2.1
attacher:
# -- Kubernetes CSI Attacher image repository
repository: registry.aliyuncs.com/google_containers/csi-attacher
# -- Attacher image tag
tag: v4.8.1
resizer:
# -- Kubernetes CSI resizer image repository
repository: registry.aliyuncs.com/google_containers/csi-resizer
# -- Resizer image tag
tag: v1.13.2
# 资源限制配置(生产环境建议调整)
resources:
limits:
memory: 1Gi
requests:
cpu: 100m
memory: 128Mi
1.4 安装 Operator Chart
helm upgrade --install rook-ceph-operator rook-release/rook-ceph \
--namespace rook-ceph \
--create-namespace \
--version v1.17.6 \
-f rook-ceph-values.yaml
1.5 验证 Operator 运行
# 等待片刻,直到 Pod 状态变为 Running
kubectl -n rook-ceph get pods -l app=rook-ceph-operator
期望输出:
NAME READY STATUS RESTARTS AGE
rook-ceph-operator-xxxxxxxx-yyyyy 1/1 Running 0 2m
🗂️ 第 2 阶段:创建 Ceph 集群 (核心配置)
2.1 为 Kubernetes 节点打上角色标签
根据调度策略,需要为节点添加标签:
# 1. 查看节点现有标签
kubectl get nodes --show-labels
# 2. 为两个 Worker 节点打上 worker 角色标签
# 假设你的节点名为 worker1 和 worker2,请根据实际情况修改
kubectl label nodes worker1 node-role.kubernetes.io/worker=true
kubectl label nodes worker2 node-role.kubernetes.io/worker=true
# 3. 再次确认标签
kubectl get nodes --show-labels
期望输出(示例):
NAME STATUS ROLES AGE VERSION
master Ready control-plane,master 15d v1.28.0
worker1 Ready worker 15d v1.28.0
worker2 Ready worker 15d v1.28.0
- worker1 和 worker2 节点应显示
node-role.kubernetes.io/worker=true - master 节点应显示
control-plane,master角色
2.2 准备 Ceph 集群的 Values 配置文件
创建一个名为 rook-ceph-cluster-values.yaml 的文件,内容如下:
# ============================================
# Rook Ceph 集群 Helm Chart 定制化配置文件
# 版本:v1.17.6
# 适用环境:1 Master + 2 Worker 节点的 Kubernetes 集群
# 存储配置:每个 Worker 节点使用 /dev/sdb 作为 Ceph OSD 磁盘
# 启用功能:块存储 (RBD) + 共享文件系统 (CephFS)
# 注意:请将所有 <需要替换的项> 替换为实际值
# ============================================
# -- 主 Rook Operator 所在的命名空间(必须与安装operator时一致)
operatorNamespace: rook-ceph
# -- CephCluster 自定义资源的名称
clusterName: rook-ceph
# 启用 Ceph 工具箱,用于集群管理和故障排查
toolbox:
enabled: true
image: quay.io/ceph/ceph:v19.2.2 # 与集群版本保持一致
# 配置 Ingress 以通过域名访问 Ceph Dashboard
ingress:
dashboard:
enabled: true
host:
name: <hostname> # 请确保此域名已解析到 Ingress Controller
path: /
pathType: Prefix
ingressClassName: <ingressClassName> # 必须与集群中已部署的 Ingress Controller 类型匹配,如nginx
annotations:
nginx.ingress.kubernetes.io/backend-protocol: "HTTPS" # Ceph Dashboard 使用 HTTPS
# ============================================
# Ceph 集群核心配置 (cephClusterSpec)
# 定义 Ceph 守护进程的基本部署参数
# ============================================
cephClusterSpec:
# 指定 Ceph 容器镜像版本
cephVersion:
image: quay.io/ceph/ceph:v19.2.2 # 稳定版 Reef
allowUnsupported: false
# Rook 在宿主机上存储配置和数据的路径
dataDirHostPath: /var/lib/rook
# Monitor 配置:部署2个以实现仲裁(两个在Worker)
# 注意:一般情况下最低部署3个(奇数,3,5等),因部署的k8s集群是2个worker节点,所以只部署2个
mon:
count: 2
allowMultiplePerNode: false # 禁止同一节点运行多个mon
# Manager 配置:部署2个实现高可用
mgr:
count: 2
allowMultiplePerNode: false
# 启用 Ceph Dashboard
dashboard:
enabled: true
ssl: true # 启用 SSL 加密访问
# 网络配置:使用 host 模式以获得最佳性能
network:
provider: host
# 重要:host模式要求节点间Ceph端口可通
# 需要放行端口:6789 (mon), 6800-7300 (osd), 8443/7000 (dashboard)
# 调度策略:控制组件在节点上的分布
placement:
# Monitor:优先调度到Worker节点,Master作为备用
mon:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
preference:
matchExpressions:
- key: node-role.kubernetes.io/worker
operator: In
values: ["true"]
tolerations: # 新增容忍度
- key: node-role.kubernetes.io/control-plane
operator: Exists
effect: NoSchedule
# OSD:必须运行在标记为worker的节点上
osd:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-role.kubernetes.io/worker
operator: In
values: ["true"]
# 存储配置:精确控制OSD在哪些节点的哪些磁盘上创建
storage:
useAllNodes: false # 不自动使用所有节点
useAllDevices: false # 不自动使用所有设备
deviceFilter: "sdb" # 全局设备过滤规则
nodes:
# 重要:必须替换为你的实际Worker节点主机名
- name: "<worker-node-1-hostname>" # 替换为第一个Worker节点名
devices:
- name: "sdb"
- name: "<worker-node-2-hostname>" # 替换为第二个Worker节点名
devices:
- name: "sdb"
config:
databaseSizeMB: "1024"
journalSizeMB: "1024"
# 资源限制(可根据节点实际配置调整)
resources:
osd:
limits:
memory: "4Gi"
requests:
cpu: "200m"
memory: "1Gi"
mgr:
limits:
memory: "1Gi"
requests:
cpu: "200m"
memory: "512Mi"
mon:
limits:
memory: "2Gi"
requests:
cpu: "200m"
memory: "512Mi"
cleanupPolicy:
confirmation: "" # 保持空值,不删除集群
sanitizeDisks:
method: complete # 从 quick 改为 complete
dataSource: zero
iteration: 1
allowUninstallWithVolumes: false
# ============================================
# 块存储 (RBD) 配置
# 提供 RWO (ReadWriteOnce) 卷,适合数据库等应用
# ============================================
# 若不启用,配置为 cephBlockPools: [] 即可
cephBlockPools:
- name: ceph-blockpool
spec:
failureDomain: host # 副本跨节点分布
replicated:
size: 2 # 关键:2个副本,匹配2个OSD节点(默认3个)
# 警告:size=2 意味着写入数据需要同时写入2个OSD
# 一个节点故障时,数据仍可用但处于降级状态
storageClass:
enabled: true
isDefault: false # 建议不设为默认存储类
name: rook-ceph-block
reclaimPolicy: Retain
allowVolumeExpansion: true
volumeBindingMode: Immediate
parameters:
# (optional) mapOptions is a comma-separated list of map options.
# For krbd options refer
# https://docs.ceph.com/docs/latest/man/8/rbd/#kernel-rbd-krbd-options
# For nbd options refer
# https://docs.ceph.com/docs/latest/man/8/rbd-nbd/#options
# mapOptions: lock_on_read,queue_depth=1024
# (optional) unmapOptions is a comma-separated list of unmap options.
# For krbd options refer
# https://docs.ceph.com/docs/latest/man/8/rbd/#kernel-rbd-krbd-options
# For nbd options refer
# https://docs.ceph.com/docs/latest/man/8/rbd-nbd/#options
# unmapOptions: force
# RBD image format. Defaults to "2".
imageFormat: "2"
# RBD image features, equivalent to OR'd bitfield value: 63
# Available for imageFormat: "2". Older releases of CSI RBD
# support only the `layering` feature. The Linux kernel (KRBD) supports the
# full feature complement as of 5.4
imageFeatures: layering
# These secrets contain Ceph admin credentials.
csi.storage.k8s.io/provisioner-secret-name: rook-csi-rbd-provisioner
csi.storage.k8s.io/provisioner-secret-namespace: "{{ .Release.Namespace }}"
csi.storage.k8s.io/controller-expand-secret-name: rook-csi-rbd-provisioner
csi.storage.k8s.io/controller-expand-secret-namespace: "{{ .Release.Namespace }}"
csi.storage.k8s.io/node-stage-secret-name: rook-csi-rbd-node
csi.storage.k8s.io/node-stage-secret-namespace: "{{ .Release.Namespace }}"
# Specify the filesystem type of the volume. If not specified, csi-provisioner
# will set default as `ext4`. Note that `xfs` is not recommended due to potential deadlock
# in hyperconverged settings where the volume is mounted on the same node as the osds.
csi.storage.k8s.io/fstype: ext4
# ============================================
# 共享文件系统 (CephFS) 配置
# 提供 RWX (ReadWriteMany) 卷,适合共享存储场景
# ============================================
cephFileSystems:
- name: ceph-filesystem
spec:
# 元数据池:存储文件系统元数据(目录结构、权限等)
metadataPool:
failureDomain: host
replicated:
size: 2 # 关键:2个副本,匹配2个OSD节点(默认3个)
# 数据池:存储实际文件内容
dataPools:
- name: data-pool # 显式命名,便于管理
failureDomain: host
replicated:
size: 2 # 2个副本
# 元数据服务器配置
metadataServer:
activeCount: 1 # 活动MDS数量(2节点环境建议1个)
activeStandby: true # 启用备用MDS,主备自动切换
# 资源限制
resources:
limits:
memory: "4Gi"
requests:
cpu: "200m"
memory: "1Gi"
priorityClassName: system-cluster-critical
# 为CephFS创建对应的StorageClass
storageClass:
enabled: true
isDefault: false
name: rook-cephfs # StorageClass名称
pool: data-pool # 使用上面定义的data-pool
reclaimPolicy: Retain
allowVolumeExpansion: true
volumeBindingMode: Immediate
parameters:
# The secrets contain Ceph admin credentials.
csi.storage.k8s.io/provisioner-secret-name: rook-csi-cephfs-provisioner
csi.storage.k8s.io/provisioner-secret-namespace: "{{ .Release.Namespace }}"
csi.storage.k8s.io/controller-expand-secret-name: rook-csi-cephfs-provisioner
csi.storage.k8s.io/controller-expand-secret-namespace: "{{ .Release.Namespace }}"
csi.storage.k8s.io/node-stage-secret-name: rook-csi-cephfs-node
csi.storage.k8s.io/node-stage-secret-namespace: "{{ .Release.Namespace }}"
# Specify the filesystem type of the volume. If not specified, csi-provisioner
# will set default as `ext4`. Note that `xfs` is not recommended due to potential deadlock
# in hyperconverged settings where the volume is mounted on the same node as the osds.
csi.storage.k8s.io/fstype: ext4
# ============================================
# 对象存储 (RGW) 配置 - 本次暂不启用
# ============================================
cephObjectStores: [] # 空数组表示不启用
# ============================================
# 监控配置 - 按需启用
# ============================================
monitoring:
enabled: false # 如果已部署Prometheus,可设为true
# ============================================
# 其他高级配置
# ============================================
# 禁用Pod安全策略(除非集群要求)
pspEnable: false
# CSI驱动名前缀(通常无需修改)
csiDriverNamePrefix:
重要提示:
- 配置文件中的
name字段(worker1, worker2)必须与你的节点 主机名 完全一致 - 可以通过
kubectl get nodes查看节点名称,确保名称匹配 - 如果节点名称不是 worker1/worker2,请根据实际情况修改配置文件
2.3 安装 Ceph 集群 Chart
helm upgrade --install rook-ceph-cluster rook-release/rook-ceph-cluster \
--create-namespace \
--namespace rook-ceph \
--version v1.17.6 \
-f rook-ceph-cluster-values.yaml
2.4 监控集群启动过程
watch -n 5 'kubectl -n rook-ceph get pods'
这是一个关键步骤,需要耐心等待(通常5-15分钟)
期望看到以下 Pod 全部就绪(基于 1 Master + 2 Worker 架构):
NAME READY STATUS RESTARTS AGE
rook-ceph-mon-a-xxxxxxxx-yyyyy 1/1 Running 0 5m
rook-ceph-mon-b-xxxxxxxx-yyyyy 1/1 Running 0 5m
rook-ceph-mgr-a-xxxxxxxx-yyyyy 1/1 Running 0 4m
rook-ceph-mgr-b-xxxxxxxx-yyyyy 1/1 Running 0 4m
rook-ceph-osd-0-xxxxxxxx-yyyyy 1/1 Running 0 6m # 在 worker1 上
rook-ceph-osd-1-xxxxxxxx-yyyyy 1/1 Running 0 6m # 在 worker2 上
rook-ceph-tools-xxxxxxxx-yyyyy 1/1 Running 0 3m
- 2 个 Monitor Pod:
rook-ceph-mon-*状态为 Running(分布在两个 Worker 节点上) - 2 个 Manager Pod:
rook-ceph-mgr-*状态为 Running - 2 个 OSD Pod:
rook-ceph-osd-*状态为 Running(分别在两个 Worker 节点上) - 1 个 Tools Pod:
rook-ceph-tools-*状态为 Running
✅ 第 3 阶段:验证与访问
3.1 验证集群健康状态
集群 Pod 就绪后,通过工具箱检查:
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph status
成功标志:
cluster:
id: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
health: HEALTH_OK
services:
mon: 2 daemons, quorum a,b,c (age 5m)
mgr: 2 daemons, quorum a,b (age 5m)
osd: 2 osds: 2 up, 2 in
data:
pools: 1 pools, 1 pgs
objects: 0 objects, 0 B
usage: 2.0 GiB used, 9.8 GiB / 12 GiB avail
pgs: 1 active+clean
注意:
health: HEALTH_OK(初始可能为HEALTH_WARN,等待几分钟后应恢复)osd: 2 osds: 2 up, 2 inmon: 2 daemons
3.2 测试存储类
创建一个测试 PVC 来验证存储供应是否正常:
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: test-pvc
spec:
storageClassName: rook-ceph-block
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
EOF
# 查看PVC状态
kubectl get pvc test-pvc
期望输出:STATUS 为 Bound
3.3 访问 Ceph Dashboard
# 获取 Service 的 NodePort
kubectl -n rook-ceph get svc rook-ceph-mgr-dashboard
# 获取登录密码
kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath="{['data']['password']}" | base64 --decode && echo
访问方式:
- 使用浏览器访问
https://<你的任一节点IP>:<NodePort> - 用户名:
admin - 密码:上面命令的输出
⚠️ 关键注意事项
安全性
- 此流程主要关注功能部署
- 生产环境务必配置网络安全策略
- 考虑为
dataDirHostPath使用更安全的路径 - 妥善保管 Dashboard 密码
资源要求(针对 1 Master + 2 Worker 架构)
- Master 节点:主要用于控制平面,不需要额外资源
- Worker 节点:每个需要承载 1-2 个 Ceph 组件
- worker1:通常部署 1 个 Monitor + 1 个 OSD
- worker2:通常部署 1 个 Monitor + 1 个 OSD
- 总配置:2 个 Mon、2 个 Mgr 和 2 个 OSD
- 建议资源:每个 Worker 节点至少 4GB 可用内存,2 CPU 核心
故障排除
如果 Pod 长时间不健康,使用以下命令排查:
# 查看 Pod 详细信息
kubectl -n rook-ceph describe pod <pod-name>
# 查看 Pod 日志
kubectl -n rook-ceph logs <pod-name>
# 查看 Ceph 集群状态
kubectl -n rook-ceph get cephcluster -o yaml
📋 快速检查清单
部署前检查(1 Master + 2 Worker)
部署过程检查
验证检查
🔄 卸载流程
如需卸载 Rook Ceph,请按顺序执行:
一、重要安全警告
#!/bin/bash
# ============================================================================
# Rook Ceph 集群安全卸载脚本
# 警告: 此操作将永久删除所有 Ceph 集群数据,且不可恢复!
# 执行前请确保:
# 1. 所有重要数据已备份
# 2. 已停止所有使用 Rook 存储的业务应用
# 3. 已获授权在生产环境执行
# ============================================================================
二、卸载脚本
Rook 官方推荐的清理顺序
业务 PV/PVC → CEPHCLUSTER(带 CLEANUPPOLICY)→ OPERATOR → CRDS → NAMESPACE → 节点数据
2.1 定义变量(可根据实际情况修改)
# ========================== 可配置变量 ==========================
NAMESPACE="rook-ceph"
CLUSTER_NAME="rook-ceph"
# 如果你的 Helm release 名字不是下面这两个,按实际情况改:
HELM_RELEASE_CLUSTER="${NAMESPACE}-cluster"
HELM_RELEASE_OPERATOR="${NAMESPACE}-operator"
# 如果你在集群 CR 里改过 dataDirHostPath,这里要改成对应路径,默认是 /var/lib/rook
DATA_DIR_HOSTPATH_BASE="/var/lib/rook"
# ========================== 可配置变量 ==========================
2.2 清理业务资源(pv/pvc)
echo "=== 检查是否还有使用 Rook 存储的 PVC / PV ==="
PVC_COUNT=$(kubectl get pvc -A 2>/dev/null | grep -c -E "${NAMESPACE}|rook" || true)
PV_COUNT=$(kubectl get pv 2>/dev/null | grep -c -E "${NAMESPACE}|rook" || true)
echo "发现 $PVC_COUNT 个相关PVC, $PV_COUNT 个相关PV"
if [ "$PVC_COUNT" -gt 0 ] || [ "$PV_COUNT" -gt 0 ]; then
echo "=== 详细列表 ==="
[ "$PVC_COUNT" -gt 0 ] && kubectl get pvc -A | grep -E "${NAMESPACE}|rook"
[ "$PV_COUNT" -gt 0 ] && kubectl get pv | grep -E "${NAMESPACE}|rook"
echo ""
echo "❌ 警告: 发现未清理的存储资源!正在自动清理..."
# --- 自动删除 PVC ---
if [ "$PVC_COUNT" -gt 0 ]; then
echo "1. 删除所有 rook 相关 PVC..."
kubectl get pvc -A --no-headers 2>/dev/null | \
grep -E "${NAMESPACE}|rook" | \
while read -r line; do
ns=$(echo "$line" | awk '{print $1}')
name=$(echo "$line" | awk '{print $2}')
echo " → kubectl delete pvc $name -n $ns"
kubectl delete pvc "$name" -n "$ns" --ignore-not-found
done
fi
# --- 自动删除 PV ---
if [ "$PV_COUNT" -gt 0 ]; then
echo "2. 删除所有 rook 相关 PV..."
kubectl get pv --no-headers 2>/dev/null | \
grep -E "${NAMESPACE}|rook" | \
while read -r line; do
pv_name=$(echo "$line" | awk '{print $1}')
echo " → kubectl delete pv $pv_name"
kubectl delete pv "$pv_name" --ignore-not-found
done
fi
sleep 2
echo "✅ 自动清理完成"
fi
echo ""
echo "✅ 存储资源检查通过"
echo ""
2.3 删除 CephCluster
echo "=== 给 CephCluster 加 cleanupPolicy ==="
kubectl -n "${NAMESPACE}" patch cephcluster "${CLUSTER_NAME}" \
--type=merge \
-p '{"spec":{"cleanupPolicy":{"confirmation":"yes-really-destroy-data"}}}'
if [ $? -eq 0 ]; then
echo "✅ cleanupPolicy 设置成功"
echo "注意:现在可以安全地删除 CephCluster,数据将被自动清理"
else
echo "❌ cleanupPolicy 设置失败"
echo "可能原因:CephCluster 不存在或名称错误"
kubectl -n "${NAMESPACE}" get cephcluster
exit 1
fi
echo ""
echo "=== 删除 CephCluster CR ==="
kubectl -n "${NAMESPACE}" delete cephcluster "${CLUSTER_NAME}" --wait=false
echo ""
echo "=== 等待 CephCluster 被删除 ==="
echo "监控状态(最多 300 秒,Ctrl+C 可中断监控,不影响后续操作)"
timeout 300 kubectl -n "${NAMESPACE}" get cephcluster --watch 2>/dev/null || true
# 检查是否真的删除了
echo ""
echo "=== 确认 CephCluster 删除状态 ==="
if kubectl -n "${NAMESPACE}" get cephcluster "${CLUSTER_NAME}" 2>/dev/null; then
echo "❌ CephCluster 仍然存在,可能卡在 Deleting 状态"
echo "可以尝试等待自动清理,或使用非常规手段清理"
else
echo "✅ CephCluster 已删除"
fi
2.4 检查并强制清理残留的 Ceph CR
echo ""
echo "=== 检查并强制清理残留的 Ceph CR ==="
# 首先检查所有 Ceph 资源状态
echo "1. 检查当前 Ceph 资源状态..."
CEPH_RESOURCES=$(kubectl get cephcluster,cephfilesystem,cephblockpool,cephobjectstore,cephnfs,cephrbdmirror -n "${NAMESPACE}" 2>&1 || true)
# 修正判断逻辑
if echo "$CEPH_RESOURCES" | grep -q -E "(No resources found|not found)" || [ -z "$(echo "$CEPH_RESOURCES" | grep -v "^NAME")" ]; then
echo "✅ 所有 Ceph CR 已清理干净"
else
echo "发现残留的 Ceph 资源:"
echo "$CEPH_RESOURCES"
echo ""
echo "2. 分析资源删除状态..."
# 获取所有资源的 JSON 格式数据
RESOURCE_JSON=$(kubectl get cephcluster,cephfilesystem,cephblockpool,cephobjectstore -n "${NAMESPACE}" -o json 2>/dev/null || echo '{"items":[]}')
# 检查是否有 deletionTimestamp(正在删除但卡住)
DELETING_RESOURCES=$(echo "$RESOURCE_JSON" | \
jq -r '.items[] | select(.metadata.deletionTimestamp != null) | "\(.kind)/\(.metadata.name)"' 2>/dev/null || true)
if [ -n "$DELETING_RESOURCES" ] && [ "$DELETING_RESOURCES" != "" ]; then
echo "以下资源卡在删除状态:"
echo "$DELETING_RESOURCES"
echo ""
echo "3. 清理卡住资源的 finalizer..."
for resource in $DELETING_RESOURCES; do
echo " 清理: $resource"
kubectl patch $resource -n "${NAMESPACE}" \
-p '{"metadata":{"finalizers":[]}}' \
--type=merge 2>/dev/null || true
done
else
echo "3. 无正在删除的资源"
fi
# 检查未开始删除的资源
NOT_DELETING_RESOURCES=$(echo "$RESOURCE_JSON" | \
jq -r '.items[] | select(.metadata.deletionTimestamp == null) | "\(.kind)/\(.metadata.name)"' 2>/dev/null || true)
if [ -n "$NOT_DELETING_RESOURCES" ] && [ "$NOT_DELETING_RESOURCES" != "" ]; then
echo "以下资源尚未开始删除:"
echo "$NOT_DELETING_RESOURCES"
echo ""
echo "4. 先删除这些资源..."
for resource in $NOT_DELETING_RESOURCES; do
echo " 删除: $resource"
kubectl delete $resource -n "${NAMESPACE}" --wait=false 2>/dev/null || true
done
echo "等待5秒让删除开始..."
sleep 5
echo "5. 检查是否卡住并清理 finalizer..."
for resource in $NOT_DELETING_RESOURCES; do
# 检查资源是否存在
if kubectl get $resource -n "${NAMESPACE}" 2>/dev/null | grep -q "Terminating" || \
kubectl get $resource -n "${NAMESPACE}" 2>/dev/null | grep -q "Deleting"; then
echo " $resource 卡住,清理 finalizer..."
kubectl patch $resource -n "${NAMESPACE}" \
-p '{"metadata":{"finalizers":[]}}' \
--type=merge 2>/dev/null || true
fi
done
else
echo "4. 无未开始删除的资源"
fi
echo ""
echo "6. 最终检查..."
sleep 3
# 修正的最终检查逻辑
FINAL_OUTPUT=$(kubectl get cephcluster,cephfilesystem,cephblockpool,cephobjectstore -n "${NAMESPACE}" 2>&1 || true)
# 统计有效行数(排除标题行和错误信息)
if echo "$FINAL_OUTPUT" | grep -q -E "(No resources found|not found)"; then
FINAL_COUNT=0
else
# 统计非标题行
FINAL_COUNT=$(echo "$FINAL_OUTPUT" | grep -v "^NAME" | grep -v "^error" | grep -c -E "^[a-zA-Z]" || echo 0)
fi
if [ "$FINAL_COUNT" -eq 0 ]; then
echo "✅ Ceph 资源清理完成"
else
echo "⚠️ 仍有 $FINAL_COUNT 个资源残留:"
echo "$FINAL_OUTPUT"
echo ""
read -p "是否强制继续?(输入 'FORCE-CONTINUE' 确认): " CONFIRM
if [[ "$CONFIRM" = "FORCE-CONTINUE" ]]; then
echo "继续执行后续步骤..."
else
echo "❌ 用户中止操作"
exit 1
fi
fi
fi
echo ""
2.5 Helm 卸载
echo ""
echo "=== Helm 卸载 cluster release ==="
if helm list -n "${NAMESPACE}" | grep -q "${HELM_RELEASE_CLUSTER}"; then
helm uninstall "${HELM_RELEASE_CLUSTER}" -n "${NAMESPACE}" --wait
echo "✅ Cluster release 卸载完成"
else
echo "ℹ️ Cluster release 未找到(可能已卸载)"
fi
sleep 10
echo ""
echo "=== Helm 卸载 operator release ==="
if helm list -n "${NAMESPACE}" | grep -q "${HELM_RELEASE_OPERATOR}"; then
helm uninstall "${HELM_RELEASE_OPERATOR}" -n "${NAMESPACE}" --wait
echo "✅ Operator release 卸载完成"
else
echo "ℹ️ Operator release 未找到(可能已卸载)"
fi
echo ""
echo "=== 等待 operator 相关资源清理 ==="
sleep 30
echo ""
echo "=== 删除 Rook Ceph 相关 CRDs ==="
CRD_COUNT=$(kubectl get crds 2>/dev/null | grep -c '\.ceph\.rook\.io' || echo "0")
if [ "$CRD_COUNT" -gt 0 ]; then
echo "发现 $CRD_COUNT 个相关 CRD"
kubectl get crds | awk '/\.ceph\.rook\.io/ {print $1}' | xargs -r kubectl delete crd --wait=false
echo "✅ CRD 删除命令已发送"
else
echo "ℹ️ 未发现 Rook Ceph 相关 CRD"
fi
echo ""
echo "=== 删除 Rook Ceph 相关 configmap secrets==="
kubectl -n "${NAMESPACE}" patch configmap rook-ceph-mon-endpoints --type merge -p '{"metadata":{"finalizers": []}}'
kubectl -n "${NAMESPACE}" patch secrets rook-ceph-mon --type merge -p '{"metadata":{"finalizers": []}}'
echo ""
echo "=== 删除 namespace ==="
if kubectl get namespace "${NAMESPACE}" 2>/dev/null; then
kubectl delete namespace "${NAMESPACE}" --wait=false
echo "✅ Namespace 删除命令已发送"
# 检查删除状态
echo "等待命名空间删除(最多120秒)..."
for i in {1..24}; do
if ! kubectl get namespace "${NAMESPACE}" 2>/dev/null; then
echo "✅ 命名空间已删除"
break
fi
echo "等待中... ($((i*5))秒)"
sleep 5
done
else
echo "ℹ️ 命名空间 ${NAMESPACE} 不存在"
fi
三、清理节点上的残留数据(安全优化版)
3.1 安全警告
echo ""
echo "========================================================"
echo " 节点数据清理(手动步骤) "
echo "========================================================"
echo "注意:以下步骤需要在每个运行过 Rook 的节点上手动执行"
echo "建议先在一个测试节点验证,再推广到所有节点"
echo "========================================================"
echo ""
3.2 清理 dataDirHostPath 目录(安全版)
# 保存为独立脚本,复制到每个节点执行
cat > /tmp/cleanup_rook_node.sh << 'EOF'
#!/bin/bash
set -e
# 配置变量
DATA_DIR_HOSTPATH_BASE="/var/lib/rook"
DATA_DIR="${DATA_DIR_HOSTPATH_BASE}"
echo "=== 节点清理脚本开始执行 ==="
echo "主机名: $(hostname)"
echo "当前用户: $(whoami)"
echo ""
# 检查目录是否存在
if [ -d "${DATA_DIR}" ]; then
echo "发现 Rook 数据目录: ${DATA_DIR}"
echo "目录内容:"
ls -la "${DATA_DIR}" || true
echo ""
# 确认删除
read -p "是否删除此目录及其所有内容?(输入 'DELETE-NOW' 确认): " NODE_CONFIRM
if [[ "$NODE_CONFIRM" = "DELETE-NOW" ]]; then
echo "正在删除 ${DATA_DIR} ..."
sudo rm -rf "${DATA_DIR}"
echo "✅ 目录已删除"
else
echo "❌ 跳过目录删除"
fi
else
echo "ℹ️ 未发现目录 ${DATA_DIR}"
fi
EOF
chmod +x /tmp/cleanup_rook_node.sh
echo "节点清理脚本已生成: /tmp/cleanup_rook_node.sh"
echo "请将此脚本复制到每个节点并执行"
echo ""
3.3 清理设备映射残留(安全版)
cat >> /tmp/cleanup_rook_node.sh << 'EOF'
echo ""
echo "=== 清理设备映射残留 ==="
# 检查并安全清理 /dev/mapper/ceph-*
CEPH_MAPPER_COUNT=$(ls /dev/mapper/ceph-* 2>/dev/null | wc -l || echo "0")
if [ "$CEPH_MAPPER_COUNT" -gt 0 ]; then
echo "发现 $CEPH_MAPPER_COUNT 个 Ceph 设备映射"
echo "列表:"
ls -l /dev/mapper/ceph-* 2>/dev/null || true
read -p "是否移除这些设备映射?(输入 'REMOVE-MAPPER' 确认): " MAPPER_CONFIRM
if [[ "$MAPPER_CONFIRM" = "REMOVE-MAPPER" ]]; then
for device in /dev/mapper/ceph-*; do
if [ -e "$device" ]; then
echo " 移除: $device"
sudo dmsetup remove "$(basename $device)" 2>/dev/null || \
echo " 警告: 移除失败(可能已被移除)"
fi
done
echo "✅ 设备映射清理完成"
fi
else
echo "ℹ️ 未发现 Ceph 设备映射"
fi
# 清理残留的符号链接
echo ""
echo "=== 清理残留符号链接 ==="
for dir in /dev/ceph-* /dev/mapper/ceph--*; do
if [ -e "$dir" ]; then
echo "删除: $dir"
sudo rm -rf "$dir" 2>/dev/null || true
fi
done
EOF
chmod +x /tmp/cleanup_rook_node.sh
echo "节点清理脚本已生成: /tmp/cleanup_rook_node.sh"
echo "请将此脚本复制到每个节点并执行"
echo ""
3.4 Zapping Devices(可选,安全版)
echo "正在生成节点清理脚本..."
cat > /tmp/cleanup_rook_node.sh << "EOF"
#!/bin/bash
echo ""
echo "=== 磁盘清理(可选,用于盘复用)==="
echo "⚠️ 注意:以下操作会清空磁盘所有数据!"
echo ""
echo "当前磁盘信息:"
lsblk -o NAME,SIZE,TYPE,MOUNTPOINT,FSTYPE,MODEL | grep -v "loop"
echo ""
# 添加 -t 0 检查是否在交互式终端
if [ -t 0 ]; then
read -p "是否需要清理特定磁盘以复用?(yes/NO): " ZAP_CONFIRM
else
echo "非交互式终端,跳过磁盘清理"
ZAP_CONFIRM="no"
fi
if [[ "$ZAP_CONFIRM" = "yes" ]]; then
echo "请输入要清理的磁盘路径(如 /dev/sdb): "
read DISK
if [ ! -b "$DISK" ]; then
echo "❌ 错误: $DISK 不是有效的块设备"
exit 1
fi
echo ""
echo "⚠️ 即将清理磁盘: $DISK"
echo "磁盘信息:"
sudo fdisk -l "$DISK" | head -20
read -p "确认清理此磁盘所有数据?(输入 'WIPE-DISK' 确认): " WIPE_CONFIRM
if [[ "$WIPE_CONFIRM" = "WIPE-DISK" ]]; then
echo "步骤1: 擦除文件系统签名..."
sudo wipefs -a "$DISK" 2>/dev/null || true
echo "步骤2: 清除分区表..."
sudo sgdisk --zap-all "$DISK" 2>/dev/null || true
echo "步骤3: 清除LVM/Ceph可能残留的元数据(官方方法)..."
# 清除磁盘开头(可能包含MBR/分区表残留)
sudo dd if=/dev/zero of="$DISK" bs=1K count=200 oflag=direct,dsync seek=0 2>/dev/null || true
# 清除可能在1GB偏移处的LVM元数据
sudo dd if=/dev/zero of="$DISK" bs=1K count=200 oflag=direct,dsync seek=$((1 * 1024**2)) 2>/dev/null || true
# 清除可能在10GB偏移处的LVM元数据
sudo dd if=/dev/zero of="$DISK" bs=1K count=200 oflag=direct,dsync seek=$((10 * 1024**2)) 2>/dev/null || true
# 清除可能在100GB偏移处的LVM元数据
sudo dd if=/dev/zero of="$DISK" bs=1K count=200 oflag=direct,dsync seek=$((100 * 1024**2)) 2>/dev/null || true
# 清除可能在1000GB偏移处的LVM元数据
sudo dd if=/dev/zero of="$DISK" bs=1K count=200 oflag=direct,dsync seek=$((1000 * 1024**2)) 2>/dev/null || true
echo "步骤4: 尝试 SSD 清理..."
sudo blkdiscard "$DISK" 2>/dev/null || echo "blkdiscard 不支持(HDD正常)"
echo "步骤5: 重新探测分区..."
sudo partprobe "$DISK" 2>/dev/null || true
echo "✅ 磁盘清理完成"
echo "清理后状态:"
sudo fdisk -l "$DISK" 2>/dev/null | head -5 || true
else
echo "❌ 磁盘清理已取消"
fi
else
echo "ℹ️ 跳过磁盘清理"
fi
echo ""
echo "=== 节点清理完成 ==="
echo "建议: 如需完全清理,可重启节点以确保所有内核状态被清除"
EOF
# 添加执行权限
chmod +x /tmp/cleanup_rook_node.sh
echo "磁盘清理部分已添加到脚本"
echo "✅ 已添加执行权限"
echo ""
echo "========================================================"
echo " 卸载脚本总结 "
echo "========================================================"
echo "✅ 第1-2部分: 已完成K8s集群内资源清理"
echo "📋 第3部分: 节点数据清理需要手动操作"
echo ""
echo "下一步操作:"
echo "1. 将 /tmp/cleanup_rook_node.sh 复制到每个节点"
echo "2. 在节点上执行: sudo bash /tmp/cleanup_rook_node.sh"
echo "3. 按提示确认每个操作"
echo "4. 建议完成后重启所有节点"
echo "========================================================"
提示:如果磁盘仍被占用,重启节点通常能释放设备映射器锁。
3.5 可选:整盘写零(非常彻底,但非常慢)
# 非常慢,仅在前面的步骤仍然无法让 ceph-volume raw list 识别为"空盘"时使用
# sudo dd if=/dev/zero of="$DISK" bs=1M status=progress
# sync
四、验证卸载完成
# 1. 检查命名空间是否已删除
kubectl get namespace "${NAMESPACE}"
# 2. 检查是否还有相关的 CRD
kubectl get crds | grep ceph
# 3. 在节点上检查残留文件
echo "=== 检查节点上是否还有残留文件 ==="
ls -la /var/lib/rook 2>/dev/null || echo "没有 /var/lib/rook 目录"
ls -la /dev/mapper/ceph-* 2>/dev/null || echo "没有 /dev/mapper/ceph-* 设备"
ls -la /dev/ceph-* 2>/dev/null || echo "没有 /dev/ceph-* 设备"
📞 技术支持
如果在部署过程中遇到任何问题,请提供以下信息以便诊断:
- 错误信息和命令输出
- 集群状态:
kubectl -n rook-ceph get all - Pod 详细信息:
kubectl -n rook-ceph describe pod <pod-name> - 配置文件内容(脱敏敏感信息)
📚 参考文献与资源
官方文档
📝 版本信息
文档版本:v1.0
最后更新:2026年
适用版本:
- Kubernetes:v1.28-v1.33(推荐 v1.28+)
- 架构:1 Master + 2 Worker
- Rook Ceph:v1.17.6
- Helm:3.x

浙公网安备 33010602011771号