Rook云原生存储
1:k8s数据持久化
1:volume:需要后端存储的一些细节
2:PV/PVC:管理员创建/定义PV,用户通过PVC使用PV的存储
3:storageClass:静态+动态,通过PVC声明使用的空间,自动创建PV和后端存储驱动对接
1.1:volumes
1:容器启动依赖数据
1:configmap
2:secret
2:临时数据存储
1:emptyDir
2:hostPath
3:持久化数据存储
1:nfs
2:cephfs
3:GlusterFS(新版本已去除驱动)
4:Cloud Storage
......
总结:这种方式没有单独的资源对象,它是与Pod的生命周期共存
1.2:pv/pvc
我们使用pv的创建存储对象,其实就是将我们的存储实例化成K8S的资源使得k8s可以去在自己的集群中与存储进行对接,这就是PV做的事情,比如创建一个nfs的pv这个时候这个pv的作用就是与nfs服务对接,负责管理访问模式等
pvc的作用就相当于我们要从这个nfs的pv中取出多少来使用比如nfs的pv有100G,pvc需要10G这样子,可以创建多个pvc去这个pv取存储的大小,以及这个存储䣌访问模式,比如只读,读写等。
1.3:storageClass
通过声明存储的提供者,storage自动会帮你去处理与后端存储的对接,包括创建PV等操作,完全由storageClass操作,我们只需要在对象内指定StorageClass的名称就可以了。
1.4:Volume对接
apiVersion: v1
kind: Pod
metadata:
name: nginx
spec:
containers:
- name: nginx
image: nginx:alpine
volumeMounts:
- name: html
mountPath: /usr/share/nginx/html
volumes:
- name: html
nfs:
server: 10.0.0.16
path: /data
1.5:PV/PVC对接
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-storage
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteMany
nfs:
server: 10.0.0.16
path: /data
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: nginx
spec:
volumeName: nfs-storage
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
---
apiVersion: v1
kind: Pod
metadata:
name: nginx
spec:
containers:
- name: nginx
image: nginx:alpine
volumeMounts:
- name: html
mountPath: /usr/share/nginx/html
volumes:
- name: html
persistentVolumeClaim:
claimName: nginx
1.6:StorageClass对接
# 静态PVC
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
parameters:
type: local
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: nginx
spec:
storageClassName: local-storage
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
---
apiVersion: v1
kind: Pod
metadata:
name: nginx
spec:
containers:
- name: nginx
image: nginx:alpine
volumeMounts:
- name: html
mountPath: /usr/share/nginx/html
volumes:
- name: html
persistentVolumeClaim:
claimName: nginx
# 动态创建pvc
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mysql
spec:
replicas: 3
serviceName: mysql
selector:
matchLabels:
app: mysql
template:
metadata:
labels:
app: mysql
spec:
containers:
- name: mysql
image: mysql:5.7
env:
- name: MYSQL_ROOT_PASSWORD
value: password
volumeMounts:
- name: mysql-persistent-storage
mountPath: /var/lib/mysql
volumes:
- name: mysql-persistent-storage
persistentVolumeClaim:
claimName: mysql-pvc
volumeClaimTemplates:
- metadata:
name: mysql-pvc
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: local-storage
resources:
requests:
storage: 1Gi
2:Rook入门
2.1:Ceph部署方式
1:CephDeploy
2:Cephadm
3:手动部署
4:Rook
2.2:什么是Rook
官网:https://rook.io
Rook是一套专门为云原生打造的存储方案,并且它专门为K8S提供解决方案,它的存储包括文件存储,块存储和对象存储,我们可以理解为Rook其实是一个存储引擎,它可以与各类存储进行对接最常见的比如Ceph,而Rook就是可以提供基于K8S对接Ceph的驱动,这就是Rook的核心,它其实也是借助Operator去做这个事情,当然了,Rook可以帮助我们解决管理Ceph的困扰,我们都知道Operator的便捷之处,而Rook就是利用Operator帮我们去管理Ceph。
2.3:Rook的存储类型
1:Ceph
2:NFS
3:Cassandra
......
2.4:Rook特性
1:简单可靠的自动化资源管理
2:超融合存储解决方案
3:有效的数据分发,以及数据的分布,保障数据的可用性
4:兼容多种存储方式
5:管理多种存储的解决方案
6:在您的数据中心轻松启用弹性存储
7:在 Apache 2.0 许可下发布的开源软件
8:可以运行在普通的硬件上设备上
2.5:Rook结合K8S
1:rook负责初始化和管理Ceph集群
1:monitor集群
2:mgr集群
3:osd集群
4:pool管理
5:对象存储
6:⽂件存储
7:监视和维护集群健康状态
2:rook负责提供访问存储所需的驱动
1:Flex驱动(旧驱动,不建议使⽤)
2:CSI驱动
3:RBD块存储
4:CephFS⽂件存储
5:S3/Swift⻛格对象存储
2.6:Rook架构

1:以上的所有对象都依托于Kubernetes集群
1:MON
2:RGW
3:MDS
4:MGR
5:OSD
6:Agent
1:csi-rbdplugin
2:csi-cephfsplugin
2:抽象化管理
1:pool
2:volumes
3:filesystems
4:buckets
3:Rook部署
3.1:环境介绍
| 节点 | IP | k8s角色 | 设备 | 系统 | k8s版本 |
|---|---|---|---|---|---|
| cce-kubernetes-master-1 | 10.0.0.16 | master | /dev/sdb | CentOS7.9 | 1.24.3 |
| cce-kubernetes-worker-1 | 10.0.0.17 | worker | /dev/sdb | CentOS7.9 | 1.24.3 |
3.2:部署Rook
前提条件
1:已经部署好的Kubernetes 1.11+
2:OSD节点需要有未格式化文件系统的磁盘
3:需要具备Linux基础
4:需要具备Ceph基础
1:mon
2:mds
3:rgw
4:osd
5:需要具备Kubernetes基础
1:Node,Pod
2:Deployment,Services,Statefulset
3:Volume,PV,PVC,StorageClass
3.3:获取源码并安装
# 如果需要master参与那么请将Master的污点去掉,否则无法参与Ceph的创建,或者在创建cluster的时候去修改配置可以容忍这些污点
[root@cce-kubernetes-master-1 ~]# git clone --single-branch --branch v1.10.12 https://github.com/rook/rook.git
[root@cce-kubernetes-master-1 ~]# cd rook/deploy/examples
[root@cce-kubernetes-master-1 examples]# kubectl create -f crds.yaml -f common.yaml -f operator.yaml
[root@cce-kubernetes-master-1 examples]# kubectl create -f cluster.yaml
[root@cce-kubernetes-master-1 examples]# kubectl get pod -n rook-ceph
NAME READY STATUS RESTARTS AGE
csi-cephfsplugin-h4csc 2/2 Running 0 17m
csi-cephfsplugin-provisioner-7c594f8cf-97fbd 5/5 Running 0 17m
csi-cephfsplugin-provisioner-7c594f8cf-fprzt 5/5 Running 0 17m
csi-cephfsplugin-z5bs4 2/2 Running 0 17m
csi-rbdplugin-6xtzq 2/2 Running 0 17m
csi-rbdplugin-85k2n 2/2 Running 0 17m
csi-rbdplugin-provisioner-99dd6c4c6-7q9rr 5/5 Running 0 17m
csi-rbdplugin-provisioner-99dd6c4c6-qrmtx 5/5 Running 0 17m
rook-ceph-crashcollector-cce-kubernetes-master-1-6b7c4c7885xbbw 1/1 Running 0 3m22s
rook-ceph-crashcollector-cce-kubernetes-worker-1-665df7874tgtwg 1/1 Running 0 3m23s
rook-ceph-mgr-a-f8b75cdb4-vs5mv 3/3 Running 0 7m28s
rook-ceph-mgr-b-6957b9b96f-xsbqc 3/3 Running 0 7m28s
rook-ceph-mon-a-5749c8659b-jn2cd 2/2 Running 0 13m
rook-ceph-mon-b-74d87b779f-8496k 2/2 Running 0 12m
rook-ceph-mon-c-7d7747cb6b-h65tv 2/2 Running 0 12m
rook-ceph-operator-644954fb4b-lfqnn 1/1 Running 0 92m
rook-ceph-osd-0-98876657-mqqlg 2/2 Running 0 3m23s
rook-ceph-osd-1-5c7675c47b-hhn58 2/2 Running 0 3m22s
rook-ceph-osd-prepare-cce-kubernetes-master-1-rrcpb 0/1 Completed 0 2m57s
rook-ceph-osd-prepare-cce-kubernetes-worker-1-jr5f6 0/1 Completed 0 2m54s
# 镜像下载不下来的话需要指定一下国内的加速或者魔法上网哦!
4:Ceph集群管理
4.1:Ceph资源对象
Ceph包含如下组件:
1:mon:monitor 管理集群
2:mgr:manager 监控管理
3:mds:CephFS 元数据管理
4:rgw:对象存储
5:osd:存储
4.2:ToolBox部署
其实这是一个Ceph的客户端,我们可以通过部署这个客户端来管理Ceph
[root@cce-kubernetes-master-1 ~]# cd /root/rook/deploy/examples
[root@cce-kubernetes-master-1 examples]# kubectl apply -f toolbox.yaml
deployment.apps/rook-ceph-tools created
# 这里我中间换了一套集群,因为我用的是CentOS7,但是Ceph最新版没有客户端包了,所以就换成了9的系统,不问题不大,不影响使用的
[root@cce-kubernetes-master-1 ~]# kubectl exec -it -n rook-ceph rook-ceph-tools-7857bc9568-m5rwl ceph status
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
cluster:
id: 60586a3f-6348-4bac-bd6b-868ae18a4dbe
health: HEALTH_OK
services:
mon: 3 daemons, quorum a,b,c (age 11m)
mgr: a(active, since 10m), standbys: b
osd: 3 osds: 3 up (since 10m), 3 in (since 11m)
data:
pools: 1 pools, 1 pgs
objects: 2 objects, 577 KiB
usage: 63 MiB used, 150 GiB / 150 GiB avail
pgs: 1 active+clean
4.3:K8S节点访问Ceph
[root@cce-kubernetes-master-1 ~]# kubectl exec -it -n rook-ceph rook-ceph-tools-7857bc9568-m5rwl cat /etc/ceph/ceph.conf
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
[global]
mon_host = 10.96.3.63:6789,10.96.0.138:6789,10.96.3.71:6789
[client.admin]
keyring = /etc/ceph/keyring
[root@cce-kubernetes-master-1 ~]# kubectl exec -it -n rook-ceph rook-ceph-tools-7857bc9568-m5rwl cat /etc/ceph/keyring
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
[client.admin]
key = AQBiNyhk46mRCRAA5SSOvLmSeapr55zjm/J3yA==
# 将这些配置原封不动的配置到需要访问Ceph集群的节点上
# ceph的客户端的配置是这样的,所以我们k8s去访问的时候基本上就是访问monitor的svc的IP和Port就可以了。
[root@cce-kubernetes-master-1 ~]# kubectl get svc -n rook-ceph
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
rook-ceph-mgr ClusterIP 10.96.0.155 <none> 9283/TCP 7h6m
rook-ceph-mgr-dashboard ClusterIP 10.96.2.106 <none> 8443/TCP 7h6m
rook-ceph-mon-a ClusterIP 10.96.1.106 <none> 6789/TCP,3300/TCP 7h12m
rook-ceph-mon-b ClusterIP 10.96.1.209 <none> 6789/TCP,3300/TCP 7h12m
rook-ceph-mon-c ClusterIP 10.96.2.234 <none> 6789/TCP,3300/TCP 7h11m
# 我们将配置放到任何一个Ceph的client上都是可以的,前提是要配置ceph的源
[root@cce-kubernetes-master-1 ~]# cat /etc/yum.repos.d/ceph.repo
[ceph]
name=ceph
baseurl=https://mirrors.tuna.tsinghua.edu.cn/ceph/rpm-17.2.5/el8/x86_64/
enabled=1
gpgcheck=0
[root@cce-kubernetes-master-1 ~]# yum install -y ceph-common.x86_64
[root@cce-kubernetes-master-1 ~]# ceph status
cluster:
id: 60586a3f-6348-4bac-bd6b-868ae18a4dbe
health: HEALTH_OK
services:
mon: 3 daemons, quorum a,b,c (age 12m)
mgr: a(active, since 11m), standbys: b
osd: 3 osds: 3 up (since 11m), 3 in (since 11m)
data:
pools: 1 pools, 1 pgs
objects: 2 objects, 577 KiB
usage: 63 MiB used, 150 GiB / 150 GiB avail
pgs: 1 active+clean
4.4:访问RBD
# 创建Pool
[root@cce-kubernetes-master-1 ~]# ceph osd pool create rook 16 16
pool 'rook' created
# 查看创建的Pool
[root@cce-kubernetes-master-1 ~]# ceph osd lspools
1 .mgr
2 rook
# 创建块存储
[root@cce-kubernetes-master-1 ~]# rbd create -p rook --image rook-rbd.img --size 1G
# 查看块存储
[root@cce-kubernetes-master-1 ~]# rbd ls -p rook
rook-rbd.img
# 详情
[root@cce-kubernetes-master-1 ~]# rbd info rook/rook-rbd.img
rbd image 'rook-rbd.img':
size 1 GiB in 256 objects
order 22 (4 MiB objects)
snapshot_count: 0
id: 3d213141932
block_name_prefix: rbd_data.3d213141932
format: 2
features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
op_features:
flags:
create_timestamp: Sat Apr 1 22:12:51 2023
access_timestamp: Sat Apr 1 22:12:51 2023
modify_timestamp: Sat Apr 1 22:12:51 2023
# 使用块存储
[root@cce-kubernetes-master-1 ~]# rbd map rook/rook-rbd.img
/dev/rbd0
[root@cce-kubernetes-master-1 ~]# rbd showmapped
id pool namespace image snap device
0 rook rook-rbd.img - /dev/rbd0
# 这个时候我们就可以把/dev/rbd0当成本地的磁盘是正常的格式化使用了。
[root@cce-kubernetes-master-1 ~]# mkfs.xfs /dev/rbd0
meta-data=/dev/rbd0 isize=512 agcount=8, agsize=32768 blks
= sectsz=512 attr=2, projid32bit=1
= crc=1 finobt=1, sparse=1, rmapbt=0
= reflink=1 bigtime=1 inobtcount=1
data = bsize=4096 blocks=262144, imaxpct=25
= sunit=16 swidth=16 blks
naming =version 2 bsize=4096 ascii-ci=0, ftype=1
log =internal log bsize=4096 blocks=2560, version=2
= sectsz=512 sunit=16 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
Discarding blocks...Done.
[root@cce-kubernetes-master-1 ~]# mkdir /data
[root@cce-kubernetes-master-1 ~]# mount -t xfs /dev/rbd0 /data
[root@cce-kubernetes-master-1 ~]# df | grep /data
/dev/rbd0 1038336 40500 997836 4% /data
5:定制Rook集群
5.1:placement调度概述
Rook借助Kubernetes默认的调度机制,Ceph组件以Pods的方式调度运行在K8S节点中,然而每个节点因为角色不一样,其机器的配置也有所不同,比如mds对CPU要求是比较高的,磁盘却要求并不高,恰恰相反,OSD节点则对磁盘和内存的要求是比较高的,CPU却成了次要的,因此我们在规划的时候需要根据角色的不同来分配节点。
Rook提供了多种调度策略:
1:nodeAffinity:节点亲和力调度,根据Labels选择合适的调度节点。
2:podAffinity:Pod亲和力调度,将Pod调度到具有相同性质类型的节点上
3:podAntAffinity:Pod反亲和调度,将Pod调度到与某些Pod相反的节点
4:topologySpreadConstraints:拓扑选择调度
5:tolerations:污点容忍调度,允许调度到某些具有“污点”的节点上
Rook支持的调度对象
1:mon
2:mgr
3:osd
4:cleanup
[root@cce-kubernetes-master-1 ~]# kubectl get pod -n rook-ceph -owide | awk '{print $1,$3,$7}'
NAME STATUS NODE
csi-cephfsplugin-j2r8d Running cce-kubernetes-master-1
csi-cephfsplugin-provisioner-7c594f8cf-m6dpg Running cce-kubernetes-worker-1
csi-cephfsplugin-provisioner-7c594f8cf-rkb9p Running cce-kubernetes-worker-2
csi-cephfsplugin-sfr57 Running cce-kubernetes-worker-2
csi-cephfsplugin-x55ds Running cce-kubernetes-worker-1
csi-rbdplugin-4j7s2 Running cce-kubernetes-worker-2
csi-rbdplugin-7phpk Running cce-kubernetes-worker-1
csi-rbdplugin-ftznl Running cce-kubernetes-master-1
csi-rbdplugin-provisioner-99dd6c4c6-xknw8 Running cce-kubernetes-worker-1
csi-rbdplugin-provisioner-99dd6c4c6-z4x4s Running cce-kubernetes-worker-2
rook-ceph-crashcollector-cce-kubernetes-master-1-6b7c4c788jk2rv Running cce-kubernetes-master-1
rook-ceph-crashcollector-cce-kubernetes-worker-1-665df7874r5twr Running cce-kubernetes-worker-1
rook-ceph-crashcollector-cce-kubernetes-worker-2-854d69f5ch9v4j Running cce-kubernetes-worker-2
rook-ceph-mgr-a-5dbb7d78fb-rvwfv Running cce-kubernetes-worker-2
rook-ceph-mgr-b-7cc79bc7fc-dppfc Running cce-kubernetes-worker-1
rook-ceph-mon-a-cb6b6cfc-t8djd Running cce-kubernetes-worker-2
rook-ceph-mon-b-7bfff49bd9-csjxz Running cce-kubernetes-worker-1
rook-ceph-mon-c-6f5b6466b4-4qz24 Running cce-kubernetes-master-1
rook-ceph-operator-644954fb4b-g4jq4 Running cce-kubernetes-worker-1
rook-ceph-osd-0-6bcf58f497-sm5q5 Running cce-kubernetes-worker-2
rook-ceph-osd-1-7599b8fdd-qdq7w Running cce-kubernetes-worker-1
rook-ceph-osd-2-84cf6cbcdb-whjsn Running cce-kubernetes-master-1
rook-ceph-osd-prepare-cce-kubernetes-master-1-q64s2 Completed cce-kubernetes-master-1
rook-ceph-osd-prepare-cce-kubernetes-worker-1-xjwjk Completed cce-kubernetes-worker-1
rook-ceph-osd-prepare-cce-kubernetes-worker-2-tq7p6 Completed cce-kubernetes-worker-2
rook-ceph-tools-7857bc9568-m5rwl Running cce-kubernetes-worker-1
[root@cce-kubernetes-master-1 examples]# vim cluster.yaml
......
# placement:
# all:
# nodeAffinity:
# requiredDuringSchedulingIgnoredDuringExecution:
# nodeSelectorTerms:
# - matchExpressions:
# - key: role
# operator: In
# values:
# - storage-node
# podAffinity:
# podAntiAffinity:
# topologySpreadConstraints:
# tolerations:
# - key: storage-node
# operator: Exists
# The above placement information can also be specified for mon, osd, and mgr components
# mon:
# Monitor deployments may contain an anti-affinity rule for avoiding monitor
# collocation on the same node. This is a required rule when host network is used
# or when AllowMultiplePerNode is false. Otherwise this anti-affinity rule is a
# preferred rule with weight: 50.
# osd:
# prepareosd:
# mgr:
# cleanup:
......
5.2:清理Rook集群
参考文档:https://www.rook.io/docs/rook/v1.11/Getting-Started/ceph-teardown/
[root@cce-kubernetes-master-1 ~]# kubectl delete -f rook/deploy/examples/cluster.yaml
[root@cce-kubernetes-master-1 ~]# kubectl delete -f rook/deploy/examples/toolbox.yaml
[root@cce-kubernetes-master-1 ~]# kubectl delete -f rook/deploy/examples/operator.yaml
[root@cce-kubernetes-master-1 ~]# kubectl delete -f rook/deploy/examples/crds.yaml
[root@cce-kubernetes-master-1 ~]# kubectl delete -f rook/deploy/examples/common.yaml
# 过程可能需要大量的时间,当然如果按照顺序的话是没问题的,但是我们这个方法是一下子删除了所有的东西,所以有些API的依赖就找不到,所以就无法删除了
# 强删Namespace的方法也很简单,就是get指定的namespace然后导出Json格式,然后删除spec下面的信息,重新执行以下curl使用api删除就OK了,前提是得用kubectl proxy指定一下端口。
curl -k -H "Content-Type: application/json" -X PUT --data-binary @rook.json http://127.0.0.1:8888/api/v1/namespaces/rook-ceph/finalize
5.3:定制mon调度参数
# 如果您将Rook用于生产,那么一定要定制这些参数,需要让固定的机器去承载Ceph的各个组件,当然既然是K8S的调度,我们完全可以使用K8S原生的调度机制
这里用到了node的亲和性调度,如果节点上有`ceph-mon: enabled`的标签,那么就调度可以调度到这个节点,否则就无法调度,当然如果想更细致,也可以和其他的调度算法一起使用,这样粒度可以更高。
...
placement:
mon:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: ceph-mon
operator: In
values:
- enabled
...
storage:
useAllNodes: false
useAllDevices: false
# 这样我们去重建一下Rook
[root@cce-kubernetes-master-1 examples]# kubectl apply -f common.yaml -f crds.yaml -f operator.yaml
[root@cce-kubernetes-master-1 examples]# kubectl get pod -n rook-ceph
NAME READY STATUS RESTARTS AGE
csi-cephfsplugin-b4b45 2/2 Running 0 2m16s
csi-cephfsplugin-cn67w 2/2 Running 0 2m16s
csi-cephfsplugin-mgn98 2/2 Running 0 2m16s
csi-cephfsplugin-provisioner-7c594f8cf-7nn78 5/5 Running 0 2m16s
csi-cephfsplugin-provisioner-7c594f8cf-k4tmd 5/5 Running 0 2m16s
csi-rbdplugin-9bx8j 2/2 Running 0 2m16s
csi-rbdplugin-gjzc5 2/2 Running 0 2m16s
csi-rbdplugin-provisioner-99dd6c4c6-97xm4 5/5 Running 0 2m16s
csi-rbdplugin-provisioner-99dd6c4c6-l5q4w 5/5 Running 0 2m16s
csi-rbdplugin-z26d7 2/2 Running 0 2m16s
rook-ceph-detect-version-6bwww 0/1 Pending 0 2m20s
rook-ceph-operator-644954fb4b-ph7jw 1/1 Running 0 2m40s
# 因为我们有定制,所以会有一个无法调度,因为没有符合的标签,所以我们需要去给固定的节点去打上标签,所以我这里给Master打上标签
[root@cce-kubernetes-master-1 examples]# kubectl label nodes cce-kubernetes-master-1 ceph-mon=enabled
node/cce-kubernetes-master-1 labeled
[root@cce-kubernetes-master-1 examples]# kubectl get pod -n rook-ceph
NAME READY STATUS RESTARTS AGE
csi-cephfsplugin-b4b45 2/2 Running 0 5m33s
csi-cephfsplugin-cn67w 2/2 Running 0 5m33s
csi-cephfsplugin-mgn98 2/2 Running 0 5m33s
csi-cephfsplugin-provisioner-7c594f8cf-7nn78 5/5 Running 0 5m33s
csi-cephfsplugin-provisioner-7c594f8cf-k4tmd 5/5 Running 0 5m33s
csi-rbdplugin-9bx8j 2/2 Running 0 5m33s
csi-rbdplugin-gjzc5 2/2 Running 0 5m33s
csi-rbdplugin-provisioner-99dd6c4c6-97xm4 5/5 Running 0 5m33s
csi-rbdplugin-provisioner-99dd6c4c6-l5q4w 5/5 Running 0 5m33s
csi-rbdplugin-z26d7 2/2 Running 0 5m33s
rook-ceph-mon-a-canary-6765fd8d4f-9fknh 0/2 Pending 0 84s
rook-ceph-mon-b-canary-8449df56cd-fqdfz 2/2 Running 0 84s
rook-ceph-mon-c-canary-79f866997f-jjm5t 0/2 Pending 0 84s
rook-ceph-operator-644954fb4b-ph7jw 1/1 Running 0 5m57s
# 可以看到这样就起来了,但是还是有的节点是Pending,那是因为什么呢?其实这个就是还是不符合标签规则,这个在Cluster.yaml的时候有一个关键名词叫`allowMultiplePerNode`,也就是说它不允许单个节点上运行多个mon,所以导致了这个问题,所以解决这个问题的方法也很简单,如果是生产,那么就直接在规划好的承载mon的节点上也打上标签`ceph-mon: enabled`就可以了,稍等一会儿就会将Operator会watch到操作,然后就会重新启动mon到符合标签的节点上了。
[root@cce-kubernetes-master-1 examples]# kubectl get pod -n rook-ceph
NAME READY STATUS RESTARTS AGE
csi-cephfsplugin-hhg7x 2/2 Running 0 114s
csi-cephfsplugin-provisioner-7c594f8cf-qpnwg 5/5 Running 0 114s
csi-cephfsplugin-provisioner-7c594f8cf-qs2t9 5/5 Running 0 114s
csi-cephfsplugin-pw785 2/2 Running 0 114s
csi-cephfsplugin-w6wnc 2/2 Running 0 114s
csi-rbdplugin-provisioner-99dd6c4c6-7v5lq 5/5 Running 0 114s
csi-rbdplugin-provisioner-99dd6c4c6-x5jkf 5/5 Running 0 114s
csi-rbdplugin-s94x4 2/2 Running 0 114s
csi-rbdplugin-sqhd6 2/2 Running 0 114s
csi-rbdplugin-t6xzv 2/2 Running 0 114s
rook-ceph-mon-a-7744687d9c-x5vgl 2/2 Running 0 105s
rook-ceph-mon-b-5b6546b457-v6xl5 2/2 Running 0 78s
rook-ceph-mon-c-598bf66c9-gx2bz 2/2 Running 0 67s
rook-ceph-operator-644954fb4b-ph7jw 1/1 Running 2 (7m29s ago) 29m
5.4:定制mgr调度参数
...
mgr:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: ceph-mgr
operator: In
values:
- enabled
...
其实原理一样,我们只需要开启固定组件的调度策略就行了
[root@cce-kubernetes-master-1 examples]# kubectl label nodes cce-kubernetes-worker-1 ceph-mgr=enabled
node/cce-kubernetes-worker-1 labeled
[root@cce-kubernetes-master-1 examples]# kubectl label nodes cce-kubernetes-worker-2 ceph-mgr=enabled
node/cce-kubernetes-worker-2 labeled
[root@cce-kubernetes-master-1 examples]# kubectl get pod -n rook-ceph
NAME READY STATUS RESTARTS AGE
csi-cephfsplugin-hhg7x 2/2 Running 0 60m
csi-cephfsplugin-provisioner-7c594f8cf-qpnwg 5/5 Running 0 60m
csi-cephfsplugin-provisioner-7c594f8cf-qs2t9 5/5 Running 0 60m
csi-cephfsplugin-pw785 2/2 Running 0 60m
csi-cephfsplugin-w6wnc 2/2 Running 0 60m
csi-rbdplugin-provisioner-99dd6c4c6-7v5lq 5/5 Running 0 60m
csi-rbdplugin-provisioner-99dd6c4c6-x5jkf 5/5 Running 0 60m
csi-rbdplugin-s94x4 2/2 Running 0 60m
csi-rbdplugin-sqhd6 2/2 Running 0 60m
csi-rbdplugin-t6xzv 2/2 Running 0 60m
rook-ceph-crashcollector-cce-kubernetes-master-1-74b86f89bwpprg 1/1 Running 0 53m
rook-ceph-crashcollector-cce-kubernetes-worker-1-7557f44975vz6x 1/1 Running 0 54m
rook-ceph-crashcollector-cce-kubernetes-worker-2-868b99f7cssvhj 1/1 Running 0 54m
rook-ceph-mgr-a-5f5bbbc559-hnbzz 3/3 Running 0 65s
rook-ceph-mgr-b-7d67888786-2z8f8 3/3 Running 0 40s
rook-ceph-mon-a-7744687d9c-x5vgl 2/2 Running 0 60m
rook-ceph-mon-b-5b6546b457-v6xl5 2/2 Running 0 60m
rook-ceph-mon-d-6bb98ddb88-4775f 2/2 Running 0 53m
rook-ceph-operator-644954fb4b-ph7jw 1/1 Running 2 (66m ago) 88m
# 所以不太好看调度节点,我们用awk过滤出来看看
[root@cce-kubernetes-master-1 examples]# kubectl get pod -n rook-ceph -owide | awk '{print $1,$2,$3,$7}'
NAME READY STATUS NODE
......
rook-ceph-mgr-a-5f5bbbc559-hnbzz 3/3 Running cce-kubernetes-worker-2
rook-ceph-mgr-b-7d67888786-2z8f8 3/3 Running cce-kubernetes-worker-1
rook-ceph-mon-a-7744687d9c-x5vgl 2/2 Running cce-kubernetes-worker-1
rook-ceph-mon-b-5b6546b457-v6xl5 2/2 Running cce-kubernetes-worker-2
rook-ceph-mon-d-6bb98ddb88-4775f 2/2 Running cce-kubernetes-master-1
......
# 可以看到调度的是按照我们的定制去调度的,当然我们可以多开几个节点做节点之间的冗余,至少不会出现一个节点挂了,其他节点也无法调度的情况出现。
5.5:定制OSD存储调度
在前面我们定制mon的时候我们还记得有一个storage下面的两个参数我们给改了改成了`false`也就是说,默认不去找我们节点上的所有存储,那么这里我们就用到了那个storage下面的参数,我们下面来定制一下。
...
storage: # cluster level storage configuration and selection
useAllNodes: false
useAllDevices: false
#deviceFilter:
config:
# crushRoot: "custom-root" # specify a non-default root label for the CRUSH map
# metadataDevice: "md0" # specify a non-rotational storage so ceph-volume will use it as block db device of bluestore.
# databaseSizeMB: "1024" # uncomment if the disks are smaller than 100 GB
# journalSizeMB: "1024" # uncomment if the disks are 20 GB or smaller
# osdsPerDevice: "1" # this value can be overridden at the node or device level
# encryptedDevice: "true" # the default value for this option is "false"
# Individual nodes and their config can be specified as well, but 'useAllNodes' above must be set to false. Then, only the named
# nodes below will be used as storage resources. Each node's 'name' field should match their 'kubernetes.io/hostname' label.
nodes:
- name: "cce-kubernetes-master-1"
devices: # specific devices to use for storage can be specified for each node
- name: "sda"
config:
journalSizeMB: "4096"
- name: "cce-kubernetes-worker-1"
devices: # specific devices to use for storage can be specified for each node
- name: "sda"
config:
journalSizeMB: "4096"
- name: "cce-kubernetes-worker-2"
devices: # specific devices to use for storage can be specified for each node
- name: "sda"
config:
journalSizeMB: "4096"
...
# 这里有一点需要说明。这里每个节点的每个盘都是可以单独设置参数的,如果不设置的话就继承config下的参数。
# 执行apply,前提是记得清空一下盘的Ceph标识,否则是不会生效的
[root@cce-kubernetes-master-1 examples]# kubectl apply -f cluster.yaml
cephcluster.ceph.rook.io/rook-ceph configured
[root@cce-kubernetes-master-1 examples]# kubectl get pod -n rook-ceph -owide | awk '{print $1,$2,$3,$7}'
NAME READY STATUS NODE
......
rook-ceph-osd-0-6b594b854c-5mkqw 2/2 Running cce-kubernetes-master-1
rook-ceph-osd-1-6d7ccd76f4-kcvxb 2/2 Running cce-kubernetes-worker-1
rook-ceph-osd-2-5bbf555979-w9r9b 2/2 Running cce-kubernetes-worker-2
# 可以看到OSD也是按照我们的预想去调度到了指定的节点上,看客户端OSD为3个
[root@cce-kubernetes-master-1 examples]# ceph status
cluster:
id: 4093033c-5fdd-4b57-87a5-681a50e5a9e6
health: HEALTH_OK
services:
mon: 3 daemons, quorum a,b,d (age 3h)
mgr: b(active, since 2h), standbys: a
osd: 3 osds: 3 up (since 2m), 3 in (since 2m)
data:
pools: 1 pools, 1 pgs
objects: 2 objects, 577 KiB
usage: 63 MiB used, 150 GiB / 150 GiB avail
pgs: 1 active+clean
# 那么我们后续扩容节点的时候就可以将新节点加入集群,然后同样的方法配置到cluster.yaml中去重新Apply一下就可以了
5.6:定制资源限制
其实前面我们做的都是没有默认的资源限制的,也就是在`cluster.yaml`中是有resources限制的,我们可以根据自己的需求去配置一定的资源限制,为了保证Ceph核心组件可以分配到一定量的资源,的确是该需要分配固定的资源的。
1:mon:(推荐内存128G)
2:mds
3:osd:每T存储建议对应4G内存
# 我这里是做测试,所以就随便改改了
...
resources:
mon:
limits:
cpu: "1000m"
memory: "204Mi"
requests:
cpu: "1000m"
memory: "2048Mi"
mgr:
limits:
cpu: "1000m"
memory: "2048Mi"
requests:
cpu: "1000m"
memory: "2048Mi"
osd:
limits:
cpu: "1000m"
memory: "2048Mi"
requests:
cpu: "1000m"
memory: "2048Mi"
...
不过我这里配置其实不够了,我就不去做这个操作了,但是操作就是这样的,不过这里有个问题需要告诉大家,如果OSD分配的内存过于小的话是或报错的,Pod内会报堆栈的问题的,就像我上面的配置其实按正常逻辑是无法启动的,会报错的。
5.7:健康检查
# 这里我们还需要了解的就是关于这些组件之间的健康检查的情况,因为组件如果不健康,我们也无法第一时间获取到组件的运行情况的话肯定是不行的,所以我们就涉及到了健康检查的配置,当然,这个也是在cluster.yaml内配置的。
这里支持的检测组件有:
1:mon
2:osd
3:mgr
# 老版本15.x还不支持startupProbe,但是我们目前这个版本已经是支持了的,并且这些机制默认都是开启的。
...
healthCheck:
# 守护进程检查,这里默认都是开启的
daemonHealth:
mon:
disabled: false
interval: 45s
osd:
disabled: false
interval: 60s
status:
disabled: false
interval: 60s
# Change pod liveness probe timing or threshold values. Works for all mon,mgr,osd daemons.
livenessProbe:
mon:
disabled: false
mgr:
disabled: false
osd:
disabled: false
# Change pod startup probe timing or threshold values. Works for all mon,mgr,osd daemons.
startupProbe:
mon:
disabled: false
mgr:
disabled: false
osd:
disabled: false
...
当我们容器发生故障的时候,比如说某个容器异常的时候会被快速拉起,比如我们手动触发一个故障Kill掉ceph-mon的进程然后k8s的状态会出现问题,那么这个时候这些检测机制的作用就出来了,它就会快速的重启这个故障,来自动解决这个问题。这个配置其实在`describe`的`pod`的时候就会有这个配置。只要是支持的组件都是相同的操作,也都是相同的检测方法,原理也都一致。
6:云原生RBD存储
6.1:什么是块存储
块存储是一种数据存储技术,它以块为单位将数据写入磁盘或其他存储介质中。块存储通常被用于服务器、数据中心和云计算等环境中,以存储大规模数据集。
块存储系统通常由一个或多个块存储设备组成,这些设备可以是磁盘、固态硬盘 (SSD) 或其他存储介质。块存储设备通常具有高速读写能力和高可靠性。每个块存储设备都有一个控制器,负责管理块设备的读写操作,并将数据块复制到多个设备上以提高容错性和数据冗余性。
块存储的主要优点是高速读写能力、高可靠性和数据冗余性。块存储设备可以快速地读取和写入数据,并且可以自动处理数据的复制、备份和恢复等操作。此外,块存储设备还具有高度的容错性和可靠性,因为它们可以将数据复制到多个设备上,并在其中一个设备失败时自动切换到其他设备上。
块存储系统通常被用于存储大规模数据集,如数据库、文件系统、云计算平台等。它们也被用于存储关键数据,如服务器操作系统、应用程序和数据仓库等。
1:阿里云:EBS
2:腾讯云:CBS
3:Ceph:RBD
......
相信大家也都用过云盘,那么云盘可以做什么,当然我们的RBD也就可以做什么,比如快照备份,增量备份,内核驱动等,都是可以支持到的
1:Thin-provisioned (受分配,使用多少分配多少,慢慢扩大)
2:Images up to 16 exabytes (单个镜像最大16EB)
3:Configurable striping(可配置切片)
4:In-memory caching (内存缓存)
5:Snapshots(支持快照)
6:Copy-on-write cloning(快照克隆)
7:Kernel driver support(内核支持)
8:KVM/libvirt support(kvm/librirt支持)
9:Back-end for cloud solutions(后端支持云解决方案)
10:Incremental backup(增量备份)
11:Disaster recovery (multisite asynchronous replication)(灾难恢复)

6.2:RBD与StorageClass

# 对接的三种方式
1:volume: 卷的存储方式,支持多种驱动,FC,EBS,Ceph等
2:PV/PVC : Persistent Volume 和 Persistent Volume Claim
3:StorageClass : 包含静态+动态两种
管理员定义好 provioner
终端用户通过 PVC 关联
Ceph对接RBD参考文档:https://docs.ceph.com/en/latest/rbd/rbd-kubernetes/
6.3:Ceph驱动
Ceph 和 kubernetes 的对接过程涉及到 pool 的创建, Ceph 认证信息,配置文件, CSI 驱动部署, StorageClass 创建等一系列过程。配置过程有一定的难度,如果对 Ceph 不熟悉的同学,对接可能有一定难度,而 Rook 则将这些配置过程简化,以云原生的方式实现对接,其默认已经继承好相关驱动,直接通过 kubernetes 创建 storageclass 即可
[root@cce-k8s-m-1 examples]# kubectl get pod -n rook-ceph
NAME READY STATUS RESTARTS AGE
csi-cephfsplugin-j9vq7 2/2 Running 0 13h
csi-cephfsplugin-nrp7f 2/2 Running 0 13h
csi-cephfsplugin-pl5vt 2/2 Running 0 13h
csi-cephfsplugin-provisioner-7c594f8cf-s7s9n 5/5 Running 0 13h
csi-cephfsplugin-provisioner-7c594f8cf-w6h47 5/5 Running 0 13h
csi-rbdplugin-2cwbl 2/2 Running 0 13h
csi-rbdplugin-4wpd5 2/2 Running 0 13h
csi-rbdplugin-dknj9 2/2 Running 0 13h
csi-rbdplugin-provisioner-99dd6c4c6-9vhnt 5/5 Running 0 13h
csi-rbdplugin-provisioner-99dd6c4c6-zqcwt 5/5 Running 0 13h
rook-ceph-crashcollector-cce-k8s-m-1-7499dbfff4-dcqg9 1/1 Running 0 13h
rook-ceph-crashcollector-cce-k8s-w-1-b58c96cb4-p7kx9 1/1 Running 0 13h
rook-ceph-crashcollector-cce-k8s-w-2-7db46448d5-xd25b 1/1 Running 0 13h
rook-ceph-mgr-a-c9b6d5b69-jfd4s 3/3 Running 0 13h
rook-ceph-mgr-b-6b7fcf6bd-kzng5 3/3 Running 0 13h
rook-ceph-mon-a-5cd956c46f-9b4ms 2/2 Running 0 13h
rook-ceph-mon-b-7bd996578f-f626r 2/2 Running 0 13h
rook-ceph-mon-c-67dcbbff4d-p85n5 2/2 Running 0 13h
rook-ceph-operator-644954fb4b-pm9cl 1/1 Running 0 13h
rook-ceph-osd-0-7bddbd7847-cj9lf 2/2 Running 0 13h
rook-ceph-osd-1-d4589dfc8-842s4 2/2 Running 0 13h
rook-ceph-osd-2-59547758c7-vk995 2/2 Running 0 13h
rook-ceph-osd-prepare-cce-k8s-m-1-7bf62 0/1 Completed 0 13h
rook-ceph-osd-prepare-cce-k8s-w-1-p7nn6 0/1 Completed 0 13h
rook-ceph-osd-prepare-cce-k8s-w-2-6k4cm 0/1 Completed 0 13h
rook-ceph-tools-7857bc9568-q2zgv 1/1 Running 0 13h
# 驱动信息
1:包含 rbd 和 cephfs 的驱动,csi-cephfsplugin 和 csi-rbdplugin
2:驱动由 provisioner 和 plugin 组成
6.4:创建存储类
# RBD块存储类
1:RBD 相关的存储驱动和 provisioner 安装 rook 时候已经创建好,因此接下来只需要直接对接即可, Rook 提供了两种对接的方式:
1:FlexVolume
2:CSI
其中 Flex 的方式比较老,默认驱动未安装,需要安装才可以对接,逐步淘汰,不建议使用,推荐使用 CSI 的对接方式
[root@cce-k8s-m-1 rbd]# pwd
/root/rook/deploy/examples/csi/rbd
[root@cce-k8s-m-1 rbd]# cat storageclass.yaml
apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
name: replicapool
namespace: rook-ceph # namespace:cluster
spec:
failureDomain: host
replicated:
size: 3
# Disallow setting pool with replica 1, this could lead to data loss without recovery.
# Make sure you're *ABSOLUTELY CERTAIN* that is what you want
requireSafeReplicaSize: true
# gives a hint (%) to Ceph in terms of expected consumption of the total cluster capacity of a given pool
# for more info: https://docs.ceph.com/docs/master/rados/operations/placement-groups/#specifying-expected-pool-size
#targetSizeRatio: .5
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: rook-ceph-block
# Change "rook-ceph" provisioner prefix to match the operator namespace if needed
provisioner: rook-ceph.rbd.csi.ceph.com
parameters:
# clusterID is the namespace where the rook cluster is running
# If you change this namespace, also change the namespace below where the secret namespaces are defined
clusterID: rook-ceph # namespace:cluster
# If you want to use erasure coded pool with RBD, you need to create
# two pools. one erasure coded and one replicated.
# You need to specify the replicated pool here in the `pool` parameter, it is
# used for the metadata of the images.
# The erasure coded pool must be set as the `dataPool` parameter below.
#dataPool: ec-data-pool
pool: replicapool
# (optional) mapOptions is a comma-separated list of map options.
# For krbd options refer
# https://docs.ceph.com/docs/master/man/8/rbd/#kernel-rbd-krbd-options
# For nbd options refer
# https://docs.ceph.com/docs/master/man/8/rbd-nbd/#options
# mapOptions: lock_on_read,queue_depth=1024
# (optional) unmapOptions is a comma-separated list of unmap options.
# For krbd options refer
# https://docs.ceph.com/docs/master/man/8/rbd/#kernel-rbd-krbd-options
# For nbd options refer
# https://docs.ceph.com/docs/master/man/8/rbd-nbd/#options
# unmapOptions: force
# (optional) Set it to true to encrypt each volume with encryption keys
# from a key management system (KMS)
# encrypted: "true"
# (optional) Use external key management system (KMS) for encryption key by
# specifying a unique ID matching a KMS ConfigMap. The ID is only used for
# correlation to configmap entry.
# encryptionKMSID: <kms-config-id>
# RBD image format. Defaults to "2".
imageFormat: "2"
# RBD image features
# Available for imageFormat: "2". Older releases of CSI RBD
# support only the `layering` feature. The Linux kernel (KRBD) supports the
# full complement of features as of 5.4
# `layering` alone corresponds to Ceph's bitfield value of "2" ;
# `layering` + `fast-diff` + `object-map` + `deep-flatten` + `exclusive-lock` together
# correspond to Ceph's OR'd bitfield value of "63". Here we use
# a symbolic, comma-separated format:
# For 5.4 or later kernels:
#imageFeatures: layering,fast-diff,object-map,deep-flatten,exclusive-lock
# For 5.3 or earlier kernels:
imageFeatures: layering
# The secrets contain Ceph admin credentials. These are generated automatically by the operator
# in the same namespace as the cluster.
csi.storage.k8s.io/provisioner-secret-name: rook-csi-rbd-provisioner
csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph # namespace:cluster
csi.storage.k8s.io/controller-expand-secret-name: rook-csi-rbd-provisioner
csi.storage.k8s.io/controller-expand-secret-namespace: rook-ceph # namespace:cluster
csi.storage.k8s.io/node-stage-secret-name: rook-csi-rbd-node
csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph # namespace:cluster
# Specify the filesystem type of the volume. If not specified, csi-provisioner
# will set default as `ext4`. Note that `xfs` is not recommended due to potential deadlock
# in hyperconverged settings where the volume is mounted on the same node as the osds.
csi.storage.k8s.io/fstype: ext4
# uncomment the following to use rbd-nbd as mounter on supported nodes
# **IMPORTANT**: CephCSI v3.4.0 onwards a volume healer functionality is added to reattach
# the PVC to application pod if nodeplugin pod restart.
# Its still in Alpha support. Therefore, this option is not recommended for production use.
#mounter: rbd-nbd
allowVolumeExpansion: true
reclaimPolicy: Delete
[root@cce-k8s-m-1 rbd]# kubectl apply -f storageclass.yaml
cephblockpool.ceph.rook.io/replicapool created
storageclass.storage.k8s.io/rook-ceph-block created
[root@cce-k8s-m-1 rbd]# ceph status
cluster:
id: 5d33f341-bf25-46bd-8a17-53952cda0eee
health: HEALTH_OK
services:
mon: 3 daemons, quorum a,b,c (age 13h)
mgr: b(active, since 13h), standbys: a
osd: 3 osds: 3 up (since 13h), 3 in (since 13h)
data:
pools: 2 pools, 2 pgs
objects: 3 objects, 577 KiB
usage: 64 MiB used, 150 GiB / 150 GiB avail
pgs: 2 active+clean
[root@cce-k8s-m-1 rbd]# ceph osd lspools
1 .mgr
2 replicapool
[root@cce-k8s-m-1 rbd]# ceph osd pool get replicapool size
size: 3
[root@cce-k8s-m-1 rbd]# kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
rook-ceph-block rook-ceph.rbd.csi.ceph.com Delete Immediate true 118s
# 可以看到这里就创建出来Pool和SC资源了。
6.5:PVC调用存储
创建好 storageclass 之后,我们就可以通过 PVC 向 storageclass 申请容量空间了, PVC 会自动和 storageclass 完成存储容量的创建过程,包括自动创建 PV , PV 与后端存储自动完成 RBD 块存储的创建,整个过程不需要我们关心,均通过 storageclass 和驱动自动完成,我们只需要关注使用即可,如下是一个 wordpress 博客应用连接 MySQL 数据库的一个云原生应用的范例
[root@cce-k8s-m-1 examples]# cat wordpress.yaml mysql.yaml
apiVersion: v1
kind: Service
metadata:
name: wordpress
labels:
app: wordpress
spec:
ports:
- port: 80
selector:
app: wordpress
tier: frontend
type: LoadBalancer
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: wp-pv-claim
labels:
app: wordpress
spec:
storageClassName: rook-ceph-block
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: wordpress
labels:
app: wordpress
tier: frontend
spec:
selector:
matchLabels:
app: wordpress
tier: frontend
strategy:
type: Recreate
template:
metadata:
labels:
app: wordpress
tier: frontend
spec:
containers:
- image: wordpress:4.6.1-apache
name: wordpress
env:
- name: WORDPRESS_DB_HOST
value: wordpress-mysql
- name: WORDPRESS_DB_PASSWORD
value: changeme
ports:
- containerPort: 80
name: wordpress
volumeMounts:
- name: wordpress-persistent-storage
mountPath: /var/www/html
volumes:
- name: wordpress-persistent-storage
persistentVolumeClaim:
claimName: wp-pv-claim
---
apiVersion: v1
kind: Service
metadata:
name: wordpress-mysql
labels:
app: wordpress
spec:
ports:
- port: 3306
selector:
app: wordpress
tier: mysql
clusterIP: None
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mysql-pv-claim
labels:
app: wordpress
spec:
storageClassName: rook-ceph-block
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: wordpress-mysql
labels:
app: wordpress
tier: mysql
spec:
selector:
matchLabels:
app: wordpress
tier: mysql
strategy:
type: Recreate
template:
metadata:
labels:
app: wordpress
tier: mysql
spec:
containers:
- image: mysql:5.6
name: mysql
env:
- name: MYSQL_ROOT_PASSWORD
value: changeme
ports:
- containerPort: 3306
name: mysql
volumeMounts:
- name: mysql-persistent-storage
mountPath: /var/lib/mysql
volumes:
- name: mysql-persistent-storage
persistentVolumeClaim:
claimName: mysql-pv-claim
[root@cce-k8s-m-1 examples]# kubectl apply -f mysql.yaml -f wordpress.yaml
service/wordpress-mysql created
persistentvolumeclaim/mysql-pv-claim created
deployment.apps/wordpress-mysql created
service/wordpress created
persistentvolumeclaim/wp-pv-claim created
deployment.apps/wordpress created
[root@cce-k8s-m-1 examples]# kubectl get pod
NAME READY STATUS RESTARTS AGE
wordpress-7964897bd9-fkzcc 1/1 Running 0 119s
wordpress-mysql-776b4f56c4-29jlt 1/1 Running 0 119s
6.6:PVC调用逻辑
PVC 会完成一系列的出对接过程,包含有什么动作。 PVC —> storageclass 申请容量 —> 创建 PV ——> 向 Ceph 申请 RBD 块,完成和 Ceph 对接
[root@cce-k8s-m-1 examples]# kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
mysql-pv-claim Bound pvc-0c40943e-875f-4893-9ee9-a696d15ecc90 20Gi RWO rook-ceph-block 5m41s
wp-pv-claim Bound pvc-302ef2cf-b236-4915-ab12-dd28d8614262 20Gi RWO rook-ceph-block 5m41s
[root@cce-k8s-m-1 examples]# kubectl get pvc mysql-pv-claim -oyaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
......
finalizers:
- kubernetes.io/pvc-protection
labels:
app: wordpress
name: mysql-pv-claim
namespace: default
resourceVersion: "131911"
uid: 0c40943e-875f-4893-9ee9-a696d15ecc90
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
storageClassName: rook-ceph-block
volumeMode: Filesystem
volumeName: pvc-0c40943e-875f-4893-9ee9-a696d15ecc90
status:
accessModes:
- ReadWriteOnce
capacity:
storage: 20Gi
phase: Bound
# PVC 会自动创建 PV
[root@cce-k8s-m-1 examples]# kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-0c40943e-875f-4893-9ee9-a696d15ecc90 20Gi RWO Delete Bound default/mysql-pv-claim rook-ceph-block 5m28s
pvc-302ef2cf-b236-4915-ab12-dd28d8614262 20Gi RWO Delete Bound default/wp-pv-claim rook-ceph-block 5m28s
[root@cce-k8s-m-1 examples]# kubectl get pv pvc-0c40943e-875f-4893-9ee9-a696d15ecc90 -oyaml
apiVersion: v1
kind: PersistentVolume
metadata:
......
finalizers:
- kubernetes.io/pv-protection
name: pvc-0c40943e-875f-4893-9ee9-a696d15ecc90
resourceVersion: "131908"
uid: 08d5b820-dc3d-4042-a736-e2671bd12e97
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: 20Gi
claimRef:
apiVersion: v1
kind: PersistentVolumeClaim
name: mysql-pv-claim
namespace: default
resourceVersion: "131874"
uid: 0c40943e-875f-4893-9ee9-a696d15ecc90
csi:
controllerExpandSecretRef:
name: rook-csi-rbd-provisioner
namespace: rook-ceph
driver: rook-ceph.rbd.csi.ceph.com
fsType: ext4
nodeStageSecretRef:
name: rook-csi-rbd-node
namespace: rook-ceph
volumeAttributes:
clusterID: rook-ceph
imageFeatures: layering
imageFormat: "2"
imageName: csi-vol-bb18cc8d-d1d8-11ed-9a54-f63b7b5ac9a4
journalPool: replicapool
pool: replicapool
storage.kubernetes.io/csiProvisionerIdentity: 1680447203456-8081-rook-ceph.rbd.csi.ceph.com
volumeHandle: 0001-0009-rook-ceph-0000000000000002-bb18cc8d-d1d8-11ed-9a54-f63b7b5ac9a4
persistentVolumeReclaimPolicy: Delete
storageClassName: rook-ceph-block
volumeMode: Filesystem
status:
phase: Bound
# PV 会完成和 Ceph 的对接,自动创建 RBD 块存储空间,期间由 plugin 驱动完成创建
# RBD信息
[root@cce-k8s-m-1 examples]# rbd -p replicapool ls
csi-vol-bb146437-d1d8-11ed-9a54-f63b7b5ac9a4
csi-vol-bb18cc8d-d1d8-11ed-9a54-f63b7b5ac9a4
# mysql 使用 rbd 块信息
[root@cce-k8s-m-1 examples]# rbd -p replicapool info csi-vol-bb146437-d1d8-11ed-9a54-f63b7b5ac9a4
rbd image 'csi-vol-bb146437-d1d8-11ed-9a54-f63b7b5ac9a4':
size 20 GiB in 5120 objects
order 22 (4 MiB objects)
snapshot_count: 0
id: 131ac1b0c0ad1
block_name_prefix: rbd_data.131ac1b0c0ad1
format: 2
features: layering
op_features:
flags:
create_timestamp: Mon Apr 3 12:33:50 2023
access_timestamp: Mon Apr 3 12:33:50 2023
modify_timestamp: Mon Apr 3 12:33:50 2023
6.7:Wordpress验证
我们需要修改wordoress的svc实现外部访问,这里我们可以直接修改为NodePort或者Ingress
[root@cce-k8s-m-1 examples]# kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 14h
wordpress NodePort 10.96.2.110 <none> 80:32141/TCP 10m
wordpress-mysql ClusterIP None <none> 3306/TCP 10m



到这里就可以基本确认是没有问题的。
6.8:存储持久化模板
PVC 使用模式适用于单个 pods 容器,如多个 pods 都需要有各自的存储如何实现,需要借助于 StatefulSet 的 volumeClaimTemplates 功能,实现每个 pods 均有各自的存储
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: nginx
namespace: default
spec:
serviceName: "nginx"
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
terminationGracePeriodSeconds: 10
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 80
name: web
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html
volumeClaimTemplates:
- metadata:
name: www
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "rook-ceph-block"
resources:
requests:
storage: 10Gi
[root@cce-k8s-m-1 ~]# kubectl get pv,pvc,pod
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/pvc-28491a54-25b1-4156-9c31-c43720dab0d6 10Gi RWO Delete Bound default/www-nginx-0 rook-ceph-block 3m20s
persistentvolume/pvc-496ca442-3add-4e21-948b-6e8a4154f9e5 10Gi RWO Delete Bound default/www-nginx-1 rook-ceph-block 2m35s
persistentvolume/pvc-d6794286-edb6-4c5a-b471-c930448b4c76 10Gi RWO Delete Bound default/www-nginx-2 rook-ceph-block 111s
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/www-nginx-0 Bound pvc-28491a54-25b1-4156-9c31-c43720dab0d6 10Gi RWO rook-ceph-block 3m20s
persistentvolumeclaim/www-nginx-1 Bound pvc-496ca442-3add-4e21-948b-6e8a4154f9e5 10Gi RWO rook-ceph-block 2m35s
persistentvolumeclaim/www-nginx-2 Bound pvc-d6794286-edb6-4c5a-b471-c930448b4c76 10Gi RWO rook-ceph-block 111s
NAME READY STATUS RESTARTS AGE
pod/nginx-0 1/1 Running 0 3m20s
pod/nginx-1 1/1 Running 0 2m35s
pod/nginx-2 1/1 Running 0 111s
# 可以看到,一一对应,而且都有自己的容量。
7:CephFS文件存储
7.1:CephFS文件存储概述
1:RBD的特点:
RBD块存储只能用于单个 VM 或者单个 pods 使用,无法提供给多个虚拟机或者多个 pods ”同时“使用,如果虚拟机或 pods 有共同访问存储的需求需要使用 CephFS 实现。
2:NAS 网络附加存储:多个客户端同时访问
1:EFS
2:NAS
3:CFS
3:CephFS 特点:
1:POSIX-compliant semantics (符合 POSIX 的语法)
2:Separates metadata from data (metadata和data 分离,数据放入data,元数据放入metadata)
3:Dynamic rebalancing (动态从分布,自愈)
4:Subdirectory snapshots (子目录筷子)
5:Configurable striping (可配置切片)
6:Kernel driver support (内核级别挂载)
7:FUSE support (用户空间级别挂载)
8:NFS/CIFS deployable (NFS/CIFS方式共享出去提供使用)
9:Use with Hadoop (replace HDFS) (支持Hadoop 的 HDFS)
7.2:MDS架构
大部分文件存储均通过元数据服务( metadata )查找元数据信息,通过 metadata 访问实际的数据, Ceph 中通过 MDS 来提供 metadata 服务,内置高可用架构,支持部署单Active-Standby方式部署,也支持双主 Active 方式部署,其通过交换日志 Journal 来保障元数据的一致性。
7.3:MDS和FS部署
[root@cce-k8s-m-1 examples]# cat filesystem.yaml
#################################################################################################################
# Create a filesystem with settings with replication enabled for a production environment.
# A minimum of 3 OSDs on different nodes are required in this example.
# If one mds daemon per node is too restrictive, see the podAntiAffinity below.
# kubectl create -f filesystem.yaml
#################################################################################################################
apiVersion: ceph.rook.io/v1
kind: CephFilesystem
metadata:
name: myfs
namespace: rook-ceph # namespace:cluster
spec:
# The metadata pool spec. Must use replication.
metadataPool:
replicated:
size: 3
requireSafeReplicaSize: true
parameters:
# Inline compression mode for the data pool
# Further reference: https://docs.ceph.com/docs/master/rados/configuration/bluestore-config-ref/#inline-compression
compression_mode:
none
# gives a hint (%) to Ceph in terms of expected consumption of the total cluster capacity of a given pool
# for more info: https://docs.ceph.com/docs/master/rados/operations/placement-groups/#specifying-expected-pool-size
#target_size_ratio: ".5"
# The list of data pool specs. Can use replication or erasure coding.
dataPools:
- name: replicated
failureDomain: host
replicated:
size: 3
# Disallow setting pool with replica 1, this could lead to data loss without recovery.
# Make sure you're *ABSOLUTELY CERTAIN* that is what you want
requireSafeReplicaSize: true
parameters:
# Inline compression mode for the data pool
# Further reference: https://docs.ceph.com/docs/master/rados/configuration/bluestore-config-ref/#inline-compression
compression_mode:
none
# gives a hint (%) to Ceph in terms of expected consumption of the total cluster capacity of a given pool
# for more info: https://docs.ceph.com/docs/master/rados/operations/placement-groups/#specifying-expected-pool-size
#target_size_ratio: ".5"
# Whether to preserve filesystem after CephFilesystem CRD deletion
preserveFilesystemOnDelete: true
# The metadata service (mds) configuration
metadataServer:
# The number of active MDS instances
activeCount: 1
# Whether each active MDS instance will have an active standby with a warm metadata cache for faster failover.
# If false, standbys will be available, but will not have a warm cache.
activeStandby: true
# The affinity rules to apply to the mds deployment
placement:
# nodeAffinity:
# requiredDuringSchedulingIgnoredDuringExecution:
# nodeSelectorTerms:
# - matchExpressions:
# - key: role
# operator: In
# values:
# - mds-node
# topologySpreadConstraints:
# tolerations:
# - key: mds-node
# operator: Exists
# podAffinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- rook-ceph-mds
## Add this if you want to allow mds daemons for different filesystems to run on one
## node. The value in "values" must match .metadata.name.
# - key: rook_file_system
# operator: In
# values:
# - myfs
# topologyKey: kubernetes.io/hostname will place MDS across different hosts
topologyKey: kubernetes.io/hostname
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- rook-ceph-mds
# topologyKey: */zone can be used to spread MDS across different AZ
# Use <topologyKey: failure-domain.beta.kubernetes.io/zone> in k8s cluster if your cluster is v1.16 or lower
# Use <topologyKey: topology.kubernetes.io/zone> in k8s cluster is v1.17 or upper
topologyKey: topology.kubernetes.io/zone
# A key/value list of annotations
# annotations:
# key: value
# A key/value list of labels
# labels:
# key: value
# resources:
# The requests and limits set here, allow the filesystem MDS Pod(s) to use half of one CPU core and 1 gigabyte of memory
# limits:
# cpu: "500m"
# memory: "1024Mi"
# requests:
# cpu: "500m"
# memory: "1024Mi"
priorityClassName: system-cluster-critical
livenessProbe:
disabled: false
startupProbe:
disabled: false
# Filesystem mirroring settings
# mirroring:
# enabled: true
# list of Kubernetes Secrets containing the peer token
# for more details see: https://docs.ceph.com/en/latest/dev/cephfs-mirroring/#bootstrap-peers
# Add the secret name if it already exists else specify the empty list here.
# peers:
#secretNames:
#- secondary-cluster-peer
# specify the schedule(s) on which snapshots should be taken
# see the official syntax here https://docs.ceph.com/en/latest/cephfs/snap-schedule/#add-and-remove-schedules
# snapshotSchedules:
# - path: /
# interval: 24h # daily snapshots
# The startTime should be mentioned in the format YYYY-MM-DDTHH:MM:SS
# If startTime is not specified, then by default the start time is considered as midnight UTC.
# see usage here https://docs.ceph.com/en/latest/cephfs/snap-schedule/#usage
# startTime: 2022-07-15T11:55:00
# manage retention policies
# see syntax duration here https://docs.ceph.com/en/latest/cephfs/snap-schedule/#add-and-remove-retention-policies
# snapshotRetention:
# - path: /
# duration: "h 24"
# 设置label
[root@cce-k8s-m-1 examples]# kubectl label nodes cce-k8s-m-1 app=rook-ceph-mds
node/cce-k8s-m-1 labeled
[root@cce-k8s-m-1 examples]# kubectl label nodes cce-k8s-w-1 app=rook-ceph-mds
node/cce-k8s-w-1 labeled
[root@cce-k8s-m-1 examples]# kubectl label nodes cce-k8s-w-2 app=rook-ceph-mds
node/cce-k8s-w-2 labeled
[root@cce-k8s-m-1 examples]# kubectl apply -f filesystem.yaml
cephfilesystem.ceph.rook.io/myfs created
[root@cce-k8s-m-1 examples]# kubectl -n rook-ceph get pods -l app=rook-ceph-mds
NAME READY STATUS RESTARTS AGE
rook-ceph-mds-myfs-a-6d98c9c4d8-hh95q 2/2 Running 0 6m49s
rook-ceph-mds-myfs-b-db75d54bf-4vbvl 2/2 Running 0 6m48s
[root@cce-k8s-m-1 examples]# ceph -s
cluster:
id: 5d33f341-bf25-46bd-8a17-53952cda0eee
health: HEALTH_OK
services:
mon: 3 daemons, quorum a,b,c (age 15h)
mgr: b(active, since 15h), standbys: a
mds: 1/1 daemons up, 1 hot standby
osd: 3 osds: 3 up (since 15h), 3 in (since 15h)
data:
volumes: 1/1 healthy
pools: 4 pools, 81 pgs
objects: 29 objects, 579 KiB
usage: 306 MiB used, 150 GiB / 150 GiB avail
pgs: 81 active+clean
io:
client: 1.2 KiB/s rd, 2 op/s rd, 0 op/s wr
# pool信息
[root@cce-k8s-m-1 examples]# ceph osd lspools
1 .mgr
2 replicapool
3 myfs-metadata # 元数据
4 myfs-replicated # 数据
[root@cce-k8s-m-1 examples]# ceph fs ls
name: myfs, metadata pool: myfs-metadata, data pools: [myfs-replicated ]
7.4:MDS高可用
MDS 支持双主的方式部署,即单个文件系统有多组 metadata 服务器,每组均有主备active-standby 的方式部署,修改`activeCount`,表示双主配置
[root@cce-k8s-m-1 examples]# kubectl apply -f filesystem.yaml
cephfilesystem.ceph.rook.io/myfs configured
[root@cce-k8s-m-1 examples]# kubectl get pod -n rook-ceph -l app=rook-ceph-mds
NAME READY STATUS RESTARTS AGE
rook-ceph-mds-myfs-a-657b6ccb9b-pbdhx 2/2 Running 0 2m57s
rook-ceph-mds-myfs-b-5d7bd967df-lpw5b 2/2 Running 0 119s
rook-ceph-mds-myfs-c-76fb898c78-gsxg2 2/2 Running 0 86s
rook-ceph-mds-myfs-d-849f46b5cf-v8khj 2/2 Running 0 55s
[root@cce-k8s-m-1 examples]# ceph status
cluster:
id: 5d33f341-bf25-46bd-8a17-53952cda0eee
health: HEALTH_OK
services:
mon: 3 daemons, quorum a,b,c (age 15h)
mgr: b(active, since 15h), standbys: a
mds: 2/2 daemons up, 2 hot standby
osd: 3 osds: 3 up (since 15h), 3 in (since 15h)
data:
volumes: 1/1 healthy
pools: 4 pools, 81 pgs
objects: 48 objects, 581 KiB
usage: 307 MiB used, 150 GiB / 150 GiB avail
pgs: 81 active+clean
io:
client: 1.7 KiB/s rd, 3 op/s rd, 0 op/s wr
# 这中间有标签规则的问题,而我把亲和性那一块给注释了然后就可以了,有兴趣的可以去研究一下。
7.5:MDS高级调度
s 通过 placement 提供了调度机制,支持节点调度, pods 亲和力调度, pods 反亲和调度,节点容忍和拓扑调度等,先看下节点的调度,通过 ceph-mds=enabled 标签选择具备满足条件的 node 节点
[root@cce-k8s-m-1 examples]# kubectl get pod -n rook-ceph -l app=rook-ceph-mds -owide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
rook-ceph-mds-myfs-a-657b6ccb9b-pbd... 2/2 Running 0 66m 100.101.104.157 cce-k8s-w-2 <none> <none>
rook-ceph-mds-myfs-b-5d7bd967df-lpw5b 2/2 Running 0 66m 100.101.104.158 cce-k8s-w-2 <none> <none>
rook-ceph-mds-myfs-c-76fb898c78-gsxg2 2/2 Running 0 65m 100.101.104.159 cce-k8s-w-2 <none> <none>
rook-ceph-mds-myfs-d-849f46b5cf-v8khj 2/2 Running 0 64m 100.118.108.219 cce-k8s-w-1 <none> <none>
...
metadataServer:
# The number of active MDS instances
activeCount: 2
# Whether each active MDS instance will have an active standby with a warm metadata cache for faster failover.
# If false, standbys will be available, but will not have a warm cache.
activeStandby: true
# The affinity rules to apply to the mds deployment
placement:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: ceph-mds
operator: In
values:
- enabled
...
[root@cce-k8s-m-1 examples]# kubectl label nodes cce-k8s-m-1 ceph-mds=enabled
node/cce-k8s-m-1 labeled
[root@cce-k8s-m-1 examples]# kubectl label nodes cce-k8s-w-1 ceph-mds=enabled
node/cce-k8s-w-1 labeled
[root@cce-k8s-m-1 examples]# kubectl label nodes cce-k8s-w-2 ceph-mds=enabled
node/cce-k8s-w-2 labeled
[root@cce-k8s-m-1 examples]# kubectl get pod -n rook-ceph -l app=rook-ceph-mds -owide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
rook-ceph-mds-myfs-a-86f58... 2/2 Running 0 2m12s 100.118.108.222 cce-k8s-w-1 <none> <none>
rook-ceph-mds-myfs-b-6c4b4... 2/2 Running 0 103s 100.101.104.160 cce-k8s-w-2 <none> <none>
rook-ceph-mds-myfs-c-69477... 2/2 Running 0 69s 100.101.104.161 cce-k8s-w-2 <none> <none>
rook-ceph-mds-myfs-d-75f79... 2/2 Running 0 38s 100.101.104.162 cce-k8s-w-2 <none> <none>
# 可以看到,这里的确是分布了,但是并没有完全分布到所有的节点上去,这个其实也是因为node的亲和性粒度也不是特别细致,所以只能说有概率调度到打了标签的所有节点上去,也是一种冗余机制,但是有一种情况就非常的操蛋了,比如恰巧你的两个主全部调度到了一个节点上,那么你的高可用也就不生效了,所以我们不仅要有亲和性,还要有反亲和的支持
...
placement:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: ceph-mds
operator: In
values:
- enabled
# topologySpreadConstraints:
# tolerations:
# - key: mds-node
# operator: Exists
# podAffinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- rook-ceph-mds
topologyKey: kubernetes.io/hostname
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- rook-ceph-mds
topologyKey: topology.kubernetes.io/zone
...
[root@cce-k8s-m-1 examples]# kubectl get pod -n rook-ceph -l app=rook-ceph-mds -owide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
rook-ceph-mds-myfs-a-556bf87cc7-wq427 1/2 Running 0 11s 100.118.108.223 cce-k8s-w-1 <none> <none>
rook-ceph-mds-myfs-b-6c4b4b79c6-gf5vv 2/2 Running 0 20m 100.101.104.160 cce-k8s-w-2 <none> <none>
rook-ceph-mds-myfs-c-6947798bdc-xsc9v 2/2 Running 0 20m 100.101.104.161 cce-k8s-w-2 <none> <none>
rook-ceph-mds-myfs-d-75f7974874-4lc2r 2/2 Running 0 19m 100.101.104.162 cce-k8s-w-2 <none> <none>
# 我这里开反亲和会有问题因为需要4个节点去承载Pod,但是我只有三个,所以会有一个节点处于Pending,大家做的时候只要节点够,标签符合就行了,我这里就不做这个反亲和了。
7.6:部署CephFS存储类
Rook 默认将 CephFS 相关的存储驱动已安装好,只需要通过 storageclass 消费即可
[root@cce-k8s-m-1 cephfs]# cat storageclass.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: rook-cephfs
# Change "rook-ceph" provisioner prefix to match the operator namespace if needed
provisioner: rook-ceph.cephfs.csi.ceph.com # driver:namespace:operator
parameters:
# clusterID is the namespace where the rook cluster is running
# If you change this namespace, also change the namespace below where the secret namespaces are defined
clusterID: rook-ceph # namespace:cluster
# CephFS filesystem name into which the volume shall be created
fsName: myfs
# Ceph pool into which the volume shall be created
# Required for provisionVolume: "true"
pool: myfs-replicated
# The secrets contain Ceph admin credentials. These are generated automatically by the operator
# in the same namespace as the cluster.
csi.storage.k8s.io/provisioner-secret-name: rook-csi-cephfs-provisioner
csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph # namespace:cluster
csi.storage.k8s.io/controller-expand-secret-name: rook-csi-cephfs-provisioner
csi.storage.k8s.io/controller-expand-secret-namespace: rook-ceph # namespace:cluster
csi.storage.k8s.io/node-stage-secret-name: rook-csi-cephfs-node
csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph # namespace:cluster
# (optional) The driver can use either ceph-fuse (fuse) or ceph kernel client (kernel)
# If omitted, default volume mounter will be used - this is determined by probing for ceph-fuse
# or by setting the default mounter explicitly via --volumemounter command-line argument.
# mounter: kernel
reclaimPolicy: Delete
allowVolumeExpansion: true
mountOptions:
# uncomment the following line for debugging
#- debug
[root@cce-k8s-m-1 cephfs]# kubectl apply -f storageclass.yaml
storageclass.storage.k8s.io/rook-cephfs created
[root@cce-k8s-m-1 cephfs]# kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
rook-ceph-block rook-ceph.rbd.csi.ceph.com Delete Immediate true 7h12m
rook-cephfs rook-ceph.cephfs.csi.ceph.com Delete Immediate true 2s
7.7:容器调用CephFS
[root@cce-k8s-m-1 cephfs]# cat kube-registry.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: cephfs-pvc
namespace: kube-system
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Gi
# 对应上面的sc
storageClassName: rook-cephfs
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: kube-registry
namespace: kube-system
labels:
k8s-app: kube-registry
kubernetes.io/cluster-service: "true"
spec:
replicas: 3
selector:
matchLabels:
k8s-app: kube-registry
template:
metadata:
labels:
k8s-app: kube-registry
kubernetes.io/cluster-service: "true"
spec:
containers:
- name: registry
image: registry:2
imagePullPolicy: Always
resources:
limits:
cpu: 100m
memory: 100Mi
env:
- name: REGISTRY_HTTP_ADDR
value: :5000
- name: REGISTRY_HTTP_SECRET
value: "Ple4seCh4ngeThisN0tAVerySecretV4lue"
- name: REGISTRY_STORAGE_FILESYSTEM_ROOTDIRECTORY
value: /var/lib/registry
volumeMounts:
- name: image-store
mountPath: /var/lib/registry
ports:
- containerPort: 5000
name: registry
protocol: TCP
livenessProbe:
httpGet:
path: /
port: registry
readinessProbe:
httpGet:
path: /
port: registry
volumes:
- name: image-store
persistentVolumeClaim:
claimName: cephfs-pvc
readOnly: false
[root@cce-k8s-m-1 cephfs]# kubectl get pod,pv,pvc -n kube-system
NAME READY STATUS RESTARTS AGE
pod/coredns-6d4b75cb6d-4frh4 1/1 Running 0 21h
pod/coredns-6d4b75cb6d-9tmpr 1/1 Running 0 21h
pod/etcd-cce-k8s-m-1 1/1 Running 0 21h
pod/kube-apiserver-cce-k8s-m-1 1/1 Running 0 21h
pod/kube-controller-manager-cce-k8s-m-1 1/1 Running 0 21h
pod/kube-proxy-96l5q 1/1 Running 0 21h
pod/kube-proxy-mxpjw 1/1 Running 0 21h
pod/kube-proxy-r4l7k 1/1 Running 0 21h
pod/kube-registry-74d7b9999c-dptn2 1/1 Running 0 37s
pod/kube-registry-74d7b9999c-thcwl 1/1 Running 0 37s
pod/kube-registry-74d7b9999c-vp9l8 1/1 Running 0 37s
pod/kube-scheduler-cce-k8s-m-1 1/1 Running 0 21h
pod/kube-sealos-lvscare-cce-k8s-w-1 1/1 Running 0 21h
pod/kube-sealos-lvscare-cce-k8s-w-2 1/1 Running 0 21h
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/pvc-... 1Gi RWX Delete Bound kube-system/cephfs-pvc rook-cephfs 37s
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/cephfs-pvc Bound pvc-3118d092-914e-4728-bc04-f816f571fc46 1Gi RWX rook-cephfs 37s
# 测试数据共享
[root@cce-k8s-m-1 cephfs]# kubectl exec -it -n kube-system kube-registry-74d7b9999c-dptn2 /bin/sh
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
/ # cd /var/lib/registry/
/var/lib/registry # echo "This is CephFS" > index.html
/var/lib/registry # cat index.html
This is CephFS
/var/lib/registry # ls -lh
total 1K
-rw-r--r-- 1 root root 15 Apr 3 11:39 index.html
[root@cce-k8s-m-1 cephfs]# kubectl exec -it -n kube-system kube-registry-74d7b9999c-thcwl -- /bin/cat /var/lib/registry/index.html
This is CephFS
[root@cce-k8s-m-1 cephfs]# kubectl exec -it -n kube-system kube-registry-74d7b9999c-vp9l8 -- /bin/cat /var/lib/registry/index.html
This is CephFS
# 镜像仓库我就不测试了,这个只需要把Pod暴露出去就可以使用了。
7.8:外部访问CephFS
# 我们需要访问CephFS的时候需要一些数据,也就是Ceph的验证信息
[root@cce-k8s-m-1 cephfs]# cat /etc/ceph/ceph.conf
[global]
mon_host = 10.96.2.87:6789,10.96.0.229:6789,10.96.1.11:6789
[client.admin]
keyring = /etc/ceph/keyring
[root@cce-k8s-m-1 cephfs]# cat /etc/ceph/keyring
[client.admin]
key = AQBLlSlkTTxiCRAAfjLe95fOv8iYcFduO0BLXg==
# 我们在本地挂在一下CephFS看看效果
[root@cce-k8s-m-1 cephfs]# mkdir /data
[root@cce-k8s-m-1 cephfs]# mount -t ceph -o name=admin,secret=AQBLlSlkTTxiCRAAfjLe95fOv8iYcFduO0BLXg==,mds_namespace=myfs 10.96.2.87:6789,10.96.0.229:6789,10.96.1.11:6789:/ /data
[root@cce-k8s-m-1 cephfs]# df -Th
......
10.96.2.87:6789,10.96.0.229:6789,10.96.1.11:6789:/ 49696768 0 49696768 0% /data
[root@cce-k8s-m-1 cephfs]# cd /data/
[root@cce-k8s-m-1 data]# ls
volumes
[root@cce-k8s-m-1 data]# cd volumes/
[root@cce-k8s-m-1 volumes]# ls
csi _csi:csi-vol-a47a2b0b-d213-11ed-b070-161e0144d679.meta
[root@cce-k8s-m-1 volumes]# cd csi/
[root@cce-k8s-m-1 csi]# ls
csi-vol-a47a2b0b-d213-11ed-b070-161e0144d679
[root@cce-k8s-m-1 csi]# cd csi-vol-a47a2b0b-d213-11ed-b070-161e0144d679/
[root@cce-k8s-m-1 csi-vol-a47a2b0b-d213-11ed-b070-161e0144d679]# ls
28c47060-8dcf-40c9-8def-0251b325a206
[root@cce-k8s-m-1 csi-vol-a47a2b0b-d213-11ed-b070-161e0144d679]# cd 28c47060-8dcf-40c9-8def-0251b325a206/
[root@cce-k8s-m-1 28c47060-8dcf-40c9-8def-0251b325a206]# ls
index.html
[root@cce-k8s-m-1 28c47060-8dcf-40c9-8def-0251b325a206]# pwd
/data/volumes/csi/csi-vol-a47a2b0b-d213-11ed-b070-161e0144d679/28c47060-8dcf-40c9-8def-0251b325a206
[root@cce-k8s-m-1 28c47060-8dcf-40c9-8def-0251b325a206]# cat index.html
This is CephFS
# 这是我们kube-registry的数据
7.9:CephFS集群维护
1:pods状态
2:ceph状态
3:fs状态
4:日志查看
1:驱动日志
2:服务日志
# pods状态查看,mds以pods的形式运行,需要确保pods运行正常
[root@cce-k8s-m-1 ~]# kubectl -n rook-ceph get pods -l app=rook-ceph-mds
NAME READY STATUS RESTARTS AGE
rook-ceph-mds-myfs-a-86f58d679c-2pdk8 2/2 Running 0 67m
rook-ceph-mds-myfs-b-6c4b4b79c6-s72c2 2/2 Running 0 66m
rook-ceph-mds-myfs-c-6947798bdc-2zbg8 2/2 Running 0 66m
rook-ceph-mds-myfs-d-75f7974874-4lc2r 2/2 Running 0 92m
# Ceph状态查看
[root@cce-k8s-m-1 ~]# ceph -s
cluster:
id: 5d33f341-bf25-46bd-8a17-53952cda0eee
health: HEALTH_OK
services:
mon: 3 daemons, quorum a,b,c (age 21h)
mgr: b(active, since 21h), standbys: a
mds: 2/2 daemons up, 2 hot standby
osd: 3 osds: 3 up (since 21h), 3 in (since 21h)
data:
volumes: 1/1 healthy
pools: 4 pools, 81 pgs
objects: 52 objects, 643 KiB
usage: 311 MiB used, 150 GiB / 150 GiB avail
pgs: 81 active+clean
io:
client: 2.5 KiB/s rd, 4 op/s rd, 0 op/s wr
# 查看CephFS文件系统
[root@cce-k8s-m-1 ~]# ceph fs ls
name: myfs, metadata pool: myfs-metadata, data pools: [myfs-replicated ]
# 容器日志查看,包含 mds 的日志和对接驱动的日志,当服务异常的时候可以结合日志信息进行排查, provisioner 包含有多个不同的容器
1:csi-attacher 挂载
2:csi-snapshotter 快照
3:csi-resizer 调整大小
4:csi-provisioner 创建
5:csi-cephfsplugin 驱动agent
# 驱动日志查看
[root@cce-k8s-m-1 ~]# kubectl logs -f -n rook-ceph csi-cephfsplugin-provisioner-7c594f8cf-s7s9n
Defaulted container "csi-attacher" out of: csi-attacher, csi-snapshotter, csi-resizer, csi-provisioner, csi-cephfsplugin
I0402 14:52:10.339745 1 main.go:94] Version: v4.1.0
W0402 14:52:20.340846 1 connection.go:173] Still connecting to unix:///csi/csi-provisioner.sock
W0402 14:52:30.341189 1 connection.go:173] Still connecting to unix:///csi/csi-provisioner.sock
W0402 14:52:40.343364 1 connection.go:173] Still connecting to unix:///csi/csi-provisioner.sock
W0402 14:52:50.341415 1 connection.go:173] Still connecting to unix:///csi/csi-provisioner.sock
W0402 14:53:00.340881 1 connection.go:173] Still connecting to unix:///csi/csi-provisioner.sock
W0402 14:53:10.341314 1 connection.go:173] Still connecting to unix:///csi/csi-provisioner.sock
W0402 14:53:20.341893 1 connection.go:173] Still connecting to unix:///csi/csi-provisioner.sock
W0402 14:53:30.341447 1 connection.go:173] Still connecting to unix:///csi/csi-provisioner.sock
W0402 14:53:40.342075 1 connection.go:173] Still connecting to unix:///csi/csi-provisioner.sock
W0402 14:53:50.341314 1 connection.go:173] Still connecting to unix:///csi/csi-provisioner.sock
W0402 14:54:00.340977 1 connection.go:173] Still connecting to unix:///csi/csi-provisioner.sock
W0402 14:54:10.341170 1 connection.go:173] Still connecting to unix:///csi/csi-provisioner.sock
W0402 14:54:20.341209 1 connection.go:173] Still connecting to unix:///csi/csi-provisioner.sock
I0402 14:54:24.255787 1 common.go:111] Probing CSI driver for readiness
I0402 14:54:24.288670 1 leaderelection.go:248] attempting to acquire leader lease rook-ceph/external-attacher-leader-rook-ceph-cephfs-csi-ceph-com...
# mds 容器日志查看
[root@cce-k8s-m-1 ~]# kubectl logs -f -n rook-ceph rook-ceph-mds-myfs-a-86f58d679c-2pdk8
8:RGW对象存储
8.1:对象存储概述
对象存储通常是指用于上传 put 和下载 get 之用的存储系统,通常用于存储静态文件,如图片、视频、音频等,对象存储一旦上传之后无法再线进行修改,如需修改则需要将其下载到本地,对象存储最早由 aws 的 S3 提供, Ceph 的对象存储提供两种接口支持:
1:S3 ⻛格接口
2:Swift ⻛格接口
# Ceph 对象存储具有以下特点
1:RESTful Interface (RESTFul api实现对象的管理上传、下载)
2:S3- and Swift-compliant APIs(提供2种风格 api,s3 和 Swift-compliant)
3:S3-style subdomains (S3风格的子域)
4:Unified S3/Swift namespace (S3/Swift扁平空间)
5:User management (安全行:用户认证)
6:Usage tracking (使用率追踪)
7:Striped objects (分片上传,在重组)
8:Cloud solution integration (和云平台集成)
9:Multi-site deployment (多站点部署)
10:Multi-site replication (多站点复制)
8.2:部署RGW集群
通过 rook 部署 RGW 对象存储集群非常便捷,默认已经提供了 yaml 文件,会创建对象存储所需的 pool 和 rgw 集群实例
[root@cce-k8s-m-1 examples]# kubectl apply -f object.yaml
cephobjectstore.ceph.rook.io/my-store created
# 此时 rgw 已经部署完毕,默认部署了 1 个 rgw 实例,通过 ceph -s 可以查看到 rgw 已经部署到Ceph集群中
[root@cce-k8s-m-1 examples]# ceph -s
cluster:
id: 5d33f341-bf25-46bd-8a17-53952cda0eee
health: HEALTH_OK
services:
mon: 3 daemons, quorum a,b,c (age 12m)
mgr: b(active, since 22h), standbys: a
mds: 2/2 daemons up, 2 hot standby
osd: 3 osds: 3 up (since 22h), 3 in (since 22h)
rgw: 1 daemon active (1 hosts, 1 zones)
data:
volumes: 1/1 healthy
pools: 12 pools, 191 pgs
objects: 259 objects, 719 KiB
usage: 302 MiB used, 150 GiB / 150 GiB avail
pgs: 191 active+clean
io:
client: 2.2 KiB/s rd, 255 B/s wr, 4 op/s rd, 0 op/s wr
progress:
# RGW 默认会创建若干个 pool 来存储对象存储的数据, pool 包含 metadata pool 和 data
[root@cce-k8s-m-1 examples]# ceph osd lspools
1 .mgr
2 replicapool
3 myfs-metadata
4 myfs-replicated
5 .rgw.root
6 my-store.rgw.buckets.non-ec
7 my-store.rgw.otp
8 my-store.rgw.meta
9 my-store.rgw.buckets.index
10 my-store.rgw.control
11 my-store.rgw.log
12 my-store.rgw.buckets.data
8.3:RGW高可用集群
RGW 是一个无状态化的 http 服务器,通过 80 端口处理 http 的 put/get 请求,生产环境中需要部署多个 RGW 以满足高可用的要求,在 《Ceph入⻔到实战课程中》 我们通过 haproxy+keepalived 的方式来构建 RGW 的高可用集群
1:Ceph 集群中部署多个 RGW 实例
2:HAproxy 提供负载均衡能力
3:keepalived 提供 VIP 和保障 haproxy 的高可用性
通过 rook 构建 RGW 集群默认部署了一个 RGW 实例,无法实现高可用性的要求,需要部署多个 instances 以满足高可用的诉求。、
我们只需要调整object.yaml内的instance的个数就可以实现了。
[root@cce-k8s-m-1 examples]# kubectl get pod -n rook-ceph
NAME READY STATUS RESTARTS AGE
csi-cephfsplugin-j9vq7 2/2 Running 0 22h
csi-cephfsplugin-nrp7f 2/2 Running 0 22h
csi-cephfsplugin-pl5vt 2/2 Running 0 22h
csi-cephfsplugin-provisioner-7c594f8cf-s7s9n 5/5 Running 0 22h
csi-cephfsplugin-provisioner-7c594f8cf-w6h47 5/5 Running 0 22h
csi-rbdplugin-2cwbl 2/2 Running 0 22h
csi-rbdplugin-4wpd5 2/2 Running 0 22h
csi-rbdplugin-dknj9 2/2 Running 0 22h
csi-rbdplugin-provisioner-99dd6c4c6-9vhnt 5/5 Running 0 22h
csi-rbdplugin-provisioner-99dd6c4c6-zqcwt 5/5 Running 0 22h
rook-ceph-crashcollector-cce-k8s-m-1-7499dbfff4-52zh2 1/1 Running 0 124m
rook-ceph-crashcollector-cce-k8s-w-1-67b77df854-kcm6x 1/1 Running 0 45m
rook-ceph-crashcollector-cce-k8s-w-2-6fdb7f546c-s9cr2 1/1 Running 0 45m
rook-ceph-mds-myfs-a-86f58d679c-qj8jc 2/2 Running 0 45m
rook-ceph-mds-myfs-b-6c4b4b79c6-nlh5j 2/2 Running 0 45m
rook-ceph-mds-myfs-c-6947798bdc-fmhdx 2/2 Running 0 45m
rook-ceph-mds-myfs-d-75f7974874-n6rzs 2/2 Running 0 45m
rook-ceph-mgr-a-c9b6d5b69-jfd4s 3/3 Running 0 22h
rook-ceph-mgr-b-6b7fcf6bd-kzng5 3/3 Running 0 22h
rook-ceph-mon-a-5cd956c46f-9b4ms 2/2 Running 0 22h
rook-ceph-mon-b-7bd996578f-f626r 2/2 Running 0 22h
rook-ceph-mon-c-67dcbbff4d-p85n5 2/2 Running 0 22h
rook-ceph-operator-644954fb4b-pm9cl 1/1 Running 0 23h
rook-ceph-osd-0-7bddbd7847-cj9lf 2/2 Running 0 22h
rook-ceph-osd-1-d4589dfc8-842s4 2/2 Running 0 22h
rook-ceph-osd-2-59547758c7-vk995 2/2 Running 0 22h
rook-ceph-osd-prepare-cce-k8s-m-1-stsgr 0/1 Completed 0 77m
rook-ceph-osd-prepare-cce-k8s-w-1-pkx9l 0/1 Completed 0 77m
rook-ceph-osd-prepare-cce-k8s-w-2-tpwn8 0/1 Completed 0 77m
rook-ceph-rgw-my-store-a-765f5dc457-5svb9 2/2 Running 0 30m
rook-ceph-rgw-my-store-a-765f5dc457-zdmpk 2/2 Running 0 34s
rook-ceph-tools-7857bc9568-q2zgv 1/1 Running 0 22h
# 也因为有反亲和性的关系,所以两个Pod也不会调度到一个节点上去,这样就可以保证我们的RGW多副本高可用的状态。
[root@cce-k8s-m-1 examples]# ceph -s
cluster:
id: 5d33f341-bf25-46bd-8a17-53952cda0eee
health: HEALTH_OK
services:
mon: 3 daemons, quorum a,b,c (age 46m)
mgr: b(active, since 22h), standbys: a
mds: 2/2 daemons up, 2 hot standby
osd: 3 osds: 3 up (since 22h), 3 in (since 22h)
rgw: 2 daemons active (2 hosts, 1 zones)
data:
volumes: 1/1 healthy
pools: 12 pools, 169 pgs
objects: 403 objects, 740 KiB
usage: 335 MiB used, 150 GiB / 150 GiB avail
pgs: 169 active+clean
io:
client: 1.7 KiB/s rd, 3 op/s rd, 0 op/s wr
# 对外提供服务通过 service 的 VIP 实现, VIP 通过 80 端口映射到后端的两个 pods 上
[root@cce-k8s-m-1 examples]# kubectl describe svc -n rook-ceph rook-ceph-rgw-my-store
Name: rook-ceph-rgw-my-store
Namespace: rook-ceph
Labels: app=rook-ceph-rgw
app.kubernetes.io/component=cephobjectstores.ceph.rook.io
app.kubernetes.io/created-by=rook-ceph-operator
app.kubernetes.io/instance=my-store
app.kubernetes.io/managed-by=rook-ceph-operator
app.kubernetes.io/name=ceph-rgw
app.kubernetes.io/part-of=my-store
ceph_daemon_id=my-store
ceph_daemon_type=rgw
rgw=my-store
rook.io/operator-namespace=rook-ceph
rook_cluster=rook-ceph
rook_object_store=my-store
Annotations: <none>
Selector: app=rook-ceph-rgw,ceph_daemon_id=my-store,rgw=my-store,rook_cluster=rook-ceph,rook_object_store=my-store
Type: ClusterIP
IP Family Policy: SingleStack
IP Families: IPv4
IP: 10.96.1.56
IPs: 10.96.1.56
Port: http 80/TCP
TargetPort: 8080/TCP
# 对应两个RGW的地址
Endpoints: 100.101.104.176:8080,100.118.108.235:8080
Session Affinity: None
Events: <none>
8.4:RGW高级调度
和前面 mon , mds 一样, rgw 支持高级的调度机制,通过 nodeAffinity , podAntiAffinity , podAffinity , tolerations 高级调度算法将 RGW 调度到特定的节点上的诉求。
...
placement:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: ceph-rgw
operator: In
values:
- enabled
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- rook-ceph-rgw
topologyKey: kubernetes.io/hostname
...
[root@cce-k8s-m-1 examples]# kubectl label nodes cce-k8s-m-1 ceph-rgw=enabled
node/cce-k8s-m-1 labeled
[root@cce-k8s-m-1 examples]# kubectl label nodes cce-k8s-w-1 ceph-rgw=enabled
node/cce-k8s-w-1 labeled
[root@cce-k8s-m-1 examples]# kubectl label nodes cce-k8s-w-2 ceph-rgw=enabled
node/cce-k8s-w-2 labeled
[root@cce-k8s-m-1 examples]# kubectl get pod -n rook-ceph -owide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
csi-cephfsplugin-j9vq7 2/2 Running 0 22h 10.0.0.11 cce-k8s-m-1 <none> <none>
csi-cephfsplugin-nrp7f 2/2 Running 0 22h 10.0.0.13 cce-k8s-w-2 <none> <none>
csi-cephfsplugin-pl5vt 2/2 Running 0 22h 10.0.0.12 cce-k8s-w-1 <none> <none>
csi-cephfsplugin-provisioner-7c594f8cf-s7s9n 5/5 Running 0 22h 100.118.108.198 cce-k8s-w-1 <none> <none>
csi-cephfsplugin-provisioner-7c594f8cf-w6h47 5/5 Running 0 22h 100.101.104.136 cce-k8s-w-2 <none> <none>
csi-rbdplugin-2cwbl 2/2 Running 0 22h 10.0.0.13 cce-k8s-w-2 <none> <none>
csi-rbdplugin-4wpd5 2/2 Running 0 22h 10.0.0.12 cce-k8s-w-1 <none> <none>
csi-rbdplugin-dknj9 2/2 Running 0 22h 10.0.0.11 cce-k8s-m-1 <none> <none>
csi-rbdplugin-provisioner-99dd6c4c6-9vhnt 5/5 Running 0 22h 100.118.108.197 cce-k8s-w-1 <none> <none>
csi-rbdplugin-provisioner-99dd6c4c6-zqcwt 5/5 Running 0 22h 100.101.104.135 cce-k8s-w-2 <none> <none>
rook-ceph-crashcollector-cce-k8s-m-1-6675c67d88-54xl6 1/1 Running 0 40s 100.110.251.25 cce-k8s-m-1 <none> <none>
rook-ceph-crashcollector-cce-k8s-w-1-67b77df854-kcm6x 1/1 Running 0 55m 100.118.108.232 cce-k8s-w-1 <none> <none>
rook-ceph-crashcollector-cce-k8s-w-2-6fdb7f546c-s9cr2 1/1 Running 0 55m 100.101.104.173 cce-k8s-w-2 <none> <none>
rook-ceph-mds-myfs-a-86f58d679c-qj8jc 2/2 Running 0 55m 100.101.104.172 cce-k8s-w-2 <none> <none>
rook-ceph-mds-myfs-b-6c4b4b79c6-nlh5j 2/2 Running 0 55m 100.118.108.231 cce-k8s-w-1 <none> <none>
rook-ceph-mds-myfs-c-6947798bdc-fmhdx 2/2 Running 0 55m 100.118.108.233 cce-k8s-w-1 <none> <none>
rook-ceph-mds-myfs-d-75f7974874-n6rzs 2/2 Running 0 55m 100.118.108.234 cce-k8s-w-1 <none> <none>
rook-ceph-mgr-a-c9b6d5b69-jfd4s 3/3 Running 0 22h 100.101.104.137 cce-k8s-w-2 <none> <none>
rook-ceph-mgr-b-6b7fcf6bd-kzng5 3/3 Running 0 22h 100.118.108.200 cce-k8s-w-1 <none> <none>
rook-ceph-mon-a-5cd956c46f-9b4ms 2/2 Running 0 22h 100.101.104.147 cce-k8s-w-2 <none> <none>
rook-ceph-mon-b-7bd996578f-f626r 2/2 Running 0 22h 100.118.108.207 cce-k8s-w-1 <none> <none>
rook-ceph-mon-c-67dcbbff4d-p85n5 2/2 Running 0 22h 100.110.251.12 cce-k8s-m-1 <none> <none>
rook-ceph-operator-644954fb4b-pm9cl 1/1 Running 0 23h 100.101.104.131 cce-k8s-w-2 <none> <none>
rook-ceph-osd-0-7bddbd7847-cj9lf 2/2 Running 0 22h 100.110.251.9 cce-k8s-m-1 <none> <none>
rook-ceph-osd-1-d4589dfc8-842s4 2/2 Running 0 22h 100.118.108.203 cce-k8s-w-1 <none> <none>
rook-ceph-osd-2-59547758c7-vk995 2/2 Running 0 22h 100.101.104.141 cce-k8s-w-2 <none> <none>
rook-ceph-osd-prepare-cce-k8s-m-1-stsgr 0/1 Completed 0 86m 100.110.251.24 cce-k8s-m-1 <none> <none>
rook-ceph-osd-prepare-cce-k8s-w-1-pkx9l 0/1 Completed 0 86m 100.118.108.230 cce-k8s-w-1 <none> <none>
rook-ceph-osd-prepare-cce-k8s-w-2-tpwn8 0/1 Completed 0 86m 100.101.104.169 cce-k8s-w-2 <none> <none>
rook-ceph-rgw-my-store-a-5c9d68f44d-ngdtg 1/2 Running 0 20s 100.101.104.177 cce-k8s-w-2 <none> <none>
rook-ceph-rgw-my-store-a-5c9d68f44d-w54qm 2/2 Running 0 40s 100.110.251.26 cce-k8s-m-1 <none> <none>
rook-ceph-tools-7857bc9568-q2zgv 1/1 Running 0 22h 100.101.104.145 cce-k8s-w-2 <none> <none>
8.5:连接外部集群
Rook 提供了管理外部 RGW 对象存储的能力,通过 CephObjectStore 自定义资源对象提供了管理外部 RGW 对象存储的能力,其会通过创建一个 service 实现和外部对象存储管理, kubernetes 集群内部访问 service 的 vip 地址即可实现访问。
[root@cce-k8s-m-1 examples]# cat object-external.yaml
#################################################################################################################
# Create an object store with settings for replication in a production environment. A minimum of 3 hosts with
# OSDs are required in this example.
# kubectl create -f object.yaml
#################################################################################################################
apiVersion: ceph.rook.io/v1
kind: CephObjectStore
metadata:
name: external-store
namespace: rook-ceph # namespace:cluster
spec:
gateway:
# The port on which **ALL** the gateway(s) are listening on.
# Passing a single IP from a load-balancer is also valid.
port: 80
externalRgwEndpoints:
- ip: 192.168.39.182
# 地址就可以写多个RGW的地址这样就可以实现管理外部RGW的目的了,那么它其实就是封装了一层VIP,来转发给外部的RGW,不过我没有外部的RGW,所以我就不部署了。
8.6:创建Bucket桶
Ceph RGW 集群创建好之后就可以往 RGW 存储对象存储数据了,存储数据之前需要有创建 bucket , bucket 是存储桶,是对象存储中逻辑的存储空间,可以通过 Ceph 原生的接口管理存储桶入 radowgw-admin ,云原生环境下,推荐使用云原生的方式管理 bucket ,何为云原生模式?即资源对象抽象化为 kubernetes ⻛格的方式来管理,尽可能减少使用原生命令,如创建资源池,创建 bucket 等。
首先需要创建存储消费桶的 StorageClass
[root@cce-k8s-m-1 examples]# grep -v "[*^#]" storageclass-bucket-delete.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: rook-ceph-delete-bucket
reclaimPolicy: Delete
parameters:
objectStoreName: my-store
[root@cce-k8s-m-1 examples]# kubectl apply -f storageclass-bucket-delete.yaml
storageclass.storage.k8s.io/rook-ceph-delete-bucket created
[root@cce-k8s-m-1 examples]# kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
rook-ceph-block rook-ceph.rbd.csi.ceph.com Delete Immediate true 9h
rook-ceph-delete-bucket rook-ceph.ceph.rook.io/bucket Delete Immediate false 4s
rook-cephfs rook-ceph.cephfs.csi.ceph.com Delete Immediate true 146m
# 通过 ObjectBucketClaim 向 StorageClass 申请存储桶,创建了一个名为 ceph-bkt 的 bucket
[root@cce-k8s-m-1 examples]# grep -v "[*^#]" object-bucket-claim-delete.yaml
apiVersion: objectbucket.io/v1alpha1
kind: ObjectBucketClaim
metadata:
name: ceph-delete-bucket
spec:
generateBucketName: ceph-bkt
storageClassName: rook-ceph-delete-bucket
additionalConfig:
[root@cce-k8s-m-1 examples]# kubectl apply -f object-bucket-claim-delete.yaml
objectbucketclaim.objectbucket.io/ceph-delete-bucket created
[root@cce-k8s-m-1 examples]# kubectl get objectbucketclaim.objectbucket.io
NAME AGE
ceph-delete-bucket 19s
[root@cce-k8s-m-1 examples]# radosgw-admin bucket list
[
"rook-ceph-bucket-checker-5672a514-f2b3-4e15-b286-e441005d8b4d",
"ceph-bkt-ef8f0ce4-b429-40d3-944b-a8bbc5bd33a0"
]
8.7:集群访问对象存储
创建 ObjectBucketClaim 之后, operator 会自动创建 bucket 相关的访问信息,包括访问的路径,认证的访问所需的 key ,这些信息分别存放在 configmap 和 secrets 对象中,可以通过如下的方式获取
# 获取访问的地址
[root@cce-k8s-m-1 examples]# kubectl get configmaps ceph-delete-bucket -o yaml -o jsonpath='{.data.BUCKET_HOST}'
rook-ceph-rgw-my-store.rook-ceph.svc
# 获取 ACCESS KEY 和 SECRET_ACCESS_KEY , Secrets 通过 base64 加密,需要获取解密后的字符串
[root@cce-k8s-m-1 examples]# kubectl get secrets ceph-delete-bucket -o yaml -o jsonpath='{.data.AWS_ACCESS_KEY_ID}' | base64 -d
8GLBZSWMCF0T4I8TKL80
[root@cce-k8s-m-1 examples]# kubectl get secrets ceph-delete-bucket -o yamel -o jsonpath='{.data.AWS_SECRET_ACCESS_KEY}' | base64 -d
zyBtINZZAgDLs1svoGpqddAdzbMRPMAA8WBaBq5P
# 我们直接本机下载一个s3cmd
[root@cce-k8s-m-1 examples]# yum install -y s3cmd.noarch
# 配置s3cmd
[root@cce-k8s-m-1 examples]# s3cmd --configure
Enter new values or accept defaults in brackets with Enter.
Refer to user manual for detailed description of all options.
Access key and Secret key are your identifiers for Amazon S3. Leave them empty for using the env variables.
Access Key: 8GLBZSWMCF0T4I8TKL80
Secret Key: zyBtINZZAgDLs1svoGpqddAdzbMRPMAA8WBaBq5P
Default Region [US]:
Use "s3.amazonaws.com" for S3 Endpoint and not modify it to the target Amazon S3.
S3 Endpoint [s3.amazonaws.com]: 10.96.1.56:80
Use "%(bucket)s.s3.amazonaws.com" to the target Amazon S3. "%(bucket)s" and "%(location)s" vars can be used
if the target S3 system supports dns based buckets.
DNS-style bucket+hostname:port template for accessing a bucket [%(bucket)s.s3.amazonaws.com]: 10.96.1.56:80/%(bucket)
Encryption password is used to protect your files from reading
by unauthorized persons while in transfer to S3
Encryption password:
Path to GPG program [/usr/bin/gpg]:
When using secure HTTPS protocol all communication with Amazon S3
servers is protected from 3rd party eavesdropping. This method is
slower than plain HTTP, and can only be proxied with Python 2.7 or newer
Use HTTPS protocol [Yes]: no
On some networks all internet access must go through a HTTP proxy.
Try setting it here if you can't connect to S3 directly
HTTP Proxy server name:
New settings:
Access Key: 8GLBZSWMCF0T4I8TKL80
Secret Key: zyBtINZZAgDLs1svoGpqddAdzbMRPMAA8WBaBq5P
Default Region: US
S3 Endpoint: 10.96.1.56:80
DNS-style bucket+hostname:port template for accessing a bucket: 10.96.1.56:80/%(bucket)
Encryption password:
Path to GPG program: /usr/bin/gpg
Use HTTPS protocol: False
HTTP Proxy server name:
HTTP Proxy server port: 0
Test access with supplied credentials? [Y/n] y
Please wait, attempting to list all buckets...
Success. Your access key and secret key worked fine :-)
Now verifying that encryption works...
Not configured. Never mind.
Save settings? [y/N] y
Configuration saved to '/root/.s3cfg'
[root@cce-k8s-m-1 examples]# s3cmd ls
2023-04-03 14:01 s3://ceph-bkt-ef8f0ce4-b429-40d3-944b-a8bbc5bd33a0
# 可以看到我们的bucket了,那么我们下面就测试上传下载了
# 上传
[root@cce-k8s-m-1 examples]# s3cmd put /etc/passwd s3://ceph-bkt-ef8f0ce4-b429-40d3-944b-a8bbc5bd33a0
upload: '/etc/passwd' -> 's3://ceph-bkt-ef8f0ce4-b429-40d3-944b-a8bbc5bd33a0/passwd' [1 of 1]
1119 of 1119 100% in 0s 51.19 KB/s done
[root@cce-k8s-m-1 examples]# s3cmd ls s3://ceph-bkt-ef8f0ce4-b429-40d3-944b-a8bbc5bd33a0
2023-04-03 14:17 1119 s3://ceph-bkt-ef8f0ce4-b429-40d3-944b-a8bbc5bd33a0/passwd
# 下载文件
[root@cce-k8s-m-1 ~]# ls
anaconda-ks.cfg rook
[root@cce-k8s-m-1 ~]# s3cmd get s3://ceph-bkt-ef8f0ce4-b429-40d3-944b-a8bbc5bd33a0/passwd
download: 's3://ceph-bkt-ef8f0ce4-b429-40d3-944b-a8bbc5bd33a0/passwd' -> './passwd' [1 of 1]
1119 of 1119 100% in 0s 153.87 KB/s done
[root@cce-k8s-m-1 ~]# ls
anaconda-ks.cfg passwd rook
# 如果需要集群外的客户端访问到bucket,那么我们需要将rgw暴露出去,可有使用NodePort或者Ingress都是可以的,外部可以使用s3cmd,或者其他的bucket工具都可以访问到这个Bucket了
8.8:创建集群访问用户
# 目的:我们前面创建的Bucket的key是通过BucketClaim创建出来的,而它创建出来的key只能在集群操作资源,并不能创建Bucket,那么我们如果需要很大的权限来管理Bucket的话需要去创建Key的
# 客户端我们可以用这个命令去操作用户相关的问题
[root@cce-k8s-m-1 ~]# radosgw-admin -h
# 云原生资源的方式
[root@cce-k8s-m-1 examples]# cat object-user.yaml
apiVersion: ceph.rook.io/v1
kind: CephObjectStoreUser
metadata:
name: my-user
namespace: rook-ceph # namespace:cluster
spec:
store: my-store
displayName: "my display name"
# Quotas set on the user
# quotas:
# maxBuckets: 100
# maxSize: 10G
# maxObjects: 10000
# Additional permissions given to the user
# capabilities:
# user: "*"
# bucket: "*"
# metadata: "*"
# usage: "*"
# zone: "*"
[root@cce-k8s-m-1 examples]# kubectl -n rook-ceph get secrets rook-ceph-object-user-my-store-my-user
NAME TYPE DATA AGE
rook-ceph-object-user-my-store-my-user kubernetes.io/rook 3 52s
# 获取新用户的 access key 和 secret key 信息
[root@cce-k8s-m-1 examples]# kubectl -n rook-ceph get secrets rook-ceph-object-user-my-store-my-user -o yaml -o jsonpath='{.data.AccessKey}' | base64 -d
HCPYBS2NG38UUHEO22ZL
[root@cce-k8s-m-1 examples]# kubectl -n rook-ceph get secrets rook-ceph-object-user-my-store-my-user -o yaml -o jsonpath='{.data.SecretKey}' | base64 -d
se0XCl31VFiDPqAaYS8sNmhvA5NgCKEwE4HQanmK
[root@cce-k8s-m-1 examples]# kubectl -n rook-ceph get secrets rook-ceph-object-user-my-store-my-user -o yaml -o jsonpath='{.data.Endpoint}' | base64 -d
http://rook-ceph-rgw-my-store.rook-ceph.svc:80
# 修改当前宿主机 /root/.s3cfg 配置信息,使用新用户的 access key 和 secret key 信息
[root@cce-k8s-m-1 examples]# cat /root/.s3cfg
[default]
access_key = HCPYBS2NG38UUHEO22ZL
#access_key = 8GLBZSWMCF0T4I8TKL80
......
secret_key = se0XCl31VFiDPqAaYS8sNmhvA5NgCKEwE4HQanmK
#secret_key = zyBtINZZAgDLs1svoGpqddAdzbMRPMAA8WBaBq5P
......
# 验证新用户权限
[root@cce-k8s-m-1 examples]# s3cmd mb s3://layzer
Bucket 's3://layzer/' created
[root@cce-k8s-m-1 examples]# s3cmd ls
2023-04-03 15:00 s3://layzer
# 上传
[root@cce-k8s-m-1 examples]# s3cmd put /etc/passwd s3://layzer
upload: '/etc/passwd' -> 's3://layzer/passwd' [1 of 1]
1119 of 1119 100% in 0s 57.00 KB/s done
[root@cce-k8s-m-1 examples]# s3cmd ls s3://layzer
2023-04-03 15:00 1119 s3://layzer/passwd
# 下载
[root@cce-k8s-m-1 ~]# s3cmd get s3://layzer/passwd
download: 's3://layzer/passwd' -> './passwd' [1 of 1]
1119 of 1119 100% in 0s 161.46 KB/s done
[root@cce-k8s-m-1 ~]# ls
anaconda-ks.cfg passwd rook
# 查看Bucket列表
[root@cce-k8s-m-1 ~]# radosgw-admin bucket list
[
"layzer",
"rook-ceph-bucket-checker-5672a514-f2b3-4e15-b286-e441005d8b4d",
"ceph-bkt-ef8f0ce4-b429-40d3-944b-a8bbc5bd33a0"
]
8.9:RGW集群维护
# RGW 以容器 pods 的形式运行在集群中
[root@cce-k8s-m-1 ~]# kubectl get pods -n rook-ceph -l app=rook-ceph-rgw
NAME READY STATUS RESTARTS AGE
rook-ceph-rgw-my-store-a-5c9d68f44d-ngdtg 2/2 Running 0 80m
rook-ceph-rgw-my-store-a-5c9d68f44d-w54qm 2/2 Running 0 81m
# RGW 集群状态
[root@cce-k8s-m-1 ~]# ceph -s
cluster:
id: 5d33f341-bf25-46bd-8a17-53952cda0eee
health: HEALTH_OK
services:
mon: 3 daemons, quorum a,b,c (age 2h)
mgr: b(active, since 24h), standbys: a
mds: 2/2 daemons up, 2 hot standby
osd: 3 osds: 3 up (since 24h), 3 in (since 24h)
rgw: 2 daemons active (2 hosts, 1 zones)
data:
volumes: 1/1 healthy
pools: 12 pools, 169 pgs
objects: 440 objects, 744 KiB
usage: 354 MiB used, 150 GiB / 150 GiB avail
pgs: 169 active+clean
io:
client: 1.7 KiB/s rd, 3 op/s rd, 0 op/s wr
# 查看 RGW 日志
[root@cce-k8s-m-1 ~]# kubectl -n rook-ceph logs -f rook-ceph-rgw-my-store-a-5c9d68f44d-ngdtg
9:OSD日常管理
9.1:OSD健康状态
[root@cce-k8s-m-1 ~]# ceph status
cluster:
id: 5d33f341-bf25-46bd-8a17-53952cda0eee
health: HEALTH_OK
services:
mon: 3 daemons, quorum a,b,c (age 2h)
mgr: b(active, since 24h), standbys: a
mds: 2/2 daemons up, 2 hot standby
osd: 3 osds: 3 up (since 24h), 3 in (since 24h)
rgw: 2 daemons active (2 hosts, 1 zones)
data:
volumes: 1/1 healthy
pools: 12 pools, 169 pgs
objects: 440 objects, 744 KiB
usage: 363 MiB used, 150 GiB / 150 GiB avail
pgs: 169 active+clean
io:
client: 2.5 KiB/s rd, 4 op/s rd, 0 op/s wr
[root@cce-k8s-m-1 ~]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.14639 root default
-5 0.04880 host cce-k8s-m-1
0 hdd 0.04880 osd.0 up 1.00000 1.00000
-3 0.04880 host cce-k8s-w-1
1 hdd 0.04880 osd.1 up 1.00000 1.00000
-7 0.04880 host cce-k8s-w-2
2 hdd 0.04880 osd.2 up 1.00000 1.00000
[root@cce-k8s-m-1 ~]# ceph osd status
ID HOST USED AVAIL WR OPS WR DATA RD OPS RD DATA STATE
0 cce-k8s-m-1 114M 49.8G 0 5 3 771 exists,up
1 cce-k8s-w-1 122M 49.8G 1 0 7 772 exists,up
2 cce-k8s-w-2 126M 49.8G 0 0 2 42 exists,up
[root@cce-k8s-m-1 ~]# ceph osd df
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS
0 hdd 0.04880 1.00000 50 GiB 114 MiB 4.9 MiB 0 B 110 MiB 50 GiB 0.22 0.94 169 up
1 hdd 0.04880 1.00000 50 GiB 122 MiB 4.9 MiB 0 B 118 MiB 50 GiB 0.24 1.01 169 up
2 hdd 0.04880 1.00000 50 GiB 126 MiB 4.9 MiB 0 B 122 MiB 50 GiB 0.25 1.04 169 up
TOTAL 150 GiB 363 MiB 15 MiB 0 B 348 MiB 150 GiB 0.24
MIN/MAX VAR: 0.94/1.04 STDDEV: 0.01
[root@cce-k8s-m-1 ~]# ceph osd utilization
avg 169
stddev 0 (expected baseline 10.6145)
min osd.0 with 169 pgs (1 * mean)
max osd.0 with 169 pgs (1 * mean)
9.2:OSD横向扩容
当 Ceph 的存储空间不够时候,需要对 Ceph 进行扩容, Ceph 能支持横向动态水平扩容,通常两种方式:
1:添加更多的 osd
2:添加额外的 host
Rook 默认使用“所有节点上所有的磁盘”,采用默认策略只要添加了磁盘或者主机就会按照 ROOK_DISCOVER_DEVICES_INTERVAL 设定的间隔扩容磁盘,前面安装时候调整了相关的策略
[root@cce-k8s-m-1 examples]# cat cluster.yaml
......
217 storage: # cluster level storage configuration and selection
218 useAllNodes: false
219 useAllDevices: false
即关闭状态,因此需要手动定义 nodes 信息,将需要扩展的磁盘添加到列表中,这里我去给master和node都去扩容一块20G的盘
[root@cce-k8s-m-1 examples]# cat cluster.yaml
...
nodes:
- name: "cce-k8s-m-1"
devices:
- name: "sda"
config:
journalSizeMB: "4096"
- name: "sdb"
config:
journalSizeMB: "4096"
- name: "cce-k8s-w-1"
devices:
- name: "sda"
config:
journalSizeMB: "4096"
- name: "sdb"
config:
journalSizeMB: "4096"
- name: "cce-k8s-w-2"
devices:
- name: "sda"
config:
journalSizeMB: "4096"
- name: "sdb"
config:
journalSizeMB: "4096"
...
[root@cce-k8s-m-1 examples]# kubectl apply -f cluster.yaml
[root@cce-k8s-m-1 examples]# ceph -s
cluster:
id: 5d33f341-bf25-46bd-8a17-53952cda0eee
health: HEALTH_OK
services:
mon: 3 daemons, quorum a,b,c (age 102m)
mgr: b(active, since 102m), standbys: a
mds: 2/2 daemons up, 2 hot standby
osd: 6 osds: 6 up (since 16s), 6 in (since 41s); 12 remapped pgs
rgw: 2 daemons active (1 hosts, 1 zones)
data:
volumes: 1/1 healthy
pools: 12 pools, 169 pgs
objects: 440 objects, 759 KiB
usage: 244 MiB used, 210 GiB / 210 GiB avail
pgs: 119/1320 objects misplaced (9.015%)
150 active+clean
12 active+recovering+undersized+remapped
3 active+remapped+backfilling
3 active+recovering
1 active+undersized+remapped
io:
client: 2.5 KiB/s rd, 4 op/s rd, 0 op/s wr
recovery: 238 B/s, 0 keys/s, 10 objects/s
# 可以看到OSD已经扩容到了6个
9.3:扩容失败排查-1
扩容后发现此时 osd 并未加入到集群中,原因何在呢?在 章节5 中定义了 osd 的节点调度机制,设定调度到具有 ceph-osd=enabled 标签的 node 节点,如果没有这个标签则无法满足调度的要求,前面我们只设定了 n4 节点,因此只有该节点满足调度要求,需要将其他的节设置上
1:查看当前 osd
[root@cce-k8s-m-1 examples]# kubectl -n rook-ceph get pods -l app=rook-ceph-osd
NAME READY STATUS RESTARTS AGE
rook-ceph-osd-0-7bddbd7847-cj9lf 2/2 Running 2 (105m ago) 26h
rook-ceph-osd-1-d4589dfc8-842s4 2/2 Running 2 (105m ago) 26h
rook-ceph-osd-2-59547758c7-vk995 2/2 Running 2 (105m ago) 26h
rook-ceph-osd-3-86dbf55594-tqdd4 2/2 Running 0 102m
rook-ceph-osd-4-7cbbf75fd4-4gx6k 2/2 Running 0 2m16s
rook-ceph-osd-5-dcfc7cc8-954d6 2/2 Running 0 2m16s
# 无非就是因为标签的问题
9.4:扩容失败排查-2
设定完标签调度机制后,重新看pods的状态发现此时 osd-0 状态处于 pending 状态,查看详情发现提示内存资源是 Insufficient ,即资源不够。
这个问题大概率不会出现在生产中,因为生产都是经过了定制的,并且资源方面都是选择好的,如果真的出现了这个问题,在cluster.yaml改改组件的配额就可以了
# 查看 ceph 集群
[root@cce-k8s-m-1 examples]# ceph -s
cluster:
id: 5d33f341-bf25-46bd-8a17-53952cda0eee
health: HEALTH_OK
services:
mon: 3 daemons, quorum a,b,c (age 107m)
mgr: b(active, since 107m), standbys: a
mds: 2/2 daemons up, 2 hot standby
osd: 6 osds: 6 up (since 5m), 6 in (since 5m)
rgw: 2 daemons active (1 hosts, 1 zones)
data:
volumes: 1/1 healthy
pools: 12 pools, 169 pgs
objects: 440 objects, 759 KiB
usage: 252 MiB used, 210 GiB / 210 GiB avail
pgs: 169 active+clean
io:
client: 2.5 KiB/s rd, 4 op/s rd, 0 op/s wr
# 查看 osd 信息
[root@cce-k8s-m-1 examples]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.20485 root default
-5 0.06828 host cce-k8s-m-1
0 hdd 0.04880 osd.0 up 1.00000 1.00000
4 hdd 0.01949 osd.4 up 1.00000 1.00000
-3 0.06828 host cce-k8s-w-1
1 hdd 0.04880 osd.1 up 1.00000 1.00000
5 hdd 0.01949 osd.5 up 1.00000 1.00000
-7 0.06828 host cce-k8s-w-2
2 hdd 0.04880 osd.2 up 1.00000 1.00000
3 hdd 0.01949 osd.3 up 1.00000 1.00000
[root@cce-k8s-m-1 examples]# ceph osd status
ID HOST USED AVAIL WR OPS WR DATA RD OPS RD DATA STATE
0 cce-k8s-m-1 47.2M 49.9G 0 0 2 0 exists,up
1 cce-k8s-w-1 38.5M 49.9G 0 0 4 212 exists,up
2 cce-k8s-w-2 48.0M 49.9G 0 0 1 0 exists,up
3 cce-k8s-w-2 51.0M 19.9G 0 0 0 0 exists,up
4 cce-k8s-m-1 33.4M 19.9G 0 0 0 0 exists,up
5 cce-k8s-w-1 33.6M 19.9G 0 0 0 0 exists,up
[root@cce-k8s-m-1 examples]# ceph osd df
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS
0 hdd 0.04880 1.00000 50 GiB 47 MiB 6.5 MiB 35 KiB 41 MiB 50 GiB 0.09 0.79 121 up
4 hdd 0.01949 1.00000 20 GiB 33 MiB 5.3 MiB 0 B 28 MiB 20 GiB 0.16 1.40 48 up
1 hdd 0.04880 1.00000 50 GiB 38 MiB 6.0 MiB 35 KiB 32 MiB 50 GiB 0.08 0.64 124 up
5 hdd 0.01949 1.00000 20 GiB 34 MiB 5.8 MiB 0 B 28 MiB 20 GiB 0.16 1.40 45 up
2 hdd 0.04880 1.00000 50 GiB 48 MiB 6.4 MiB 35 KiB 42 MiB 50 GiB 0.09 0.80 123 up
3 hdd 0.01949 1.00000 20 GiB 51 MiB 5.4 MiB 0 B 46 MiB 20 GiB 0.25 2.13 46 up
TOTAL 210 GiB 252 MiB 35 MiB 106 KiB 216 MiB 210 GiB 0.12
MIN/MAX VAR: 0.64/2.13 STDDEV: 0.06
# 日常发现 pods 无法运行时可以通过如下两种方式排查
1:kubectl describe pods 查看 events 事件
2:kubectl logs 查看容器内部日志信息
9.5:配置bluestore加速
Ceph 支持两种存储引擎
1:Filestore:SSD作为journal
2:Bluestore:WAL+DB=SSD
主流均已使用Bluestore,对于Bluestore加速需要将wal+db存储在SSD中,其到加速的作用,如下讲解配置,不涉及具体配置
参考文档:https://rook.io/docs/rook/v1.11/CRDs/Cluster/ceph-cluster-crd/#cluster-settings
# 查看主机磁盘信息
[root@cce-k8s-m-1 examples]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda 8:0 0 50G 0 disk
sdb 8:16 0 20G 0 disk
sdc 8:32 0 20G 0 disk
sr0 11:0 1 8.3G 0 rom
nvme0n1 259:0 0 20G 0 disk
├─nvme0n1p1 259:1 0 1G 0 part /boot
└─nvme0n1p2 259:2 0 19G 0 part
├─cs-root 253:0 0 17G 0 lvm /
└─cs-swap 253:1 0 2G 0 lvm
[root@cce-k8s-m-1 examples]# cat cluster.yaml
...
nodes:
- name: "cce-k8s-m-1"
devices:
- name: "sda"
config:
journalSizeMB: "4096"
devices:
- name: "sdb"
config:
metadataDevice: "/dev/sdc"
databaseSizeMB: "4096"
walSizeMB: "4096"
...
# 不过我这里没有SSD盘,所以貌似是看不出效果的。加不进来。
9.6:云原生删除OSD
如果 OSD 所在的磁盘故障了或者需要更换配置,这时需要将 OSD 从集群中删除,删除 OSD 的注意事项有
1:删除 osd 后确保集群有足够的容量
2:删除 osd 后确保PG状态正常
3:单次尽可能不要删除过多的 osd
4:删除多个 osd 需要等待数据同步同步完毕后再执行(rebalancing)
模拟 osd.5 故障,osd 故障时候 pods 状态会变成 CrashLoopBackoff 或者 error 状态,此时 ceph 中 osd 的状态也会变成 down 状态,通过如下方式可以模拟
[root@cce-k8s-m-1 examples]# kubectl scale deployment -n rook-ceph rook-ceph-osd-5 --replicas=0
deployment.apps/rook-ceph-osd-5 scaled
[root@cce-k8s-m-1 examples]# kubectl get deployments.apps -n rook-ceph
NAME READY UP-TO-DATE AVAILABLE AGE
......
rook-ceph-osd-5 0/0 0 0 80m
......
[root@cce-k8s-m-1 examples]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.20485 root default
-5 0.06828 host cce-k8s-m-1
0 hdd 0.04880 osd.0 up 1.00000 1.00000
4 hdd 0.01949 osd.4 up 1.00000 1.00000
-3 0.06828 host cce-k8s-w-1
1 hdd 0.04880 osd.1 up 1.00000 1.00000
5 hdd 0.01949 osd.5 down 1.00000 1.00000
-7 0.06828 host cce-k8s-w-2
2 hdd 0.04880 osd.2 up 1.00000 1.00000
3 hdd 0.01949 osd.3 up 1.00000 1.00000
# 我们发现OSD.5已经down了。
# 通过 rook 提供的脚本来删除,修改 osd-purge.yaml 资源清单,云原生删除 osd 坏盘
[root@cce-k8s-m-1 examples]# cat osd-purge.yaml
#################################################################################################################
# We need many operations to remove OSDs as written in Documentation/Storage-Configuration/Advanced/ceph-osd-mgmt.md.
# This job can automate some of that operations: mark OSDs as `out`, purge these OSDs,
# and delete the corresponding resources like OSD deployments, OSD prepare jobs, and PVCs.
#
# Please note the following.
#
# - This job only works for `down` OSDs.
# - This job doesn't wait for backfilling to be completed.
#
# If you want to remove `up` OSDs and/or want to wait for backfilling to be completed between each OSD removal,
# please do it by hand.
#################################################################################################################
apiVersion: batch/v1
kind: Job
metadata:
name: rook-ceph-purge-osd
namespace: rook-ceph # namespace:cluster
labels:
app: rook-ceph-purge-osd
spec:
template:
metadata:
labels:
app: rook-ceph-purge-osd
spec:
serviceAccountName: rook-ceph-purge-osd
containers:
- name: osd-removal
image: rook/ceph:v1.10.12
args:
- "ceph"
- "osd"
- "remove"
- "--preserve-pvc"
- "false"
- "--force-osd-removal"
- "false"
- "--osd-ids"
# 这里是我们OSD的ID
- "5"
env:
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: ROOK_MON_ENDPOINTS
valueFrom:
configMapKeyRef:
key: data
name: rook-ceph-mon-endpoints
- name: ROOK_CEPH_USERNAME
valueFrom:
secretKeyRef:
key: ceph-username
name: rook-ceph-mon
- name: ROOK_CONFIG_DIR
value: /var/lib/rook
- name: ROOK_CEPH_CONFIG_OVERRIDE
value: /etc/rook/config/override.conf
- name: ROOK_FSID
valueFrom:
secretKeyRef:
key: fsid
name: rook-ceph-mon
- name: ROOK_LOG_LEVEL
value: DEBUG
volumeMounts:
- mountPath: /etc/ceph
name: ceph-conf-emptydir
- mountPath: /var/lib/rook
name: rook-config
- name: ceph-admin-secret
mountPath: /var/lib/rook-ceph-mon
volumes:
- name: ceph-admin-secret
secret:
secretName: rook-ceph-mon
optional: false
items:
- key: ceph-secret
path: secret.keyring
- emptyDir: {}
name: ceph-conf-emptydir
- emptyDir: {}
name: rook-config
restartPolicy: Never
# 执行脚本删除OSD
[root@cce-k8s-m-1 examples]# kubectl apply -f osd-purge.yaml
job.batch/rook-ceph-purge-osd created
# 此时 Ceph 会自动的进行数据的同步,等带数据的同步完成
[root@cce-k8s-m-1 examples]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.18536 root default
-5 0.06828 host cce-k8s-m-1
0 hdd 0.04880 osd.0 up 1.00000 1.00000
4 hdd 0.01949 osd.4 up 1.00000 1.00000
-3 0.04880 host cce-k8s-w-1
1 hdd 0.04880 osd.1 up 1.00000 1.00000
-7 0.06828 host cce-k8s-w-2
2 hdd 0.04880 osd.2 up 1.00000 1.00000
3 hdd 0.01949 osd.3 up 1.00000 1.00000
[root@cce-k8s-m-1 examples]# ceph osd crush dump | grep devices -A 50
"devices": [
{
"id": 0,
"name": "osd.0",
"class": "hdd"
},
{
"id": 1,
"name": "osd.1",
"class": "hdd"
},
{
"id": 2,
"name": "osd.2",
"class": "hdd"
},
{
"id": 3,
"name": "osd.3",
"class": "hdd"
},
{
"id": 4,
"name": "osd.4",
"class": "hdd"
}
],
# 删除不必要的 Deployment
[root@cce-k8s-m-1 examples]# kubectl delete deployments.apps -n rook-ceph rook-ceph-osd-5
deployment.apps "rook-ceph-osd-5" deleted
# 由于 cluster.yaml 中关闭了 useAllNodes 和 useAllDevices ,因此需要将 osd 的信息从 nodes 中删除,避免 apply 之后重新添加回集群。
[root@cce-k8s-m-1 examples]# ceph status
cluster:
id: 5d33f341-bf25-46bd-8a17-53952cda0eee
health: HEALTH_WARN
clock skew detected on mon.b, mon.c
services:
mon: 3 daemons, quorum a,b,c (age 78m)
mgr: b(active, since 78m), standbys: a
mds: 2/2 daemons up, 2 hot standby
osd: 5 osds: 5 up (since 12m), 5 in (since 8m)
rgw: 2 daemons active (1 hosts, 1 zones)
data:
volumes: 1/1 healthy
pools: 12 pools, 169 pgs
objects: 440 objects, 777 KiB
usage: 184 MiB used, 190 GiB / 190 GiB avail
pgs: 169 active+clean
io:
client: 4.7 KiB/s rd, 426 B/s wr, 7 op/s rd, 1 op/s wr
9.7:手动删除OSD
# 除了使用云原生的方式删除 osd 之外,也可以使用 Ceph 标准的方式进行删除,如下是删除的方法
[root@cce-k8s-m-1 examples]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.18536 root default
-5 0.06828 host cce-k8s-m-1
0 hdd 0.04880 osd.0 up 1.00000 1.00000
4 hdd 0.01949 osd.4 up 1.00000 1.00000
-3 0.04880 host cce-k8s-w-1
1 hdd 0.04880 osd.1 up 1.00000 1.00000
-7 0.06828 host cce-k8s-w-2
2 hdd 0.04880 osd.2 up 1.00000 1.00000
3 hdd 0.01949 osd.3 up 1.00000 1.00000
# 将 osd 标识为 out
[root@cce-k8s-m-1 examples]# ceph osd out osd.0
marked out osd.2.
[root@cce-k8s-m-1 examples]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.18536 root default
-5 0.06828 host cce-k8s-m-1
0 hdd 0.04880 osd.0 up 0 1.00000
4 hdd 0.01949 osd.4 up 1.00000 1.00000
-3 0.04880 host cce-k8s-w-1
1 hdd 0.04880 osd.1 up 1.00000 1.00000
-7 0.06828 host cce-k8s-w-2
2 hdd 0.04880 osd.2 up 1.00000 1.00000
3 hdd 0.01949 osd.3 up 1.00000 1.00000
[root@cce-k8s-m-1 examples]# ceph -s
cluster:
id: 5d33f341-bf25-46bd-8a17-53952cda0eee
health: HEALTH_WARN
Degraded data redundancy: 1/1317 objects degraded (0.076%), 1 pg degraded
services:
mon: 3 daemons, quorum a,b,c (age 100m)
mgr: b(active, since 100m), standbys: a
mds: 2/2 daemons up, 2 hot standby
osd: 5 osds: 5 up (since 37s), 4 in (since 11s); 25 remapped pgs
rgw: 2 daemons active (1 hosts, 1 zones)
data:
volumes: 1/1 healthy
pools: 12 pools, 169 pgs
objects: 439 objects, 777 KiB
usage: 275 MiB used, 140 GiB / 140 GiB avail
pgs: 0.592% pgs not active
1/1317 objects degraded (0.076%)
101/1317 objects misplaced (7.669%)
126 active+clean
28 active+recovering+undersized+remapped
11 active+recovering
2 active+remapped+backfilling
1 peering
1 active+recovering+undersized+degraded+remapped
io:
client: 11 KiB/s rd, 1.2 KiB/s wr, 14 op/s rd, 4 op/s wr
recovery: 5.3 KiB/s, 3 keys/s, 20 objects/s
[root@cce-k8s-m-1 examples]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.18536 root default
-5 0.06828 host cce-k8s-m-1
0 hdd 0.04880 osd.0 down 0 1.00000
4 hdd 0.01949 osd.4 up 1.00000 1.00000
-3 0.04880 host cce-k8s-w-1
1 hdd 0.04880 osd.1 up 1.00000 1.00000
-7 0.06828 host cce-k8s-w-2
2 hdd 0.04880 osd.2 up 1.00000 1.00000
3 hdd 0.01949 osd.3 up 1.00000 1.00000
[root@cce-k8s-m-1 examples]# ceph osd purge 0
purged osd.0
[root@cce-k8s-m-1 examples]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.13657 root default
-5 0.01949 host cce-k8s-m-1
4 hdd 0.01949 osd.4 up 1.00000 1.00000
-3 0.04880 host cce-k8s-w-1
1 hdd 0.04880 osd.1 up 1.00000 1.00000
-7 0.06828 host cce-k8s-w-2
2 hdd 0.04880 osd.2 up 1.00000 1.00000
3 hdd 0.01949 osd.3 up 1.00000 1.00000
[root@cce-k8s-m-1 examples]# ceph status
cluster:
id: 5d33f341-bf25-46bd-8a17-53952cda0eee
health: HEALTH_OK
services:
mon: 3 daemons, quorum a,b,c (age 109m)
mgr: b(active, since 109m), standbys: a
mds: 2/2 daemons up, 2 hot standby
osd: 4 osds: 4 up (since 26s), 4 in (since 37s)
rgw: 2 daemons active (1 hosts, 1 zones)
data:
volumes: 1/1 healthy
pools: 12 pools, 169 pgs
objects: 440 objects, 777 KiB
usage: 263 MiB used, 140 GiB / 140 GiB avail
pgs: 169 active+clean
io:
client: 3.4 KiB/s rd, 170 B/s wr, 5 op/s rd, 1 op/s wr
# 删除 out ,此时会进行元数据的同步,即 backfilling 和 rebalancing 动作,完成数据的迁移
[root@cce-k8s-m-1 examples]# ceph osd crush dump | grep devices -A 50
"devices": [
{
"id": 0,
"name": "device0"
},
{
"id": 1,
"name": "osd.1",
"class": "hdd"
},
{
"id": 2,
"name": "osd.2",
"class": "hdd"
},
{
"id": 3,
"name": "osd.3",
"class": "hdd"
},
{
"id": 4,
"name": "osd.4",
"class": "hdd"
}
],
# 对应 deployment 和 cluster.yaml 中的内容删除
9.8:OSD替换方法
1:替换操作的思路是:
1:将其从 Ceph 集群中删除—采用云原生方式或手动方式
2:删除之后数据同步完毕后再通过扩容的方式添加回集群中
3:添加回来时候注意将对应的LVM删除
10:Dashboard概述
10.1:Dashboard Web概述
日常可以通过 Ceph 原生的命令行和 Rook 提供的云原生方式对 Ceph 进行管理,这两种方式都具有一定的难度, Ceph 提供了一种更加简单的方式使用和 Ceph 监控,这个工具便是 Ceph Dashboard , Ceph dashboard 官方展板介绍地址,它是一个 WebUI 的图形管理方式,能够提供两个方面的功能:
1:Ceph 管理: 如 Pool,RBD,CephFS的日常管理接口
2:性能监控: 监控 Ceph 的健康状态,如 Mon,OSD,mgr
10.2:启用Dashboard组件
Rook 默认在 cluster.yaml 文件中已经启用了 Ceph Dashboard 组件,集成在 mgr 内部,不需要任何的配置即可使用(免去了包的安装,插件启用,SSL证书,端口等配置过程),使用非常简单
...
mgr:
# When higher availability of the mgr is needed, increase the count to 2.
# In that case, one mgr will be active and one in standby. When Ceph updates which
# mgr is active, Rook will update the mgr services to match the active mgr.
count: 2
allowMultiplePerNode: false
modules:
# Several modules should not need to be included in this list. The "dashboard" and "monitoring" modules
# are already enabled by other settings in the cluster CR.
- name: pg_autoscaler
enabled: true
# enable the ceph dashboard for viewing cluster status
dashboard:
enabled: true
urlPrefix: /ceph-dashboard
# serve the dashboard at the given port.
# port: 8443
# serve the dashboard using SSL
ssl: false
...
Ceph Dashboard默认会通过 service 的方式将服务暴露给外部,通过 8443 的 https 端口进行访问,如下
[root@cce-k8s-m-1 examples]# kubectl get svc -n rook-ceph
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
rook-ceph-mgr ClusterIP 10.96.1.168 <none> 9283/TCP 64m
rook-ceph-mgr-dashboard ClusterIP 10.96.1.173 <none> 8443/TCP 64m
rook-ceph-mon-a ClusterIP 10.96.0.92 <none> 6789/TCP,3300/TCP 66m
rook-ceph-mon-b ClusterIP 10.96.2.93 <none> 6789/TCP,3300/TCP 65m
rook-ceph-mon-c ClusterIP 10.96.1.40 <none> 6789/TCP,3300/TCP 65m
# 暴露 Dashboard 访问
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: rook-ceph-mgr-dashboard
namespace: rook-ceph
spec:
ingressClassName: nginx
rules:
- host: rook-ceph.kudevops.cn
http:
paths:
- pathType: Prefix
path: /ceph-dashboard
backend:
service:
name: rook-ceph-mgr-dashboard
port:
name: http-dashboard
# 但是我这里走的是ingress,可以用
[root@cce-k8s-m-1 examples]# ls | grep dashboard
dashboard-external-https.yaml
dashboard-external-http.yaml
dashboard-ingress-https.yaml
dashboard-loadbalancer.yaml
# 这四个都是可以的,不过前提是我前面关闭了ssl,然后svc就会变成7000端口了

用户: admin ,密码从 secrets 中获取
[root@cce-k8s-m-1 examples]# kubectl -n rook-ceph get secrets rook-ceph-dashboard-password -o yaml -o jsonpath={.data.password} | base64 -d
R}9I?$ih=K'bqs8|>rM_

10.3:Dashboard监控Ceph
Dashboard 提供了图形界面的功能,能够监控完成 Ceph 监控所需的功能,包括:
1:集群监控,包含各组件的监控,如 MON,MGR,RGW,OSD,HOST 等
2:容量监控,容量使用情况,Objects 数量,Pool 状态,PGs 状态,Pool 等;
3:性能监控,如客户端读写性能,吞吐量量,恢复 Recovery 流量,Scrubbing 等



10.4:Dashboard管理Ceph
Ceph 还提供了部分管理 Ceph 的功能,如 Pool,RBD 块存储,RGW对象存储等
# 创建Pool

# 修改Pool类型

# 查看Pool信息

# 对比
[root@cce-k8s-m-1 examples]# ceph osd lspools
1 .mgr
2 myfs-metadata
3 myfs-replicated


11:Prometheus监控系统
11.1:Prometheus与Ceph
Dashboard 能够提供部分 Ceph 的监控能力,然而如果要更加细化的监控能力和自定义监控则无法实现, prometheus 是新一代的监控系统,能够提供更加完善的监控指标和告警能力,一个完善的 prometheus 系统通常包含:
1:exporter:监控agent端,用于上报,mgr默认已经提供,内置有监控指标
2:promethues:监控服务端,存储监控数据,提供查询,监控,告警,展示等能力;
3:grafana:从prometheus中获取监控指标并通过模版进行展示

11.2:exporters客户端
Rook 默认启用了 exporters 客户端,以 rook-ceph-mgr service的方式提供 9283 端口作为 agent 端,默认是 ClusterIP ,可以启用为 NodePort 给外部访问,如下开启
[root@cce-k8s-m-1 examples]# kubectl -n rook-ceph get svc -l app=rook-ceph-mgr
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
rook-ceph-mgr ClusterIP 10.96.1.168 <none> 9283/TCP 178m
rook-ceph-mgr-dashboard ClusterIP 10.96.1.173 <none> 7000/TCP 178m
# 我们可以通过edit这个svc使其可以在外部回去到Ceph的监控数据
[root@cce-k8s-m-1 examples]# kubectl edit svc -n rook-ceph rook-ceph-mgr
apiVersion: v1
kind: Service
metadata:
creationTimestamp: "2023-04-04T04:42:55Z"
labels:
app: rook-ceph-mgr
ceph_daemon_id: b
rook_cluster: rook-ceph
name: rook-ceph-mgr
namespace: rook-ceph
ownerReferences:
- apiVersion: ceph.rook.io/v1
blockOwnerDeletion: true
controller: true
kind: CephCluster
name: rook-ceph
uid: cdc98920-f581-4acb-a0a1-27a320987d2a
resourceVersion: "39576"
uid: d015099f-880c-4a54-a03a-0ca871dfe33e
spec:
clusterIP: 10.96.1.168
clusterIPs:
- 10.96.1.168
externalTrafficPolicy: Cluster
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- name: http-metrics
nodePort: 31976
port: 9283
protocol: TCP
targetPort: 9283
selector:
app: rook-ceph-mgr
ceph_daemon_id: b
rook_cluster: rook-ceph
sessionAffinity: None
type: NodePort
status:
loadBalancer: {}
[root@cce-k8s-m-1 examples]# kubectl get svc -n rook-ceph
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
rook-ceph-mgr ClusterIP 10.96.1.168 <none> 9283/TCP 3h
rook-ceph-mgr-dashboard ClusterIP 10.96.1.173 <none> 7000/TCP 3h
rook-ceph-mon-a ClusterIP 10.96.0.92 <none> 6789/TCP,3300/TCP 3h1m
rook-ceph-mon-b ClusterIP 10.96.2.93 <none> 6789/TCP,3300/TCP 3h
rook-ceph-mon-c ClusterIP 10.96.1.40 <none> 6789/TCP,3300/TCP 3h
rook-ceph-rgw-my-store ClusterIP 10.96.1.169 <none> 80/TCP 52m
[root@cce-k8s-m-1 ~]# cat ceph-metrics-ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: rook-ceph-metrics
namespace: rook-ceph
spec:
ingressClassName: nginx
rules:
- host: rook-ceph-metrics.kudevops.cn
http:
paths:
- pathType: Prefix
path: /metrics
backend:
service:
name: rook-ceph-mgr
port:
name: http-metrics
[root@cce-k8s-m-1 ~]# kubectl get ingress -n rook-ceph
NAME CLASS HOSTS ADDRESS PORTS AGE
rook-ceph-metrics nginx rook-ceph-metrics.kudevops.cn 80 4s
rook-ceph-mgr-dashboard nginx rook-ceph.kudevops.cn 10.0.0.12 80 99m
[root@cce-k8s-m-1 ~]# kubectl apply -f ceph-metrics-ingress.yaml
ingress.networking.k8s.io/rook-ceph-metrics created

11.3:部署Prometheus Operator
[root@cce-k8s-m-1 monitoring]# kubectl create -f https://raw.githubusercontent.com/coreos/prometheus-operator/v0.64.0/bundle.yaml
customresourcedefinition.apiextensions.k8s.io/alertmanagerconfigs.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/alertmanagers.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/podmonitors.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/probes.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/prometheusagents.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/prometheuses.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/prometheusrules.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/servicemonitors.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/thanosrulers.monitoring.coreos.com created
clusterrolebinding.rbac.authorization.k8s.io/prometheus-operator created
clusterrole.rbac.authorization.k8s.io/prometheus-operator created
deployment.apps/prometheus-operator created
serviceaccount/prometheus-operator created
service/prometheus-operator created
[root@cce-k8s-m-1 monitoring]# kubectl get pod,svc
NAME READY STATUS RESTARTS AGE
pod/prometheus-operator-745b9c6b7f-s46qf 1/1 Running 0 17s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 9h
service/prometheus-operator ClusterIP None <none> 8080/TCP 17s
11.4:部署Prometheus
[root@cce-k8s-m-1 ~]# cd rook/deploy/examples/monitoring/
[root@cce-k8s-m-1 monitoring]# ls
csi-metrics-service-monitor.yaml keda-rgw.yaml prometheus-service.yaml rbac.yaml
externalrules.yaml localrules.yaml prometheus.yaml service-monitor.yaml
[root@cce-k8s-m-1 monitoring]# kubectl apply -f prometheus.yaml
serviceaccount/prometheus created
clusterrole.rbac.authorization.k8s.io/prometheus created
clusterrole.rbac.authorization.k8s.io/prometheus-rules created
clusterrolebinding.rbac.authorization.k8s.io/prometheus created
prometheus.monitoring.coreos.com/rook-prometheus created
[root@cce-k8s-m-1 monitoring]# kubectl apply -f prometheus-service.yaml
service/rook-prometheus created
[root@cce-k8s-m-1 monitoring]# kubectl apply -f service-monitor.yaml
servicemonitor.monitoring.coreos.com/rook-ceph-mgr created
[root@cce-k8s-m-1 monitoring]# kubectl get pod,svc -n rook-ceph
NAME READY STATUS RESTARTS AGE
pod/csi-cephfsplugin-g654n 2/2 Running 2 (69m ago) 9h
pod/csi-cephfsplugin-provisioner-75b9f74d7b-5z57g 5/5 Running 5 (69m ago) 9h
pod/csi-cephfsplugin-provisioner-75b9f74d7b-r58zz 5/5 Running 5 (69m ago) 9h
pod/csi-cephfsplugin-ttw8h 2/2 Running 2 (69m ago) 9h
pod/csi-cephfsplugin-zdqmq 2/2 Running 2 (69m ago) 9h
pod/csi-rbdplugin-4kkp8 2/2 Running 2 (69m ago) 9h
pod/csi-rbdplugin-5cbxq 2/2 Running 2 (69m ago) 9h
pod/csi-rbdplugin-provisioner-66d48ddf89-4dn6n 5/5 Running 5 (69m ago) 9h
pod/csi-rbdplugin-provisioner-66d48ddf89-gr6tb 5/5 Running 5 (69m ago) 9h
pod/csi-rbdplugin-tppmh 2/2 Running 2 (69m ago) 9h
pod/prometheus-rook-prometheus-0 2/2 Running 0 74s
pod/rook-ceph-crashcollector-cce-k8s-m-1-5c977d69f5-fpctk 1/1 Running 1 (69m ago) 8h
pod/rook-ceph-crashcollector-cce-k8s-w-1-5f8979948c-xj22j 1/1 Running 1 (69m ago) 8h
pod/rook-ceph-crashcollector-cce-k8s-w-2-7cdd5bfdf4-bwv2c 1/1 Running 1 (69m ago) 5h34m
pod/rook-ceph-mds-myfs-a-fc84f6679-6x2kp 2/2 Running 2 (69m ago) 8h
pod/rook-ceph-mds-myfs-b-9cd4f99fb-xwctq 2/2 Running 2 (69m ago) 8h
pod/rook-ceph-mgr-a-5f8c5f4d97-rnkq5 3/3 Running 4 (68m ago) 4h2m
pod/rook-ceph-mgr-b-5f547b498-pgt5r 3/3 Running 4 (67m ago) 4h1m
pod/rook-ceph-mon-a-6c9999c8cf-64wlv 2/2 Running 2 (69m ago) 9h
pod/rook-ceph-mon-b-cb878d55-vtvqr 2/2 Running 2 (69m ago) 9h
pod/rook-ceph-mon-c-644cdff6c5-h2l9l 2/2 Running 2 (69m ago) 9h
pod/rook-ceph-operator-cc59dfcb9-dd4hx 1/1 Running 1 (69m ago) 8h
pod/rook-ceph-osd-0-78cf58899c-tkgd6 2/2 Running 2 (69m ago) 9h
pod/rook-ceph-osd-1-65d869f84-5bwfq 2/2 Running 2 (69m ago) 9h
pod/rook-ceph-osd-2-6f5995d478-5qpzr 2/2 Running 2 (69m ago) 9h
pod/rook-ceph-osd-3-879d78967-444lh 2/2 Running 2 (69m ago) 9h
pod/rook-ceph-osd-4-5fc974dbbb-skr8v 2/2 Running 2 (69m ago) 9h
pod/rook-ceph-osd-5-54f5b874cf-6rxfs 2/2 Running 2 (69m ago) 9h
pod/rook-ceph-osd-prepare-cce-k8s-m-1-mfbxv 0/1 Completed 0 66m
pod/rook-ceph-osd-prepare-cce-k8s-w-1-qff8s 0/1 Completed 0 66m
pod/rook-ceph-osd-prepare-cce-k8s-w-2-lwt72 0/1 Completed 0 66m
pod/rook-ceph-tools-54bdbfc7b7-t2vld 1/1 Running 1 (69m ago) 9h
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/prometheus-operated ClusterIP None <none> 9090/TCP 74s
service/rook-ceph-mgr ClusterIP 10.96.1.168 <none> 9283/TCP 9h
service/rook-ceph-mgr-dashboard ClusterIP 10.96.1.173 <none> 7000/TCP 9h
service/rook-ceph-mon-a ClusterIP 10.96.0.92 <none> 6789/TCP,3300/TCP 9h
service/rook-ceph-mon-b ClusterIP 10.96.2.93 <none> 6789/TCP,3300/TCP 9h
service/rook-ceph-mon-c ClusterIP 10.96.1.40 <none> 6789/TCP,3300/TCP 9h
service/rook-prometheus NodePort 10.96.3.152 <none> 9090:30900/TCP 56s
11.5:Prometheus控制台
访问svc或者可以用ingress暴露出去可以直接访问到Prometheus了



11.6:安装grafana服务端
prometheus 将 exporter 上报的监控数据进行统一的存储,在 prometheus 中可以通过 PromSQL 查询监控数据并进行数据的展示, prometheus 提供了比较简陋的图形界面,这些图形界面显然无法满足到日常实际的需求,幸运的是, grafana 是这方面的高手,因此我们通过 grafana 进行数据的展示, grafana 支持将prometheus做为一个后端进行展示,如下是安装的方法:
[root@cce-k8s-m-1 monitoring]# yum install -y https://mirrors.tuna.tsinghua.edu.cn/grafana/yum/rpm/Packages/grafana-9.4.7-1.x86_64.rpm
[root@cce-k8s-m-1 monitoring]# systemctl enable grafana-server --now
Synchronizing state of grafana-server.service with SysV service script with /usr/lib/systemd/systemd-sysv-install.
Executing: /usr/lib/systemd/systemd-sysv-install enable grafana-server
Created symlink /etc/systemd/system/multi-user.target.wants/grafana-server.service → /usr/lib/systemd/system/grafana-server.service.
[root@cce-k8s-m-1 monitoring]# ss -lnt | grep 3000
LISTEN 0 4096 *:3000 *:*
11.7:配置grafana数据源
grafana 是一个数据展示的开源工具,其数据需要依托于后端的数据源(Data Sources),因此需要配置数据源,以便更好和 grafana 进行集成,如下是配置 prometheus 和 grafana 进行集成,需要 prometheus 的 9090 端口

账号:admin
密码:admin



11.8:配置Grafana展板
grafana 如何展示 prometheus 的数据呢?需要编写 PromSQL 实现和 prometheus 集成,幸好已经有很多定义好的 json 模版,我们直接使用即可,下载 json 后将其 import 到 grafana 中






11.9:配置Prometheus Alerts

[root@cce-k8s-m-1 monitoring]# kubectl apply -f rbac.yaml
role.rbac.authorization.k8s.io/rook-ceph-monitor created
rolebinding.rbac.authorization.k8s.io/rook-ceph-monitor created
role.rbac.authorization.k8s.io/rook-ceph-metrics created
rolebinding.rbac.authorization.k8s.io/rook-ceph-metrics created
role.rbac.authorization.k8s.io/rook-ceph-monitor-mgr created
rolebinding.rbac.authorization.k8s.io/rook-ceph-monitor-mgr created
[root@cce-k8s-m-1 monitoring]# vim ../cluster.yaml
apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
name: rook-ceph
namespace: rook-ceph
[...]
spec:
[...]
monitoring:
enabled: true # 启动配置,默认为 false
rulesNamespace: "rook-ceph"
[...]
[root@cce-k8s-m-1 monitoring]# kubectl apply -f localrules.yaml
prometheusrule.monitoring.coreos.com/prometheus-ceph-rules created
# 配置完成后,打开 prometheus 的控制台即可看到 alter 相关的告警,如下

12:容器存储扩缩容
12.1:云原生存储扩容
容器的容量的不足的时候需要对容器的存储空间进行扩容, Rook 默认提供了两种驱动:
1:RBD
2:CephFS
这两种驱动通过 StorageClass 存储供给者为容器提供存储空间,同时已经自动扩容的能力,其流程为:
客户端通过 PVC 声明的方式向 StorageClass 申请扩容容量空间,通过驱动调整 PV 的容量,调整底层的 RBD 像块容量,最终达到容量的扩展,如下是操作过程:
[root@cce-k8s-m-1 monitoring]# cd ../
[root@cce-k8s-m-1 examples]# kubectl apply -f mysql.yaml
service/wordpress-mysql created
persistentvolumeclaim/mysql-pv-claim created
deployment.apps/wordpress-mysql created
[root@cce-k8s-m-1 examples]# kubectl get pv,pvc
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/pvc-... 20Gi RWO Delete Bound default/mysql-pv-claim rook-cephfs 44s
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/mysql-... Bound pvc-31b8fe4d-3cee-442b-b55b-8d8923a9cf72 20Gi RWO rook-cephfs 45s
[root@cce-k8s-m-1 examples]# cat mysql.yaml
apiVersion: v1
kind: Service
metadata:
name: wordpress-mysql
labels:
app: wordpress
spec:
ports:
- port: 3306
selector:
app: wordpress
tier: mysql
clusterIP: None
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mysql-pv-claim
labels:
app: wordpress
spec:
storageClassName: rook-cephfs
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 30Gi
---
......
# 更新资源清单
[root@cce-k8s-m-1 examples]# kubectl apply -f mysql.yaml
service/wordpress-mysql unchanged
persistentvolumeclaim/mysql-pv-claim configured
deployment.apps/wordpress-mysql unchanged
[root@cce-k8s-m-1 examples]# kubectl get pv,pvc
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/pvc-31b... 30Gi RWO Delete Bound default/mysql-pv-claim rook-cephfs 3h23m
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/mysql-... Bound pvc-31b8fe4d-3cee-442b-b55b-8d8923a9cf72 30Gi RWO rook-cephfs 3h23m
# 可以看到已经扩容到了30G了,查看一下容器内的存储大小
[root@cce-k8s-m-1 examples]# kubectl exec -it wordpress-mysql-6f99c59595-895bn -- df -Th
Filesystem Type Size Used Avail Use% Mounted on
overlay overlay 17G 8.1G 9.0G 48% /
tmpfs tmpfs 64M 0 64M 0% /dev
/dev/mapper/cs-root xfs 17G 8.1G 9.0G 48% /etc/hosts
shm tmpfs 64M 0 64M 0% /dev/shm
10.96.0.92:6789,10.96.2.93:6789,10.96.1.40:6789:/volumes/csi/csi-vol-00cee7d1-d2f5-11ed-bfce-0a700b708e03/b490e28f-3789-4e83-9c6f-c137a3d27547 ceph 30G 112M 30G 1% /var/lib/mysql
tmpfs tmpfs 3.5G 12K 3.5G 1% /run/secrets/kubernetes.io/serviceaccount
tmpfs tmpfs 1.8G 0 1.8G 0% /proc/acpi
tmpfs tmpfs 1.8G 0 1.8G 0% /proc/scsi
tmpfs tmpfs 1.8G 0 1.8G 0% /sys/firmware
# 可以看到存储也是30G了
12.2:RBD块扩容原理
# PVC —> PV —> RBD 块 —> 文件系统扩容
# 查看 PV 对应 RBD 块容量
[root@cce-k8s-m-1 examples]# kubectl get pv pvc-c7740a06-df1d-4c23-bf3b-4a79d024f9fc -o yaml
apiVersion: v1
kind: PersistentVolume
metadata:
annotations:
pv.kubernetes.io/provisioned-by: rook-ceph.rbd.csi.ceph.com
volume.kubernetes.io/provisioner-deletion-secret-name: rook-csi-rbd-provisioner
volume.kubernetes.io/provisioner-deletion-secret-namespace: rook-ceph
creationTimestamp: "2023-04-04T18:00:54Z"
finalizers:
- kubernetes.io/pv-protection
name: pvc-c7740a06-df1d-4c23-bf3b-4a79d024f9fc
resourceVersion: "117698"
uid: 10402b2b-cd8f-4ca5-8bad-dd8dd40ea0c4
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: 30Gi
claimRef:
apiVersion: v1
kind: PersistentVolumeClaim
name: mysql-pv-claim
namespace: default
resourceVersion: "117681"
uid: c7740a06-df1d-4c23-bf3b-4a79d024f9fc
csi:
controllerExpandSecretRef:
name: rook-csi-rbd-provisioner
namespace: rook-ceph
driver: rook-ceph.rbd.csi.ceph.com
fsType: ext4
nodeStageSecretRef:
name: rook-csi-rbd-node
namespace: rook-ceph
volumeAttributes:
clusterID: rook-ceph
imageFeatures: layering
imageFormat: "2"
imageName: csi-vol-a4a7b890-d312-11ed-8457-b6ca8b376104 # PV 对应的 RBD 块
journalPool: replicapool # RBD 块所在的 pool
pool: replicapool
storage.kubernetes.io/csiProvisionerIdentity: 1680612058363-8081-rook-ceph.rbd.csi.ceph.com
volumeHandle: 0001-0009-rook-ceph-000000000000000d-a4a7b890-d312-11ed-8457-b6ca8b376104
persistentVolumeReclaimPolicy: Delete
storageClassName: rook-ceph-block
volumeMode: Filesystem
status:
phase: Bound
# 查看 RBD 块大小
[root@cce-k8s-m-1 examples]# rbd -p replicapool info csi-vol-a4a7b890-d312-11ed-8457-b6ca8b376104
rbd image 'csi-vol-a4a7b890-d312-11ed-8457-b6ca8b376104':
size 30 GiB in 7680 objects
order 22 (4 MiB objects)
snapshot_count: 0
id: 1121597d1eef6
block_name_prefix: rbd_data.1121597d1eef6
format: 2
features: layering
op_features:
flags:
create_timestamp: Wed Apr 5 02:00:54 2023
access_timestamp: Wed Apr 5 02:00:54 2023
modify_timestamp: Wed Apr 5 02:00:54 2023
# 容器内部磁盘空间查看
[root@cce-k8s-m-1 examples]# kubectl exec -it wordpress-mysql-6f99c59595-hfq7q -- df -h
Filesystem Size Used Avail Use% Mounted on
overlay 17G 8.1G 9.0G 48% /
tmpfs 64M 0 64M 0% /dev
/dev/mapper/cs-root 17G 8.1G 9.0G 48% /etc/hosts
shm 64M 0 64M 0% /dev/shm
/dev/rbd0 30G 116M 30G 1% /var/lib/mysql
tmpfs 3.5G 12K 3.5G 1% /run/secrets/kubernetes.io/serviceaccount
tmpfs 1.8G 0 1.8G 0% /proc/acpi
tmpfs 1.8G 0 1.8G 0% /proc/scsi
tmpfs 1.8G 0 1.8G 0% /sys/firmware
# 扩容 RBD 镜像块容量大小
[root@cce-k8s-m-1 examples]# rbd -p replicapool resize csi-vol-a4a7b890-d312-11ed-8457-b6ca8b376104 --size 40G
Resizing image: 100% complete...done.
[root@cce-k8s-m-1 examples]# rbd -p replicapool info csi-vol-a4a7b890-d312-11ed-8457-b6ca8b376104
rbd image 'csi-vol-a4a7b890-d312-11ed-8457-b6ca8b376104':
size 40 GiB in 10240 objects
order 22 (4 MiB objects)
snapshot_count: 0
id: 1121597d1eef6
block_name_prefix: rbd_data.1121597d1eef6
format: 2
features: layering
op_features:
flags:
create_timestamp: Wed Apr 5 02:00:54 2023
access_timestamp: Wed Apr 5 02:00:54 2023
modify_timestamp: Wed Apr 5 02:00:54 2023
# 查看 PVC/PV 大小是否扩容
[root@cce-k8s-m-1 examples]# kubectl get pv,pvc
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/pvc-... 30Gi RWO Delete Bound default/mysql-pv-claim rook-ceph-block 5m2s
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/mysql-... Bound pvc-c7740a06-df1d-4c23-bf3b-4a79d024f9fc 30Gi RWO rook-ceph-block 5m2s
# 登陆容器所在宿主机,扩容文件系统的容量
[root@cce-k8s-m-1 examples]# kubectl get pod -owide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
prometheus-operator-745b9c6b7f-s46qf 1/1 Running 0 4h27m 100.101.104.168 cce-k8s-w-2 <none> <none>
wordpress-mysql-6f99c59595-hfq7q 1/1 Running 0 6m48s 100.118.108.209 cce-k8s-w-1 <none> <none>
[root@cce-k8s-w-1 ~]# rbd showmapped
id pool namespace image snap device
0 replicapool csi-vol-a4a7b890-d312-11ed-8457-b6ca8b376104 - /dev/rbd0
[root@cce-k8s-w-1 ~]# df -h | grep pvc-c7740a06-df1d-4c23-bf3b-4a79d024f9fc
/dev/rbd0 30G 116M 30G 1% /var/lib/kubelet/pods/36ae784d-5f75-4123-84e1-27174cfe6cfc/volumes/kubernetes.io~csi/pvc-c7740a06-df1d-4c23-bf3b-4a79d024f9fc/mount
[root@cce-k8s-w-1 ~]# df -h | grep pvc-c7740a06-df1d-4c23-bf3b-4a79d024f9fc | awk '{print $1}'
/dev/rbd0
[root@cce-k8s-w-1 ~]# resize2fs /dev/rbd0
resize2fs 1.46.5 (30-Dec-2021)
Filesystem at /dev/rbd0 is mounted on /var/lib/kubelet/plugins/kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com/f50ce3b26dd2c62f281e30d83d0ed62f6dc99bbc0f35f00d0642a44cd1671fe5/globalmount/0001-0009-rook-ceph-000000000000000d-a4a7b890-d312-11ed-8457-b6ca8b376104; on-line resizing required
old_desc_blocks = 4, new_desc_blocks = 5
The filesystem on /dev/rbd0 is now 10485760 (4k) blocks long.
# 查看校验存储容量扩展情况
[root@cce-k8s-w-1 ~]# df -h | grep pvc-c7740a06-df1d-4c23-bf3b-4a79d024f9fc
/dev/rbd0 40G 116M 40G 1% /var/lib/kubelet/pods/36ae784d-5f75-4123-84e1-27174cfe6cfc/volumes/kubernetes.io~csi/pvc-c7740a06-df1d-4c23-bf3b-4a79d024f9fc/mount
# 再次去查看容器,可以看到扩容上来了
[root@cce-k8s-m-1 examples]# kubectl exec -it wordpress-mysql-6f99c59595-hfq7q -- df -h
Filesystem Size Used Avail Use% Mounted on
overlay 17G 8.2G 8.8G 49% /
tmpfs 64M 0 64M 0% /dev
/dev/mapper/cs-root 17G 8.2G 8.8G 49% /etc/hosts
shm 64M 0 64M 0% /dev/shm
/dev/rbd0 40G 116M 40G 1% /var/lib/mysql
tmpfs 3.5G 12K 3.5G 1% /run/secrets/kubernetes.io/serviceaccount
tmpfs 1.8G 0 1.8G 0% /proc/acpi
tmpfs 1.8G 0 1.8G 0% /proc/scsi
tmpfs 1.8G 0 1.8G 0% /sys/firmware
# 遗留问题:PVC/PV 信息没有扩容
[root@cce-k8s-m-1 examples]# kubectl get pv,pvc
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/pvc- 30Gi RWO Delete Bound default/mysql-pv-claim rook-ceph-block 12m
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
per.../mysql-pv-claim Bound pvc-c7740a06-df1d-4c23-bf3b-4a79d024f9fc 30Gi RWO rook-ceph-block 12m

浙公网安备 33010602011771号