使用cephadm搭建ceph v17集群
软件环境描述说明:
- OS:Rocky Linux 8.10
- cephadm:cephadm v17.2.8(quincy)
- Podman:4.9.4
- Python:3.6.8
- Kernel:4.18.0-553.53.1.el8_10.x86_64
网络规划:
Public Network:172.25.55.0/24 Cluster Network:172.16.57.0/24
服务器规划:
主机名 | PublicNET(对外提供服务) | ClusterNET(集群的内部通信) | 角色 | 磁盘 |
ceph-node01 | 172.25.55.200 | 172.16.57.100 | bootstrap,mon,mgr,osd | sda(20G),sdb(10G) |
ceph-node02 | 172.25.55.201 | 172.16.57.101 | mod,mgr,osd | sda(20G),sdb(10G) |
ceph-node03 | 172.25.55.202 | 172.16.57.102 | osd | sda(20G),sdb(10G) |
1. 所有节点基础配置
1.1 所有节点关闭防火墙、selinux、swap分区
# systemctl stop firewalld && systemctl disable firewalld
# sed -ri 's/SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config
# setenforce 0
sestatus
# swapoff -a
# sed -ri 's/.*swap.*/#&/' /etc/fstab
如何需要开启防火墙可开放如下服务和端口
# firewall-cmd --list-all
public (active)
target: default
icmp-block-inversion: no
interfaces: ens33 ens34
sources:
services: ceph ceph-mon cockpit ssh
ports: 9283/tcp 8443/tcp 9093/tcp 9094/tcp 3000/tcp 9100/tcp 9095/tcp
protocols:
forward: no
masquerade: no
forward-ports:
source-ports:
icmp-blocks:
rich rules:
1.2 所有节点更新软件包并配置hosts文件
dnf -y update ; dnf -y upgrade
cat >> /etc/hosts <<EOF
172.25.55.200 ceph-node01
172.25.55.201 ceph-node02
172.25.55.202 ceph-node03
172.16.57.100 ceph-node01
172.16.57.101 ceph-node02
172.16.57.102 ceph-node03
EOF
1.3 配置从ceph-node01到其他节点免密登陆
# ssh-keygen -t rsa -P ''
# for i in `tail -n 3 /etc/hosts | awk '{print $1}'`; do ssh-copy-id $i;done
1.4 所有节点安装所需软件包
在ceph-node01主机执行
# dnf -y install net-tools lrzsz nmap tcpdump lsof python3 chrony
# for i in `tail -n 2 /etc/hosts | awk '{print $1}'`; do ssh $i exec dnf -y install net-tools lrzsz nmap tcpdump lsof python3 chrony;done
1.5 配置时间同步服务
在ceph-node01主机配置
# vi /etc/chrony.conf
pool ntp1.aliyun.com iburst
pool s1e.time.edu.cn iburst
driftfile /var/lib/chrony/drift
makestep 1.0 3
rtcsync
allow 172.25.55.0/24
local stratum 10
keyfile /etc/chrony.keys
leapsectz right/UTC
logdir /var/log/chrony
其他节点上配置
# vi /etc/chrony.conf
pool ceph-node01 iburst
driftfile /var/lib/chrony/drift
makestep 1.0 3
rtcsync
local stratum 10
keyfile /etc/chrony.keys
leapsectz right/UTC
logdir /var/log/chrony
# systemctl enable chronyd && systemctl start chronyd
1.6 配置镜像下载加速代理
在ceph-node01主机执行
# cat > /etc/containers/registries.conf << EOF
unqualified-search-registries = ["docker.io"]
[[registry]]
prefix = "docker.io"
location = "docker.m.daocloud.io"
EOF
复制registries.conf文件到其他主机
# for i in `tail -n 2 /etc/hosts | awk '{print $1}'`; do scp /etc/containers/registries.conf $i:/etc/containers/ ;done
1.7 所有节点安装cephadm
在ceph-node01主机执行
# curl -L -o /usr/bin/cephadm --remote-name --location https://mirrors.chenby.cn/https://github.com/ceph/ceph/raw/quincy/src/cephadm/cephadm
# chmod a+x /usr/bin/cephadm
# cephadm add-repo --release 17.2.7
将cephadm复制到其他节点并赋权
# for i in $(tail -n 2 /etc/hosts | awk '{print $1}'); do \
scp /usr/bin/cephadm $i:/usr/bin/ && \
ssh $i "chmod a+x /usr/bin/cephadm && cephadm add-repo --release 17.2.7 && cephadm install"; \
done
1.8 检测所有节点是否符合安装要求
# cephadm check-host --expect-hostname `hostname`
podman (/usr/bin/podman) version 4.9.4 is present
systemctl is present
lvcreate is present
Unit chronyd.service is enabled and running
Hostname "ceph-node01" matches what is expected.
Host looks OK
1.9 修改ceph国内安装源头
在ceph-node01主机执行
# sed -i 's#download.ceph.com#mirrors.ustc.edu.cn/ceph#' /etc/yum.repos.d/ceph.repo
将ceph.repo同步到其他节点
# for i in `tail -n 2 /etc/hosts | awk '{print $1}'`; do scp /etc/yum.repos.d/ceph.repo $i:/etc/yum.repos.d/ ;done
2. 安装MOD节点
2.1 在bootstrap上使用cephadm引导安装
只在ceph-node01主机执行
# cephadm bootstrap --mon-ip 172.25.55.200 --cluster-network 172.16.57.0/24 --log-to-file
这个过程中,cephadm会自动完成拉取镜像,创建keyring,创建配置文件,部署mgr节点等一系列操作
Verifying podman|docker is present...
Verifying lvm2 is present...
Verifying time synchronization is in place...
Unit chronyd.service is enabled and running
Repeating the final host check...
podman (/usr/bin/podman) version 4.9.4 is present
systemctl is present
lvcreate is present
Unit chronyd.service is enabled and running
Host looks OK
Cluster fsid: 672de82c-3e17-11f0-a36a-000c2912c8ce
Verifying IP 172.25.55.200 port 3300 ...
Verifying IP 172.25.55.200 port 6789 ...
Mon IP `172.25.55.200` is in CIDR network `172.25.55.0/24`
Mon IP `172.25.55.200` is in CIDR network `172.25.55.0/24`
Pulling container image quay.io/ceph/ceph:v17...
Ceph version: ceph version 17.2.8 (f817ceb7f187defb1d021d6328fa833eb8e943b3) quincy (stable)
Extracting ceph user uid/gid from container image...
Creating initial keys...
Creating initial monmap...
Creating mon...
Waiting for mon to start...
Waiting for mon...
mon is available
Assimilating anything we can from ceph.conf...
Generating new minimal ceph.conf...
Restarting the monitor...
Setting mon public_network to 172.25.55.0/24
Setting cluster_network to 172.16.57.0/24
Ceph Dashboard is now available at:
URL: https://ceph-node01:8443/
User: admin
Password: 17xvyrqdxd
Enabling client.admin keyring and conf on hosts with "admin" label
Saving cluster configuration to /var/lib/ceph/672de82c-3e17-11f0-a36a-000c2912c8ce/config directory
Enabling autotune for osd_memory_target
You can access the Ceph CLI as following in case of multi-cluster or non-default config:
sudo /usr/bin/cephadm shell --fsid 672de82c-3e17-11f0-a36a-000c2912c8ce -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring
Or, if you are only running a single cluster on this host:
sudo /usr/bin/cephadm shell
Please consider enabling telemetry to help improve Ceph:
ceph telemetry on
For more information see:
https://docs.ceph.com/docs/master/mgr/telemetry/
Bootstrap complete.
使用初识生成的User和Password账号,是有ceph-node01主机IP访问ceph管理WEB页
2.2 查看ceph版本和编排服务状态
# cephadm shell -- ceph -v
ceph version 17.2.8 (f817ceb7f187defb1d021d6328fa833eb8e943b3) quincy (stable)
# cephadm shell -- ceph orch status
Backend: cephadm
Available: Yes
Paused: No
2.3 查看编排服务
# cephadm shell -- ceph orch ls
NAME PORTS RUNNING REFRESHED AGE PLACEMENT
alertmanager ?:9093,9094 1/1 4m ago 2h count:1
crash 1/1 4m ago 2h *
grafana ?:3000 1/1 4m ago 2h count:1
mgr 1/2 4m ago 2h count:2
mon 1/5 4m ago 111m <unmanaged>
node-exporter ?:9100 1/1 4m ago 2h *
prometheus ?:9095 1/1 4m ago 2h count:1
可以看到,安装了7个容器服务。
1.alertmanager: prometheus告警组件
2.crash: ceph-crash 崩溃数据收集模块
3.grafana: 监控数据展示 Dashboard
4.mgr: ceph-manager (Ceph 管理程序) 也就是 Dashboard
5.mon: ceph-monitor (Ceph 监视器)
6.node_exporter: prometheus节点数据收集组件
7.prometheus: prometheus监控组件
注意:
mon 服务默认需要安装 5 个,由于目前只安装了一个节点,所以是 1/5.
mgr 服务默认需要安装 2 个,当前为 1/2。
2.4 查看容器的进程
# cephadm shell -- ceph orch ps
NAME HOST PORTS STATUS REFRESHED AGE MEM USE MEM LIM VERSION IMAGE ID CONTAINER ID
alertmanager.ceph-node01 ceph-node01 *:9093,9094 running (2h) 2s ago 2h 23.2M - 0.25.0 c8568f914cd2 221a53c313f8
crash.ceph-node01 ceph-node01 running (2h) 2s ago 2h 6891k - 17.2.8 259b35566514 087d43c233b6
grafana.ceph-node01 ceph-node01 *:3000 running (2h) 2s ago 2h 87.2M - 9.4.7 954c08fa6188 55a30fad3c0d
mgr.ceph-node01.brmiab ceph-node01 *:9283 running (2h) 2s ago 2h 436M - 17.2.8 259b35566514 222d9ada1ad8
mon.ceph-node01 ceph-node01 running (2h) 2s ago 2h 59.4M 2048M 17.2.8 259b35566514 96bbddd4dd5c
node-exporter.ceph-node01 ceph-node01 *:9100 running (2h) 2s ago 2h 12.9M - 1.5.0 0da6a335fe13 1901851b7a73
prometheus.ceph-node01 ceph-node01 *:9095 running (2h) 2s ago 2h 53.5M - 2.43.0 a07b618ecd1d 1be635eee047
2.5 查看 ceph 的总体状态
# cephadm shell -- ceph status
cluster:
id: 672de82c-3e17-11f0-a36a-000c2912c8ce
health: HEALTH_WARN
OSD count 0 < osd_pool_default_size 3
services:
mon: 1 daemons, quorum ceph-node01 (age 2h)
mgr: ceph-node01.brmiab(active, since 2h)
osd: 0 osds: 0 up, 0 in
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0 B
usage: 0 B used, 0 B / 0 B avail
pgs:
HEALTH_WARN,因为没有添加 OSD,OSD count 0 < osd_pool_default_size 3 可以看到,默认最少 3 个 OSD。
2.6 查看已拉取的容器
# podman images
REPOSITORY TAG IMAGE ID CREATED SIZE
quay.io/ceph/ceph v17 259b35566514 6 months ago 1.25 GB
quay.io/ceph/ceph-grafana 9.4.7 954c08fa6188 18 months ago 647 MB
quay.io/prometheus/prometheus v2.43.0 a07b618ecd1d 2 years ago 235 MB
quay.io/prometheus/alertmanager v0.25.0 c8568f914cd2 2 years ago 66.5 MB
quay.io/prometheus/node-exporter v1.5.0 0da6a335fe13 2 years ago 23.9 MB
2.7 复制ceph公钥到其他节点
在ceph-node01主机执行,这一步不做ceph bootstrap节点会连不其他节点,在执行ceph orch host ls命令后节点的STATUS会提示Offline
# ssh-copy-id -f -i /etc/ceph/ceph.pub root@ceph-node01
# ssh-copy-id -f -i /etc/ceph/ceph.pub root@ceph-node02
# ssh-copy-id -f -i /etc/ceph/ceph.pub root@ceph-node03
3. 部署MON节点
3.1 添加节点
禁用自动部署mon节点,如果不做这一步,cephadm会在所有已添加的节点上去部署mon和mgr进程
# cephadm shell -- ceph orch apply mon --unmanaged
Scheduled mon update...
# cephadm shell -- ceph orch host add ceph-node01
Added host 'ceph-node02' with addr '172.25.55.200'
# cephadm shell -- ceph orch host add ceph-node02
Added host 'ceph-node02' with addr '172.25.55.201'
# cephadm shell -- ceph orch host add ceph-node03
Added host 'ceph-node03' with addr '172.25.55.202'
# cephadm shell -- ceph orch host ls
HOST ADDR LABELS STATUS
ceph-node01 172.25.55.200 _admin
ceph-node02 172.25.55.201
ceph-node03 172.25.55.202
3 hosts in cluster
3.2 给需要部署mon进程的节点打上标签
# cephadm shell -- ceph orch host label add ceph-node01 mon
Added label mon to host ceph-node01
# cephadm shell -- ceph orch host label add ceph-node02 mon
Added label mon to host ceph-node02
3.4 在有mon标签的节点上部署mon进程
# cephadm shell -- ceph orch apply mon label:mon
Scheduled mon update...
# cephadm shell -- ceph status
cluster:
id: 672de82c-3e17-11f0-a36a-000c2912c8ce
health: HEALTH_WARN
OSD count 0 < osd_pool_default_size 3
services:
mon: 2 daemons, quorum ceph-node01,ceph-node02 (age 75s)
mgr: ceph-node02.zdvwto(active, since 11m), standbys: ceph-node01.brmiab
osd: 0 osds: 0 up, 0 in
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0 B
usage: 0 B used, 0 B / 0 B avail
pgs:
# cephadm shell -- ceph orch ls
NAME PORTS RUNNING REFRESHED AGE PLACEMENT
alertmanager ?:9093,9094 0/1 32m ago 3h count:1
crash 2/3 32m ago 3h *
grafana ?:3000 0/1 32m ago 3h count:1
mgr 1/2 32m ago 3h count:2
mon 1/2 32m ago 10m label:mon
node-exporter ?:9100 2/3 32m ago 3h *
prometheus ?:9095 0/1 32m ago 3h count:1
1.如果是 *,表示每个节点都有,不受控制(unmanaged)
2.如果是 count:1或count:2,表示部署的上限数量,具体位置可以 ceph orch ps 查看
3.如果是 label:mon,表示运行在有标签的节点上
4. 部署OSD
添加 OSD 需要满足以下所有条件:
- 设备必须没有格式化/分区。
- 设备不得具有任何 LVM 状态。
- 该设备不得包含文件系统。
- 该设备不得包含 Ceph BlueStore OSD。
- 设备必须大于 5 GiB,建议使用SSD盘。
4.1 列出所有节点的可用设备(本次挂载sdb盘)
# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 20G 0 disk
├─sda1 8:1 0 1G 0 part /boot
└─sda2 8:2 0 19G 0 part
├─rl-root 253:0 0 17G 0 lvm /
└─rl-swap 253:1 0 2G 0 lvm [SWAP]
sdb 8:16 0 10G 0 disk
sr0 11:0 1 2.5G 0 rom
# cephadm shell -- ceph orch device ls
HOST PATH TYPE DEVICE ID SIZE AVAILABLE REFRESHED REJECT REASONS
ceph-node01 /dev/sdb hdd 10.0G Yes 47m ago
ceph-node01 /dev/sr0 hdd VMware_IDE_CDR10_10000000000000000001 2569M No 47m ago Has a FileSystem, Insufficient space (<5GB)
ceph-node02 /dev/sdb hdd 10.0G Yes 13m ago
ceph-node02 /dev/sr0 hdd VMware_IDE_CDR10_10000000000000000001 2569M No 13m ago Has a FileSystem, Insufficient space (<5GB)
ceph-node03 /dev/sdb hdd 10.0G Yes 11m ago
ceph-node03 /dev/sr0 hdd VMware_IDE_CDR10_10000000000000000001 2569M No 11m ago Has a FileSystem, Insufficient space (<5GB)
4.2 添加osd
# cephadm shell -- ceph orch daemon add osd ceph-node01:/dev/sdb
Created osd(s) 0 on host 'ceph-node01'
# cephadm shell -- ceph orch daemon add osd ceph-node02:/dev/sdb
Created osd(s) 0 on host 'ceph-node02'
# cephadm shell -- ceph orch daemon add osd ceph-node03:/dev/sdb
Created osd(s) 0 on host 'ceph-node03'
4.3 osd部署验证
# cephadm shell -- ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.01959 root default
-3 0.00980 host ceph-node01
0 hdd 0.00980 osd.0 up 1.00000 1.00000
-5 0.00980 host ceph-node02
1 hdd 0.00980 osd.1 up 1.00000 1.00000
-7 0.00980 host ceph-node03
2 hdd 0.00980 osd.2 up 1.00000 1.00000
# cephadm shell -- ceph status
cluster:
id: 672de82c-3e17-11f0-a36a-000c2912c8ce
health: HEALTH_OK
services:
mon: 2 daemons, quorum ceph-node01,ceph-node02 (age 58m)
mgr: ceph-node02.zdvwto(active, since 68m), standbys: ceph-node01.brmiab
osd: 3 osds: 3 up (since 8m), 3 in (since 8m)
data:
pools: 1 pools, 1 pgs
objects: 2 objects, 449 KiB
usage: 873 MiB used, 29 GiB / 30 GiB avail
pgs: 1 active+clean
ceph集群的状态已经是 HEALTH_OK 了,集群已经好了。
注意:Ceph集群只要有 ceph-mon 和 ceph-osd 这两个就基本搭建完成了。但是如果需要使用其分布式存储功能(如对象存储,块存储和文件存储),则需要添加其他的模块。
5. 创建CephFS文件系统
5.1 创建名为myfilesys的元数据池和数据池
创建metadata元数据池
# cephadm shell -- ceph osd pool create myfilesys_metadata 32
pool 'myfilesys_metadata' created
创建data数据池
# cephadm shell -- ceph osd pool create myfilesys_data 64
pool 'myfilesys_data' created
删除存储池
# ceph config set mon mon_allow_pool_delete true
# cephadm shell -- ceph config get mon mon_allow_pool_delete
true
# cephadm shell -- ceph osd pool delete myfilesys_metadata myfilesys_metadata --yes-i-really-really-mean-it
# cephadm shell -- ceph osd pool delete myfilesys_data myfilesys_data --yes-i-really-really-mean-it
注意:
32 和 64是 PG 数量(Production 环境可根据 OSD 数量做调整),可以根据实际集群规模设为:
- 小型集群:metadata 32、data 64
- 中型集群:metadata 64、data 128
- 大型集群:metadata 512、data 8192
计算公式:Total PGs ≈ (Total OSDs * 100) / Replication Factor[副本数]
5.2 设置metadata元数据池类型
# cephadm shell -- ceph osd pool application enable myfilesys_metadata cephfs
enabled application 'cephfs' on pool 'myfilesys_metadata'
# cephadm shell -- ceph osd pool application enable myfilesys_data cephfs
enabled application 'cephfs' on pool 'myfilesys_data'
元数据池必须设置为replicated且类型为metadata
5.3 创建文件系统
# cephadm shell -- ceph fs new myfilesys myfilesys_metadata myfilesys_data
Pool 'myfilesys_data' (id '3') has pg autoscale mode 'on' but is not marked as bulk.
Consider setting the flag by running
# ceph osd pool set myfilesys_data bulk true
new fs with metadata pool 2 and data pool 3
配置验证
# cephadm shell -- ceph fs ls
name: myfilesys, metadata pool: myfilesys_metadata, data pools: [myfilesys_data ]
# cephadm shell -- ceph fs status myfilesys
myfilesys - 0 clients
=========
RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS
0 active myfilesys.ceph-node01.tkhclk Reqs: 0 /s 10 13 12 0
POOL TYPE USED AVAIL
myfilesys_metadata metadata 96.0k 9432M
myfilesys_data data 0 9432M
STANDBY MDS
myfilesys.ceph-node03.skhhek
myfilesys.ceph-node02.frrqti
MDS version: ceph version 17.2.8 (f817ceb7f187defb1d021d6328fa833eb8e943b3) quincy (stable)
6. 部署MDS元数据服务器
6.1 在指定节点上部署名为myfilesys的
# cephadm shell -- ceph orch apply mds myfilesys --placement="3 ceph-node01 ceph-node02 ceph-node03"
Scheduled mds.myfilesys update...
配置验证
# cephadm shell -- ceph orch ls --service_type mds
NAME PORTS RUNNING REFRESHED AGE PLACEMENT
mds.myfilesys 3/3 2m ago 12m ceph-node01;ceph-node02;ceph-node03;count:3
# cephadm shell -- ceph orch ps --daemon_type mds
NAME HOST PORTS STATUS REFRESHED AGE MEM USE MEM LIM VERSION IMAGE ID CONTAINER ID
mds.myfilesys.ceph-node01.tkhclk ceph-node01 running (22m) 28s ago 22m 19.4M - 17.2.8 259b35566514 b01444f003e7
mds.myfilesys.ceph-node02.frrqti ceph-node02 running (22m) 30s ago 22m 18.5M - 17.2.8 259b35566514 af1a675ecc61
mds.myfilesys.ceph-node03.skhhek ceph-node03 running (22m) 94s ago 22m 16.6M - 17.2.8 259b35566514 4ca620b8cece
删除已创建的MDS
# cephadm shell -- ceph orch rm mds.myfilesys
部署完MDS后设置多个MDS actice,这里部署3个MDS后设置了2个active和1个standby(并发访问量高才需要多个active)
# cephadm shell -- ceph fs set myfilesys max_mds 2
6.2 设置MDS热备
# cephadm shell -- ceph fs set myfilesys allow_standby_replay true
配置验证
# cephadm shell -- ceph fs get myfilesys
Filesystem 'myfilesys' (1)
fs_name myfilesys
epoch 15
flags 32 joinable allow_snaps allow_multimds_snaps allow_standby_replay //allow_standby_replay说明已启用
created 2025-06-04T14:51:38.284680+0000
modified 2025-06-04T14:52:41.713104+0000
tableserver 0
root 0
session_timeout 60
session_autoclose 300
max_file_size 1099511627776
required_client_features {}
last_failure 0
last_failure_osd_epoch 54
compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table,9=file layout v2,10=snaprealm v2}
max_mds 1
in 0
up {0=54155}
failed
damaged
stopped
data_pools [3]
metadata_pool 2
inline_data disabled
balancer
standby_count_wanted 1
[mds.myfilesys.ceph-node03.pankkx{0:54155} state up:active seq 5 join_fscid=1 addr [v2:172.25.55.202:6808/2245027536,v1:172.25.55.202:6809/2245027536] compat {c=[1],r=[1],i=[7ff]}]
[mds.myfilesys.ceph-node02.fjcwol{0:54159} state up:standby-replay seq 2 join_fscid=1 addr [v2:172.25.55.201:6810/747937919,v1:172.25.55.201:6811/747937919] compat {c=[1],r=[1],i=[7ff]}] //standby-replay,说明standby MDS正在用重放模式同步元数据
# cephadm shell -- ceph mds stat
myfilesys:1 {0=myfilesys.ceph-node03.pankkx=up:active} 1 up:standby-replay 1 up:standby
注意:
- 需要快速故障切换和高可用性allow_standby_replay true
- 普通场景,非特别严格的 HA 需求保持默认或 false
7. 客户端挂载CephFS
7.1 获取客户端挂载密钥(在ceph bootstrap节点执行)
# cephadm shell -- ceph auth get-or-create client.myclient \
mon 'allow r' \
mds 'allow rw' \
osd 'allow rw tag cephfs data=myfilesys' \
osd 'allow rw tag cephfs metadata=myfilesys' > /etc/ceph/ceph.client.myclient.keyring
# cephadm shell -- ceph auth get-key client.myclient > /etc/ceph/myclient.secret
# ll /etc/ceph/
-rw------- 1 root root 151 Jun 6 17:22 ceph.client.admin.keyring
-rw-r--r-- 1 root root 66 Jun 6 18:16 ceph.client.myclient.keyring
-rw-r--r-- 1 root root 227 Jun 6 17:22 ceph.conf
-rw-r--r-- 1 root root 595 May 31 20:06 ceph.pub
-rw-r--r-- 1 root root 40 Jun 6 18:17 myclient.secret
将
ceph.client.myclient.keyring
myclient.secret
两个文件拷贝到客户端并赋权chmod 600
更新已有授权信息
# cephadm shell -- ceph auth caps client.myclient \
mon 'allow r' \ //只读访问 MON(必需)
mds 'allow r' \ //只读访问 MDS
osd 'allow r tag cephfs data=myfilesys' \ //只读访问名为myfilesys的 CephFS 数据池
osd 'allow r tag cephfs metadata=myfilesys' //只读访问名为myfilesys元数据池
7.1 安装 ceph-common包
dnf install -y https://download.ceph.com/rpm-17.2.7/el8/noarch/ceph-release-1-1.el8.noarch.rpm
sed -i 's/rpm-17.2.7/rpm-quincy/' /etc/yum.repos.d/ceph.repo
dnf install -y ceph-common
7.2 创建挂载点并挂载cephfs文件系统
# mkdir /ceph_data
[root@script-test ceph]# mount -t ceph 172.25.55.200:6789,172.25.55.201:6789:/ /ceph_data \
-o name=myclient,secretfile=/etc/ceph/myclient.secret,fs=myfilesys,noatime,_netdev,conf=/dev/null
unable to get monitor info from DNS SRV with service name: ceph-mon
2025-06-07T02:18:10.998+0800 7f11c12ec0c0 -1 failed for service _ceph-mon._tcp
验证配置
# df -h /ceph_data
Filesystem Size Used Avail Use% Mounted on
172.25.55.200:6789,172.25.55.201:6789:/ 9.3G 0 9.3G 0% /ceph_data
7.3 将挂载点添加至fstab
# echo '172.25.55.200:6789,172.25.55.201:6789:/ /ceph_data ceph name=myclient,secretfile=/etc/ceph/myclient.secret,fs=myfilesys,noatime,_netdev,conf=/dev/null 0 0
' >> /etc/fstab
配置验证
# mount -a
如果没有错误输出且你能看到 /ceph_data 已挂载,说明写法正确
7.4 文件写入验证
在客户端执行
# touch /ceph_data/test.txt
在cephfs bootstrap节点执行
# cephadm shell -- ceph df
--- RAW STORAGE ---
CLASS SIZE AVAIL USED RAW USED %RAW USED
hdd 30 GiB 29 GiB 890 MiB 890 MiB 2.90
TOTAL 30 GiB 29 GiB 890 MiB 890 MiB 2.90
--- POOLS ---
POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL
.mgr 1 1 449 KiB 2 1.3 MiB 0 9.2 GiB
myfilesys_metadata 2 32 28 KiB 22 170 KiB 0 9.2 GiB
myfilesys_data 3 64 0 B 0 0 B 0 9.2 GiB
如果myfilesys_data 中 OBJECTS、STORED 和 USED 有非 0 值证明写入成功
8. 创建MON
8.1 架构设计
客户端
|
挂载IP -> VIP:6789
v
HAProxy+Keepalived VIP IP(172.25.55.100:6789)
|
+--> ceph-node01:6789(172.25.55.200:6789)
+--> ceph-node02:6789(172.25.55.201:6789)
8.2 HAProxy 配置(监听 6789,代理到多个 MON)
# vi /etc/haproxy/haproxy.cfg
global
log /dev/log local0
log /dev/log local1 notice
daemon
maxconn 20480
defaults
log global
mode tcp
option tcplog
timeout connect 5s
timeout client 60s
timeout server 60s
frontend ceph_mon_frontend
bind *:6789
default_backend ceph_mon_backend
backend ceph_mon_backend
option tcp-check
tcp-check connect port 6789
tcp-check send-binary 01
tcp-check expect binary 02
server mon1 172.25.55.200:6789 check
server mon2 172.25.55.201:6789 check
8.3 Keepalived 配置(高可用 VIP)
keepalived主节点配置
# cat > /etc/keepalived/keepalived.conf <<EOF
vrrp_instance VI_1 {
state MASTER
interface eth0 # 根据你的服务器网卡名修改
virtual_router_id 60
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 6666
}
virtual_ipaddress {
172.25.55.100 # 对外暴露的 VIP
}
track_script {
chk_haproxy
}
}
vrrp_script chk_haproxy {
script "/etc/keepalived/check_haproxy.sh"
interval 2
weight -5
fall 2
rise 1
}
EOF
keepalived备节点配置
# cat > /etc/keepalived/keepalived.conf <<EOF
vrrp_instance VI_1 {
state BACKUP
interface eth0 # 根据你的服务器网卡名修改
virtual_router_id 60
priority 90 # 小于主节点的 priority
advert_int 1
authentication {
auth_type PASS
auth_pass 6666 # 与主节点保持一致
}
virtual_ipaddress {
172.25.55.100 # VIP 与主节点一样
}
track_script {
chk_haproxy
}
}
vrrp_script chk_haproxy {
script "/etc/keepalived/check_haproxy.sh"
interval 2
weight -5
fall 2
rise 1
}
EOF
# cat > /etc/keepalived/check_haproxy.sh <<EOF
#!/bin/bash
pgrep haproxy > /dev/nul
EOF
# chmod +x /etc/keepalived/check_haproxy.sh
8.4 客户端挂在CephFS文件系统
# mount -t ceph 172.25.55.100:6789:/ /ceph_data \
-o name=myclient,secretfile=/etc/ceph/myclient.secret,fs=myfilesys,noatime,_netdev,conf=/dev/null
# echo '172.25.55.100:6789:/ /ceph_data ceph name=myclient,secretfile=/etc/ceph/myclient.secret,fs=myfilesys,noatime,_netdev,conf=/dev/null 0 0
' >> /etc/fstab