一、部署环境介绍
节点1:node1 192.168.0.196 千兆单网卡 3个osd deploy节点
节点2:node2 192.168.0.197 千兆单网卡 3个osd
节点3:node3 192.168.0.198 千兆单网卡 3个osd
所有节点关闭防火墙以及selinux
Mon : node1 node2 node3
Mds: node1 node2 node3
Osd: node1 node2 node3
二、前期准备
1、每个节点配置/etc/hosts文件
[root@node1 ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.0.196 node1
192.168.0.197 node2
192.168.0.198 node3
2、deploy节点配置节点间ssh免秘钥互联
[root@node1 ~]# ssh-keygen -t rsa
敲3个回车即可。
[root@node1 ~]# ssh-copy-id -i node2
输入节点密码
[root@node1 ~]# ssh-copy-id -i node3
输入节点密码
3、所有节点配置yum源
所有节点执行如下操作
[root@node1 yum.repos.d]# pwd
/etc/yum.repos.d
[root@node1 yum.repos.d]# vi ceph.repo
内容如下
[root@node1 yum.repos.d]# cat ceph.repo
[ceph]
name=ceph
baseurl=http://mirrors.163.com/ceph/rpm-luminous/el7/x86_64/
gpgckeck=0
gpgkey=http://mirrors.163.com/ceph/keys/release.asc
[ceph-noarch]
name=Ceph noarch packages
baseurl=http://mirrors.163.com/ceph/rpm-luminous/el7/noarch/
gpgcheck=0
gpgkey=http://mirrors.163.com/ceph/keys/release.asc
[root@node1 yum.repos.d]# yum makecache
执行成功后,安装最新epel源
[root@node1 yum.repos.d]# yum install epel-release –y
三、安装ceph
1、node1 yum安装deploy ceph
[root@node1 ~]# yum install -y ceph-deploy
[root@node1 ~]# yum install -y ceph
返回成功即可
2、node2 node3 yum安装ceph
[root@node2 ~]# yum install -y ceph
[root@node3 ~]# yum install ceph –y
所有节点执行ceph –v 返回如下则ceph安装成功
[root@node1 yum.repos.d]# ceph -v
ceph version 12.1.0 (262617c9f16c55e863693258061c5b25dea5b086) luminous (dev)
[root@node2 ~]# ceph -v
ceph version 12.1.0 (262617c9f16c55e863693258061c5b25dea5b086) luminous (dev)
[root@node3 ~]# ceph -v
ceph version 12.1.0 (262617c9f16c55e863693258061c5b25dea5b086) luminous (dev)
3、deploy节点操作
所有节点上创建目录/etc/ceph(若是不存在)
进入/etc/ceph 目录下执行以下操作:
1、创建集群信息
[root@node1 ceph]# ceph-deploy new node1 node2 node3
输出省略部分
[ceph_deploy.new][DEBUG ] Resolving host node3
[ceph_deploy.new][DEBUG ] Monitor node3 at 192.168.0.198
[ceph_deploy.new][DEBUG ] Monitor initial members are ['node1', 'node2', 'node3']
[ceph_deploy.new][DEBUG ] Monitor addrs are ['192.168.0.196', '192.168.0.197', '192.168.0.198']
[ceph_deploy.new][DEBUG ] Creating a random mon key...
[ceph_deploy.new][DEBUG ] Writing monitor keyring to ceph.mon.keyring...
[ceph_deploy.new][DEBUG ] Writing initial config to ceph.conf...
执行成功后查看
[root@node1 ceph]# ll
total 20
-rw-r--r-- 1 root root 238 Jun 29 18:37 ceph.conf
-rw-r--r-- 1 root root 4852 Jun 29 18:37 ceph-deploy-ceph.log
-rw------- 1 root root 73 Jun 29 18:37 ceph.mon.keyring
-rw-r--r-- 1 root root 92 Jun 23 00:14 rbdmap
[root@node1 ceph]#
2、部署mon
[root@node1 ceph]# ceph-deploy --overwrite-conf mon create-initial
过程省略部分
[node1][INFO ] Running command: /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-node1/keyring auth get-or-create client.bootstrap-rgw mon allow profile bootstrap-rgw
[ceph_deploy.gatherkeys][INFO ] Storing ceph.client.admin.keyring
[ceph_deploy.gatherkeys][INFO ] Storing ceph.bootstrap-mds.keyring
[ceph_deploy.gatherkeys][INFO ] Storing ceph.bootstrap-mgr.keyring
[ceph_deploy.gatherkeys][INFO ] keyring 'ceph.mon.keyring' already exists
[ceph_deploy.gatherkeys][INFO ] Storing ceph.bootstrap-osd.keyring
[ceph_deploy.gatherkeys][INFO ] Storing ceph.bootstrap-rgw.keyring
[ceph_deploy.gatherkeys][INFO ] Destroy temp directory /tmp/tmptJ7fhE
[root@node1 ceph]#
执行ceph –s 查看集群状态
[root@node1 ceph]# ceph -s
cluster:
id: b13504e0-ed55-4ca2-b424-836c9bd318ae
health: HEALTH_ERR
clock skew detected on mon.node2, mon.node3
no osds
Monitor clock skew detected
services:
mon: 3 daemons, quorum node1,node2,node3
mgr: node2(active), standbys: node1, node3
osd: 0 osds: 0 up, 0 in
data:
pools: 1 pools, 64 pgs
objects: 0 objects, 0 bytes
usage: 0 kB used, 0 kB / 0 kB avail
pgs: 100.000% pgs unknown
64 unknown
可以看到services中mon开启3个分别是node1 node2 node3
3、部署osd
部署node1节点上的osd
[root@node1 ceph]# ceph-deploy --overwrite-conf osd prepare 192.168.0.196:/dev/sdb
[root@node1 ceph]# ceph-deploy --overwrite-conf osd prepare 192.168.0.196:/dev/sdc
[root@node1 ceph]# ceph-deploy --overwrite-conf osd prepare 192.168.0.196:/dev/sdd
执行期间需要输入root密码2次
部署node2节点上的osd
[root@node1 ceph]# ceph-deploy --overwrite-conf osd prepare 192.168.0.197:/dev/sdb
[root@node1 ceph]# ceph-deploy --overwrite-conf osd prepare 192.168.0.197:/dev/sdc
[root@node1 ceph]# ceph-deploy --overwrite-conf osd prepare 192.168.0.197:/dev/sdd
不需要输入任何密码
部署node3节点上的osd
[root@node1 ceph]# ceph-deploy --overwrite-conf osd prepare 192.168.0.198:/dev/sdb
[root@node1 ceph]# ceph-deploy --overwrite-conf osd prepare 192.168.0.198:/dev/sdc
[root@node1 ceph]# ceph-deploy --overwrite-conf osd prepare 192.168.0.198:/dev/sdd
期间不需要输入密码 直接返回成功
查看集群状态
[root@node1 ceph]# ceph -s
cluster:
id: b13504e0-ed55-4ca2-b424-836c9bd318ae
health: HEALTH_WARN
clock skew detected on mon.node2, mon.node3
too few PGs per OSD (21 < min 30)
Monitor clock skew detected
services:
mon: 3 daemons, quorum node1,node2,node3
mgr: node2(active), standbys: node1, node3
osd: 9 osds: 9 up, 9 in
data:
pools: 1 pools, 64 pgs
objects: 0 objects, 0 bytes
usage: 9541 MB used, 439 GB / 449 GB avail
pgs: 64 active+clean
在services 看到osd 共9个,且均状态up in
4、部署mds信息
[root@node1 ceph]# ceph-deploy --overwrite-conf mds create node1 node2 node3
创建文件系统ceph (9个osd 取256pg)
[root@node1 ceph]# ceph osd pool create data 256 256
pool 'data' created
[root@node1 ceph]# ceph osd pool create metadata 256 256
pool 'metadata' created
[root@node1 ceph]# ceph fs new ceph metadata data
new fs with metadata pool 2 and data pool 1
查看集群状态
[root@node1 ceph]# ceph -s
cluster:
id: b13504e0-ed55-4ca2-b424-836c9bd318ae
health: HEALTH_WARN
clock skew detected on mon.node2, mon.node3
Monitor clock skew detected
services:
mon: 3 daemons, quorum node1,node2,node3
mgr: node2(active), standbys: node1, node3
mds: 1/1/1 up {0=node2=up:active}, 2 up:standby
osd: 9 osds: 9 up, 9 in
data:
pools: 3 pools, 576 pgs
objects: 21 objects, 2246 bytes
usage: 9559 MB used, 439 GB / 449 GB avail
pgs: 576 active+clean
可以看到data池中 显示有3个存储池
此时集群告警为时间不同步,所有节点安装时间同步软件ntp
5、所有安装ntp
[root@node1 ceph]# yum install ntp –y
[root@node2 ceph]# yum install ntp –y
[root@node3 ceph]# yum install ntp –y
以node1为始终同步源
[root@node1 ceph]# cat /etc/ntp.conf
restrict 127.0.0.1
restrict ::1
server time.windows.com
server 0.centos.pool.ntp.org iburst
server 1.centos.pool.ntp.org iburst
server 2.centos.pool.ntp.org iburst
server 3.centos.pool.ntp.org iburst
开启ntpd服务
[root@node1 ceph]# systemctl start ntpd
此时时间
[root@node1 ceph]# date
Thu Jun 29 19:02:43 CST 2017
等待5s后,在此查看时间
[root@node1 ceph]# date
Tue Jul 4 12:31:07 CST 2017
已经与当前实际时间同步。
其他2个节点 node2 node3
[root@node2 ceph]# cat /etc/ntp.conf
server 192.168.0.196
[root@node3 ceph]# cat /etc/ntp.conf
server 192.168.0.196
启动ntpd服务
[root@node2 ceph]# systemctl start ntpd
[root@node3 ceph]# systemctl start ntpd
等待10分钟左右,始终自动同步,若是没有自动同步,则手动进行
[root@node2 ceph]#ntpdata 192.168.0.196
直到所有节点时钟同步。否则集群会始终出现各种警告信息。
[root@node1 ceph]# ceph -s
2017-07-04 12:47:20.304417 7fdfe2ffd700 0 -- 192.168.0.196:0/898621246 >> 192.168.0.197:6800/17590 conn(0x7fdfcc00a3d0 :-1 s=STATE_CONNECTING_WAIT_CONNECT_REPLY_AUTH pgs=0 cs=0 l=1).handle_connect_reply connect got BADAUTHORIZER
2017-07-04 12:47:20.305990 7fdfe2ffd700 0 -- 192.168.0.196:0/898621246 >> 192.168.0.197:6800/17590 conn(0x7fdfcc00a3d0 :-1 s=STATE_CONNECTING_WAIT_CONNECT_REPLY_AUTH pgs=0 cs=0 l=1).handle_connect_reply connect got BADAUTHORIZER
cluster:
id: b13504e0-ed55-4ca2-b424-836c9bd318ae
health: HEALTH_ERR
clock skew detected on mon.node2
372 pgs are stuck inactive for more than 300 seconds
576 pgs peering
372 pgs stuck inactive
575 pgs stuck unclean
Monitor clock skew detected
services:
mon: 3 daemons, quorum node1,node2,node3
mgr: node2(active), standbys: node1, node3
mds: 1/1/1 up {0=node2=up:active}, 2 up:standby
osd: 9 osds: 9 up, 9 in; 376 remapped pgs
data:
pools: 3 pools, 576 pgs
objects: 21 objects, 2246 bytes
usage: 9579 MB used, 439 GB / 449 GB avail
pgs: 100.000% pgs not active
376 remapped+peering
200 peering
6、查看集群状态
始终同步完成后,查看集群状态
[root@node1 ceph]# ceph -s
cluster:
id: b13504e0-ed55-4ca2-b424-836c9bd318ae
health: HEALTH_OK
services:
mon: 3 daemons, quorum node1,node2,node3
mgr: node2(active), standbys: node1, node3
mds: 1/1/1 up {0=node2=up:active}, 2 up:standby
osd: 9 osds: 9 up, 9 in
data:
pools: 3 pools, 576 pgs
objects: 21 objects, 2246 bytes
usage: 9592 MB used, 439 GB / 449 GB avail
pgs: 576 active+clean