使用 Ansible 搭建 Kubernetes 高可用集群
使用 Ansible 搭建高可用集群
使用 ansible 搭建 kubernetes 集群。操作系统为 Ubuntu 16.04.6 LTS, 用到的各相关程序版本如下:
- k8s: v1.17.2
- etcd: v3.4.3
- docker: 19.03.5
- coredns: v1.6.6
- kube-ovn: 0.9.1
- dashboard: v2.0.0-rc3
准备环境
1. 主机名解析
~# cat >> /etc/hosts << EOF
192.168.124.211 k8s-master01.linux.io k8s-master01
192.168.124.212 k8s-master02.linux.io k8s-master02
192.168.124.213 k8s-master03.linux.io k8s-master03
192.168.124.214 k8s-ha1.linux.io k8s-ha1
192.168.124.215 k8s-ha2.linux.io k8s-ha2
192.168.124.216 k8s-node01.linux.io k8s-node01
192.168.124.217 k8s-node01.linux.io k8s-node02
192.168.124.218 k8s-node01.linux.io k8s-node03
192.168.124.231 k8s-etcd01.linux.io k8s-etcd01
192.168.124.232 k8s-etcd02.linux.io k8s-etcd02
192.168.124.233 k8s-etcd03.linux.io k8s-etcd03
# VIP
192.168.124.188 k8s-api.linux.io k8s-api
192.168.124.250 reg.linux.io harbor
EOF
2. 更新软件源
~# cat > /etc/apt/sources.list << EOF
deb http://mirrors.aliyun.com/ubuntu/ xenial main
deb-src http://mirrors.aliyun.com/ubuntu/ xenial main
deb http://mirrors.aliyun.com/ubuntu/ xenial-updates main
deb-src http://mirrors.aliyun.com/ubuntu/ xenial-updates main
deb http://mirrors.aliyun.com/ubuntu/ xenial universe
deb-src http://mirrors.aliyun.com/ubuntu/ xenial universe
deb http://mirrors.aliyun.com/ubuntu/ xenial-updates universe
deb-src http://mirrors.aliyun.com/ubuntu/ xenial-updates universe
deb http://mirrors.aliyun.com/ubuntu/ xenial-security main
deb-src http://mirrors.aliyun.com/ubuntu/ xenial-security main
deb http://mirrors.aliyun.com/ubuntu/ xenial-security universe
deb-src http://mirrors.aliyun.com/ubuntu/ xenial-security universe
EOF
~# apt list --upgradable
3. 关闭防火墙和 selinux
~# ufw disable
4. 禁用swap分区
~# swapoff -a && sudo sed -i 's/.*swap.*/#&/' /etc/fstab
5. 调整内核参数
~# cat > /etc/sysctl.d/99-kubernetes-cri.conf <<EOF
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.conf.default.rp_filter = 1
net.ipv4.ip_nonlocal_bind = 1
EOF
6. 时间同步
~# apt install -y chrony
~# cat > /etc/chrony/chrony.conf << EOF
server ntp.aliyun.com iburst
stratumweight 0
driftfile /var/lib/chrony/drift
rtcsync
makestep 10 3
bindcmdaddress 127.0.0.1
bindcmdaddress ::1
keyfile /etc/chrony.keys
commandkey 1
generatecommandkey
logchange 0.5
logdir /var/log/chrony
EOF
~# service chrony restart
~# chronyc sources
在每个节点安装依赖工具
~# apt-get update && apt-get upgrade -y && apt-get dist-upgrade -y
~# apt-get install python2.7 -y
~# ln -s /usr/bin/python2.7 /usr/bin/python
在ansible控制端安装及准备ansible
1. 安装 ansle
~# apt-get install git python-pip -y
~# pip install pip --upgrade -i https://mirrors.aliyun.com/pypi/simple/
~# pip install ansible==2.6.18 netaddr==0.7.19 -i https://mirrors.aliyun.com/pypi/simple/
2 配置主机互信
~# ssh-keygen -t rsa -N ''
~# apt-get install sshpass
~# bash -x ssh_copy.sh
高可用负载均衡
1. 配置 Keepalived
root@k8s-ha01:~# apt install -y haproxy keepalived
root@k8s-ha02:~# apt install -y haproxy keepalived
~# cp /usr/share/doc/keepalived/samples/keepalived.conf.vrrp /etc/keepalived/keepalived.conf
~# vim /etc/keepalived/keepalived.conf
vrrp_instance VI_1 {
state MASTER # BACKUP
interface ens33
garp_master_delay 10
smtp_alert
virtual_router_id 88
priority 100 # 98
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.124.188 dev ens33 label ens33:1 # VIP
192.168.124.189 dev ens33 label ens33:2 # 业务ip
}
}
~# systemctl enable keepalived
~# systemctl restart keepalived
2. 配置 haproxy
~# vim /etc/haproxy/haproxy.cfg
listen stats
mode http
bind 0.0.0.0:9999
stats enable
log global
stats uri /haproxy-status
stats auth haadmin:123456
listen k8s_api_nodes_6443
bind 192.168.124.188:6443
mode tcp
server 192.168.124.211 192.168.124.211:6443 check inter 2000 fall 3 rise 5
server 192.168.124.212 192.168.124.212:6443 check inter 2000 fall 3 rise 5
#server 192.168.124.213 192.168.124.213:6443 check inter 2000 fall 3 rise 5
~# systemctl stop haproxy
~# systemctl start haproxy # 确保端口被监听 `net.ipv4.ip_nonlocal_bind = 1`
准备安装文件
1. 使用 easzup下载安装所需文件
~# wget https://github.com/easzlab/kubeasz/releases/download/2.2.0/easzup
~# chmod +x easzup
~# ./easzup -D
使用 easzup 脚本下载 4.0/4.1/4.2 所需文件;运行成功后,所有文件(kubeasz代码、二进制、离线镜像)均已整理好放入目录/etc/ansible
2. 配置集群参数
~# cp /etc/ansible/example/hosts.multi-node /etc/ansible/hosts
~# vim /etc/ansible/hosts
[etcd]
192.168.124.231 NODE_NAME=k8s-etcd01
192.168.124.232 NODE_NAME=k8s-etcd02
192.168.124.233 NODE_NAME=k8s-etcd03
[kube-master]
192.168.124.211
192.168.124.212
#192.168.124.213
[kube-node]
192.168.124.216
192.168.124.217
#192.168.124.218
[ex-lb] #
192.168.124.214 LB_ROLE=master EX_APISERVER_VIP=192.168.124.188 EX_APISERVER_PORT=6443
192.168.124.215 LB_ROLE=backup EX_APISERVER_VIP=192.168.124.188 EX_APISERVER_PORT=6443
验证ansible 安装:ansible all -m ping
正常能看到节点返回 SUCCESS
分步安装
1.环境初始化
~# ansible-playbook /etc/ansible/01.prepare.yml
2. 安装 etcd 集群
~# ansible-playbook /etc/ansible/02.etcd.yml
~# bash etcd_cluster_check.sh
https://192.168.124.231:2379 is healthy: successfully committed proposal: took = 10.330327ms
https://192.168.124.232:2379 is healthy: successfully committed proposal: took = 7.273319ms
https://192.168.124.233:2379 is healthy: successfully committed proposal: took = 7.382855ms
3. 安装 docker
~# ansible-playbook /etc/ansible/03.docker.yml
4. 部署 master
~# ansible-playbook /etc/ansible/04.kube-master.yml
~# /opt/kube/bin/kubectl get nodes
NAME STATUS ROLES AGE VERSION
192.168.124.211 Ready,SchedulingDisabled master 3m31s v1.17.2
192.168.124.212 Ready,SchedulingDisabled master 3m31s v1.17.2
5. 部署 node
~# ansible-playbook /etc/ansible/05.kube-node.yml
~# kubectl get nodes
NAME STATUS ROLES AGE VERSION
192.168.124.211 Ready,SchedulingDisabled master 18m v1.17.2
192.168.124.212 Ready,SchedulingDisabled master 18m v1.17.2
192.168.124.216 Ready node 14m v1.17.2
192.168.124.217 Ready node 14m v1.17.2
6. 部署网络插件
网络插件支持 calico, flannel, kube-router, cilium, kube-ovn,我们可以在 /etc/ansible/hosts
文件指定,这里我们安装 flannel 插件
~# ansible-playbook /etc/ansible/06.network.yml
# 启动一个容器,验证网络是否正常
~# kubectl run net-test --image=busybox --replicas=4 sleep 3600
~# kubectl exec -it net-test-c4d86d548-8dztn -- /bin/sh
# ping 172.20.3.3 # 验证内网 pod 网络
# ping 114.114.114.114 # 验证外网
~# kubectl run app --image=ikubernetes/myapp:v1 --replicas=2
~# kubectl expose deployment/app --port=80 --target-port=80
7. 部署 kube-dns
~#kubectl apply -f kube-dns.yaml
# 验证
~# kubectl run busybox --image=busybox --replicas=4 sleep 3600
~# kubectl exec -it busybox-594d48545b-kczfn -- sh
/ # wget -O - -q myapp
Hello MyApp | Version: v1 | <a href="hostname.html">Pod Name</a>
8.部署 CoreDNS
~# git clone https://github.com/coredns/deployment.git
~# cd /deployment/kubernete && # ./deploy.sh > coredns.yaml.template # 这个脚本执行需要部署好kube-nds
~# cp coredns.yaml.template coredns.yaml
~# vim coredns.yaml #修改域名
~# kubectl delete -f kube-dns.yaml # 删除kube-dns
~# kubectl apply -f coredns.yaml # 部署 CoreDNS
集群管理
增加 kube-master 节点
首先配置 ssh 免密码登录新增节点,然后执行 (假设待增加节点为 192.168.124.213):
~# easzctl add-master 192.168.124.213
[INFO] Action successed : add-master 192.168.124.213
- [可选]新节点安装 chrony 时间同步
- 新节点预处理 prepare
- 新节点安装 docker 服务
- 新节点安装 kube-master 服务
- 新节点安装 kube-node 服务
- 新节点安装网络插件相关
- 禁止业务 pod调度到新master节点
- 更新 node 节点 haproxy 负载均衡并重启
增加 kube-node 节点
首先配置 ssh 免密码登录新增节点,然后执行 (假设待增加节点为 192.168.124.218):
~# easzctl add-node 192.168.124.218
- [可选]新节点安装 chrony 时间同步
- 新节点预处理 prepare
- 新节点安装 docker 服务
- 新节点安装 kube-node 服务
- 新节点安装网络插件相关
升级集群
集群升级存在一定风险,请谨慎操作。
项目分支master安装的集群可以在当前支持k8s大版本基础上升级任意小版本,比如当前安装集群为1.17.0,你可以方便的升级到任何1.17.x版本
- 备份etcd数据:
ETCDCTL_API=3 etcdctl snapshot save backup.db
或~# ansible-playbook /etc/ansible/23.backup.yml
- 查看备份:
ETCDCTL_API=3 etcdctl --write-out=table snapshot status backup.db
- 备份二进制文件:
~# cp /etc/ansible/bin/{kube-apiserver,kube-controller-manager,kube-scheduler,kubelet,kube-proxy,kubectl} /bak/v1.17.2/
- 替换新的二进制文件:
~# tar -xf kubernetes-server-linux-amd64.tar.gz
~# cp kubernetes/server/bin/{kube-apiserver,kube-controller-manager,kube-scheduler,kubelet,kube-proxy,kubectl} /etc/ansible/bin/
- 执行升级:
~# easzctl upgrade
~# kubectl get nodes
NAME STATUS ROLES AGE VERSION
192.168.124.211 Ready,SchedulingDisabled master 64m v1.17.4
192.168.124.212 Ready,SchedulingDisabled master 64m v1.17.4
192.168.124.213 Ready,SchedulingDisabled master 38m v1.17.4
192.168.124.216 Ready node 60m v1.17.4
192.168.124.217 Ready node 60m v1.17.4
192.168.124.218 Ready node 33m v1.17.4
如果集群规模不是很大时,我们可以手逐台升级集群,具体步骤如下:
- 从负载均衡器上摘除节点
- 停止节点上的服务
- 替换二进制文件
- 启动服务
- 加入负载均衡
参考文档:https://github.com/easzlab/kubeasz