部署和配置kubernetes高可用群集

1、基本环境

192.168.10.20 master1 ubuntu 24.04
192.168.10.21 master2 ubuntu 24.04
192.168.10.22 master2 ubuntu 24.04
192.168.10.24 node1 ubuntu 24.04
192.168.10.23 virtual ip

2、前期准备(略)

3、部署kubenetes三件套(略)

4、安装和配置keepalived

参考文档:kubeadm/docs/ha-considerations.md at main · kubernetes/kubeadm · GitHub

sudo apt install keepalived -y 
#修改配置
cat <<EOF | sudo /etc/keepalived/keepalived.conf
! /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
    router_id LVS_DEVEL
}
vrrp_script check_apiserver {
  script "/etc/keepalived/check_apiserver.sh"
  interval 3
  weight -2
  fall 10
  rise 2
}

vrrp_instance VI_1 {
    state MASTER
    interface enp1s0
    virtual_router_id 51
    priority 100
    authentication {
        auth_type PASS
        auth_pass 123456
    }
    virtual_ipaddress {
        192.168.10.23
    }
    track_script {
        check_apiserver
    }
}

#检测脚本
cat <<EOF | sudo tee /etc/keepalived/check_apiserver.sh
#!/bin/sh

errorExit() {
    echo "*** $*" 1>&2
    exit 1
}

curl -sfk --max-time 2 https://localhost:6443/healthz -o /dev/null || errorExit "Error GET https://localhost:6443/healthz"

#赋予执行权限
sudo chmod +x /etc/keepalived/check_apiserver.sh

验证keepalived的安装

#在任一台服务器上ping 192.168.10.23
64 bytes from 192.168.10.23: icmp_seq=1 ttl=64 time=1.06 ms
64 bytes from 192.168.10.23: icmp_seq=2 ttl=64 time=0.648 ms
64 bytes from 192.168.10.23: icmp_seq=3 ttl=64 time=0.775 ms

#通过ip addr查看virtual ip 在哪一台上
2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 52:54:00:b6:73:c8 brd ff:ff:ff:ff:ff:ff
    inet 192.168.10.21/24 brd 192.168.10.255 scope global enp1s0
       valid_lft forever preferred_lft forever
    inet 192.168.10.23/32 scope global enp1s0
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:feb6:73c8/64 scope link 
       valid_lft forever preferred_lft forever

#重启192.168.10.21验证virtal ip 漂移
2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 52:54:00:a3:ee:a8 brd ff:ff:ff:ff:ff:ff
    inet 192.168.10.20/24 brd 192.168.10.255 scope global enp1s0
       valid_lft forever preferred_lft forever
    inet 192.168.10.23/32 scope global enp1s0
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fea3:eea8/64 scope link 
       valid_lft forever preferred_lft forever

5、安装和配置HAproxy

参考文档:kubeadm/docs/ha-considerations.md at main · kubernetes/kubeadm · GitHub

#安装
sudo apt install -y haproxy

#备份配置文件
sudo cp /etc/haproxy/haproxy.cfg /etc/haproxy/haproxy.cfg.bak

#修改配置文件
cat <<EOF | sudo tee /etc/haproxy/haproxy.cfg
# /etc/haproxy/haproxy.cfg
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
    log stdout format raw local0
    daemon

#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
    mode                    http
    log                     global
    option                  httplog
    option                  dontlognull
    option http-server-close
    option forwardfor       except 127.0.0.0/8
    option                  redispatch
    retries                 1
    timeout http-request    10s
    timeout queue           20s
    timeout connect         5s
    timeout client          35s
    timeout server          35s
    timeout http-keep-alive 10s
    timeout check           10s

#---------------------------------------------------------------------
# apiserver frontend which proxys to the control plane nodes
#---------------------------------------------------------------------
frontend apiserver
    bind *:6443
    mode tcp
    option tcplog
    default_backend apiserverbackend

#---------------------------------------------------------------------
# round robin balancing for apiserver
#---------------------------------------------------------------------
backend apiserverbackend
    option httpchk

    http-check connect ssl
    http-check send meth GET uri /healthz
    http-check expect status 200

    mode tcp
    balance     roundrobin

    server cka-master1 192.168.10.20:6443 check verify none
    server cka-master2 192.168.10.21:6443 check verify none
    server cka-master3 192.168.10.22:6443 check verify none
    # [...]

6、修改kubeadm初始化配置文件

手动生成kubeadm默认配置文件

sudo kubeadm config default | tee kubeadm-init.yaml

核对以下需要修改的配置项

# 
grep -E 'advertiseAddress|bindPort|name|controlPlaneEndpoint|kubernetesVersion|serviceSubnet|cgroupDriver|imageRepository' kubeadm-init.yaml -n -C 1
11-localAPIEndpoint:
12:  advertiseAddress: 192.168.10.20     #修改为当前master服务器IP
13:  bindPort: 6443
14-nodeRegistration:
--
17-  imagePullSerial: true
18:  name: cka-master1
19-  taints: null
--
34-clusterName: kubernetes
35:controlPlaneEndpoint: "192.168.10.23:6443"    #修改为irtual IP
36-controllerManager: {}
--
41-    dataDir: /var/lib/etcd
42:imageRepository: registry.aliyuncs.com/google_containers    #替换为阿里云容器仓库
43-kind: ClusterConfiguration
44:kubernetesVersion: 1.32.2    #修改为当前部署的版本
45-networking:
46-  dnsDomain: cluster.local
47:  serviceSubnet: 10.96.0.0/12    #修改
48-proxy: {}
--
52-apiVersion: kubelet.config.k8s.io/v1beta1    
53:cgroupDriver: systemd    #配置cgroup驱动为systemd

7、执行初始化

加--dry-run参数执行初始化测试

sudo kubeadm init --v=5 --config=kubeadm-init.yaml --upload-certs --dry-run

测试直至没有报错,去掉--dry-run参数开始初始化

sudo kubeadm init --v=5 --config=kubeadm-init.yaml --upload-certs

初始化

sudo kubeadm init --v=5 --config=kubeadm-init.yaml --upload-certs

初始化成功,显示如下:

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of control-plane nodes running the following command on each as root:

  kubeadm join 192.168.10.23:6443 --token abcdef.0123456789abcdef \
        --discovery-token-ca-cert-hash sha256:e927585d374438d2a3ba25e55256a0136fb7053322ba081069460f341dbafb46 \
        --control-plane --certificate-key 50f836e0de98125b363373c2600a3065ee827780f782978773450b9d33c4ed35

Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.10.23:6443 --token abcdef.0123456789abcdef \
        --discovery-token-ca-cert-hash sha256:e927585d374438d2a3ba25e55256a0136fb7053322ba081069460f341dbafb46  

验证

kubectl get po -n kube-system -l tier=control-plane

NAME                                  READY   STATUS    RESTARTS   AGE
etcd-cka-master1                      1/1     Running   0          25m
kube-apiserver-cka-master1            1/1     Running   0          25m
kube-controller-manager-cka-master1   1/1     Running   0          25m
kube-scheduler-cka-master1            1/1     Running   0          25m

8、添加master节点

添加--dry-run参数测试,检测执行是否有报错

sudo kubeadm join 192.168.10.23:6443 --dry-run --token abcdef.0123456789abcdef \
        --discovery-token-ca-cert-hash sha256:e927585d374438d2a3ba25e55256a0136fb7053322ba081069460f341dbafb46 \
        --control-plane --certificate-key 50f836e0de98125b363373c2600a3065ee827780f782978773450b9d33c4ed35

没有报错后执行

sudo kubeadm join 192.168.10.23:6443 --token abcdef.0123456789abcdef \
        --discovery-token-ca-cert-hash sha256:e927585d374438d2a3ba25e55256a0136fb7053322ba081069460f341dbafb46 \
        --control-plane --certificate-key 50f836e0de98125b363373c2600a3065ee827780f782978773450b9d33c4ed35

成功添加,结果如下,同样的方法将第三台master添加到群集

This node has joined the cluster and a new control plane instance was created:

* Certificate signing request was sent to apiserver and approval was received.
* The Kubelet was informed of the new secure connection details.
* Control plane label and taint were applied to the new node.
* The Kubernetes control plane instances scaled up.
* A new etcd member was added to the local/stacked etcd cluster.

To start administering your cluster from this node, you need to run the following as a regular user:

        mkdir -p $HOME/.kube
        sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
        sudo chown $(id -u):$(id -g) $HOME/.kube/config

Run 'kubectl get nodes' to see this node join the cluster.

配置权限

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

查看节点

kubectl get nodes
--------------------------------------------------------
NAME          STATUS     ROLES           AGE     VERSION
cka-master1   NotReady   control-plane   50m     v1.32.2
cka-master2   NotReady   control-plane   9m59s   v1.32.2
cka-master3   NotReady   control-plane   34s     v1.32.2

查看静态Pod

kubectl get pods -n kube-system -l tier=control-plane
----------------------------------------------------------------
NAME                                  READY   STATUS    RESTARTS   AGE
etcd-cka-master1                      1/1     Running   0          50m
etcd-cka-master2                      1/1     Running   0          10m
etcd-cka-master3                      1/1     Running   0          57s
kube-apiserver-cka-master1            1/1     Running   0          50m
kube-apiserver-cka-master2            1/1     Running   0          10m
kube-apiserver-cka-master3            1/1     Running   0          57s
kube-controller-manager-cka-master1   1/1     Running   0          50m
kube-controller-manager-cka-master2   1/1     Running   0          10m
kube-controller-manager-cka-master3   1/1     Running   0          57s
kube-scheduler-cka-master1            1/1     Running   0          50m
kube-scheduler-cka-master2            1/1     Running   0          10m
kube-scheduler-cka-master3            1/1     Running   0          57s        4m36s

9、下载 和安装calico

下载calico.yaml

wget https://docs.projectcalico.org/manifests/calico.yaml

修改calico.yaml

# 替换镜像源为华为云镜像
sed -i 's/docker.io/swr.cn-north-4.myhuaweicloud.com\/ddn-k8s\/docker.io/g' calico.yaml# 修改 IP 池配置(示例:10.244.0.0/16)
#修改Pod CIDR网络
- name: CALICO_IPV4POOL_CIDR
  value: "10.244.0.0/16"

安装 Calico

kubectl apply -f calico.yaml

#查验安装是否成功,Calico 组件将正常启动,节点状态变为Ready
get pods -n kube-system -l k8s-app=calico-node

补充说明:

通过以下优先级解决问题:

  1. 替换镜像源:使用阿里云等国内镜像仓库。
  2. 配置镜像加速器:优化 Containerd 拉取速度。
  3. 手动拉取镜像:适用于严格内网环境。
  4. 检查版本兼容性:确保 Calico 与 Kubernetes 版本匹配。

完成上述步骤后,Calico 组件将正常启动,

kubectl get po -A
NAMESPACE     NAME                                       READY   STATUS    RESTARTS       AGE
kube-system   calico-kube-controllers-6b7cfc4d75-7769f   1/1     Running   1 (2d7h ago)   7d15h
kube-system   calico-node-cj6n9                          1/1     Running   1 (2d7h ago)   7d15h
kube-system   calico-node-dcpr2                          1/1     Running   1 (2d7h ago)   7d15h
kube-system   calico-node-dw579                          1/1     Running   1 (2d7h ago)   7d15h
kube-system   calico-node-rztlr                          1/1     Running   6 (2d7h ago)   7d2h
kube-system   coredns-6766b7b6bb-9s45j                   1/1     Running   1              7d18h
kube-system   coredns-6766b7b6bb-mbk5b                   1/1     Running   1 (2d7h ago)   7d18h
kube-system   etcd-cka-master1                           1/1     Running   1 (2d7h ago)   7d18h
kube-system   etcd-cka-master2                           1/1     Running   1 (2d7h ago)   7d17h
kube-system   etcd-cka-master3                           1/1     Running   1 (2d7h ago)   7d17h
kube-system   kube-apiserver-cka-master1                 1/1     Running   2 (2d7h ago)   7d18h
kube-system   kube-apiserver-cka-master2                 1/1     Running   1 (2d7h ago)   7d17h
kube-system   kube-apiserver-cka-master3                 1/1     Running   1 (2d7h ago)   7d17h
kube-system   kube-controller-manager-cka-master1        1/1     Running   1 (2d7h ago)   7d18h
kube-system   kube-controller-manager-cka-master2        1/1     Running   1 (2d7h ago)   7d17h
kube-system   kube-controller-manager-cka-master3        1/1     Running   1 (2d7h ago)   7d17h
kube-system   kube-proxy-45dqn                           1/1     Running   1 (2d7h ago)   7d17h
kube-system   kube-proxy-9kf94                           1/1     Running   1 (2d7h ago)   7d18h
kube-system   kube-proxy-k86bx                           1/1     Running   6 (2d7h ago)   7d2h
kube-system   kube-proxy-mgwkn                           1/1     Running   1 (2d7h ago)   7d17h
kube-system   kube-scheduler-cka-master1                 1/1     Running   1 (2d7h ago)   7d18h
kube-system   kube-scheduler-cka-master2                 1/1     Running   1 (2d7h ago)   7d17h
kube-system   kube-scheduler-cka-master3                 1/1     Running   1 (2d7h ago)   7d17h

节点状态变为 Ready

kubectl get nodes
NAME          STATUS     ROLES           AGE     VERSION
cka-master1   Ready      control-plane   7d18h   v1.32.2
cka-master2   Ready      control-plane   7d17h   v1.32.2
cka-master3   Ready      control-plane   7d17h   v1.32.2

10、添加工作节点

执行 kubeadm join

sudo kubeadm join 192.168.10.20:6443 --token 5ealr1.c4ozr01xcwtjji8h --discovery-token-ca-cert-hash sha256:25393deeda6473b58b70cb5f8d7ebb5cbe9633f370815af643641d844aaf4a69 

验证节点状态

kubectl get nodes
NAME          STATUS     ROLES           AGE     VERSION
cka-master1   Ready      control-plane   7d18h   v1.32.2
cka-master2   Ready      control-plane   7d17h   v1.32.2
cka-master3   Ready      control-plane   7d17h   v1.32.2
cka-node1     Ready      <none>          7d2h    v1.32.2

至此,kubernetes高可用群集部署完成,如需添加更多工作节点,重复上述步骤即可。

posted @ 2025-03-19 17:38  HoraceXie  阅读(74)  评论(0)    收藏  举报