Docker基础知识 (19) - Kubernetes(二) | 部署 K8s 集群(一主一从)


Kubernetes,也被称为 K8s 或 Kube,是谷歌推出的业界最受欢迎的容器管理/运维工具(容器编排器)。它是一套自动化容器管理/运维的开源平台,包括部署、调度和节点集群的扩展等。

Kubernetes 的详细介绍,请参考 "系统架构与设计(6)- Kubernetes(K8s)"。

本文要部署 K8s 集群(一主一从)。


1. 部署环境

    虚拟机: Virtual Box 6.1.30(Windows 版)
    操作系统: Linux CentOS 7.9 64位
    Docker 版本:20.10.7
    Docker Compose 版本:2.6.1
    Kubernetes 版本:1.23.0

    工作目录:/home/k8s
    Linux 用户:非 root 权限用户 (用户名自定义,这里以 xxx 表示),属于 docker 用户组

    主机列表:

主机名 IP 角色 操作系统
k8s-master 192.168.0.10 master CentOS 7.9
k8s-node01 192.168.0.11 node CentOS 7.9

 

    1) 设置主机名

        在 Master 主机上运行如下命令:

            $ sudo hostnamectl set-hostname k8s-master

        在 Node 主机上运行如下命令:

            $ sudo hostnamectl set-hostname k8s-node01

        修改 Master 和 Node 主机的 /etc/hosts 文件:

            $ sudo vim /etc/hosts

                # 添加
                192.168.0.10 k8s-master
                192.168.0.11 k8s-node01

    2) 关闭 SELinux

        (1) 临时关闭,运行如下命令

            $ sudo setenforce 0
            $ getenforce

                Permissive

            注:设置为 permissive 模式后,SELinux 被临时关闭,系统重启就失效。对应的临时开启的命令是:

            $ sudo setenforce 1
            $ getenforce

                Enforcing                   

        (2) 永久关闭,运行如下命令

            $ sudo sed -i 's/^SELINUX=.*/SELINUX=disabled/' /etc/selinux/config
            $ getenforce

                Enforcing

            注:命令运行后,还是 Enforcing 模式。这是因为永久关闭,需要重启系统才能生效。重启后运行如下命令:

            $ getenforce

                Disabled

            对应的永久开启的命令是:

                $ sudo sed -i 's/^SELINUX=.*/SELINUX=enforcing/' /etc/selinux/config

            也需要重启系统才能生效。

    3) 关闭 SWAP

        (1) 临时关闭,运行如下命令

            $ sudo swapoff -a
            $ free -m

                        total        used        free      shared  buff/cache   available
                Mem:    1837         132        1588           8         116        1569
                Swap:    0           0           0

            注:Swap 被临时关闭,系统重启就失效。对应的临时开启的命令是:

            $ sudo swapon -a
            $ free -m

                        total        used        free      shared  buff/cache   available
                Mem:    1837         133        1587           8         116        1568
                Swap:   2047           0        2047                        

 

        (2) 永久关闭,运行如下命令

            $ sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab # 注释 /etc/fstab 中相应的条目
            $ free -m

                        total        used        free      shared  buff/cache   available
                Mem:    1837         133        1587           8         117        1568
                Swap:   2047           0        2047

            注:命令运行后,Swap 还是开启状态。这是因为永久关闭,需要重启系统才能生效。重启后运行如下命令:

            $ free -m

                        total        used        free      shared  buff/cache   available
                Mem:    1837         129        1546           8         161        1562
                Swap:    0           0           0

            永久开启,就是把 “sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab” 命令注释掉的条目恢复,需要重启系统才能生效。

    4) 配置防火墙

        关闭防火墙,命令如下。

            $ sudo systemctl stop firewalld && systemctl disable firewalld

        启动防火墙

            $ sudo systemctl enable firewalld

    5) 时间同步

        $ sudo yum -y install ntpdate
        $ sudo ntpdate time.windows.com

    6)安装 Docker 和 Compose

        Docker 安装配置请参考 “Docker基础知识 (1) - Docker 架构、Docker 安装、Docker 镜像加速”。

        Docker compose 安装配置请参考 “Docker基础知识 (4) - Docker Compose”。

    注:以上永久关闭 SELinux、永久关闭 SWAP、关闭防火墙、时间同步、安装 Docker 和 Compose 等操作,在 Master 和 Node 上都要执行。


2. 在 Master 和 Node 上安装 Kubernetes

    1) 配置 YUM 源

        $ sudo vim /etc/yum.repos.d/kubernetes.repo

            [kubernetes]
            name=Kubernetes
            baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
            enabled=1
            gpgcheck=0
            repo_gpgcheck=0
            gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg


    2) 安装 Kubernetes 组件 (指定版本)

        $ sudo yum install -y kubelet-1.23.0 kubeadm-1.23.0 kubectl-1.23.0

        $ kubelet --version

            Kubernetes v1.23.0

        $ kubeadm version

            kubeadm version: &version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.0", GitCommit:"ab69524f795c42094a6630298ff53f3c3ebab7f4", GitTreeState:"clean", BuildDate:"2021-12-07T18:15:11Z", GoVersion:"go1.17.3", Compiler:"gc", Platform:"linux/amd64"}

        $ kubectl version

            Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.0", GitCommit:"ab69524f795c42094a6630298ff53f3c3ebab7f4", GitTreeState:"clean", BuildDate:"2021-12-07T18:16:20Z", GoVersion:"go1.17.3", Compiler:"gc", Platform:"linux/amd64"}
            The connection to the server localhost:8080 was refused - did you specify the right host or port?


3. 在 Master 上初始化集群

    1) 初始化 Master 主机

        以下两种运行方式,都可以初始化 Master 主机:

        (1) 命令参数方式

            $ sudo kubeadm init \
                --apiserver-advertise-address=192.168.0.10 \
                --image-repository=registry.aliyuncs.com/google_containers \
                --kubernetes-version=v1.23.0 \
                --service-cidr=10.96.0.0/12 \
                --pod-network-cidr=10.244.0.0/16 \
                --ignore-preflight-errors=all

                参数说明:

                    --apiserver-advertise-address:设置 Master 节点 API Server 的监听地址
                    --image-repository:设置容器镜像拉取地址;
                    --kubernetes-version:设置 K8S 版本,需与您安装的保持一致;
                    --service-cidr:集群内部虚拟网络,Pod 统一访问入口;
                    --pod-network-cidr:Pod 网络,与部署 CNI 网络组件 yaml 文件中需保持一致;
                    --ignore-preflight-errors:其错误将显示为警告的检查列表,值为 'all' 忽略所有检查中的错误;

                    更多 kubeadm init 参数可查看官方文档:https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm-init/

        (2) 配置文件方式

            $ cd /home/k8s
            $ kubeadm config print init-defaults > init.default.yaml
            $ vim init.default.yaml

                apiVersion: kubeadm.k8s.io/v1beta3
                bootstrapTokens:
                - groups:
                  - system:bootstrappers:kubeadm:default-node-token
                  token: abcdef.0123456789abcdef
                  ttl: 24h0m0s
                  usages:
                  - signing
                  - authentication
                kind: InitConfiguration
                localAPIEndpoint:
                  advertiseAddress: 192.168.0.10
                  bindPort: 6443
                nodeRegistration:
                  criSocket: /var/run/dockershim.sock
                  name: k8s-master
                  taints: null
                ---
                apiServer:
                  timeoutForControlPlane: 4m0s
                apiVersion: kubeadm.k8s.io/v1beta3
                certificatesDir: /etc/kubernetes/pki
                clusterName: kubernetes
                controllerManager: {}
                dns:
                  type: CoreDNS
                etcd:
                  local:
                    dataDir: /var/lib/etcd
                imageRepository: registry.aliyuncs.com/google_containers
                kind: ClusterConfiguration
                kubernetesVersion: v1.23.0
                networking:
                  dnsDomain: cluster.local
                  serviceSubnet: 10.96.0.0/12
                  podSubnet: 10.244.0.0/16
                scheduler: {}

                注:imageRepository:  默认从 k8s.gcr.io 拉取镜像,这是国外地址,修改成 registry.aliyuncs.com/google_containers。

                    localAPIEndpoint.advertiseAddress: 修改为 192.168.0.10。

                    networking.podSubnet: 添加这一项,值为 10.244.0.0/16。

            $ sudo kubeadm init --config=init.default.yaml

                注:可以在运行 kubeadm init --config 命令之前,先运行 kubeadm config images pull --config=init.default.yaml 拉取镜像。

    2) kubeadm init 执行结果

        (1) 执行失败

            [init] Using Kubernetes version: v1.23.0

            ...

            [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
            [kubelet-check] Initial timeout of 40s passed.
            [kubelet-check] It seems like the kubelet isn't running or healthy.
            [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.

            注:kubeadm init 执行失败,查看 kubelet 状态:

                $ sudo  journalctl -f -u kubelet

                    ...

                    Nov 16 01:56:05 k8s-master kubelet[8634]: E1116 01:56:05.490013    8634 server.go:302] "Failed to run kubelet" err="failed to run Kubelet: misconfiguration: kubelet cgroup driver: \"systemd\" is different from docker cgroup driver: \"cgroupfs\""
                    Nov 16 01:56:05 k8s-master systemd[1]: kubelet.service: main process exited, code=exited, status=1/FAILURE
                    Nov 16 01:56:05 k8s-master systemd[1]: Unit kubelet.service entered failed state.
                    Nov 16 01:56:05 k8s-master systemd[1]: kubelet.service failed.

                从 log 可以看出是驱动问题,即 docker 的驱动与 kubelet 驱动不一致。

                查看 docker 驱动:

                    $ docker info | grep Cgroup

                        Cgroup Driver: cgroupfs

                查看 kubelet 驱动:

                    $ sudo cat /var/lib/kubelet/config.yaml | grep cgroup

                        cgroupDriver: systemd

                修改 docker 驱动,查看 /etc/docker/daemon.json 文件(如果没有,手动创建),添加以下内容:

                    {
                        ...

                        "exec-opts":["native.cgroupdriver=systemd"]
                    }

                重启 docker 和 重置 Master 节点:

                    $ sudo systemctl daemon-reload
                    $ sudo systemctl restart docker    

                    $ sudo kubeadm reset

                执行以上命令后,再次初始化 Master 主机。

        (2) 执行成功

            ...

            Your Kubernetes control-plane has initialized successfully!

            To start using your cluster, you need to run the following as a regular user:

            mkdir -p $HOME/.kube
            sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
            sudo chown $(id -u):$(id -g) $HOME/.kube/config

            Alternatively, if you are the root user, you can run:

            export KUBECONFIG=/etc/kubernetes/admin.conf

            You should now deploy a pod network to the cluster.
            Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
            https://kubernetes.io/docs/concepts/cluster-administration/addons/

            Then you can join any number of worker nodes by running the following on each as root:

            kubeadm join 192.168.0.10:6443 --token 67lwvx.7nhlvr3y74g7yccg \
            --discovery-token-ca-cert-hash sha256:c62bfdbb2a65c5ad5bdee19596f0130b92c93d12ececb8898deaeb2b54b1e7eb

            注:kubeadm join 命令(两行),是用来在 Node 运行连接 Master 的,每次 kubeadm init 成功生成的这个命令里带的 sha256 不一样,要确保在 Node 上运行的命令是最后生成的。

                这个 kubeadm join 命令内容,可以运行如下命令获得:

                    $ kubeadm token create --print-join-command

    3) 创建 kubectl 连接认证文件

        $ mkdir -p $HOME/.kube
        $ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
        $ sudo chown $(id -u):$(id -g) $HOME/.kube/config
        $ kubectl get nodes        # 查看工作节点状态

            NAME         STATUS     ROLES                  AGE   VERSION
            k8s-master   NotReady   control-plane,master   52s   v1.23.0

        注:NotReady 是因为没有配置网络插件。

    4) 添加 Flannel 网络插件

        Flannel 是一种专为 Kubernetes 设计的简单易配置得 Pod 网络插件,它在众多开源的 CNI (Container Network Interface)插件中部署相对简单、相关文档较多的一个。

        https://github.com/flannel-io/flannel

        在 Master 主机上安装 Flannel 插件,步骤如下。

        (1) 配置 kube-flannel.yml 文件

            $ cd /home/k8s
            $ wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml   # 下载资源配置清单
            $ kubectl apply -f kube-flannel.yml

                namespace/kube-flannel created
                clusterrole.rbac.authorization.k8s.io/flannel created
                clusterrolebinding.rbac.authorization.k8s.io/flannel created
                serviceaccount/flannel created
                configmap/kube-flannel-cfg created
                daemonset.apps/kube-flannel-ds created


        (2) 重启网络

            $ sudo systemctl restart network
            $ sudo systemctl restart kubelet

            $ kubectl get nodes

                NAME         STATUS   ROLES                  AGE     VERSION
                k8s-master   Ready    control-plane,master   4m      v1.23.0

                注:Flannel 网络生效可能会有延时,可以查看 flannel 相关 pod 是否处于 running 状态,命令如下:

                    $ kubectl get pod -n kube-flannel


4. 在 Node 上加入集群

    1) 创建 kubectl 连接认证文件

        从 Master 主机上复制 /etc/kubernetes/admin.conf 文件到本地 /etc/kubernetes/admin.conf。

        $ sudo scp root@192.168.0.10:/etc/kubernetes/admin.conf /etc/kubernetes/admin.conf
        $ sudo chmod +r /etc/kubernetes/admin.conf
        $ echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> ~/.bash_profile
        $ source ~/.bash_profile

    2) 把 Node 主机加入集群

        在 Node 主机上执行在 Master 主机上 kubeadm init 输出的 kubeadm join 命令,命令格式如下:

            $ sudo kubeadm join 192.168.0.10:6443 --token 67lwvx.7nhlvr3y74g7yccg \
                    --discovery-token-ca-cert-hash sha256:c62bfdbb2a65c5ad5bdee19596f0130b92c93d12ececb8898deaeb2b54b1e7eb

            参数说明:

                --token:集群 Master 的 token
                --discovery-token-ca-cert-hash:验证根 CA 公钥是否与此哈希匹配

                更多 join 参数可查看官方文档:https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm-join/#join-workflow

    3) kubeadm join 执行结果

        (1) 执行失败

            [preflight] Running pre-flight checks
            [preflight] Reading configuration from the cluster...
            [preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
            [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
            [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
            [kubelet-start] Starting the kubelet
            [kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
            [kubelet-check] Initial timeout of 40s passed.
            [kubelet-check] It seems like the kubelet isn't running or healthy.
            [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.

            注:kubeadm join 执行失败,查看 kubelet 状态:

                $ sudo  journalctl -f -u kubelet

                    ...

                    Nov 16 04:49:27 k8s-node01 kubelet[30432]: E1116 04:49:27.087250   30432 server.go:302] "Failed to run kubelet" err="failed to run Kubelet: misconfiguration: kubelet cgroup driver: \"systemd\" is different from docker cgroup driver: \"cgroupfs\""
                    Nov 16 04:49:27 k8s-node01 systemd[1]: kubelet.service: main process exited, code=exited, status=1/FAILURE
                    Nov 16 04:49:27 k8s-node01 systemd[1]: Unit kubelet.service entered failed state.
                    Nov 16 04:49:27 k8s-node01 systemd[1]: kubelet.service failed.

                从 log 可以看出是驱动问题,即 docker 的驱动与 kubelet 驱动不一致,参考上文 Master 节点上的方法修改,改完后再运行 kubeadm join 命令。

        (2) 执行成功

            [preflight] Running pre-flight checks
            [preflight] Reading configuration from the cluster...
            [preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
            [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
            [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
            [kubelet-start] Starting the kubelet
            [kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

            This node has joined the cluster:
            * Certificate signing request was sent to apiserver and a response was received.
            * The Kubelet was informed of the new secure connection details.

            Run 'kubectl get nodes' on the control-plane to see this node join the cluster.


            # 查看节点
            $ kubectl get nodes

                NAME         STATUS     ROLES                  AGE     VERSION
                k8s-master   Ready      control-plane,master   129m    v1.23.0
                k8s-node01   NotReady   <none>                 7m15s   v1.23.0

            注:k8s-node01 的状态 NotReady 是因为没有配置网络插件.

    4) 添加 Flannel 网络插件

        在 Node 主机上安装 Flannel 插件,步骤如下。

        (1) 配置 kube-flannel.yml 文件   

            $ cd /home/k8s
            $ wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml   # 下载资源配置清单
            $ kubectl apply -f kube-flannel.yml

                namespace/kube-flannel created
                clusterrole.rbac.authorization.k8s.io/flannel created
                clusterrolebinding.rbac.authorization.k8s.io/flannel created
                serviceaccount/flannel created
                configmap/kube-flannel-cfg created
                daemonset.apps/kube-flannel-ds created


        (2) 重启网络

            $ sudo systemctl restart network
            $ sudo systemctl restart kubelet

            $ kubectl get nodes

                NAME         STATUS   ROLES                  AGE    VERSION
                k8s-master   Ready    control-plane,master   140m   v1.23.0
                k8s-node01   Ready    <none>                 18m    v1.23.0

 


5. 卸载 Kubernetes (Master/Node 上)

    1) Flannel 插件清理

        $ cd /home/k8s
        $ kubectl delete -f kube-flannel.yml
        $ sudo systemctl restart kubelet

    2) 重置节点

        $ sudo kubeadm reset

            [reset] Reading configuration from the cluster...
            [reset] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
            [reset] WARNING: Changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted.
            [reset] Are you sure you want to proceed? [y/N]: y
             ...


    3) 卸载组件

        # 卸载 3 个 k8s 重要组件
        $ sudo yum -y remove kubelet kubeadm kubectl

            Loaded plugins: fastestmirror
            Resolving Dependencies
            --> Running transaction check
            ---> Package kubeadm.x86_64 0:1.23.0-0 will be erased
            ---> Package kubectl.x86_64 0:1.23.0-0 will be erased
            ---> Package kubelet.x86_64 0:1.23.0-0 will be erased
            --> Processing Dependency: kubelet for package: kubernetes-cni-0.8.7-0.x86_64
            --> Running transaction check
            ---> Package kubernetes-cni.x86_64 0:0.8.7-0 will be erased
            --> Finished Dependency Resolution
            ...    

        # 删除配置目录
        $ rm -rf ~/.kube/

    4) 清理 k8s 相关的 docker 镜像

        单独安装的 k8s 可以运行如下命令,清理全部 k8s 镜像:

            $ docker rm $(docker ps -a -q)

            $ docker rmi $(docker images -q)

        也可以根据 docker images 命令的输出列表,手动运行 docker rmi 命令逐个删除 k8s 的镜像:

            $ sudo docker images

                ...

            $ sudo docker rmi [IMAGE ID]

                ...

        重启 docker

            $ sudo systemctl daemon-reload
            $ sudo systemctl restart docker


posted @ 2022-11-16 18:56  垄山小站  阅读(428)  评论(0编辑  收藏  举报