基于Centos7 部署kubernetes v1.21.3 实践-kubeadm方式安装 | 亲测实践部署有效

Centos7

  • CentOS Linux release 7.9.2009 (Core)
  • Linux 5.4.138-1.el7.elrepo.x86_64

docker

  • 20.10.8

Kubernetes

  • v1.21.3
  • flannel v0.14.0

nodes

  • master | 192.168.181.135 | 2core2g
  • node1 | 192.168.181.136 | 2core4g
  • node2 | 192.168.181.137 | 2core4g

操作用户及目录

  • root
  • /root/

一.Centos系统优化 -> all nodes

1.更换yum源

首先在国内镜像源,如华为源下载个Centos7的minimal iso镜像,安装后,先更换为阿里云的国内yum源。

# 备份原系统的yum源
cd /etc/yum.repo.d
mkdir bak
mv *.repo bak/

# 下载阿里云的yum源
yum install -y wget
wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo
或者
curl -o /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo
yum clean all
yum makecache
yum update
yum upgrade

# 新装的server没有ifconfig命令
yum install -y net-tools

2.配置静态ip

本实验基于NAT模式新建三台虚拟机,以master节点为例,其他从节点类似。

# 查看网卡 -> 通常是 ens33
ifconfig | ip addr

# 修改配置文件 
cp /etc/sysconfig/network-scripts/ifcfg-ens33 /etc/sysconfig/network-scripts/ifcfg-ens33.bak
vi /etc/sysconfig/network-scripts/ifcfg-ens33
---修改后---
BOOTPROTO="static"
NAME="ens33"
DEVICE="ens33"
ONBOOT="yes"
IPADDR=192.168.181.135
NETMASK=255.255.255.0
GATEWAY=192.168.181.2
DNS1=114.114.114.114
-----------

# 重启网卡
service network restart
ifconfig | ip addr 

3.配置hostname

永久修改主机名,以master为例,其他类似。

# master节点,node1 node2同样设置hostname
hostnamectl set-hostname master

# exit 后重新 login

# 修改hosts文件,追加模式
cat >> /etc/hosts << EOF
> 192.168.181.135 master
> 192.168.181.136 node1
> 192.168.181.137 node2
> EOF

# 分发hosts文件,输入密码后确认
scp /etc/hosts  node1:/etc/hosts 
scp /etc/hosts  node2:/etc/hosts 

4.配置ssh互信

在master节点上配置,然后分发即可

# 生成公私钥,一路回车
ssh-keygen

# 分发,不指定公钥默认用 /root/.ssh/id_rsa.pub
ssh-copy-id root@node1
ssh-copy-id root@node2

# 验证ssh互信,首次选择yes
ssh node1
ssh node2

5..升级内核

Centos7的kernel版本为3.10,运行docker和k8s的时候,会出错,升级较新版本的kernel。

# 查看内核信息
uname -rs 

# elrepo pub key
rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org

# install elrepo, choose u own version
yum install https://www.elrepo.org/elrepo-release-7.0-4.el7.elrepo.noarch.rpm

# 查看内核版本
yum --disablerepo="*" --enablerepo="elrepo-kernel" list available

# 安装内核 lt代表long term 长期支持版本
yum --disablerepo='*' --enablerepo=elrepo-kernel install kernel-lt

# 查看配置
awk -F\' '$1=="menuentry " {print i++ " : " $2}' /etc/grub2.cfg
# 0 : CentOS Linux (4.4.186-1.el7.elrepo.x86_64) 7 (Core)
# 1 : CentOS Linux (3.10.0-957.27.2.el7.x86_64) 7 (Core)
# 2 : CentOS Linux (3.10.0-957.21.3.el7.x86_64) 7 (Core)

# 默认用更新的内核,重启主机
grub2-set-default 0 && reboot

6.同步集群时间

date # 各节点都验证后,同步时间,最好加个cron定时任务,定期获取时间同步
yun install -y ntpdate
ntpdate cn.pool.ntp.org 
crontab -e
---
*/20 * * * * /usr/bin/ntpdate -u cn.pool.ntp.org

7.脚本运行-> set_env.sh

#!/bin/bash

# 1.配置国内源
cd /etc/yum.repo.d
mkdir bak
mv *.repo bak/

yum install -y wget
wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo

yum clean all
yum makecache
yum update
yum upgrade

yum install -y net-tools

# 2.配置静态IP
cp /etc/sysconfig/network-scripts/ifcfg-ens33 /etc/sysconfig/network-scripts/ifcfg-ens33.bak

cat > /etc/sysconfig/network-scripts/ifcfg-ens33 << EOF
BOOTPROTO="static"
NAME="ens33"
DEVICE="ens33"
ONBOOT="yes"
IPADDR=192.168.181.135
NETMASK=255.255.255.0
GATEWAY=192.168.181.2
DNS1=114.114.114.114
EOF

service network restart

# 3.配置hostname,手动分发hosts文件
hostnamectl set-hostname master


cat >> /etc/hosts << EOF
192.168.181.135 master
192.168.181.136 node1
192.168.181.137 node2
EOF

# 4.ssh互信参考上面,下同

# 5.升级内核

# 6.同步时间

二.docker安装 -> all nodes

1.清理docker环境

# rm docker already occupied env
yum remove -y docker*

# cfg docker yum source && docker dependencies
yum install -y yum-utils   device-mapper-persistent-data   lvm2

# yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo

# up yum cache
yum makecache fast

2.查看docker版本

yum list docker-ce --showduplicates | sort -r

3.安装docker

# install specific version docker
yum install -y docker-ce-20.10.8 

# 开机自启
systemctl start docker
systemctl enable docker


# 验证docker
docker run hello-world
docker rm hello-world_containerName
docker rmi hello-world:latest

4.配置docker源

由于国内网络问题,需要配置加速器来加速。
需要修改配置文件,Docker 使用 /etc/docker/daemon.json来配置daemon,同时修改启动参数cgroup driversystemd

vi /etc/docker/daemon.json
---begin---
{
  "registry-mirrors": ["http://hub-mirror.c.163.com"],
  "exec-opts": ["native.cgroupdriver=systemd"]
}
---end---

systemctl daemon-reload
systemctl restart docker

# check
docker info | grep Cgroup 

三.kubernetes安装

1.配置k8s环境 -> all nodes

我写成脚本形式,k8s_env.sh,如下:

#!/bin/bash
# 关闭防火墙
systemctl status firewalld
systemctl disable firewalld
systemctl stop firewalld

# 关闭selinux
# 临时禁用selinux
setenforce 0
# 永久关闭 修改/etc/sysconfig/selinux文件设置
sed -i 's/SELINUX=permissive/SELINUX=disabled/' /etc/sysconfig/selinux
sed -i "s/SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config

# 禁用交换分区
swapoff -a
# 永久禁用,打开/etc/fstab注释掉swap那一行。
sed -i 's/.*swap.*/#&/' /etc/fstab

# 修改内核参数
cat <<EOF >/etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF

sysctl --system # sysctl -p

执行时

chmod 755 k8s_env.sh
./k8s_env.sh

2.master节点安装k8s组件 -> master node

可以先通过查看相关版本,选择合适的版本安装,建议选择较新或者最新的版本安装:

# 以下三个命令任选一个即可
yum list kubelet --showduplicates | sort -r
yum list kubeadm --showduplicates | sort -r
yum list kubectl --showduplicates | sort -r

同样,这部分脚本形式,k8s_install.sh,如下:

#/bin/bash

K8S_BASEURL=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
K8S_GPGKEY="https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg"
POD_NETWORK=10.244.0.0

# 执行配置k8s阿里云源
cat <<EOF >/etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=${K8S_BASEURL}
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=${K8S_GPGKEY}
EOF

# 更新yum源
yum clean all
yum -y makecache

# 安装kubeadm、kubectl、kubelet
yum install -y kubectl-1.21.3-0 kubeadm-1.21.3-0 kubelet-1.21.3-0

# 启动kubelet服务
systemctl enable kubelet && systemctl start kubelet

执行时

chmod 755 k8s_install.sh
./k8s_install.sh

3.从节点安装 -> all work nodes

同样,这部分脚本形式,k8s_install_node.sh,如下:

#/bin/bash

K8S_BASEURL=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
K8S_GPGKEY="https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg"

cat <<EOF >/etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=${K8S_BASEURL}
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=${K8S_GPGKEY}
EOF

# 更新yum源
yum clean all
yum -y makecache

# 安装kubeadm、kubectl、kubelet  version=v1.21.3
yum install -y kubectl-1.21.3-0 kubeadm-1.21.3-0 kubelet-1.21.3-0

# 启动kubelet服务
systemctl enable kubelet && systemctl start kubelet

执行时

chmod 755 k8s_install_node.sh
./k8s_install_node.sh

4.docker 拉取k8s镜像

在k8s init初始化时需要用到k8s相关镜像,具体操作如下:

# 在任意节点查看k8s需要镜像及版本,最好master,因为后面都是先安装k8s到master节点,后面不说都是master节点
kubeadm config images list --kubernetes-version=v1.21.3
---begin---
k8s.gcr.io/kube-apiserver:v1.21.3
k8s.gcr.io/kube-controller-manager:v1.21.3
k8s.gcr.io/kube-scheduler:v1.21.3
k8s.gcr.io/kube-proxy:v1.21.3
k8s.gcr.io/pause:3.4.1
k8s.gcr.io/etcd:3.4.13-0
k8s.gcr.io/coredns/coredns:v1.8.0
-----------

# 由于科~上网原因,下载以上镜像不能直接拉取镜像,自己网络允许忽略此步,但可以通过阿里云也可以通过docker hub下载

# 方式1:阿里云
# 以下是images.sh脚本内容
---begin---
#!/bin/bash

kubeadm config images list --kubernetes-version=v1.21.3

set -e

KUBE_VERSION=v1.21.3
KUBE_PAUSE_VERSION=3.4.1
ETCD_VERSION=3.4.13-0
CORE_DNS_VERSION=v1.8.0 # aliyun无该版本镜像,docker hub上有对应版本镜像

GCR_URL=k8s.gcr.io
ALIYUN_URL=registry.cn-hangzhou.aliyuncs.com/google_containers

images=(kube-proxy:${KUBE_VERSION}
kube-scheduler:${KUBE_VERSION}
kube-controller-manager:${KUBE_VERSION}
kube-apiserver:${KUBE_VERSION}
pause:${KUBE_PAUSE_VERSION}
etcd:${ETCD_VERSION}
coredns:${CORE_DNS_VERSION})

for imageName in ${images[@]} ;
do
    docker pull $ALIYUN_URL/$imageName || docker pull coredns/coredns:1.8.0
    if [ $(imageName) != "coredns:v1.8.0" ]; # 非coredns镜像,aliyun镜像tag成k8s镜像
    then
        docker tag $ALIYUN_URL/$imageName $GCR_URL/$imageName 
    else 
        docker tag $imageName $GCR_URL/coredns/$imageName
        docker tag $imageName $ALIYUN_URL/$imageName
    fi	
    docker rmi $ALIYUN_URL/$imageName || docker rmi $imageName
done

echo
echo "docker pull finished..."
-----------

# 方式2,在网上找资源时,发现有位老铁的docker hub 有k8s v1.21.3全套,感谢仁兄,此处下载该老铁的镜像
# 以下是images.sh脚本内容
---begin---
#!/bin/bash

kubeadm config images list --kubernetes-version=v1.21.3

set -e

KUBE_VERSION=v1.21.3
KUBE_PAUSE_VERSION=3.4.1
ETCD_VERSION=3.4.13-0
CORE_DNS_VERSION=v1.8.0 

GCR_URL=k8s.gcr.io
DOCKER_HUB=xwjh # https://hub.docker.com/u/xwjh

images=(kube-proxy:${KUBE_VERSION}
kube-scheduler:${KUBE_VERSION}
kube-controller-manager:${KUBE_VERSION}
kube-apiserver:${KUBE_VERSION}
pause:${KUBE_PAUSE_VERSION}
etcd:${ETCD_VERSION}
coredns:${CORE_DNS_VERSION})

# 拉取镜像
# 打tag
# 删除初始镜像

for imageName in ${images[@]} ;
do
    docker pull $DOCKER_HUB/$imageName
    if [ $imageName != "coredns:v1.8.0" ];
    then
        docker tag $DOCKER_HUB/$imageName $GCR_URL/$imageName
    else
        docker tag $DOCKER_HUB/$imageName $GCR_URL/coredns/$imageName
    docker rmi $DOCKER_HUB/$imageName
done

echo
echo "docker pull finished..."
-----------

# 选择一种方式,执行即可

# 拉取flannel网络插件镜像,v1.21.3对象flannel版本可以使用v0.14.0,网络原因,拉取docker hub镜像
docker pull xwjh/flannel:v0.14.0
docker tag xwjh/flannel:v0.14.0 quay.io/coreos/flannel:v0.14.0
docker rmi xwjh/flannel:v0.14.0

# 镜像分发
# 以下为我的镜像,打包时可以参考
---
[root@master ~]# docker images | grep io
k8s.gcr.io/kube-apiserver                                            v1.21.3    3d174f00aa39   3 weeks ago     126MB
k8s.gcr.io/kube-scheduler                                            v1.21.3    6be0dc1302e3   3 weeks ago     50.6MB
k8s.gcr.io/kube-proxy                                                v1.21.3    adb2816ea823   3 weeks ago     103MB
k8s.gcr.io/kube-controller-manager                                   v1.21.3    bc2bb319a703   3 weeks ago     120MB
quay.io/coreos/flannel                                               v0.14.0    8522d622299c   2 months ago    67.9MB
k8s.gcr.io/pause                                                     3.4.1      0f8457a4c2ec   7 months ago    683kB
k8s.gcr.io/coredns/coredns                                           v1.8.0     296a6d5035e2   9 months ago    42.5MB
k8s.gcr.io/etcd                                                      3.4.13-0   0369cf4303ff   11 months ago   253MB
---

docker save -o k8s.tar `docker images | grep io| awk -v  OFS=":" '{print $1,$2}'`
scp k8s.tar node1:/root/ 
scp k8s.tar node2:/root/ 

# 从节点操作,载入镜像
cd /root
ls
docker load -i k8s.tar
docker images

5.安装flannel网络插件 -> master节点

export POD_SUBNET=10.100.0.0/16
kubectl apply -f https://kuboard.cn/install-script/v1.21.x/calico-operator.yaml
wget https://kuboard.cn/install-script/flannel/flannel-v0.14.0.yaml
sed -i "s#10.244.0.0/16#${POD_SUBNET}#" flannel-v0.14.0.yaml
kubectl apply -f ./flannel-v0.14.0.yaml

6.主节点初始化init -> master节点

同样写成脚本操作,k8s_init.sh,我用的是aliyun方式下载的镜像

#!/bin/bash

version=v1.21.3
master_ip=192.168.181.135
POD_NETWORK=10.244.0.0

# 安装初始化镜像,参数详解查看文档 https://kubernetes.io/zh/docs/reference/setup-tools/kubeadm/kubeadm-init/

# 以下是本集群设置初始化
kubeadm init --apiserver-advertise-address=$master_ip \
 --image-repository registry.aliyuncs.com/google_containers \
 --kubernetes-version $version\
 --service-cidr=10.1.0.0/16 \
 --pod-network-cidr=$POD_NETWORK/16

# kubeadm init 执行完成之后需要执行的操作
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

主节点初始化后,kubectl get cs如发现status为unhealthy,编辑/etc/kubernetes/manifests下的kube-controller-manager.yaml & kube-scheduler.yaml,找到port=0那行,注释即可,随后再次查看自动变为healthy状态。

7.从节点加入集群

从节点:
kubectl get pods

主节点:
#复制admin.conf,请在主节点服务器上执行此命令
scp /etc/kubernetes/admin.conf node1:/etc/kubernetes/admin.conf
scp /etc/kubernetes/admin.conf node2:/etc/kubernetes/admin.conf

从节点:
#设置kubeconfig文件
export KUBECONFIG=/etc/kubernetes/admin.conf
echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> ~/.bash_profile

主节点:
keubeadm token create --print-join-command
---
kubeadm join 192.168.181.135:6443 --token f317g6.t340kbufmaaj76n4 --discovery-token-ca-cert-hash sha256:eeb0954da014595564feb700c38a3e8a6fa7c74260be4b047b5f00e3d60e58c7
---

从节点:
# 复制以上kubeadm join命令到从节点执行
kubeadm join 192.168.181.135:6443 --token f317g6.t340kbufmaaj76n4 --discovery-token-ca-cert-hash sha256:eeb0954da014595564feb700c38a3e8a6fa7c74260be4b047b5f00e3d60e58c7
---
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
---

# 任意节点查看集群状态
kubectl get nodes
---
NAME     STATUS   ROLES                  AGE   VERSION
master   Ready    control-plane,master   23h   v1.21.3
node1    Ready    <none>                 19h   v1.21.3
node2    Ready    <none>                 19h   v1.21.3
---

# 任意节点查看pods信息
kubectl get pods --all-namespaces
---
NAMESPACE         NAME                              READY   STATUS    RESTARTS   AGE
kube-system       coredns-59d64cd4d4-6ggnb          1/1     Running   0          23h
kube-system       coredns-59d64cd4d4-cn27q          1/1     Running   0          23h
kube-system       etcd-master                       1/1     Running   0          23h
kube-system       kube-apiserver-master             1/1     Running   0          23h
kube-system       kube-controller-manager-master    1/1     Running   0          20h
kube-system       kube-flannel-ds-68xtw             1/1     Running   0          20h
kube-system       kube-flannel-ds-mlbg5             1/1     Running   0          19h
kube-system       kube-flannel-ds-xh9c5             1/1     Running   0          19h
kube-system       kube-proxy-52lwk                  1/1     Running   0          19h
kube-system       kube-proxy-99zwv                  1/1     Running   0          19h
kube-system       kube-proxy-dffm5                  1/1     Running   0          23h
kube-system       kube-scheduler-master             1/1     Running   0          20h
tigera-operator   tigera-operator-cf6b69777-lcm89   1/1     Running   0          20h
---
# 看到以上信息说明集群已经正常起来了

8.常见问题

  • 镜像拉取失败问题,选择国内镜像源,aliyun或者docker hub都可以
  • coredns镜像拉取失败问题,docker hub解决
  • coredns一直是pending状态,安装flannel插件解决
  • kubeadm join从节点加入失败,多排查防火墙,时间同步,基本设置等问题
  • kubectl get pods从节点返回refused,scp主节点conf到从节点
  • kubectl get cs status为unhealthy,修改yaml文件,port=0注释
  • flannel状态为Init:ImagePullBackOff,主从节点都要拉取到flannel镜像
  • 从节点NotReady状态,重启从节点docker/kubelet,rm容器,重新加入集群
  • 更详细及更多问题,参考k8s部署问题记录-v1.21.3

参考文档:

posted on 2021-08-11 18:23  进击的davis  阅读(625)  评论(0编辑  收藏  举报

导航