kubeadm 部署 k8s 1.32.8(containerd 版)
一、Kubernetes集群部署方式
k8s 部署方式很多:
本地学习/测试:Minikube、Kind、K3s
企业级生产环境:kubeadm + 自建、OpenShift、Rancher、Tanzu
大规模/多云部署:kOps、kubespray、Rancher
1、二进制、kubeadm
https://mp.weixin.qq.com/s/P90Z_rudmI548aOfD9aRtA 二进制版
https://github.com/easzlab/kubeasz 二进制部署工具
k8sre/k8s: Deploy a Production Ready Kubernetes High Availability Cluster with Binary (github.com)
https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.24.md#client-binaries
2、sealer、sealos 一键安装方便
https://github.com/sealerio/sealer 想把“K8s+中间件+业务”整个封装成可复用镜像 → 用 Sealer。
只想最快装好一套生产可用的 K8s → 用 Sealos。sealyun.com
目前试过最快的部署方案
sealos init --passwd '123456' \
--master 192.168.40.130 \
--node 192.168.40.131 \
--node 192.168.40.132 \
--pkg-url /root/kube1.22.0.tar.gz \
--without-cni \
--version v1.22.0
3、KubeKey、kubespray
Release v2.0.0 · kubesphere/kubekey · GitHub
KubeKey 是 KubeSphere 社区开源的一款高效集群部署工具,运行时默认使用 Docker , 也可对接 Containerd CRI-O iSula 等 CRI 运行时,且 ETCD 集群独立运行,
支持与 K8s 分离部署,提高环境部署灵活性。它提供了一种灵活、快速、便捷的方式来仅安装 Kubernetes/K3s,
或同时安装 Kubernetes/K3s 和 KubeSphere,以及其他云原生插件。除此之外,它也是扩展和升级集群的有效工具。
kubernetes-sigs/kubespray: Deploy a Production Ready Kubernetes Cluster (github.com)
kubespray:基于 Ansible 的多节点部署工具,支持多云/裸金属。
4、Kind (Kubernetes IN Docker)
用 Docker 容器作为节点跑 K8s 集群,CNCF 官方支持。
5、rancher、OpenShift
rancher:全生命周期管理 K8s 集群的平台,支持导入已有集群
openshift: Red Hat 出品的企业级 K8s 平台,内置 CI/CD、镜像仓库。
二、Kubeadm部署docker 版本
https://v1-32.docs.kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/
https://kuboard.cn/install/install-kubernetes.html (kubeadm方式构建多master集群)
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/high-availability/ (高可用)
1、环境准备(docker环境)
hostnamectl set-hostname k8s-master cat /etc/hosts 192.168.40.137 k8s-master 192.168.40.138 k8s-node1 192.168.40.139 k8s-node2 systemctl disable firewalld systemctl stop firewalld sed -i 's/SELINUX=permissive/SELINUX=disabled/' /etc/sysconfig/selinux sed -i 's/.*swap.*/#&/' /etc/fstab swapoff -a update-alternatives --set iptables /usr/sbin/iptables-legacy yum install -y yum-utils device-mapper-persistent-data lvm2 && yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo && yum makecache && yum -y install docker-ce -y && systemctl enable docker.service && systemctl start docker# hostnamectl set-hostname k8s-master
所有机器安装kubeadm和kubelet
配置aliyun的yum源
# cat <<EOF > /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64 enabled=1 gpgcheck=1 repo_gpgcheck=1 gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg EOF
安装最新版kubeadm
yum makecache yum install -y kubelet kubeadm kubectl ipvsadm 说明:如果想安装指定版本的kubeadmin yum install kubelet-1.16.0-0.x86_64 kubeadm-1.16.0-0.x86_64 kubectl-1.16.0-0.x86_64
配置内核参数
cat <<EOF > /etc/sysctl.d/k8s.conf net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 vm.swappiness=0 EOF sysctl --system modprobe br_netfilter sysctl -p /etc/sysctl.d/k8s.conf 加载ipvs相关内核模块 如果重新开机,需要重新加载(可以写在 /etc/rc.local 中开机自动加载)
vim 1.sh
#!/bin/bash modprobe ip_vs modprobe ip_vs_rr modprobe ip_vs_wrr modprobe ip_vs_sh modprobe nf_conntrack_ipv4 查看是否加载成功 lsmod | grep ip_vs
2、获取镜像
特别说明:
- 三个节点都要下载
- 注意下载时把版本号修改到官方最新版,即使下载了最新版也可能版本不对应,需要按报错提示下载
- 每次部署都会有版本更新,具体版本要求,运行初始化过程失败会有版本提示
- kubeadm的版本和镜像的版本必须是对应的
用命令查看版本当前kubeadm对应的k8s镜像版本
# kubeadm config images list k8s.gcr.io/kube-apiserver:v1.17.2 k8s.gcr.io/kube-controller-manager:v1.17.2 k8s.gcr.io/kube-scheduler:v1.17.2 k8s.gcr.io/kube-proxy:v1.17.2 k8s.gcr.io/pause:3.1 k8s.gcr.io/etcd:3.4.3-0 k8s.gcr.io/coredns:1.6.5
使用下面的方法在aliyun拉取相应的镜像并重新打标
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.17.2 docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.17.2 http://k8s.gcr.io/kube-apiserver:v1.17.2 docker rmi 删除镜像
或者指定镜像仓库地址--image-repository
kubeadm init \ --apiserver-advertise-address=192.168.31.61 \ --image-repository registry.aliyuncs.com/google_containers \ --kubernetes-version v1.20.0 \ --service-cidr=10.96.0.0/12 \ --pod-network-cidr=10.244.0.0/16 \ --ignore-preflight-errors=all
3、所有节点配置启动kubelet
配置kubelet使用国内pause镜像
获取docker的cgroups
DOCKER_CGROUPS=$( docker info | grep 'Cgroup' | awk -F':' '{print $2}') echo $DOCKER_CGROUPS cgroupfs
配置kubelet的cgroups
#cat >/etc/sysconfig/kubelet<<EOF KUBELET_EXTRA_ARGS="--cgroup-driver=$DOCKER_CGROUPS --pod-infra-container-image=k8s.gcr.io/pause:3.2" EOF 注意这里的版本,和驱动!!!!!!
systemctl daemon-reload systemctl enable kubelet && systemctl start kubelet
特别说明:在这里使用systemctl status kubelet,你会发现报错误信息 10月 11 00:26:43 node1 systemd[1]: kubelet.service: main process exited, code=exited, status=255/n/a 10月 11 00:26:43 node1 systemd[1]: Unit kubelet.service entered failed state. 10月 11 00:26:43 node1 systemd[1]: kubelet.service failed. 运行journalctl -xefu kubelet 命令查看systemd日志才发现,真正的错误是: unable to load client CA file /etc/kubernetes/pki/ca.crt: open /etc/kubernetes/pki/ca.crt: no such file or directory 这个错误在运行kubeadm init 生成CA证书后会被自动解决,此处可先忽略。 简单地说就是在kubeadm init 之前kubelet会不断重启。
4、初始化集群
初始化完成必须要记录下初始化过程最后的命令
# kubeadm init --kubernetes-version=v1.17.4 --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=192.168.40.137 --ignore-preflight-errors=Swap
--pod-network-cidr:如果创建集群时指定了该参数,那么 kube-proxy 的 cluster-cidr 就会被设置成该值(不指定该参数时,cluster-cidr 的默认值为空)。
cluster-cidr 主要用于帮助 kube-proxy 区分内外流量:当值为空时,kube-proxy 认为所有流量都是内部流量,不做 SNAT(MASQ);当值非空时,来自 cluster-cidr 网络
(即 Pod 网络)的流量被当成内部流量,访问 Service 时不做 SNAT(MASQ),来自其他网络的流量被当成外部流量,访问 Service 时需要做 SNAT(MASQ)。此外,kube-proxy
还提供了一个独立参数 masquerade-all,如果设置了该参数,那所有访问 Service 的流量都会做 SNAT(MASQ),不再区分内外流量。
后期修改--pod-network-cidr=10.244.0.0/16地址段,
kubectl -n kube-system edit cm kubeadm-config
vim /etc/kubernetes/manifests/kube-scheduler.yaml
通过 kubectl cluster-info dump | grep -m 1 cluster-cidr 命令可以检查配置是否生效
失败执行:kubeadm reset
报错The HTTP call equal to 'curl -sSL http://localhost:10255/healthz' failed with error: Get http://localhost:10255/healthz: dial tcp 127.0.0.1:10255: getsockopt: connection refused.
#vim /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
Environment="KUBELET_SYSTEM_PODS_ARGS=--pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true --fail-swap-on=false"
#systemctl daemon-reload
#systemctl restart kubelet
Your Kubernetes master has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of machines by running the following on each node
as root:
kubeadm join 192.168.1.200:6443 --token wip0ux.19q3dpudrnyc6q7i --discovery-token-ca-cert-hash sha256:e41c201f32d7aa6c57254cd78c13a5aa7242979f7152bf33ec25dde13c1dcc9a
其中有以下关键内容:
[kubelet] 生成kubelet的配置文件”/var/lib/kubelet/config.yaml”
[certificates]生成相关的各种证书
[kubeconfig]生成相关的kubeconfig文件
[bootstraptoken]生成token记录下来,后边使用kubeadm join往集群中添加节点时会用到
在master节点配置使用kubectl
rm -rf $HOME/.kube mkdir -p $HOME/.kube cp -i /etc/kubernetes/admin.conf $HOME/.kube/config chown $(id -u):$(id -g) $HOME/.kube/config
node节点使用kubectl工具
scp /etc/kubernetes/admin.conf node1:/etc/kubernetes/ vim /etc/profile export KUBECONFIG=/etc/kubernetes/admin.conf source /etc/profile kubectl get nodes NAME STATUS ROLES AGE VERSION master NotReady master 6m19s v1.13.0
查看初始化配置
kubeadm config print init-defaults > kubeadm-init.yaml
5、配置网络插件 (注意flannel镜像每个节点都要有)
master节点下载yaml配置文件
特别说明:版本会经常更新,如果配置成功,就手动去https://raw.githubusercontent.com/coreos/flannel/master/Documentation/ 下载最新版yaml文件
cd ~ && mkdir flannel && cd flannel curl -O https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
修改配置文件kube-flannel.yml(此处要改几处镜像名)
说明:默认的镜像是http://quay.io/coreos/flannel:v0.10.0-amd64,如果你能pull下来就不用修改镜像地址,否则,修改yml中镜像地址为阿里镜像源,要修改所有的镜像版本,里面有好几条flannel镜像地址
image: http://registry.cn-shanghai.aliyuncs.com/gcr-k8s/flannel:v0.10.0-amd64
指定启动网卡
containers:
- name: kube-flannel
image: registry.cn-shanghai.aliyuncs.com/gcr-k8s/flannel:v0.10.0-amd64
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
- --iface=ens33
- --iface=eth0
--iface=ens33 的值,是你当前的网卡,或者可以指定多网
启动
# kubectl apply -f ~/flannel/kube-flannel.yml
查看
# kubectl get pods -n kube-system -o wide (集群搭建完毕才会全部是1) NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
coredns-66bff467f8-jzt7q 1/1 Running 3 3d23h 10.244.0.10 master <none> <none>
coredns-66bff467f8-rq87t 1/1 Running 3 3d23h 10.244.0.11 master <none> <none>
etcd-master 1/1 Running 3 3d23h 192.168.40.137 master <none> <none>
kube-apiserver-master 1/1 Running 3 3d23h 192.168.40.137 master <none> <none>
kube-controller-manager-master 1/1 Running 3 3d23h 192.168.40.137 master <none> <none>
kube-flannel-ds-amd64-76tjs 1/1 Running 3 3d21h 192.168.40.139 node2 <none> <none>
kube-flannel-ds-amd64-h84dp 1/1 Running 4 3d20h 192.168.40.138 node1 <none> <none>
kube-flannel-ds-amd64-mnhbn 1/1 Running 3 3d21h 192.168.40.137 master <none> <none>
kube-proxy-5jjzr 1/1 Running 2 3d21h 192.168.40.139 node2 <none> <none>
kube-proxy-85hz4 1/1 Running 3 3d20h 192.168.40.138 node1 <none> <none>
kube-proxy-gbtgm 1/1 Running 3 3d23h 192.168.40.137 master <none> <none>
kube-scheduler-master 1/1 Running 7 3d23h 192.168.40.137 master <none> <none>
#kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-deployment-5bf87f5f59-js7nc 0/1 ImagePullBackOff 0 12m 10.244.5.12 node1 <none> <none>
nginx-deployment-5bf87f5f59-nsb84 0/1 ImagePullBackOff 0 12m 10.244.5.13 node1 <none> <none>
使用命令describ查看报错信息
# kubectl describ nginx-deployment-5bf87f5f59-js7nc
.....省略大部分信息,关键如下
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 17m default-scheduler Successfully assigned default/nginx-deployment-5bf87f5f59-js7nc to node1
Warning Failed 12m kubelet, node1 Failed to pull image "nginx:1.7.9": rpc error: code = Unknown desc = context canceled
Warning Failed 12m kubelet, node1 Error: ErrImagePull
Normal BackOff 12m kubelet, node1 Back-off pulling image "nginx:1.7.9"
Warning Failed 12m kubelet, node1 Error: ImagePullBackOff
Normal Pulling 12m (x2 over 17m) kubelet, node1 Pulling image "nginx:1.7.9"
# kubectl get service NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 14m # kubectl get svc --namespace kube-system NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 15m
7、配置所有node节点加入集群
在所有node节点操作,此命令为初始化master成功后返回的结果
# kubeadm join 192.168.1.200:6443 --token ccxrk8.myui0xu4syp99gxu --discovery-token-ca-cert-hash sha256:e3c90ace969aa4d62143e7da6202f548662866dfe33c140095b020031bff2986
集群检测
# kubectl get pods -n kube-system NAME READY STATUS RESTARTS AGE coredns-6c66ffc55b-l76bq 1/1 Running 0 16m coredns-6c66ffc55b-zlsvh 1/1 Running 0 16m etcd-node1 1/1 Running 0 16m kube-apiserver-node1 1/1 Running 0 16m kube-controller-manager-node1 1/1 Running 0 15m kube-flannel-ds-sr6tq 0/1 CrashLoopBackOff 6 7m12s kube-flannel-ds-ttzhv 1/1 Running 0 9m24s kube-proxy-nfbg2 1/1 Running 0 7m12s kube-proxy-r4g7b 1/1 Running 0 16m kube-scheduler-node1 1/1 Running 0 16m
遇到异常状态0/1的pod长时间启动不了可删除它等待集群创建新的pod资源
# kubectl delete pod kube-flannel-ds-sr6tq -n kube-system pod "kube-flannel-ds-sr6tq" deleted
删除后再次查看,发现状态为正常
再次查看节点状态
[root@master flannel]# kubectl get nodes -n kube-system NAME STATUS ROLES AGE VERSION master Ready master 19m v1.17.2 node1 Ready <none> 3m16s v1.17.2 node2 Ready <none> 103s v1.17.2 到此集群配置完成
8、 测试kubernetes集群
$ kubectl create deployment nginx --image=nginx $ kubectl expose deployment nginx --port=80 --type=NodePort $ kubectl get pod,svc
9、kubelet(pod管家)
在kubernetes集群中,每个Node节点都会启动kubelet进程,用来处理Master节点下发到本节点的任务,管理Pod和其中的容器。kubelet会在API Server上注册节点信息,定期向Master汇报节点资源使用情况,并通过cAdvisor监控容器和节点资源。可以把kubelet理解成【Server-Agent】架构中的agent,是Node上的pod管家。更多kubelet配置参数信息可参考kubelet –help
当k8s节点出现 NotReady,可用如下命令查看报错信息
# journalctl -f -u kubelet
-f:实时滚动更新 -u:指定某一个具体的服务
例如node节点的问题
# journalctl -f -u kubelet Unable to update cni config: no networks found in /etc/cni/net.d
解决:
vi /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf
添加
Environment="KUBELET_NETWORK_ARGS=--network-plugin=cni --cni-conf-dir=/etc/cni/ --cni-bin-dir=/opt/cni/bin"
三、Kubeadm部署 containerd 版本
1、mac address 和 product_uuid 是唯一的
Verify the MAC address and product_uuid are unique for every node
cat /sys/class/dmi/id/product_uuid 4c391742-60ea-8e34-896a-cae6a9d724ca 是 Linux 系统中的一个虚拟文件,位于 sysfs 虚拟文件系统中。 它提供了当前计算机硬件的系统 UUID(Universally Unique Identifier), 也就是主板或整机的唯一标识符。
DMI:Desktop Management Interface(桌面管理接口),是 BIOS/UEFI 提供的一种标准,用于描述硬件信息(如主板、BIOS、序列号等)。
UUID:通用唯一识别码,是一个 128 位的数字,通常表示为 8-4-4-4-12 格式的十六进制字符串,例如
kubelet 会使用这台机器的 product_uuid 来识别节点。Kubernetes 要求每台节点必须有唯一的 UUID,否则可能会导致
2、关闭 swap 并配置内核参数
sudo swapoff -a sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab sudo tee /etc/modules-load.d/containerd.conf <<EOF overlay br_netfilter EOF # 启动内核模块 sudo modprobe overlay sudo modprobe br_netfilter sudo tee /etc/sysctl.d/kubernetes.conf <<EOF net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 net.ipv4.ip_forward = 1 EOF sudo sysctl —system
3、安装 containerd(以及配置 systemd cgroup 驱动)
# 安装依赖
sudo apt update
sudo apt install -y apt-transport-https ca-certificates curl gnupg lsb-release
# 添加 containerd GPG 密钥(阿里云镜像)
curl -fsSL https://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker.gpg
# 添加 containerd 仓库(使用官方源,阿里云也镜像了)
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker.gpg] https://mirrors.aliyun.com/docker-ce/linux/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt update
sudo apt install -y containerd.io
sudo mkdir -p /etc/containerd
containerd config default | sudo tee /etc/containerd/config.toml
修改配置:使用 systemd cgroup、配置阿里云镜像加速
sudo sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.toml
sudo sed -i 's|registry.k8s.io|registry.aliyuncs.com/google_containers|g' /etc/containerd/config.toml
# 找到
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
endpoint = ["https://fz5yth0r.mirror.aliyuncs.com"]
# 改为
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
endpoint = ["https://fz5yth0r.mirror.aliyuncs.com"]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."registry.aliyuncs.com"]
endpoint = ["https://registry.aliyuncs.com"]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."registry.cn-hangzhou.aliyuncs.com"]
endpoint = ["https://registry.cn-hangzhou.aliyuncs.com"]
sudo systemctl enable containerd
sudo systemctl restart containerd
4、安装 kubeadm、kubelet、kubectl(使用阿里云镜像)
# 添加 Kubernetes 阿里云 APT 源 curl -fsSL https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.32/deb/keyring.gpg | sudo gpg —dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg echo "deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.32/deb/ /" | sudo tee /etc/apt/sources.list.d/kubernetes.list sudo apt update # 如果源有问题,更新一下 sudo rm /etc/apt/keyrings/kubernetes-apt-keyring.gpg curl -fsSL https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.32/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg cat <<EOF | sudo tee /etc/apt/sources.list.d/kubernetes.list deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.32/deb/ ./ EOF # 指定版本安装 apt install -y kubeadm=1.32.* kubelet=1.32.* kubectl=1.32.* # 锁定版本防止自动更新(重要) sudo apt-mark hold kubelet kubeadm kubectl
5、准备镜像
# 查看需要哪些镜像
# kubeadm config images list --kubernetes-version=v1.32.0 \
--image-repository registry.aliyuncs.com/google_containers
registry.aliyuncs.com/google_containers/kube-apiserver:v1.32.0
registry.aliyuncs.com/google_containers/kube-controller-manager:v1.32.0
registry.aliyuncs.com/google_containers/kube-scheduler:v1.32.0
registry.aliyuncs.com/google_containers/kube-proxy:v1.32.0
registry.aliyuncs.com/google_containers/coredns:v1.11.3
registry.aliyuncs.com/google_containers/pause:3.10
registry.aliyuncs.com/google_containers/etcd:3.5.16-0
# 导出离线包
sudo ctr -n k8s.io images export k8s-images-v1.32.tar \
registry.aliyuncs.com/google_containers/kube-apiserver:v1.32.0 \
registry.aliyuncs.com/google_containers/kube-controller-manager:v1.32.0 \
registry.aliyuncs.com/google_containers/kube-scheduler:v1.32.0 \
registry.aliyuncs.com/google_containers/kube-proxy:v1.32.0 \
registry.aliyuncs.com/google_containers/pause:3.10 \
registry.aliyuncs.com/google_containers/etcd:3.5.16-0 \
registry.aliyuncs.com/google_containers/coredns:v1.11.3
# 导入
sudo ctr -n k8s.io images import k8s-images-v1.32.tar
crictl images
IMAGE TAG IMAGE ID SIZE
registry.aliyuncs.com/google_containers/pause 3.8 4873874c08efc 311kB
registry.cn-hangzhou.aliyuncs.com/google_containers/coredns v1.11.3 c69fa2e9cbf5f 18.6MB
registry.cn-hangzhou.aliyuncs.com/google_containers/etcd 3.5.16-0 a9e7e6b294baf 57.7MB
registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver v1.32.0 c2e17b8d0f4a3 28.7MB
registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager v1.32.0 8cab3d2a8bd0f 26.3MB
registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy v1.32.0 040f9f8aac8cd 30.9MB
registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler v1.32.0 a389e107f4ff1 20.7MB
registry.cn-hangzhou.aliyuncs.com/google_containers/pause 3.10 873ed75102791 320kB
补充:sandbox_images的作用
sandbox_image 是 容器运行时(如 containerd)中用于创建 Pod 沙箱(Pod Sandbox)的镜像。
在 Kubernetes 中,每一个 Pod 都对应一个“沙箱容器”(Sandbox Container),它不运行应用,而是:
- 提供 Pod 的 网络命名空间(Network Namespace)
- 提供 IPC 命名空间
- 提供 PID 命名空间
- 作为 Pod 内所有容器共享的基础环境
这个沙箱容器通常是一个非常小的镜像,只包含一个极简的 pause 程序(什么也不做,就是 sleep(inf)),所以也叫 pause 镜像。
6、初始化master
sudo kubeadm init \ --pod-network-cidr=10.244.0.0/16 \ --kubernetes-version=v1.32.0 \ --cri-socket=unix:///run/containerd/containerd.sock \ --image-repository registry.aliyuncs.com/google_containers
7、安装cilium组件
curl -LO https://github.com/cilium/cilium-cli/releases/latest/download/cilium-linux-amd64.tar.gz sudo tar xzvfC cilium-linux-amd64.tar.gz /usr/local/bin rm cilium-linux-amd64.tar.gz # cilium version cilium-cli: v0.18.6 compiled with go1.24.5 on linux/amd64 cilium image (default): v1.18.0 cilium image (stable): v1.18.0 cilium image (running): unknown. Unable to obtain cilium version. Reason: release: not found # 有外网的主机下载打包 docker pull quay.io/cilium/cilium:v1.18.0 docker pull quay.io/cilium/operator-generic:v1.18.0 docker pull quay.io/cilium/cilium-envoy:v1.34.4-1753677767-266d5a01d1d55bd1d60148f991b98dac0390d363 # 打包 docker save quay.io/cilium/cilium:v1.18.0 -o cilium.tar docker save quay.io/cilium/operator-generic:v1.18.0 -o operator-generic.tar docker save quay.io/cilium/cilium-envoy:v1.34.4-1753677767-266d5a01d1d55bd1d60148f991b98dac0390d363 -o cilium-envoy.tar # 导入到 containerd 的 k8s.io namespace ctr -n k8s.io images import cilium.tar ctr -n k8s.io images import operator-generic.tar ctr -n k8s.io images import cilium-envoy.tar # 安装 cilium install \ --set kubeProxyReplacement=true \ --set k8sServiceHost=192.168.1.141 \ --set k8sServicePort=6443 --set image.pullPolicy=IfNotPresent \ --set operator.image.pullPolicy=IfNotPresent \ --set envoy.image.pullPolicy=IfNotPresent # 查看状态 # kubectl get pods -n kube-system -l k8s-app=cilium-envoy -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES cilium-envoy-kktzm 1/1 Running 0 66s 192.168.1.142 node2 <none> <none> cilium-envoy-lc44f 1/1 Running 0 66s 192.168.1.143 node3 <none> <none> cilium-envoy-zkh59 1/1 Running 0 66s 192.168.1.141 node1 <none> <none> # cilium status --wait
8、Cilium 完全替代 kube-proxy
kubectl delete ds -n kube-system kube-proxy
daemonset.apps "kube-proxy" deleted
# 在 node1、node2、node3 上分别执行
iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X
# 部署 hubble-relay 和 hubble-ui 相关 Pod。
cilium hubble enable —relay —ui
# 检查
kubectl -n kube-system get pods -l k8s-app=hubble-ui
kubectl -n kube-system get pods -l k8s-app=hubble-relay
# NodePort 暴露 Hubble UI
kubectl -n kube-system patch svc hubble-ui \
-p '{"spec": {"type": "NodePort", "ports": [{"port": 80, "nodePort": 30080, "protocol": "TCP", "targetPort": 8081}]}}'
# http://ip:30080/
四、小技巧
1、flannel pod 强制删除不行
经排查,node2节点有问题(docker没有开机自启,重启docker)。节点正常后再强制删除
使用命令获取pod的名字 kubectl get podes -n NAMESPACE |grep Terminating (NAMESPACE可替换为kube-system) 使用kubectl中的强制删除命令 kubectl delete pod podName -n NAMESPACE --force --grace-period=0
2、报错The connection to the server localhost:8080 was refused - did you specify the right host or port?
# export KUBECONFIG=/etc/kubernetes/admin.conf
3、集群初始化没有生成token或者忘记token
# kubeadm token list 会有一个token 获取ca证书sha256编码hash值 #openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //' 2cc3029123db737f234186636330e87b5510c173c669f513a9c0e0da395515b0 kubeadm token create --print-join-command --ttl=0 证书永不过期 node节点加入 #kubeadm join 10.167.11.153:6443 --token o4avtg.65ji6b778nyacw68 --discovery-token-ca-cert-hash sha256:2cc3029123db737f234186636330e87b5510c173c669f513a9c0e0da395515b0
4、# kubeadm config images list (列出k8s的镜像,然后下载阿里云的打成如下tag)
I0326 11:17:43.015885 47157 version.go:251] remote version is much newer: v1.18.0; falling back to: stable-1.17 W0326 11:17:46.061291 47157 validation.go:28] Cannot validate kube-proxy config - no validator is available W0326 11:17:46.061306 47157 validation.go:28] Cannot validate kubelet config - no validator is available k8s.gcr.io/kube-apiserver:v1.17.4 k8s.gcr.io/kube-controller-manager:v1.17.4 k8s.gcr.io/kube-scheduler:v1.17.4 k8s.gcr.io/kube-proxy:v1.17.4 k8s.gcr.io/pause:3.1 k8s.gcr.io/etcd:3.4.3-0 k8s.gcr.io/coredns:1.6.5
5、在搭建好的k8s集群内创建的容器,只能在其所在的节点上curl可访问,但是在其他任何主机上无法访问容器占用的端口
解决方案1:你的系统可能没开路由
# vim /etc/sysctl.conf 找到这一行,放开注释 # Uncomment the next line to enable packet forwarding for IPv4 net.ipv4.ip_forward=1
重启主机(必须要重启才能生效)
解决方案2:
使用iptables打通网络
docker 从 1.13 版本开始,可能将 iptables FORWARD chain的默认策略设置为DROP,从而导致 ping 其它 Node 上的 Pod IP 失败,遇到这种情况时,需要手动设置策略为 ACCEPT:
# iptables -P FORWARD ACCEPT
并且把以下命令写入/etc/rc.local文件中,防止节点重启iptables FORWARD chain的默认策略又还原为DROP
# vim /etc/rc.local sleep 60 && /sbin/iptables -P FORWARD ACCEPT chmod +x /etc/rc.d/rc.local
6、kubectl命令报错
如果需要让普通用户可以运行 kubectl,请运行如下命令,其实这也是 kubeadm init 输出的一部分:
mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config
或者,如果您是 root 用户,则可以运行:
export KUBECONFIG=/etc/kubernetes/admin.conf


浙公网安备 33010602011771号