K8S集群CNI网络插件之Cilium底层原理

                                              作者:尹正杰

版权声明:原创作品,谢绝转载!否则将追究法律责任。

一.BPF和eBPF概述

1.什么是BPF

BPF全称为"Berkeley Package Filter",于1997年自Linux 2.1.75版本的内核引入。

基于寄存器的虚拟机,运行于内核空间:
	- 1.负责运行从用户空间注入的代码而无须对内核进行编程(开发内核模块);
	- 2.使用自定义的64位RISC指令集;
	- 3.能够在Linux内核内部运行时本地编译的"BPF程序",并能访问内核功能和内存的子集;
	
收到数据报后,驱动程序不仅会发给协议栈,还会发给BPF一份,由BPF根据不同的filter直接就地进行过滤,而将结果送往Buffer。

最初因tcpdump而知名,如今各个类Unix系统几乎都在使用BPF作为网络数据包过滤技术。

综上所述,简而言之,就是BPF工作在内核层效率高。

2.什么是eBPF

eBPF即"extended BPF",由Alexei Straovoitov于2014年实现,最早出现在Linux3.15版本的内核中。
	- 1.针对现代硬件进行了优化,较之BPF执行速度更快,据说快了将近4倍;
	- 2.2014年6月,eBPF扩展到用户空间,从而演进成了一个通用执行引擎,应用场景不再局限于数据包过滤,而是逐步扩展到内核各子模块,随后被广泛应用在观测,网络和安全等领域;
	- 3.目前已然成为Linux内核的顶级子系统,地位和内存,磁盘,文件系统等管理处于同一个级别;

eBPF为用户提供通过简单程序扩展或定制内核功能的接口,用户可编写eBPF程序内核中的目标时间相关联,并由这些事件触发执行。
	
eBPF不仅是一个包过滤系统,用户可以通过"bpf()"系统调用允许在Linux内核中直接运行在用户空间编写的代码。
	
由网友类比,eBPF在内核的地位,就类似于JavaScript在浏览器中的地位。

3.eBPF的应用场景

eBPF有很多应用场景:
	- SDN Configuration: 软件级的系统配置。
	- DDos Mitigation: DDos攻击的防范。
	- Intrusion Detection: 入侵检测系统。
	- Container Security: 容器安全。
	- Observability: 可观测性。
	- Firwalls: 防火墙的访问控制。
	- Device Drivers: 增强设备驱动。 
	

eBPF相关的子系统:
	- CPU(Scheduling)
	- Memory
	- Disks
	- File Systems
	- Networking
	- Applications
	- kernel
	- Hypervisors
	- Containers
	
温馨提示:
	BCC全称为"BPF Compiler Collection",是一款eBPF的开发工具,程序包名称为"bcc-tools",开发的小伙伴感兴趣的话可自行研究。

4.eBPF和XDP

XDP的全称为"eXpress Data Path",在内核级提供可编程数据包处理机制。

数据包到达时,网络驱动程序执行XDP Hook上eBPF程序,无需处理上下文切换,中断等,可大大减少内核开销。

Linux内核基于eBPF提供的高性能数据路径,支持绕过内核的网络栈直接在网络驱动程序的级别处理数据包。

5.eBPF vs iptables

对于kube-proxy组件的iptables工作模式,随着Pod数量的增多,其底层的iptables规则也会逐渐增多,当集群规模较大时,对于遍历iptables表就是一件很痛苦的事情。

eBPF的解决方案就是基于hash的方式查找规则,底层逻辑处理方式和ipvs类似。

二.Cilium概述

1.什么是Cilium

Cilium是基于eBPF和XDP的Kubernetes高性能网络插件,支持可观测性和安全机制,提供ServiceMesh的数据平面。

Cilium支持将多个Kubernetes集群定义成Mesh。

Cluster Mesh支持的模式有HA,Shard Services,Splitting Services和Load Service Affinity和Remote Service Affinity。

Cilium的主要功能如下:
	容器网络解决方案和负载均衡:
		- CNI:
			提供Pod容器跨主机互联的网络组件。
		- Kubernetes Services
			可以支持K8S svc的功能,意味着部署该组件后,我们甚至不用部署kube-proxy组件。
		- Multi-Cluster
        	支持多集群网络管理。
		
	网络安全:
		- Network Policy:
			支持容器的网络策略。
		- Identity-based:
			可实现端到端的认证通信。
		- Encryption:
			可以实现链路加密。
			
	可观测性:
		- Metrics:
			支持指标观测,意味着将来可以接入Prometheus。
		- Flow Visibility:
			支持可视化。
		- Service Dependency:
			服务的依赖关系。
			
			
温馨提示:
	Cilium高级特性依赖Linux内核版本,最好Linux内核能大于5.10+版本。
	建议生产环境中选择Ubuntu 22.04+ LTS系统部署,内核版本较新。
	
	
官网地址:
	https://github.com/cilium/cilium

2.Cilium组件

Cilium:
	- Cilium Agent
		- 1.由daemonset编排运行集群中的每个节点上。
		- 2.从Kubernetes或API接口接收配置: 网络,LB,Network Policy等;
		- 3.基于SDK调用节点上的eBPF实现相关的功能;
		
	- Cilium Client
		- 命令行工具,通过Rest API同当前节点上的Cilium Agent进行交互;
		- 常用语检查本地Cilium Agent的状态;
		- 也可用于直接访问eBPF Map;
		- 另外还有一个客户端工具称为Cilium CLI,负责管理整个Cilium集群,而非当前Cilium Agent。
		
	- Operator
		- 负责监视和维护整个集群;
		- 同Cilium的网络功能和网络策略机制不相关;
			- 其故障并不会影响报文转发和网络策略的进行;
			- 但会影响IPAM和KVstore的数据存取;
			
	- CNI Plugin
		- Cilium自身即为Kubernetes CNI插件,同时还可以完全取代kube Proxy的功能。
	
	
Hubble: 分布式的网络可观测性平台,建立在cilium和eBPF之上,存储展示从cilium获取的相关数据。
	- Server:
		服务端接收数据,进行分析处理。
		
	- Relay:
		采集被监控端的数据。
		
	- Client:
		可以和Server进行通信的组件。
		
	- Graphical UI
		用于从Server端展示数据的WebUI相关组件。
		
eBPF:
	Cilium底层使用的就是eBPF功能。

Data Store: (数据存储)
	- Kubernetes CRDs(Default,推荐使用这种方式):
	- Key-Value Store

3.Cilium支持网络模式

Encapsulation(隧道封装,Overlay网络):
	- 1.VXLAN:
		默认模式,8472/UDP。
	- 2.Geneve:
		6081/UDP。
		
		
Native-Routing(原生路由,直接路由,默认不支持跨路由,underlay网络):
	- 1.将所有非本地目标地址报文都交由内核的路由子系统,处理逻辑等同于本地进程生成的报文;
	- 2.节点上的路由表将决定如何路由Pod报文;
	- 3.开启方式: "--set tunnel=disabled --set routing-mode=native";
	

4.部署Cilium方式

方案一: 只作为CNI网络插件
	Service的功能仍由kube-proxy实现。
	
	
方案二: 作为CNI网络插件,同时取代kube-proxy部分功能
	余下的功能仍由kube-proxy实现,因此还是需要部署kube-proxy组件。
	
	
方案三: 完全取代kube-proxy
	说白了,就是使用Cilium来取代kube-proxy的所有功能。
	
	
温馨提示:
	使用kubeadm部署K8S集群时可以使用"--skip-phases=addon/kube-proxy"选项,以跳过kube proxy的部署。
	 

5.部署Cilium工具

- helm
	基于helm方式部署Cilium。
	
- cilium CLI:
	由Cilium官方提供的一款部署工具。

6.Cilium网络插件的实现

cilium会在宿主机上创建四个虚拟网络接口:
	- cilium_host和cilium_net
		- 1.一组veth_pair,由cilium agent创建;
		- 2.cilium_host会被配置为该主机分配到的PodCIDR中的第一个地址,并作为该子网的默认网关;
		- 3.CNI插件负责创建BPF规则,以便于在内核接通veth pair的两端;
		
	- cilium_vxlan
    	负责在vxlan模式下,封装或解封装vxlan报文。
    	
    - lxc_health
    	节点健康状态检测。
    	

Pod网络接口:
	- cilium会为每个Pod创建一组veth pair:
		- 一端作为pod中的网络接口存在,网关指向cilium_host的地址;
		- 另一端表现为宿主机上名称形如"lxcXXXXX"的虚拟接口;
	- LXC接口的MAC地址用于响应ARP请求:
		- 执行命令"cilium bpf tunnel list",此命令需要进入到Cilium-agent的容器中执行。
		
		
		
测试验证:
	1.查看网卡设备
[root@master241 ~]# ifconfig 
cilium_host: flags=4291<UP,BROADCAST,RUNNING,NOARP,MULTICAST>  mtu 1500
        inet 10.100.0.56  netmask 255.255.255.255  broadcast 0.0.0.0
        inet6 fe80::bc14:bfff:febc:e3df  prefixlen 64  scopeid 0x20<link>
        ether be:14:bf:bc:e3:df  txqueuelen 1000  (Ethernet)
        RX packets 206  bytes 17801 (17.8 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 28  bytes 3954 (3.9 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

cilium_net: flags=4291<UP,BROADCAST,RUNNING,NOARP,MULTICAST>  mtu 1500
        inet6 fe80::5cca:d6ff:fe2b:255f  prefixlen 64  scopeid 0x20<link>
        ether 5e:ca:d6:2b:25:5f  txqueuelen 1000  (Ethernet)
        RX packets 28  bytes 3954 (3.9 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 206  bytes 17801 (17.8 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

cilium_vxlan: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::30d7:15ff:fea8:e671  prefixlen 64  scopeid 0x20<link>
        ether 32:d7:15:a8:e6:71  txqueuelen 1000  (Ethernet)
        RX packets 376  bytes 25277 (25.2 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 380  bytes 24461 (24.4 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

...

lxc_health: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::ac24:a3ff:fe62:ca48  prefixlen 64  scopeid 0x20<link>
        ether ae:24:a3:62:ca:48  txqueuelen 1000  (Ethernet)
        RX packets 251  bytes 20300 (20.3 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 326  bytes 27674 (27.6 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root@master241 ~]# 


	2.查看基于bpf构建出的vxlan隧道信息
[root@master241 ~]# kubectl get pods -n kube-system -l app.kubernetes.io/name=cilium-agent -o wide
NAME           READY   STATUS    RESTARTS   AGE   IP           NODE        NOMINATED NODE   READINESS GATES
cilium-2c6st   1/1     Running   0          17m   10.0.0.243   worker243   <none>           <none>
cilium-9hf7b   1/1     Running   0          17m   10.0.0.242   worker242   <none>           <none>
cilium-fwcjw   1/1     Running   0          17m   10.0.0.241   master241   <none>           <none>
[root@master241 ~]# 
[root@master241 ~]# kubectl -n kube-system exec -it cilium-2c6st -c cilium-agent -- bash
root@worker243:/home/cilium# 
root@worker243:/home/cilium# cilium bpf tunnel list
TUNNEL       VALUE
10.100.1.0   10.0.0.242:0   
10.100.0.0   10.0.0.241:0   
root@worker243:/home/cilium# 
root@worker243:/home/cilium# cilium status  # 查看cilium状态信息,如果想要查看更详细信息,可以添加"--verbose"选项。
KVStore:                 Disabled   
Kubernetes:              Ok         1.31 (v1.31.6) [linux/amd64]
Kubernetes APIs:         ["EndpointSliceOrEndpoint", "cilium/v2::CiliumClusterwideNetworkPolicy", "cilium/v2::CiliumEndpoint", "cilium/v2::CiliumNetworkPolicy", "cilium/v2::CiliumNode", "cilium/v2alpha1::CiliumCIDRGroup", "core/v1::Namespace", "core/v1::Pods", "core/v1::Service", "networking.k8s.io/v1::NetworkPolicy"]
KubeProxyReplacement:    True   [eth0   10.0.0.243 fe80::20c:29ff:fec0:3c55 (Direct Routing)]
Host firewall:           Disabled
SRv6:                    Disabled
CNI Chaining:            none
CNI Config file:         successfully wrote CNI configuration file to /host/etc/cni/net.d/05-cilium.conflist
Cilium:                  Ok   1.17.0 (v1.17.0-c2bbf787)
NodeMonitor:             Listening for events on 128 CPUs with 64x4096 of shared memory
Cilium health daemon:    Ok   
IPAM:                    IPv4: 5/254 allocated from 10.100.2.0/24, 
IPv4 BIG TCP:            Disabled
IPv6 BIG TCP:            Disabled
BandwidthManager:        Disabled
Routing:                 Network: Tunnel [vxlan]   Host: Legacy
Attach Mode:             Legacy TC
Device Mode:             veth
Masquerading:            IPTables [IPv4: Enabled, IPv6: Disabled]
Controller Status:       37/37 healthy
Proxy Status:            OK, ip 10.100.2.222, 0 redirects active on ports 10000-20000, Envoy: external
Global Identity Range:   min 256, max 65535
Hubble:                  Ok              Current/Max Flows: 4095/4095 (100.00%), Flows/s: 9.31   Metrics: Disabled
Encryption:              Disabled        
Cluster health:          3/3 reachable   (2025-03-09T02:09:18Z)
Name                     IP              Node   Endpoints
Modules Health:          Stopped(0) Degraded(0) OK(60)
root@worker243:/home/cilium# 

三.部署Kubernetes 1.31.6集群

1.安装kubeadm相关软件包

1.1 配置软件源

	1.所有节点安装依赖包
apt-get update && apt-get install -y apt-transport-https


	2.所有节点添加软件源【将来如果想要安装其他版本,将1.31更换为你想要的版本即可,此方法仅对K8S 1.29+有效】
curl -fsSL https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.31/deb/Release.key |
    gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
echo "deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.31/deb/ /" |
    tee /etc/apt/sources.list.d/kubernetes.list


参考链接:
	https://developer.aliyun.com/mirror/kubernetes
	https://www.cnblogs.com/yinzhengjie/p/18353027

1.2 查看支持安装的kubeadm列表

	1.更新软件源
apt-get update
	
	
	2.查看支持安装的kubeadm列表,目前最新版为"1.31.6-1.1"
[root@master241 ~]# apt-cache madison kubeadm
   kubeadm | 1.31.6-1.1 | https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.31/deb  Packages
   kubeadm | 1.31.5-1.1 | https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.31/deb  Packages
   kubeadm | 1.31.4-1.1 | https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.31/deb  Packages
   kubeadm | 1.31.3-1.1 | https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.31/deb  Packages
   kubeadm | 1.31.2-1.1 | https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.31/deb  Packages
   kubeadm | 1.31.1-1.1 | https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.31/deb  Packages
   kubeadm | 1.31.0-1.1 | https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.31/deb  Packages
[root@master241 ~]# 

1.3 安装指定版本的kubeadm软件包

	1.安装软件包
apt-get -y install kubelet=1.31.6-1.1 kubeadm=1.31.6-1.1 kubectl=1.31.6-1.1


	2.查看软件的安装版本
[root@master241 ~]# kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"31", GitVersion:"v1.31.6", GitCommit:"6b3560758b37680cb713dfc71da03c04cadd657c", GitTreeState:"clean", BuildDate:"2025-02-12T21:31:09Z", GoVersion:"go1.22.12", Compiler:"gc", Platform:"linux/amd64"}
[root@master241 ~]# 
[root@master241 ~]# kubectl version
Client Version: v1.31.6
Kustomize Version: v5.4.2
The connection to the server localhost:8080 was refused - did you specify the right host or port?
[root@master241 ~]# 
[root@master241 ~]# kubelet --version
Kubernetes v1.31.6
[root@master241 ~]# 

2.安装containerd

2.1 Linux基础优化

	1.关闭swap分区
swapoff -a && sysctl -w vm.swappiness=0  # 临时关闭
sed -ri '/^[^#]*swap/s@^@#@' /etc/fstab  # 基于配置文件关闭


	2.修改时区
ln -svf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime 


	3.允许iptable检查桥接流量
cat <<EOF | tee /etc/modules-load.d/k8s.conf
br_netfilter
EOF

cat <<EOF | tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF
sysctl --system

2.2 集群所有节点安装containerd

	1.解压我给大家准备好的安装包
tar xf yinzhengjie-autoinstall-containerd-v1.6.36.tar.gz 


	2.安装containerd服务
./install-containerd.sh i

3.基于kubeadm初始化K8S集群

3.1 初始化master节点

kubeadm init --control-plane-endpoint 10.0.0.241 \
  --kubernetes-version=v1.31.6 \
  --pod-network-cidr=10.100.0.0/16 \
  --service-cidr=10.200.0.0/12 \
  --upload-certs \
  --image-repository=registry.cn-hangzhou.aliyuncs.com/google_containers \
  --skip-phases=addon/kube-proxy
  
...
[addons] Applied essential addon: CoreDNS  # 注意哈,组件仅安装了coreDNS,并不会安装kube-proxy组件

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of the control-plane node running the following command on each as root:

  kubeadm join 10.0.0.241:6443 --token bcgvtt.70kxqw5s57o4pg3o \
	--discovery-token-ca-cert-hash sha256:acd620b40bdcba9b382d85ceb4291d84ac04db82e2fda273b44833ec391ca5c6 \
	--control-plane --certificate-key 9d56d479b5b6a622cea587fec4365b9f98ea580b98c20eb7be452b5065b3e091

Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 10.0.0.241:6443 --token bcgvtt.70kxqw5s57o4pg3o \
	--discovery-token-ca-cert-hash sha256:acd620b40bdcba9b382d85ceb4291d84ac04db82e2fda273b44833ec391ca5c6 
[root@master241 ~]# 

3.2 worker节点加入

[root@worker242 ~]# kubeadm join 10.0.0.241:6443 --token bcgvtt.70kxqw5s57o4pg3o \
	--discovery-token-ca-cert-hash sha256:acd620b40bdcba9b382d85ceb4291d84ac04db82e2fda273b44833ec391ca5c6 


[root@worker243 ~]# kubeadm join 10.0.0.241:6443 --token bcgvtt.70kxqw5s57o4pg3o \
	--discovery-token-ca-cert-hash sha256:acd620b40bdcba9b382d85ceb4291d84ac04db82e2fda273b44833ec391ca5c6 

3.3 配置管理节点

	1.准备kubeconfig文件
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

	
	2.查看节点状态信息
[root@master241 ~]# kubectl get nodes -o wide
NAME        STATUS     ROLES           AGE     VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION       CONTAINER-RUNTIME
master241   NotReady   control-plane   7m57s   v1.31.6   10.0.0.241    <none>        Ubuntu 22.04.4 LTS   5.15.0-119-generic   containerd://1.6.36
worker242   NotReady   <none>          22s     v1.31.6   10.0.0.242    <none>        Ubuntu 22.04.4 LTS   5.15.0-119-generic   containerd://1.6.36
worker243   NotReady   <none>          8s      v1.31.6   10.0.0.243    <none>        Ubuntu 22.04.4 LTS   5.15.0-119-generic   containerd://1.6.36
[root@master241 ~]# 

3.4 配置自动补全功能

kubectl completion bash > ~/.kube/completion.bash.inc
echo source '$HOME/.kube/completion.bash.inc' >> ~/.bashrc 
source ~/.bashrc

四.部署Cilium网络插件

1.安装Cilium网络插件

1.1 安装cilium客户端

参考链接:
	https://github.com/cilium/cilium-cli
	
	
部署cilium组件:
	1.下载cilium-cli程序
[root@master241 ~]# wget https://github.com/cilium/cilium-cli/releases/download/v0.18.2/cilium-linux-amd64.tar.gz
 

	2.解压程序
[root@master241 ~]# tar xf cilium-linux-amd64.tar.gz  -C /usr/local/bin/


	3.查看Cilium支持的安装版本列表
[root@master241 ~]# cilium install --list-versions
v1.17.0 (default)
v1.17.0-rc.2
v1.17.0-rc.1
v1.17.0-rc.0
v1.17.0-pre.3
v1.17.0-pre.2
v1.17.0-pre.1
v1.17.0-pre.0
...
v1.6.12
v1.6.11
v1.6.10
v1.6.9
v1.6.8
v1.6.7
v1.6.6
v1.6.5
[root@master241 ~]# 

1.2 部署cilium的vxlan模式

	1.安装指定版本的cilium
[root@master241 ~]# cilium install \
      --set kubeProxyReplacement=true \
      --set ipam.mode=kubernetes \
      --set routingMode=tunnel \
      --set tunnelProtocol=vxlan \
      --set ipam.operator.clusterPoolIPv4PodCIDRList=10.100.0.0/16 \
      --set ipam.operator.clusterPoolIPv4MaskSize=24 \
      --set version=v1.17.0
...# 具体输出如下所示
ℹ️  Using Cilium version 1.17.0  # 部署的版本
🔮 Auto-detected cluster name: kubernetes  # 集群的名称
🔮 Auto-detected kube-proxy has not been installed  # 发现了kube-proxy未被安装
ℹ️  Cilium will fully replace all functionalities of kube-proxy  # cilium替换kube-proxy功能
[root@master241 ~]# 


	2.检查是否安装成功
[root@master241 ~]# kubectl get pods -n kube-system -l app.kubernetes.io/part-of=cilium -o wide
NAME                               READY   STATUS    RESTARTS   AGE     IP           NODE        NOMINATED NODE   READINESS GATES
cilium-2c6st                       1/1     Running   0          2m46s   10.0.0.243   worker243   <none>           <none>
cilium-9hf7b                       1/1     Running   0          2m46s   10.0.0.242   worker242   <none>           <none>
cilium-envoy-d4kkx                 1/1     Running   0          2m46s   10.0.0.243   worker243   <none>           <none>
cilium-envoy-fp7s6                 1/1     Running   0          2m46s   10.0.0.241   master241   <none>           <none>
cilium-envoy-mhxcg                 1/1     Running   0          2m46s   10.0.0.242   worker242   <none>           <none>
cilium-fwcjw                       1/1     Running   0          2m46s   10.0.0.241   master241   <none>           <none>
cilium-operator-84f88cb595-lv4n5   1/1     Running   0          2m46s   10.0.0.243   worker243   <none>           <none>
[root@master241 ~]# 


	3.查看cilium状态
[root@master241 ~]# cilium status
    /¯¯\
 /¯¯\__/¯¯\    Cilium:             OK
 \__/¯¯\__/    Operator:           OK
 /¯¯\__/¯¯\    Envoy DaemonSet:    OK
 \__/¯¯\__/    Hubble Relay:       disabled
    \__/       ClusterMesh:        disabled

DaemonSet              cilium                   Desired: 3, Ready: 3/3, Available: 3/3
DaemonSet              cilium-envoy             Desired: 3, Ready: 3/3, Available: 3/3
Deployment             cilium-operator          Desired: 1, Ready: 1/1, Available: 1/1
Containers:            cilium                   Running: 3
                       cilium-envoy             Running: 3
                       cilium-operator          Running: 1
                       clustermesh-apiserver    
                       hubble-relay             
Cluster Pods:          2/2 managed by Cilium
Helm chart version:    1.17.0
Image versions         cilium             quay.io/cilium/cilium:v1.17.0@sha256:51f21bdd003c3975b5aaaf41bd21aee23cc08f44efaa27effc91c621bc9d8b1d: 3
                       cilium-envoy       quay.io/cilium/cilium-envoy:v1.31.5-1737535524-fe8efeb16a7d233bffd05af9ea53599340d3f18e@sha256:57a3aa6355a3223da360395e3a109802867ff635cb852aa0afe03ec7bf04e545: 3
                       cilium-operator    quay.io/cilium/operator-generic:v1.17.0@sha256:1ce5a5a287166fc70b6a5ced3990aaa442496242d1d4930b5a3125e44cccdca8: 1
[root@master241 ~]# 

1.3 检查worker节点是否就绪

[root@master241 ~]# kubectl get nodes -o wide
NAME        STATUS   ROLES           AGE   VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION       CONTAINER-RUNTIME
master241   Ready    control-plane   10h   v1.31.6   10.0.0.241    <none>        Ubuntu 22.04.4 LTS   5.15.0-119-generic   containerd://1.6.36
worker242   Ready    <none>          10h   v1.31.6   10.0.0.242    <none>        Ubuntu 22.04.4 LTS   5.15.0-119-generic   containerd://1.6.36
worker243   Ready    <none>          10h   v1.31.6   10.0.0.243    <none>        Ubuntu 22.04.4 LTS   5.15.0-119-generic   containerd://1.6.36
[root@master241 ~]# 

2.验证cilium是否正常工作

2.1 检查CNI组件是否正常工作

	1.编写资源清单
[root@master241 ~]# cat > yinzhengjie-network-cni-test.yaml <<EOF
apiVersion: v1
kind: Pod
metadata:
  name: xiuxian-v1
spec:
  nodeName: worker242
  containers:
  - image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v1 
    name: xiuxian

---

apiVersion: v1
kind: Pod
metadata:
  name: xiuxian-v2
spec:
  nodeName: worker243
  containers:
  - image: registry.cn-hangzhou.aliyuncs.com/yinzhengjie-k8s/apps:v2
    name: xiuxian
EOF
	
	2.创建Pod
[root@master241 ~]# kubectl apply -f yinzhengjie-network-cni-test.yaml 
pod/xiuxian-v1 created
pod/xiuxian-v2 created
[root@master241 ~]# 
[root@master241 ~]# kubectl get pods -o wide
NAME         READY   STATUS    RESTARTS   AGE   IP             NODE        NOMINATED NODE   READINESS GATES
xiuxian-v1   1/1     Running   0          6s    10.100.1.42    worker242   <none>           <none>
xiuxian-v2   1/1     Running   0          6s    10.100.2.195   worker243   <none>           <none>
[root@master241 ~]# 



	3.访问测试
[root@master241 ~]# curl 10.100.1.42
<!DOCTYPE html>
<html>
  <head>
    <meta charset="utf-8"/>
    <title>yinzhengjie apps v1</title>
    <style>
       div img {
          width: 900px;
          height: 600px;
          margin: 0;
       }
    </style>
  </head>

  <body>
    <h1 style="color: green">凡人修仙传 v1 </h1>
    <div>
      <img src="1.jpg">
    <div>
  </body>

</html>
[root@master241 ~]# 
[root@master241 ~]# 
[root@master241 ~]# curl 10.100.2.195
<!DOCTYPE html>
<html>
  <head>
    <meta charset="utf-8"/>
    <title>yinzhengjie apps v2</title>
    <style>
       div img {
          width: 900px;
          height: 600px;
          margin: 0;
       }
    </style>
  </head>

  <body>
    <h1 style="color: red">凡人修仙传 v2 </h1>
    <div>
      <img src="2.jpg">
    <div>
  </body>

</html>
[root@master241 ~]# 

2.2 tcpdump抓包vxlan设备数据报文

	1.访问测试
[root@master241 ~]# kubectl get pods -o wide
NAME         READY   STATUS    RESTARTS   AGE   IP             NODE        NOMINATED NODE   READINESS GATES
xiuxian-v1   1/1     Running   0          17m   10.100.1.42    worker242   <none>           <none>
xiuxian-v2   1/1     Running   0          17m   10.100.2.195   worker243   <none>           <none>
[root@master241 ~]# 
[root@master241 ~]# curl  10.100.2.195 
<!DOCTYPE html>
<html>
  <head>
    <meta charset="utf-8"/>
    <title>yinzhengjie apps v2</title>
    <style>
       div img {
          width: 900px;
          height: 600px;
          margin: 0;
       }
    </style>
  </head>

  <body>
    <h1 style="color: red">凡人修仙传 v2 </h1>
    <div>
      <img src="2.jpg">
    <div>
  </body>

</html>
[root@master241 ~]# 


	2.使用vxlan设备进行Pod抓包
[root@worker243 ~]# tcpdump -i cilium_vxlan -nn host 10.100.2.195
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on cilium_vxlan, link-type EN10MB (Ethernet), snapshot length 262144 bytes
10:24:56.225340 IP 10.100.0.56.57100 > 10.100.2.195.80: Flags [S], seq 4009844643, win 64860, options [mss 1410,sackOK,TS val 29932480 ecr 0,nop,wscale 7], length 0
10:24:56.225494 IP 10.100.2.195.80 > 10.100.0.56.57100: Flags [S.], seq 2178855995, ack 4009844644, win 64308, options [mss 1410,sackOK,TS val 1769040274 ecr 29932480,nop,wscale 7], length 0
10:24:56.225957 IP 10.100.0.56.57100 > 10.100.2.195.80: Flags [.], ack 1, win 507, options [nop,nop,TS val 29932480 ecr 1769040274], length 0
10:24:56.225957 IP 10.100.0.56.57100 > 10.100.2.195.80: Flags [P.], seq 1:77, ack 1, win 507, options [nop,nop,TS val 29932480 ecr 1769040274], length 76: HTTP: GET / HTTP/1.1
10:24:56.226063 IP 10.100.2.195.80 > 10.100.0.56.57100: Flags [.], ack 77, win 502, options [nop,nop,TS val 1769040274 ecr 29932480], length 0
10:24:56.226252 IP 10.100.2.195.80 > 10.100.0.56.57100: Flags [P.], seq 1:239, ack 77, win 502, options [nop,nop,TS val 1769040274 ecr 29932480], length 238: HTTP: HTTP/1.1 200 OK
10:24:56.226348 IP 10.100.2.195.80 > 10.100.0.56.57100: Flags [P.], seq 239:594, ack 77, win 502, options [nop,nop,TS val 1769040275 ecr 29932480], length 355: HTTP
10:24:56.226467 IP 10.100.0.56.57100 > 10.100.2.195.80: Flags [.], ack 239, win 506, options [nop,nop,TS val 29932481 ecr 1769040274], length 0
10:24:56.226754 IP 10.100.0.56.57100 > 10.100.2.195.80: Flags [.], ack 594, win 504, options [nop,nop,TS val 29932481 ecr 1769040275], length 0
10:24:56.227134 IP 10.100.0.56.57100 > 10.100.2.195.80: Flags [F.], seq 77, ack 594, win 504, options [nop,nop,TS val 29932482 ecr 1769040275], length 0
10:24:56.227285 IP 10.100.2.195.80 > 10.100.0.56.57100: Flags [F.], seq 594, ack 78, win 502, options [nop,nop,TS val 1769040275 ecr 29932482], length 0
10:24:56.228144 IP 10.100.0.56.57100 > 10.100.2.195.80: Flags [.], ack 595, win 504, options [nop,nop,TS val 29932483 ecr 1769040275], length 0
...
	
	
	3.到相应节点宿主机抓包测试
[root@worker243 ~]# tcpdump -i eth0 -nn host 10.0.0.241 
...  # 咱们的报文带有"OTV(Overlay Virtualization Transport)"标识,其底层使用的就与vxlan模式(端口为"8472")通信
10:24:56.225340 IP 10.0.0.241.52043 > 10.0.0.243.8472: OTV, flags [I] (0x08), overlay 0, instance 6
IP 10.100.0.56.57100 > 10.100.2.195.80: Flags [S], seq 4009844643, win 64860, options [mss 1410,sackOK,TS val 29932480 ecr 0,nop,wscale 7], length 0
10:24:56.225517 IP 10.0.0.243.53086 > 10.0.0.241.8472: OTV, flags [I] (0x08), overlay 0, instance 10307
IP 10.100.2.195.80 > 10.100.0.56.57100: Flags [S.], seq 2178855995, ack 4009844644, win 64308, options [mss 1410,sackOK,TS val 1769040274 ecr 29932480,nop,wscale 7], length 0
10:24:56.225957 IP 10.0.0.241.52043 > 10.0.0.243.8472: OTV, flags [I] (0x08), overlay 0, instance 6
IP 10.100.0.56.57100 > 10.100.2.195.80: Flags [.], ack 1, win 507, options [nop,nop,TS val 29932480 ecr 1769040274], length 0
10:24:56.225957 IP 10.0.0.241.52043 > 10.0.0.243.8472: OTV, flags [I] (0x08), overlay 0, instance 6
IP 10.100.0.56.57100 > 10.100.2.195.80: Flags [P.], seq 1:77, ack 1, win 507, options [nop,nop,TS val 29932480 ecr 1769040274], length 76: HTTP: GET / HTTP/1.1
10:24:56.226076 IP 10.0.0.243.53086 > 10.0.0.241.8472: OTV, flags [I] (0x08), overlay 0, instance 10307
IP 10.100.2.195.80 > 10.100.0.56.57100: Flags [.], ack 77, win 502, options [nop,nop,TS val 1769040274 ecr 29932480], length 0
10:24:56.226264 IP 10.0.0.243.53086 > 10.0.0.241.8472: OTV, flags [I] (0x08), overlay 0, instance 10307
IP 10.100.2.195.80 > 10.100.0.56.57100: Flags [P.], seq 1:239, ack 77, win 502, options [nop,nop,TS val 1769040274 ecr 29932480], length 238: HTTP: HTTP/1.1 200 OK
10:24:56.226356 IP 10.0.0.243.53086 > 10.0.0.241.8472: OTV, flags [I] (0x08), overlay 0, instance 10307
IP 10.100.2.195.80 > 10.100.0.56.57100: Flags [P.], seq 239:594, ack 77, win 502, options [nop,nop,TS val 1769040275 ecr 29932480], length 355: HTTP
10:24:56.226467 IP 10.0.0.241.52043 > 10.0.0.243.8472: OTV, flags [I] (0x08), overlay 0, instance 6
IP 10.100.0.56.57100 > 10.100.2.195.80: Flags [.], ack 239, win 506, options [nop,nop,TS val 29932481 ecr 1769040274], length 0
10:24:56.226754 IP 10.0.0.241.52043 > 10.0.0.243.8472: OTV, flags [I] (0x08), overlay 0, instance 6
IP 10.100.0.56.57100 > 10.100.2.195.80: Flags [.], ack 594, win 504, options [nop,nop,TS val 29932481 ecr 1769040275], length 0
10:24:56.227134 IP 10.0.0.241.22 > 10.0.0.1.65159: Flags [P.], seq 417:829, ack 152, win 501, length 412
10:24:56.227134 IP 10.0.0.241.52043 > 10.0.0.243.8472: OTV, flags [I] (0x08), overlay 0, instance 6
IP 10.100.0.56.57100 > 10.100.2.195.80: Flags [F.], seq 77, ack 594, win 504, options [nop,nop,TS val 29932482 ecr 1769040275], length 0
10:24:56.227134 IP 10.0.0.1.65159 > 10.0.0.241.22: Flags [.], ack 829, win 4106, length 0
10:24:56.227327 IP 10.0.0.243.53086 > 10.0.0.241.8472: OTV, flags [I] (0x08), overlay 0, instance 10307
IP 10.100.2.195.80 > 10.100.0.56.57100: Flags [F.], seq 594, ack 78, win 502, options [nop,nop,TS val 1769040275 ecr 29932482], length 0
10:24:56.228144 IP 10.0.0.241.52043 > 10.0.0.243.8472: OTV, flags [I] (0x08), overlay 0, instance 6
IP 10.100.0.56.57100 > 10.100.2.195.80: Flags [.], ack 595, win 504, options [nop,nop,TS val 29932483 ecr 1769040275], length 0
10:24:56.228893 IP 10.0.0.241.22 > 10.0.0.1.65159: Flags [P.], seq 829:929, ack 152, win 501, length 100
10:24:56.274039 IP 10.0.0.1.65159 > 10.0.0.241.22: Flags [.], ack 929, win 4105, length 0
10:24:56.506461 IP 10.0.0.241.59343 > 10.0.0.242.8472: OTV, flags [I] (0x08), overlay 0, instance 4
IP 10.100.0.63.4240 > 10.100.1.247.44970: Flags [.], ack 341364297, win 502, options [nop,nop,TS val 1169120106 ecr 1743764241], length 0
10:24:56.506641 IP 10.0.0.242.49250 > 10.0.0.241.8472: OTV, flags [I] (0x08), overlay 0, instance 6
IP 10.100.1.247.44970 > 10.100.0.63.4240: Flags [.], ack 1, win 507, options [nop,nop,TS val 1743779346 ecr 1169089857], length 0
10:24:56.812831 IP 10.0.0.242.49250 > 10.0.0.241.8472: OTV, flags [I] (0x08), overlay 0, instance 6
IP 10.100.1.247.44970 > 10.100.0.63.4240: Flags [.], ack 1, win 507, options [nop,nop,TS val 1743779651 ecr 1169089857], length 0
10:24:56.813344 IP 10.0.0.241.59343 > 10.0.0.242.8472: OTV, flags [I] (0x08), overlay 0, instance 4
IP 10.100.0.63.4240 > 10.100.1.247.44970: Flags [.], ack 1, win 502, options [nop,nop,TS val 1169120413 ecr 1743779346], length 0
...

2.3 验证Service功能是否正常

因为我们使用cilium来替代kube-proxy组件,而kube-proxy有一个重要功能就是为Service提供底层的代理功能。

综上所述,我们要验证Service功能是否正常工作,具体验证方式如下:
	1.为Pod打标签
[root@master241 ~]# kubectl get pods  --show-labels -o wide
NAME         READY   STATUS    RESTARTS   AGE   IP             NODE        NOMINATED NODE   READINESS GATES   LABELS
xiuxian-v1   1/1     Running   0          34m   10.100.1.42    worker242   <none>           <none>            <none>
xiuxian-v2   1/1     Running   0          34m   10.100.2.195   worker243   <none>           <none>            <none>
[root@master241 ~]# 
[root@master241 ~]# 
[root@master241 ~]# kubectl label pod --all apps=xiuxian
pod/xiuxian-v1 labeled
pod/xiuxian-v2 labeled
[root@master241 ~]# 
[root@master241 ~]# kubectl get pods  --show-labels -o wide
NAME         READY   STATUS    RESTARTS   AGE   IP             NODE        NOMINATED NODE   READINESS GATES   LABELS
xiuxian-v1   1/1     Running   0          35m   10.100.1.42    worker242   <none>           <none>            apps=xiuxian
xiuxian-v2   1/1     Running   0          35m   10.100.2.195   worker243   <none>           <none>            apps=xiuxian
[root@master241 ~]# 


	
	2.创建svc暴露Pod服务
[root@master241 ~]# kubectl expose pod xiuxian-v1 --name svc-xiuxian --selector apps=xiuxian --port 80
service/svc-xiuxian exposed
[root@master241 ~]# 

	
	3.查看svc信息
[root@master241 ~]# kubectl get service svc-xiuxian -o wide
NAME          TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)   AGE   SELECTOR
svc-xiuxian   ClusterIP   10.204.89.73   <none>        80/TCP    53s   apps=xiuxian
[root@master241 ~]# 
[root@master241 ~]# kubectl describe service svc-xiuxian 
Name:                     svc-xiuxian
Namespace:                default
Labels:                   apps=xiuxian
Annotations:              <none>
Selector:                 apps=xiuxian
Type:                     ClusterIP
IP Family Policy:         SingleStack
IP Families:              IPv4
IP:                       10.204.89.73
IPs:                      10.204.89.73
Port:                     <unset>  80/TCP
TargetPort:               80/TCP
Endpoints:                10.100.1.42:80,10.100.2.195:80
Session Affinity:         None
Internal Traffic Policy:  Cluster
Events:                   <none>
[root@master241 ~]# 


	4.访问测试,发现svc功能是正常的
[root@master241 ~]# for i in `seq 10`; do curl 10.204.89.73;sleep 0.5;done
<!DOCTYPE html>
<html>
  <head>
    <meta charset="utf-8"/>
    <title>yinzhengjie apps v1</title>
    <style>
       div img {
          width: 900px;
          height: 600px;
          margin: 0;
       }
    </style>
  </head>

  <body>
    <h1 style="color: green">凡人修仙传 v1 </h1>
    <div>
      <img src="1.jpg">
    <div>
  </body>

</html>
<!DOCTYPE html>
<html>
  <head>
    <meta charset="utf-8"/>
    <title>yinzhengjie apps v2</title>
    <style>
       div img {
          width: 900px;
          height: 600px;
          margin: 0;
       }
    </style>
  </head>

  <body>
    <h1 style="color: red">凡人修仙传 v2 </h1>
    <div>
      <img src="2.jpg">
    <div>
  </body>

</html>
...

3.部署cilium的原生路由模式

3.1 卸载cilium

前面部署的cilium基于vxlan模式工作,如果我们想要切换为原生路由模式,则需要先卸载之前的cilium再重新部署即可。


具体操作步骤如下:
	1.卸载前查看K8S集群各节点路由信息
[root@master241 ~]# ifconfig cilium_host
cilium_host: flags=4291<UP,BROADCAST,RUNNING,NOARP,MULTICAST>  mtu 1500
        inet 10.100.0.56  netmask 255.255.255.255  broadcast 0.0.0.0
        inet6 fe80::bc14:bfff:febc:e3df  prefixlen 64  scopeid 0x20<link>
        ether be:14:bf:bc:e3:df  txqueuelen 1000  (Ethernet)
        RX packets 870  bytes 78405 (78.4 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 1910  bytes 128174 (128.1 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root@master241 ~]# 
[root@master241 ~]# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
...  # 注意看网关,都交由"cilium_host"网络设备加密传输
10.100.0.0      10.100.0.56     255.255.255.0   UG    0      0        0 cilium_host
10.100.1.0      10.100.0.56     255.255.255.0   UG    0      0        0 cilium_host
10.100.2.0      10.100.0.56     255.255.255.0   UG    0      0        0 cilium_host
[root@master241 ~]# 



[root@worker242 ~]# ifconfig cilium_host
cilium_host: flags=4291<UP,BROADCAST,RUNNING,NOARP,MULTICAST>  mtu 1500
        inet 10.100.1.247  netmask 255.255.255.255  broadcast 0.0.0.0
        inet6 fe80::f451:8aff:fe9a:5e15  prefixlen 64  scopeid 0x20<link>
        ether f6:51:8a:9a:5e:15  txqueuelen 1000  (Ethernet)
        RX packets 756  bytes 59525 (59.5 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 1910  bytes 128160 (128.1 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root@worker242 ~]# 
[root@worker242 ~]# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
...  # 注意看网关,都交由"cilium_host"网络设备加密传输
10.100.0.0      10.100.1.247    255.255.255.0   UG    0      0        0 cilium_host
10.100.1.0      10.100.1.247    255.255.255.0   UG    0      0        0 cilium_host
10.100.2.0      10.100.1.247    255.255.255.0   UG    0      0        0 cilium_host
[root@worker242 ~]# 



[root@worker243 ~]# ifconfig cilium_host
cilium_host: flags=4291<UP,BROADCAST,RUNNING,NOARP,MULTICAST>  mtu 1500
        inet 10.100.2.222  netmask 255.255.255.255  broadcast 0.0.0.0
        inet6 fe80::b85d:5bff:fe0e:1cb2  prefixlen 64  scopeid 0x20<link>
        ether ba:5d:5b:0e:1c:b2  txqueuelen 1000  (Ethernet)
        RX packets 771  bytes 60370 (60.3 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 1910  bytes 128160 (128.1 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root@worker243 ~]# 
[root@worker243 ~]# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
...  # 注意看网关,都交由"cilium_host"网络设备加密传输
10.100.0.0      10.100.2.222    255.255.255.0   UG    0      0        0 cilium_host
10.100.1.0      10.100.2.222    255.255.255.0   UG    0      0        0 cilium_host
10.100.2.0      10.100.2.222    255.255.255.0   UG    0      0        0 cilium_host
[root@worker243 ~]# 

	
	
	2.卸载cilium
[root@master241 ~]# cilium uninstall
🔥 Deleting pods in cilium-test namespace...
🔥 Deleting cilium-test namespace...
⌛ Uninstalling Cilium
[root@master241 ~]# 


	3.发现资源也会被删除哟
[root@master241 ~]# kubectl get pods -n kube-system -l app.kubernetes.io/part-of=cilium -o wide
No resources found in kube-system namespace.
[root@master241 ~]#  

3.2 部署cilium的原生路由模式

	1.安装cilium时指定原生路由模式,效率更高
[root@master241 ~]# cilium install \
      --set kubeProxyReplacement=true \
      --set ipam.mode=kubernetes \
      --set routingMode=native \
      --set autoDirectNodeRoutes=true \
      --set ipam.operator.clusterPoolIPv4PodCIDRList=10.100.0.0/16 \
      --set ipam.operator.clusterPoolIPv4MaskSize=24 \
      --set ipv4NativeRoutingCIDR=10.100.0.0/16 \
      --set bpf.masquerade=true \
      --set version=v1.17.0
... # 输出效果如下所示
ℹ️  Using Cilium version 1.17.0
🔮 Auto-detected cluster name: kubernetes
🔮 Auto-detected kube-proxy has not been installed
ℹ️  Cilium will fully replace all functionalities of kube-proxy
[root@master241 ~]# 
      
      
      2.验证是否部署成功
[root@master241 ~]# kubectl get pods -n kube-system -l app.kubernetes.io/part-of=cilium -o wide
NAME                               READY   STATUS    RESTARTS   AGE   IP           NODE        NOMINATED NODE   READINESS GATES
cilium-envoy-d589r                 1/1     Running   0          61s   10.0.0.243   worker243   <none>           <none>
cilium-envoy-hxhst                 1/1     Running   0          61s   10.0.0.242   worker242   <none>           <none>
cilium-envoy-x695g                 1/1     Running   0          61s   10.0.0.241   master241   <none>           <none>
cilium-hxnxp                       1/1     Running   0          61s   10.0.0.242   worker242   <none>           <none>
cilium-operator-84f88cb595-rn5m9   1/1     Running   0          61s   10.0.0.242   worker242   <none>           <none>
cilium-qxm68                       1/1     Running   0          61s   10.0.0.241   master241   <none>           <none>
cilium-xszwf                       1/1     Running   0          61s   10.0.0.243   worker243   <none>           <none>
[root@master241 ~]# 


	3.接下来我们再次检查下K8S集群各节点路由信息
[root@master241 ~]# ifconfig cilium_host
cilium_host: flags=4291<UP,BROADCAST,RUNNING,NOARP,MULTICAST>  mtu 1500
        inet 10.100.0.56  netmask 255.255.255.255  broadcast 0.0.0.0
        inet6 fe80::bc14:bfff:febc:e3df  prefixlen 64  scopeid 0x20<link>
        ether be:14:bf:bc:e3:df  txqueuelen 1000  (Ethernet)
        RX packets 870  bytes 78405 (78.4 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 1910  bytes 128174 (128.1 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root@master241 ~]# 
[root@master241 ~]# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
... # 很明显去往其他节点的路由直接到宿主机,去往本机才交由'cilium_host'设备处理
10.100.0.0      10.100.0.56     255.255.255.0   UG    0      0        0 cilium_host
10.100.1.0      10.0.0.242      255.255.255.0   UG    0      0        0 eth0
10.100.2.0      10.0.0.243      255.255.255.0   UG    0      0        0 eth0
[root@master241 ~]# 



[root@worker242 ~]# ifconfig cilium_host
cilium_host: flags=4291<UP,BROADCAST,RUNNING,NOARP,MULTICAST>  mtu 1500
        inet 10.100.1.247  netmask 255.255.255.255  broadcast 0.0.0.0
        inet6 fe80::f451:8aff:fe9a:5e15  prefixlen 64  scopeid 0x20<link>
        ether f6:51:8a:9a:5e:15  txqueuelen 1000  (Ethernet)
        RX packets 756  bytes 59525 (59.5 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 1910  bytes 128160 (128.1 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root@worker242 ~]# 
[root@worker242 ~]# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
... # 很明显去往其他节点的路由直接到宿主机,去往本机才交由'cilium_host'设备处理
10.100.0.0      10.0.0.241      255.255.255.0   UG    0      0        0 eth0
10.100.1.0      10.100.1.247    255.255.255.0   UG    0      0        0 cilium_host
10.100.2.0      10.0.0.243      255.255.255.0   UG    0      0        0 eth0
[root@worker242 ~]# 



[root@worker243 ~]# ifconfig cilium_host
cilium_host: flags=4291<UP,BROADCAST,RUNNING,NOARP,MULTICAST>  mtu 1500
        inet 10.100.2.222  netmask 255.255.255.255  broadcast 0.0.0.0
        inet6 fe80::b85d:5bff:fe0e:1cb2  prefixlen 64  scopeid 0x20<link>
        ether ba:5d:5b:0e:1c:b2  txqueuelen 1000  (Ethernet)
        RX packets 772  bytes 60440 (60.4 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 1911  bytes 128230 (128.2 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root@worker243 ~]# 
[root@worker243 ~]# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
... # 很明显去往其他节点的路由直接到宿主机,去往本机才交由'cilium_host'设备处理
10.100.0.0      10.0.0.241      255.255.255.0   UG    0      0        0 eth0
10.100.1.0      10.0.0.242      255.255.255.0   UG    0      0        0 eth0
10.100.2.0      10.100.2.222    255.255.255.0   UG    0      0        0 cilium_host
[root@worker243 ~]# 

3.3 tcpdump抓包宿主机设备数据报文验证

	1.在直接抓包Pod的IP地址
[root@worker243 ~]# tcpdump -i eth0 -nn host 10.100.2.195
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
11:09:49.078853 IP 10.0.0.241.38592 > 10.100.2.195.80: Flags [S], seq 1526009510, win 64240, options [mss 1460,sackOK,TS val 4246647951 ecr 0,nop,wscale 7], length 0
11:09:49.078989 IP 10.100.2.195.80 > 10.0.0.241.38592: Flags [S.], seq 1715449442, ack 1526009511, win 65160, options [mss 1460,sackOK,TS val 2995384073 ecr 4246647951,nop,wscale 7], length 0
11:09:49.079421 IP 10.0.0.241.38592 > 10.100.2.195.80: Flags [.], ack 1, win 502, options [nop,nop,TS val 4246647952 ecr 2995384073], length 0
11:09:49.079421 IP 10.0.0.241.38592 > 10.100.2.195.80: Flags [P.], seq 1:77, ack 1, win 502, options [nop,nop,TS val 4246647952 ecr 2995384073], length 76: HTTP: GET / HTTP/1.1
11:09:49.079519 IP 10.100.2.195.80 > 10.0.0.241.38592: Flags [.], ack 77, win 509, options [nop,nop,TS val 2995384074 ecr 4246647952], length 0
11:09:49.079756 IP 10.100.2.195.80 > 10.0.0.241.38592: Flags [P.], seq 1:239, ack 77, win 509, options [nop,nop,TS val 2995384074 ecr 4246647952], length 238: HTTP: HTTP/1.1 200 OK
11:09:49.079838 IP 10.100.2.195.80 > 10.0.0.241.38592: Flags [P.], seq 239:594, ack 77, win 509, options [nop,nop,TS val 2995384074 ecr 4246647952], length 355: HTTP
11:09:49.079999 IP 10.0.0.241.38592 > 10.100.2.195.80: Flags [.], ack 239, win 501, options [nop,nop,TS val 4246647953 ecr 2995384074], length 0
11:09:49.080057 IP 10.0.0.241.38592 > 10.100.2.195.80: Flags [.], ack 594, win 501, options [nop,nop,TS val 4246647953 ecr 2995384074], length 0
11:09:49.080311 IP 10.0.0.241.38592 > 10.100.2.195.80: Flags [F.], seq 77, ack 594, win 501, options [nop,nop,TS val 4246647953 ecr 2995384074], length 0
11:09:49.080703 IP 10.100.2.195.80 > 10.0.0.241.38592: Flags [F.], seq 594, ack 78, win 509, options [nop,nop,TS val 2995384075 ecr 4246647953], length 0
11:09:49.081063 IP 10.0.0.241.38592 > 10.100.2.195.80: Flags [.], ack 595, win 501, options [nop,nop,TS val 4246647954 ecr 2995384075], length 0
...



	2.tcpdump抓包宿主机设备数据报文
[root@worker243 ~]# tcpdump -i eth0 -nn host 10.0.0.241
...  # 很明显,在物理网卡抓包,发现没有隧道模式了,而是直接返回了。
11:09:49.078853 IP 10.0.0.241.38592 > 10.100.2.195.80: Flags [S], seq 1526009510, win 64240, options [mss 1460,sackOK,TS val 4246647951 ecr 0,nop,wscale 7], length 0
11:09:49.078989 IP 10.100.2.195.80 > 10.0.0.241.38592: Flags [S.], seq 1715449442, ack 1526009511, win 65160, options [mss 1460,sackOK,TS val 2995384073 ecr 4246647951,nop,wscale 7], length 0
11:09:49.079421 IP 10.0.0.241.38592 > 10.100.2.195.80: Flags [.], ack 1, win 502, options [nop,nop,TS val 4246647952 ecr 2995384073], length 0
11:09:49.079421 IP 10.0.0.241.38592 > 10.100.2.195.80: Flags [P.], seq 1:77, ack 1, win 502, options [nop,nop,TS val 4246647952 ecr 2995384073], length 76: HTTP: GET / HTTP/1.1
11:09:49.079519 IP 10.100.2.195.80 > 10.0.0.241.38592: Flags [.], ack 77, win 509, options [nop,nop,TS val 2995384074 ecr 4246647952], length 0
11:09:49.079756 IP 10.100.2.195.80 > 10.0.0.241.38592: Flags [P.], seq 1:239, ack 77, win 509, options [nop,nop,TS val 2995384074 ecr 4246647952], length 238: HTTP: HTTP/1.1 200 OK
11:09:49.079838 IP 10.100.2.195.80 > 10.0.0.241.38592: Flags [P.], seq 239:594, ack 77, win 509, options [nop,nop,TS val 2995384074 ecr 4246647952], length 355: HTTP
11:09:49.079999 IP 10.0.0.241.38592 > 10.100.2.195.80: Flags [.], ack 239, win 501, options [nop,nop,TS val 4246647953 ecr 2995384074], length 0
11:09:49.080057 IP 10.0.0.241.38592 > 10.100.2.195.80: Flags [.], ack 594, win 501, options [nop,nop,TS val 4246647953 ecr 2995384074], length 0
11:09:49.080311 IP 10.0.0.241.22 > 10.0.0.1.65159: Flags [P.], seq 53:457, ack 36, win 592, length 404
11:09:49.080311 IP 10.0.0.241.38592 > 10.100.2.195.80: Flags [F.], seq 77, ack 594, win 501, options [nop,nop,TS val 4246647953 ecr 2995384074], length 0
11:09:49.080406 IP 10.0.0.241.22 > 10.0.0.1.65159: Flags [P.], seq 457:501, ack 36, win 592, length 44
11:09:49.080406 IP 10.0.0.1.65159 > 10.0.0.241.22: Flags [.], ack 457, win 4100, length 0
11:09:49.080703 IP 10.100.2.195.80 > 10.0.0.241.38592: Flags [F.], seq 594, ack 78, win 509, options [nop,nop,TS val 2995384075 ecr 4246647953], length 0
11:09:49.081063 IP 10.0.0.241.38592 > 10.100.2.195.80: Flags [.], ack 595, win 501, options [nop,nop,TS val 4246647954 ecr 2995384075], length 0
...

五.Hubble及cilium指标展示

1.Hubble介绍

Hubble是分布式的网络可观测性平台,建立在cilium和eBPF之上,存储展示从cilium获取的相关数据。
	
Hubble相关的组件如下:
	- Server:
		服务端接收数据,进行分析处理。
		
	- Relay:
		采集被监控端的数据。
		
	- Client:
		可以和Server进行通信的组件。
		
	- Graphical UI
		用于从Server端展示数据的WebUI相关组件。
		
参考链接:
	https://github.com/cilium#hubble

2.cilium命令启用Hubble方式

2.1 启用Hubble组件

	1.启用Hubble组件
[root@master241 ~]# cilium hubble enable --ui


	2.检查Hubble状态
[root@master241 ~]# cilium status
    /¯¯\
 /¯¯\__/¯¯\    Cilium:             OK
 \__/¯¯\__/    Operator:           OK
 /¯¯\__/¯¯\    Envoy DaemonSet:    OK
 \__/¯¯\__/    Hubble Relay:       OK  # 注意哈,此时说明Hubble已经启用成功啦!
    \__/       ClusterMesh:        disabled

DaemonSet              cilium                   Desired: 3, Ready: 3/3, Available: 3/3
DaemonSet              cilium-envoy             Desired: 3, Ready: 3/3, Available: 3/3
Deployment             cilium-operator          Desired: 1, Ready: 1/1, Available: 1/1
Deployment             hubble-relay             Desired: 1, Ready: 1/1, Available: 1/1
Deployment             hubble-ui                Desired: 1, Ready: 1/1, Available: 1/1
Containers:            cilium                   Running: 3
                       cilium-envoy             Running: 3
                       cilium-operator          Running: 1
                       clustermesh-apiserver    
                       hubble-relay             Running: 1
                       hubble-ui                Running: 1
Cluster Pods:          6/6 managed by Cilium
Helm chart version:    1.17.0
Image versions         cilium             quay.io/cilium/cilium:v1.17.0@sha256:51f21bdd003c3975b5aaaf41bd21aee23cc08f44efaa27effc91c621bc9d8b1d: 3
                       cilium-envoy       quay.io/cilium/cilium-envoy:v1.31.5-1737535524-fe8efeb16a7d233bffd05af9ea53599340d3f18e@sha256:57a3aa6355a3223da360395e3a109802867ff635cb852aa0afe03ec7bf04e545: 3
                       cilium-operator    quay.io/cilium/operator-generic:v1.17.0@sha256:1ce5a5a287166fc70b6a5ced3990aaa442496242d1d4930b5a3125e44cccdca8: 1
                       hubble-relay       quay.io/cilium/hubble-relay:v1.17.0@sha256:022c084588caad91108ac73e04340709926ea7fe12af95f57fcb794b68472e05: 1
                       hubble-ui          quay.io/cilium/hubble-ui-backend:v0.13.1@sha256:0e0eed917653441fded4e7cdb096b7be6a3bddded5a2dd10812a27b1fc6ed95b: 1
                       hubble-ui          quay.io/cilium/hubble-ui:v0.13.1@sha256:e2e9313eb7caf64b0061d9da0efbdad59c6c461f6ca1752768942bfeda0796c6: 1
[root@master241 ~]# 

2.2 访问cilium的WebUI

	1.修改svc的类型
[root@master241 ~]# kubectl -n kube-system get svc hubble-ui 
NAME        TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)   AGE
hubble-ui   ClusterIP   10.197.82.62   <none>        80/TCP    5m46s
[root@master241 ~]# 
[root@master241 ~]# kubectl -n kube-system get svc hubble-ui -o yaml | sed '/type/s#ClusterIP#NodePort#' | kubectl apply -f -
Warning: resource services/hubble-ui is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.
service/hubble-ui configured
[root@master241 ~]# 
[root@master241 ~]# kubectl -n kube-system get svc hubble-ui 
NAME        TYPE       CLUSTER-IP     EXTERNAL-IP   PORT(S)        AGE
hubble-ui   NodePort   10.197.82.62   <none>        80:32089/TCP   5m52s
[root@master241 ~]# 
[root@master241 ~]# 


	2.访问Hubble的WebUI
如上图所示,我们成功访问到Hubble的WebUI啦~

3.部署cilium时直接启用Hubble

3.1 启用Hubble示例

[root@master241 ~]# cilium install \
      --set kubeProxyReplacement=true \
      --set ipam.mode=kubernetes \
      --set routingMode=native \
      --set autoDirectNodeRoutes=true \
      --set ipam.operator.clusterPoolIPv4PodCIDRList=10.100.0.0/16 \
      --set ipam.operator.clusterPoolIPv4MaskSize=24 \
      --set ipv4NativeRoutingCIDR=10.100.0.0/16 \
      --set bpf.masquerade=true \
      --set version=v1.17.0 \
      --set hubble.enabled="true" \
      --set hubble.listenAddress=":4244" \
      --set hubble.relay.enabled="true" \
      --set hubble.ui.enabled="true"
... # 输出效果如下所示
ℹ️  Using Cilium version 1.17.0
🔮 Auto-detected cluster name: kubernetes
🔮 Auto-detected kube-proxy has not been installed
ℹ️  Cilium will fully replace all functionalities of kube-proxy
[root@master241 ~]# 


3.2 暴露Hubble指标给Prometheus示例

[root@master241 ~]# cilium install \
      --set kubeProxyReplacement=true \
      --set ipam.mode=kubernetes \
      --set routingMode=native \
      --set autoDirectNodeRoutes=true \
      --set ipam.operator.clusterPoolIPv4PodCIDRList=10.100.0.0/16 \
      --set ipam.operator.clusterPoolIPv4MaskSize=24 \
      --set ipv4NativeRoutingCIDR=10.100.0.0/16 \
      --set bpf.masquerade=true \
      --set version=v1.17.0 \
      --set hubble.enabled="true" \
      --set hubble.listenAddress=":4244" \
      --set hubble.relay.enabled="true" \
      --set hubble.ui.enabled="true" \
      --set prometheus.enable=true \
      --set operator.prometheus.enabled=true \
      --set hubble.metrics.port=9665 \
      --set hubble.metrics.enableOpenMetrics=true \
      --set hubble.metrics.enable="{dns,drop,tcp,flow,port-distribution,icmp,httpV2:expmplars=true;lablesContext=source_ip\,source_namespace\,source_workload\,destination_ip\,destination_namespace\,destination_workload\,traffic_direction}"
... # 输出效果如下所示
ℹ️  Using Cilium version 1.17.0
🔮 Auto-detected cluster name: kubernetes
🔮 Auto-detected kube-proxy has not been installed
ℹ️  Cilium will fully replace all functionalities of kube-proxy
[root@master241 ~]# 
posted @ 2025-03-09 03:08  尹正杰  阅读(1375)  评论(0)    收藏  举报