极客时间运维进阶训练营第十六周作业

1、总结 Underlay 和 Overlay 网络的的区别及优缺点

  • Underlay 直接使用宿主机网络,无需封装解封装,性能好 基于宿主机网络物理网卡虚拟出多个网络接口(子接口),每个虚拟接口都拥有唯一的mac地址并可配置网卡子接口IP,强依赖物理网络

    缺点:消耗的IP地址较多,子网要划分的足够大,报文广播也大

  • Overlay 叠加网络,宿主机中封装容器网络,容器的mac封装到宿主机网络,利用宿主机网络,传输报文中的容器mac地址。效果就像L2的以太网帧在一个广播域中传输。 私有网络使用最多的网络之一 优点:对物理网络的兼容性比较好,没有额外的要求,可以实现pod的跨主机子网通信,calico和flannel等网络插件都支持overlay网络,私有云使用较多 缺点:有额外的封装与解封装性能开销

 

2、在 kubernetes 集群实现 underlay 网络

# 所有节点执行
apt-get -y update
apt -y install apt-transport-https ca-certificates curl software-properties-common
# 安装GPG证书
curl -fsSL http://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg |  apt-key add -
# 写入软件源信息
add-apt-repository "deb [arch=amd64] http://mirrors.aliyun.com/docker-ce/linux/ubuntu $(lsb_release -cs) stable"
# 更新软件源
apt-get -y update
# 查看可安装的Docker版
apt-cache madison docker-ce docker-ce-cli
apt install -y docker-ce=5:20.10.23~3-0~ubuntu-jammy docker-ce-cli=5:20.10.23~3-0~ubuntu-jammy

systemctl start docker && systemctl enable docker

mkdir -p /etc/docker
tee /etc/docker/daemon.json <<-'EOF'
{
"exec-opts": ["native.cgroupdriver=systemd"],
"registry-mirrors": ["https://9916w1ow.mirror.aliyuncs.com"]
}
EOF
sudo systemctl daemon-reload && sudo systemctl restart docker

# 安装cri-dockerd
cd /usr/local/src/
wget https://github.com/Mirantis/cri-dockerd/releases/download/v0.3.1/cri-dockerd-0.3.1.amd64.tgz

tar xvf cri-dockerd-0.3.1.amd64.tgz
cp cri-dockerd/cri-dockerd /usr/local/bin

tee /lib/systemd/system/cri-docker.service << "EOF"
[Unit]
Description=CRI Interface for Docker Application Container Engine
Documentation=https://docs.mirantis.com
After=network-online.target firewalld.service docker.service
Wants=network-online.target
Requires=cri-docker.socket
[Service]
Type=notify
ExecStart=/usr/local/bin/cri-dockerd --network-plugin=cni --pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.9
ExecReload=/bin/kill -s HUP $MAINPID
TimeoutSec=0
RestartSec=2
Restart=always
StartLimitBurst=3
StartLimitInterval=60s
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
Delegate=yes
KillMode=process
[Install]
WantedBy=multi-user.target

EOF

tee /etc/systemd/system/cri-docker.socket << "EOF"
[Unit]
Description=CRI Docker Socket for the API
PartOf=cri-docker.service
[Socket]
ListenStream=%t/cri-dockerd.sock
SocketMode=0660
SocketUser=root
SocketGroup=docker
[Install]
WantedBy=sockets.target
EOF
systemctl daemon-reload && systemctl restart cri-docker && systemctl enable cri-docker && systemctl enable --now
cri-docker.socket
systemctl status cri-docker.service

# 检查cri socket 文件
ls /var/run/cri-dockerd.sock

# 安装kubeadmin
## 设置k8s镜像源
apt-get update && apt-get install -y apt-transport-https
curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add -

cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main
EOF
## 开始安装kubeadm:
apt-get update
apt-cache madison kubeadm
apt-get install -y kubelet=1.24.10-00 kubeadm=1.24.10-00 kubectl=1.24.10-00

kubeadm config images list --kubernetes-version v1.24.10

tee /opt/images-download.sh << "EOF"
#!/bin/bash
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.24.10
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.24.10
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.24.10
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.24.10
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.9
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.5.6-0
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:1.8.6
EOF
bash /opt/images-download.sh
'''
场景1:pod可以选择overlay或者underlay,SVC使用overlay,如果是underlay需要配置SVC使用宿主机的子网
比如以下场景是overlay网络、后期会用于overlay场景的pod,service会用于overlay的svc场景。
# kubeadm init --apiserver-advertise-address=172.31.6.201 --apiserver-bind-port=6443 --kubernetes-version=v1.24.4 --pod-network- cidr=10.200.0.0/16 --service-cidr=10.100.0.0/16 --service-dns-domain=cluster.local --image-repository=registry.cn- hangzhou.aliyuncs.com/google_containers --ignore-preflight-errors=swap --cri-socket unix:///var/run/cri-dockerd.sock
场景2:pod可以选择overlay或者underlay,SVC使用underlay
underlay初始化,--pod-network-cidr=10.200.0.0/16会用于后期overlay的场景,underlay的网络CIDR后期单独指定,overlay会与underlay并存,-- service-cidr=172.31.5.0/24用于后期的underlay svc,通过SVC可以直接访问pod。
# 演示underlay初始化命令:
# kubeadm init --apiserver-advertise-address=172.31.6.201 --apiserver-bind-port=6443 --kubernetes-version=v1.24.10 --pod-network- cidr=10.200.0.0/16 --service-cidr=172.31.5.0/24 --service-dns-domain=cluster.local --image-repository=registry.cn- hangzhou.aliyuncs.com/google_containers --ignore-preflight-errors=swap --cri-socket unix:///var/run/cri-dockerd.sock

注意:后期如果要访问SVC则需要在网络设备配置静态路由,因为SVC是iptables或者IPVS规则,不会进行arp报文广播:
-A KUBE-SERVICES -d 172.31.5.148/32 -p tcp -m comment --comment "myserver/myserver-tomcat-app1-service-underlay:http cluster IP" -m
tcp --dport 80 -j KUBE-SVC-DXPW2IL54XTPIKP5

-A KUBE-SVC-DXPW2IL54XTPIKP5 ! -s 10.200.0.0/16 -d 172.31.5.148/32 -p tcp -m comment --comment "myserver/myserver-tomcat-app1- service-underlay:http cluster IP" -m tcp --dport 80 -j KUBE-MARK-MASQ


Chain KUBE-POSTROUTING (1 references)
pkts bytes target prot opt in out source destination
1260 83666 RETURN all -- * * 0.0.0.0/0 0.0.0.0/0 mark match ! 0x4000/0x4000
5 312 MARK all -- * * 0.0.0.0/0 0.0.0.0/0 MARK xor 0x4000
5 312 MASQUERADE all -- * * 0.0.0.0/0 0.0.0.0/0 /* kubernetes service traffic requiring SNAT */ rando
'''

## 初始化k8s-使用underlay网络
# 仅master执行
kubeadm init --apiserver-advertise-address=172.31.6.201 --apiserver-bind-port=6443 --kubernetes-version=v1.24.10 --pod-network-cidr=10.200.0.0/16 --service-cidr=172.31.5.0/24 --service-dns-domain=cluster.local --image-repository=registry.cn-hangzhou.aliyuncs.com/google_containers --ignore-preflight-errors=swap --cri-socket unix:///var/run/cri-dockerd.sock
'''
初始化完成后的重要信息
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 172.31.6.201:6443 --token t0ruut.j6toxlfjte31sngo \
        --discovery-token-ca-cert-hash sha256:c78950990035913274f57d8e62f56c2502dab04ec2f578b6ccd58d788f3932c7
'''
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

root@k8s-master:~# kubectl  get nodes
NAME                     STATUS     ROLES           AGE    VERSION
k8s-master.iclinux.com   NotReady   control-plane   4m4s   v1.24.10

# node 节点执行
## 添加node节点,仅在node节点执行
kubeadm join 172.31.6.201:6443 --token t0ruut.j6toxlfjte31sngo  --discovery-token-ca-cert-hash sha256:c78950990035913274f57d8e62f56c2502dab04ec2f578b6ccd58d788f3932c7 --cri-socket unix:///var/run/cri-dockerd.sock

root@k8s-master:~# kubectl  get nodes
NAME                     STATUS     ROLES           AGE   VERSION
k8s-master.iclinux.com   NotReady   control-plane   10m   v1.24.10
k8s-node1.iclinux.com    NotReady   <none>          20s   v1.24.10
k8s-node2.iclinux.com    NotReady   <none>          11s   v1.24.10
k8s-node3.iclinux.com    NotReady   <none>          8s    v1.24.10
## 分发config master 执行
scp /root/.kube/config  172.31.6.204:/root/.kube

# 基于helm部署网络组件 hybridnet , master 节点
## 安装helm
cd /usr/local/src && wget https://get.helm.sh/helm-v3.9.0-linux-amd64.tar.gz
tar xvf helm-v3.9.0-linux-amd64.tar.gz
mv linux-amd64/helm /usr/local/bin/
# 设置helm仓库
helm repo add hybridnet https://alibaba.github.io/hybridnet/
helm repo update
## 初始化网络组件
helm install hybridnet hybridnet/hybridnet -n kube-system --set init.cidr=10.200.0.0/16
# 注意 cidr为搭建k8s集群pod网段
## 查看
root@k8s-master:/usr/local/src# kubectl get pod -A
NAMESPACE     NAME                                             READY   STATUS     RESTARTS   AGE
kube-system   calico-typha-6f55876f98-hr7sm                    1/1     Running    0          2m7s
kube-system   calico-typha-6f55876f98-kp47t                    1/1     Running    0          2m7s
kube-system   calico-typha-6f55876f98-sdzmz                    1/1     Running    0          2m7s
kube-system   coredns-7f74c56694-bfwqd                         0/1     Pending    0          85m
kube-system   coredns-7f74c56694-blmcv                         0/1     Pending    0          85m
kube-system   etcd-k8s-master.iclinux.com                      1/1     Running    0          85m
kube-system   hybridnet-daemon-5c2fh                           2/2     Running    0          2m7s
kube-system   hybridnet-daemon-5p4fn                           0/2     Init:0/1   0          2m7s
kube-system   hybridnet-daemon-6hvz6                           0/2     Init:0/1   0          2m7s
kube-system   hybridnet-daemon-tvb7x                           2/2     Running    0          2m7s
kube-system   hybridnet-manager-6574dcc5fb-2wm76               0/1     Pending    0          2m7s
kube-system   hybridnet-manager-6574dcc5fb-ghxn6               0/1     Pending    0          2m7s
kube-system   hybridnet-manager-6574dcc5fb-l6gx6               0/1     Pending    0          2m7s
kube-system   hybridnet-webhook-76dc57b4bf-cf9mj               0/1     Pending    0          2m10s
kube-system   hybridnet-webhook-76dc57b4bf-klqzt               0/1     Pending    0          2m10s
kube-system   hybridnet-webhook-76dc57b4bf-wbsnj               0/1     Pending    0          2m10s
kube-system   kube-apiserver-k8s-master.iclinux.com            1/1     Running    0          85m
kube-system   kube-controller-manager-k8s-master.iclinux.com   1/1     Running    0          85m
kube-system   kube-proxy-864vx                                 1/1     Running    0          75m
kube-system   kube-proxy-h585r                                 1/1     Running    0          75m
kube-system   kube-proxy-m5wd5                                 1/1     Running    0          75m
kube-system   kube-proxy-vctbh                                 1/1     Running    0          85m
kube-system   kube-scheduler-k8s-master.iclinux.com            1/1     Running    0          85m
# 设置选择器
kubectl label node k8s-node1.iclinux.com node-role.kubernetes.io/master=
kubectl label node k8s-node2.iclinux.com node-role.kubernetes.io/master=
kubectl label node k8s-node3.iclinux.com node-role.kubernetes.io/master=

# 确保所有的pod都启动了

root@k8s-master:/usr/local/src#
root@k8s-master:/usr/local/src# kubectl  get pods -A
NAMESPACE     NAME                                             READY   STATUS    RESTARTS        AGE
kube-system   calico-typha-6f55876f98-hr7sm                    1/1     Running   0               5m37s
kube-system   calico-typha-6f55876f98-kp47t                    1/1     Running   0               5m37s
kube-system   calico-typha-6f55876f98-sdzmz                    1/1     Running   0               5m37s
kube-system   coredns-7f74c56694-bfwqd                         1/1     Running   0               89m
kube-system   coredns-7f74c56694-blmcv                         1/1     Running   0               89m
kube-system   etcd-k8s-master.iclinux.com                      1/1     Running   0               89m
kube-system   hybridnet-daemon-5c2fh                           2/2     Running   1 (3m2s ago)    5m37s
kube-system   hybridnet-daemon-5p4fn                           2/2     Running   1 (43s ago)     5m37s
kube-system   hybridnet-daemon-6hvz6                           2/2     Running   1 (2m44s ago)   5m37s
kube-system   hybridnet-daemon-tvb7x                           2/2     Running   1 (3m2s ago)    5m37s
kube-system   hybridnet-manager-6574dcc5fb-2wm76               1/1     Running   0               5m37s
kube-system   hybridnet-manager-6574dcc5fb-ghxn6               1/1     Running   0               5m37s
kube-system   hybridnet-manager-6574dcc5fb-l6gx6               1/1     Running   0               5m37s
kube-system   hybridnet-webhook-76dc57b4bf-cf9mj               1/1     Running   0               5m40s
kube-system   hybridnet-webhook-76dc57b4bf-klqzt               1/1     Running   0               5m40s
kube-system   hybridnet-webhook-76dc57b4bf-wbsnj               1/1     Running   0               5m40s
kube-system   kube-apiserver-k8s-master.iclinux.com            1/1     Running   0               89m
kube-system   kube-controller-manager-k8s-master.iclinux.com   1/1     Running   0               89m
kube-system   kube-proxy-864vx                                 1/1     Running   0               79m
kube-system   kube-proxy-h585r                                 1/1     Running   0               79m
kube-system   kube-proxy-m5wd5                                 1/1     Running   0               79m

# 查看网络

root@k8s-node3:/usr/local/src# ifconfig
docker0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        inet 172.17.0.1  netmask 255.255.0.0  broadcast 172.17.255.255
        ether 02:42:0e:b8:b1:9c  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.31.6.204  netmask 255.255.248.0  broadcast 172.31.7.255
        inet6 fe80::20c:29ff:feda:8719  prefixlen 64  scopeid 0x20<link>
        ether 00:0c:29:da:87:19  txqueuelen 1000  (Ethernet)
        RX packets 545289  bytes 695214333 (695.2 MB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 231360  bytes 42214413 (42.2 MB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth0.vxlan4: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450
        inet 172.31.6.204  netmask 255.255.248.0  broadcast 172.31.7.255
        inet6 fe80::20c:29ff:feda:8719  prefixlen 64  scopeid 0x20<link>
        ether 00:0c:29:da:87:19  txqueuelen 0  (Ethernet)
        RX packets 27  bytes 2580 (2.5 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 12  bytes 1188 (1.1 KB)
        TX errors 0  dropped 1 overruns 0  carrier 0  collisions 0

hybr2f5133a0152: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450
        inet6 fe80::ecee:eeff:feee:eeee  prefixlen 64  scopeid 0x20<link>
        ether ee:ee:ee:ee:ee:ee  txqueuelen 0  (Ethernet)
        RX packets 124  bytes 12975 (12.9 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 134  bytes 33940 (33.9 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

hybrf4727d8f411: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450
        inet6 fe80::ecee:eeff:feee:eeee  prefixlen 64  scopeid 0x20<link>
        ether ee:ee:ee:ee:ee:ee  txqueuelen 0  (Ethernet)
        RX packets 124  bytes 12904 (12.9 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 131  bytes 33766 (33.7 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 4030  bytes 374007 (374.0 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 4030  bytes 374007 (374.0 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

root@k8s-node3:/usr/local/src#

root@k8s-node3:/usr/local/src# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         172.31.0.2      0.0.0.0         UG    0      0        0 eth0
172.17.0.0      0.0.0.0         255.255.0.0     U     0      0        0 docker0
172.31.0.0      0.0.0.0         255.255.248.0   U     0      0        0 eth0
root@k8s-node3:/usr/local/src#

# 配置underlay 网络
## 给支持underlay的节点打标签
kubectl label node k8s-node1.iclinux.com network=underlay-nethost
kubectl label node k8s-node2.iclinux.com network=underlay-nethost
kubectl label node k8s-node3.iclinux.com network=underlay-nethost
## 打标错误后的处理
kubectl label --overwrite node k8s-node1.iclinux.com network=underlay-nethost
kubectl label --overwrite node k8s-node2.iclinux.com network=underlay-nethost
kubectl label --overwrite node k8s-node3.iclinux.com network=underlay-nethost

root@k8s-master:~/underlay-cases-files# cat 1.create-underlay-network.yaml
---
apiVersion: networking.alibaba.com/v1
kind: Network
metadata:
  name: underlay-network1
spec:
  netID: 0
  type: Underlay
  nodeSelector:
    network: "underlay-nethost"

---
apiVersion: networking.alibaba.com/v1
kind: Subnet
metadata:
  name: underlay-network1
spec:
  network: underlay-network1
  netID: 0
  range:
    version: "4"              # ipv4
    cidr: "172.31.0.0/21"     # 整个子网的网络规划
    gateway: "172.31.0.2"     # 外部网关地址
    start: "172.31.6.1"
    end: "172.31.6.254"
root@k8s-master:~/underlay-cases-files# kubectl apply -f 1.create-underlay-network.yaml
network.networking.alibaba.com/underlay-network1 created
subnet.networking.alibaba.com/underlay-network1 created

root@k8s-master:~/underlay-cases-files# kubectl  get network
NAME                NETID   TYPE       MODE   V4TOTAL   V4USED   V4AVAILABLE   LASTALLOCATEDV4SUBNET   V6TOTAL   V6USED   V6AVAILABLE   LASTALLOCATEDV6SUBNET
init                4       Overlay           65534     2        65532         init                    0         0        0
underlay-network1   0       Underlay          254       0        254           underlay-network1       0         0        0
# 验证

root@k8s-master:~/underlay-cases-files# kubectl  create ns myserver
namespace/myserver created
k8s-master:~/underlay-cases-files#
root@k8s-master:~/underlay-cases-files# cat 2.tomcat-app1-overlay.yaml
kind: Deployment
apiVersion: apps/v1
metadata:
  labels:
    app: myserver-tomcat-app1-deployment-overlay-label
  name: myserver-tomcat-app1-deployment-overlay
  namespace: myserver
spec:
  replicas: 1
  selector:
    matchLabels:
      app: myserver-tomcat-app1-overlay-selector
  template:
    metadata:
      labels:
        app: myserver-tomcat-app1-overlay-selector
    spec:
      #nodeName: k8s-node2.example.com
      containers:
      - name: myserver-tomcat-app1-container
        #image: tomcat:7.0.93-alpine
        image: registry.cn-hangzhou.aliyuncs.com/zhangshijie/tomcat-app1:v1
        imagePullPolicy: IfNotPresent
        ##imagePullPolicy: Always
        ports:
        - containerPort: 8080
          protocol: TCP
          name: http
        env:
        - name: "password"
          value: "123456"
        - name: "age"
          value: "18"
#        resources:
#          limits:
#            cpu: 0.5
#            memory: "512Mi"
#          requests:
#            cpu: 0.5
#            memory: "512Mi"

---
kind: Service
apiVersion: v1
metadata:
  labels:
    app: myserver-tomcat-app1-service-overlay-label
  name: myserver-tomcat-app1-service-overlay
  namespace: myserver
spec:
  type: NodePort
  ports:
  - name: http
    port: 80
    protocol: TCP
    targetPort: 8080
    nodePort: 30003
  selector:
    app: myserver-tomcat-app1-overlay-selector
root@k8s-master:~/underlay-cases-files# kubectl  apply -f 2.tomcat-app1-overlay.yaml
deployment.apps/myserver-tomcat-app1-deployment-overlay created
service/myserver-tomcat-app1-service-overlay created


root@k8s-master:~/underlay-cases-files# kubectl  get pod -n myserver
NAME                                                       READY   STATUS    RESTARTS   AGE
myserver-tomcat-app1-deployment-overlay-69dfff68d9-jjg45   1/1     Running   0          2m49s
root@k8s-master:~/underlay-cases-files#
root@k8s-master:~/underlay-cases-files# kubectl  get pod -n myserver -o wide
NAME                                                       READY   STATUS    RESTARTS   AGE     IP           NODE                    NOMINATED NODE   READINESS GATES
myserver-tomcat-app1-deployment-overlay-69dfff68d9-jjg45   1/1     Running   0          3m26s   10.200.0.3   k8s-node2.iclinux.com   <none>           <none>
# 此时发现 pod 默认为overlay 网络
root@k8s-master:~/underlay-cases-files# kubectl get svc  -n myserver -o wide
NAME                                   TYPE       CLUSTER-IP     EXTERNAL-IP   PORT(S)        AGE    SELECTOR
myserver-tomcat-app1-service-overlay   NodePort   172.31.5.162   <none>        80:30003/TCP   5m4s   app=myserver-tomcat-app1-overlay-selector
root@k8s-master:~/underlay-cases-files#
# 可以通过http://172.31.6.203:30003/myapp/ 来访问验证
# 创建一个underlay 网络的pod

root@k8s-master:~/underlay-cases-files# cat 3.tomcat-app1-underlay.yaml
kind: Deployment
#apiVersion: extensions/v1beta1
apiVersion: apps/v1
metadata:
  labels:
    app: myserver-tomcat-app1-deployment-underlay-label
  name: myserver-tomcat-app1-deployment-underlay
  namespace: myserver
spec:
  replicas: 1
  selector:
    matchLabels:
      app: myserver-tomcat-app1-underlay-selector
  template:
    metadata:
      labels:
        app: myserver-tomcat-app1-underlay-selector
      annotations: #使用Underlay或者Overlay网络
        networking.alibaba.com/network-type: Underlay
    spec:
      #nodeName: k8s-node2.example.com
      containers:
      - name: myserver-tomcat-app1-container
        #image: tomcat:7.0.93-alpine
        image: registry.cn-hangzhou.aliyuncs.com/zhangshijie/tomcat-app1:v2
        imagePullPolicy: IfNotPresent
        ##imagePullPolicy: Always
        ports:
        - containerPort: 8080
          protocol: TCP
          name: http
        env:
        - name: "password"
          value: "123456"
        - name: "age"
          value: "18"
#        resources:
#          limits:
#            cpu: 0.5
#            memory: "512Mi"
#          requests:
#            cpu: 0.5
#            memory: "512Mi"

---
kind: Service
apiVersion: v1
metadata:
  labels:
    app: myserver-tomcat-app1-service-underlay-label
  name: myserver-tomcat-app1-service-underlay
  namespace: myserver
spec:
#  type: NodePort
  ports:
  - name: http
    port: 80
    protocol: TCP
    targetPort: 8080
    #nodePort: 40003
  selector:
    app: myserver-tomcat-app1-underlay-selector
root@k8s-master:~/underlay-cases-files# kubectl  apply -f 3.tomcat-app1-underlay.yaml
deployment.apps/myserver-tomcat-app1-deployment-underlay created
service/myserver-tomcat-app1-service-underlay created

root@k8s-master:~/underlay-cases-files# kubectl get pod -n myserver -o wide
NAME                                                       READY   STATUS    RESTARTS   AGE   IP           NODE                    NOMINATED NODE   READINESS GATES
myserver-tomcat-app1-deployment-overlay-69dfff68d9-hgr2h   1/1     Running   0          26m   10.200.0.4   k8s-node2.iclinux.com   <none>           <none>
myserver-tomcat-app1-deployment-underlay-bd7cd59cf-nskp9   1/1     Running   0          54s   172.31.6.3   k8s-node1.iclinux.com   <none>           <none>
root@k8s-master:~/underlay-cases-files#
#验证地址:http://172.31.6.4:8080/myapp
# 观察1节点网络

root@k8s-node1:~# ifconfig
docker0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        inet 172.17.0.1  netmask 255.255.0.0  broadcast 172.17.255.255
        ether 02:42:af:7b:06:5e  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.31.6.202  netmask 255.255.248.0  broadcast 172.31.7.255
        inet6 fe80::20c:29ff:fe76:cc0a  prefixlen 64  scopeid 0x20<link>
        ether 00:0c:29:76:cc:0a  txqueuelen 1000  (Ethernet)
        RX packets 1258413  bytes 1662158036 (1.6 GB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 495846  bytes 142946399 (142.9 MB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth0.vxlan4: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450
        inet 172.31.6.202  netmask 255.255.248.0  broadcast 172.31.7.255
        inet6 fe80::20c:29ff:fe76:cc0a  prefixlen 64  scopeid 0x20<link>
        ether 00:0c:29:76:cc:0a  txqueuelen 0  (Ethernet)
        RX packets 27  bytes 2594 (2.5 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 14  bytes 1394 (1.3 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

hybrcf33ee38b06: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::ecee:eeff:feee:eeee  prefixlen 64  scopeid 0x20<link>
        ether ee:ee:ee:ee:ee:ee  txqueuelen 0  (Ethernet)
        RX packets 25  bytes 2961 (2.9 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 35  bytes 3182 (3.1 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 25208  bytes 2007678 (2.0 MB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 25208  bytes 2007678 (2.0 MB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

root@k8s-node1:~# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         172.31.0.2      0.0.0.0         UG    0      0        0 eth0
172.17.0.0      0.0.0.0         255.255.0.0     U     0      0        0 docker0
172.31.0.0      0.0.0.0         255.255.248.0   U     0      0        0 eth0

# pod 地址可以固定,
# 通过service来访问pod

root@k8s-node1:~# kubectl  get svc -n myserver -o wide
NAME                                    TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)        AGE   SELECTOR
myserver-tomcat-app1-service-overlay    NodePort    172.31.5.47   <none>        80:30003/TCP   36m   app=myserver-tomcat-app1-overlay-selector
myserver-tomcat-app1-service-underlay   ClusterIP   172.31.5.86   <none>        80/TCP         10m   app=myserver-tomcat-app1-underlay-selector

# 正常不能直接访问,需要打通客户端到svc的通讯,生产环境需要网络同事添加路由,测试环境需要在测试机添加本地路由

## 配置hybiridnet的默认网络行为从underlay修改为overlay
helm upgrade hybridnet hybridnet/hybridnet -n kube-system --set defualtNetworkType=Overlay
或者:
kubectl edit deploy hybridnet-webhook -n kube-system
  env:
  - name: DEFAULT_NETWORK_TYPE
    value: Overlay
kubectl edit deploy hybridnet-manager -n kube-system
  env:
  - name: DEFAULT_NETWORK_TYPE
    value: Overlay
View Code

3、总结网络组件 flannel vxlan 模式的网络通信流程

 

 

 

1. 源pod发起请求,此时报文中源IP为pod的eth0的ip,源mac为pod的eth0的mac,目的ip为目的pod的ip,目的mac为网关(cni0)的mac
抓包命令:tcpdump -nn -vvv -i veth91d6f855 -vvv -nn ! port 22 and ! port 2379 and ! port 6443 and ! port 10250 and ! arp and ! port 53
2. 数据报文通过veth peer 发送给网关cni0,检查目的mac就是发给自己的,cni0进行目标IP检查,如果是同一个网桥的报文就直接抓发,不是的话就发送给flannel.1,此时保卫会被修改
   源IP:pod IP, 10.100.2.2
   目的IP:Pod ip, 10.100.1.6
   源mac:源pod mac
   目的mac: cni mac
  抓包:tcpdump -nn -vvv -i cni0 -vvv -nn ! port 22 and ! port 2379 and ! port 6443 and ! port 10250 and ! arp and ! port 53

3. 到达flannel.1,检查目的mac就是发给自己的,开始匹配路由表,先实现ov报文的内层封装(主要修改目的pod的对端flannel.1的mac,源mac为当前宿主机flannel.1的mac)
bridge fdb show dev flannel.1
抓包命令:tcpdump -nn -vvv -i flannel.1 -vvv -nn ! port 22 and ! port 2379 and ! port 6443 and ! port 10250 and ! arp and ! port 53
4. 源宿主机基于udp封装vxlan报文
udp 源端口:随机
udp目的端口:8472
源IP:源pod所在的宿主机的物理网卡IP
目的IP:目的pod所在的宿主机的物理网卡IP
源mac: 源pod所在的宿主机的物理网卡
目的mac:目的pod所在宿主机的物理网卡
抓包命令: tcpdump -nn -vvv -i eth0 -vvv -nn ! port 22 and ! port 2379 and ! port 6443 and ! port 10250 and ! arp and ! port 53

5. 报文到达目的宿主机物理网卡,开始解封装报文
   外层目的IP为本机物理网卡,解开后发现里面还有一层目的ip和目的mac,发现目的ip为10.100.1.6,目的mac为xxx(目的flannel.1的mac),然后将报文发给flannel.1
抓包命令:tcpdump -nn -vvv -i eth0 -vvv -nn ! port 22 and ! port 2379 and ! port 6443 and ! port 10250 and ! arp and ! port 53
6. 报文到达目的宿主机flannel.1
   flannel.1检查报文的目的ip,发现是去本机cni0的子网,将请求报文抓发给cnio
   抓包:tcpdump -nn -vvv -i flannel.1 -vvv -nn ! port 22 and ! port 2379 and ! port 6443 and ! port 10250 and ! arp and ! port 53
   目的ip:10.100.1.6 目的pod
   源IP:10.100.2.2 源pod
   目的mac: 目的pod所在宿主机的flannel.1 mac
   源mac:源pod所在宿主机的flannel.1 mac
7. 报文到达目的宿主机cni0
   cni0 基于目的ip检查mac地址表,修改目的mac为目的pod mac后将将请求抓发给pod
   源IP:源pod IP
   目的IP: 目的pod ip
   源mac:cni0的mac
   目的mac:目的pod的mac
   抓包: tcpdump -nn -vvv -i cni0 -vvv -nn ! port 22 and ! port 2379 and ! port 6443 and ! port 10250 and ! arp and ! port 53 -w 7.flannel-flannel-vxlan-cni0-in.pcap
8. 报文到达目的宿主机pod
   cnio 收到报文发现去往10.100.1.6,检查mac地址表发现是本地接口,然后通过网桥接口发给pod
   目的IP: 目的pod IP
   源IP:源POD IP
   目的mac: 目的pod mac
   源mac:cni0的mac
   抓包:tcpdump -nn -vvv -i vethf38183ee -vvv -nn ! port 22 and ! port 2379 and ! port 6443 and ! port 10250 and ! arp and ! port 53 -w 8.flannel-vxlan-vethf38183ee-in.pcap

 

4、总结网络组件 calico IPIP 模式的网络通信流程

 

 

1. 源pod发起请求,报文到达宿主机与pod对应的网卡
2. 报文到达在宿主机与pid对应的网卡
抓包:tcpdump -nn -vvv -i cali2b2e7c9e43e -vvv -nn ! port 22 and ! port 2379 and ! port 6443 and ! port 10250 and ! arp and ! port 53
此时,下一跳的网关都为169.254.1.1,目的mac都为ee:ee:ee:ee:ee:ee
3. 报文到达宿主机tun0
抓包:tcpdump -nn -vvv -i cali2b2e7c9e43e -vvv -nn ! port 22 and ! port 2379 and ! port 6443 and ! port 10250 and ! arp and ! port 53
4. 报文到达 源宿主机eth0
   抓包:# tcpdump -nn -vvv -i eth0 -vvv -nn ! port 22 and ! port 2379 and ! port 6443 and ! port 10250 and ! arp and ! port 53 and ! port 2380 and ! host 172.31.7.101 -w 3.eth0.pca
5. 报文到达目的宿主机eth0
   此时收到的是源宿主机的IPinIP报文,外层为源宿主机和目的宿主机的源mac目的mac、源IP及目的IP,内部为源pod IP及目的pod的IP,没有使用mac地址,解封后发现是去往10.200.151.205
6. 报文到达目的宿主机tunl0
   抓包:# tcpdump -nn -vvv -i tunl0 -vvv -nn ! port 22 and ! port 2379 and ! port 6443 and ! port 10250 and ! arp and ! port 53 and ! port 2380 and ! host 172.31.7.101 -w 5-tunl0.pca
7. 报文到达目的pod与目的宿主机对应的网卡
   源IP为源pod ip
   源mac为tunl0 mac
   目的ip为pod ip
   目的mac为目的pod mac
   随后报文被转发被目的mac(目的 pod mac)
   抓包:tcpdump -nn -vvv -i cali32ecf57bfbe -vvv -nn ! port 22 and ! port 2379 and ! port 6443 and ! port 10250 and ! arp and ! port 53 and ! port 2380 and ! host 172.31.7.101 -w 6-cali32ecf57bfbe.pca
8. 报文到达目的pod
   抓包:tcpdump -i eth0 -vvv -nn -w 7-dst-pod.pcap
   报文到达目的pod,目的pod接受请求并构建响应报文并原路返回给源pod

 

 

posted @ 2023-03-03 22:11  john221100  阅读(29)  评论(0编辑  收藏  举报