docker基础之六network-3-flannel
flannel
flannel是coreos开发的容器网络解决方案。flannel为每个host分配一个subnet,容器从此subnet中分配IP,这些IP可以在host间路由,容器间无需NAT和port mapping就可以跨主机通信。
每个subnet都是从一个更大的IP池中划分的,flannel会在每个主机上运行一个叫flannel的agnet,其职责就是从池中分配subnet,为了在各个主机间共享信息,flannel用etcd(与consul类似的key-value分布式数据库)存放网络配置、已分配的subnet、host的IP等信息。
数据包在主机间转发由backend是实现。
flannel提供多种backend,最常用的有vxlan和host-gw,详情参考: https://github.com/coreos/flannel
实验环境描述:
etcd部署在10.1.1.73
nas4和nas5上运行flanneld
运行etcd:
官网:https://github.com/etcd-io/etcd/releases
方式一:linux
flanneld不支持etcd v3,故使用v2
测试etcd
# start a local etcd server etcd -listen-client-urls http://10.1.1.73:2379 -advertise-client-urls http://10.1.1.73:2379 # write,read to etcd etcdctl --endpoints=10.1.1.73:2379 set foo bar etcdctl --endpoints=10.1.1.73:2379 get foo
将etcd注册到systemd
vi /usr/lib/systemd/system/etcd.service [Unit] Description=Etcd Server After=network.target After=network-online.target Wants=network-online.target [Service] Type=notify WorkingDirectory=/usr/local/src/etcd-v2.3.7-linux-amd64 EnvironmentFile=-/usr/local/src/etcd-v2.3.7-linux-amd64/etcd.conf # set GOMAXPROCS to number of processors ExecStart=/bin/bash -c "/usr/local/src/etcd-v2.3.7-linux-amd64/etcd \ --name infra0 \ --data-dir=/var/lib/etcd/ \ --initial-advertise-peer-urls 'http://10.1.1.73:2380' \ --listen-client-urls 'http://10.1.1.73:2379','http://127.0.0.1:2379' \ --listen-peer-urls 'http://10.1.1.73:2380' \ --advertise-client-urls 'http://10.1.1.73:2379' \ --initial-cluster-token etcd-cluster1 \ --initial-cluster infra0='http://10.1.1.73:2380' \ --initial-cluster-state new" Restart=on-failure LimitNOFILE=65536 [Install] WantedBy=multi-user.target
systemctl start etcd systemctl enable etcd
flannel vxlan backend
flannel配置信息导入到etcd
{
"Network": "10.5.0.0/16",
"SubnetLen": 24,
"Backend": {
"Type": "vxlan"
}
}
etcdctl --endpoints=10.1.1.73:2379 set/coreos.com/network/config '{ "Network": "10.5.0.0/16","SubnetLen": 24, ,"Backend": {"Type": "vxlan"}}'
etcdctl --endpoints=10.1.1.73:2379 get /coreos.com/network/config
1.Network 定义网络的ip池
2.SubnetLen 指定每个主机分配到的subnet大小为24位
3.Backend 为vxlan,即主机间通过vxlan通信
下载二进制flannel
wget https://github.com/coreos/flannel/releases/download/v0.10.0/flanneld-amd64 && chmod +x flanneld-amd64
flanneld的参数
--etcd-endpoints 指定etcd url
-iface 指定主机间数据传输使用的interface
-etcd-prefix 指定etcd存放flannel网络配置信息的key
start flanneld
nas4
flanneld -etcd-endpoints=http://10.1.1.73:2379 -iface=ens33 -etcd-prefix=/coreos.com/network & [1] 9335 [root@nas4 ~]# I0822 17:44:49.810596 9335 main.go:488] Using interface with name ens33 and address 10.1.1.14 I0822 17:44:49.810733 9335 main.go:505] Defaulting external address to interface address (10.1.1.14) #被选为与外部通信的interface I0822 17:44:49.810838 9335 main.go:235] Created subnet manager: Etcd Local Manager with Previous Subnet: None I0822 17:44:49.810845 9335 main.go:238] Installing signal handlers I0822 17:44:49.814651 9335 main.go:353] Found network config - Backend type: vxlan I0822 17:44:49.814718 9335 vxlan.go:120] VXLAN config: VNI=1 Port=0 GBP=false DirectRouting=false I0822 17:44:49.880898 9335 local_manager.go:234] Picking subnet in range 10.5.1.0 ... 10.5.255.0 #识别flannel网络池 I0822 17:44:49.884425 9335 local_manager.go:220] Allocated lease (10.5.4.0/24) to current node (10.1.1.14) #分配subnet为10.5.1.0 I0822 17:44:49.885385 9335 main.go:300] Wrote subnet file to /run/flannel/subnet.env I0822 17:44:49.885402 9335 main.go:304] Running backend. I0822 17:44:49.899071 9335 vxlan_network.go:60] watching for new subnet leases I0822 17:44:49.900595 9335 main.go:396] Waiting for 46h59m58.780046466s to renew lease I0822 17:44:49.922422 9335 iptables.go:115] Some iptables rules are missing; deleting and recreating rules I0822 17:44:49.922442 9335 iptables.go:137] Deleting iptables rule: -s 10.5.0.0/16 -j ACCEPT I0822 17:44:49.923929 9335 iptables.go:137] Deleting iptables rule: -d 10.5.0.0/16 -j ACCEPT I0822 17:44:49.925485 9335 iptables.go:125] Adding iptables rule: -s 10.5.0.0/16 -j ACCEPT I0822 17:44:49.934799 9335 iptables.go:125] Adding iptables rule: -d 10.5.0.0/16 -j ACCEPT
一个新的interface flannel.1被创建,配置subnet的第一个IP
[root@nas4 ~]# ip a show flannel.1
12: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default
link/ether 36:cf:d1:b6:90:19 brd ff:ff:ff:ff:ff:ff
inet 10.5.4.0/32 scope global flannel.1
valid_lft forever preferred_lft forever
inet6 fe80::34cf:d1ff:feb6:9019/64 scope link
valid_lft forever preferred_lft forever
nas5
flanneld -etcd-endpoints=http://10.1.1.73:2379 -iface=ens33 -etcd-prefix=/coreos.com/network &
[1] 8431
You have new mail in /var/spool/mail/root
[root@nas5 ~]# I0822 17:48:37.488463 8431 main.go:488] Using interface with name ens33 and address 10.1.1.15
I0822 17:48:37.488567 8431 main.go:505] Defaulting external address to interface address (10.1.1.15)
I0822 17:48:37.488741 8431 main.go:235] Created subnet manager: Etcd Local Manager with Previous Subnet: None
I0822 17:48:37.488751 8431 main.go:238] Installing signal handlers
I0822 17:48:37.492385 8431 main.go:353] Found network config - Backend type: vxlan
I0822 17:48:37.492456 8431 vxlan.go:120] VXLAN config: VNI=1 Port=0 GBP=false DirectRouting=false
I0822 17:48:37.527213 8431 local_manager.go:234] Picking subnet in range 10.5.1.0 ... 10.5.255.0
I0822 17:48:37.531621 8431 local_manager.go:220] Allocated lease (10.5.56.0/24) to current node (10.1.1.15)
I0822 17:48:37.532732 8431 main.go:300] Wrote subnet file to /run/flannel/subnet.env
I0822 17:48:37.532750 8431 main.go:304] Running backend.
I0822 17:48:37.546585 8431 vxlan_network.go:60] watching for new subnet leases
I0822 17:48:37.547657 8431 iptables.go:115] Some iptables rules are missing; deleting and recreating rules
I0822 17:48:37.547679 8431 iptables.go:137] Deleting iptables rule: -s 10.5.0.0/16 -j ACCEPT
I0822 17:48:37.554245 8431 iptables.go:137] Deleting iptables rule: -d 10.5.0.0/16 -j ACCEPT
I0822 17:48:37.557171 8431 iptables.go:125] Adding iptables rule: -s 10.5.0.0/16 -j ACCEPT
I0822 17:48:37.566648 8431 main.go:396] Waiting for 46h59m59.274471105s to renew lease
I0822 17:48:37.567421 8431 iptables.go:125] Adding iptables rule: -d 10.5.0.0/16 -j ACCEPT
[root@nas5 ~]# ip a show flannel.1
8: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default
link/ether 1e:c8:f2:71:60:dc brd ff:ff:ff:ff:ff:ff
inet 10.5.56.0/32 scope global flannel.1
valid_lft forever preferred_lft forever
inet6 fe80::1cc8:f2ff:fe71:60dc/64 scope link
valid_lft forever preferred_lft forever
当第2台主机接入flannel网络时,host添加了路由
[root@nas4 ~]# route -n |grep flannel.1 10.5.56.0 10.5.56.0 255.255.255.0 UG 0 0 0 flannel.1 [root@nas5 ~]# route -n |grep flannel.1 10.5.4.0 10.5.4.0 255.255.255.0 UG 0 0 0 flannel.1
在docker中使用flannel网络
配置docker连接flannel
Docker 配置文件 /usr/lib/systemd/system/docker.service,设置 --bip 和 --mtu。注意:要与/run/flannel/subnet.env内一致
[root@nas4 ~]# cat /run/flannel/subnet.env FLANNEL_NETWORK=10.5.0.0/16 FLANNEL_SUBNET=10.5.4.1/24 FLANNEL_MTU=1450 FLANNEL_IPMASQ=false
docker-18
[root@master ~]# cat /usr/lib/systemd/system/docker.service |grep -A 1 ExecStart ExecStart=/usr/bin/dockerd-current \ --bip=10.5.4.1/24 --mtu=1450
docker-19
[root@nas4 ~]# cat /usr/lib/systemd/system/docker.service |grep "^ExecStart" ExecStart=/usr/bin/dockerd -H fd:// -H tcp://0.0.0.0:2376 --containerd=/run/containerd/containerd.sock --bip=10.5.4.1/24 --mtu=1450
systemctl daemon-reload systemctl restart docker
[root@nas4 ~]# ip a |grep docker0
7: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
inet 10.5.4.1/24 brd 10.5.4.255 scope global docker0
10.5.4.0/24配置在linux bridge docker0,并添加路由
[root@nas4 ~]# route -n |grep '10.5.' 10.5.4.0 0.0.0.0 255.255.255.0 U 0 0 0 docker0 10.5.56.0 10.5.56.0 255.255.255.0 UG 0 0 0 flannel.1
nas5配置类似;
--bip=10.5.56.1/24 --mtu=1450
结论:
flannel没有创建新的docker网络,而是直接使用默认的bridge网络,同主机的容器通过docker0连接,跨主机流量通过flannel.1转发
将容器连接到flannel网络
运行容器
[root@nas4 ~]# docker run -itd --name bbox1 busybox [root@nas5 ~]# docker run -itd --name bbox2 busybox
问题记录:
docker-18存在该问题,docker-19无
报错如下:
/usr/bin/docker-current: Error response from daemon: shim error: docker-runc not installed on system.
问题解决
[root@master ~]# cd /usr/libexec/docker/ [root@master docker]# ls docker-init-current docker-proxy-current docker-runc-current [root@master docker]# ln -s docker-runc-current dokcer-runc
[root@nas4 ~]# docker exec bbox1 ip a
9: eth0@if10: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1450 qdisc noqueue
link/ether 02:42:0a:05:04:02 brd ff:ff:ff:ff:ff:ff
inet 10.5.4.2/24 brd 10.5.4.255 scope global eth0
valid_lft forever preferred_lft forever
[root@nas5 ~]# docker exec bbox2 ip a
9: eth0@if10: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1450 qdisc noqueue
link/ether 02:42:0a:05:38:02 brd ff:ff:ff:ff:ff:ff
inet 10.5.56.2/24 brd 10.5.56.255 scope global eth0
valid_lft forever preferred_lft foreve
测试bbox1和bbox2的连通性
[root@nas4 ~]# docker exec bbox1 ping 10.5.56.2 -c 2 PING 10.5.56.2 (10.5.56.2): 56 data bytes 64 bytes from 10.5.56.2: seq=0 ttl=62 time=6.502 ms 64 bytes from 10.5.56.2: seq=1 ttl=62 time=0.538 ms --- 10.5.56.2 ping statistics --- 2 packets transmitted, 2 packets received, 0% packet loss round-trip min/avg/max = 0.538/3.520/6.502 ms [root@nas4 ~]# docker exec bbox1 traceroute 10.5.56.2 traceroute to 10.5.56.2 (10.5.56.2), 30 hops max, 46 byte packets 1 10.5.4.1 (10.5.4.1) 0.015 ms 0.012 ms 0.008 ms 2 10.5.56.0 (10.5.56.0) 0.841 ms 0.907 ms 1.240 ms 3 10.5.56.2 (10.5.56.2) 1.015 ms 0.851 ms 0.387 ms
bbox1与bbox2不是一个subnet,数据包发给默认网关10.5.4.1(docker0)
[root@nas4 ~]# ip a |grep -E "docker0|flannel.1"
7: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default
inet 10.5.4.1/24 brd 10.5.4.255 scope global docker0
8: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default
inet 10.5.4.0/32 scope global flannel.1
10: veth04d665a@if9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master docker0 state UP group default
[root@nas5 ~]# ip a |grep -E "docker0|flannel.1"
7: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default
inet 10.5.56.1/24 brd 10.5.56.255 scope global docker0
8: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default
inet 10.5.56.0/32 scope global flannel.1
10: veth089168a@if9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master docker0 state UP group default
根据nas4路由表(如下),数据包发给flannel.1
[root@nas4 ~]# route -n |grep -E "docker0|flannel.1" 10.5.4.0 0.0.0.0 255.255.255.0 U 0 0 0 docker0 10.5.56.0 10.5.56.0 255.255.255.0 UG 0 0 0 flannel.1
flannel.1将数据包封装成vxlan,通过ens33发送给nas5
nas5收到包解封装,发现数据包目的地址为10.5.56.2,根据路由表(如下)将数据包发给flannel.1,并通过docker0到达bbox2
[root@nas5 ~]# route -n |grep -E "docker0|flannel.1" 10.5.4.0 10.5.4.0 255.255.255.0 UG 0 0 0 flannel.1 10.5.56.0 0.0.0.0 255.255.255.0 U 0 0 0 docker0
flannel没有dns服务,容器无法通过hostname通信
[root@nas4 ~]# docker exec bbox1 ping bbox2 -c 2 ping: bad address 'bbox2'
flannel网络隔离
flannel为每个主机分配了独立的subnet,但flannel.1将这些subnet连接起来,相互之间可以路由。本质上,flannel将各主机上相互独立的docker0容器网络组成了一个互通的大网络,实现了容器跨主机通信,flannel没有提供隔离
flannel与外网连通性
因为flannel网络利用的是默认的bridge网络,所以容器与外网的连通方式与bridge网络一样:
1.容器通过docker0 nat访问外网
2.通过主机端口映射,外网可以访问容器
flannel host-gw backend
{
"Network": "10.5.0.0/16",
"SubnetLen": 24,
"Backend": {
"Type": "host-gw"
}
}
type用host-gw替换原先的vxlan。更新etcd数据库
etcdctl --endpoints=10.1.1.73:2379 set /coreos.com/network/config \
'{"Network": "10.5.0.0/16","SubnetLen": 24,"Backend": {"Type": "host-gw"}}'
etcdctl --endpoints=10.1.1.73:2379 get /coreos.com/network/config
nas4和nas5停掉之前的flanneld进程,同样运行如下命令启动flanneld
flanneld -etcd-endpoints=http://10.1.1.73:2379 -iface=ens33 -etcd-prefix=/coreos.com/network
nas4输出
I0823 15:22:57.942519 17217 main.go:488] Using interface with name ens33 and address 10.1.1.14 I0823 15:22:57.942683 17217 main.go:505] Defaulting external address to interface address (10.1.1.14) I0823 15:22:57.943011 17217 main.go:235] Created subnet manager: Etcd Local Manager with Previous Subnet: 10.5.4.0/24 I0823 15:22:57.943022 17217 main.go:238] Installing signal handlers I0823 15:22:57.948033 17217 main.go:353] Found network config - Backend type: host-gw I0823 15:22:57.953155 17217 local_manager.go:147] Found lease (10.5.4.0/24) for current IP (10.1.1.14), reusing #flanneld检查到原先已分配的subnet10.5.4.0/24,重用之 I0823 15:22:57.956278 17217 main.go:300] Wrote subnet file to /run/flannel/subnet.env I0823 15:22:57.956297 17217 main.go:304] Running backend. I0823 15:22:57.963752 17217 route_network.go:53] Watching for new subnet leases I0823 15:22:57.971127 17217 main.go:396] Waiting for 46h59m59.494233086s to renew lease I0823 15:22:57.971630 17217 route_network.go:85] Subnet added: 10.5.56.0/24 via 10.1.1.15 W0823 15:22:57.971658 17217 route_network.go:88] Ignoring non-host-gw subnet: type=vxlan #flanneld从etcd数据库中检索到naas4d的subnet,但因为type=vxlan,立即忽略 I0823 15:23:51.781100 17217 route_network.go:85] Subnet added: 10.5.56.0/24 via 10.1.1.15 W0823 15:23:51.781218 17217 route_network.go:102] Replacing existing route to 10.5.56.0/24 via 10.5.56.0 dev index 8 with 10.5.56.0/24 via 10.1.1.15 dev index 2. #nas5启动flanneld后,发现subnet,将其加到路由表中,
nas5输出
I0823 15:23:52.700393 16685 main.go:488] Using interface with name ens33 and address 10.1.1.15 I0823 15:23:52.700521 16685 main.go:505] Defaulting external address to interface address (10.1.1.15) I0823 15:23:52.700785 16685 main.go:235] Created subnet manager: Etcd Local Manager with Previous Subnet: 10.5.56.0/24 I0823 15:23:52.700795 16685 main.go:238] Installing signal handlers I0823 15:23:52.704843 16685 main.go:353] Found network config - Backend type: host-gw I0823 15:23:52.707794 16685 local_manager.go:147] Found lease (10.5.56.0/24) for current IP (10.1.1.15), reusing I0823 15:23:52.709589 16685 main.go:300] Wrote subnet file to /run/flannel/subnet.env I0823 15:23:52.709614 16685 main.go:304] Running backend. I0823 15:23:52.720002 16685 route_network.go:53] Watching for new subnet leases I0823 15:23:52.720829 16685 main.go:396] Waiting for 46h59m58.568489093s to renew lease I0823 15:23:52.723572 16685 route_network.go:85] Subnet added: 10.5.4.0/24 via 10.1.1.14 W0823 15:23:52.723812 16685 route_network.go:102] Replacing existing route to 10.5.4.0/24 via 10.5.4.0 dev index 8 with 10.5.4.0/24 via 10.1.1.14 dev index 2.
Docker 配置文件 /usr/lib/systemd/system/docker.service,设置 --bip 和 --mtu。注意:要与/run/flannel/subnet.env内一致
[root@nas4 ~]# cat /run/flannel/subnet.env FLANNEL_NETWORK=10.5.0.0/16 FLANNEL_SUBNET=10.5.4.1/24 FLANNEL_MTU=1500 FLANNEL_IPMASQ=false
这与vxlan MTU=1450不同,所以应该修改docker启动参数 --mtu=1500,并重启docker daemon
[root@nas4 ~]# cat /usr/lib/systemd/system/docker.service |grep "^ExecStart" ExecStart=/usr/bin/dockerd -H fd:// -H tcp://0.0.0.0:2376 --containerd=/run/containerd/containerd.sock --bip=10.5.4.1/24 --mtu=1500 [root@nas4 ~]# systemctl daemon-reload [root@nas4 ~]# systemctl restart docker [root@nas4 ~]# route -n |grep "10.5" 10.5.4.0 0.0.0.0 255.255.255.0 U 0 0 0 docker0 10.5.56.0 10.1.1.15 255.255.255.0 UG 0 0 0 ens33
nas5类似
[root@nas5 ~]# route -n |grep "10.5" 10.5.4.0 10.1.1.14 255.255.255.0 UG 0 0 0 ens33 10.5.56.0 0.0.0.0 255.255.255.0 U 0 0 0 docker0
结论:host-gw的网关为host ip
测试
[root@nas4 ~]# docker run -itd --name bbox1 busybox
[root@nas4 ~]# docker exec bbox1 ip a
17: eth0@if18: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue
link/ether 02:42:0a:05:04:02 brd ff:ff:ff:ff:ff:ff
inet 10.5.4.2/24 brd 10.5.4.255 scope global eth0
valid_lft forever preferred_lft forever
[root@nas4 ~]# docker exec bbox1 route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.5.4.1 0.0.0.0 UG 0 0 0 eth0
10.5.4.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
[root@nas5 ~]# docker run -itd --name bbox2 busybox
[root@nas5 ~]# docker exec bbox2 ip a
11: eth0@if12: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1450 qdisc noqueue
link/ether 02:42:0a:05:38:02 brd ff:ff:ff:ff:ff:ff
inet 10.5.56.2/24 brd 10.5.56.255 scope global eth0
valid_lft forever preferred_lft forever
[root@nas5 ~]# docker exec bbox2 route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.5.56.1 0.0.0.0 UG 0 0 0 eth0
10.5.56.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
[root@nas4 ~]# docker exec bbox1 ping 10.5.56.2 -c 2
PING 10.5.56.2 (10.5.56.2): 56 data bytes
64 bytes from 10.5.56.2: seq=0 ttl=62 time=4.430 ms
64 bytes from 10.5.56.2: seq=1 ttl=62 time=0.992 ms
--- 10.5.56.2 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.992/2.711/4.430 ms
[root@nas4 ~]# docker exec bbox1 traceroute 10.5.56.2
traceroute to 10.5.56.2 (10.5.56.2), 30 hops max, 46 byte packets
1 10.5.4.1 (10.5.4.1) 0.014 ms 0.013 ms 0.009 ms
2 10.1.1.15 (10.1.1.15) 0.848 ms 1.140 ms 0.832 ms
3 10.5.56.2 (10.5.56.2) 0.761 ms 0.458 ms 2.561 ms
比较host-gw和vxlan
host-gw把每个主机都配置成网关,主机知道其他主机的subnet和转发地址。vxlan则在主机间建立隧道,不同主机的容器都在一个大的网段内(如:10.5.0.0/16)
虽然vxlan与host-gw使用不同的机制建立主机之间连接,但对于容器则无需任何改变,bbox1仍然可与bbox2通信
由于vxlan需要对数据进行额外打包和拆包,性能会稍逊与host-gw

浙公网安备 33010602011771号