LVS(dr)+keepalived
LVS可以实现负载均衡,但是不能进行健康检查,如果有一个rs出现故障,LVS任然会把请求转发给故障的rs服务器,这样就会导致请求失败。
keepalived可以进行健康检查,同时能够实现LVS的高可用性,解决LVS的单点故障问题,其实,keepalived就是为LVS而生的。
实验环境:四台机器
192.168.75.61: LVS1 BACKUP
192.168.75.63: LVS2 MASTER
192.168.75.64:realserver1
192.168.75.65:realserver2
LVS的两个节点安装ipvsadm+keepalived
yum install -y ipvsadm keepalived(本文我是用编译安装的keepalived)
realserver的节点安装nginx
yum install -y nginx
配置脚本:
realserver两台节点的配置脚本:
[root@VM-75-64 ~]# cat lvs_dr_rs.sh
#!/bin/bash
vip=192.168.75.55
set -x
ifconfig lo:0 $vip broadcast $vip netmask 255.255.255.255 up
route add -host $vip lo:0
echo "1" >/proc/sys/net/ipv4/conf/lo/arp_ignore
echo "2" >/proc/sys/net/ipv4/conf/lo/arp_announce
echo "1" >/proc/sys/net/ipv4/conf/all/arp_ignore
echo "2" >/proc/sys/net/ipv4/conf/all/arp_announce
由于之前已经配置过dr模式的lvs,lo:0网卡已经存在了,因此首先要手动删掉对应的网卡,操作步骤为:
#ifconfig lo:0 down
在两台rs上分别执行该脚本。
LVS两台节点配置:ipvsadm不用配置了,直接在keepalived的配置文件里配置,所以说keepalived是为lvs而生的。
两台节点的keepalived配置文件:
MASTER:192.168.76.63
[root@VM-75-63 keepalived]# pwd
/usr/local/etc/keepalived
[root@VM-75-63 keepalived]# cat keepalived.conf
! Configuration File for keepalived
global_defs {
router_id 75_63
vrrp_skip_check_adv_addr
# vrrp_strict
vrrp_garp_interval 0
vrrp_gna_interval 0
}
vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 51
priority 150
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.75.55
}
}
virtual_server 192.168.75.55 80 {
delay_loop 6
lb_algo rr
lb_kind DR
protocol TCP
real_server 192.168.75.64 80 {
weight 1
TCP_CHECK {
connect_port 80
connect_timeout 3
nb_get_retry 3
delay_before_retry 10
}
}
real_server 192.168.75.65 80 {
weight 1
TCP_CHECK {
connect_port 80
connect_timeout 3
nb_get_retry 3
delay_before_retry 3
}
}
}
BACKUP:192.168.75.61
[root@VM-75-61 keepalived]# pwd
/usr/local/etc/keepalived
[root@VM-75-61 keepalived]# cat keepalived.conf
! Configuration File for keepalived
global_defs {
router_id 75_61
vrrp_skip_check_adv_addr
# vrrp_strict #这里要注释掉,否则虚拟出的IP是ping不通的
vrrp_garp_interval 0
vrrp_gna_interval 0
}
vrrp_instance VI_1 {
state BACKUP
interface eth0
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.75.55
}
}
virtual_server 192.168.75.55 80 { #此处等同于:/sbin/ipvsadm -A -t 192.168.75.55:80 -s wrr
delay_loop 6 #服务轮询的时间间隔,每隔6秒检测后端服务
lb_algo rr #lvs的调度算法
lb_kind DR #lvs集群模式
persistence_timeout 360 #持久化超时时间,默认是6分钟,客户端在持久化超时时间内的连接,会持续的连接到同一台后端服务器
protocol TCP #指定协议
real_server 192.168.75.64 80 { #等同于/sbin/ipvsadm -a -t 192.168.75.55:80 -r 192.168.75.64:80 -g -w 1
weight 1 #权重
TCP_CHECK {
connect_port 80
connect_timeout 3 #检测的连接超时时间
nb_get_retry 3 #重试次数,这里是HTTP_GET检测方式的配置字段,TCP_CHECK一般可以去掉
delay_before_retry 10 #重试之前延迟多少秒,默认1s
}
}
real_server 192.168.75.65 80 {
weight 1
TCP_CHECK {
connect_port 80
connect_timeout 3
nb_get_retry 3
delay_before_retry 3
}
}
}
keepalived的配置文件处理好之后,开启keepalived进程:
注意,在开启keepalived的时候,先MASTER,再BACKUP!
[root@VM-75-63 init.d]# /usr/local/sbin/keepalived -f /usr/local/etc/keepalived/keepalived.conf
[root@VM-75-63 init.d]# ps -ef |grep keep
root 27612 1 0 04:18 ? 00:00:00 /usr/local/sbin/keepalived -f /usr/local/etc/keepalived/keepalived.conf
root 27613 27612 0 04:18 ? 00:00:00 /usr/local/sbin/keepalived -f /usr/local/etc/keepalived/keepalived.conf
root 27614 27612 0 04:18 ? 00:00:00 /usr/local/sbin/keepalived -f /usr/local/etc/keepalived/keepalived.conf
root 27619 8470 0 04:18 pts/0 00:00:00 grep keep
进程起来了,看下状态:
[root@VM-75-63 init.d]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:d1:74:a7 brd ff:ff:ff:ff:ff:ff
inet 192.168.75.63/24 brd 192.168.75.255 scope global eth0
inet 192.168.75.55/32 scope global eth0 #这一行,就是虚拟IP
inet6 fe80::20c:29ff:fed1:74a7/64 scope link
valid_lft forever preferred_lft forever
同样开启BCKUP的keepalived:
[root@VM-75-61 sysconfig]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:50:56:b3:00:01 brd ff:ff:ff:ff:ff:ff
inet 192.168.75.61/24 brd 192.168.75.255 scope global eth0
inet6 fe80::250:56ff:feb3:1/64 scope link
valid_lft forever preferred_lft forever
此时并没有VIP,因为VIP在MASTER节点上!
ok,在MASTER节点查看ipvsadm的结果:
[root@VM-75-63 init.d]# ipvsadm -l
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 192.168.75.55:http rr
-> 192.168.75.64:http Route 1 0 0
-> 192.168.75.65:http Route 1 0 0
这里我们就不需要手动去配置虚拟主机和真实主机,而是通过keepalived的配置文件去实现的,所以说二者的联系很密切。
虽然vip不在backup节点上,但是ipvsadm的配置,他是存在的:
[root@VM-75-61 sysconfig]# ipvsadm -l
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 192.168.75.55:http rr
-> 192.168.75.64:http Route 1 0 0
-> 192.168.75.65:http Route 1 0 0
现在,测试在页面访问:http://192.168.75.55
发现访问不不通,连接超时,但是当我们用同一网段的linux虚拟机访问时:
[root@Vm-75-60 ~]# curl http://192.168.75.55
75.65
[root@Vm-75-60 ~]# curl http://192.168.75.55
75.64
[root@Vm-75-60 ~]# curl http://192.168.75.55
75.65
[root@Vm-75-60 ~]# curl http://192.168.75.55
75.64
发现能够拿到结果,并且轮询方式也是rr,那么现在思考,为什么用局域网内的办公电脑没法访问呢???
办公电脑跟后端realserver在不同网段,不同网段通信就需要添加路由信息,好的,查看realserver的路由信息:
[root@VM-75-64 ~]# route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
192.168.75.55 * 255.255.255.255 UH 0 0 0 lo
192.168.75.0 * 255.255.255.0 U 0 0 0 eth1
link-local * 255.255.0.0 U 1002 0 0 eth1
default 192.168.75.55 0.0.0.0 UG 0 0 0 eth1
default 192.168.75.1 0.0.0.0 UG 0 0 0 eth1
我们看到192.168.75.55 * 255.255.255.255 UH 0 0 0 lo 这条是到75的路由信息,且后面的网卡设备是lo,但是往下发现,本地还有个默认到75.55的网关,通过上面学习的结果知道,lvs的DR模式下,是不允许后端的realserver配置默认IP为VIP的网关的,好的,我们尝试删除该条网关:
[root@VM-75-64 ~]# route del default gw 192.168.75.55
[root@VM-75-64 ~]# route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
192.168.75.55 * 255.255.255.255 UH 0 0 0 lo
192.168.75.0 * 255.255.255.0 U 0 0 0 eth1
link-local * 255.255.0.0 U 1002 0 0 eth1
default 192.168.75.1 0.0.0.0 UG 0 0 0 eth1
好的,再次尝试用页面访问:
你看,访问到了,思考下原因:
因为DR模式是realserver直接把响应包反馈给客户端的,也就是这里的web主机,我电脑的IP,76.147,这里看来二者没法通信,虽然加上了路由信息,但默认网关是走的vip,也就是这里的75.55,但是实际上返回包是通过realserver的lo的IP,也就是vip,传给eth0,也就是75.64,再返回给web,所以这里需要网关是75.1,也就是真实的默认网关,不然76.147和75.64是没法通信的。
测试1:
手动关闭75.64的nginx进程:
[root@VM-75-64 ~]# service nginx stop
Stopping nginx: [ OK ]
测试访问情况:
[root@Vm-75-60 ~]# curl http://192.168.75.55
curl: (7) couldn't connect to host
[root@Vm-75-60 ~]# curl http://192.168.75.55
75.65
[root@Vm-75-60 ~]# curl http://192.168.75.55
curl: (7) couldn't connect to host
[root@Vm-75-60 ~]# curl http://192.168.75.55
75.65
[root@Vm-75-60 ~]# curl http://192.168.75.55
curl: (7) couldn't connect to host
[root@Vm-75-60 ~]# curl http://192.168.75.55
75.65
[root@Vm-75-60 ~]# curl http://192.168.75.55
75.65
[root@Vm-75-60 ~]# curl http://192.168.75.55
75.65
[root@Vm-75-60 ~]# curl http://192.168.75.55
75.65
期初你会发现,当轮询到75.64上时,访问是会失败的,75.65节点正常,但是大约10秒钟之后,所有的访问请求都会被分配到75.65节点上,这是因为keepalived检测到后端有节点响应失败,所以把所有的请求都分配到正常的节点上了!
当75.64的nginx再次开起来之后,realserver立马加入集群,处理请求。
测试2:
keepalived的HA特性,我们尝试关闭MASTER的keepalived进程,看VIP是否能够正常的票到BACKUP节点上:
[root@VM-75-63 keepalived]# ps -ef |grep keep
root 27612 1 0 04:18 ? 00:00:00 /usr/local/sbin/keepalived -f /usr/local/etc/keepalived/keepalived.conf
root 27613 27612 0 04:18 ? 00:00:00 /usr/local/sbin/keepalived -f /usr/local/etc/keepalived/keepalived.conf
root 27614 27612 0 04:18 ? 00:00:00 /usr/local/sbin/keepalived -f /usr/local/etc/keepalived/keepalived.conf
root 27722 8470 0 06:40 pts/0 00:00:00 grep keep
[root@VM-75-63 keepalived]#
[root@VM-75-63 keepalived]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:d1:74:a7 brd ff:ff:ff:ff:ff:ff
inet 192.168.75.63/24 brd 192.168.75.255 scope global eth0
inet 192.168.75.55/32 scope global eth0 #VIP在MASTER上
inet6 fe80::20c:29ff:fed1:74a7/64 scope link
valid_lft forever preferred_lft forever
[root@VM-75-63 keepalived]# pkill keepalived
[root@VM-75-63 keepalived]# ps -ef |grep keep
root 27726 8470 0 06:41 pts/0 00:00:00 grep keep
[root@VM-75-63 keepalived]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:d1:74:a7 brd ff:ff:ff:ff:ff:ff
inet 192.168.75.63/24 brd 192.168.75.255 scope global eth0
inet6 fe80::20c:29ff:fed1:74a7/64 scope link #VIP不见了
valid_lft forever preferred_lft forever
MASTER上的VIP不见了,我们看下BACKUP的情况:
[root@VM-75-61 sysconfig]# ps -fe | grep keep
root 16821 1 0 09:49 ? 00:00:00 /usr/local/sbin/keepalived -f /usr/local/etc/keepalived/keepalived.conf
root 16822 16821 0 09:49 ? 00:00:01 /usr/local/sbin/keepalived -f /usr/local/etc/keepalived/keepalived.conf
root 16823 16821 0 09:49 ? 00:00:01 /usr/local/sbin/keepalived -f /usr/local/etc/keepalived/keepalived.conf
root 17076 31656 0 15:01 pts/0 00:00:00 grep keep
[root@VM-75-61 sysconfig]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:50:56:b3:00:01 brd ff:ff:ff:ff:ff:ff
inet 192.168.75.61/24 brd 192.168.75.255 scope global eth0
inet 192.168.75.55/32 scope global eth0
inet6 fe80::250:56ff:feb3:1/64 scope link
valid_lft forever preferred_lft forever
是不是,来到75.61上了!并且服务依旧是可用的,那么这里就实现了双机热备的高可用架构。
注意:实验初期,我在测试VIP漂移的时候,发现MASTER的keepalived杀掉之后,VIP并不会漂移,依旧在MASTER上,更奇怪的是,BACKUP上也出现了VIP,二者同时存在,这是为什么呢?而且通过抓包我们发现:
[root@VM-75-61 sysconfig]# tcpdump -i eth0 -p vrrp -n
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
15:05:11.078778 IP 192.168.75.61 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20
15:05:12.079844 IP 192.168.75.61 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20
15:05:13.080878 IP 192.168.75.61 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20
15:05:14.081893 IP 192.168.75.61 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20
vrrp广播确实由75.61也就是BACKUP发送的,也就说是由75.61提供服务的,并且服务依旧可用!
当把master的keepalived开起来之后,backup的VIP就没了,通过抓包发下:
[root@VM-75-61 sysconfig]# tcpdump -i eth0 -p vrrp -n
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
15:05:44.108400 IP 192.168.75.63 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 150, authtype simple, intvl 1s, length 20
15:05:45.109462 IP 192.168.75.63 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 150, authtype simple, intvl 1s, length 20
15:05:46.110508 IP 192.168.75.63 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 150, authtype simple, intvl 1s, length 20
又由master提供服务了,所以这里的情况是:VIP仅仅看起来没有转移,实际上高可用功能是能够正常提供的,那么为什么呢?
通过网上提供的思路发现,我在杀master端的keepalived进程,使用了 -9 参数,也就是强制中断,并不是正常的关闭进程,当我尝试不加参数‘-9’的时候,VIP顺利的飘走了,服务依旧可用,那么,这里在测试的时候要注意了!
不能使用 ‘kill -9’
好的,关于lvs的DR模式的高可用大致就说到这里,后面有发现会继续补充。
LVS状态能够通过zabbix自动以监控进行监控并画图,监控数据主要从下面几个指令得到:
[root@VM-75-63 bin]# cat /proc/net/ip_vs
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP C0A84B37:0050 rr
-> C0A84B41:0050 Route 1 0 0
-> C0A84B40:0050 Route 1 0 0
这里看起来不太友好,换个方式:
[root@VM-75-63 bin]# ipvsadm -l
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 192.168.75.55:http rr
-> 192.168.75.64:http Route 1 0 0
-> 192.168.75.65:http Route 1 0 0
这里是展示连接数的。
[root@VM-75-63 bin]# cat /proc/net/ip_vs_stats
Total Incoming Outgoing Incoming Outgoing
Conns Packets Packets Bytes Bytes
542 58DB 3E99 8C1065 344D49
Conns/s Pkts/s Pkts/s Bytes/s Bytes/s
1 7 0 214 0
上面的指令是展示当前往来的数据量和输出带宽的。
自定义监控项的具体细节就不阐述了。
以上,共勉!

浙公网安备 33010602011771号