keepalived简介&keepalived实现httpd服务高可用集群案例
keepalived笔记:
keepalived简介:
集群:是指一组相互独立的计算机,利用高速通信网络组成一个计算机系统,每个集群节点(即集群中的每台计算机)都是运行其自己进程的一个独立服务器。
keepalived是集群管理中保证集群高可用的一个服务软件,功能类似于heartbeat,用来防止单点故障。
Keepalived是Linux 下一个轻量级的高可用解决方案,它与 HACMP、RoseHA 实现的功能类似,都可以实现服务或者网络的高可用,但是又有差别:HACMP 是一个专业的、功能完善的高可用软件,它提供了 HA 软件所需的基本功能,比如心跳检测和资源接管,监测集群中的系统服务,在群集节点间转移共享 IP 地址的所有者等,HACMP 功能强大,但是部署和使用相对比较麻烦,同时也是商业化软件;与 HACMP相比,Keepalived 主要是通过虚拟路由冗余来实现高可用功能,虽然它没有 HACMP 功能强大,但 Keepalived部署和使用非常简单,所有配置只需一个配置文件即可完成。
keepalived 的用途:
Keepalived 起初是为 LVS 设计的,专门用来监控集群系统中各个服务节点的状态。它根据 layer3, 4 & 5 交换机制检测每个服务节点的状态,如果某个服务节点出现异常,或工作出现故障,Keepalived 将检测到,并将出现故障的服务节点从集群系统中剔除,而在故障节点恢复正常后,Keepalived 又可以自动将此服务节点重新加入到服务器集群中,这些工作全部自动完成,不需要人工干涉,需要人工完成的只是修复出现故障的服务节点。
Keepalived 后来又加入了 VRRP 的功能,VRRP 是 Virtual Router Redundancy Protocol(虚拟路由器冗余协议)的缩写,它出现的目的是为了解决静态路由出现的单点故障问题,通过 VRRP 可以实现网络不间断地、稳定地运行。因此,Keepalived 一方面具有服务器状态检测和故障隔离功能,另一方面也具有HA cluster功能。
VRRP 协议与工作原理:
在现实的网络环境中,主机之间的通信都是通过配置静态路由(默认网关)完成的,而主机之间的路由器一旦出现故障,通信就会失败,因此,在这种通信模式中,路由器就成了一个单点瓶颈,为了解决这个问题,就引入了 VRRP 协议。
VRRP协议它是一种主备模式的协议,通过VRRP可以在网络发生故障时透明地进行设备切换而不影响主机间的数据通信,这其中涉及两个概念:物理路由器和虚拟路由器。
VRRP可以将两台或者多台物理路由器设备虚拟成一个虚拟路由器,这个虚拟路由器通过虚拟IP(一个或者多个)对外提供服务,而在虚拟路由器内部,是多个物理路由器协同工作,同一时间只有一台物理路由器对外提供服务,该台物理路由器被称为主路由器(处于MASTER角色)。一般情况下MASTER由选举算法产生,它拥有对外服务的虚拟IP,提供各种网络功能,如ARP请求、ICMP、数据转发等。而其他物理路由器不拥有对外的虚拟IP,也不提供对外网络功能,仅仅接受MASTER的VRRP状态通告信息,这些路由器被统称为备用路由器(处于BACKUP角色)。当主路由器失效时,处于BACKUP角色的备份路由器将重新进行选举产生一个新的主路由器进入MASTER角色继续提供对外服务,整个切换过程对用户来说完全透明。
在一个虚拟路由器中只有处于MASTER角色的路由器会一直发送VRRP数据包,处于BACKUP角色的路由器只接受MASTER发过来的报文信息,用来监测MASTER运行状态,因此,不会发生MASTER抢占现象,除非它的优先级更高。而当MASTER不可用时,BACKUP也就无法收到MASTER发过来的报文信息,于是就认定MASTER出现故障,接着多台BACKUP就会进行选举,优先级最高的BACKUP将成为新的MASTER,这种选举并进行角色切换的过程非常快,因此也就保证了服务的持续可用性。
keepalived安装:
[root@crmn ~]# yum install -y keepalived
[root@crmn ~]# yum install -y ipvsadm //若需要lvs功能还需要安装ipvs模块
[root@crmn ~]# /etc/init.d/keepalived start
keepalived安装与配置:
[root@crmn ~]# vim /etc/keepalived/keepalived.conf //Keepalived 的所有配置均在这个配置文件中完成
[root@crmn ~]# vim /etc/init.d/keepalived
根据配置文件所实现的功能,将Keepalived 配置分为三类,分别是:全局配置;VRRPD配置;LVS配置。
(1)keepalived全局配置:
全局配置就是对整个Keepalived都生效的配置,keepalived的配置文件都是以块的形式组织的,每个块内容都包括在{}中,以#或!开头的行都是注释。
! Configuration File for keepalived
global_defs { //全局配置以“global_defs”作为标识,在“global_defs”区域内的都是全局配置选项
notification_email { //用于设置报警邮件地址,可以设置多个;若要开启邮件报警则须提前开启本机sendmail服务。
acassen@firewall.loc
failover@firewall.loc
sysadmin@firewall.loc
}
notification_email_from Alexandre.Cassen@firewall.loc //设置邮件发送地址
smtp_server 192.168.200.1 //设置邮件的smtp server地址
smtp_connect_timeout 30 //设置连接smtp server的超时时间
router_id LVS_DEVEL //表示运行keepalived服务器的一个标识,是发送邮件时显示在邮件主题中的信息。
}
(2)keepalived的VRRPD配置参数解释:
VRRPD配置是keepalived所有配置的核心,主要用来实现keepalived的高可用功能。
VRRP实例段主要用来配置节点角色(主或者从)、实例绑定的网络接口、节点间验证机制、集群服务IP等。eg:
vrrp_instance VI_1 { //vrrp_instance是VRRP实例开始的标识,后跟VRRP实例名称。
state MASTER //指定keepalived的角色,MASTER表示此主机是主服务器,BACKUP表示此主机是备用服务器。
interface eth0 //指定HA监测网络的接口;应用程序使用的网卡。
virtual_router_id 51 //虚拟路由标识,同一个vrrp实例(同一个集群)使用唯一的标识,即在同一个vrrp_instance下MASTER和BACKUP必须是一致的。
priority 100 //定义节点优先级,数字越大表示节点优先级越高(同一个vrrp_instance下MASTER优先级必须大于BACKUP优先级)
advert_int 1 //设定MASTER与BACKUP主机之间检查、同步时间间隔,单位是秒。
---------------------//设置一些额外的网络监控接口,其中任何一个网络接口出现故障,keepalived都会进入FAULT状态。
track_interface{
eth0
eth1
}
---------------------
authentication { //设定节点间通信验证类型和密码,验证类型主要有PASS和AH两种,在一个vrrp_instance下MASTER与BACKUP必须使用相同的密码才能正常通信。
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
//设置虚拟IP地址(VIP),又称漂移IP地址。可以设置多个虚拟IP地址,每行一个。
叫做漂移IP地址是因为keepalived切换到MASTER状态时,这个IP地址会自动添加到系统中,而切换到BACKUP状态时这些IP又会自动从系统中删除。keepalived通过“ip address add”命令的形式将VIP添加进系统中。要查看系统中添加的VIP地址可以通过“ip add”命令实现。
“virtual_ipsddress”段中添加的IP形式可以多种多样,eg:"192.168.16.189/24 dev eth1",而keepalived会使用IP命令“ip addr add 192.168.16.189/24 dev eth1”将IP信息添加到系统中。因此,这里的配置规则和IP命令的使用规则是一致的。
192.168.200.16
192.168.200.17 dev eth1 //指定到eth1
192.168.200.18 dev eth2 //指定到eth2
}
-----------------------
nopreempt //设置高可用集群中的不抢占功能。设置后可以实现主节点故障恢复后不再切回到主节点,让服务一直在备用节点工作,直到备用节点出现故障才会进行切换。
在一个HA Cluster中如果主节点死机,备用节点会进行接管,主节点再次正常启动后一般会自动接管服务。这种来回切换的操作,对于实时性和稳定性要求不高的业务系统来说还是可以接受的,而对于稳定性和实时性要求很高的业务系统来说不建议来回切换,毕竟服务的切换存在一定的风险和不稳定性,在这种情况下,就需要设置该选项。
注意:在使用不抢占时只能在“state”状态为“BACKUP”的节点上设置,而且这个节点的优先级必须高于其他节点。//(主备节点都是BACKUP,权重高的是主节点)
preemtp_delay 300 //设置切换的延迟时间,单位是秒;超过该延时时间段keepalived监测到网络发生问题则进行动作切换!
-----------------------
notify_master "/etc/keepalived/master.sh"
//当keepalived进入Master状态时要执行的脚本,可以是状态报警脚本也可以是服务管理脚本,keepalived允许脚本传入参数,因此灵活性很强。
notify_master "/etc/keepalived/backup.sh"
//当keepalived进入Backup状态时要执行的脚本,可以是状态报警脚本也可以是服务管理脚本。
notify_fault "/etc/keepalived/fault.sh"
//当keepalived进入Fault状态时要执行的脚本,脚本功能与前两个类似。
notify_stop "/etc/keepalived/stop.sh"
//指定当keepalived程序终止时需要执行的脚本。
}
//4、实例演示keepalived实现httpd服务高可用集群
下面通过配置一套keepalived集群系统来实现演示一下keepalived高可用集群的实现过程,这里以操作系统centos release 6.9、keepalived v1.2.13版本为例,具体部署环境如下:
主机名 主机IP地址 集群角色 集群服务 虚拟IP地址
keepalived-master 172.16.135.132 MASTER HTTPD 172.16.135.195
keepalived-backup 172.16.135.134 BACKUP HTTPD 172.16.135.195
这里要部署一套基于HTTPD的高可用集群系统。
[root@crmn ~]# ls /etc/keepalived/
keepalived.conf
[root@crmn ~]# keepalived -v
Keepalived v1.2.13 (03/19,2015)
[root@crmn ~]# more /etc/issue
CentOS release 6.9 (Final)
Kernel \r on an \m
[root@crmn ~]# cd /etc/keepalived/
[root@crmn keepalived]# cp keepalived.conf keepalived.conf_bak
[root@crmn keepalived]# ls
backup.sh fault.sh keepalived.conf keepalived.conf_bak master.sh
//一、keepalived启动过程分析:
[root@crmn keepalived]# gedit /etc/keepalived/keepalived.conf //主节点配置172.16.135.132
! Configuration File for keepalived
global_defs {
notification_email {
acassen@firewall.loc
failover@firewall.loc
sysadmin@firewall.loc
}
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 192.168.200.1
smtp_connect_timeout 30
router_id LVS_DEVEL
}
#2017.11.6 定义一个监测的服务脚本
vrrp_scrpt check_httpd {
#script "killall -0 httpd"
script "</dev/tcp/127.0.0.1/80"
interval 2
fall 2
rise 1
}
vrrp_instance HA_1 {
state MASTER //
interface eth0 //
virtual_router_id 180
priority 100 //
advert_int 2
#nopreempt
authentication {
auth_type PASS
auth_pass 1111
}
notify_master "/etc/keepalived/master.sh"
notify_backup "/etc/keepalived/backup.sh"
notify_fault "/etc/keepalived/fault.sh"
#引用该服务
track_script {
check_httpd
}
virtual_ipaddress {
172.16.135.195/24 dev eth0
}
}
[root@crmn keepalived]# cat master.sh
#!/bin/bash
LOGFILE=/var/log/keepalived-httpd-state.log
echo "[Master]">>$LOGFILE
date >> $LOGFILE
[root@crmn keepalived]# cat backup.sh
#!/bin/bash
LOGFILE=/var/log/keepalived-httpd-state.log
echo "[Backup]">>$LOGFILE
date >> $LOGFILE
[root@crmn keepalived]# cat fault.sh
#!/bin/bash
LOGFILE=/var/log/keepalived-httpd-state.log
echo "[Fault]">>$LOGFILE
date >> $LOGFILE
[root@crmn keepalived]# gedit /etc/keepalived/keepalived.conf //备用节点配置172.16.135.134
! Configuration File for keepalived
global_defs {
notification_email {
acassen@firewall.loc
failover@firewall.loc
sysadmin@firewall.loc
}
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 192.168.200.1
smtp_connect_timeout 30
router_id LVS_DEVEL
}
#2017.11.6 定义一个监测的服务脚本
vrrp_scrpt check_httpd {
#script "killall -0 httpd"
script "</dev/tcp/127.0.0.1/80"
interval 2
fall 2
rise 1
}
vrrp_instance HA_1 {
state BACKUP //
interface eth0 //
virtual_router_id 180
priority 90 //
advert_int 2
#nopreempt
authentication {
auth_type PASS
auth_pass 1111
}
notify_master "/etc/keepalived/master.sh"
notify_backup "/etc/keepalived/backup.sh"
notify_fault "/etc/keepalived/fault.sh"
#引用该服务
track_script {
check_httpd
}
virtual_ipaddress {
172.16.135.195/24 dev eth0
}
}
//主、备用节点分别设置:
主节点172.16.135.132:
[root@crmn keepalived]# ps -ef|grep httpd
root 21402 12707 0 14:18 pts/0 00:00:00 grep httpd
[root@crmn keepalived]# service httpd start
正在启动 httpd: [确定]
[root@crmn keepalived]# /etc/init.d/keepalived start
正在启动 keepalived: [确定]
[root@crmn keepalived]# ip add
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:ab:01:c7 brd ff:ff:ff:ff:ff:ff
inet 172.16.135.132/24 brd 172.16.135.255 scope global eth0
inet 172.16.135.195/24 scope global secondary eth0 //
inet6 fe80::20c:29ff:feab:1c7/64 scope link
valid_lft forever preferred_lft forever
3: pan0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN
link/ether 6a:73:e4:72:b5:de brd ff:ff:ff:ff:ff:ff
[root@crmn keepalived]#
//打开另一窗口172.16.135.132(1)
[root@crmn keepalived]# tail -f /var/log/messages
Nov 8 14:04:51 crmn NetworkManager[1908]: <info> domain name 'localdomain'
Nov 8 14:17:44 crmn dhclient[12552]: DHCPREQUEST on eth0 to 172.16.135.254 port 67 (xid=0x12c8ed80)
Nov 8 14:17:44 crmn dhclient[12552]: DHCPACK from 172.16.135.254 (xid=0x12c8ed80)
Nov 8 14:17:44 crmn NetworkManager[1908]: <info> (eth0): DHCPv4 state changed renew -> renew
Nov 8 14:17:44 crmn NetworkManager[1908]: <info> address 172.16.135.132
Nov 8 14:17:44 crmn NetworkManager[1908]: <info> prefix 24 (255.255.255.0)
Nov 8 14:17:44 crmn NetworkManager[1908]: <info> gateway 172.16.135.2
Nov 8 14:17:44 crmn NetworkManager[1908]: <info> nameserver '172.16.135.2'
Nov 8 14:17:44 crmn NetworkManager[1908]: <info> domain name 'localdomain'
Nov 8 14:17:44 crmn dhclient[12552]: bound to 172.16.135.132 -- renewal in 842 seconds.
Nov 8 14:19:08 crmn Keepalived[21449]: Starting Keepalived v1.2.13 (03/19,2015)
Nov 8 14:19:08 crmn Keepalived[21450]: Starting Healthcheck child process, pid=21451
Nov 8 14:19:08 crmn Keepalived_healthcheckers[21451]: Netlink reflector reports IP 172.16.135.132 added
Nov 8 14:19:08 crmn Keepalived_healthcheckers[21451]: Netlink reflector reports IP fe80::20c:29ff:feab:1c7 added
Nov 8 14:19:08 crmn Keepalived_healthcheckers[21451]: Registering Kernel netlink reflector
Nov 8 14:19:08 crmn Keepalived_healthcheckers[21451]: Registering Kernel netlink command channel
Nov 8 14:19:08 crmn Keepalived[21450]: Starting VRRP child process, pid=21452
Nov 8 14:19:08 crmn Keepalived_vrrp[21452]: Netlink reflector reports IP 172.16.135.132 added
Nov 8 14:19:08 crmn Keepalived_vrrp[21452]: Netlink reflector reports IP fe80::20c:29ff:feab:1c7 added
Nov 8 14:19:08 crmn Keepalived_vrrp[21452]: Registering Kernel netlink reflector
Nov 8 14:19:08 crmn Keepalived_vrrp[21452]: Registering Kernel netlink command channel
Nov 8 14:19:08 crmn Keepalived_vrrp[21452]: Registering gratuitous ARP shared channel
Nov 8 14:19:08 crmn Keepalived_healthcheckers[21451]: Opening file '/etc/keepalived/keepalived.conf'. //
Nov 8 14:19:08 crmn Keepalived_healthcheckers[21451]: Configuration is using : 7431 Bytes
Nov 8 14:19:08 crmn Keepalived_healthcheckers[21451]: Using LinkWatch kernel netlink reflector...
Nov 8 14:19:08 crmn Keepalived_vrrp[21452]: Opening file '/etc/keepalived/keepalived.conf'. //
Nov 8 14:19:08 crmn Keepalived_vrrp[21452]: Configuration is using : 39552 Bytes
Nov 8 14:19:08 crmn Keepalived_vrrp[21452]: Using LinkWatch kernel netlink reflector...
Nov 8 14:19:08 crmn Keepalived_vrrp[21452]: VRRP sockpool: [ifindex(2), proto(112), unicast(0), fd(10,11)]
Nov 8 14:19:08 crmn Keepalived_vrrp[21452]: VRRP_Script(check_httpd) succeeded //
Nov 8 14:19:10 crmn Keepalived_vrrp[21452]: VRRP_Instance(HA_1) Transition to MASTER STATE //
Nov 8 14:19:12 crmn Keepalived_vrrp[21452]: VRRP_Instance(HA_1) Entering MASTER STATE
Nov 8 14:19:12 crmn Keepalived_vrrp[21452]: VRRP_Instance(HA_1) setting protocol VIPs.
Nov 8 14:19:12 crmn Keepalived_vrrp[21452]: VRRP_Instance(HA_1) Sending gratuitous ARPs on eth0 for 172.16.135.195 //
Nov 8 14:19:12 crmn Keepalived_healthcheckers[21451]: Netlink reflector reports IP 172.16.135.195 added
Nov 8 14:19:12 crmn avahi-daemon[1923]: Registering new address record for 172.16.135.195 on eth0.IPv4.
Nov 8 14:19:13 crmn ntpd[9765]: Listen normally on 91 eth0 172.16.135.195 UDP 123
Nov 8 14:19:17 crmn Keepalived_vrrp[21452]: VRRP_Instance(HA_1) Sending gratuitous ARPs on eth0 for 172.16.135.195
^C
[root@crmn keepalived]# tail -f /var/log/keepalived-httpd-state.log
[Master]
2017年 11月 08日 星期三 14:19:12 CST
//备用节点172.16.135.134:
[root@centos keepalived]# ps -ef|grep httpd
root 10942 10890 0 14:17 pts/1 00:00:00 grep httpd
[root@centos keepalived]# service httpd start
正在启动 httpd: [确定]
[root@centos keepalived]# ps -ef|grep httpd
root 11006 1 2 14:23 ? 00:00:00 /usr/sbin/httpd
apache 11009 11006 0 14:23 ? 00:00:00 /usr/sbin/httpd
apache 11010 11006 0 14:23 ? 00:00:00 /usr/sbin/httpd
apache 11011 11006 0 14:23 ? 00:00:00 /usr/sbin/httpd
apache 11012 11006 0 14:23 ? 00:00:00 /usr/sbin/httpd
apache 11013 11006 0 14:23 ? 00:00:00 /usr/sbin/httpd
apache 11014 11006 0 14:23 ? 00:00:00 /usr/sbin/httpd
apache 11015 11006 0 14:23 ? 00:00:00 /usr/sbin/httpd
apache 11016 11006 0 14:23 ? 00:00:00 /usr/sbin/httpd
root 11018 10890 0 14:23 pts/1 00:00:00 grep httpd
[root@centos keepalived]# /etc/init.d/keepalived start
正在启动 keepalived: [确定]
//打开另一窗口172.16.135.134(1)
[root@centos keepalived]# tail -f /var/log/messages
Nov 8 14:16:54 centos NetworkManager[1902]: <info> Activation (eth0) successful, device activated.
Nov 8 14:16:54 centos NetworkManager[1902]: <info> Activation (eth0) Stage 5 of 5 (IP Configure Commit) complete.
Nov 8 14:16:55 centos ntpd[9760]: Listen normally on 5 eth0 172.16.135.134 UDP 123
Nov 8 14:16:57 centos ntpd_intres[9773]: DNS 0.centos.pool.ntp.org -> 61.216.153.106
Nov 8 14:16:57 centos ntpd_intres[9773]: DNS 1.centos.pool.ntp.org -> 120.25.115.19
Nov 8 14:16:57 centos ntpd_intres[9773]: DNS 2.centos.pool.ntp.org -> 163.172.177.158
Nov 8 14:16:57 centos ntpd_intres[9773]: DNS 3.centos.pool.ntp.org -> 115.28.122.198
Nov 8 14:17:04 centos ntpd[9760]: 0.0.0.0 c61c 0c clock_step -49.389641 s
Nov 8 14:16:15 centos ntpd[9760]: 0.0.0.0 c614 04 freq_mode
Nov 8 14:16:18 centos ntpd[9760]: 0.0.0.0 c618 08 no_sys_peer
Nov 8 14:23:43 centos Keepalived[11032]: Starting Keepalived v1.2.13 (03/19,2015)
Nov 8 14:23:43 centos Keepalived[11033]: Starting Healthcheck child process, pid=11034
Nov 8 14:23:43 centos Keepalived[11033]: Starting VRRP child process, pid=11035
Nov 8 14:23:43 centos Keepalived_vrrp[11035]: Netlink reflector reports IP 172.16.135.134 added
Nov 8 14:23:43 centos Keepalived_vrrp[11035]: Netlink reflector reports IP fe80::20c:29ff:feec:faf7 added
Nov 8 14:23:43 centos Keepalived_vrrp[11035]: Registering Kernel netlink reflector
Nov 8 14:23:43 centos Keepalived_vrrp[11035]: Registering Kernel netlink command channel
Nov 8 14:23:43 centos Keepalived_vrrp[11035]: Registering gratuitous ARP shared channel
Nov 8 14:23:43 centos Keepalived_vrrp[11035]: Opening file '/etc/keepalived/keepalived.conf'. //
Nov 8 14:23:43 centos Keepalived_vrrp[11035]: Configuration is using : 39554 Bytes
Nov 8 14:23:43 centos Keepalived_vrrp[11035]: Using LinkWatch kernel netlink reflector...
Nov 8 14:23:43 centos Keepalived_vrrp[11035]: VRRP_Instance(HA_1) Entering BACKUP STATE ////
Nov 8 14:23:43 centos Keepalived_vrrp[11035]: VRRP sockpool: [ifindex(2), proto(112), unicast(0), fd(10,11)]
Nov 8 14:23:43 centos Keepalived_vrrp[11035]: VRRP_Script(check_httpd) succeeded //
Nov 8 14:23:43 centos kernel: type=1400 audit(1510122223.931:4): avc: denied { write } for pid=11040 comm="backup.sh" name="log" dev=sda1 ino=263280 scontext=unconfined_u:system_r:keepalived_t:s0 tcontext=system_u:object_r:var_log_t:s0 tclass=dir
Nov 8 14:23:43 centos kernel: type=1400 audit(1510122223.932:5): avc: denied { add_name } for pid=11040 comm="backup.sh" name="keepalived-httpd-state.log" scontext=unconfined_u:system_r:keepalived_t:s0 tcontext=system_u:object_r:var_log_t:s0 tclass=dir
Nov 8 14:23:43 centos kernel: type=1400 audit(1510122223.932:6): avc: denied { create } for pid=11040 comm="backup.sh" name="keepalived-httpd-state.log" scontext=unconfined_u:system_r:keepalived_t:s0 tcontext=unconfined_u:object_r:var_log_t:s0 tclass=file
Nov 8 14:23:43 centos kernel: IPVS: Registered protocols (TCP, UDP, SCTP, AH, ESP)
Nov 8 14:23:43 centos kernel: IPVS: Connection hash table configured (size=4096, memory=32Kbytes)
Nov 8 14:23:43 centos kernel: IPVS: ipvs loaded.
Nov 8 14:23:43 centos Keepalived_healthcheckers[11034]: Netlink reflector reports IP 172.16.135.134 added
Nov 8 14:23:43 centos Keepalived_healthcheckers[11034]: Netlink reflector reports IP fe80::20c:29ff:feec:faf7 added
Nov 8 14:23:43 centos Keepalived_healthcheckers[11034]: Registering Kernel netlink reflector
Nov 8 14:23:43 centos Keepalived_healthcheckers[11034]: Registering Kernel netlink command channel
Nov 8 14:23:43 centos Keepalived_healthcheckers[11034]: Opening file '/etc/keepalived/keepalived.conf'. //
Nov 8 14:23:43 centos Keepalived_healthcheckers[11034]: Configuration is using : 7433 Bytes
Nov 8 14:23:44 centos Keepalived_healthcheckers[11034]: Using LinkWatch kernel netlink reflector...
Nov 8 14:23:50 centos Keepalived_vrrp[11035]: VRRP_Instance(HA_1) Transition to MASTER STATE
Nov 8 14:23:52 centos Keepalived_vrrp[11035]: VRRP_Instance(HA_1) Entering MASTER STATE ////
Nov 8 14:23:52 centos Keepalived_vrrp[11035]: VRRP_Instance(HA_1) setting protocol VIPs.
Nov 8 14:23:52 centos Keepalived_vrrp[11035]: VRRP_Instance(HA_1) Sending gratuitous ARPs on eth0 for 172.16.135.195
Nov 8 14:23:52 centos avahi-daemon[1917]: Registering new address record for 172.16.135.195 on eth0.IPv4. //
Nov 8 14:23:52 centos Keepalived_healthcheckers[11034]: Netlink reflector reports IP 172.16.135.195 added
Nov 8 14:23:54 centos ntpd[9760]: Listen normally on 6 eth0 172.16.135.195 UDP 123
Nov 8 14:23:57 centos Keepalived_vrrp[11035]: VRRP_Instance(HA_1) Sending gratuitous ARPs on eth0 for 172.16.135.195
^C
[root@centos keepalived]# tail -f /var/log/keepalived-httpd-state.log
[Backup]
2017年 11月 08日 星期三 14:23:43 CST
[Master]
2017年 11月 08日 星期三 14:23:52 CST //问题复现:9s后转成Master,不正常么!
----------------------------------------------------
//二、keepalived故障切换过程分析
主节点172.16.135.132:
[root@crmn keepalived]# killall -9 httpd
[root@crmn keepalived]# ip add
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:ab:01:c7 brd ff:ff:ff:ff:ff:ff
inet 172.16.135.132/24 brd 172.16.135.255 scope global eth0
inet6 fe80::20c:29ff:feab:1c7/64 scope link
valid_lft forever preferred_lft forever
3: pan0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN
link/ether 6a:73:e4:72:b5:de brd ff:ff:ff:ff:ff:ff
[root@crmn keepalived]#
//打开另一窗口172.16.135.132(1)
[root@crmn keepalived]# tail -f /var/log/messages
Nov 8 14:19:17 crmn Keepalived_vrrp[21452]: VRRP_Instance(HA_1) Sending gratuitous ARPs on eth0 for 172.16.135.195
Nov 8 14:31:46 crmn dhclient[12552]: DHCPREQUEST on eth0 to 172.16.135.254 port 67 (xid=0x12c8ed80)
Nov 8 14:31:46 crmn dhclient[12552]: DHCPACK from 172.16.135.254 (xid=0x12c8ed80)
Nov 8 14:31:46 crmn dhclient[12552]: bound to 172.16.135.132 -- renewal in 856 seconds.
Nov 8 14:31:46 crmn NetworkManager[1908]: <info> (eth0): DHCPv4 state changed renew -> renew
Nov 8 14:31:46 crmn NetworkManager[1908]: <info> address 172.16.135.132
Nov 8 14:31:46 crmn NetworkManager[1908]: <info> prefix 24 (255.255.255.0)
Nov 8 14:31:46 crmn NetworkManager[1908]: <info> gateway 172.16.135.2
Nov 8 14:31:46 crmn NetworkManager[1908]: <info> nameserver '172.16.135.2'
Nov 8 14:31:46 crmn NetworkManager[1908]: <info> domain name 'localdomain'
Nov 8 14:38:15 crmn Keepalived_vrrp[21452]: VRRP_Script(check_httpd) failed
Nov 8 14:38:17 crmn Keepalived_vrrp[21452]: VRRP_Instance(HA_1) Entering FAULT STATE
Nov 8 14:38:17 crmn Keepalived_vrrp[21452]: VRRP_Instance(HA_1) removing protocol VIPs.
Nov 8 14:38:17 crmn Keepalived_healthcheckers[21451]: Netlink reflector reports IP 172.16.135.195 removed
Nov 8 14:38:17 crmn Keepalived_vrrp[21452]: VRRP_Instance(HA_1) Now in FAULT state
Nov 8 14:38:17 crmn avahi-daemon[1923]: Withdrawing address record for 172.16.135.195 on eth0.
Nov 8 14:38:18 crmn ntpd[9765]: Deleting interface #91 eth0, 172.16.135.195#123, interface stats: received=0, sent=0, dropped=0, active_time=1145 secs
^C
[root@crmn keepalived]# tail -f /var/log/keepalived-httpd-state.log
[Master]
2017年 11月 08日 星期三 14:19:12 CST
[Fault]
2017年 11月 08日 星期三 14:38:17 CST
//备用节点172.16.135.134:貌似无变化!不正常么!
[root@centos keepalived]# tail -f /var/log/messages
Nov 8 14:29:51 centos dhclient[10698]: DHCPACK from 172.16.135.254 (xid=0x45f824f1)
Nov 8 14:29:51 centos dhclient[10698]: bound to 172.16.135.134 -- renewal in 802 seconds.
Nov 8 14:29:51 centos NetworkManager[1902]: <info> (eth0): DHCPv4 state changed reboot -> renew
Nov 8 14:29:51 centos NetworkManager[1902]: <info> address 172.16.135.134
Nov 8 14:29:51 centos NetworkManager[1902]: <info> prefix 24 (255.255.255.0)
Nov 8 14:29:51 centos NetworkManager[1902]: <info> gateway 172.16.135.2
Nov 8 14:29:51 centos NetworkManager[1902]: <info> nameserver '172.16.135.2'
Nov 8 14:29:51 centos NetworkManager[1902]: <info> domain name 'localdomain'
Nov 8 14:31:29 centos ntpd[9760]: 0.0.0.0 c612 02 freq_set kernel 112.264 PPM
Nov 8 14:31:29 centos ntpd[9760]: 0.0.0.0 c615 05 clock_sync
^C
[root@centos keepalived]# tail -f /var/log/keepalived-httpd-state.log
[Backup]
2017年 11月 08日 星期三 14:23:43 CST
[Master]
2017年 11月 08日 星期三 14:23:52 CST
//三、故障恢复切换分析
......
=============================================
日志avc: denied错误解析:
[root@centos keepalived]# cat /etc/selinux/config //备用节点
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=enforcing
# SELINUXTYPE= can take one of these two values:
# targeted - Targeted processes are protected,
# mls - Multi Level Security protection.
SELINUXTYPE=targeted
修改为:
[root@centos keepalived]# cat /etc/selinux/config
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE= can take one of these two values:
# targeted - Targeted processes are protected,
# mls - Multi Level Security protection.
SELINUXTYPE=targeted
Nov 8 15:11:09 centos Keepalived[3620]: Starting Keepalived v1.2.13 (03/19,2015)
Nov 8 15:11:09 centos Keepalived[3621]: Starting Healthcheck child process, pid=3622
Nov 8 15:11:09 centos Keepalived[3621]: Starting VRRP child process, pid=3623
Nov 8 15:11:09 centos Keepalived_vrrp[3623]: Netlink reflector reports IP 172.16.135.134 added
Nov 8 15:11:09 centos Keepalived_vrrp[3623]: Netlink reflector reports IP fe80::20c:29ff:feec:faf7 added
Nov 8 15:11:09 centos Keepalived_vrrp[3623]: Registering Kernel netlink reflector
Nov 8 15:11:09 centos Keepalived_vrrp[3623]: Registering Kernel netlink command channel
Nov 8 15:11:09 centos Keepalived_vrrp[3623]: Registering gratuitous ARP shared channel
Nov 8 15:11:09 centos Keepalived_vrrp[3623]: Opening file '/etc/keepalived/keepalived.conf'.
Nov 8 15:11:09 centos Keepalived_vrrp[3623]: Configuration is using : 39554 Bytes
Nov 8 15:11:09 centos Keepalived_vrrp[3623]: Using LinkWatch kernel netlink reflector...
Nov 8 15:11:09 centos kernel: IPVS: Registered protocols (TCP, UDP, SCTP, AH, ESP)
Nov 8 15:11:09 centos kernel: IPVS: Connection hash table configured (size=4096, memory=32Kbytes)
Nov 8 15:11:09 centos kernel: IPVS: ipvs loaded.
Nov 8 15:11:09 centos Keepalived_vrrp[3623]: VRRP_Instance(HA_1) Entering BACKUP STATE //
Nov 8 15:11:09 centos Keepalived_vrrp[3623]: VRRP sockpool: [ifindex(2), proto(112), unicast(0), fd(10,11)]
Nov 8 15:11:09 centos Keepalived_healthcheckers[3622]: Netlink reflector reports IP 172.16.135.134 added
Nov 8 15:11:09 centos Keepalived_healthcheckers[3622]: Netlink reflector reports IP fe80::20c:29ff:feec:faf7 added
Nov 8 15:11:09 centos Keepalived_healthcheckers[3622]: Registering Kernel netlink reflector
Nov 8 15:11:09 centos Keepalived_healthcheckers[3622]: Registering Kernel netlink command channel
Nov 8 15:11:09 centos Keepalived_healthcheckers[3622]: Opening file '/etc/keepalived/keepalived.conf'.
Nov 8 15:11:09 centos Keepalived_healthcheckers[3622]: Configuration is using : 7433 Bytes
Nov 8 15:11:09 centos Keepalived_vrrp[3623]: VRRP_Script(check_httpd) succeeded
Nov 8 15:11:09 centos Keepalived_healthcheckers[3622]: Using LinkWatch kernel netlink reflector...
Nov 8 15:11:16 centos Keepalived_vrrp[3623]: VRRP_Instance(HA_1) Transition to MASTER STATE
Nov 8 15:11:18 centos Keepalived_vrrp[3623]: VRRP_Instance(HA_1) Entering MASTER STATE //
Nov 8 15:11:18 centos Keepalived_vrrp[3623]: VRRP_Instance(HA_1) setting protocol VIPs.
Nov 8 15:11:18 centos Keepalived_healthcheckers[3622]: Netlink reflector reports IP 172.16.135.195 added
Nov 8 15:11:18 centos avahi-daemon[1884]: Registering new address record for 172.16.135.195 on eth0.IPv4.
Nov 8 15:11:18 centos Keepalived_vrrp[3623]: VRRP_Instance(HA_1) Sending gratuitous ARPs on eth0 for 172.16.135.195
Nov 8 15:11:20 centos ntpd[2381]: Listen normally on 6 eth0 172.16.135.195 UDP 123
Nov 8 15:11:23 centos Keepalived_vrrp[3623]: VRRP_Instance(HA_1) Sending gratuitous ARPs on eth0 for 172.16.135.195
//5、通过vrrp_script实现对集群资源的状态监控
切换怎么做?接下来模拟一个故障事件看看:
不抢占的配置方法:
主节点配置:
vrrp_instance HA_1 {
state BACKUP
interface eth0
virtual_router_id 180
priority 100
advert_int 2
nopreempt
完后重启keepalived服务即可。
Fault 表示监控的服务出现异常问题而没有能力继续接管服务就进入该状态。
Backup 表示服务都是正常的,但是由于角色控制原因,当主节点出现故障时随时可以接管服务。
(1). 通过killall命令探测服务运行状态:
[root@crmn keepalived]# killall -0 httpd //通过状态码返回值是0或者1判断服务是否正常!
[root@crmn keepalived]# echo $?
0
[root@crmn keepalived]# killall -0 mysqld
mysqld: 没有进程被杀死
[root@crmn keepalived]# echo $?
1
(2). 检测端口运行状态
(3). 通过shell语句进行状态监控
(4). 通过脚本进行服务状态监控 //推荐使用;该方法比较精确。
//6、keepalived使用过程中的常见问题以及问题排查技巧
监测两个keepalived主机间是否能通信的方法:
(一)停掉一个keepalived看另外一个keepalived的日志/var/log/message 里是否有新的日志。
(二)通过嗅探器抓包方式判断服务是否正常。
//1.主节点
[root@crmn keepalived]# /etc/init.d/keepalived start
正在启动 keepalived: [确定]
[root@crmn keepalived]# tcpdump -vvv -i eth0 host 224.0.0.18
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
10:05:27.987699 IP (tos 0xc0, ttl 255, id 2, offset 0, flags [none], proto VRRP (112), length 40)
172.16.135.132 > vrrp.mcast.net: VRRPv2, Advertisement, vrid 180, prio 100, authtype simple, intvl 2s, length 20, addrs: 172.16.135.195 auth "1111^@^@^@^@"
10:05:29.989771 IP (tos 0xc0, ttl 255, id 3, offset 0, flags [none], proto VRRP (112), length 40)
172.16.135.132 > vrrp.mcast.net: VRRPv2, Advertisement, vrid 180, prio 100, authtype simple, intvl 2s, length 20, addrs: 172.16.135.195 auth "1111^@^@^@^@"
10:05:31.992013 IP (tos 0xc0, ttl 255, id 4, offset 0, flags [none], proto VRRP (112), length 40)
172.16.135.132 > vrrp.mcast.net: VRRPv2, Advertisement, vrid 180, prio 100, authtype simple, intvl 2s, length 20, addrs: 172.16.135.195 auth "1111^@^@^@^@"
10:05:33.993077 IP (tos 0xc0, ttl 255, id 5, offset 0, flags [none], proto VRRP (112), length 40)
172.16.135.132 > vrrp.mcast.net: VRRPv2, Advertisement, vrid 180, prio 100, authtype simple, intvl 2s, length 20, addrs: 172.16.135.195 auth "1111^@^@^@^@"
^C
4 packets captured
5 packets received by filter
0 packets dropped by kernel
//2.备用节点,可以判断出132是主节点;通过tcpdump抓包发现备用节点处显示组播由172.16.135.132主节点发送的信息
[root@centos keepalived]# service httpd start
正在启动 httpd: [确定]
[root@centos keepalived]# vim keepalived.conf
[root@centos keepalived]# tcpdump -vvv -i eth0 host 224.0.0.18
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
10:10:58.329398 IP (tos 0xc0, ttl 255, id 167, offset 0, flags [none], proto VRRP (112), length 40)
172.16.135.132 > vrrp.mcast.net: VRRPv2, Advertisement, vrid 180, prio 100, authtype simple, intvl 2s, length 20, addrs: 172.16.135.195 auth "1111^@^@^@^@"
10:11:00.331317 IP (tos 0xc0, ttl 255, id 168, offset 0, flags [none], proto VRRP (112), length 40)
172.16.135.132 > vrrp.mcast.net: VRRPv2, Advertisement, vrid 180, prio 100, authtype simple, intvl 2s, length 20, addrs: 172.16.135.195 auth "1111^@^@^@^@"
^C
2 packets captured
2 packets received by filter
0 packets dropped by kernel
//3.主节点关闭服务时再次抓包观察
[root@crmn keepalived]# killall -9 httpd
[root@crmn keepalived]# tcpdump -vvv -i eth0 host 224.0.0.18
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
//通过tcpdump抓包发现备用节点处本应该只显示组播由172.16.135.134发送的信息,以便说明134接管服务来充当MASTER角色(VRRP协议中谁是MASTER谁就具备权限来可以发送组播,BACKUP角色只能接收。)
但是现在停在这,说明出现问题了!
[root@centos keepalived]# tcpdump -vvv -i eth0 host 224.0.0.18
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
浙公网安备 33010602011771号