redis哨兵高可用

sentinel的工作过程:

              sentinel安装在另外的主机上,sentinel主机既能监控又能提供配置功能,向sentinel指明主redis服务器即可(仅监控主服务器),sentinel可以从主服务中获取主从架信息,并分辨从节点,sentinel可以监控当前整个主从服务器架构的工作状态,一旦发现master离线的情况,sentinel会从多个从服务器中选择并提升一个从节点成为主节点,当主节点被从节点取代以后,那么IP地址则发生了,客户所连接之前的主节点IP则不无法连接,此时可以向sentinel发起查询请求,sentinel会告知客户端新的主节点的IP,所以sentinel是redis在主从架构中实现高可用的解决方,sentinel为了误判和单点故障,sentinel也应该组织为集群,sentinel多个节点同时监控redis主从架构,一旦有一个sentinel节点发现redis的主节点不在线时,sentinel会与其他的sentinel节点协商其他的sentinel节点是否也为同样发现redis的主节点不在线的情况,如果sentinel的多个点节点都发现redis的主节点都为离线的情况,那么则判定redis主节点为离线状态,以此方式避免误判,同样也避免了单点故障

总结:1用于管理多个redis服务实现HA  2、监控多个redis服务节点  3、自动故障转移

 

redis-sentinel.conf文件常用配置参数:

(1) sentinel monitor <master-name> <ip> <redis-port> <quorum> //此项可以出现多次,可以监控多组redis主从架构,此项用于监控主节点 <master-name> 自定义的主节点名称,<ip> 主节点的IP地址,<redis-port>主节点的端口号,<quorum>主节点对应的quorum法定数量,用于定义sentinel的数量,是一个大于值尽量使用奇数,一般建议将其设置为Sentinel节点的一半加1,如果sentinel有3个,则指定为2即可,如果有4个,不能够指定为2,避免导致集群分裂,注意,<master-name>为集群名称,可以自定义,如果同时监控有多组redis集群时,<master-name>不能同样

 

(2) sentinel down-after-milliseconds <master-name> <milliseconds>  //sentinel连接其他节点超时时间,单位为毫秒(默认为30秒)

 

(3) sentinel parallel-syncs <master-name> <numslaves>      //提升主服务器时,允许多少个从服务向新的主服务器发起同步请求

 

 

(4) sentinel failover-timeout <master-name> <milliseconds>       //故障转移超时时间,在指定时间没能完成则判定为失败,单位为毫秒(默认为180秒)

 

实验说明:

  准备三台机子。

192.168.1.5 为sentinel

192.168.1.6 为master

192.168.1.7 为slave

三台主机都已安装好redis(配置epel源就可以安装了)

 

 1.6配置:

   [root@6 ~]# vim /etc/redis.conf 

 

bind 192.168.1.6

 

1.7配置:  

  [root@7 ~]# vim /etc/redis.conf 

  

# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~
bind 192.168.1.7
......
slaveof 192.168.1.6 6379

 

1.5配置:

  [root@ml ~]# vim /etc/redis-sentinel.conf 

  

bind 192.168.1.5
sentinel monitor qunzu 192.168.1.6 6379 1

  [root@ml ~]# systemctl restart redis-sentinel

  [root@ml ~]# redis-cli -h 192.168.1.5 -p 26379

  192.168.1.5:26379>  info sentinel

# Sentinel
sentinel_masters:2
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=sdown,address=127.0.0.1:6379,slaves=0,sentinels=1
master1:name=qunzu,status=ok,address=192.168.1.6:6379,slaves=1,sentinels=1

  192.168.1.63:26379> sentinel masters            //获取主节点及从节点的信息

1)  1) "name"    #第一个节点
    2) "mymaster"
    3) "ip"
    4) "127.0.0.1"
    5) "port"
    6) "6379"
    7) "runid"
    8) ""
    9) "flags"
   10) "s_down,master,disconnected"
   11) "link-pending-commands"
   12) "2"
   13) "link-refcount"
   14) "1"
   15) "last-ping-sent"
   16) "636869"
   17) "last-ok-ping-reply"
   18) "636869"
   19) "last-ping-reply"
   20) "636869"
   21) "s-down-time"
   22) "606856"
   23) "down-after-milliseconds"
   24) "30000"
   25) "info-refresh"
   26) "1586413602435"
   27) "role-reported"
   28) "master"
   29) "role-reported-time"
   30) "636869"
   31) "config-epoch"
   32) "0"
   33) "num-slaves"
   34) "0"
   35) "num-other-sentinels"
   36) "0"
   37) "quorum"
   38) "2"
   39) "failover-timeout"
   40) "180000"
   41) "parallel-syncs"
   42) "1"
2)  1) "name"    #第二个节点
    2) "qunzu"
    3) "ip"
    4) "192.168.1.6"
    5) "port"
    6) "6379"
    7) "runid"
    8) "c8a9345764e1a44e56b62547efb3107b7b24fcdf"
    9) "flags"
   10) "master"
   11) "link-pending-commands"
   12) "0"
   13) "link-refcount"
   14) "1"
   15) "last-ping-sent"
   16) "0"
   17) "last-ok-ping-reply"
   18) "312"
   19) "last-ping-reply"
   20) "312"
   21) "down-after-milliseconds"
   22) "30000"
   23) "info-refresh"
   24) "4274"
   25) "role-reported"
   26) "master"
   27) "role-reported-time"
   28) "636869"
   29) "config-epoch"
   30) "0"
   31) "num-slaves"
   32) "1"
   33) "num-other-sentinels"
   34) "0"
   35) "quorum"
   36) "1"
   37) "failover-timeout"
   38) "180000"
   39) "parallel-syncs"
   40) "1"

   192.168.1.5:26379> sentinel slaves quzhu(这个是组名) //获取mymaster集群的从节点信息

1)  1) "name"
    2) "192.168.1.7:6379"
    3) "ip"
    4) "192.168.1.7"
    5) "port"
    6) "6379"
    7) "runid"
    8) "9df62fb2d5252c62206dafc8554e630e87726cc2"
    9) "flags"
   10) "slave"
   11) "link-pending-commands"
   12) "0"
   13) "link-refcount"
   14) "1"
   15) "last-ping-sent"
   16) "0"
   17) "last-ok-ping-reply"
   18) "192"
   19) "last-ping-reply"
   20) "192"
   21) "down-after-milliseconds"
   22) "30000"
   23) "info-refresh"
   24) "9991"
   25) "role-reported"
   26) "slave"
   27) "role-reported-time"
   28) "893674"
   29) "master-link-down-time"
   30) "0"
   31) "master-link-status"
   32) "ok"
   33) "master-host"
   34) "192.168.1.6"
   35) "master-port"
   36) "6379"
   37) "slave-priority"
   38) "100"
   39) "slave-repl-offset"
   40) "58854"

尝试关闭master

目前master:

  192.168.1.5:26379>  info sentinel

# Sentinel
sentinel_masters:2
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=sdown,address=127.0.0.1:6379,slaves=0,sentinels=1
master1:name=qunzu,status=ok,address=192.168.1.6:6379,slaves=1,sentinels=1

关闭master

[root@6 ~]# systemctl stop redis 

在查询一下,发现master已经变成了192.168.1.7了

192.168.1.5:26379>  info sentinel

# Sentinel
sentinel_masters:2
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=sdown,address=127.0.0.1:6379,slaves=0,sentinels=1
master1:name=qunzu,status=ok,address=192.168.1.7:6379,slaves=1,sentinels=1

再次启动master

[root@6 ~]# systemctl restart redis

192.168.1.6:6379> info replication  #发现以前的master已经变成了slave了

# Replication
role:slave
master_host:192.168.1.7
master_port:6379
master_link_status:up
master_last_io_seconds_ago:2
master_sync_in_progress:0
slave_repl_offset:2626
slave_priority:100
slave_read_only:1
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0

实验总结:

  当部署sentinel后,master宕机后,sentinel端会在slave中选择出一个作为master,当旧master恢复后,它将会成为了slave,不在成为master了。

 

posted @ 2020-04-09 14:55  meml  阅读(343)  评论(0)    收藏  举报