Redis 哨兵模式实践

Redis 哨兵模式

哨兵模式(Sentinel),用于对主从结构中的每一台服务器进行监控,当主节点出现故障后通过投票机制来挑选新的主节点,并且将所有的从节点连接到新的主节点上。

  • 监控:监控主从节点运行情况;
  • 通知:当监控节点出现故障,哨兵之间进行通讯;
  • 自动故障转移:当监控到主节点宕机后,断开与宕机主节点连接的所有从节点,然后,在从节点中选取一个作为主节点,将其他的从节点连接到这个最新的主节点,最后通知客户端最新的服务器地址。

哨兵其实也是redis的实例,哨兵至少是3个redis实例,且必须是单数(投票时防止平票情况)。

在代码中,应该连接哨兵的实例。

宕机情况:

  • 主观宕机:单独哨兵认为主节点宕机了
  • 客观宕机:半数以上哨兵认为主节点宕机了

选举主节点的原则:

  • 心跳检查,健康度 - 从节点响应的时间
  • 完整性,尽可能选择备份数据点完整性比较高的从节点(数据备份的偏移量)
  • 稳定性,根据启动的时间周期,心跳检查
  • 如果上面三个条件都相等,则根据我们节点启动时分配的run id 来,如果runid 越小,则最有可能选择为我们主节点;
  • 主节点宕机,从节点替换主节点,如果主节点重新启动了,则成为从节点。

哨兵或者主从里面的数据问题

  • 脑裂问题:出现了主节点和哨兵之间网络断开情况,而且有多数的哨兵认为主节点宕机,则会从从节点里面选择一个节点成为新的主节点。但是,这个时候客户端代码还是可以连接到之前的主节点的,可以写数据,如果主节点和哨兵之间的网络恢复了,然后,之前的主节点备份现在的主节点数据,造成数据不完整。
  • 异步复制数据丢失问题:异步复制数据,如果主节点和从节点直接复制太慢,在这之间,主节点宕机,而且是真的宕机,这个时候从节点替换主节点,丢失了数据。
  • 哨兵--不管怎么样都没有办法保证数据百分百不丢失,只能尽可能少量丢数据。

怎样解决以上问题?需要修改配置文件

  1. 至少有几个从节点。配置=0,代表的是,当主节点和从节点之间互通的时候,发现从节点小于一个的时候,则从节点不会再继续给客户端提供服务,解决了脑裂的问题;
  2. 偏移量配置,主节点和从节点数据之间的偏移量之差,如果偏移量之差比配置小,则主节点也不会提供服务。

redis servers

docker-compose.yml

version: '3'

services:
  # 主节点的容器
  redis-server-master:
    image: redis
    container_name: redis-server-master
    restart: always
    # 为了规避Docker中端口映射可能带来的问题
    # 这里选择使用host网络
    network_mode: host
    # 指定时区,保证容器内时间正确
    environment:
      TZ: "Asia/Shanghai"
    volumes:
      # 映射配置文件和数据目录
      - ./redis-master.conf:/usr/local/etc/redis/redis.conf
      - ./data/redis-master:/data 
    command: ["redis-server", "/usr/local/etc/redis/redis.conf"]
  # 从节点1的容器
  redis-server-slave-1:
    image: redis
    container_name: redis-server-slave-1
    restart: always
    network_mode: host
    depends_on:
      - redis-server-master
    environment:
      TZ: "Asia/Shanghai"
    volumes:
      - ./redis-slave1.conf:/usr/local/etc/redis/redis.conf
      - ./data/redis-slave-1:/data 
    command: ["redis-server", "/usr/local/etc/redis/redis.conf"]
  # 从节点2的容器
  redis-server-slave-2:
    image: redis
    container_name: redis-server-slave-2
    restart: always
    network_mode: host
    depends_on:
      - redis-server-master
    environment:
      TZ: "Asia/Shanghai"
    volumes:
      - ./redis-slave2.conf:/usr/local/etc/redis/redis.conf
      - ./data/redis-slave-2:/data 
    command: ["redis-server", "/usr/local/etc/redis/redis.conf"]

redis-master.conf

# 监听端口
port 6379

# 启动时不打印logo
# 这个不重要,想看logo就打开它
always-show-logo no

# 设定密码认证
requirepass 123456

# 禁用KEYS命令
# 一方面 KEYS * 命令可以列出所有的键,会影响数据安全
# 另一方面 KEYS 命令会阻塞数据库,在数据库中存储了大量数据时,该命令会消耗很长时间
# 期间对Redis的访问也会被阻塞,而当锁释放的一瞬间,大量请求涌入Redis,会造成Redis直接崩溃
rename-command KEYS ""

redis-slave1.conf

# bind 127.0.0.1

# 监听端口
port 6380
always-show-logo no
requirepass 123456

rename-command KEYS ""
slaveof 127.0.0.1 6379

# 设定连接主节点所使用的密码
masterauth "123456"

redis-slave2.conf

# 监听端口
port 6381
always-show-logo no

# 设定密码认证
requirepass 123456
rename-command KEYS ""

slaveof 127.0.0.1 6379

# 设定连接主节点所使用的密码
masterauth "123456"

Sentinel

docker-compose.yml

version: '3'

services:
  redis-sentinel-1:
    image: redis
    container_name: redis-sentinel-1
    restart: always
    # 为了规避Docker中端口映射可能带来的问题
    # 这里选择使用host网络
    network_mode: host
    volumes:
      - ./redis-sentinel-1.conf:/usr/local/etc/redis/redis-sentinel.conf
    # 指定时区,保证容器内时间正确
    environment:
      TZ: "Asia/Shanghai" 
    command: ["redis-sentinel", "/usr/local/etc/redis/redis-sentinel.conf"]
  redis-sentinel-2:
    image: redis
    container_name: redis-sentinel-2
    restart: always
    network_mode: host
    volumes:
      - ./redis-sentinel-2.conf:/usr/local/etc/redis/redis-sentinel.conf
    environment:
      TZ: "Asia/Shanghai" 
    command: ["redis-sentinel", "/usr/local/etc/redis/redis-sentinel.conf"]
  redis-sentinel-3:
    image: redis
    container_name: redis-sentinel-3
    restart: always
    network_mode: host
    volumes:
      - ./redis-sentinel-3.conf:/usr/local/etc/redis/redis-sentinel.conf
    environment:
      TZ: "Asia/Shanghai"
    command: ["redis-sentinel", "/usr/local/etc/redis/redis-sentinel.conf"]

redis-sentinel-1.conf

port 26379
requirepass 123456
sentinel monitor local-master 127.0.0.1 6379 2
sentinel auth-pass local-master 123456
# master在连续多长时间无法响应PING指令后,就会主观判定节点下线,默认是30秒
# 格式:sentinel down-after-milliseconds <master-name> <milliseconds>
sentinel down-after-milliseconds local-master 30000

redis-sentinel-2.conf

port 26380
requirepass 123456
sentinel monitor local-master 127.0.0.1 6379 2
sentinel auth-pass local-master 123456
# master在连续多长时间无法响应PING指令后,就会主观判定节点下线,默认是30秒
# 格式:sentinel down-after-milliseconds <master-name> <milliseconds>
sentinel down-after-milliseconds local-master 30000

redis-sentinel-3.conf

port 26381
requirepass 123456
sentinel monitor local-master 127.0.0.1 6379 2
sentinel auth-pass local-master 123456
# master在连续多长时间无法响应PING指令后,就会主观判定节点下线,默认是30秒
# 格式:sentinel down-after-milliseconds <master-name> <milliseconds>
sentinel down-after-milliseconds local-master 30000

启动所有server和sentinel

[root@localhost server]# docker-compose up -d
Creating redis-server-master ... done
Creating redis-server-slave-1 ... done
Creating redis-server-slave-2 ... done
[root@localhost server]# docker ps 
CONTAINER ID   IMAGE     COMMAND                  CREATED         STATUS         PORTS     NAMES
85b87c9abe11   redis     "docker-entrypoint.s…"   7 seconds ago   Up 7 seconds             redis-server-slave-2
da2356613c03   redis     "docker-entrypoint.s…"   7 seconds ago   Up 7 seconds             redis-server-slave-1
098d30c165ca   redis     "docker-entrypoint.s…"   8 seconds ago   Up 7 seconds             redis-server-master
[root@localhost server]# cd ..
[root@localhost sentinel]# cd sentinel/
[root@localhost sentinel]# docker-compose up -d
Creating redis-sentinel-1 ... done
Creating redis-sentinel-2 ... done
Creating redis-sentinel-3 ... done
[root@localhost sentinel]# docker ps
CONTAINER ID   IMAGE     COMMAND                  CREATED              STATUS              PORTS     NAMES
8e827b9e9eaa   redis     "docker-entrypoint.s…"   3 seconds ago        Up 3 seconds                  redis-sentinel-3
8bd1ec7c5041   redis     "docker-entrypoint.s…"   3 seconds ago        Up 3 seconds                  redis-sentinel-2
182d5d70f7f6   redis     "docker-entrypoint.s…"   3 seconds ago        Up 3 seconds                  redis-sentinel-1
85b87c9abe11   redis     "docker-entrypoint.s…"   About a minute ago   Up About a minute             redis-server-slave-2
da2356613c03   redis     "docker-entrypoint.s…"   About a minute ago   Up About a minute             redis-server-slave-1
098d30c165ca   redis     "docker-entrypoint.s…"   About a minute ago   Up About a minute             redis-server-master

验证哨兵模式

启动三个命令窗口,分别进入主节点和两个从节点,查看主从配置信息

进入主节点

[root@localhost sentinel]# docker exec -it redis-server-master /bin/bash
root@localhost:/data# redis-cli
127.0.0.1:6379> info replication
NOAUTH Authentication required.
127.0.0.1:6379> auth 123456
OK
127.0.0.1:6379> info replication 
# Replication
role:master
connected_slaves:2
slave0:ip=127.0.0.1,port=6380,state=online,offset=198306,lag=0
slave1:ip=127.0.0.1,port=6381,state=online,offset=198306,lag=0
master_failover_state:no-failover
master_replid:8149377844b979ed775e37147d2775d7eee9d267
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:198306
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:198306
127.0.0.1:6379> set name daniel
OK

进入第一个从节点

[root@localhost ~]# docker exec -it redis-server-slave-1 /bin/bash
root@localhost:/data# redis-cli -a 123456 -p 6380
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
127.0.0.1:6380> info replication
# Replication
role:slave
master_host:127.0.0.1
master_port:6379
master_link_status:up
master_last_io_seconds_ago:0
master_sync_in_progress:0
slave_repl_offset:302441
slave_priority:100
slave_read_only:1
replica_announced:1
connected_slaves:0
master_failover_state:no-failover
master_replid:8149377844b979ed775e37147d2775d7eee9d267
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:302441
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:302441
127.0.0.1:6380> get name
"daniel"
127.0.0.1:6380> 

进入第二个节点验证

[root@localhost ~]# docker exec -it redis-server-slave-2 /bin/bash
root@localhost:/data# redis-cli -a 123456 -p 6381
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
127.0.0.1:6381> get name
"daniel"
127.0.0.1:6381> 

通过以上验证,可以证明主从复制已经可以正常运行。

验证哨兵模式下的主从切换

停止主节点

[root@localhost ~]# docker ps
CONTAINER ID   IMAGE     COMMAND                  CREATED          STATUS          PORTS     NAMES
8e827b9e9eaa   redis     "docker-entrypoint.s…"   31 minutes ago   Up 31 minutes             redis-sentinel-3
8bd1ec7c5041   redis     "docker-entrypoint.s…"   31 minutes ago   Up 31 minutes             redis-sentinel-2
182d5d70f7f6   redis     "docker-entrypoint.s…"   31 minutes ago   Up 31 minutes             redis-sentinel-1
85b87c9abe11   redis     "docker-entrypoint.s…"   32 minutes ago   Up 32 minutes             redis-server-slave-2
da2356613c03   redis     "docker-entrypoint.s…"   32 minutes ago   Up 32 minutes             redis-server-slave-1
098d30c165ca   redis     "docker-entrypoint.s…"   32 minutes ago   Up 32 minutes             redis-server-master
[root@localhost ~]# docker stop redis-server-master
redis-server-master

查看其它节点节点状态

端口为6380的节点仍然为从节点, 而端口为6381的节点变成了主节点。

127.0.0.1:6380> info replication
# Replication
role:slave
master_host:127.0.0.1
master_port:6381
master_link_status:up
master_last_io_seconds_ago:0
master_sync_in_progress:0
slave_repl_offset:394782
slave_priority:100
slave_read_only:1
replica_announced:1
connected_slaves:0
master_failover_state:no-failover
master_replid:4f418fb2d0233b647ec82bdabc6582e7b104e4f3
master_replid2:8149377844b979ed775e37147d2775d7eee9d267
master_repl_offset:394782
second_repl_offset:393650
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:394782

端口为6381的节点变成了主节点

127.0.0.1:6381> info replication
# Replication
role:master
connected_slaves:1
slave0:ip=127.0.0.1,port=6380,state=online,offset=396015,lag=0
master_failover_state:no-failover
master_replid:4f418fb2d0233b647ec82bdabc6582e7b104e4f3
master_replid2:8149377844b979ed775e37147d2775d7eee9d267
master_repl_offset:396303
second_repl_offset:393650
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:396303
127.0.0.1:6381> 
posted @ 2021-05-10 17:08  scogee  阅读(158)  评论(0编辑  收藏  举报