• 博客园logo
  • 会员
  • 众包
  • 新闻
  • 博问
  • 闪存
  • 赞助商
  • HarmonyOS
  • Chat2DB
    • 搜索
      所有博客
    • 搜索
      当前博客
  • 写随笔 我的博客 短消息 简洁模式
    用户头像
    我的博客 我的园子 账号设置 会员中心 简洁模式 ... 退出登录
    注册 登录
CAPF兵哥
博客园    首页    新随笔    联系   管理    订阅  订阅

基于MHA的MySQL做高可用

一、MHA 简介
MHA(Master High Availability)目前在 MySQL 高可用方面是一个相对成熟的解决方案, 它由日本 DeNA 公司的 youshimaton 员工(现就职于 Facebook 公司)开发,是一套优秀的作 为 MySQL 高可用性环境下故障切换和主从角色提升的高可用软件。在 MySQL 故障切换过程 中,MHA 能做到在 0~30 秒之内自动完成数据库的主从故障切换操作,并且在进行故障切换 的过程中,MHA 能在最大程度上保证数据的一致性,以达到真正意义上的高可用。

MHA 由两部分组成:MHA Manager(管理节点)和 MHA Node(数据节点)。MHA Manager 可以单独部署在一台独立的机器上管理多个 master-slave 集群,也可以部署在一台 slave 节 点上。MHA Node 运行在每台 MySQL 服务器及 Manager 服务器上,MHA Manager 会定时探 测集群中的 master 节点,当 master 出现故障时,它可以自动将拥有最新数据的 slave 提升 为新的 master,然后将所有其他的 slave 重新指向新提升的 master。整个故障转移过程对应 用程序层面完全透明。

在 MHA 自动故障切换过程中,MHA 会试图从宕机的主服务器上保存二进制日志,最大 程度的保证数据不丢失,但这种操作是有概率性的。例如,如果主服务器硬件故障或无法通 过 ssh 访问,MHA没法保存二进制日志,只进行故障转移从而丢失了最新的数据。使用MySQL 5.5 的半同步复制,可以降低数据丢失的风险。MHA 可以与半同步复制结合起来。如果只有 一个 slave 已经收到了最新的二进制日志,MHA 可以将最新的二进制日志应用于其他所有的 slave 服务器上,因此可以保证所有节点的数据一致性。

目前 MHA 主要支持一主多从的架构,要搭建 MHA,要求一个复制集群中必须最少有三 台数据库服务器,一主二从,即一台充当 master,一台充当备用 master,另外一台充当从 库,因为至少需要三台服务器,出于机器成本的考虑,淘宝也在该基础上进行了改造,目前 淘宝 TMHA 已经支持一主一从。另外对于想快速搭建的可以参考:MHA 快速搭建 我们自己使用其实也可以使用 1 主 1 从,但是 master 主机宕机后无法切换,以及无法补全 binlog。master 的 mysqld 进程 crash 后,还是可以切换成功,以及补全 binlog 的。

1.1、工作流程
(1)从宕机崩溃的 master 上尝试保存二进制日志事件(binlog events); (2)识别含有最新更新的 slave 服务器; (3)应用差异的中继日志(relay log)到其他的 slave; (4)应用从 master 保存的二进制日志事件(binlog events); (5)提升一个 slave 为新的 master 服务器; (6)将其他的 slave 连接指向新的 master 进行主从复制;
1.2、MHA 工具介绍
MHA 软件由两部分组成,Manager 工具包和 Node 工具包,具体的说明如下。

Manager 工具包主要包括以下几个工具: ➢ masterha_check_ssh 检查 MHA 的 SSH 配置状况 ➢ masterha_check_repl 检查 MySQL 复制状况 ➢ masterha_manger 启动 MHA ➢ masterha_check_status 检测当前 MHA 运行状态 ➢ masterha_master_monitor 检测 master 是否宕机 ➢ masterha_master_switch 控制故障转移(自动或者手动) ➢ masterha_conf_host 添加或删除配置的 server 信息

Node 工具包(这些工具通常由 MHA Manager 的脚本触发,无需人为操作)主要包括以下几 个工具: ➢ save_binary_logs 保存和复制 master 的二进制日志 ➢ apply_diff_relay_logs 识别差异的中继日志事件并将其差异的事件应用于其他的 slave ➢ filter_mysqlbinlog 去除不必要的 ROLLBACK 事件(MHA 已不再使用这个工具)
➢ purge_relay_logs 清除中继日志(不会阻塞 SQL 线程)


QQ:981343876 6 / 50 Tel:13651028933


注意:为了尽可能的减少主库硬件损坏宕机造成的数据丢失,因此在配置 MHA 的同时建议 配置成 MySQL 5.5 的半同步复制。关于半同步复制原理各位自己进行查阅。(不是必须)
1

 

实验环境准备五台虚拟机

192.168.200.111(master主从)

192.168.200.112(master主从)

192.168.200.113(slave1从)

192.168.200.114(slave2从)

192.168.200.115(manager监控)

所有服务器关闭防火墙和selinux
systemctl stop firewalld
iptables -F
setenforce 0

  

二、前期环境部署 
2.1、配置所有主机名称 
master1 主机: hostname server01 bash 
 
master2 主机: hostname server02 bash 
 
slave1 主机: hostname server03 bash 
 
slave2 主机: hostname server04 bash 
 
manager 主机: hostname server05 bas

  

2.2、配置所有主机名映射 
[root@server05 ~]# vim /etc/hosts 
192.168.200.111 server01 
192.168.200.112 server02 
192.168.200.113 server03 
192.168.200.114 server04 
192.168.200.115 server05 
 
scp /etc/hosts 192.168.200.111:/etc/ 
scp /etc/hosts 192.168.200.112:/etc/ 
scp /etc/hosts 192.168.200.113:/etc/ 
scp /etc/hosts 192.168.200.114:/etc/ 
scp /etc/hosts 192.168.200.115:/etc/

  三、安装 MHA node 

3.1、所有主机安装 MHA node 及相关 perl 依赖包

安装 epel 源: 

[root@server02 ~]# rpm -ivh epel-release-latest-7.noarch.rpm 
warning: epel-release-latest-7.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID 352c64e5: NOKEY
Preparing...                          ################################# [100%]
Updating / installing...
   1:epel-release-7-11                ################################# [100%]

perl 依赖包并检查依赖包是否都已安装

[root@server02 ~]# yum install -y perl-DBD-MySQL.x86_64 perl-DBI.x86_64 perl-CPAN perl-ExtUtils-CBuilder perl-ExtUtils-MakeMaker


[root@server02 ~]# rpm -q perl-DBD-MySQL.x86_64 perl-DBI.x86_64 perl-CPAN perl-ExtUtils-CBuilder perl-ExtUtils-MakeMaker 
perl-DBD-MySQL-4.023-6.el7.x86_64
perl-DBI-1.627-4.el7.x86_64
perl-CPAN-1.9800-294.el7_6.noarch
perl-ExtUtils-CBuilder-0.28.2.6-294.el7_6.noarch
perl-ExtUtils-MakeMaker-6.68-3.el7.noarch

3.2、所有主机上安装 MHA Node 

tar xf mha4mysql-node-0.56.tar.gz 
cd mha4mysql-node-0.56/ 
perl Makefile.PL make && make install 

3.3、MHA Node 安装完后会在 /usr/local/bin 生成以下脚本 

[root@server02 mha4mysql-node-0.56]# ls -l /usr/local/bin/ 
total 40
-r-xr-xr-x. 1 root root 16346 Oct 23 01:15 apply_diff_relay_logs
-r-xr-xr-x. 1 root root  4807 Oct 23 01:15 filter_mysqlbinlog
-r-xr-xr-x. 1 root root  7401 Oct 23 01:15 purge_relay_logs
-r-xr-xr-x. 1 root root  7395 Oct 23 01:15 save_binary_logs

四、安装 MHA Manger 

注意:安装 MHA Manger 之前也需要安装 MHA Node
4.1、首先安装 MHA Manger 依赖的 perl 模块(我这里使用 yum 安装)

yum install -y perl perl-Log-Dispatch perl-Parallel-ForkManager perl-DBD-MySQL perl
yum -y install perl-Config-Tiny-2.14-7.el7.noarch.rpm 

检查软件包是否安装

[root@server05 ~]# rpm -q perl perl-Log-Dispatch perl-Parallel-ForkManager perl-DBD-MySQL perl-DBI perl-Time-HiRes perl-Config-Tiny 
perl-5.16.3-294.el7_6.x86_64
perl-Log-Dispatch-2.41-1.el7.1.noarch
perl-Parallel-ForkManager-1.18-2.el7.noarch
perl-DBD-MySQL-4.023-6.el7.x86_64
perl-DBI-1.627-4.el7.x86_64
perl-Time-HiRes-1.9725-3.el7.x86_64
perl-Config-Tiny-2.14-7.el7.noarch

4.2、安装 MHA Manger 软件包 

tar xf mha4mysql-manager-0.56.tar.gz  
cd mha4mysql-manager-0.56/ 
perl Makefile.PL make && make instal

4.3、安装完成后会有以下脚本文件 

[root@server05 ~]# ls -l /usr/local/bin/
total 80
-r-xr-xr-x. 1 root root 16346 Oct 23 01:15 apply_diff_relay_logs
-r-xr-xr-x. 1 root root  4807 Oct 23 01:15 filter_mysqlbinlog
-r-xr-xr-x. 1 root root  1995 Oct 23 01:35 masterha_check_repl
-r-xr-xr-x. 1 root root  1779 Oct 23 01:35 masterha_check_ssh
-r-xr-xr-x. 1 root root  1865 Oct 23 01:35 masterha_check_status
-r-xr-xr-x. 1 root root  3201 Oct 23 01:35 masterha_conf_host
-r-xr-xr-x. 1 root root  2517 Oct 23 01:35 masterha_manager
-r-xr-xr-x. 1 root root  2165 Oct 23 01:35 masterha_master_monitor
-r-xr-xr-x. 1 root root  2373 Oct 23 01:35 masterha_master_switch
-r-xr-xr-x. 1 root root  3879 Oct 23 01:35 masterha_secondary_check
-r-xr-xr-x. 1 root root  1739 Oct 23 01:35 masterha_stop
-rwxr-xr-x. 1 root root  2807 Oct 23 01:56 master_ip_failover
-r-xr-xr-x. 1 root root  7401 Oct 23 01:15 purge_relay_logs
-r-xr-xr-x. 1 root root  7395 Oct 23 01:15 save_binary_logs

五、配置 SSH 密钥对验证 

5.1、Server05(192.168.200.115)上:
ssh-keygen -t rsa
ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.200.111 
ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.200.112 
ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.200.113 
ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.200.114 
 
注意:Server05 需要连接每个主机测试,因为第一次连接的时候需要输入 yes,影响后期故 障切换时,对于每个主机的 SSH 控制。   ssh server(01-04)
111-114四台配置密钥对不需要给115其他的服务器都要给

六、安装 mysql 

[root@server01 ~]# yum -y install mariadb-server mariadb-devel
[root@server02 ~]# yum -y install mariadb-server mariadb-devel
[root@server03 ~]# yum -y install mariadb-server mariadb-devel
[root@server04 ~]# yum -y install mariadb-server mariadb-devel
systemctl start mariadb

设置数据库初始密码(后续操作中使用) 

mysqladmin -u root password 123456

七、搭建主从复制环境
注意:binlog-do-db 和 replicate-ignore-db 设置必须相同。 MHA 在启动时候会检测过滤规 则,如果过滤规则不同,MHA 将不启动监控和故障转移功能。
7.1、修改 mysql 主机的配置文件

Primary Master(192.168.200.111): vim /etc/my.cnf 
[mysqld] 
server-id = 1     
log-bin=master-bin       
log-slave-updates=true     
relay_log_purge=0  
 
systemctl restart mariadb 
 
Secondary Master(192.168.200.112): vim /etc/my.cnf 
[mysqld] 
server-id=2               
log-bin=master-bin         
log-slave-updates=true      
relay_log_purge=0    
systemctl restart mariadb 
 
slave1(192.168.200.113): vim /etc/my.cnf 
[mysqld] server-id=3     
log-bin=mysql-bin        
relay-log=slave-relay-bin     
log-slave-updates=true      
relay_log_purge=0    
systemctl restart mariadb 
 
slave2(192.168.200.114): vim /etc/my.cnf 
[mysqld] 
server-id=4     
log-bin=mysql-bin        
relay-log=slave-relay-bin     
log-slave-updates=true      
relay_log_purge=0    
systemctl restart mariadb 

mysql 服务器创建复制授权用户(111-114)
grant replication slave on *.* to 'repl'@'192.168.200.%' identified by '123456'; flush privileges;

查看主库备份时的 binlog 名称和位置 

MariaDB [(none)]> show master status;
+-------------------+----------+--------------+------------------+
| File              | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+-------------------+----------+--------------+------------------+
| master-bin.000001 |      474 |              |                  |
+-------------------+----------+--------------+------------------+
1 row in set (0.00 sec)

(112-114)做授权

stop slave;
CHANGE MASTER TO MASTER_HOST='192.168.200.111', MASTER_USER='repl', MASTER_PASSWORD='123456', MASTER_LOG_FILE='master-bin.000001', MASTER_LOG_POS=474; 
start slave; 
show slave status\G    
 
# 检查 IO 和 SQL 线程是否为:yes             
Slave_IO_Running: Yes             
Slave_SQL_Running: Yes 

三台 slave 服务器设置 read_only 状态 

从库对外只提供读服务,只所以没有写进 mysql 配置文件,是因为随时 server02 会提升为 master 

[root@server02 ~]# mysql -uroot -p123456 -e 'set global read_only=1' 
[root@server03 ~]# mysql -uroot -p123456 -e 'set global read_only=1'  
[root@server04 ~]# mysql -uroot -p123456 -e 'set global read_only=1' 

创建监控用户(111-114 主机上的操作):

grant all privileges on *.* to 'root'@'192.168.200.%' identified  by '123456'; 
flush privileges; 
 
为自己的主机名授权: (server01-04)
grant all privileges on *.* to 'root'@'server01' identified  by '123456'; 
flush privileges; 

配置 MHA 环境 

创建 MHA 的工作目录及相关配置文件 

Server05(192.168.200.115):在软件包解压后的目录里面有样例配置文件 

mkdir /etc/masterha 
cp mha4mysql-manager-0.56/samples/conf/app1.cnf /etc/masterha 

修改 app1.cnf 配置文件 

[root@server05 ~]# vim /etc/masterha/app1.cnf 

[server default]
manager_log=/var/log/masterha/app1/manager.log
manager_workdir=/var/log/masterha/app1
master_binlog_dir=/var/lib/mysql
master_ip_failover_script=/usr/local/bin/master_ip_failover
password=123456
ping_interval=1
remote_workdir=/tmp
repl_password=123456
repl_user=repl
user=root

[server1]
hostname=server01
port=3306

[server2]
hostname=server02
candidate_master=1
port=3306
check_repl_delay=0

[server3]
hostname=server03
port=3306

[server4]
hostname=server04
port=3306

配置故障转移脚本 

[root@server05 ~]# vim /usr/local/bin/master_ip_failover 
#!/usr/bin/env perl 

use strict; 
use warnings FATAL => 'all'; 
use Getopt::Long; 
my ( 
$command, $ssh_user, $orig_master_host, $orig_master_ip, 
$orig_master_port, $new_master_host, $new_master_ip, $new_master_port, 
); 
my $vip = '192.168.200.100';          	# 写入VIP
my $key = "1"; 	#非keepalived方式切换脚本使用的
my $ssh_start_vip = "/sbin/ifconfig ens32:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig ens32:$key down"; #那么这里写服务的开关命令
$ssh_user = "root"; 
GetOptions( 
'command=s' => \$command, 
'ssh_user=s' => \$ssh_user, 
'orig_master_host=s' => \$orig_master_host, 
'orig_master_ip=s' => \$orig_master_ip, 
'orig_master_port=i' => \$orig_master_port, 
'new_master_host=s' => \$new_master_host, 
'new_master_ip=s' => \$new_master_ip, 
'new_master_port=i' => \$new_master_port, 
); 
exit &main(); 
sub main { 
print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n"; 
if ( $command eq "stop" || $command eq "stopssh" ) { 
# $orig_master_host, $orig_master_ip, $orig_master_port are passed. 
# If you manage master ip address at global catalog database, 
# invalidate orig_master_ip here. 
my $exit_code = 1; 
#eval { 
# print "Disabling the VIP on old master: $orig_master_host \n"; 
# &stop_vip(); 
# $exit_code = 0; 
#}; 
eval { 
print "Disabling the VIP on old master: $orig_master_host \n"; 
#my $ping=`ping -c 1 10.0.0.13 | grep "packet loss" | awk -F',' '{print $3}' | awk '{print $1}'`; 
#if ( $ping le "90.0%"&& $ping gt "0.0%" ){ 
#$exit_code = 0; 
#} 
#else { 
&stop_vip(); 
# updating global catalog, etc 
$exit_code = 0; 
#} 
}; 

if ($@) { 
warn "Got Error: $@\n"; 
exit $exit_code; 
} 
exit $exit_code; 
} 
elsif ( $command eq "start" ) {
# all arguments are passed. 
# If you manage master ip address at global catalog database, 
# activate new_master_ip here. 
# You can also grant write access (create user, set read_only=0, etc) here. 
my $exit_code = 10; 
eval { 
print "Enabling the VIP - $vip on the new master - $new_master_host \n"; 
&start_vip(); 
$exit_code = 0; 
}; 
if ($@) { 
warn $@; 
exit $exit_code; 
} 
exit $exit_code; 
} 
elsif ( $command eq "status" ) { 
print "Checking the Status of the script.. OK \n"; 
`ssh $ssh_user\@$orig_master_ip \" $ssh_start_vip \"`; 
exit 0; 
} 
else { 
&usage(); 
exit 1; 
} 
} 
# A simple system call that enable the VIP on the new master 
sub start_vip() { 
`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`; 
} 
# A simple system call that disable the VIP on the old_master 
sub stop_vip() { 
`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`; 
} 
sub usage { 
print 
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --
new_master_host=host --new_master_ip=ip --new_master_port=port\n"; }

设置从库 relay log 的清除方式(112-114): 

mysql -uroot -p123456 -e 'set global relay_log_purge=0;' 

检查 MHA ssh 通信状态 

[root@server05 ~]#  masterha_check_repl --conf=/etc/masterha/app1.cnf
Wed Oct 23 02:58:00 2019 - [info]   server01(192.168.200.111:3306)
Wed Oct 23 02:58:00 2019 - [info]   server02(192.168.200.112:3306)
Wed Oct 23 02:58:00 2019 - [info]   server03(192.168.200.113:3306)
Wed Oct 23 02:58:00 2019 - [info]   server04(192.168.200.114:3306)
Wed Oct 23 02:58:07 2019 - [info] 
server01 (current master)
 +--server02
 +--server03
 +--server04
Checking the Status of the script.. OK 
Wed Oct 23 02:58:08 2019 - [info]  OK.
Wed Oct 23 02:58:08 2019 - [warning] shutdown_script is not defined.
Wed Oct 23 02:58:08 2019 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.

检查整个集群的状态 

[root@server05 ~]#  masterha_check_repl --conf=/etc/masterha/app1.cnf 
Wed Oct 23 02:59:37 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Wed Oct 23 02:59:37 2019 - [info] Reading application default configurations from /etc/masterha/app1.cnf..
Wed Oct 23 02:59:37 2019 - [info] Reading server configurations from /etc/masterha/app1.cnf..
Wed Oct 23 02:59:37 2019 - [info] MHA::MasterMonitor version 0.56.
Wed Oct 23 02:59:38 2019 - [info] Dead Servers:
Wed Oct 23 02:59:38 2019 - [info] Alive Servers:
Wed Oct 23 02:59:38 2019 - [info]   server01(192.168.200.111:3306)
Wed Oct 23 02:59:38 2019 - [info]   server02(192.168.200.112:3306)
Wed Oct 23 02:59:38 2019 - [info]   server03(192.168.200.113:3306)
Wed Oct 23 02:59:38 2019 - [info]   server04(192.168.200.114:3306)
Wed Oct 23 02:59:38 2019 - [info] Alive Slaves:
Wed Oct 23 02:59:38 2019 - [info]   server02(192.168.200.112:3306)  Version=5.5.64-MariaDB (oldest major version between slaves) log-bin:enabled
Wed Oct 23 02:59:38 2019 - [info]     Replicating from 192.168.200.111(192.168.200.111:3306)
Wed Oct 23 02:59:38 2019 - [info]     Primary candidate for the new Master (candidate_master is set)
Wed Oct 23 02:59:38 2019 - [info]   server03(192.168.200.113:3306)  Version=5.5.64-MariaDB (oldest major version between slaves) log-bin:enabled
Wed Oct 23 02:59:38 2019 - [info]     Replicating from 192.168.200.111(192.168.200.111:3306)
Wed Oct 23 02:59:38 2019 - [info]   server04(192.168.200.114:3306)  Version=5.5.64-MariaDB (oldest major version between slaves) log-bin:enabled
Wed Oct 23 02:59:38 2019 - [info]     Replicating from 192.168.200.111(192.168.200.111:3306)
Wed Oct 23 02:59:38 2019 - [info] Current Alive Master: server01(192.168.200.111:3306)
Wed Oct 23 02:59:38 2019 - [info] Checking slave configurations..
Wed Oct 23 02:59:38 2019 - [info]  read_only=1 is not set on slave server02(192.168.200.112:3306).
Wed Oct 23 02:59:38 2019 - [warning]  relay_log_purge=0 is not set on slave server02(192.168.200.112:3306).
Wed Oct 23 02:59:38 2019 - [info] Checking replication filtering settings..
Wed Oct 23 02:59:38 2019 - [info]  binlog_do_db= , binlog_ignore_db= 
Wed Oct 23 02:59:38 2019 - [info]  Replication filtering check ok.
Wed Oct 23 02:59:38 2019 - [info] Starting SSH connection tests..
Wed Oct 23 02:59:42 2019 - [info] All SSH connection tests passed successfully.
Wed Oct 23 02:59:42 2019 - [info] Checking MHA Node version..
Wed Oct 23 02:59:43 2019 - [info]  Version check ok.
Wed Oct 23 02:59:43 2019 - [info] Checking SSH publickey authentication settings on the current master..
Wed Oct 23 02:59:43 2019 - [info] HealthCheck: SSH to server01 is reachable.
Wed Oct 23 02:59:44 2019 - [info] Master MHA Node version is 0.56.
Wed Oct 23 02:59:44 2019 - [info] Checking recovery script configurations on the current master..
Wed Oct 23 02:59:44 2019 - [info]   Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/tmp/save_binary_logs_test --manager_version=0.56 --start_file=master-bin.000002 
Wed Oct 23 02:59:44 2019 - [info]   Connecting to root@server01(server01).. 
  Creating /tmp if not exists..    ok.
  Checking output directory is accessible or not..
   ok.
  Binlog found at /var/lib/mysql, up to master-bin.000002
Wed Oct 23 02:59:44 2019 - [info] Master setting check done.
Wed Oct 23 02:59:44 2019 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Wed Oct 23 02:59:44 2019 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=server02 --slave_ip=192.168.200.112 --slave_port=3306 --workdir=/tmp --target_version=5.5.64-MariaDB --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx
Wed Oct 23 02:59:44 2019 - [info]   Connecting to root@192.168.200.112(server02:22).. 
  Checking slave recovery environment settings..
    Opening /var/lib/mysql/relay-log.info ... ok.
    Relay log found at /var/lib/mysql, up to mariadb-relay-bin.000003
    Temporary relay log file is /var/lib/mysql/mariadb-relay-bin.000003
    Testing mysql connection and privileges.. done.
    Testing mysqlbinlog output.. done.
    Cleaning up test file(s).. done.
Wed Oct 23 02:59:44 2019 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=server03 --slave_ip=192.168.200.113 --slave_port=3306 --workdir=/tmp --target_version=5.5.64-MariaDB --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx
Wed Oct 23 02:59:44 2019 - [info]   Connecting to root@192.168.200.113(server03:22).. 
  Checking slave recovery environment settings..
    Opening /var/lib/mysql/relay-log.info ... ok.
    Relay log found at /var/lib/mysql, up to slave-relay-bin.000002
    Temporary relay log file is /var/lib/mysql/slave-relay-bin.000002
    Testing mysql connection and privileges.. done.
    Testing mysqlbinlog output.. done.
    Cleaning up test file(s).. done.
Wed Oct 23 02:59:45 2019 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=server04 --slave_ip=192.168.200.114 --slave_port=3306 --workdir=/tmp --target_version=5.5.64-MariaDB --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx
Wed Oct 23 02:59:45 2019 - [info]   Connecting to root@192.168.200.114(server04:22).. 
  Checking slave recovery environment settings..
    Opening /var/lib/mysql/relay-log.info ... ok.
    Relay log found at /var/lib/mysql, up to slave-relay-bin.000002
    Temporary relay log file is /var/lib/mysql/slave-relay-bin.000002
    Testing mysql connection and privileges.. done.
    Testing mysqlbinlog output.. done.
    Cleaning up test file(s).. done.
Wed Oct 23 02:59:45 2019 - [info] Slaves settings check done.
Wed Oct 23 02:59:45 2019 - [info] 
server01 (current master)
 +--server02
 +--server03
 +--server04

Wed Oct 23 02:59:45 2019 - [info] Checking replication health on server02..
Wed Oct 23 02:59:45 2019 - [info]  ok.
Wed Oct 23 02:59:45 2019 - [info] Checking replication health on server03..
Wed Oct 23 02:59:45 2019 - [info]  ok.
Wed Oct 23 02:59:45 2019 - [info] Checking replication health on server04..
Wed Oct 23 02:59:45 2019 - [info]  ok.
Wed Oct 23 02:59:45 2019 - [info] Checking master_ip_failover_script status:
Wed Oct 23 02:59:45 2019 - [info]   /usr/local/bin/master_ip_failover --command=status --ssh_user=root --orig_master_host=server01 --orig_master_ip=192.168.200.111 --orig_master_port=3306 


IN SCRIPT TEST====/sbin/ifconfig ens32:1 down==/sbin/ifconfig ens32:1 192.168.200.100===

Checking the Status of the script.. OK 
Wed Oct 23 02:59:46 2019 - [info]  OK.
Wed Oct 23 02:59:46 2019 - [warning] shutdown_script is not defined.
Wed Oct 23 02:59:46 2019 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.
[root@server05 ~]#  masterha_check_ssh --conf=/etc/masterha/app1.cnf 
Wed Oct 23 03:00:14 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Wed Oct 23 03:00:14 2019 - [info] Reading application default configurations from /etc/masterha/app1.cnf..
Wed Oct 23 03:00:14 2019 - [info] Reading server configurations from /etc/masterha/app1.cnf..
Wed Oct 23 03:00:14 2019 - [info] Starting SSH connection tests..
Wed Oct 23 03:00:17 2019 - [debug] 
Wed Oct 23 03:00:14 2019 - [debug]  Connecting via SSH from root@server01(192.168.200.111:22) to root@server02(192.168.200.112:22)..
Wed Oct 23 03:00:15 2019 - [debug]   ok.
Wed Oct 23 03:00:15 2019 - [debug]  Connecting via SSH from root@server01(192.168.200.111:22) to root@server03(192.168.200.113:22)..
Wed Oct 23 03:00:16 2019 - [debug]   ok.
Wed Oct 23 03:00:16 2019 - [debug]  Connecting via SSH from root@server01(192.168.200.111:22) to root@server04(192.168.200.114:22)..
Wed Oct 23 03:00:17 2019 - [debug]   ok.
Wed Oct 23 03:00:17 2019 - [debug] 
Wed Oct 23 03:00:15 2019 - [debug]  Connecting via SSH from root@server02(192.168.200.112:22) to root@server01(192.168.200.111:22)..
Wed Oct 23 03:00:15 2019 - [debug]   ok.
Wed Oct 23 03:00:15 2019 - [debug]  Connecting via SSH from root@server02(192.168.200.112:22) to root@server03(192.168.200.113:22)..
Wed Oct 23 03:00:16 2019 - [debug]   ok.
Wed Oct 23 03:00:16 2019 - [debug]  Connecting via SSH from root@server02(192.168.200.112:22) to root@server04(192.168.200.114:22)..
Wed Oct 23 03:00:17 2019 - [debug]   ok.
Wed Oct 23 03:00:18 2019 - [debug] 
Wed Oct 23 03:00:15 2019 - [debug]  Connecting via SSH from root@server03(192.168.200.113:22) to root@server01(192.168.200.111:22)..
Wed Oct 23 03:00:16 2019 - [debug]   ok.
Wed Oct 23 03:00:16 2019 - [debug]  Connecting via SSH from root@server03(192.168.200.113:22) to root@server02(192.168.200.112:22)..
Wed Oct 23 03:00:17 2019 - [debug]   ok.
Wed Oct 23 03:00:17 2019 - [debug]  Connecting via SSH from root@server03(192.168.200.113:22) to root@server04(192.168.200.114:22)..
Wed Oct 23 03:00:17 2019 - [debug]   ok.
Wed Oct 23 03:00:18 2019 - [debug] 
Wed Oct 23 03:00:16 2019 - [debug]  Connecting via SSH from root@server04(192.168.200.114:22) to root@server01(192.168.200.111:22)..
Wed Oct 23 03:00:17 2019 - [debug]   ok.
Wed Oct 23 03:00:17 2019 - [debug]  Connecting via SSH from root@server04(192.168.200.114:22) to root@server02(192.168.200.112:22)..
Wed Oct 23 03:00:17 2019 - [debug]   ok.
Wed Oct 23 03:00:17 2019 - [debug]  Connecting via SSH from root@server04(192.168.200.114:22) to root@server03(192.168.200.113:22)..
Wed Oct 23 03:00:18 2019 - [debug]   ok.
Wed Oct 23 03:00:18 2019 - [info] All SSH connection tests passed successfully.

VIP 配置管理
Master vip 配置有两种方式,一种是通过 keepalived 或者 heartbeat 类似的软件的方式管 理 VIP 的浮动,另一种为通过命令方式管理。

[root@server05 ~]# grep -n 'master_ip_failover_script' /etc/masterha/app1.cnf 9:master_ip_failover_script= /usr/local/bin/master_ip_failover 
 
Primary Master(192.168.200.111)  [root@server01 ~]# ip a | grep ens32 
2: ens32: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000     
inet 192.168.200.111/24 brd 192.168.200.255 scope global ens32     
inet 192.168.200.100/24 brd 192.168.200.255 scope global secondary ens32:1 Server05(192.168.200.115)修改故障转移脚本 
[root@server05 ~]# head -13 /usr/local/bin/master_ip_failover #!/usr/bin/env perl  
 use strict;  
use warnings FATAL => 'all';  use Getopt::Long;  my (  $command, $ssh_user, $orig_master_host, $orig_master_ip,  
$orig_master_port, $new_master_host, $new_master_ip, $new_master_port,  );  
my $vip = '192.168.200.100';           # 写入 VIP 
my $key = "1";  #非 keepalived 方式切换脚本使用的 
my $ssh_start_vip = "/sbin/ifconfig ens32:$key $vip"; #若是使用 keepalived 
my $ssh_stop_vip = "/sbin/ifconfig ens32:$key down"; #那么这里写服务的开关命令 

Server05(192.168.200.115) 检查 manager 状态 

masterha_check_status --conf=/etc/masterha/app1.cnf app1 is stopped(2:NOT_RUNNING). 

Server05(192.168.200.115) 开启 manager 监控 

[root@server05 ~]# nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dea d_master_conf --ignore_last_failover< /dev/null >/var/log/masterha/app1/manager.log 2>&1 & [1] 65837 

Server05(192.168.200.115)查看 Server05 监控是否正常: 

[root@monitor ~]# masterha_check_status --conf=/etc/masterha/app1.cnf 
app1 (pid:65837) is running(0:PING_OK), master:server01

Server05(192.168.200.115)查看启动日志 

[root@server05 ~]# cat /var/log/masterha/app1/manager.log

关闭 MHA manager 监控,忽略操作

masterha_stop --conf=/etc/masterha/app1.cnf   

Primary Master(192.168.200.111) 模拟主库故障 

[root@server01 ~]# systemctl stop mariadb 
[root@server01 ~]# netstat -lnpt | grep :3306 
[root@server01 ~]# ip a | grep ens32 2
slave1(192.168.200.113)状态: 
MariaDB [(none)]> show slave status\G
 *************************** 1. row ***************************                Slave_IO_State: Waiting for master to send event                  
Master_Host: 192.168.200.112                   
Master_User: repl 
slave2(192.168.200.114)状态: MariaDB [(none)]> show slave status\G *************************** 1. row ***************************                Slave_IO_State: Waiting for master to send event                   
Master_Host: 192.168.200.112                   
Master_User: repl   

Server05(192.168.200.115) 监控已经自动关闭:

[root@server05 ~]# 

[1]+  完成                  nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1  

Server05(192.168.200.115) 查看监控配置文件已经发生了变化(server01 的配置已被删除)

故障主库修复及 VIP 切回测试 

Primary Master(192.168.200.111):

[root@server01 ~]# systemctl start mariadb

[root@server01 ~]# netstat -lnpt | grep :3306 

Primary Master(192.168.200.111) 指向新的主库 

[root@server01 ~]# mysql -u root -p123456 
stop slave; 
CHANGE MASTER TO 
MASTER_HOST='192.168.200.112', MASTER_USER='repl', MASTER_PASSWORD='123456'; 
start slave;
show slave status\G 
*************************** 1. row ***************************                
Slave_IO_State: Waiting for master to send event                   Master_Host: 192.168.200.112                   
Master_User: repl                   
Master_Port: 3306                 Connect_Retry: 60               
Master_Log_File: master-bin.000001           
Read_Master_Log_Pos: 1372                
Relay_Log_File: mariadb-relay-bin.000002                 
Relay_Log_Pos: 1208         
Relay_Master_Log_File: master-bin.000001              
Slave_IO_Running: Yes             
Slave_SQL_Running: Yes 

Server05(192.168.200.115) 修改监控配置文件添加 server1 配置

[root@server05 ~]# vim /etc/masterha/app1.cnf 
[server01] hostname=server01 port=3306 
   

检查群集查看

 

  

 

  

  

  

  

  

  

  

  

 

 

 

 

posted @ 2019-10-23 15:11  CAPF兵哥  阅读(204)  评论(0)    收藏  举报
刷新页面返回顶部
博客园  ©  2004-2025
浙公网安备 33010602011771号 浙ICP备2021040463号-3