马哥博客作业第十四周

1.MariaDB主从复制原理
1>.主库DB的更新事件(update、insert、delete)被写到binlog日志。
2>.主库创建一个Bin log输出线程,把Bin log内容发送到从库。
3>.从库启动并发起连接,连接到主库。
4>.从库启动后,创建一个I/O线程,读取主库传过来的Bin log内容并写入到Relay log中。
5>.从库启动后,创建一个SQL线程,从Relay log里面读取内容,将更新的内容写到到从库的DB。
总结:对于每一个主从复制的连接,都有三个线程。
拥有多个从库的主库为每一个连接到主库的从库创建一个binlog输出线程,每一个从库都有它自己的I/O线程和SQL线程。

 

2.1MariaDB一主一从架构构建(新建主从)
1>.主节点配置:
1.启用二进制日志,为当前节点设置一个全局惟一的ID号。
[root@master ~]#cat /etc/my.cnf.d/mysql-server.cnf
[mysqld]
server-id=18
log-bin=/data/mysql/mysql-bin
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
log-error=/var/log/mysql/mysqld.log
pid-file=/run/mysqld/mysqld.pid
2.重新启动mysql服务
systemctl restart mysqld
3.查看二进制日志的文件和位置开始进行复制
mysql> show master logs;
| Log_name | File_size | Encrypted |
+------------------+-----------+-----------+
| mysql-bin.000001 | 155 | No |
4.创建有复制权限的用户账号
mysql>create user repluser@'10.0.0.%' identified by 'magedu';
mysql>grant replication slave on *.* to repluser@'10.0.0.%';


2>.从节点配置:
1.为当前节点设置一个全局惟一的ID号,设置数据库只读。
[root@slave ~]#cat /etc/my.cnf.d/mysql-server.cnf
[mysqld]
server-id=28
read-only
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
log-error=/var/log/mysql/mysqld.log
pid-file=/run/mysqld/mysqld.pid
2.重新启动mysql服务
systemctl restart mysqld
3.使用有复制权限的用户账号连接至主服务器,并启动复制线程
mysql>CHANGE MASTER TO
MASTER_HOST='10.0.0.18',
MASTER_USER='repluser',
MASTER_PASSWORD='magedu',
MASTER_PORT=3306,
MASTER_LOG_FILE='mysql-bin.000001',
MASTER_LOG_POS=155;
mysql> start slave;
mysql> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 10.0.0.18
Master_User: repluser
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000002
Read_Master_Log_Pos: 155
Relay_Log_File: slave-relay-bin.000003
Relay_Log_Pos: 369
Relay_Master_Log_File: mysql-bin.000002
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 155
Relay_Log_Space: 1734
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 18
Master_UUID: c620a37b-eabd-11ea-a7be-000c29a591e6
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set:
Executed_Gtid_Set:
Auto_Position: 0
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
Master_public_key_path:
Get_master_public_key: 0
Network_Namespace:

#建议优化主和从节点服务器的性能
mysql> set global innodb_flush_log_at_trx_commit=2;
mysql> set global sync_binlog=0;

2.2MariaDB一主一从架构构建(主节点运行一段时间后从节点后加入)
1.主节点配置
1>.在主服务器完全备份
[root@master ~]#mysqldump -A -F --single-transaction --master-data=1 > /data/fullbackup.sql
[root@master ~]#scp /data/fullbackup.sql 10.0.0.28:/data
2>.从节点配置:
1.为当前节点设置一个全局惟一的ID号,设置数据库只读。
[root@slave ~]#cat /etc/my.cnf.d/mysql-server.cnf
[mysqld]
server-id=28
read-only
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
log-error=/var/log/mysql/mysqld.log
pid-file=/run/mysqld/mysqld.pid
2.重新启动mysql服务
systemctl restart mysqld
3.将完全备份还原到新的从节点
[root@slave ~]#grep '^CHANGE MASTER' /data/fullbackup.sql
CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000003', MASTER_LOG_POS=155;
#修改备份文件,添加主机名、用户、密码和端口。
[root@slave ~]#vim /data/fullbackup.sql
CHANGE MASTER TO
MASTER_HOST='10.0.0.18',
MASTER_USER='repluser',
MASTER_PASSWORD='magedu',
MASTER_PORT=3306,
MASTER_LOG_FILE='mysql-bin.000003',
MASTER_LOG_POS=155;
mysql> set sql_log_bin=0;
[root@slave ~]#mysql < /data/fullbackup.sql
mysql> start slave;
mysql> set sql_log_bin=1;

 

3.MariaDB级联复制
1>.主节点配置:
1.启用二进制日志,为当前节点设置一个全局惟一的ID号。
[root@master ~]#cat /etc/my.cnf.d/mysql-server.cnf
[mysqld]
server-id=18
log-bin=/data/mysql/mysql-bin
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
log-error=/var/log/mysql/mysqld.log
pid-file=/run/mysqld/mysqld.pid
2.重新启动mysql服务
systemctl restart mysqld
3.创建有复制权限的用户账号
mysql>create user repluser@'10.0.0.%' identified by 'magedu';
mysql>grant replication slave on *.* to repluser@'10.0.0.%';
4.在主节点完全备份
[root@master ~]#mysqldump -A -F --single-transaction --master-data=1 > /data/all.sql
[root@master ~]#scp /data/all.sql 10.0.0.28:/data
[root@master ~]#scp /data/all.sql 10.0.0.38:/data
2>.中间级联slave配置
1.[root@slave ~]#cat /etc/my.cnf.d/mysql-server.cnf
[mysqld]
server-id=28
log-bin
read-only
log_slave_updates
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
log-error=/var/log/mysql/mysqld.log
pid-file=/run/mysqld/mysqld.pid
2.重新启动mysql服务
[root@slave ~]#systemctl restart mysqld
3.还原数据库
[root@slave ~]#vim /data/all.sql
CHANGE MASTER TO
MASTER_HOST='10.0.0.18',
MASTER_USER='repluser',
MASTER_PASSWORD='magedu',
MASTER_PORT=3306,
MASTER_LOG_FILE='mysql-bin.000005',
MASTER_LOG_POS=155;
4.[root@slave ~]#mysql
mysql>set log_bin=0
mysql>source /data/all.sql;
mysql>show master logs;  #记录二进制位置,给第三个节点使用
binlog.000007 | 155
mysql>set log_bin=1;
mysql>start slave;
3>.在第二个从节点配置
1.[root@slave1 ~]#cat /etc/my.cnf.d/mysql-server.cnf
[mysqld]
server-id=38
read-only
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
log-error=/var/log/mysql/mysqld.log
pid-file=/run/mysqld/mysqld.pid
2.重新启动mysql服务
[root@slave1 ~]#systemctl restart mysqld
3.还原数据库
[root@slave1 ~]#vim /data/all.sql
CHANGE MASTER TO
MASTER_HOST='10.0.0.28',
MASTER_USER='repluser',
MASTER_PASSWORD='magedu',
MASTER_PORT=3306,
MASTER_LOG_FILE='binlog.000007',
MASTER_LOG_POS=155;
4.[root@slave1 ~]#mysql
mysql>set log_bin=0
mysql>source /data/all.sql;
mysql>set log_bin=1;
mysql>start slave;

4.MariaDB半同步复制(主从复制已正常配置)
1>.主节点配置:
1.#安装主节点半同步插件
mysql>INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';
mysql>SHOW PLUGINS; #查看插件
2.#修改配置文件
[root@master ~]#cat /etc/my.cnf.d/mysql-server.cnf
server-id=18
log-bin
rpl_semi_sync_master_enabled=ON
rpl_semi_sync_master_timeout = 1000
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
log-error=/var/log/mysql/mysqld.log
pid-file=/run/mysqld/mysqld.pid
3.重新启动mysql服务
systemctl restart mysqld
4.查看半同步状态
mysql>SHOW GLOBAL VARIABLES LIKE '%semi%';
mysql>SHOW GLOBAL STATUS LIKE '%semi%';
2>.其他所有从节点配置:
1.#安装从节点半同步插件
mysql>INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so';
2.#修改配置文件
[root@slave ~]#cat /etc/my.cnf.d/mysql-server.cnf
[mysqld]
server-id=28
read-only
rpl_semi_sync_slave_enabled=1
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
log-error=/var/log/mysql/mysqld.log
pid-file=/run/mysqld/mysqld.pid
#测试
#在master实现,创建数据库,立即成功
mysql> create database db3;
Query OK, 1 row affected (0.00 sec)

#在所有slave节点实现,停止复制线程
mysql> stop slave;
Query OK, 0 rows affected (0.01 sec)

#在master实现,创建数据库,等待1s才能成功
mysql> create database db4;
Query OK, 1 row affected (1.01 sec)

#在任意一个slave节点实现,恢复复制线程
mysql> start slave;
Query OK, 0 rows affected (0.00 sec)

#在master实现,创建数据库,立即成功
mysql> create database db5;
Query OK, 1 row affected (0.01 sec)

5.MariaDB高可用方案MHA

环境:四台主机
10.0.0.202 CentOS7 manager
10.0.0.17 CentOS7 Master
10.0.0.27 CentOS7 Slave1
10.0.0.37 CentOS7 Slave2

1.在管理节点上安装manager和node两个包(不支持CentOS8,只支持CentOS7 以下版本)
[root@manager ~]#yum -y install mha4mysql-manager-0.58-0.el7.centos.noarch.rpm
[root@manager ~]#yum -y install mha4mysql-node-0.58-0.el7.centos.noarch.rpm

2.在所有MySQL服务器上安装node包
[root@master ~]#yum -y install mha4mysql-node-0.58-0.el7.centos.noarch.rpm
[root@slave1 ~]#yum -y install mha4mysql-node-0.58-0.el7.centos.noarch.rpm
[root@slave2 ~]#yum -y install mha4mysql-node-0.58-0.el7.centos.noarch.rpm

3.通过管理节点的公钥和私钥在所有节点实现相互之间ssh key验证
[root@manager ~]ssh-keygen
[root@manager ~]ssh-copy-id 10.0.0.202
[root@manager ~]rsync -av .ssh 10.0.0.17:/root/
[root@manager ~]rsync -av .ssh 10.0.0.27:/root/
[root@manager ~]rsync -av .ssh 10.0.0.37:/root/

4.在管理节点建立配置文件
[root@manager ~]#mkdir /etc/mastermha/
[root@manager ~]#vim /etc/mastermha/app1.cnf
[server default]
user=mhauser #用于远程连接MySQL所有节点的用户,需要有管理员的权限
password=magedu
manager_workdir=/data/mastermha/app1/
manager_log=/data/mastermha/app1/manager.log
remote_workdir=/data/mastermha/app1/
ssh_user=root #用于实现远程ssh基于KEY的连接,访问二进制日志
repl_user=repluser #主从复制的用户信息
repl_password=magedu
ping_interval=1 #健康性检查的时间间隔
master_ip_failover_script=/usr/local/bin/master_ip_failover #切换VIP的perl脚本
report_script=/usr/local/bin/sendmail.sh #执行邮件报警脚本
#check_repl_delay=0 #默认如果slave中从库落后主库realylog 100M,主库不会选择这个从库为新的master,因为这个从库进行恢复需要很长的时间.通过这个参数,mha触发主从切换的时候会忽略复制的延时,这个参数对于设置了candidate_master=1的从库非常有用,这样确保这个从库一定能成为最新的master。
[server1]
hostname=10.0.0.17
[server2]
hostname=10.0.0.27
#candidate_master=1 #设置为优先候选master,即使不是集群中事件最新的slave,也会优先当master
[server3]
hostname=10.0.0.37

相关脚本配置
[root@manager ~]#cat /usr/local/bin/sendmail.sh
echo "MySQL is down" | mail -s "MHA Warning" 382314765@qq.com
[root@manager ~]#cat /etc/mail.rc
set from=382314765@qq.com
set smtp=smtp.qq.com
set smtp-auth-user=382314765@qq.com
set smtp-auth-password=ytayshjgebhucbai
[root@manager ~]#cat /usr/local/bin/master_ip_failover
#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';
use Getopt::Long;
my (
$command, $ssh_user, $orig_master_host, $orig_master_ip,
$orig_master_port, $new_master_host, $new_master_ip, $new_master_port
);
my $vip = '10.0.0.200';#设置Virtual IP
my $gateway = '10.0.0.254';#网关Gateway IP
my $interface = 'eth0';
my $key = "1";
my $ssh_start_vip = "/sbin/ifconfig $interface:$key $vip;/sbin/arping -I $interface -c 3 -s $vip $gateway >/dev/null 2>&1";
my $ssh_stop_vip = "/sbin/ifconfig $interface:$key down";
GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
);
exit &main();
sub main {
print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
if ( $command eq "stop" || $command eq "stopssh" ) {
# $orig_master_host, $orig_master_ip, $orig_master_port are passed.
# If you manage master ip address at global catalog database,
# invalidate orig_master_ip here.
my $exit_code = 1;
eval {
print "Disabling the VIP on old master: $orig_master_host \n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {
# all arguments are passed.
# If you manage master ip address at global catalog database,
# activate new_master_ip here.
# You can also grant write access (create user, set read_only=0, etc) here.
my $exit_code = 10;
eval {
print "Enabling the VIP - $vip on the new master - $new_master_host \n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
`ssh $ssh_user\@$orig_master_host \" $ssh_start_vip \"`;
exit 0;
}
else {
&usage();
exit 1;
}
}
# A simple system call that enable the VIP on the new master
sub start_vip() {
`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
# A simple system call that disable the VIP on the old_master
sub stop_vip() {
`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}
sub usage {
print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}

5. 实现Master
[root@master ~]#cat /etc/yum.repos.d/mysql57.repo
[msyql57]
name=msyql57
baseurl=https://mirrors.tuna.tsinghua.edu.cn/mysql/yum/mysql-5.7-community-el7-x86_64/
gpgcheck=0
[root@master ~]#yum -y install mysql-server
[root@master ~]#systemctl start mysqld
[root@master ~]#grep "A temporary password" /var/log/mysqld.log
2020-09-07T02:49:49.452646Z 1 [Note] A temporary password is generated for root@localhost: iQmBa1#ouT&/
[root@master ~]#mysqladmin -uroot -p'NTMTfeVVw8!u' password '1qaz!QAZ'
[root@master ~]mysql -uroot -p'1qaz!QAZ'
#安装半同步复制插件
mysql>INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';
mysql>INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so';
mysql>grant replication slave on *.* to repluser@'10.0.0.%' identified by 'magedu'; #创建同步账户
mysql>grant all on *.* to mhauser@'10.0.0.%' identified by 'magedu'; #创建远程连接MySQL所有节点的用户,需要有管理员的权限
[root@master ~]#vim /etc/my.cnf
server_id=17
log-bin
skip_name_resolve=1 #禁止反向解析
general_log #启用通用日志,
rpl_semi_sync_master_enabled=1 #开启半同步复制
rpl_semi_sync_master_timeout = 1000 #半同步超时时间为1S
rpl_semi_sync_slave_enabled=1
gtid_mode=ON #开启GTID复制
enforce_gtid_consistency #开启GTID复制
validate_password=off #关闭密码策略
log-error=/var/log/mysqld.log
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
pid-file=/var/run/mysqld/mysqld.pid
[root@master ~]#systemctl restart mysqld
配置VIP
[root@master ~]#ifconfig eth0:1 10.0.0.200/24

6.实现slave1
[root@slave1 ~]#cat /etc/yum.repos.d/mysql57.repo
[msyql57]
name=msyql57
baseurl=https://mirrors.tuna.tsinghua.edu.cn/mysql/yum/mysql-5.7-community-el7-x86_64/
gpgcheck=0
[root@slave1 ~]#yum -y install mysql-server
[root@slave1 ~]#systemctl start mysqld
[root@slave1 ~]#grep "A temporary password" /var/log/mysqld.log
2020-09-07T02:49:54.830529Z 1 [Note] A temporary password is generated for root@localhost: fu*e9uGzQKoe
[root@slave1 ~]#mysqladmin -uroot -p'fu*e9uGzQKoe' password '1qaz!QAZ'
[root@slave1 ~]#mysql -uroot -p'1qaz!QAZ'
#安装半同步复制插件
mysql>INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';
mysql>INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so';
[root@slave1 ~]#vim /etc/my.cnf
server-id=27
log-bin
read_only #设置从库只读,避免主从不一致
relay_log_purge=0 #不自动清除relay_log日志
skip_name_resolve=1
rpl_semi_sync_master_enabled=1
rpl_semi_sync_master_timeout = 1000
rpl_semi_sync_slave_enabled=1
gtid_mode=ON
enforce_gtid_consistency
log-error=/var/log/mysqld.log
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
pid-file=/var/run/mysqld/mysqld.pid
[root@slave1 ~]#systemctl restart mysqld
[root@slave1 ~]#mysql -uroot -p'1qaz!QAZ'
CHANGE MASTER TO
MASTER_HOST='10.0.0.17',
MASTER_USER='repluser',
MASTER_PASSWORD='magedu',
MASTER_PORT=3306,
MASTER_AUTO_POSITION=1;
mysql>start slave;
mysql>show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 10.0.0.17
Master_User: repluser
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: master-bin.000002
Read_Master_Log_Pos: 738
Relay_Log_File: slave1-relay-bin.000002
Relay_Log_Pos: 953
Relay_Master_Log_File: master-bin.000002
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 738
Relay_Log_Space: 1161
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 17
Master_UUID: e4557fa3-f0b5-11ea-9374-000c29424435
Master_Info_File: /var/lib/mysql/master.info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: e4557fa3-f0b5-11ea-9374-000c29424435:1-2
Executed_Gtid_Set: e4557fa3-f0b5-11ea-9374-000c29424435:1-2
Auto_Position: 1
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
mysql> select user,host from mysql.user; #用户已从主节点同步过来
+---------------+-----------+
| user | host |
+---------------+-----------+
| mhauser | 10.0.0.% |
| repluser | 10.0.0.% |
| mysql.session | localhost |
| mysql.sys | localhost |
| root | localhost |
+---------------+-----------+
slave2和slave1配置相同

7.在管理节点检查Mha的环境
[root@manager ~]masterha_check_ssh --conf=/etc/mastermha/app1.cnf
Mon Sep 7 12:05:09 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Mon Sep 7 12:05:09 2020 - [info] Reading application default configuration from /etc/mastermha/app1.cnf..
Mon Sep 7 12:05:09 2020 - [info] Reading server configuration from /etc/mastermha/app1.cnf..
Mon Sep 7 12:05:09 2020 - [info] Starting SSH connection tests..
Mon Sep 7 12:05:10 2020 - [debug]
Mon Sep 7 12:05:09 2020 - [debug] Connecting via SSH from root@10.0.0.17(10.0.0.17:22) to root@10.0.0.27(10.0.0.27:22)..
Warning: Permanently added '10.0.0.27' (ECDSA) to the list of known hosts.
Mon Sep 7 12:05:10 2020 - [debug] ok.
Mon Sep 7 12:05:10 2020 - [debug] Connecting via SSH from root@10.0.0.17(10.0.0.17:22) to root@10.0.0.37(10.0.0.37:22)..
Warning: Permanently added '10.0.0.37' (ECDSA) to the list of known hosts.
n Sep 7 12:05:10 2020 - [debug] ok.
Mon Sep 7 12:05:11 2020 - [debug]
Mon Sep 7 12:05:10 2020 - [debug] Connecting via SSH from root@10.0.0.27(10.0.0.27:22) to root@10.0.0.17(10.0.0.17:22)..
Mon Sep 7 12:05:10 2020 - [debug] ok.
Mon Sep 7 12:05:10 2020 - [debug] Connecting via SSH from root@10.0.0.27(10.0.0.27:22) to root@10.0.0.37(10.0.0.37:22)..
Warning: Permanently added '10.0.0.37' (ECDSA) to the list of known hosts.
Mon Sep 7 12:05:10 2020 - [debug] ok.
Mon Sep 7 12:05:12 2020 - [debug]
Mon Sep 7 12:05:10 2020 - [debug] Connecting via SSH from root@10.0.0.37(10.0.0.37:22) to root@10.0.0.17(10.0.0.17:22)..
Mon Sep 7 12:05:11 2020 - [debug] ok.
Mon Sep 7 12:05:11 2020 - [debug] Connecting via SSH from root@10.0.0.37(10.0.0.37:22) to root@10.0.0.27(10.0.0.27:22)..
Mon Sep 7 12:05:11 2020 - [debug] ok.
Mon Sep 7 12:05:12 2020 - [info] All SSH connection tests passed successfully.

[root@manager ~]#masterha_check_repl --conf=/etc/mastermha/app1.cnf
Mon Sep 7 12:22:03 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Mon Sep 7 12:22:03 2020 - [info] Reading application default configuration from /etc/mastermha/app1.cnf..
Mon Sep 7 12:22:03 2020 - [info] Reading server configuration from /etc/mastermha/app1.cnf..
Mon Sep 7 12:22:03 2020 - [info] MHA::MasterMonitor version 0.58.
Mon Sep 7 12:22:05 2020 - [info] GTID failover mode = 1
Mon Sep 7 12:22:05 2020 - [info] Dead Servers:
Mon Sep 7 12:22:05 2020 - [info] Alive Servers:
Mon Sep 7 12:22:05 2020 - [info] 10.0.0.17(10.0.0.17:3306)
Mon Sep 7 12:22:05 2020 - [info] 10.0.0.27(10.0.0.27:3306)
Mon Sep 7 12:22:05 2020 - [info] 10.0.0.37(10.0.0.37:3306)
Mon Sep 7 12:22:05 2020 - [info] Alive Slaves:
Mon Sep 7 12:22:05 2020 - [info] 10.0.0.27(10.0.0.27:3306) Version=5.7.31-log (oldest major version between slaves) log-bin:enabled
Mon Sep 7 12:22:05 2020 - [info] GTID ON
Mon Sep 7 12:22:05 2020 - [info] Replicating from 10.0.0.17(10.0.0.17:3306)
Mon Sep 7 12:22:05 2020 - [info] Primary candidate for the new Master (candidate_master is set)
Mon Sep 7 12:22:05 2020 - [info] 10.0.0.37(10.0.0.37:3306) Version=5.7.31-log (oldest major version between slaves) log-bin:enabled
Mon Sep 7 12:22:05 2020 - [info] GTID ON
Mon Sep 7 12:22:05 2020 - [info] Replicating from 10.0.0.17(10.0.0.17:3306)
Mon Sep 7 12:22:05 2020 - [info] Current Alive Master: 10.0.0.17(10.0.0.17:3306)
Mon Sep 7 12:22:05 2020 - [info] Checking slave configurations..
Mon Sep 7 12:22:05 2020 - [info] read_only=1 is not set on slave 10.0.0.27(10.0.0.27:3306).
Mon Sep 7 12:22:05 2020 - [info] read_only=1 is not set on slave 10.0.0.37(10.0.0.37:3306).
Mon Sep 7 12:22:05 2020 - [info] Checking replication filtering settings..
Mon Sep 7 12:22:05 2020 - [info] binlog_do_db= , binlog_ignore_db=
Mon Sep 7 12:22:05 2020 - [info] Replication filtering check ok.
Mon Sep 7 12:22:05 2020 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking.
Mon Sep 7 12:22:05 2020 - [info] Checking SSH publickey authentication settings on the current master..
Mon Sep 7 12:22:05 2020 - [info] HealthCheck: SSH to 10.0.0.17 is reachable.
Mon Sep 7 12:22:05 2020 - [info]
10.0.0.17(10.0.0.17:3306) (current master)
+--10.0.0.27(10.0.0.27:3306)
+--10.0.0.37(10.0.0.37:3306)

Mon Sep 7 12:22:05 2020 - [info] Checking replication health on 10.0.0.27..
Mon Sep 7 12:22:05 2020 - [info] ok.
Mon Sep 7 12:22:05 2020 - [info] Checking replication health on 10.0.0.37..
Mon Sep 7 12:22:05 2020 - [info] ok.
Mon Sep 7 12:22:05 2020 - [info] Checking master_ip_failover_script status:
Mon Sep 7 12:22:05 2020 - [info] /usr/local/bin/master_ip_failover --command=status --ssh_user=root --orig_master_host=10.0.0.17 --orig_master_ip=10.0.0.17 --orig_master_port=3306


IN SCRIPT TEST====/sbin/ifconfig eth0:1 down==/sbin/ifconfig eth0:1 10.0.0.200;/sbin/arping -I eth0 -c 3 -s 10.0.0.200 10.0.0.254 >/dev/null 2>&1===

Checking the Status of the script.. OK
Mon Sep 7 12:22:08 2020 - [info] OK.
Mon Sep 7 12:22:08 2020 - [warning] shutdown_script is not defined.
Mon Sep 7 12:22:08 2020 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.

8. 在管理节点启动MHA
#开启MHA,默认是前台运行
[root@manager ~]#nohup masterha_manager --conf=/etc/mastermha/app1.cnf &> /dev/null &

#查看状态
[root@manager ~]#masterha_check_status --conf=/etc/mastermha/app1.cnf
app1 (pid:1872) is running(0:PING_OK), master:10.0.0.17

#查看到主节点的健康性检查
[root@master ~]#tail -f /var/lib/mysql/master.log
2020-09-07T04:27:08.684295Z 18 Query SELECT 1 As Value
2020-09-07T04:27:09.680138Z 18 Query SELECT 1 As Value
2020-09-07T04:27:10.680963Z 18 Query SELECT 1 As Value
2020-09-07T04:27:11.681818Z 18 Query SELECT 1 As Value
2020-09-07T04:27:12.684381Z 18 Query SELECT 1 As Value
2020-09-07T04:27:13.687269Z 18 Query SELECT 1 As Value

9.模拟主节点故障
[root@master ~]systemctl restart mysqld

#当 master down机后,mha自动退出
[root@manager bin]#masterha_manager --conf=/etc/mastermha/app1.cnf
Mon Sep 7 12:27:05 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Mon Sep 7 12:27:05 2020 - [info] Reading application default configuration from /etc/mastermha/app1.cnf..
Mon Sep 7 12:27:05 2020 - [info] Reading server configuration from /etc/mastermha/app1.cnf..
Mon Sep 7 12:38:27 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Mon Sep 7 12:38:27 2020 - [info] Reading application default configuration from /etc/mastermha/app1.cnf..
Mon Sep 7 12:38:27 2020 - [info] Reading server configuration from /etc/mastermha/app1.cnf..

10.排错日志
[root@manager ~]#tail -f /data/mastermha/app1/manager.log

IN SCRIPT TEST====/sbin/ifconfig eth0:1 down==/sbin/ifconfig eth0:1 10.0.0.200;/sbin/arping -I eth0 -c 3 -s 10.0.0.200 10.0.0.254 >/dev/null 2>&1===

Checking the Status of the script.. OK
Mon Sep 7 12:27:09 2020 - [info] OK.
Mon Sep 7 12:27:09 2020 - [warning] shutdown_script is not defined.
Mon Sep 7 12:27:09 2020 - [info] Set master ping interval 1 seconds.
Mon Sep 7 12:27:09 2020 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes.
Mon Sep 7 12:27:09 2020 - [info] Starting ping health check on 10.0.0.17(10.0.0.17:3306)..
Mon Sep 7 12:27:09 2020 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..
Mon Sep 7 12:38:23 2020 - [warning] Got error on MySQL select ping: 2006 (MySQL server has gone away)
Mon Sep 7 12:38:23 2020 - [info] Executing SSH check script: exit 0
Mon Sep 7 12:38:23 2020 - [info] HealthCheck: SSH to 10.0.0.17 is reachable.
Mon Sep 7 12:38:24 2020 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '10.0.0.17' (111))
Mon Sep 7 12:38:24 2020 - [warning] Connection failed 2 time(s)..
Mon Sep 7 12:38:25 2020 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '10.0.0.17' (111))
Mon Sep 7 12:38:25 2020 - [warning] Connection failed 3 time(s)..
Mon Sep 7 12:38:26 2020 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '10.0.0.17' (111))
Mon Sep 7 12:38:26 2020 - [warning] Connection failed 4 time(s)..
Mon Sep 7 12:38:26 2020 - [warning] Master is not reachable from health checker!
Mon Sep 7 12:38:26 2020 - [warning] Master 10.0.0.17(10.0.0.17:3306) is not reachable!
Mon Sep 7 12:38:26 2020 - [warning] SSH is reachable.
Mon Sep 7 12:38:26 2020 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/mastermha/app1.cnf again, and trying to connect to all servers to check server status..
Mon Sep 7 12:38:26 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Mon Sep 7 12:38:26 2020 - [info] Reading application default configuration from /etc/mastermha/app1.cnf..
Mon Sep 7 12:38:26 2020 - [info] Reading server configuration from /etc/mastermha/app1.cnf..
Mon Sep 7 12:38:27 2020 - [info] GTID failover mode = 1
Mon Sep 7 12:38:27 2020 - [info] Dead Servers:
Mon Sep 7 12:38:27 2020 - [info] 10.0.0.17(10.0.0.17:3306)
Mon Sep 7 12:38:27 2020 - [info] Alive Servers:
Mon Sep 7 12:38:27 2020 - [info] 10.0.0.27(10.0.0.27:3306)
Mon Sep 7 12:38:27 2020 - [info] 10.0.0.37(10.0.0.37:3306)
Mon Sep 7 12:38:27 2020 - [info] Alive Slaves:
Mon Sep 7 12:38:27 2020 - [info] 10.0.0.27(10.0.0.27:3306) Version=5.7.31-log (oldest major version between slaves) log-bin:enabled
Mon Sep 7 12:38:27 2020 - [info] GTID ON
Mon Sep 7 12:38:27 2020 - [info] Replicating from 10.0.0.17(10.0.0.17:3306)
Mon Sep 7 12:38:27 2020 - [info] Primary candidate for the new Master (candidate_master is set)
Mon Sep 7 12:38:27 2020 - [info] 10.0.0.37(10.0.0.37:3306) Version=5.7.31-log (oldest major version between slaves) log-bin:enabled
Mon Sep 7 12:38:27 2020 - [info] GTID ON
Mon Sep 7 12:38:27 2020 - [info] Replicating from 10.0.0.17(10.0.0.17:3306)
Mon Sep 7 12:38:27 2020 - [info] Checking slave configurations..
Mon Sep 7 12:38:27 2020 - [info] Checking replication filtering settings..
Mon Sep 7 12:38:27 2020 - [info] Replication filtering check ok.
Mon Sep 7 12:38:27 2020 - [info] Master is down!
Mon Sep 7 12:38:27 2020 - [info] Terminating monitoring script.
Mon Sep 7 12:38:27 2020 - [info] Got exit code 20 (Master dead).
Mon Sep 7 12:38:27 2020 - [info] MHA::MasterFailover version 0.58.
Mon Sep 7 12:38:27 2020 - [info] Starting master failover.
Mon Sep 7 12:38:27 2020 - [info]
Mon Sep 7 12:38:27 2020 - [info] * Phase 1: Configuration Check Phase..
Mon Sep 7 12:38:27 2020 - [info]
Mon Sep 7 12:38:28 2020 - [info] GTID failover mode = 1
Mon Sep 7 12:38:28 2020 - [info] Dead Servers:
Mon Sep 7 12:38:28 2020 - [info] 10.0.0.17(10.0.0.17:3306)
Mon Sep 7 12:38:28 2020 - [info] Checking master reachability via MySQL(double check)...
Mon Sep 7 12:38:28 2020 - [info] ok.
Mon Sep 7 12:38:28 2020 - [info] Alive Servers:
Mon Sep 7 12:38:28 2020 - [info] 10.0.0.27(10.0.0.27:3306)
Mon Sep 7 12:38:28 2020 - [info] 10.0.0.37(10.0.0.37:3306)
Mon Sep 7 12:38:28 2020 - [info] Alive Slaves:
Mon Sep 7 12:38:28 2020 - [info] 10.0.0.27(10.0.0.27:3306) Version=5.7.31-log (oldest major version between slaves) log-bin:enabled
Mon Sep 7 12:38:28 2020 - [info] GTID ON
Mon Sep 7 12:38:28 2020 - [info] Replicating from 10.0.0.17(10.0.0.17:3306)
Mon Sep 7 12:38:28 2020 - [info] Primary candidate for the new Master (candidate_master is set)
Mon Sep 7 12:38:28 2020 - [info] 10.0.0.37(10.0.0.37:3306) Version=5.7.31-log (oldest major version between slaves) log-bin:enabled
Mon Sep 7 12:38:28 2020 - [info] GTID ON
Mon Sep 7 12:38:28 2020 - [info] Replicating from 10.0.0.17(10.0.0.17:3306)
Mon Sep 7 12:38:28 2020 - [info] Starting GTID based failover.
Mon Sep 7 12:38:28 2020 - [info]
Mon Sep 7 12:38:28 2020 - [info] ** Phase 1: Configuration Check Phase completed.
Mon Sep 7 12:38:28 2020 - [info]
Mon Sep 7 12:38:28 2020 - [info] * Phase 2: Dead Master Shutdown Phase..
Mon Sep 7 12:38:28 2020 - [info]
Mon Sep 7 12:38:28 2020 - [info] Forcing shutdown so that applications never connect to the current master..
Mon Sep 7 12:38:28 2020 - [info] Executing master IP deactivation script:
Mon Sep 7 12:38:28 2020 - [info] /usr/local/bin/master_ip_failover --orig_master_host=10.0.0.17 --orig_master_ip=10.0.0.17 --orig_master_port=3306 --command=stopssh --ssh_user=root


IN SCRIPT TEST====/sbin/ifconfig eth0:1 down==/sbin/ifconfig eth0:1 10.0.0.200;/sbin/arping -I eth0 -c 3 -s 10.0.0.200 10.0.0.254 >/dev/null 2>&1===

Disabling the VIP on old master: 10.0.0.17
Mon Sep 7 12:38:28 2020 - [info] done.
Mon Sep 7 12:38:28 2020 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.
Mon Sep 7 12:38:28 2020 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Mon Sep 7 12:38:28 2020 - [info]
Mon Sep 7 12:38:28 2020 - [info] * Phase 3: Master Recovery Phase..
Mon Sep 7 12:38:28 2020 - [info]
Mon Sep 7 12:38:28 2020 - [info] * Phase 3.1: Getting Latest Slaves Phase..
Mon Sep 7 12:38:28 2020 - [info]
Mon Sep 7 12:38:28 2020 - [info] The latest binary log file/position on all slaves is master-bin.000002:738
Mon Sep 7 12:38:28 2020 - [info] Retrieved Gtid Set: e4557fa3-f0b5-11ea-9374-000c29424435:1-2
Mon Sep 7 12:38:28 2020 - [info] Latest slaves (Slaves that received relay log files to the latest):
Mon Sep 7 12:38:28 2020 - [info] 10.0.0.27(10.0.0.27:3306) Version=5.7.31-log (oldest major version between slaves) log-bin:enabled
Mon Sep 7 12:38:28 2020 - [info] GTID ON
Mon Sep 7 12:38:28 2020 - [info] Replicating from 10.0.0.17(10.0.0.17:3306)
Mon Sep 7 12:38:28 2020 - [info] Primary candidate for the new Master (candidate_master is set)
Mon Sep 7 12:38:28 2020 - [info] 10.0.0.37(10.0.0.37:3306) Version=5.7.31-log (oldest major version between slaves) log-bin:enabled
Mon Sep 7 12:38:28 2020 - [info] GTID ON
Mon Sep 7 12:38:28 2020 - [info] Replicating from 10.0.0.17(10.0.0.17:3306)
Mon Sep 7 12:38:28 2020 - [info] The oldest binary log file/position on all slaves is master-bin.000002:738
Mon Sep 7 12:38:28 2020 - [info] Retrieved Gtid Set: e4557fa3-f0b5-11ea-9374-000c29424435:1-2
Mon Sep 7 12:38:28 2020 - [info] Oldest slaves:
Mon Sep 7 12:38:28 2020 - [info] 10.0.0.27(10.0.0.27:3306) Version=5.7.31-log (oldest major version between slaves) log-bin:enabled
Mon Sep 7 12:38:28 2020 - [info] GTID ON
Mon Sep 7 12:38:28 2020 - [info] Replicating from 10.0.0.17(10.0.0.17:3306)
Mon Sep 7 12:38:28 2020 - [info] Primary candidate for the new Master (candidate_master is set)
Mon Sep 7 12:38:28 2020 - [info] 10.0.0.37(10.0.0.37:3306) Version=5.7.31-log (oldest major version between slaves) log-bin:enabled
Mon Sep 7 12:38:28 2020 - [info] GTID ON
Mon Sep 7 12:38:28 2020 - [info] Replicating from 10.0.0.17(10.0.0.17:3306)
Mon Sep 7 12:38:28 2020 - [info]
Mon Sep 7 12:38:28 2020 - [info] * Phase 3.3: Determining New Master Phase..
Mon Sep 7 12:38:28 2020 - [info]
Mon Sep 7 12:38:28 2020 - [info] Searching new master from slaves..
Mon Sep 7 12:38:28 2020 - [info] Candidate masters from the configuration file:
Mon Sep 7 12:38:28 2020 - [info] 10.0.0.27(10.0.0.27:3306) Version=5.7.31-log (oldest major version between slaves) log-bin:enabled
Mon Sep 7 12:38:28 2020 - [info] GTID ON
Mon Sep 7 12:38:28 2020 - [info] Replicating from 10.0.0.17(10.0.0.17:3306)
Mon Sep 7 12:38:28 2020 - [info] Primary candidate for the new Master (candidate_master is set)
Mon Sep 7 12:38:28 2020 - [info] Non-candidate masters:
Mon Sep 7 12:38:28 2020 - [info] Searching from candidate_master slaves which have received the latest relay log events..
Mon Sep 7 12:38:28 2020 - [info] New master is 10.0.0.27(10.0.0.27:3306)
Mon Sep 7 12:38:28 2020 - [info] Starting master failover..
Mon Sep 7 12:38:28 2020 - [info]
From:
10.0.0.17(10.0.0.17:3306) (current master)
+--10.0.0.27(10.0.0.27:3306)
+--10.0.0.37(10.0.0.37:3306)

To:
10.0.0.27(10.0.0.27:3306) (new master)
+--10.0.0.37(10.0.0.37:3306)
Mon Sep 7 12:38:28 2020 - [info]
Mon Sep 7 12:38:28 2020 - [info] * Phase 3.3: New Master Recovery Phase..
Mon Sep 7 12:38:28 2020 - [info]
Mon Sep 7 12:38:28 2020 - [info] Waiting all logs to be applied..
Mon Sep 7 12:38:28 2020 - [info] done.
Mon Sep 7 12:38:28 2020 - [info] Getting new master's binlog name and position..
Mon Sep 7 12:38:28 2020 - [info] slave1-bin.000002:154
Mon Sep 7 12:38:28 2020 - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='10.0.0.27', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='repluser', MASTER_PASSWORD='xxx';
Mon Sep 7 12:38:28 2020 - [info] Master Recovery succeeded. File:Pos:Exec_Gtid_Set: slave1-bin.000002, 154, e4557fa3-f0b5-11ea-9374-000c29424435:1-2
Mon Sep 7 12:38:28 2020 - [info] Executing master IP activate script:
Mon Sep 7 12:38:28 2020 - [info] /usr/local/bin/master_ip_failover --command=start --ssh_user=root --orig_master_host=10.0.0.17 --orig_master_ip=10.0.0.17 --orig_master_port=3306 --new_master_host=10.0.0.27 --new_master_ip=10.0.0.27 --new_master_port=3306 --new_master_user='mhauser' --new_master_password=xxx
Unknown option: new_master_user
Unknown option: new_master_password


IN SCRIPT TEST====/sbin/ifconfig eth0:1 down==/sbin/ifconfig eth0:1 10.0.0.200;/sbin/arping -I eth0 -c 3 -s 10.0.0.200 10.0.0.254 >/dev/null 2>&1===

Enabling the VIP - 10.0.0.200 on the new master - 10.0.0.27
Mon Sep 7 12:38:31 2020 - [info] OK.
Mon Sep 7 12:38:31 2020 - [info] Setting read_only=0 on 10.0.0.27(10.0.0.27:3306)..
Mon Sep 7 12:38:31 2020 - [info] ok.
Mon Sep 7 12:38:31 2020 - [info] ** Finished master recovery successfully.
Mon Sep 7 12:38:31 2020 - [info] * Phase 3: Master Recovery Phase completed.
Mon Sep 7 12:38:31 2020 - [info]
Mon Sep 7 12:38:31 2020 - [info] * Phase 4: Slaves Recovery Phase..
Mon Sep 7 12:38:31 2020 - [info]
Mon Sep 7 12:38:31 2020 - [info]
Mon Sep 7 12:38:31 2020 - [info] * Phase 4.1: Starting Slaves in parallel..
Mon Sep 7 12:38:31 2020 - [info]
Mon Sep 7 12:38:31 2020 - [info] -- Slave recovery on host 10.0.0.37(10.0.0.37:3306) started, pid: 5545. Check tmp log /data/mastermha/app1//10.0.0.37_3306_20200907123827.log if it takes time..
Mon Sep 7 12:38:34 2020 - [info]
Mon Sep 7 12:38:34 2020 - [info] Log messages from 10.0.0.37 ...
Mon Sep 7 12:38:34 2020 - [info]
Mon Sep 7 12:38:32 2020 - [info] Resetting slave 10.0.0.37(10.0.0.37:3306) and starting replication from the new master 10.0.0.27(10.0.0.27:3306)..
Mon Sep 7 12:38:32 2020 - [info] Executed CHANGE MASTER.
Mon Sep 7 12:38:33 2020 - [info] Slave started.
Mon Sep 7 12:38:33 2020 - [info] gtid_wait(e4557fa3-f0b5-11ea-9374-000c29424435:1-2) completed on 10.0.0.37(10.0.0.37:3306). Executed 0 events.
Mon Sep 7 12:38:34 2020 - [info] End of log messages from 10.0.0.37.
Mon Sep 7 12:38:34 2020 - [info] -- Slave on host 10.0.0.37(10.0.0.37:3306) started.
Mon Sep 7 12:38:34 2020 - [info] All new slave servers recovered successfully.
Mon Sep 7 12:38:34 2020 - [info]
Mon Sep 7 12:38:34 2020 - [info] * Phase 5: New master cleanup phase..
Mon Sep 7 12:38:34 2020 - [info]
Mon Sep 7 12:38:34 2020 - [info] Resetting slave info on the new master..
Mon Sep 7 12:38:34 2020 - [info] 10.0.0.27: Resetting slave info succeeded.
Mon Sep 7 12:38:34 2020 - [info] Master failover to 10.0.0.27(10.0.0.27:3306) completed successfully.
Mon Sep 7 12:38:34 2020 - [info]

----- Failover Report -----

app1: MySQL Master failover 10.0.0.17(10.0.0.17:3306) to 10.0.0.27(10.0.0.27:3306) succeeded

Master 10.0.0.17(10.0.0.17:3306) is down!

Check MHA Manager logs at manager:/data/mastermha/app1/manager.log for details.

Started automated(non-interactive) failover.
Invalidated master IP address on 10.0.0.17(10.0.0.17:3306)
Selected 10.0.0.27(10.0.0.27:3306) as a new master.
10.0.0.27(10.0.0.27:3306): OK: Applying all logs succeeded.
10.0.0.27(10.0.0.27:3306): OK: Activated master IP address.
10.0.0.37(10.0.0.37:3306): OK: Slave started, replicating from 10.0.0.27(10.0.0.27:3306)
10.0.0.27(10.0.0.27:3306): Resetting slave info succeeded.
Master failover to 10.0.0.27(10.0.0.27:3306) completed successfully.
Mon Sep 7 12:38:34 2020 - [info] Sending mail..
从日志中看到已经切换到10.0.0.27(slave1)节点

#查看MHA状态,已经退出

[root@manager ~]#masterha_check_status --conf=/etc/mastermha/app1.cnf
app1 is stopped(2:NOT_RUNNING).

#验证VIP漂移至新的Master上
[root@slave1 ~]#ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.0.0.27 netmask 255.255.255.0 broadcast 10.0.0.255
inet6 fe80::250:56ff:fe38:ecae prefixlen 64 scopeid 0x20<link>
ether 00:50:56:38:ec:ae txqueuelen 1000 (Ethernet)
RX packets 157117 bytes 230663851 (219.9 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 39655 bytes 2896034 (2.7 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

eth0:1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.0.0.200 netmask 255.0.0.0 broadcast 10.255.255.255
ether 00:50:56:38:ec:ae txqueuelen 1000 (Ethernet)
查看邮箱收到报警邮件

#切换完成后需要删除管理节点的failover文件
[root@manager ~]#rm -rf /data/mastermha/app1/app1.failover.complete
重新启动MHA
[root@manager ~]#nohup masterha_manager --conf=/etc/mastermha/app1.cnf &> /dev/null &
把新的主节点(slave1)配置文件中的read_only去掉,避免重启后无法写入数据。

11.把宕机的主节点从新添加到新集群中充当从节点
1>.启动read_only
2>.重新连接数据库,变成新的从节点
CHANGE MASTER TO
MASTER_HOST='10.0.0.27',(新的主节点IP)
MASTER_USER='repluser',
MASTER_PASSWORD='magedu',
MASTER_PORT=3306,
MASTER_AUTO_POSITION=1;

posted @ 2020-09-04 17:39  小小小小石头  阅读(259)  评论(0)    收藏  举报