mha
http://azhuang.blog.51cto.com/9176790/1744269
http://www.ywnds.com/?p=8129
binlog1参数添加后的日志
no -o BatchMode=yes -o ConnectTimeout=5, timeout 5 Monitoring server 172.16.12.10 is reachable, Master is not reachable from 172.16.12.10. OK. Fri Dec 8 19:04:14 2017 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111) Fri Dec 8 19:04:14 2017 - [warning] Connection failed 2 time(s).. Monitoring server 172.16.12.13 is reachable, Master is not reachable from 172.16.12.13. OK. Fri Dec 8 19:04:14 2017 - [info] Master is not reachable from all other monitoring servers. Failover should start. Fri Dec 8 19:04:14 2017 - [info] HealthCheck: SSH to 172.16.12.11 is reachable. Fri Dec 8 19:04:17 2017 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111) Fri Dec 8 19:04:17 2017 - [warning] Connection failed 3 time(s).. Fri Dec 8 19:04:20 2017 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111) Fri Dec 8 19:04:20 2017 - [warning] Connection failed 4 time(s).. Fri Dec 8 19:04:20 2017 - [warning] Master is not reachable from health checker! Fri Dec 8 19:04:20 2017 - [warning] Master 172.16.12.11(172.16.12.11:3306) is not reachable! Fri Dec 8 19:04:20 2017 - [warning] SSH is reachable. Fri Dec 8 19:04:20 2017 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/mha/app1.cnf again, and trying to connect to all servers to check server status.. Fri Dec 8 19:04:20 2017 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Fri Dec 8 19:04:20 2017 - [info] Reading application default configuration from /etc/mha/app1.cnf.. Fri Dec 8 19:04:20 2017 - [info] Reading server configuration from /etc/mha/app1.cnf.. Fri Dec 8 19:04:20 2017 - [debug] Skipping connecting to dead master 172.16.12.11(172.16.12.11:3306). Fri Dec 8 19:04:20 2017 - [debug] Connecting to servers.. Fri Dec 8 19:04:20 2017 - [debug] Connected to: 172.16.12.12(172.16.12.12:3306), user=root Fri Dec 8 19:04:20 2017 - [debug] Number of slave worker threads on host 172.16.12.12(172.16.12.12:3306): 2 Fri Dec 8 19:04:20 2017 - [debug] Connected to: 172.16.12.13(172.16.12.13:3306), user=root Fri Dec 8 19:04:20 2017 - [debug] Number of slave worker threads on host 172.16.12.13(172.16.12.13:3306): 2 Fri Dec 8 19:04:20 2017 - [debug] Comparing MySQL versions.. Fri Dec 8 19:04:20 2017 - [debug] Comparing MySQL versions done. Fri Dec 8 19:04:20 2017 - [debug] Connecting to servers done. Fri Dec 8 19:04:20 2017 - [info] GTID failover mode = 1 Fri Dec 8 19:04:20 2017 - [info] Dead Servers: Fri Dec 8 19:04:20 2017 - [info] 172.16.12.11(172.16.12.11:3306) Fri Dec 8 19:04:20 2017 - [info] Alive Servers: Fri Dec 8 19:04:20 2017 - [info] 172.16.12.12(172.16.12.12:3306) Fri Dec 8 19:04:20 2017 - [info] 172.16.12.13(172.16.12.13:3306) Fri Dec 8 19:04:20 2017 - [info] Alive Slaves: Fri Dec 8 19:04:20 2017 - [info] 172.16.12.12(172.16.12.12:3306) Version=5.7.9-log (oldest major version between slaves) log-bin:enabled Fri Dec 8 19:04:20 2017 - [info] GTID ON Fri Dec 8 19:04:20 2017 - [debug] Relay log info repository: TABLE Fri Dec 8 19:04:20 2017 - [info] Replicating from 172.16.12.11(172.16.12.11:3306) Fri Dec 8 19:04:20 2017 - [info] Primary candidate for the new Master (candidate_master is set) Fri Dec 8 19:04:20 2017 - [info] 172.16.12.13(172.16.12.13:3306) Version=5.7.9-log (oldest major version between slaves) log-bin:enabled Fri Dec 8 19:04:20 2017 - [info] GTID ON Fri Dec 8 19:04:20 2017 - [debug] Relay log info repository: TABLE Fri Dec 8 19:04:20 2017 - [info] Replicating from 172.16.12.11(172.16.12.11:3306) Fri Dec 8 19:04:20 2017 - [info] Not candidate for the new Master (no_master is set) Fri Dec 8 19:04:20 2017 - [info] Checking slave configurations.. Fri Dec 8 19:04:20 2017 - [info] read_only=1 is not set on slave 172.16.12.12(172.16.12.12:3306). Fri Dec 8 19:04:20 2017 - [info] read_only=1 is not set on slave 172.16.12.13(172.16.12.13:3306). Fri Dec 8 19:04:20 2017 - [info] Checking replication filtering settings.. Fri Dec 8 19:04:20 2017 - [info] Replication filtering check ok. Fri Dec 8 19:04:20 2017 - [info] Master is down! Fri Dec 8 19:04:20 2017 - [info] Terminating monitoring script. Fri Dec 8 19:04:20 2017 - [debug] Disconnected from 172.16.12.12(172.16.12.12:3306) Fri Dec 8 19:04:20 2017 - [debug] Disconnected from 172.16.12.13(172.16.12.13:3306) Fri Dec 8 19:04:20 2017 - [info] Got exit code 20 (Master dead). Fri Dec 8 19:04:20 2017 - [info] MHA::MasterFailover version 0.57. Fri Dec 8 19:04:20 2017 - [info] Starting master failover. Fri Dec 8 19:04:20 2017 - [info] Fri Dec 8 19:04:20 2017 - [info] * Phase 1: Configuration Check Phase.. Fri Dec 8 19:04:20 2017 - [info] Fri Dec 8 19:04:20 2017 - [debug] SSH connection test to 172.16.12.11, option -o StrictHostKeyChecking=no -o PasswordAuthentication=no -o BatchMode=yes -o ConnectTimeout=5, timeout 5 Fri Dec 8 19:04:20 2017 - [info] HealthCheck: SSH to 172.16.12.11 is reachable. Fri Dec 8 19:04:21 2017 - [info] Binlog server 172.16.12.11 is reachable. Fri Dec 8 19:04:21 2017 - [debug] Skipping connecting to dead master 172.16.12.11. Fri Dec 8 19:04:21 2017 - [debug] Connecting to servers.. Fri Dec 8 19:04:21 2017 - [debug] Connected to: 172.16.12.12(172.16.12.12:3306), user=root Fri Dec 8 19:04:21 2017 - [debug] Number of slave worker threads on host 172.16.12.12(172.16.12.12:3306): 2 Fri Dec 8 19:04:21 2017 - [debug] Connected to: 172.16.12.13(172.16.12.13:3306), user=root Fri Dec 8 19:04:21 2017 - [debug] Number of slave worker threads on host 172.16.12.13(172.16.12.13:3306): 2 Fri Dec 8 19:04:21 2017 - [debug] Comparing MySQL versions.. Fri Dec 8 19:04:21 2017 - [debug] Comparing MySQL versions done. Fri Dec 8 19:04:21 2017 - [debug] Connecting to servers done. Fri Dec 8 19:04:21 2017 - [info] GTID failover mode = 1 Fri Dec 8 19:04:21 2017 - [info] Dead Servers: Fri Dec 8 19:04:21 2017 - [info] 172.16.12.11(172.16.12.11:3306) Fri Dec 8 19:04:21 2017 - [info] Checking master reachability via MySQL(double check)... Fri Dec 8 19:04:21 2017 - [info] ok. Fri Dec 8 19:04:21 2017 - [info] Alive Servers: Fri Dec 8 19:04:21 2017 - [info] 172.16.12.12(172.16.12.12:3306) Fri Dec 8 19:04:21 2017 - [info] 172.16.12.13(172.16.12.13:3306) Fri Dec 8 19:04:21 2017 - [info] Alive Slaves: Fri Dec 8 19:04:21 2017 - [info] 172.16.12.12(172.16.12.12:3306) Version=5.7.9-log (oldest major version between slaves) log-bin:enabled Fri Dec 8 19:04:21 2017 - [info] GTID ON Fri Dec 8 19:04:21 2017 - [debug] Relay log info repository: TABLE Fri Dec 8 19:04:21 2017 - [info] Replicating from 172.16.12.11(172.16.12.11:3306) Fri Dec 8 19:04:21 2017 - [info] Primary candidate for the new Master (candidate_master is set) Fri Dec 8 19:04:21 2017 - [info] 172.16.12.13(172.16.12.13:3306) Version=5.7.9-log (oldest major version between slaves) log-bin:enabled Fri Dec 8 19:04:21 2017 - [info] GTID ON Fri Dec 8 19:04:21 2017 - [debug] Relay log info repository: TABLE Fri Dec 8 19:04:21 2017 - [info] Replicating from 172.16.12.11(172.16.12.11:3306) Fri Dec 8 19:04:21 2017 - [info] Not candidate for the new Master (no_master is set) Fri Dec 8 19:04:22 2017 - [info] Starting GTID based failover. Fri Dec 8 19:04:22 2017 - [info] Fri Dec 8 19:04:22 2017 - [info] ** Phase 1: Configuration Check Phase completed. Fri Dec 8 19:04:22 2017 - [info] Fri Dec 8 19:04:22 2017 - [info] * Phase 2: Dead Master Shutdown Phase.. Fri Dec 8 19:04:22 2017 - [info] Fri Dec 8 19:04:22 2017 - [info] Forcing shutdown so that applications never connect to the current master.. Fri Dec 8 19:04:22 2017 - [info] Executing master IP deactivation script: Fri Dec 8 19:04:22 2017 - [info] /usr/local/bin/master_ip_failover --orig_master_host=172.16.12.11 --orig_master_ip=172.16.12.11 --orig_master_port=3306 --command=stopssh --ssh_user=root Fri Dec 8 19:04:22 2017 - [debug] Stopping IO thread on 172.16.12.13(172.16.12.13:3306).. Fri Dec 8 19:04:22 2017 - [debug] Stopping IO thread on 172.16.12.12(172.16.12.12:3306).. Fri Dec 8 19:04:22 2017 - [debug] Stop IO thread on 172.16.12.13(172.16.12.13:3306) done. Fri Dec 8 19:04:22 2017 - [debug] Stop IO thread on 172.16.12.12(172.16.12.12:3306) done. IN SCRIPT TEST====/sbin/ifconfig eth0:100 down==/sbin/ifconfig eth0:100 172.16.12.100/24=== Disabling the VIP on old master: 172.16.12.11 SIOCSIFFLAGS: 无法指定被请求的地址 Fri Dec 8 19:04:22 2017 - [info] done. Fri Dec 8 19:04:22 2017 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master. Fri Dec 8 19:04:22 2017 - [info] * Phase 2: Dead Master Shutdown Phase completed. Fri Dec 8 19:04:22 2017 - [info] Fri Dec 8 19:04:22 2017 - [info] * Phase 3: Master Recovery Phase.. Fri Dec 8 19:04:22 2017 - [info] Fri Dec 8 19:04:22 2017 - [info] * Phase 3.1: Getting Latest Slaves Phase.. Fri Dec 8 19:04:22 2017 - [info] Fri Dec 8 19:04:22 2017 - [debug] Fetching current slave status.. Fri Dec 8 19:04:22 2017 - [debug] Fetching current slave status done. Fri Dec 8 19:04:22 2017 - [info] The latest binary log file/position on all slaves is mysql-bin.000007:4095683 Fri Dec 8 19:04:22 2017 - [info] Retrieved Gtid Set: 865e07c9-bae8-11e7-8aba-08002729e4f7:16087-21782 Fri Dec 8 19:04:22 2017 - [info] Latest slaves (Slaves that received relay log files to the latest): Fri Dec 8 19:04:22 2017 - [info] 172.16.12.13(172.16.12.13:3306) Version=5.7.9-log (oldest major version between slaves) log-bin:enabled Fri Dec 8 19:04:22 2017 - [info] GTID ON Fri Dec 8 19:04:22 2017 - [debug] Relay log info repository: TABLE Fri Dec 8 19:04:22 2017 - [info] Replicating from 172.16.12.11(172.16.12.11:3306) Fri Dec 8 19:04:22 2017 - [info] Not candidate for the new Master (no_master is set) Fri Dec 8 19:04:22 2017 - [info] The oldest binary log file/position on all slaves is mysql-bin.000007:1795676 Fri Dec 8 19:04:22 2017 - [info] Retrieved Gtid Set: 865e07c9-bae8-11e7-8aba-08002729e4f7:16087-18583 Fri Dec 8 19:04:22 2017 - [info] Oldest slaves: Fri Dec 8 19:04:22 2017 - [info] 172.16.12.12(172.16.12.12:3306) Version=5.7.9-log (oldest major version between slaves) log-bin:enabled Fri Dec 8 19:04:22 2017 - [info] GTID ON Fri Dec 8 19:04:22 2017 - [debug] Relay log info repository: TABLE Fri Dec 8 19:04:22 2017 - [info] Replicating from 172.16.12.11(172.16.12.11:3306) Fri Dec 8 19:04:22 2017 - [info] Primary candidate for the new Master (candidate_master is set) Fri Dec 8 19:04:22 2017 - [info] Fri Dec 8 19:04:22 2017 - [info] * Phase 3.3: Determining New Master Phase.. Fri Dec 8 19:04:22 2017 - [info] Fri Dec 8 19:04:22 2017 - [debug] Checking replication delay on 172.16.12.12(172.16.12.12:3306).. Fri Dec 8 19:04:22 2017 - [debug] ok. Fri Dec 8 19:04:22 2017 - [info] Searching new master from slaves.. Fri Dec 8 19:04:22 2017 - [info] Candidate masters from the configuration file: Fri Dec 8 19:04:22 2017 - [info] 172.16.12.12(172.16.12.12:3306) Version=5.7.9-log (oldest major version between slaves) log-bin:enabled Fri Dec 8 19:04:22 2017 - [info] GTID ON Fri Dec 8 19:04:22 2017 - [debug] Relay log info repository: TABLE Fri Dec 8 19:04:22 2017 - [info] Replicating from 172.16.12.11(172.16.12.11:3306) Fri Dec 8 19:04:22 2017 - [info] Primary candidate for the new Master (candidate_master is set) Fri Dec 8 19:04:22 2017 - [info] Non-candidate masters: Fri Dec 8 19:04:22 2017 - [info] 172.16.12.13(172.16.12.13:3306) Version=5.7.9-log (oldest major version between slaves) log-bin:enabled Fri Dec 8 19:04:22 2017 - [info] GTID ON Fri Dec 8 19:04:22 2017 - [debug] Relay log info repository: TABLE Fri Dec 8 19:04:22 2017 - [info] Replicating from 172.16.12.11(172.16.12.11:3306) Fri Dec 8 19:04:22 2017 - [info] Not candidate for the new Master (no_master is set) Fri Dec 8 19:04:22 2017 - [info] Searching from candidate_master slaves which have received the latest relay log events.. Fri Dec 8 19:04:22 2017 - [info] Not found. Fri Dec 8 19:04:22 2017 - [info] Searching from all candidate_master slaves.. Fri Dec 8 19:04:22 2017 - [info] New master is 172.16.12.12(172.16.12.12:3306) Fri Dec 8 19:04:22 2017 - [info] Starting master failover.. Fri Dec 8 19:04:22 2017 - [info] From: 172.16.12.11(172.16.12.11:3306) (current master) +--172.16.12.12(172.16.12.12:3306) +--172.16.12.13(172.16.12.13:3306) To: 172.16.12.12(172.16.12.12:3306) (new master) +--172.16.12.13(172.16.12.13:3306) Fri Dec 8 19:04:22 2017 - [info] Fri Dec 8 19:04:22 2017 - [info] * Phase 3.3: New Master Recovery Phase.. Fri Dec 8 19:04:22 2017 - [info] Fri Dec 8 19:04:22 2017 - [info] Waiting all logs to be applied.. Fri Dec 8 19:04:22 2017 - [info] done. Fri Dec 8 19:04:22 2017 - [debug] Stopping slave IO/SQL thread on 172.16.12.12(172.16.12.12:3306).. Fri Dec 8 19:05:24 2017 - [debug] done. Fri Dec 8 19:05:24 2017 - [info] Replicating from the latest slave 172.16.12.13(172.16.12.13:3306) and waiting to apply.. Fri Dec 8 19:05:24 2017 - [info] Waiting all logs to be applied on the latest slave.. Fri Dec 8 19:05:24 2017 - [info] Resetting slave 172.16.12.12(172.16.12.12:3306) and starting replication from the new master 172.16.12.13(172.16.12.13:3306).. Fri Dec 8 19:05:24 2017 - [debug] Stopping slave IO/SQL thread on 172.16.12.12(172.16.12.12:3306).. Fri Dec 8 19:05:24 2017 - [debug] done. Fri Dec 8 19:05:24 2017 - [info] Executed CHANGE MASTER. Fri Dec 8 19:05:24 2017 - [debug] Starting slave IO/SQL thread on 172.16.12.12(172.16.12.12:3306).. Fri Dec 8 19:05:24 2017 - [debug] done. Fri Dec 8 19:05:24 2017 - [info] Slave started. Fri Dec 8 19:05:24 2017 - [info] Waiting to execute all relay logs on 172.16.12.12(172.16.12.12:3306).. Fri Dec 8 19:06:12 2017 - [info] master_pos_wait(mysql-bin.000003:15419438) completed on 172.16.12.12(172.16.12.12:3306). Executed 156 events. Fri Dec 8 19:06:12 2017 - [info] done. Fri Dec 8 19:06:12 2017 - [debug] Stopping SQL thread on 172.16.12.12(172.16.12.12:3306).. Fri Dec 8 19:06:12 2017 - [debug] done. Fri Dec 8 19:06:12 2017 - [info] done. Fri Dec 8 19:06:12 2017 - [info] -- Saving binlog from host 172.16.12.11 started, pid: 2735 Fri Dec 8 19:06:18 2017 - [info] Fri Dec 8 19:06:18 2017 - [info] Log messages from 172.16.12.11 ... Fri Dec 8 19:06:18 2017 - [info] Fri Dec 8 19:06:12 2017 - [info] Fetching binary logs from binlog server 172.16.12.11.. Fri Dec 8 19:06:12 2017 - [info] Executing binlog save command: save_binary_logs --command=save --start_file=mysql-bin.000007 --start_pos=4095618 --output_file=/tmp/saved_binlog_binlog1_20171208190420.binlog --handle_raw_binlog=0 --skip_filter=1 --disable_log_bin=0 --manager_version=0.57 --oldest_version=5.7.9-log --debug --binlog_dir=/home/data/mysql57/log Creating /tmp if not exists.. ok. Concat binary/relay logs from mysql-bin.000007 pos 4095618 to mysql-bin.000007 EOF into /tmp/saved_binlog_binlog1_20171208190420.binlog .. Executing command: mysqlbinlog --start-position=4095618 /home/data/mysql57/log/mysql-bin.000007 >> /tmp/saved_binlog_binlog1_20171208190420.binlog Concat succeeded. Fri Dec 8 19:06:18 2017 - [info] scp from root@172.16.12.11:/tmp/saved_binlog_binlog1_20171208190420.binlog to local:/var/log/masterha/app1/saved_binlog_172.16.12.11_binlog1_20171208190420.binlog succeeded. Fri Dec 8 19:06:18 2017 - [info] End of log messages from 172.16.12.11. Fri Dec 8 19:06:18 2017 - [info] Saved mysqlbinlog size from 172.16.12.11 is 61894698 bytes. Fri Dec 8 19:06:18 2017 - [info] Applying differential binlog /var/log/masterha/app1/saved_binlog_172.16.12.11_binlog1_20171208190420.binlog .. Fri Dec 8 19:10:15 2017 - [info] Differential log apply from binlog server succeeded. Fri Dec 8 19:10:15 2017 - [info] Getting new master's binlog name and position.. Fri Dec 8 19:10:15 2017 - [info] mysql-bin.000006:61413022 Fri Dec 8 19:10:15 2017 - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='172.16.12.12', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='xxx'; Fri Dec 8 19:10:15 2017 - [info] Master Recovery succeeded. File:Pos:Exec_Gtid_Set: mysql-bin.000006, 61413022, 865e07c9-bae8-11e7-8aba-08002729e4f7:1-74770 Fri Dec 8 19:10:15 2017 - [info] Executing master IP activate script: Fri Dec 8 19:10:15 2017 - [info] /usr/local/bin/master_ip_failover --command=start --ssh_user=root --orig_master_host=172.16.12.11 --orig_master_ip=172.16.12.11 --orig_master_port=3306 --new_master_host=172.16.12.12 --new_master_ip=172.16.12.12 --new_master_port=3306 --new_master_user='root' --new_master_password=xxx Unknown option: new_master_user Unknown option: new_master_password IN SCRIPT TEST====/sbin/ifconfig eth0:100 down==/sbin/ifconfig eth0:100 172.16.12.100/24=== Enabling the VIP - 172.16.12.100/24 on the new master - 172.16.12.12 Fri Dec 8 19:10:19 2017 - [info] OK. Fri Dec 8 19:10:19 2017 - [info] ** Finished master recovery successfully. Fri Dec 8 19:10:19 2017 - [info] * Phase 3: Master Recovery Phase completed. Fri Dec 8 19:10:19 2017 - [info] Fri Dec 8 19:10:19 2017 - [info] * Phase 4: Slaves Recovery Phase.. Fri Dec 8 19:10:19 2017 - [info] Fri Dec 8 19:10:19 2017 - [info] Fri Dec 8 19:10:19 2017 - [info] * Phase 4.1: Starting Slaves in parallel.. Fri Dec 8 19:10:19 2017 - [info] Fri Dec 8 19:10:19 2017 - [info] -- Slave recovery on host 172.16.12.13(172.16.12.13:3306) started, pid: 2751. Check tmp log /var/log/masterha/app1/172.16.12.13_3306_20171208190420.log if it takes time.. Fri Dec 8 19:25:16 2017 - [info] Fri Dec 8 19:25:16 2017 - [info] Log messages from 172.16.12.13 ... Fri Dec 8 19:25:16 2017 - [info] Fri Dec 8 19:10:19 2017 - [info] Resetting slave 172.16.12.13(172.16.12.13:3306) and starting replication from the new master 172.16.12.12(172.16.12.12:3306).. Fri Dec 8 19:10:19 2017 - [debug] Stopping slave IO/SQL thread on 172.16.12.13(172.16.12.13:3306).. Fri Dec 8 19:11:21 2017 - [debug] done. Fri Dec 8 19:11:21 2017 - [info] Executed CHANGE MASTER. Fri Dec 8 19:11:21 2017 - [debug] Starting slave IO/SQL thread on 172.16.12.13(172.16.12.13:3306).. Fri Dec 8 19:11:22 2017 - [debug] done. Fri Dec 8 19:11:22 2017 - [info] Slave started. Fri Dec 8 19:25:16 2017 - [info] gtid_wait(865e07c9-bae8-11e7-8aba-08002729e4f7:1-74770) completed on 172.16.12.13(172.16.12.13:3306). Executed 2677 events. Fri Dec 8 19:25:16 2017 - [info] End of log messages from 172.16.12.13. Fri Dec 8 19:25:16 2017 - [info] -- Slave on host 172.16.12.13(172.16.12.13:3306) started. Fri Dec 8 19:25:16 2017 - [info] All new slave servers recovered successfully. Fri Dec 8 19:25:16 2017 - [info] Fri Dec 8 19:25:16 2017 - [info] * Phase 5: New master cleanup phase.. Fri Dec 8 19:25:16 2017 - [info] Fri Dec 8 19:25:16 2017 - [info] Resetting slave info on the new master.. Fri Dec 8 19:25:16 2017 - [debug] Clearing slave info.. Fri Dec 8 19:25:16 2017 - [debug] Stopping slave IO/SQL thread on 172.16.12.12(172.16.12.12:3306).. Fri Dec 8 19:25:16 2017 - [debug] done. Fri Dec 8 19:25:16 2017 - [debug] SHOW SLAVE STATUS shows new master does not replicate from anywhere. OK. Fri Dec 8 19:25:16 2017 - [info] 172.16.12.12: Resetting slave info succeeded. Fri Dec 8 19:25:16 2017 - [info] Master failover to 172.16.12.12(172.16.12.12:3306) completed successfully. Fri Dec 8 19:25:16 2017 - [debug] Disconnected from 172.16.12.12(172.16.12.12:3306) Fri Dec 8 19:25:16 2017 - [debug] Disconnected from 172.16.12.13(172.16.12.13:3306) Fri Dec 8 19:25:16 2017 - [info] ----- Failover Report ----- app1: MySQL Master failover 172.16.12.11(172.16.12.11:3306) to 172.16.12.12(172.16.12.12:3306) succeeded Master 172.16.12.11(172.16.12.11:3306) is down! Check MHA Manager logs at db10:/var/log/masterha/app1/manager.log for details. Started automated(non-interactive) failover. Invalidated master IP address on 172.16.12.11(172.16.12.11:3306) Selected 172.16.12.12(172.16.12.12:3306) as a new master. 172.16.12.12(172.16.12.12:3306): OK: Applying all logs succeeded. 172.16.12.12(172.16.12.12:3306): OK: Activated master IP address. 172.16.12.13(172.16.12.13:3306): OK: Slave started, replicating from 172.16.12.12(172.16.12.12:3306) 172.16.12.12(172.16.12.12:3306): Resetting slave info succeeded. Master failover to 172.16.12.12(172.16.12.12:3306) completed successfully.
yum install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes -y
[root@localhost tools]# rpm -ivh mha4mysql-manager-0.57-0.el7.noarch.rpm
error: Failed dependencies:
perl(Log::Dispatch) is needed by mha4mysql-manager-0.57-0.el7.noarch
perl(Log::Dispatch::File) is needed by mha4mysql-manager-0.57-0.el7.noarch
perl(Log::Dispatch::Screen) is needed by mha4mysql-manager-0.57-0.el7.noarch
perl(Parallel::ForkManager) is needed by mha4mysql-manager-0.57-0.el7.noarch
centos6.8下部分包不支持。
wget http://mirror.centos.org/centos/6/extras/x86_64/Packages/epel-release-6-8.noarch.rpm
rpm -ivh epel-release-6-8.noarch.rpm
################################
复制vbox
点击修改mac地址
vi /etc/udev/rules.d/70-persistent-net.rules
vi /etc/sysconfig/network-scripts/ifcfg-eth0
修改主机名:
vim /etc/hosts
vim /etc/sysconfig/network
###########################################
yum 故障:
1、检查网络 ping www.baidu.co
2、检查网关 route -n
3、检查防火墙和selinux
4、检查/etc/resolve.conf /etc/hosts
###########################################
每个节点打通ssh
[root@db13 ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@db10
-bash: ssh-copy-id: command not found
yum -y install openssh-clients
ssh-copy-id -i ~/.ssh/id_rsa.pub root@IP
检查:cat ~/.ssh/authorized_keys
###########################################
Last_IO_Error: Fatal error: The slave I/O thread stops because master and slave have equal MySQL server UUIDs; these UUIDs must be different for replication to work.克隆主机导致
auto.cnf 删除后重启mysql,生成新文件
Last_Error: Slave failed to initialize relay log info structure from the repository
解决:
stop slave; reset slave; start slave; show slave status; #查看最新状态, 发现已经恢复正常
#############################################
masterha_check_ssh --conf=/etc/mha/app1.cnf 报错
[error][/usr/share/perl5/vendor_perl/MHA/SSHCheck.pm, ln111] SSH connection from
manager上要把自己的公钥信息也加入到authorized_keys
##############################################
masterha_check_repl --conf=/etc/mha/app1.cnf 报错,添加权限
####################################################
masterha_check_ssh --conf=/etc/mha/app1.cnf
masterha_check_repl --conf=/etc/mha/app1.cnf
启动:
nohup /usr/bin/masterha_manager --conf=/etc/mha/app1.cnf > /var/log/masterha/app1/manager.log 2>&1 &
masterha_check_status --conf=/etc/mha/app1.cnf
###################################################
断掉主库后,从库连接新从库出现1236故障。
Last_IO_Errno: 1236
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.'
mysql> stop slave; Query OK, 0 rows affected (0.06 sec) mysql> reset slave; Query OK, 0 rows affected (0.06 sec) mysql> reset master; Query OK, 0 rows affected (0.07 sec) mysql> set global gtid_purged='865e07c9-bae8-11e7-8aba-08002729e4f7:128730'; Query OK, 0 rows affected (0.00 sec) mysql> change master to master_host='172.xxxx',master_port=3306,master_user='root',master_password='root123',master_auto_position=1; Query OK, 0 rows affected, 2 warnings (0.07 sec) mysql> start slave; Query OK, 0 rows affected (0.02 sec)
还是有1236错误。重启从库出现1872故障.
change后还是不行,检查一下gtid是否与主库对齐。show master status;
检查一下select * from mysql.slave_relay_log_info; 检查一下relay_log。
重新change,1236,一般是认为主从gtid没有对齐。
从库一定要设置read_only=on,要不切换的适合gtid没有对齐,会导致1236故障。
##############################
sysbench /usr/share/sysbench/tests/include/oltp_legacy/insert.lua --oltp-tables-count=4 --oltp-table-size=10000000 --oltp-dist-res=95 --mysql-user=root --mysql-password=root123 --mysql-db=sbtest --db-driver=mysql --mysql-socket=/home/data/mysql57/run/mysql.sock --num-threads=4 --max-requests=0 --max-time=300 --report-interval=3 run
关闭数据库主库,mha自动切换,日志如下:
sysbench /usr/share/sysbench/tests/include/oltp_legacy/insert.lua --oltp-tables-count=4 --oltp-table-size=10000000 --oltp-dist-res=95 --mysql-user=root --mysql-password=root123 --mysql-db=sbtest --db-driver=mysql --mysql-socket=/home/data/mysql57/run/mysql.sock --num-threads=4 --max-requests=0 --max-time=300 --report-interval=3 run ################log1 Thu Nov 16 17:06:58 2017 - [debug] Set short wait_timeout on master: 3 seconds Thu Nov 16 17:06:58 2017 - [debug] Trying to get advisory lock.. Thu Nov 16 17:07:01 2017 - [debug] Connected on master. Thu Nov 16 17:07:01 2017 - [debug] Set short wait_timeout on master: 3 seconds Thu Nov 16 17:07:01 2017 - [debug] Trying to get advisory lock.. Thu Nov 16 17:07:04 2017 - [warning] Got error on MySQL connect ping: DBI connect(';host=172.16.12.11;port=3306;mysql_connect_timeout=1','root',...) failed: Lost connection to MySQL server at 'reading initial communication packet', system error: 111 at /usr/share/perl5/vendor_perl/MHA/HealthCheck.pm line 97 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111) Thu Nov 16 17:07:04 2017 - [info] Executing SSH check script: exit 0 Thu Nov 16 17:07:04 2017 - [info] Executing secondary network check script: /usr/bin/masterha_secondary_check -s 172.16.12.11 -s 172.16.12.12 --user=root --master_host=172.16.12.10 --master_port=3306 --user=root --master_host=172.16.12.11 --master_ip=172.16.12.11 --master_port=3306 --master_user=root --master_password=root123 --ping_type=CONNECT Thu Nov 16 17:07:04 2017 - [debug] SSH connection test to 172.16.12.11, option -o StrictHostKeyChecking=no -o PasswordAuthentication=no -o BatchMode=yes -o ConnectTimeout=5, timeout 5 Thu Nov 16 17:07:05 2017 - [info] HealthCheck: SSH to 172.16.12.11 is reachable. Monitoring server 172.16.12.11 is reachable, Master is not reachable from 172.16.12.11. OK. Thu Nov 16 17:07:07 2017 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111) Thu Nov 16 17:07:07 2017 - [warning] Connection failed 2 time(s).. Monitoring server 172.16.12.12 is reachable, Master is not reachable from 172.16.12.12. OK. Thu Nov 16 17:07:07 2017 - [info] Master is not reachable from all other monitoring servers. Failover should start. Thu Nov 16 17:07:10 2017 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111) Thu Nov 16 17:07:10 2017 - [warning] Connection failed 3 time(s).. Thu Nov 16 17:07:13 2017 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111) Thu Nov 16 17:07:13 2017 - [warning] Connection failed 4 time(s).. Thu Nov 16 17:07:13 2017 - [warning] Master is not reachable from health checker! Thu Nov 16 17:07:13 2017 - [warning] Master 172.16.12.11(172.16.12.11:3306) is not reachable! Thu Nov 16 17:07:13 2017 - [warning] SSH is reachable. Thu Nov 16 17:07:13 2017 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/mha/app1.cnf again, and trying to connect to all servers to check server status.. Thu Nov 16 17:07:13 2017 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Thu Nov 16 17:07:13 2017 - [info] Reading application default configuration from /etc/mha/app1.cnf.. Thu Nov 16 17:07:13 2017 - [info] Reading server configuration from /etc/mha/app1.cnf.. Thu Nov 16 17:07:13 2017 - [debug] Skipping connecting to dead master 172.16.12.11(172.16.12.11:3306). Thu Nov 16 17:07:13 2017 - [debug] Connecting to servers.. Thu Nov 16 17:07:13 2017 - [debug] Connected to: 172.16.12.12(172.16.12.12:3306), user=root Thu Nov 16 17:07:13 2017 - [debug] Number of slave worker threads on host 172.16.12.12(172.16.12.12:3306): 2 Thu Nov 16 17:07:13 2017 - [debug] Connected to: 172.16.12.13(172.16.12.13:3306), user=root Thu Nov 16 17:07:13 2017 - [debug] Number of slave worker threads on host 172.16.12.13(172.16.12.13:3306): 2 Thu Nov 16 17:07:13 2017 - [debug] Comparing MySQL versions.. Thu Nov 16 17:07:13 2017 - [debug] Comparing MySQL versions done. Thu Nov 16 17:07:13 2017 - [debug] Connecting to servers done. Thu Nov 16 17:07:13 2017 - [info] GTID failover mode = 1 Thu Nov 16 17:07:13 2017 - [info] Dead Servers: Thu Nov 16 17:07:13 2017 - [info] 172.16.12.11(172.16.12.11:3306) Thu Nov 16 17:07:13 2017 - [info] Alive Servers: Thu Nov 16 17:07:13 2017 - [info] 172.16.12.12(172.16.12.12:3306) Thu Nov 16 17:07:13 2017 - [info] 172.16.12.13(172.16.12.13:3306) Thu Nov 16 17:07:13 2017 - [info] Alive Slaves: Thu Nov 16 17:07:13 2017 - [info] 172.16.12.12(172.16.12.12:3306) Version=5.7.9-log (oldest major version between slaves) log-bin:enabled Thu Nov 16 17:07:13 2017 - [info] GTID ON Thu Nov 16 17:07:13 2017 - [debug] Relay log info repository: TABLE Thu Nov 16 17:07:13 2017 - [info] Replicating from 172.16.12.11(172.16.12.11:3306) Thu Nov 16 17:07:13 2017 - [info] Primary candidate for the new Master (candidate_master is set) Thu Nov 16 17:07:13 2017 - [info] 172.16.12.13(172.16.12.13:3306) Version=5.7.9-log (oldest major version between slaves) log-bin:enabled Thu Nov 16 17:07:13 2017 - [info] GTID ON Thu Nov 16 17:07:13 2017 - [debug] Relay log info repository: TABLE Thu Nov 16 17:07:13 2017 - [info] Replicating from 172.16.12.11(172.16.12.11:3306) Thu Nov 16 17:07:13 2017 - [info] Not candidate for the new Master (no_master is set) Thu Nov 16 17:07:13 2017 - [info] Checking slave configurations.. Thu Nov 16 17:07:13 2017 - [info] read_only=1 is not set on slave 172.16.12.12(172.16.12.12:3306). Thu Nov 16 17:07:13 2017 - [info] Checking replication filtering settings.. Thu Nov 16 17:07:13 2017 - [info] Replication filtering check ok. Thu Nov 16 17:07:13 2017 - [info] Master is down! Thu Nov 16 17:07:13 2017 - [info] Terminating monitoring script. Thu Nov 16 17:07:13 2017 - [debug] Disconnected from 172.16.12.12(172.16.12.12:3306) Thu Nov 16 17:07:13 2017 - [debug] Disconnected from 172.16.12.13(172.16.12.13:3306) Thu Nov 16 17:07:13 2017 - [info] Got exit code 20 (Master dead). Thu Nov 16 17:07:13 2017 - [info] MHA::MasterFailover version 0.57. Thu Nov 16 17:07:13 2017 - [info] Starting master failover. Thu Nov 16 17:07:13 2017 - [info] Thu Nov 16 17:07:13 2017 - [info] * Phase 1: Configuration Check Phase.. Thu Nov 16 17:07:13 2017 - [info] Thu Nov 16 17:07:13 2017 - [debug] Skipping connecting to dead master 172.16.12.11. Thu Nov 16 17:07:13 2017 - [debug] Connecting to servers.. Thu Nov 16 17:07:13 2017 - [debug] Connected to: 172.16.12.12(172.16.12.12:3306), user=root Thu Nov 16 17:07:13 2017 - [debug] Number of slave worker threads on host 172.16.12.12(172.16.12.12:3306): 2 Thu Nov 16 17:07:13 2017 - [debug] Connected to: 172.16.12.13(172.16.12.13:3306), user=root Thu Nov 16 17:07:13 2017 - [debug] Number of slave worker threads on host 172.16.12.13(172.16.12.13:3306): 2 Thu Nov 16 17:07:13 2017 - [debug] Comparing MySQL versions.. Thu Nov 16 17:07:13 2017 - [debug] Comparing MySQL versions done. Thu Nov 16 17:07:13 2017 - [debug] Connecting to servers done. Thu Nov 16 17:07:13 2017 - [info] GTID failover mode = 1 Thu Nov 16 17:07:13 2017 - [info] Dead Servers: Thu Nov 16 17:07:13 2017 - [info] 172.16.12.11(172.16.12.11:3306) Thu Nov 16 17:07:13 2017 - [info] Checking master reachability via MySQL(double check)... Thu Nov 16 17:07:13 2017 - [info] ok. Thu Nov 16 17:07:13 2017 - [info] Alive Servers: Thu Nov 16 17:07:13 2017 - [info] 172.16.12.12(172.16.12.12:3306) Thu Nov 16 17:07:13 2017 - [info] 172.16.12.13(172.16.12.13:3306) Thu Nov 16 17:07:13 2017 - [info] Alive Slaves: Thu Nov 16 17:07:13 2017 - [info] 172.16.12.12(172.16.12.12:3306) Version=5.7.9-log (oldest major version between slaves) log-bin:enabled Thu Nov 16 17:07:13 2017 - [info] GTID ON Thu Nov 16 17:07:13 2017 - [debug] Relay log info repository: TABLE Thu Nov 16 17:07:13 2017 - [info] Replicating from 172.16.12.11(172.16.12.11:3306) Thu Nov 16 17:07:13 2017 - [info] Primary candidate for the new Master (candidate_master is set) Thu Nov 16 17:07:13 2017 - [info] 172.16.12.13(172.16.12.13:3306) Version=5.7.9-log (oldest major version between slaves) log-bin:enabled Thu Nov 16 17:07:13 2017 - [info] GTID ON Thu Nov 16 17:07:13 2017 - [debug] Relay log info repository: TABLE Thu Nov 16 17:07:13 2017 - [info] Replicating from 172.16.12.11(172.16.12.11:3306) Thu Nov 16 17:07:13 2017 - [info] Not candidate for the new Master (no_master is set) Thu Nov 16 17:07:13 2017 - [info] Starting GTID based failover. Thu Nov 16 17:07:13 2017 - [info] Thu Nov 16 17:07:13 2017 - [info] ** Phase 1: Configuration Check Phase completed. Thu Nov 16 17:07:13 2017 - [info] Thu Nov 16 17:07:13 2017 - [info] * Phase 2: Dead Master Shutdown Phase.. Thu Nov 16 17:07:13 2017 - [info] Thu Nov 16 17:07:13 2017 - [info] Forcing shutdown so that applications never connect to the current master.. Thu Nov 16 17:07:13 2017 - [info] Executing master IP deactivation script: Thu Nov 16 17:07:13 2017 - [info] /usr/local/bin/master_ip_failover --orig_master_host=172.16.12.11 --orig_master_ip=172.16.12.11 --orig_master_port=3306 --command=stopssh --ssh_user=root Thu Nov 16 17:07:13 2017 - [debug] Stopping IO thread on 172.16.12.12(172.16.12.12:3306).. Thu Nov 16 17:07:13 2017 - [debug] Stopping IO thread on 172.16.12.13(172.16.12.13:3306).. IN SCRIPT TEST====/sbin/ifconfig eth0:100 down==/sbin/ifconfig eth0:100 172.16.12.100/24=== Disabling the VIP on old master: 172.16.12.11 Thu Nov 16 17:07:14 2017 - [debug] Stop IO thread on 172.16.12.13(172.16.12.13:3306) done. Thu Nov 16 17:07:14 2017 - [debug] Stop IO thread on 172.16.12.12(172.16.12.12:3306) done. SIOCSIFFLAGS: 无法指定被请求的地址 Thu Nov 16 17:07:14 2017 - [info] done. Thu Nov 16 17:07:14 2017 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master. Thu Nov 16 17:07:14 2017 - [info] * Phase 2: Dead Master Shutdown Phase completed. Thu Nov 16 17:07:14 2017 - [info] Thu Nov 16 17:07:14 2017 - [info] * Phase 3: Master Recovery Phase.. Thu Nov 16 17:07:14 2017 - [info] Thu Nov 16 17:07:14 2017 - [info] * Phase 3.1: Getting Latest Slaves Phase.. Thu Nov 16 17:07:14 2017 - [info] Thu Nov 16 17:07:14 2017 - [debug] Fetching current slave status.. Thu Nov 16 17:07:14 2017 - [debug] Fetching current slave status done. Thu Nov 16 17:07:14 2017 - [info] The latest binary log file/position on all slaves is mysql-bin.000022:4912402 Thu Nov 16 17:07:14 2017 - [info] Retrieved Gtid Set: 865e07c9-bae8-11e7-8aba-08002729e4f7:129029-135860 Thu Nov 16 17:07:14 2017 - [info] Latest slaves (Slaves that received relay log files to the latest): Thu Nov 16 17:07:14 2017 - [info] 172.16.12.12(172.16.12.12:3306) Version=5.7.9-log (oldest major version between slaves) log-bin:enabled Thu Nov 16 17:07:14 2017 - [info] GTID ON Thu Nov 16 17:07:14 2017 - [debug] Relay log info repository: TABLE Thu Nov 16 17:07:14 2017 - [info] Replicating from 172.16.12.11(172.16.12.11:3306) Thu Nov 16 17:07:14 2017 - [info] Primary candidate for the new Master (candidate_master is set) Thu Nov 16 17:07:14 2017 - [info] 172.16.12.13(172.16.12.13:3306) Version=5.7.9-log (oldest major version between slaves) log-bin:enabled Thu Nov 16 17:07:14 2017 - [info] GTID ON Thu Nov 16 17:07:14 2017 - [debug] Relay log info repository: TABLE Thu Nov 16 17:07:14 2017 - [info] Replicating from 172.16.12.11(172.16.12.11:3306) Thu Nov 16 17:07:14 2017 - [info] Not candidate for the new Master (no_master is set) Thu Nov 16 17:07:14 2017 - [info] The oldest binary log file/position on all slaves is mysql-bin.000022:4912402 Thu Nov 16 17:07:14 2017 - [info] Retrieved Gtid Set: 865e07c9-bae8-11e7-8aba-08002729e4f7:129029-135860 Thu Nov 16 17:07:14 2017 - [info] Oldest slaves: Thu Nov 16 17:07:14 2017 - [info] 172.16.12.12(172.16.12.12:3306) Version=5.7.9-log (oldest major version between slaves) log-bin:enabled Thu Nov 16 17:07:14 2017 - [info] GTID ON Thu Nov 16 17:07:14 2017 - [debug] Relay log info repository: TABLE Thu Nov 16 17:07:14 2017 - [info] Replicating from 172.16.12.11(172.16.12.11:3306) Thu Nov 16 17:07:14 2017 - [info] Primary candidate for the new Master (candidate_master is set) Thu Nov 16 17:07:14 2017 - [info] 172.16.12.13(172.16.12.13:3306) Version=5.7.9-log (oldest major version between slaves) log-bin:enabled Thu Nov 16 17:07:14 2017 - [info] GTID ON Thu Nov 16 17:07:14 2017 - [debug] Relay log info repository: TABLE Thu Nov 16 17:07:14 2017 - [info] Replicating from 172.16.12.11(172.16.12.11:3306) Thu Nov 16 17:07:14 2017 - [info] Not candidate for the new Master (no_master is set) Thu Nov 16 17:07:14 2017 - [info] Thu Nov 16 17:07:14 2017 - [info] * Phase 3.3: Determining New Master Phase.. Thu Nov 16 17:07:14 2017 - [info] Thu Nov 16 17:07:14 2017 - [info] Searching new master from slaves.. Thu Nov 16 17:07:14 2017 - [info] Candidate masters from the configuration file: Thu Nov 16 17:07:14 2017 - [info] 172.16.12.12(172.16.12.12:3306) Version=5.7.9-log (oldest major version between slaves) log-bin:enabled Thu Nov 16 17:07:14 2017 - [info] GTID ON Thu Nov 16 17:07:14 2017 - [debug] Relay log info repository: TABLE Thu Nov 16 17:07:14 2017 - [info] Replicating from 172.16.12.11(172.16.12.11:3306) Thu Nov 16 17:07:14 2017 - [info] Primary candidate for the new Master (candidate_master is set) Thu Nov 16 17:07:14 2017 - [info] Non-candidate masters: Thu Nov 16 17:07:14 2017 - [info] 172.16.12.13(172.16.12.13:3306) Version=5.7.9-log (oldest major version between slaves) log-bin:enabled Thu Nov 16 17:07:14 2017 - [info] GTID ON Thu Nov 16 17:07:14 2017 - [debug] Relay log info repository: TABLE Thu Nov 16 17:07:14 2017 - [info] Replicating from 172.16.12.11(172.16.12.11:3306) Thu Nov 16 17:07:14 2017 - [info] Not candidate for the new Master (no_master is set) Thu Nov 16 17:07:14 2017 - [info] Searching from candidate_master slaves which have received the latest relay log events.. Thu Nov 16 17:07:14 2017 - [info] New master is 172.16.12.12(172.16.12.12:3306) Thu Nov 16 17:07:14 2017 - [info] Starting master failover.. Thu Nov 16 17:07:14 2017 - [info] From: 172.16.12.11(172.16.12.11:3306) (current master) +--172.16.12.12(172.16.12.12:3306) +--172.16.12.13(172.16.12.13:3306) To: 172.16.12.12(172.16.12.12:3306) (new master) +--172.16.12.13(172.16.12.13:3306) Thu Nov 16 17:07:14 2017 - [info] Thu Nov 16 17:07:14 2017 - [info] * Phase 3.3: New Master Recovery Phase.. Thu Nov 16 17:07:14 2017 - [info] Thu Nov 16 17:07:14 2017 - [info] Waiting all logs to be applied.. Thu Nov 16 17:07:14 2017 - [info] done. Thu Nov 16 17:07:14 2017 - [debug] Stopping slave IO/SQL thread on 172.16.12.12(172.16.12.12:3306).. Thu Nov 16 17:07:14 2017 - [debug] done. Thu Nov 16 17:07:14 2017 - [info] Getting new master's binlog name and position.. Thu Nov 16 17:07:14 2017 - [info] mysql-bin.000001:4837210 Thu Nov 16 17:07:14 2017 - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='172.16.12.12', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='xxx'; Thu Nov 16 17:07:14 2017 - [info] Master Recovery succeeded. File:Pos:Exec_Gtid_Set: mysql-bin.000001, 4837210, 865e07c9-bae8-11e7-8aba-08002729e4f7:1-135860 Thu Nov 16 17:07:14 2017 - [info] Executing master IP activate script: Thu Nov 16 17:07:14 2017 - [info] /usr/local/bin/master_ip_failover --command=start --ssh_user=root --orig_master_host=172.16.12.11 --orig_master_ip=172.16.12.11 --orig_master_port=3306 --new_master_host=172.16.12.12 --new_master_ip=172.16.12.12 --new_master_port=3306 --new_master_user='root' --new_master_password=xxx Unknown option: new_master_user Unknown option: new_master_password IN SCRIPT TEST====/sbin/ifconfig eth0:100 down==/sbin/ifconfig eth0:100 172.16.12.100/24=== Enabling the VIP - 172.16.12.100/24 on the new master - 172.16.12.12 Thu Nov 16 17:07:15 2017 - [info] OK. Thu Nov 16 17:07:15 2017 - [info] ** Finished master recovery successfully. Thu Nov 16 17:07:15 2017 - [info] * Phase 3: Master Recovery Phase completed. Thu Nov 16 17:07:15 2017 - [info] Thu Nov 16 17:07:15 2017 - [info] * Phase 4: Slaves Recovery Phase.. Thu Nov 16 17:07:15 2017 - [info] Thu Nov 16 17:07:15 2017 - [info] Thu Nov 16 17:07:15 2017 - [info] * Phase 4.1: Starting Slaves in parallel.. Thu Nov 16 17:07:15 2017 - [info] Thu Nov 16 17:07:15 2017 - [info] -- Slave recovery on host 172.16.12.13(172.16.12.13:3306) started, pid: 6148. Check tmp log /var/log/masterha/app1/172.16.12.13_3306_20171116170713.log if it takes time.. Thu Nov 16 17:07:15 2017 - [info] Thu Nov 16 17:07:15 2017 - [info] Log messages from 172.16.12.13 ... Thu Nov 16 17:07:15 2017 - [info] Thu Nov 16 17:07:15 2017 - [info] Resetting slave 172.16.12.13(172.16.12.13:3306) and starting replication from the new master 172.16.12.12(172.16.12.12:3306).. Thu Nov 16 17:07:15 2017 - [debug] Stopping slave IO/SQL thread on 172.16.12.13(172.16.12.13:3306).. Thu Nov 16 17:07:15 2017 - [debug] done. Thu Nov 16 17:07:15 2017 - [info] Executed CHANGE MASTER. Thu Nov 16 17:07:15 2017 - [debug] Starting slave IO/SQL thread on 172.16.12.13(172.16.12.13:3306).. Thu Nov 16 17:07:15 2017 - [debug] done. Thu Nov 16 17:07:15 2017 - [info] Slave started. Thu Nov 16 17:07:15 2017 - [info] gtid_wait(865e07c9-bae8-11e7-8aba-08002729e4f7:1-135860) completed on 172.16.12.13(172.16.12.13:3306). Executed 0 events. Thu Nov 16 17:07:15 2017 - [info] End of log messages from 172.16.12.13. Thu Nov 16 17:07:15 2017 - [info] -- Slave on host 172.16.12.13(172.16.12.13:3306) started. Thu Nov 16 17:07:15 2017 - [info] All new slave servers recovered successfully. Thu Nov 16 17:07:15 2017 - [info] Thu Nov 16 17:07:15 2017 - [info] * Phase 5: New master cleanup phase.. Thu Nov 16 17:07:15 2017 - [info] Thu Nov 16 17:07:15 2017 - [info] Resetting slave info on the new master.. Thu Nov 16 17:07:15 2017 - [debug] Clearing slave info.. Thu Nov 16 17:07:15 2017 - [debug] Stopping slave IO/SQL thread on 172.16.12.12(172.16.12.12:3306).. Thu Nov 16 17:07:15 2017 - [debug] done. Thu Nov 16 17:07:15 2017 - [debug] SHOW SLAVE STATUS shows new master does not replicate from anywhere. OK. Thu Nov 16 17:07:15 2017 - [info] 172.16.12.12: Resetting slave info succeeded. Thu Nov 16 17:07:15 2017 - [info] Master failover to 172.16.12.12(172.16.12.12:3306) completed successfully. Thu Nov 16 17:07:15 2017 - [debug] Disconnected from 172.16.12.12(172.16.12.12:3306) Thu Nov 16 17:07:15 2017 - [debug] Disconnected from 172.16.12.13(172.16.12.13:3306) Thu Nov 16 17:07:16 2017 - [info] ----- Failover Report ----- app1: MySQL Master failover 172.16.12.11(172.16.12.11:3306) to 172.16.12.12(172.16.12.12:3306) succeeded Master 172.16.12.11(172.16.12.11:3306) is down! Check MHA Manager logs at db10:/var/log/masterha/app1/manager.log for details. Started automated(non-interactive) failover. Invalidated master IP address on 172.16.12.11(172.16.12.11:3306) Selected 172.16.12.12(172.16.12.12:3306) as a new master. 172.16.12.12(172.16.12.12:3306): OK: Applying all logs succeeded. 172.16.12.12(172.16.12.12:3306): OK: Activated master IP address. 172.16.12.13(172.16.12.13:3306): OK: Slave started, replicating from 172.16.12.12(172.16.12.12:3306) 172.16.12.12(172.16.12.12:3306): Resetting slave info succeeded. Master failover to 172.16.12.12(172.16.12.12:3306) completed successfully.
流程:1、选主,2、切换VIP, 3、change启动从库
手动平滑切换:
masterha_master_switch --master_state=alive --conf=/etc/mha/app1.cnf --orig_master_is_new_slave
###################
非半同步复制,mysqladmin shutdown 主库
set global rpl_semi_sync_master_enabled=OFF;
set global rpl_semi_sync_slave_enabled =OFF;
show variables like "rpl%";
Mon Nov 20 15:31:03 2017 - [debug] Set short wait_timeout on master: 3 seconds
Mon Nov 20 15:31:03 2017 - [debug] Trying to get advisory lock..
Mon Nov 20 15:31:06 2017 - [warning] Got error on MySQL connect ping: DBI connect(';host=172.16.12.11;port=3306;mysql_connect_timeout=1','root',...) failed: Lost connection to MySQL server at 'reading initial communication packet', system error: 111 at /usr/share/perl5/vendor_perl/MHA/HealthCheck.pm line 97
2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Mon Nov 20 15:31:06 2017 - [info] Executing SSH check script: exit 0
Mon Nov 20 15:31:06 2017 - [debug] SSH connection test to 172.16.12.11, option -o StrictHostKeyChecking=no -o PasswordAuthentication=no -o BatchMode=yes -o ConnectTimeout=5, timeout 5
Mon Nov 20 15:31:06 2017 - [info] Executing secondary network check script: /usr/bin/masterha_secondary_check -s 172.16.12.10 -s 172.16.12.13 --user=root --master_host=172.16.12.11 --master_port=3306 --user=root --master_host=172.16.12.11 --master_ip=172.16.12.11 --master_port=3306 --master_user=root --master_password=mysqlpass --ping_type=CONNECT
Monitoring server 172.16.12.10 is reachable, Master is not reachable from 172.16.12.10. OK.
Mon Nov 20 15:31:09 2017 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Mon Nov 20 15:31:09 2017 - [warning] Connection failed 2 time(s)..
Mon Nov 20 15:31:11 2017 - [warning] HealthCheck: Got timeout on checking SSH connection to 172.16.12.11! at /usr/share/perl5/vendor_perl/MHA/HealthCheck.pm line 342.
Mon Nov 20 15:31:12 2017 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Mon Nov 20 15:31:12 2017 - [warning] Connection failed 3 time(s)..
Monitoring server 172.16.12.13 is reachable, Master is not reachable from 172.16.12.13. OK.
Mon Nov 20 15:31:14 2017 - [info] Master is not reachable from all other monitoring servers. Failover should start.
Mon Nov 20 15:31:15 2017 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Mon Nov 20 15:31:15 2017 - [warning] Connection failed 4 time(s)..
Mon Nov 20 15:31:15 2017 - [warning] Master is not reachable from health checker!
Mon Nov 20 15:31:15 2017 - [warning] Master 172.16.12.11(172.16.12.11:3306) is not reachable!
Mon Nov 20 15:31:15 2017 - [warning] SSH is NOT reachable.
Mon Nov 20 15:31:15 2017 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/mha/app1.cnf again, and trying to connect to all servers to check server status..
Mon Nov 20 15:31:15 2017 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Mon Nov 20 15:31:15 2017 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Mon Nov 20 15:31:15 2017 - [info] Reading server configuration from /etc/mha/app1.cnf..
Mon Nov 20 15:31:15 2017 - [debug] Skipping connecting to dead master 172.16.12.11(172.16.12.11:3306).
Mon Nov 20 15:31:15 2017 - [debug] Connecting to servers..
Mon Nov 20 15:31:15 2017 - [debug] Connected to: 172.16.12.12(172.16.12.12:3306), user=root
Mon Nov 20 15:31:15 2017 - [debug] Number of slave worker threads on host 172.16.12.12(172.16.12.12:3306): 2
Mon Nov 20 15:31:15 2017 - [debug] Connected to: 172.16.12.13(172.16.12.13:3306), user=root
Mon Nov 20 15:31:15 2017 - [debug] Number of slave worker threads on host 172.16.12.13(172.16.12.13:3306): 2
Mon Nov 20 15:31:15 2017 - [debug] Comparing MySQL versions..
Mon Nov 20 15:31:15 2017 - [debug] Comparing MySQL versions done.
Mon Nov 20 15:31:15 2017 - [debug] Connecting to servers done.
Mon Nov 20 15:31:15 2017 - [info] GTID failover mode = 1
Mon Nov 20 15:31:15 2017 - [info] Dead Servers:
Mon Nov 20 15:31:15 2017 - [info] 172.16.12.11(172.16.12.11:3306)
Mon Nov 20 15:31:15 2017 - [info] Alive Servers:
Mon Nov 20 15:31:15 2017 - [info] 172.16.12.12(172.16.12.12:3306)
Mon Nov 20 15:31:15 2017 - [info] 172.16.12.13(172.16.12.13:3306)
Mon Nov 20 15:31:15 2017 - [info] Alive Slaves:
Mon Nov 20 15:31:15 2017 - [info] 172.16.12.12(172.16.12.12:3306) Version=5.7.9-log (oldest major version between slaves) log-bin:enabled
Mon Nov 20 15:31:15 2017 - [info] GTID ON
Mon Nov 20 15:31:15 2017 - [debug] Relay log info repository: TABLE
Mon Nov 20 15:31:15 2017 - [info] Replicating from 172.16.12.11(172.16.12.11:3306)
Mon Nov 20 15:31:15 2017 - [info] Primary candidate for the new Master (candidate_master is set)
Mon Nov 20 15:31:15 2017 - [info] 172.16.12.13(172.16.12.13:3306) Version=5.7.9-log (oldest major version between slaves) log-bin:enabled
Mon Nov 20 15:31:15 2017 - [info] GTID ON
Mon Nov 20 15:31:15 2017 - [debug] Relay log info repository: TABLE
Mon Nov 20 15:31:15 2017 - [info] Replicating from 172.16.12.11(172.16.12.11:3306)
Mon Nov 20 15:31:15 2017 - [info] Not candidate for the new Master (no_master is set)
Mon Nov 20 15:31:15 2017 - [info] Checking slave configurations..
Mon Nov 20 15:31:15 2017 - [info] read_only=1 is not set on slave 172.16.12.12(172.16.12.12:3306).
Mon Nov 20 15:31:15 2017 - [info] read_only=1 is not set on slave 172.16.12.13(172.16.12.13:3306).
Mon Nov 20 15:31:15 2017 - [info] Checking replication filtering settings..
Mon Nov 20 15:31:15 2017 - [info] Replication filtering check ok.
Mon Nov 20 15:31:15 2017 - [info] Master is down!
Mon Nov 20 15:31:15 2017 - [info] Terminating monitoring script.
Mon Nov 20 15:31:15 2017 - [debug] Disconnected from 172.16.12.12(172.16.12.12:3306)
Mon Nov 20 15:31:15 2017 - [debug] Disconnected from 172.16.12.13(172.16.12.13:3306)
Mon Nov 20 15:31:15 2017 - [info] Got exit code 20 (Master dead). 确认主库挂了
Mon Nov 20 15:31:15 2017 - [info] MHA::MasterFailover version 0.57.
Mon Nov 20 15:31:15 2017 - [info] Starting master failover.
Mon Nov 20 15:31:15 2017 - [info]
Mon Nov 20 15:31:15 2017 - [info] * Phase 1: Configuration Check Phase.. #配置检查
Mon Nov 20 15:31:15 2017 - [info]
Mon Nov 20 15:31:15 2017 - [debug] Skipping connecting to dead master 172.16.12.11.
Mon Nov 20 15:31:15 2017 - [debug] Connecting to servers..
Mon Nov 20 15:31:15 2017 - [debug] Connected to: 172.16.12.12(172.16.12.12:3306), user=root
Mon Nov 20 15:31:15 2017 - [debug] Number of slave worker threads on host 172.16.12.12(172.16.12.12:3306): 2
Mon Nov 20 15:31:15 2017 - [debug] Connected to: 172.16.12.13(172.16.12.13:3306), user=root
Mon Nov 20 15:31:16 2017 - [debug] Number of slave worker threads on host 172.16.12.13(172.16.12.13:3306): 2
Mon Nov 20 15:31:16 2017 - [debug] Comparing MySQL versions..
Mon Nov 20 15:31:16 2017 - [debug] Comparing MySQL versions done.
Mon Nov 20 15:31:16 2017 - [debug] Connecting to servers done.
Mon Nov 20 15:31:16 2017 - [info] GTID failover mode = 1
Mon Nov 20 15:31:16 2017 - [info] Dead Servers:
Mon Nov 20 15:31:16 2017 - [info] 172.16.12.11(172.16.12.11:3306)
Mon Nov 20 15:31:16 2017 - [info] Checking master reachability via MySQL(double check)...
Mon Nov 20 15:31:16 2017 - [info] ok.
Mon Nov 20 15:31:16 2017 - [info] Alive Servers:
Mon Nov 20 15:31:16 2017 - [info] 172.16.12.12(172.16.12.12:3306)
Mon Nov 20 15:31:16 2017 - [info] 172.16.12.13(172.16.12.13:3306)
Mon Nov 20 15:31:16 2017 - [info] Alive Slaves:
Mon Nov 20 15:31:16 2017 - [info] 172.16.12.12(172.16.12.12:3306) Version=5.7.9-log (oldest major version between slaves) log-bin:enabled
Mon Nov 20 15:31:16 2017 - [info] GTID ON
Mon Nov 20 15:31:16 2017 - [debug] Relay log info repository: TABLE
Mon Nov 20 15:31:16 2017 - [info] Replicating from 172.16.12.11(172.16.12.11:3306)
Mon Nov 20 15:31:16 2017 - [info] Primary candidate for the new Master (candidate_master is set)
Mon Nov 20 15:31:16 2017 - [info] 172.16.12.13(172.16.12.13:3306) Version=5.7.9-log (oldest major version between slaves) log-bin:enabled
Mon Nov 20 15:31:16 2017 - [info] GTID ON
Mon Nov 20 15:31:16 2017 - [debug] Relay log info repository: TABLE
Mon Nov 20 15:31:16 2017 - [info] Replicating from 172.16.12.11(172.16.12.11:3306)
Mon Nov 20 15:31:16 2017 - [info] Not candidate for the new Master (no_master is set)
Mon Nov 20 15:31:16 2017 - [info] Starting GTID based failover.
Mon Nov 20 15:31:16 2017 - [info]
Mon Nov 20 15:31:16 2017 - [info] ** Phase 1: Configuration Check Phase completed.
Mon Nov 20 15:31:16 2017 - [info]
Mon Nov 20 15:31:16 2017 - [info] * Phase 2: Dead Master Shutdown Phase.. 故障库关闭
Mon Nov 20 15:31:16 2017 - [info]
Mon Nov 20 15:31:16 2017 - [info] Forcing shutdown so that applications never connect to the current master..
Mon Nov 20 15:31:16 2017 - [info] Executing master IP deactivation script:
Mon Nov 20 15:31:16 2017 - [info] /usr/local/bin/master_ip_failover --orig_master_host=172.16.12.11 --orig_master_ip=172.16.12.11 --orig_master_port=3306 --command=stop
Mon Nov 20 15:31:16 2017 - [debug] Stopping IO thread on 172.16.12.13(172.16.12.13:3306)..
Mon Nov 20 15:31:16 2017 - [debug] Stopping IO thread on 172.16.12.12(172.16.12.12:3306)..
Mon Nov 20 15:31:16 2017 - [debug] Stop IO thread on 172.16.12.12(172.16.12.12:3306) done.
IN SCRIPT TEST====/sbin/ifconfig eth0:100 down==/sbin/ifconfig eth0:100 172.16.12.100/24===
Disabling the VIP on old master: 172.16.12.11
Mon Nov 20 15:31:16 2017 - [info] done.
Mon Nov 20 15:31:16 2017 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.
Mon Nov 20 15:31:16 2017 - [debug] Stop IO thread on 172.16.12.13(172.16.12.13:3306) done.
Mon Nov 20 15:31:16 2017 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Mon Nov 20 15:31:16 2017 - [info]
Mon Nov 20 15:31:16 2017 - [info] * Phase 3: Master Recovery Phase..
Mon Nov 20 15:31:16 2017 - [info]
Mon Nov 20 15:31:16 2017 - [info] * Phase 3.1: Getting Latest Slaves Phase..
Mon Nov 20 15:31:16 2017 - [info]
Mon Nov 20 15:31:16 2017 - [debug] Fetching current slave status..
Mon Nov 20 15:31:16 2017 - [debug] Fetching current slave status done.
Mon Nov 20 15:31:16 2017 - [info] The latest binary log file/position on all slaves is mysql-bin.000001:4362327
Mon Nov 20 15:31:16 2017 - [info] Retrieved Gtid Set: 865e07c9-bae8-11e7-8aba-08002729e4f7:138400-143340
Mon Nov 20 15:31:16 2017 - [info] Latest slaves (Slaves that received relay log files to the latest):
Mon Nov 20 15:31:16 2017 - [info] 172.16.12.12(172.16.12.12:3306) Version=5.7.9-log (oldest major version between slaves) log-bin:enabled
Mon Nov 20 15:31:16 2017 - [info] GTID ON
Mon Nov 20 15:31:16 2017 - [debug] Relay log info repository: TABLE
Mon Nov 20 15:31:16 2017 - [info] Replicating from 172.16.12.11(172.16.12.11:3306)
Mon Nov 20 15:31:16 2017 - [info] Primary candidate for the new Master (candidate_master is set)
Mon Nov 20 15:31:16 2017 - [info] The oldest binary log file/position on all slaves is mysql-bin.000001:4081167
Mon Nov 20 15:31:16 2017 - [info] Retrieved Gtid Set: 865e07c9-bae8-11e7-8aba-08002729e4f7:138400-142948
Mon Nov 20 15:31:16 2017 - [info] Oldest slaves:
Mon Nov 20 15:31:16 2017 - [info] 172.16.12.13(172.16.12.13:3306) Version=5.7.9-log (oldest major version between slaves) log-bin:enabled
Mon Nov 20 15:31:16 2017 - [info] GTID ON
Mon Nov 20 15:31:16 2017 - [debug] Relay log info repository: TABLE
Mon Nov 20 15:31:16 2017 - [info] Replicating from 172.16.12.11(172.16.12.11:3306)
Mon Nov 20 15:31:16 2017 - [info] Not candidate for the new Master (no_master is set)
Mon Nov 20 15:31:16 2017 - [info]
Mon Nov 20 15:31:16 2017 - [info] * Phase 3.3: Determining New Master Phase..
Mon Nov 20 15:31:16 2017 - [info]
Mon Nov 20 15:31:16 2017 - [info] Searching new master from slaves..
Mon Nov 20 15:31:16 2017 - [info] Candidate masters from the configuration file:
Mon Nov 20 15:31:16 2017 - [info] 172.16.12.12(172.16.12.12:3306) Version=5.7.9-log (oldest major version between slaves) log-bin:enabled
Mon Nov 20 15:31:16 2017 - [info] GTID ON
Mon Nov 20 15:31:16 2017 - [debug] Relay log info repository: TABLE
Mon Nov 20 15:31:16 2017 - [info] Replicating from 172.16.12.11(172.16.12.11:3306)
Mon Nov 20 15:31:16 2017 - [info] Primary candidate for the new Master (candidate_master is set)
Mon Nov 20 15:31:16 2017 - [info] Non-candidate masters:
Mon Nov 20 15:31:16 2017 - [info] 172.16.12.13(172.16.12.13:3306) Version=5.7.9-log (oldest major version between slaves) log-bin:enabled
Mon Nov 20 15:31:16 2017 - [info] GTID ON
Mon Nov 20 15:31:16 2017 - [debug] Relay log info repository: TABLE
Mon Nov 20 15:31:16 2017 - [info] Replicating from 172.16.12.11(172.16.12.11:3306)
Mon Nov 20 15:31:16 2017 - [info] Not candidate for the new Master (no_master is set)
Mon Nov 20 15:31:16 2017 - [info] Searching from candidate_master slaves which have received the latest relay log events..
Mon Nov 20 15:31:16 2017 - [info] New master is 172.16.12.12(172.16.12.12:3306)
Mon Nov 20 15:31:16 2017 - [info] Starting master failover..
Mon Nov 20 15:31:16 2017 - [info]
From:
172.16.12.11(172.16.12.11:3306) (current master)
+--172.16.12.12(172.16.12.12:3306)
+--172.16.12.13(172.16.12.13:3306)
To:
172.16.12.12(172.16.12.12:3306) (new master)
+--172.16.12.13(172.16.12.13:3306)
Mon Nov 20 15:31:16 2017 - [info]
Mon Nov 20 15:31:16 2017 - [info] * Phase 3.3: New Master Recovery Phase..
Mon Nov 20 15:31:16 2017 - [info]
Mon Nov 20 15:31:16 2017 - [info] Waiting all logs to be applied..
Mon Nov 20 15:31:16 2017 - [info] done.
Mon Nov 20 15:31:16 2017 - [debug] Stopping slave IO/SQL thread on 172.16.12.12(172.16.12.12:3306)..
Mon Nov 20 15:31:16 2017 - [debug] done.
Mon Nov 20 15:31:16 2017 - [info] Getting new master's binlog name and position..
Mon Nov 20 15:31:16 2017 - [info] mysql-bin.000001:4295590
Mon Nov 20 15:31:16 2017 - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='172.16.12.12', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='xxx';
Mon Nov 20 15:31:16 2017 - [info] Master Recovery succeeded. File:Pos:Exec_Gtid_Set: mysql-bin.000001, 4295590, 865e07c9-bae8-11e7-8aba-08002729e4f7:1-143340
Mon Nov 20 15:31:16 2017 - [info] Executing master IP activate script:
Mon Nov 20 15:31:16 2017 - [info] /usr/local/bin/master_ip_failover --command=start --ssh_user=root --orig_master_host=172.16.12.11 --orig_master_ip=172.16.12.11 --orig_master_port=3306 --new_master_host=172.16.12.12 --new_master_ip=172.16.12.12 --new_master_port=3306 --new_master_user='root' --new_master_password=xxx
Unknown option: new_master_user
Unknown option: new_master_password
IN SCRIPT TEST====/sbin/ifconfig eth0:100 down==/sbin/ifconfig eth0:100 172.16.12.100/24===
Enabling the VIP - 172.16.12.100/24 on the new master - 172.16.12.12
Mon Nov 20 15:31:18 2017 - [info] OK.
Mon Nov 20 15:31:18 2017 - [info] ** Finished master recovery successfully.
Mon Nov 20 15:31:18 2017 - [info] * Phase 3: Master Recovery Phase completed.
Mon Nov 20 15:31:18 2017 - [info]
Mon Nov 20 15:31:18 2017 - [info] * Phase 4: Slaves Recovery Phase..
Mon Nov 20 15:31:18 2017 - [info]
Mon Nov 20 15:31:18 2017 - [info]
Mon Nov 20 15:31:18 2017 - [info] * Phase 4.1: Starting Slaves in parallel..
Mon Nov 20 15:31:18 2017 - [info]
Mon Nov 20 15:31:18 2017 - [info] -- Slave recovery on host 172.16.12.13(172.16.12.13:3306) started, pid: 3511. Check tmp log /var/log/masterha/app1/172.16.12.13_3306_20171120153115.log if it takes time..
Mon Nov 20 15:32:27 2017 - [info]
Mon Nov 20 15:32:27 2017 - [info] Log messages from 172.16.12.13 ...
Mon Nov 20 15:32:27 2017 - [info]
Mon Nov 20 15:31:18 2017 - [info] Resetting slave 172.16.12.13(172.16.12.13:3306) and starting replication from the new master 172.16.12.12(172.16.12.12:3306)..
Mon Nov 20 15:31:18 2017 - [debug] Stopping slave IO/SQL thread on 172.16.12.13(172.16.12.13:3306)..
Mon Nov 20 15:32:21 2017 - [debug] done.
Mon Nov 20 15:32:21 2017 - [info] Executed CHANGE MASTER.
Mon Nov 20 15:32:21 2017 - [debug] Starting slave IO/SQL thread on 172.16.12.13(172.16.12.13:3306)..
Mon Nov 20 15:32:21 2017 - [debug] done.
Mon Nov 20 15:32:21 2017 - [info] Slave started.
Mon Nov 20 15:32:27 2017 - [info] gtid_wait(865e07c9-bae8-11e7-8aba-08002729e4f7:1-143340) completed on 172.16.12.13(172.16.12.13:3306). Executed 25 events.
Mon Nov 20 15:32:27 2017 - [info] End of log messages from 172.16.12.13.
Mon Nov 20 15:32:27 2017 - [info] -- Slave on host 172.16.12.13(172.16.12.13:3306) started.
Mon Nov 20 15:32:27 2017 - [info] All new slave servers recovered successfully.
Mon Nov 20 15:32:27 2017 - [info]
Mon Nov 20 15:32:27 2017 - [info] * Phase 5: New master cleanup phase..
Mon Nov 20 15:32:27 2017 - [info]
Mon Nov 20 15:32:27 2017 - [info] Resetting slave info on the new master..
Mon Nov 20 15:32:27 2017 - [debug] Clearing slave info..
Mon Nov 20 15:32:27 2017 - [debug] Stopping slave IO/SQL thread on 172.16.12.12(172.16.12.12:3306)..
Mon Nov 20 15:32:27 2017 - [debug] done.
Mon Nov 20 15:32:27 2017 - [debug] SHOW SLAVE STATUS shows new master does not replicate from anywhere. OK.
Mon Nov 20 15:32:27 2017 - [info] 172.16.12.12: Resetting slave info succeeded.
Mon Nov 20 15:32:27 2017 - [info] Master failover to 172.16.12.12(172.16.12.12:3306) completed successfully.
Mon Nov 20 15:32:27 2017 - [debug] Disconnected from 172.16.12.12(172.16.12.12:3306)
Mon Nov 20 15:32:27 2017 - [debug] Disconnected from 172.16.12.13(172.16.12.13:3306)
Mon Nov 20 15:32:27 2017 - [info]
----- Failover Report -----
app1: MySQL Master failover 172.16.12.11(172.16.12.11:3306) to 172.16.12.12(172.16.12.12:3306) succeeded
Master 172.16.12.11(172.16.12.11:3306) is down!
Check MHA Manager logs at db10:/var/log/masterha/app1/manager.log for details.
Started automated(non-interactive) failover.
Invalidated master IP address on 172.16.12.11(172.16.12.11:3306)
Selected 172.16.12.12(172.16.12.12:3306) as a new master.
172.16.12.12(172.16.12.12:3306): OK: Applying all logs succeeded.
172.16.12.12(172.16.12.12:3306): OK: Activated master IP address.
172.16.12.13(172.16.12.13:3306): OK: Slave started, replicating from 172.16.12.12(172.16.12.12:3306)
172.16.12.12(172.16.12.12:3306): Resetting slave info succeeded.
Master failover to 172.16.12.12(172.16.12.12:3306) completed successfully.
[root@db10 ~]#
1、检查配置 2、选新主库 3、切换VIP 4、change,启动从库
结论:非半同步复制可能会丢失数据,因为宕机的主库启动后git不一致。
#######################半同步+sysbench+mysqladmin shutdown主库
set global rpl_semi_sync_master_enabled=ON;
set global rpl_semi_sync_slave_enabled =ON;
show variables like "rpl%";
结论:宕机的主机启动后gtid也是不一致的。
应该是切换异步了
############################半同步+sysbench+mysqladmin shutdown主库+不切到异步
set global rpl_semi_sync_master_timeout=100000000000;
Mon Nov 20 17:05:45 2017 - [info] * Phase 3.3: New Master Recovery Phase..
Mon Nov 20 17:05:45 2017 - [info]
Mon Nov 20 17:05:45 2017 - [info] Waiting all logs to be applied..
Mon Nov 20 17:05:45 2017 - [debug] Sql Thread Done: 0, Worker Thread done: 0, Ended workers: 1
Mon Nov 20 17:05:46 2017 - [debug] Sql Thread Done: 0, Worker Thread done: 0, Ended workers: 1
Mon Nov 20 17:05:47 2017 - [debug] Sql Thread Done: 0, Worker Thread done: 0, Ended workers: 1
Mon Nov 20 17:05:48 2017 - [debug] Sql Thread Done: 0, Worker Thread done: 0, Ended workers: 1
Mon Nov 20 17:05:49 2017 - [debug] Sql Thread Done: 0, Worker Thread done: 0, Ended workers: 1
Mon Nov 20 17:05:50 2017 - [debug] Sql Thread Done: 0, Worker Thread done: 0, Ended workers: 1
Mon Nov 20 17:05:51 2017 - [debug] Sql Thread Done: 0, Worker Thread done: 0, Ended workers: 1
Mon Nov 20 17:05:52 2017 - [debug] Sql Thread Done: 0, Worker Thread done: 0, Ended workers: 1
将差异的relay_log应用到库里,数据量也没那么大啊,但是执行时间很长啊。由于gtid设置的有问题。gtid变小导致 检查一下vip
浙公网安备 33010602011771号