mha

http://azhuang.blog.51cto.com/9176790/1744269

 http://www.ywnds.com/?p=8129

binlog1参数添加后的日志

no -o BatchMode=yes -o ConnectTimeout=5, timeout 5
Monitoring server 172.16.12.10 is reachable, Master is not reachable from 172.16.12.10. OK.
Fri Dec  8 19:04:14 2017 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Fri Dec  8 19:04:14 2017 - [warning] Connection failed 2 time(s)..
Monitoring server 172.16.12.13 is reachable, Master is not reachable from 172.16.12.13. OK.
Fri Dec  8 19:04:14 2017 - [info] Master is not reachable from all other monitoring servers. Failover should start.
Fri Dec  8 19:04:14 2017 - [info] HealthCheck: SSH to 172.16.12.11 is reachable.
Fri Dec  8 19:04:17 2017 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Fri Dec  8 19:04:17 2017 - [warning] Connection failed 3 time(s)..
Fri Dec  8 19:04:20 2017 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Fri Dec  8 19:04:20 2017 - [warning] Connection failed 4 time(s)..
Fri Dec  8 19:04:20 2017 - [warning] Master is not reachable from health checker!
Fri Dec  8 19:04:20 2017 - [warning] Master 172.16.12.11(172.16.12.11:3306) is not reachable!
Fri Dec  8 19:04:20 2017 - [warning] SSH is reachable.
Fri Dec  8 19:04:20 2017 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/mha/app1.cnf again, and trying to connect to all servers to check server status..
Fri Dec  8 19:04:20 2017 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Fri Dec  8 19:04:20 2017 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Fri Dec  8 19:04:20 2017 - [info] Reading server configuration from /etc/mha/app1.cnf..
Fri Dec  8 19:04:20 2017 - [debug] Skipping connecting to dead master 172.16.12.11(172.16.12.11:3306).
Fri Dec  8 19:04:20 2017 - [debug] Connecting to servers..
Fri Dec  8 19:04:20 2017 - [debug]  Connected to: 172.16.12.12(172.16.12.12:3306), user=root
Fri Dec  8 19:04:20 2017 - [debug]  Number of slave worker threads on host 172.16.12.12(172.16.12.12:3306): 2
Fri Dec  8 19:04:20 2017 - [debug]  Connected to: 172.16.12.13(172.16.12.13:3306), user=root
Fri Dec  8 19:04:20 2017 - [debug]  Number of slave worker threads on host 172.16.12.13(172.16.12.13:3306): 2
Fri Dec  8 19:04:20 2017 - [debug]  Comparing MySQL versions..
Fri Dec  8 19:04:20 2017 - [debug]   Comparing MySQL versions done.
Fri Dec  8 19:04:20 2017 - [debug] Connecting to servers done.
Fri Dec  8 19:04:20 2017 - [info] GTID failover mode = 1
Fri Dec  8 19:04:20 2017 - [info] Dead Servers:
Fri Dec  8 19:04:20 2017 - [info]   172.16.12.11(172.16.12.11:3306)
Fri Dec  8 19:04:20 2017 - [info] Alive Servers:
Fri Dec  8 19:04:20 2017 - [info]   172.16.12.12(172.16.12.12:3306)
Fri Dec  8 19:04:20 2017 - [info]   172.16.12.13(172.16.12.13:3306)
Fri Dec  8 19:04:20 2017 - [info] Alive Slaves:
Fri Dec  8 19:04:20 2017 - [info]   172.16.12.12(172.16.12.12:3306)  Version=5.7.9-log (oldest major version between slaves) log-bin:enabled
Fri Dec  8 19:04:20 2017 - [info]     GTID ON
Fri Dec  8 19:04:20 2017 - [debug]    Relay log info repository: TABLE
Fri Dec  8 19:04:20 2017 - [info]     Replicating from 172.16.12.11(172.16.12.11:3306)
Fri Dec  8 19:04:20 2017 - [info]     Primary candidate for the new Master (candidate_master is set)
Fri Dec  8 19:04:20 2017 - [info]   172.16.12.13(172.16.12.13:3306)  Version=5.7.9-log (oldest major version between slaves) log-bin:enabled
Fri Dec  8 19:04:20 2017 - [info]     GTID ON
Fri Dec  8 19:04:20 2017 - [debug]    Relay log info repository: TABLE
Fri Dec  8 19:04:20 2017 - [info]     Replicating from 172.16.12.11(172.16.12.11:3306)
Fri Dec  8 19:04:20 2017 - [info]     Not candidate for the new Master (no_master is set)
Fri Dec  8 19:04:20 2017 - [info] Checking slave configurations..
Fri Dec  8 19:04:20 2017 - [info]  read_only=1 is not set on slave 172.16.12.12(172.16.12.12:3306).
Fri Dec  8 19:04:20 2017 - [info]  read_only=1 is not set on slave 172.16.12.13(172.16.12.13:3306).
Fri Dec  8 19:04:20 2017 - [info] Checking replication filtering settings..
Fri Dec  8 19:04:20 2017 - [info]  Replication filtering check ok.
Fri Dec  8 19:04:20 2017 - [info] Master is down!
Fri Dec  8 19:04:20 2017 - [info] Terminating monitoring script.
Fri Dec  8 19:04:20 2017 - [debug]  Disconnected from 172.16.12.12(172.16.12.12:3306)
Fri Dec  8 19:04:20 2017 - [debug]  Disconnected from 172.16.12.13(172.16.12.13:3306)
Fri Dec  8 19:04:20 2017 - [info] Got exit code 20 (Master dead).
Fri Dec  8 19:04:20 2017 - [info] MHA::MasterFailover version 0.57.
Fri Dec  8 19:04:20 2017 - [info] Starting master failover.
Fri Dec  8 19:04:20 2017 - [info] 
Fri Dec  8 19:04:20 2017 - [info] * Phase 1: Configuration Check Phase..
Fri Dec  8 19:04:20 2017 - [info] 
Fri Dec  8 19:04:20 2017 - [debug] SSH connection test to 172.16.12.11, option -o StrictHostKeyChecking=no -o PasswordAuthentication=no -o BatchMode=yes -o ConnectTimeout=5, timeout 5
Fri Dec  8 19:04:20 2017 - [info] HealthCheck: SSH to 172.16.12.11 is reachable.
Fri Dec  8 19:04:21 2017 - [info] Binlog server 172.16.12.11 is reachable.
Fri Dec  8 19:04:21 2017 - [debug] Skipping connecting to dead master 172.16.12.11.
Fri Dec  8 19:04:21 2017 - [debug] Connecting to servers..
Fri Dec  8 19:04:21 2017 - [debug]  Connected to: 172.16.12.12(172.16.12.12:3306), user=root
Fri Dec  8 19:04:21 2017 - [debug]  Number of slave worker threads on host 172.16.12.12(172.16.12.12:3306): 2
Fri Dec  8 19:04:21 2017 - [debug]  Connected to: 172.16.12.13(172.16.12.13:3306), user=root
Fri Dec  8 19:04:21 2017 - [debug]  Number of slave worker threads on host 172.16.12.13(172.16.12.13:3306): 2
Fri Dec  8 19:04:21 2017 - [debug]  Comparing MySQL versions..
Fri Dec  8 19:04:21 2017 - [debug]   Comparing MySQL versions done.
Fri Dec  8 19:04:21 2017 - [debug] Connecting to servers done.
Fri Dec  8 19:04:21 2017 - [info] GTID failover mode = 1
Fri Dec  8 19:04:21 2017 - [info] Dead Servers:
Fri Dec  8 19:04:21 2017 - [info]   172.16.12.11(172.16.12.11:3306)
Fri Dec  8 19:04:21 2017 - [info] Checking master reachability via MySQL(double check)...
Fri Dec  8 19:04:21 2017 - [info]  ok.
Fri Dec  8 19:04:21 2017 - [info] Alive Servers:
Fri Dec  8 19:04:21 2017 - [info]   172.16.12.12(172.16.12.12:3306)
Fri Dec  8 19:04:21 2017 - [info]   172.16.12.13(172.16.12.13:3306)
Fri Dec  8 19:04:21 2017 - [info] Alive Slaves:
Fri Dec  8 19:04:21 2017 - [info]   172.16.12.12(172.16.12.12:3306)  Version=5.7.9-log (oldest major version between slaves) log-bin:enabled
Fri Dec  8 19:04:21 2017 - [info]     GTID ON
Fri Dec  8 19:04:21 2017 - [debug]    Relay log info repository: TABLE
Fri Dec  8 19:04:21 2017 - [info]     Replicating from 172.16.12.11(172.16.12.11:3306)
Fri Dec  8 19:04:21 2017 - [info]     Primary candidate for the new Master (candidate_master is set)
Fri Dec  8 19:04:21 2017 - [info]   172.16.12.13(172.16.12.13:3306)  Version=5.7.9-log (oldest major version between slaves) log-bin:enabled
Fri Dec  8 19:04:21 2017 - [info]     GTID ON
Fri Dec  8 19:04:21 2017 - [debug]    Relay log info repository: TABLE
Fri Dec  8 19:04:21 2017 - [info]     Replicating from 172.16.12.11(172.16.12.11:3306)
Fri Dec  8 19:04:21 2017 - [info]     Not candidate for the new Master (no_master is set)
Fri Dec  8 19:04:22 2017 - [info] Starting GTID based failover.
Fri Dec  8 19:04:22 2017 - [info] 
Fri Dec  8 19:04:22 2017 - [info] ** Phase 1: Configuration Check Phase completed.
Fri Dec  8 19:04:22 2017 - [info] 
Fri Dec  8 19:04:22 2017 - [info] * Phase 2: Dead Master Shutdown Phase..
Fri Dec  8 19:04:22 2017 - [info] 
Fri Dec  8 19:04:22 2017 - [info] Forcing shutdown so that applications never connect to the current master..
Fri Dec  8 19:04:22 2017 - [info] Executing master IP deactivation script:
Fri Dec  8 19:04:22 2017 - [info]   /usr/local/bin/master_ip_failover --orig_master_host=172.16.12.11 --orig_master_ip=172.16.12.11 --orig_master_port=3306 --command=stopssh --ssh_user=root  
Fri Dec  8 19:04:22 2017 - [debug]  Stopping IO thread on 172.16.12.13(172.16.12.13:3306)..
Fri Dec  8 19:04:22 2017 - [debug]  Stopping IO thread on 172.16.12.12(172.16.12.12:3306)..
Fri Dec  8 19:04:22 2017 - [debug]  Stop IO thread on 172.16.12.13(172.16.12.13:3306) done.
Fri Dec  8 19:04:22 2017 - [debug]  Stop IO thread on 172.16.12.12(172.16.12.12:3306) done.


IN SCRIPT TEST====/sbin/ifconfig eth0:100 down==/sbin/ifconfig eth0:100 172.16.12.100/24===

Disabling the VIP on old master: 172.16.12.11 
SIOCSIFFLAGS: 无法指定被请求的地址
Fri Dec  8 19:04:22 2017 - [info]  done.
Fri Dec  8 19:04:22 2017 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.
Fri Dec  8 19:04:22 2017 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Fri Dec  8 19:04:22 2017 - [info] 
Fri Dec  8 19:04:22 2017 - [info] * Phase 3: Master Recovery Phase..
Fri Dec  8 19:04:22 2017 - [info] 
Fri Dec  8 19:04:22 2017 - [info] * Phase 3.1: Getting Latest Slaves Phase..
Fri Dec  8 19:04:22 2017 - [info] 
Fri Dec  8 19:04:22 2017 - [debug] Fetching current slave status..
Fri Dec  8 19:04:22 2017 - [debug]  Fetching current slave status done.
Fri Dec  8 19:04:22 2017 - [info] The latest binary log file/position on all slaves is mysql-bin.000007:4095683
Fri Dec  8 19:04:22 2017 - [info] Retrieved Gtid Set: 865e07c9-bae8-11e7-8aba-08002729e4f7:16087-21782
Fri Dec  8 19:04:22 2017 - [info] Latest slaves (Slaves that received relay log files to the latest):
Fri Dec  8 19:04:22 2017 - [info]   172.16.12.13(172.16.12.13:3306)  Version=5.7.9-log (oldest major version between slaves) log-bin:enabled
Fri Dec  8 19:04:22 2017 - [info]     GTID ON
Fri Dec  8 19:04:22 2017 - [debug]    Relay log info repository: TABLE
Fri Dec  8 19:04:22 2017 - [info]     Replicating from 172.16.12.11(172.16.12.11:3306)
Fri Dec  8 19:04:22 2017 - [info]     Not candidate for the new Master (no_master is set)
Fri Dec  8 19:04:22 2017 - [info] The oldest binary log file/position on all slaves is mysql-bin.000007:1795676
Fri Dec  8 19:04:22 2017 - [info] Retrieved Gtid Set: 865e07c9-bae8-11e7-8aba-08002729e4f7:16087-18583
Fri Dec  8 19:04:22 2017 - [info] Oldest slaves:
Fri Dec  8 19:04:22 2017 - [info]   172.16.12.12(172.16.12.12:3306)  Version=5.7.9-log (oldest major version between slaves) log-bin:enabled
Fri Dec  8 19:04:22 2017 - [info]     GTID ON
Fri Dec  8 19:04:22 2017 - [debug]    Relay log info repository: TABLE
Fri Dec  8 19:04:22 2017 - [info]     Replicating from 172.16.12.11(172.16.12.11:3306)
Fri Dec  8 19:04:22 2017 - [info]     Primary candidate for the new Master (candidate_master is set)
Fri Dec  8 19:04:22 2017 - [info] 
Fri Dec  8 19:04:22 2017 - [info] * Phase 3.3: Determining New Master Phase..
Fri Dec  8 19:04:22 2017 - [info] 
Fri Dec  8 19:04:22 2017 - [debug] Checking replication delay on 172.16.12.12(172.16.12.12:3306).. 
Fri Dec  8 19:04:22 2017 - [debug]  ok.
Fri Dec  8 19:04:22 2017 - [info] Searching new master from slaves..
Fri Dec  8 19:04:22 2017 - [info]  Candidate masters from the configuration file:
Fri Dec  8 19:04:22 2017 - [info]   172.16.12.12(172.16.12.12:3306)  Version=5.7.9-log (oldest major version between slaves) log-bin:enabled
Fri Dec  8 19:04:22 2017 - [info]     GTID ON
Fri Dec  8 19:04:22 2017 - [debug]    Relay log info repository: TABLE
Fri Dec  8 19:04:22 2017 - [info]     Replicating from 172.16.12.11(172.16.12.11:3306)
Fri Dec  8 19:04:22 2017 - [info]     Primary candidate for the new Master (candidate_master is set)
Fri Dec  8 19:04:22 2017 - [info]  Non-candidate masters:
Fri Dec  8 19:04:22 2017 - [info]   172.16.12.13(172.16.12.13:3306)  Version=5.7.9-log (oldest major version between slaves) log-bin:enabled
Fri Dec  8 19:04:22 2017 - [info]     GTID ON
Fri Dec  8 19:04:22 2017 - [debug]    Relay log info repository: TABLE
Fri Dec  8 19:04:22 2017 - [info]     Replicating from 172.16.12.11(172.16.12.11:3306)
Fri Dec  8 19:04:22 2017 - [info]     Not candidate for the new Master (no_master is set)
Fri Dec  8 19:04:22 2017 - [info]  Searching from candidate_master slaves which have received the latest relay log events..
Fri Dec  8 19:04:22 2017 - [info]   Not found.
Fri Dec  8 19:04:22 2017 - [info]  Searching from all candidate_master slaves..
Fri Dec  8 19:04:22 2017 - [info] New master is 172.16.12.12(172.16.12.12:3306)
Fri Dec  8 19:04:22 2017 - [info] Starting master failover..
Fri Dec  8 19:04:22 2017 - [info] 
From:
172.16.12.11(172.16.12.11:3306) (current master)
 +--172.16.12.12(172.16.12.12:3306)
 +--172.16.12.13(172.16.12.13:3306)

To:
172.16.12.12(172.16.12.12:3306) (new master)
 +--172.16.12.13(172.16.12.13:3306)
Fri Dec  8 19:04:22 2017 - [info] 
Fri Dec  8 19:04:22 2017 - [info] * Phase 3.3: New Master Recovery Phase..
Fri Dec  8 19:04:22 2017 - [info] 
Fri Dec  8 19:04:22 2017 - [info]  Waiting all logs to be applied.. 
Fri Dec  8 19:04:22 2017 - [info]   done.
Fri Dec  8 19:04:22 2017 - [debug]  Stopping slave IO/SQL thread on 172.16.12.12(172.16.12.12:3306)..
Fri Dec  8 19:05:24 2017 - [debug]   done.
Fri Dec  8 19:05:24 2017 - [info]  Replicating from the latest slave 172.16.12.13(172.16.12.13:3306) and waiting to apply..
Fri Dec  8 19:05:24 2017 - [info]  Waiting all logs to be applied on the latest slave.. 
Fri Dec  8 19:05:24 2017 - [info]  Resetting slave 172.16.12.12(172.16.12.12:3306) and starting replication from the new master 172.16.12.13(172.16.12.13:3306)..
Fri Dec  8 19:05:24 2017 - [debug]  Stopping slave IO/SQL thread on 172.16.12.12(172.16.12.12:3306)..
Fri Dec  8 19:05:24 2017 - [debug]   done.
Fri Dec  8 19:05:24 2017 - [info]  Executed CHANGE MASTER.
Fri Dec  8 19:05:24 2017 - [debug]  Starting slave IO/SQL thread on 172.16.12.12(172.16.12.12:3306)..
Fri Dec  8 19:05:24 2017 - [debug]   done.
Fri Dec  8 19:05:24 2017 - [info]  Slave started.
Fri Dec  8 19:05:24 2017 - [info]  Waiting to execute all relay logs on 172.16.12.12(172.16.12.12:3306)..
Fri Dec  8 19:06:12 2017 - [info]  master_pos_wait(mysql-bin.000003:15419438) completed on 172.16.12.12(172.16.12.12:3306). Executed 156 events.
Fri Dec  8 19:06:12 2017 - [info]   done.
Fri Dec  8 19:06:12 2017 - [debug]  Stopping SQL thread on 172.16.12.12(172.16.12.12:3306)..
Fri Dec  8 19:06:12 2017 - [debug]   done.
Fri Dec  8 19:06:12 2017 - [info]   done.
Fri Dec  8 19:06:12 2017 - [info] -- Saving binlog from host 172.16.12.11 started, pid: 2735
Fri Dec  8 19:06:18 2017 - [info] 
Fri Dec  8 19:06:18 2017 - [info] Log messages from 172.16.12.11 ...
Fri Dec  8 19:06:18 2017 - [info] 
Fri Dec  8 19:06:12 2017 - [info] Fetching binary logs from binlog server 172.16.12.11..
Fri Dec  8 19:06:12 2017 - [info] Executing binlog save command: save_binary_logs --command=save --start_file=mysql-bin.000007  --start_pos=4095618 --output_file=/tmp/saved_binlog_binlog1_20171208190420.binlog --handle_raw_binlog=0 --skip_filter=1 --disable_log_bin=0 --manager_version=0.57 --oldest_version=5.7.9-log  --debug  --binlog_dir=/home/data/mysql57/log 
  Creating /tmp if not exists..    ok.
 Concat binary/relay logs from mysql-bin.000007 pos 4095618 to mysql-bin.000007 EOF into /tmp/saved_binlog_binlog1_20171208190420.binlog ..
Executing command: mysqlbinlog --start-position=4095618  /home/data/mysql57/log/mysql-bin.000007 >> /tmp/saved_binlog_binlog1_20171208190420.binlog
 Concat succeeded.
Fri Dec  8 19:06:18 2017 - [info] scp from root@172.16.12.11:/tmp/saved_binlog_binlog1_20171208190420.binlog to local:/var/log/masterha/app1/saved_binlog_172.16.12.11_binlog1_20171208190420.binlog succeeded.
Fri Dec  8 19:06:18 2017 - [info] End of log messages from 172.16.12.11.
Fri Dec  8 19:06:18 2017 - [info] Saved mysqlbinlog size from 172.16.12.11 is 61894698 bytes.
Fri Dec  8 19:06:18 2017 - [info] Applying differential binlog /var/log/masterha/app1/saved_binlog_172.16.12.11_binlog1_20171208190420.binlog ..
Fri Dec  8 19:10:15 2017 - [info] Differential log apply from binlog server succeeded.
Fri Dec  8 19:10:15 2017 - [info] Getting new master's binlog name and position..
Fri Dec  8 19:10:15 2017 - [info]  mysql-bin.000006:61413022
Fri Dec  8 19:10:15 2017 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='172.16.12.12', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='xxx';
Fri Dec  8 19:10:15 2017 - [info] Master Recovery succeeded. File:Pos:Exec_Gtid_Set: mysql-bin.000006, 61413022, 865e07c9-bae8-11e7-8aba-08002729e4f7:1-74770
Fri Dec  8 19:10:15 2017 - [info] Executing master IP activate script:
Fri Dec  8 19:10:15 2017 - [info]   /usr/local/bin/master_ip_failover --command=start --ssh_user=root --orig_master_host=172.16.12.11 --orig_master_ip=172.16.12.11 --orig_master_port=3306 --new_master_host=172.16.12.12 --new_master_ip=172.16.12.12 --new_master_port=3306 --new_master_user='root'   --new_master_password=xxx
Unknown option: new_master_user
Unknown option: new_master_password


IN SCRIPT TEST====/sbin/ifconfig eth0:100 down==/sbin/ifconfig eth0:100 172.16.12.100/24===

Enabling the VIP - 172.16.12.100/24 on the new master - 172.16.12.12 
Fri Dec  8 19:10:19 2017 - [info]  OK.
Fri Dec  8 19:10:19 2017 - [info] ** Finished master recovery successfully.
Fri Dec  8 19:10:19 2017 - [info] * Phase 3: Master Recovery Phase completed.
Fri Dec  8 19:10:19 2017 - [info] 
Fri Dec  8 19:10:19 2017 - [info] * Phase 4: Slaves Recovery Phase..
Fri Dec  8 19:10:19 2017 - [info] 
Fri Dec  8 19:10:19 2017 - [info] 
Fri Dec  8 19:10:19 2017 - [info] * Phase 4.1: Starting Slaves in parallel..
Fri Dec  8 19:10:19 2017 - [info] 
Fri Dec  8 19:10:19 2017 - [info] -- Slave recovery on host 172.16.12.13(172.16.12.13:3306) started, pid: 2751. Check tmp log /var/log/masterha/app1/172.16.12.13_3306_20171208190420.log if it takes time..
Fri Dec  8 19:25:16 2017 - [info] 
Fri Dec  8 19:25:16 2017 - [info] Log messages from 172.16.12.13 ...
Fri Dec  8 19:25:16 2017 - [info] 
Fri Dec  8 19:10:19 2017 - [info]  Resetting slave 172.16.12.13(172.16.12.13:3306) and starting replication from the new master 172.16.12.12(172.16.12.12:3306)..
Fri Dec  8 19:10:19 2017 - [debug]  Stopping slave IO/SQL thread on 172.16.12.13(172.16.12.13:3306)..
Fri Dec  8 19:11:21 2017 - [debug]   done.
Fri Dec  8 19:11:21 2017 - [info]  Executed CHANGE MASTER.
Fri Dec  8 19:11:21 2017 - [debug]  Starting slave IO/SQL thread on 172.16.12.13(172.16.12.13:3306)..
Fri Dec  8 19:11:22 2017 - [debug]   done.
Fri Dec  8 19:11:22 2017 - [info]  Slave started.
Fri Dec  8 19:25:16 2017 - [info]  gtid_wait(865e07c9-bae8-11e7-8aba-08002729e4f7:1-74770) completed on 172.16.12.13(172.16.12.13:3306). Executed 2677 events.
Fri Dec  8 19:25:16 2017 - [info] End of log messages from 172.16.12.13.
Fri Dec  8 19:25:16 2017 - [info] -- Slave on host 172.16.12.13(172.16.12.13:3306) started.
Fri Dec  8 19:25:16 2017 - [info] All new slave servers recovered successfully.
Fri Dec  8 19:25:16 2017 - [info] 
Fri Dec  8 19:25:16 2017 - [info] * Phase 5: New master cleanup phase..
Fri Dec  8 19:25:16 2017 - [info] 
Fri Dec  8 19:25:16 2017 - [info] Resetting slave info on the new master..
Fri Dec  8 19:25:16 2017 - [debug]  Clearing slave info..
Fri Dec  8 19:25:16 2017 - [debug]  Stopping slave IO/SQL thread on 172.16.12.12(172.16.12.12:3306)..
Fri Dec  8 19:25:16 2017 - [debug]   done.
Fri Dec  8 19:25:16 2017 - [debug]  SHOW SLAVE STATUS shows new master does not replicate from anywhere. OK.
Fri Dec  8 19:25:16 2017 - [info]  172.16.12.12: Resetting slave info succeeded.
Fri Dec  8 19:25:16 2017 - [info] Master failover to 172.16.12.12(172.16.12.12:3306) completed successfully.
Fri Dec  8 19:25:16 2017 - [debug]  Disconnected from 172.16.12.12(172.16.12.12:3306)
Fri Dec  8 19:25:16 2017 - [debug]  Disconnected from 172.16.12.13(172.16.12.13:3306)
Fri Dec  8 19:25:16 2017 - [info] 

----- Failover Report -----

app1: MySQL Master failover 172.16.12.11(172.16.12.11:3306) to 172.16.12.12(172.16.12.12:3306) succeeded

Master 172.16.12.11(172.16.12.11:3306) is down!

Check MHA Manager logs at db10:/var/log/masterha/app1/manager.log for details.

Started automated(non-interactive) failover.
Invalidated master IP address on 172.16.12.11(172.16.12.11:3306)
Selected 172.16.12.12(172.16.12.12:3306) as a new master.
172.16.12.12(172.16.12.12:3306): OK: Applying all logs succeeded.
172.16.12.12(172.16.12.12:3306): OK: Activated master IP address.
172.16.12.13(172.16.12.13:3306): OK: Slave started, replicating from 172.16.12.12(172.16.12.12:3306)
172.16.12.12(172.16.12.12:3306): Resetting slave info succeeded.
Master failover to 172.16.12.12(172.16.12.12:3306) completed successfully.

  

yum install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes -y

[root@localhost tools]# rpm -ivh mha4mysql-manager-0.57-0.el7.noarch.rpm
error: Failed dependencies:
perl(Log::Dispatch) is needed by mha4mysql-manager-0.57-0.el7.noarch
perl(Log::Dispatch::File) is needed by mha4mysql-manager-0.57-0.el7.noarch
perl(Log::Dispatch::Screen) is needed by mha4mysql-manager-0.57-0.el7.noarch
perl(Parallel::ForkManager) is needed by mha4mysql-manager-0.57-0.el7.noarch

centos6.8下部分包不支持。
wget http://mirror.centos.org/centos/6/extras/x86_64/Packages/epel-release-6-8.noarch.rpm
rpm -ivh epel-release-6-8.noarch.rpm

################################

复制vbox

点击修改mac地址

 vi  /etc/udev/rules.d/70-persistent-net.rules

vi /etc/sysconfig/network-scripts/ifcfg-eth0

修改主机名:

 vim /etc/hosts

vim /etc/sysconfig/network

###########################################

yum 故障:

1、检查网络  ping  www.baidu.co    

2、检查网关 route -n     

3、检查防火墙和selinux 

4、检查/etc/resolve.conf       /etc/hosts

###########################################

每个节点打通ssh

[root@db13 ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@db10
-bash: ssh-copy-id: command not found

yum -y install openssh-clients

ssh-copy-id -i ~/.ssh/id_rsa.pub root@IP

检查:cat ~/.ssh/authorized_keys

###########################################

                Last_IO_Error: Fatal error: The slave I/O thread stops because master and slave have equal MySQL server UUIDs; these UUIDs must be different for replication to work.克隆主机导致

auto.cnf 删除后重启mysql,生成新文件

  Last_Error: Slave failed to initialize relay log info structure from the repository

解决:

stop slave; reset slave; start slave; show slave status; #查看最新状态, 发现已经恢复正常

#############################################

masterha_check_ssh --conf=/etc/mha/app1.cnf  报错

[error][/usr/share/perl5/vendor_perl/MHA/SSHCheck.pm, ln111] SSH connection from

manager上要把自己的公钥信息也加入到authorized_keys

##############################################

 masterha_check_repl --conf=/etc/mha/app1.cnf  报错,添加权限

####################################################

masterha_check_ssh --conf=/etc/mha/app1.cnf 

 masterha_check_repl --conf=/etc/mha/app1.cnf 

启动:

nohup /usr/bin/masterha_manager --conf=/etc/mha/app1.cnf > /var/log/masterha/app1/manager.log 2>&1 &

masterha_check_status --conf=/etc/mha/app1.cnf 

###################################################

断掉主库后,从库连接新从库出现1236故障。

                Last_IO_Errno: 1236
                Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.'

  

mysql> stop slave;
Query OK, 0 rows affected (0.06 sec)

mysql> reset slave;
Query OK, 0 rows affected (0.06 sec)

mysql> reset master;
Query OK, 0 rows affected (0.07 sec)

mysql> set global gtid_purged='865e07c9-bae8-11e7-8aba-08002729e4f7:128730';
Query OK, 0 rows affected (0.00 sec)

mysql> change master to master_host='172.xxxx',master_port=3306,master_user='root',master_password='root123',master_auto_position=1;
Query OK, 0 rows affected, 2 warnings (0.07 sec)

mysql> start slave;
Query OK, 0 rows affected (0.02 sec)

  还是有1236错误。重启从库出现1872故障.

       change后还是不行,检查一下gtid是否与主库对齐。show master status;

       检查一下select * from mysql.slave_relay_log_info;   检查一下relay_log。

       重新change,1236,一般是认为主从gtid没有对齐。  

从库一定要设置read_only=on,要不切换的适合gtid没有对齐,会导致1236故障。

 ##############################

sysbench /usr/share/sysbench/tests/include/oltp_legacy/insert.lua  --oltp-tables-count=4 --oltp-table-size=10000000 --oltp-dist-res=95  --mysql-user=root --mysql-password=root123 --mysql-db=sbtest --db-driver=mysql --mysql-socket=/home/data/mysql57/run/mysql.sock  --num-threads=4 --max-requests=0 --max-time=300 --report-interval=3 run

关闭数据库主库,mha自动切换,日志如下:

sysbench /usr/share/sysbench/tests/include/oltp_legacy/insert.lua  --oltp-tables-count=4 --oltp-table-size=10000000 --oltp-dist-res=95  --mysql-user=root --mysql-password=root123 --mysql-db=sbtest --db-driver=mysql --mysql-socket=/home/data/mysql57/run/mysql.sock  --num-threads=4 --max-requests=0 --max-time=300 --report-interval=3 run



################log1
Thu Nov 16 17:06:58 2017 - [debug] Set short wait_timeout on master: 3 seconds
Thu Nov 16 17:06:58 2017 - [debug] Trying to get advisory lock..
Thu Nov 16 17:07:01 2017 - [debug] Connected on master.
Thu Nov 16 17:07:01 2017 - [debug] Set short wait_timeout on master: 3 seconds
Thu Nov 16 17:07:01 2017 - [debug] Trying to get advisory lock..
Thu Nov 16 17:07:04 2017 - [warning] Got error on MySQL connect ping: DBI connect(';host=172.16.12.11;port=3306;mysql_connect_timeout=1','root',...) failed: Lost connection to MySQL server at 'reading initial communication packet', system error: 111 at /usr/share/perl5/vendor_perl/MHA/HealthCheck.pm line 97
2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Thu Nov 16 17:07:04 2017 - [info] Executing SSH check script: exit 0
Thu Nov 16 17:07:04 2017 - [info] Executing secondary network check script: /usr/bin/masterha_secondary_check -s 172.16.12.11 -s 172.16.12.12 --user=root --master_host=172.16.12.10 --master_port=3306  --user=root  --master_host=172.16.12.11  --master_ip=172.16.12.11  --master_port=3306 --master_user=root --master_password=root123 --ping_type=CONNECT
Thu Nov 16 17:07:04 2017 - [debug] SSH connection test to 172.16.12.11, option -o StrictHostKeyChecking=no -o PasswordAuthentication=no -o BatchMode=yes -o ConnectTimeout=5, timeout 5
Thu Nov 16 17:07:05 2017 - [info] HealthCheck: SSH to 172.16.12.11 is reachable.
Monitoring server 172.16.12.11 is reachable, Master is not reachable from 172.16.12.11. OK.
Thu Nov 16 17:07:07 2017 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Thu Nov 16 17:07:07 2017 - [warning] Connection failed 2 time(s)..
Monitoring server 172.16.12.12 is reachable, Master is not reachable from 172.16.12.12. OK.
Thu Nov 16 17:07:07 2017 - [info] Master is not reachable from all other monitoring servers. Failover should start.
Thu Nov 16 17:07:10 2017 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Thu Nov 16 17:07:10 2017 - [warning] Connection failed 3 time(s)..
Thu Nov 16 17:07:13 2017 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Thu Nov 16 17:07:13 2017 - [warning] Connection failed 4 time(s)..
Thu Nov 16 17:07:13 2017 - [warning] Master is not reachable from health checker!
Thu Nov 16 17:07:13 2017 - [warning] Master 172.16.12.11(172.16.12.11:3306) is not reachable!
Thu Nov 16 17:07:13 2017 - [warning] SSH is reachable.
Thu Nov 16 17:07:13 2017 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/mha/app1.cnf again, and trying to connect to all servers to check server status..
Thu Nov 16 17:07:13 2017 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Nov 16 17:07:13 2017 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Thu Nov 16 17:07:13 2017 - [info] Reading server configuration from /etc/mha/app1.cnf..
Thu Nov 16 17:07:13 2017 - [debug] Skipping connecting to dead master 172.16.12.11(172.16.12.11:3306).
Thu Nov 16 17:07:13 2017 - [debug] Connecting to servers..
Thu Nov 16 17:07:13 2017 - [debug]  Connected to: 172.16.12.12(172.16.12.12:3306), user=root
Thu Nov 16 17:07:13 2017 - [debug]  Number of slave worker threads on host 172.16.12.12(172.16.12.12:3306): 2
Thu Nov 16 17:07:13 2017 - [debug]  Connected to: 172.16.12.13(172.16.12.13:3306), user=root
Thu Nov 16 17:07:13 2017 - [debug]  Number of slave worker threads on host 172.16.12.13(172.16.12.13:3306): 2
Thu Nov 16 17:07:13 2017 - [debug]  Comparing MySQL versions..
Thu Nov 16 17:07:13 2017 - [debug]   Comparing MySQL versions done.
Thu Nov 16 17:07:13 2017 - [debug] Connecting to servers done.
Thu Nov 16 17:07:13 2017 - [info] GTID failover mode = 1
Thu Nov 16 17:07:13 2017 - [info] Dead Servers:
Thu Nov 16 17:07:13 2017 - [info]   172.16.12.11(172.16.12.11:3306)
Thu Nov 16 17:07:13 2017 - [info] Alive Servers:
Thu Nov 16 17:07:13 2017 - [info]   172.16.12.12(172.16.12.12:3306)
Thu Nov 16 17:07:13 2017 - [info]   172.16.12.13(172.16.12.13:3306)
Thu Nov 16 17:07:13 2017 - [info] Alive Slaves:
Thu Nov 16 17:07:13 2017 - [info]   172.16.12.12(172.16.12.12:3306)  Version=5.7.9-log (oldest major version between slaves) log-bin:enabled
Thu Nov 16 17:07:13 2017 - [info]     GTID ON
Thu Nov 16 17:07:13 2017 - [debug]    Relay log info repository: TABLE
Thu Nov 16 17:07:13 2017 - [info]     Replicating from 172.16.12.11(172.16.12.11:3306)
Thu Nov 16 17:07:13 2017 - [info]     Primary candidate for the new Master (candidate_master is set)
Thu Nov 16 17:07:13 2017 - [info]   172.16.12.13(172.16.12.13:3306)  Version=5.7.9-log (oldest major version between slaves) log-bin:enabled
Thu Nov 16 17:07:13 2017 - [info]     GTID ON
Thu Nov 16 17:07:13 2017 - [debug]    Relay log info repository: TABLE
Thu Nov 16 17:07:13 2017 - [info]     Replicating from 172.16.12.11(172.16.12.11:3306)
Thu Nov 16 17:07:13 2017 - [info]     Not candidate for the new Master (no_master is set)
Thu Nov 16 17:07:13 2017 - [info] Checking slave configurations..
Thu Nov 16 17:07:13 2017 - [info]  read_only=1 is not set on slave 172.16.12.12(172.16.12.12:3306).
Thu Nov 16 17:07:13 2017 - [info] Checking replication filtering settings..
Thu Nov 16 17:07:13 2017 - [info]  Replication filtering check ok.
Thu Nov 16 17:07:13 2017 - [info] Master is down!
Thu Nov 16 17:07:13 2017 - [info] Terminating monitoring script.
Thu Nov 16 17:07:13 2017 - [debug]  Disconnected from 172.16.12.12(172.16.12.12:3306)
Thu Nov 16 17:07:13 2017 - [debug]  Disconnected from 172.16.12.13(172.16.12.13:3306)
Thu Nov 16 17:07:13 2017 - [info] Got exit code 20 (Master dead).
Thu Nov 16 17:07:13 2017 - [info] MHA::MasterFailover version 0.57.
Thu Nov 16 17:07:13 2017 - [info] Starting master failover.
Thu Nov 16 17:07:13 2017 - [info] 
Thu Nov 16 17:07:13 2017 - [info] * Phase 1: Configuration Check Phase..
Thu Nov 16 17:07:13 2017 - [info] 
Thu Nov 16 17:07:13 2017 - [debug] Skipping connecting to dead master 172.16.12.11.
Thu Nov 16 17:07:13 2017 - [debug] Connecting to servers..
Thu Nov 16 17:07:13 2017 - [debug]  Connected to: 172.16.12.12(172.16.12.12:3306), user=root
Thu Nov 16 17:07:13 2017 - [debug]  Number of slave worker threads on host 172.16.12.12(172.16.12.12:3306): 2
Thu Nov 16 17:07:13 2017 - [debug]  Connected to: 172.16.12.13(172.16.12.13:3306), user=root
Thu Nov 16 17:07:13 2017 - [debug]  Number of slave worker threads on host 172.16.12.13(172.16.12.13:3306): 2
Thu Nov 16 17:07:13 2017 - [debug]  Comparing MySQL versions..
Thu Nov 16 17:07:13 2017 - [debug]   Comparing MySQL versions done.
Thu Nov 16 17:07:13 2017 - [debug] Connecting to servers done.
Thu Nov 16 17:07:13 2017 - [info] GTID failover mode = 1
Thu Nov 16 17:07:13 2017 - [info] Dead Servers:
Thu Nov 16 17:07:13 2017 - [info]   172.16.12.11(172.16.12.11:3306)
Thu Nov 16 17:07:13 2017 - [info] Checking master reachability via MySQL(double check)...
Thu Nov 16 17:07:13 2017 - [info]  ok.
Thu Nov 16 17:07:13 2017 - [info] Alive Servers:
Thu Nov 16 17:07:13 2017 - [info]   172.16.12.12(172.16.12.12:3306)
Thu Nov 16 17:07:13 2017 - [info]   172.16.12.13(172.16.12.13:3306)
Thu Nov 16 17:07:13 2017 - [info] Alive Slaves:
Thu Nov 16 17:07:13 2017 - [info]   172.16.12.12(172.16.12.12:3306)  Version=5.7.9-log (oldest major version between slaves) log-bin:enabled
Thu Nov 16 17:07:13 2017 - [info]     GTID ON
Thu Nov 16 17:07:13 2017 - [debug]    Relay log info repository: TABLE
Thu Nov 16 17:07:13 2017 - [info]     Replicating from 172.16.12.11(172.16.12.11:3306)
Thu Nov 16 17:07:13 2017 - [info]     Primary candidate for the new Master (candidate_master is set)
Thu Nov 16 17:07:13 2017 - [info]   172.16.12.13(172.16.12.13:3306)  Version=5.7.9-log (oldest major version between slaves) log-bin:enabled
Thu Nov 16 17:07:13 2017 - [info]     GTID ON
Thu Nov 16 17:07:13 2017 - [debug]    Relay log info repository: TABLE
Thu Nov 16 17:07:13 2017 - [info]     Replicating from 172.16.12.11(172.16.12.11:3306)
Thu Nov 16 17:07:13 2017 - [info]     Not candidate for the new Master (no_master is set)
Thu Nov 16 17:07:13 2017 - [info] Starting GTID based failover.
Thu Nov 16 17:07:13 2017 - [info] 
Thu Nov 16 17:07:13 2017 - [info] ** Phase 1: Configuration Check Phase completed.
Thu Nov 16 17:07:13 2017 - [info] 
Thu Nov 16 17:07:13 2017 - [info] * Phase 2: Dead Master Shutdown Phase..
Thu Nov 16 17:07:13 2017 - [info] 
Thu Nov 16 17:07:13 2017 - [info] Forcing shutdown so that applications never connect to the current master..
Thu Nov 16 17:07:13 2017 - [info] Executing master IP deactivation script:
Thu Nov 16 17:07:13 2017 - [info]   /usr/local/bin/master_ip_failover --orig_master_host=172.16.12.11 --orig_master_ip=172.16.12.11 --orig_master_port=3306 --command=stopssh --ssh_user=root  
Thu Nov 16 17:07:13 2017 - [debug]  Stopping IO thread on 172.16.12.12(172.16.12.12:3306)..
Thu Nov 16 17:07:13 2017 - [debug]  Stopping IO thread on 172.16.12.13(172.16.12.13:3306)..


IN SCRIPT TEST====/sbin/ifconfig eth0:100 down==/sbin/ifconfig eth0:100 172.16.12.100/24===

Disabling the VIP on old master: 172.16.12.11 
Thu Nov 16 17:07:14 2017 - [debug]  Stop IO thread on 172.16.12.13(172.16.12.13:3306) done.
Thu Nov 16 17:07:14 2017 - [debug]  Stop IO thread on 172.16.12.12(172.16.12.12:3306) done.
SIOCSIFFLAGS: 无法指定被请求的地址
Thu Nov 16 17:07:14 2017 - [info]  done.
Thu Nov 16 17:07:14 2017 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.
Thu Nov 16 17:07:14 2017 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Thu Nov 16 17:07:14 2017 - [info] 
Thu Nov 16 17:07:14 2017 - [info] * Phase 3: Master Recovery Phase..
Thu Nov 16 17:07:14 2017 - [info] 
Thu Nov 16 17:07:14 2017 - [info] * Phase 3.1: Getting Latest Slaves Phase..
Thu Nov 16 17:07:14 2017 - [info] 
Thu Nov 16 17:07:14 2017 - [debug] Fetching current slave status..
Thu Nov 16 17:07:14 2017 - [debug]  Fetching current slave status done.
Thu Nov 16 17:07:14 2017 - [info] The latest binary log file/position on all slaves is mysql-bin.000022:4912402
Thu Nov 16 17:07:14 2017 - [info] Retrieved Gtid Set: 865e07c9-bae8-11e7-8aba-08002729e4f7:129029-135860
Thu Nov 16 17:07:14 2017 - [info] Latest slaves (Slaves that received relay log files to the latest):
Thu Nov 16 17:07:14 2017 - [info]   172.16.12.12(172.16.12.12:3306)  Version=5.7.9-log (oldest major version between slaves) log-bin:enabled
Thu Nov 16 17:07:14 2017 - [info]     GTID ON
Thu Nov 16 17:07:14 2017 - [debug]    Relay log info repository: TABLE
Thu Nov 16 17:07:14 2017 - [info]     Replicating from 172.16.12.11(172.16.12.11:3306)
Thu Nov 16 17:07:14 2017 - [info]     Primary candidate for the new Master (candidate_master is set)
Thu Nov 16 17:07:14 2017 - [info]   172.16.12.13(172.16.12.13:3306)  Version=5.7.9-log (oldest major version between slaves) log-bin:enabled
Thu Nov 16 17:07:14 2017 - [info]     GTID ON
Thu Nov 16 17:07:14 2017 - [debug]    Relay log info repository: TABLE
Thu Nov 16 17:07:14 2017 - [info]     Replicating from 172.16.12.11(172.16.12.11:3306)
Thu Nov 16 17:07:14 2017 - [info]     Not candidate for the new Master (no_master is set)
Thu Nov 16 17:07:14 2017 - [info] The oldest binary log file/position on all slaves is mysql-bin.000022:4912402
Thu Nov 16 17:07:14 2017 - [info] Retrieved Gtid Set: 865e07c9-bae8-11e7-8aba-08002729e4f7:129029-135860
Thu Nov 16 17:07:14 2017 - [info] Oldest slaves:
Thu Nov 16 17:07:14 2017 - [info]   172.16.12.12(172.16.12.12:3306)  Version=5.7.9-log (oldest major version between slaves) log-bin:enabled
Thu Nov 16 17:07:14 2017 - [info]     GTID ON
Thu Nov 16 17:07:14 2017 - [debug]    Relay log info repository: TABLE
Thu Nov 16 17:07:14 2017 - [info]     Replicating from 172.16.12.11(172.16.12.11:3306)
Thu Nov 16 17:07:14 2017 - [info]     Primary candidate for the new Master (candidate_master is set)
Thu Nov 16 17:07:14 2017 - [info]   172.16.12.13(172.16.12.13:3306)  Version=5.7.9-log (oldest major version between slaves) log-bin:enabled
Thu Nov 16 17:07:14 2017 - [info]     GTID ON
Thu Nov 16 17:07:14 2017 - [debug]    Relay log info repository: TABLE
Thu Nov 16 17:07:14 2017 - [info]     Replicating from 172.16.12.11(172.16.12.11:3306)
Thu Nov 16 17:07:14 2017 - [info]     Not candidate for the new Master (no_master is set)
Thu Nov 16 17:07:14 2017 - [info] 
Thu Nov 16 17:07:14 2017 - [info] * Phase 3.3: Determining New Master Phase..
Thu Nov 16 17:07:14 2017 - [info] 
Thu Nov 16 17:07:14 2017 - [info] Searching new master from slaves..
Thu Nov 16 17:07:14 2017 - [info]  Candidate masters from the configuration file:
Thu Nov 16 17:07:14 2017 - [info]   172.16.12.12(172.16.12.12:3306)  Version=5.7.9-log (oldest major version between slaves) log-bin:enabled
Thu Nov 16 17:07:14 2017 - [info]     GTID ON
Thu Nov 16 17:07:14 2017 - [debug]    Relay log info repository: TABLE
Thu Nov 16 17:07:14 2017 - [info]     Replicating from 172.16.12.11(172.16.12.11:3306)
Thu Nov 16 17:07:14 2017 - [info]     Primary candidate for the new Master (candidate_master is set)
Thu Nov 16 17:07:14 2017 - [info]  Non-candidate masters:
Thu Nov 16 17:07:14 2017 - [info]   172.16.12.13(172.16.12.13:3306)  Version=5.7.9-log (oldest major version between slaves) log-bin:enabled
Thu Nov 16 17:07:14 2017 - [info]     GTID ON
Thu Nov 16 17:07:14 2017 - [debug]    Relay log info repository: TABLE
Thu Nov 16 17:07:14 2017 - [info]     Replicating from 172.16.12.11(172.16.12.11:3306)
Thu Nov 16 17:07:14 2017 - [info]     Not candidate for the new Master (no_master is set)
Thu Nov 16 17:07:14 2017 - [info]  Searching from candidate_master slaves which have received the latest relay log events..
Thu Nov 16 17:07:14 2017 - [info] New master is 172.16.12.12(172.16.12.12:3306)
Thu Nov 16 17:07:14 2017 - [info] Starting master failover..
Thu Nov 16 17:07:14 2017 - [info] 
From:
172.16.12.11(172.16.12.11:3306) (current master)
 +--172.16.12.12(172.16.12.12:3306)
 +--172.16.12.13(172.16.12.13:3306)

To:
172.16.12.12(172.16.12.12:3306) (new master)
 +--172.16.12.13(172.16.12.13:3306)
Thu Nov 16 17:07:14 2017 - [info] 
Thu Nov 16 17:07:14 2017 - [info] * Phase 3.3: New Master Recovery Phase..
Thu Nov 16 17:07:14 2017 - [info] 
Thu Nov 16 17:07:14 2017 - [info]  Waiting all logs to be applied.. 
Thu Nov 16 17:07:14 2017 - [info]   done.
Thu Nov 16 17:07:14 2017 - [debug]  Stopping slave IO/SQL thread on 172.16.12.12(172.16.12.12:3306)..
Thu Nov 16 17:07:14 2017 - [debug]   done.
Thu Nov 16 17:07:14 2017 - [info] Getting new master's binlog name and position..
Thu Nov 16 17:07:14 2017 - [info]  mysql-bin.000001:4837210
Thu Nov 16 17:07:14 2017 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='172.16.12.12', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='xxx';
Thu Nov 16 17:07:14 2017 - [info] Master Recovery succeeded. File:Pos:Exec_Gtid_Set: mysql-bin.000001, 4837210, 865e07c9-bae8-11e7-8aba-08002729e4f7:1-135860
Thu Nov 16 17:07:14 2017 - [info] Executing master IP activate script:
Thu Nov 16 17:07:14 2017 - [info]   /usr/local/bin/master_ip_failover --command=start --ssh_user=root --orig_master_host=172.16.12.11 --orig_master_ip=172.16.12.11 --orig_master_port=3306 --new_master_host=172.16.12.12 --new_master_ip=172.16.12.12 --new_master_port=3306 --new_master_user='root'   --new_master_password=xxx
Unknown option: new_master_user
Unknown option: new_master_password


IN SCRIPT TEST====/sbin/ifconfig eth0:100 down==/sbin/ifconfig eth0:100 172.16.12.100/24===

Enabling the VIP - 172.16.12.100/24 on the new master - 172.16.12.12 
Thu Nov 16 17:07:15 2017 - [info]  OK.
Thu Nov 16 17:07:15 2017 - [info] ** Finished master recovery successfully.
Thu Nov 16 17:07:15 2017 - [info] * Phase 3: Master Recovery Phase completed.
Thu Nov 16 17:07:15 2017 - [info] 
Thu Nov 16 17:07:15 2017 - [info] * Phase 4: Slaves Recovery Phase..
Thu Nov 16 17:07:15 2017 - [info] 
Thu Nov 16 17:07:15 2017 - [info] 
Thu Nov 16 17:07:15 2017 - [info] * Phase 4.1: Starting Slaves in parallel..
Thu Nov 16 17:07:15 2017 - [info] 
Thu Nov 16 17:07:15 2017 - [info] -- Slave recovery on host 172.16.12.13(172.16.12.13:3306) started, pid: 6148. Check tmp log /var/log/masterha/app1/172.16.12.13_3306_20171116170713.log if it takes time..
Thu Nov 16 17:07:15 2017 - [info] 
Thu Nov 16 17:07:15 2017 - [info] Log messages from 172.16.12.13 ...
Thu Nov 16 17:07:15 2017 - [info] 
Thu Nov 16 17:07:15 2017 - [info]  Resetting slave 172.16.12.13(172.16.12.13:3306) and starting replication from the new master 172.16.12.12(172.16.12.12:3306)..
Thu Nov 16 17:07:15 2017 - [debug]  Stopping slave IO/SQL thread on 172.16.12.13(172.16.12.13:3306)..
Thu Nov 16 17:07:15 2017 - [debug]   done.
Thu Nov 16 17:07:15 2017 - [info]  Executed CHANGE MASTER.
Thu Nov 16 17:07:15 2017 - [debug]  Starting slave IO/SQL thread on 172.16.12.13(172.16.12.13:3306)..
Thu Nov 16 17:07:15 2017 - [debug]   done.
Thu Nov 16 17:07:15 2017 - [info]  Slave started.
Thu Nov 16 17:07:15 2017 - [info]  gtid_wait(865e07c9-bae8-11e7-8aba-08002729e4f7:1-135860) completed on 172.16.12.13(172.16.12.13:3306). Executed 0 events.
Thu Nov 16 17:07:15 2017 - [info] End of log messages from 172.16.12.13.
Thu Nov 16 17:07:15 2017 - [info] -- Slave on host 172.16.12.13(172.16.12.13:3306) started.
Thu Nov 16 17:07:15 2017 - [info] All new slave servers recovered successfully.
Thu Nov 16 17:07:15 2017 - [info] 
Thu Nov 16 17:07:15 2017 - [info] * Phase 5: New master cleanup phase..
Thu Nov 16 17:07:15 2017 - [info] 
Thu Nov 16 17:07:15 2017 - [info] Resetting slave info on the new master..
Thu Nov 16 17:07:15 2017 - [debug]  Clearing slave info..
Thu Nov 16 17:07:15 2017 - [debug]  Stopping slave IO/SQL thread on 172.16.12.12(172.16.12.12:3306)..
Thu Nov 16 17:07:15 2017 - [debug]   done.
Thu Nov 16 17:07:15 2017 - [debug]  SHOW SLAVE STATUS shows new master does not replicate from anywhere. OK.
Thu Nov 16 17:07:15 2017 - [info]  172.16.12.12: Resetting slave info succeeded.
Thu Nov 16 17:07:15 2017 - [info] Master failover to 172.16.12.12(172.16.12.12:3306) completed successfully.
Thu Nov 16 17:07:15 2017 - [debug]  Disconnected from 172.16.12.12(172.16.12.12:3306)
Thu Nov 16 17:07:15 2017 - [debug]  Disconnected from 172.16.12.13(172.16.12.13:3306)
Thu Nov 16 17:07:16 2017 - [info] 

----- Failover Report -----

app1: MySQL Master failover 172.16.12.11(172.16.12.11:3306) to 172.16.12.12(172.16.12.12:3306) succeeded

Master 172.16.12.11(172.16.12.11:3306) is down!

Check MHA Manager logs at db10:/var/log/masterha/app1/manager.log for details.

Started automated(non-interactive) failover.
Invalidated master IP address on 172.16.12.11(172.16.12.11:3306)
Selected 172.16.12.12(172.16.12.12:3306) as a new master.
172.16.12.12(172.16.12.12:3306): OK: Applying all logs succeeded.
172.16.12.12(172.16.12.12:3306): OK: Activated master IP address.
172.16.12.13(172.16.12.13:3306): OK: Slave started, replicating from 172.16.12.12(172.16.12.12:3306)
172.16.12.12(172.16.12.12:3306): Resetting slave info succeeded.
Master failover to 172.16.12.12(172.16.12.12:3306) completed successfully.

 流程:1、选主,2、切换VIP,   3、change启动从库  

 

手动平滑切换:

masterha_master_switch --master_state=alive  --conf=/etc/mha/app1.cnf --orig_master_is_new_slave 

###################

非半同步复制,mysqladmin shutdown 主库

 set global rpl_semi_sync_master_enabled=OFF; 

set global rpl_semi_sync_slave_enabled =OFF;

show variables like "rpl%";

 

 

Mon Nov 20 15:31:03 2017 - [debug] Set short wait_timeout on master: 3 seconds
Mon Nov 20 15:31:03 2017 - [debug] Trying to get advisory lock..
Mon Nov 20 15:31:06 2017 - [warning] Got error on MySQL connect ping: DBI connect(';host=172.16.12.11;port=3306;mysql_connect_timeout=1','root',...) failed: Lost connection to MySQL server at 'reading initial communication packet', system error: 111 at /usr/share/perl5/vendor_perl/MHA/HealthCheck.pm line 97
2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Mon Nov 20 15:31:06 2017 - [info] Executing SSH check script: exit 0
Mon Nov 20 15:31:06 2017 - [debug] SSH connection test to 172.16.12.11, option -o StrictHostKeyChecking=no -o PasswordAuthentication=no -o BatchMode=yes -o ConnectTimeout=5, timeout 5
Mon Nov 20 15:31:06 2017 - [info] Executing secondary network check script: /usr/bin/masterha_secondary_check -s 172.16.12.10 -s 172.16.12.13 --user=root --master_host=172.16.12.11 --master_port=3306  --user=root  --master_host=172.16.12.11  --master_ip=172.16.12.11  --master_port=3306 --master_user=root --master_password=mysqlpass --ping_type=CONNECT
Monitoring server 172.16.12.10 is reachable, Master is not reachable from 172.16.12.10. OK.
Mon Nov 20 15:31:09 2017 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Mon Nov 20 15:31:09 2017 - [warning] Connection failed 2 time(s)..
Mon Nov 20 15:31:11 2017 - [warning] HealthCheck: Got timeout on checking SSH connection to 172.16.12.11! at /usr/share/perl5/vendor_perl/MHA/HealthCheck.pm line 342.
Mon Nov 20 15:31:12 2017 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Mon Nov 20 15:31:12 2017 - [warning] Connection failed 3 time(s)..
Monitoring server 172.16.12.13 is reachable, Master is not reachable from 172.16.12.13. OK.
Mon Nov 20 15:31:14 2017 - [info] Master is not reachable from all other monitoring servers. Failover should start.
Mon Nov 20 15:31:15 2017 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Mon Nov 20 15:31:15 2017 - [warning] Connection failed 4 time(s)..
Mon Nov 20 15:31:15 2017 - [warning] Master is not reachable from health checker!
Mon Nov 20 15:31:15 2017 - [warning] Master 172.16.12.11(172.16.12.11:3306) is not reachable!
Mon Nov 20 15:31:15 2017 - [warning] SSH is NOT reachable.
Mon Nov 20 15:31:15 2017 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/mha/app1.cnf again, and trying to connect to all servers to check server status..
Mon Nov 20 15:31:15 2017 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Mon Nov 20 15:31:15 2017 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Mon Nov 20 15:31:15 2017 - [info] Reading server configuration from /etc/mha/app1.cnf..
Mon Nov 20 15:31:15 2017 - [debug] Skipping connecting to dead master 172.16.12.11(172.16.12.11:3306).
Mon Nov 20 15:31:15 2017 - [debug] Connecting to servers..
Mon Nov 20 15:31:15 2017 - [debug]  Connected to: 172.16.12.12(172.16.12.12:3306), user=root
Mon Nov 20 15:31:15 2017 - [debug]  Number of slave worker threads on host 172.16.12.12(172.16.12.12:3306): 2
Mon Nov 20 15:31:15 2017 - [debug]  Connected to: 172.16.12.13(172.16.12.13:3306), user=root
Mon Nov 20 15:31:15 2017 - [debug]  Number of slave worker threads on host 172.16.12.13(172.16.12.13:3306): 2
Mon Nov 20 15:31:15 2017 - [debug]  Comparing MySQL versions..
Mon Nov 20 15:31:15 2017 - [debug]   Comparing MySQL versions done.
Mon Nov 20 15:31:15 2017 - [debug] Connecting to servers done.
Mon Nov 20 15:31:15 2017 - [info] GTID failover mode = 1
Mon Nov 20 15:31:15 2017 - [info] Dead Servers:
Mon Nov 20 15:31:15 2017 - [info]   172.16.12.11(172.16.12.11:3306)
Mon Nov 20 15:31:15 2017 - [info] Alive Servers:
Mon Nov 20 15:31:15 2017 - [info]   172.16.12.12(172.16.12.12:3306)
Mon Nov 20 15:31:15 2017 - [info]   172.16.12.13(172.16.12.13:3306)
Mon Nov 20 15:31:15 2017 - [info] Alive Slaves:
Mon Nov 20 15:31:15 2017 - [info]   172.16.12.12(172.16.12.12:3306)  Version=5.7.9-log (oldest major version between slaves) log-bin:enabled
Mon Nov 20 15:31:15 2017 - [info]     GTID ON
Mon Nov 20 15:31:15 2017 - [debug]    Relay log info repository: TABLE
Mon Nov 20 15:31:15 2017 - [info]     Replicating from 172.16.12.11(172.16.12.11:3306)
Mon Nov 20 15:31:15 2017 - [info]     Primary candidate for the new Master (candidate_master is set)
Mon Nov 20 15:31:15 2017 - [info]   172.16.12.13(172.16.12.13:3306)  Version=5.7.9-log (oldest major version between slaves) log-bin:enabled
Mon Nov 20 15:31:15 2017 - [info]     GTID ON
Mon Nov 20 15:31:15 2017 - [debug]    Relay log info repository: TABLE
Mon Nov 20 15:31:15 2017 - [info]     Replicating from 172.16.12.11(172.16.12.11:3306)
Mon Nov 20 15:31:15 2017 - [info]     Not candidate for the new Master (no_master is set)
Mon Nov 20 15:31:15 2017 - [info] Checking slave configurations..
Mon Nov 20 15:31:15 2017 - [info]  read_only=1 is not set on slave 172.16.12.12(172.16.12.12:3306).
Mon Nov 20 15:31:15 2017 - [info]  read_only=1 is not set on slave 172.16.12.13(172.16.12.13:3306).
Mon Nov 20 15:31:15 2017 - [info] Checking replication filtering settings..
Mon Nov 20 15:31:15 2017 - [info]  Replication filtering check ok.
Mon Nov 20 15:31:15 2017 - [info] Master is down!
Mon Nov 20 15:31:15 2017 - [info] Terminating monitoring script.
Mon Nov 20 15:31:15 2017 - [debug]  Disconnected from 172.16.12.12(172.16.12.12:3306)
Mon Nov 20 15:31:15 2017 - [debug]  Disconnected from 172.16.12.13(172.16.12.13:3306)
Mon Nov 20 15:31:15 2017 - [info] Got exit code 20 (Master dead).        确认主库挂了
Mon Nov 20 15:31:15 2017 - [info] MHA::MasterFailover version 0.57.
Mon Nov 20 15:31:15 2017 - [info] Starting master failover.
Mon Nov 20 15:31:15 2017 - [info] 
Mon Nov 20 15:31:15 2017 - [info] * Phase 1: Configuration Check Phase..       #配置检查
Mon Nov 20 15:31:15 2017 - [info] 
Mon Nov 20 15:31:15 2017 - [debug] Skipping connecting to dead master 172.16.12.11.
Mon Nov 20 15:31:15 2017 - [debug] Connecting to servers..
Mon Nov 20 15:31:15 2017 - [debug]  Connected to: 172.16.12.12(172.16.12.12:3306), user=root
Mon Nov 20 15:31:15 2017 - [debug]  Number of slave worker threads on host 172.16.12.12(172.16.12.12:3306): 2
Mon Nov 20 15:31:15 2017 - [debug]  Connected to: 172.16.12.13(172.16.12.13:3306), user=root
Mon Nov 20 15:31:16 2017 - [debug]  Number of slave worker threads on host 172.16.12.13(172.16.12.13:3306): 2
Mon Nov 20 15:31:16 2017 - [debug]  Comparing MySQL versions..
Mon Nov 20 15:31:16 2017 - [debug]   Comparing MySQL versions done.
Mon Nov 20 15:31:16 2017 - [debug] Connecting to servers done.
Mon Nov 20 15:31:16 2017 - [info] GTID failover mode = 1
Mon Nov 20 15:31:16 2017 - [info] Dead Servers:
Mon Nov 20 15:31:16 2017 - [info]   172.16.12.11(172.16.12.11:3306)
Mon Nov 20 15:31:16 2017 - [info] Checking master reachability via MySQL(double check)...
Mon Nov 20 15:31:16 2017 - [info]  ok.
Mon Nov 20 15:31:16 2017 - [info] Alive Servers:
Mon Nov 20 15:31:16 2017 - [info]   172.16.12.12(172.16.12.12:3306)
Mon Nov 20 15:31:16 2017 - [info]   172.16.12.13(172.16.12.13:3306)
Mon Nov 20 15:31:16 2017 - [info] Alive Slaves:
Mon Nov 20 15:31:16 2017 - [info]   172.16.12.12(172.16.12.12:3306)  Version=5.7.9-log (oldest major version between slaves) log-bin:enabled
Mon Nov 20 15:31:16 2017 - [info]     GTID ON
Mon Nov 20 15:31:16 2017 - [debug]    Relay log info repository: TABLE
Mon Nov 20 15:31:16 2017 - [info]     Replicating from 172.16.12.11(172.16.12.11:3306)
Mon Nov 20 15:31:16 2017 - [info]     Primary candidate for the new Master (candidate_master is set)
Mon Nov 20 15:31:16 2017 - [info]   172.16.12.13(172.16.12.13:3306)  Version=5.7.9-log (oldest major version between slaves) log-bin:enabled
Mon Nov 20 15:31:16 2017 - [info]     GTID ON
Mon Nov 20 15:31:16 2017 - [debug]    Relay log info repository: TABLE
Mon Nov 20 15:31:16 2017 - [info]     Replicating from 172.16.12.11(172.16.12.11:3306)
Mon Nov 20 15:31:16 2017 - [info]     Not candidate for the new Master (no_master is set)
Mon Nov 20 15:31:16 2017 - [info] Starting GTID based failover.
Mon Nov 20 15:31:16 2017 - [info] 
Mon Nov 20 15:31:16 2017 - [info] ** Phase 1: Configuration Check Phase completed.
Mon Nov 20 15:31:16 2017 - [info] 
Mon Nov 20 15:31:16 2017 - [info] * Phase 2: Dead Master Shutdown Phase..  故障库关闭
Mon Nov 20 15:31:16 2017 - [info] 
Mon Nov 20 15:31:16 2017 - [info] Forcing shutdown so that applications never connect to the current master..
Mon Nov 20 15:31:16 2017 - [info] Executing master IP deactivation script:
Mon Nov 20 15:31:16 2017 - [info]   /usr/local/bin/master_ip_failover --orig_master_host=172.16.12.11 --orig_master_ip=172.16.12.11 --orig_master_port=3306 --command=stop 
Mon Nov 20 15:31:16 2017 - [debug]  Stopping IO thread on 172.16.12.13(172.16.12.13:3306)..
Mon Nov 20 15:31:16 2017 - [debug]  Stopping IO thread on 172.16.12.12(172.16.12.12:3306)..
Mon Nov 20 15:31:16 2017 - [debug]  Stop IO thread on 172.16.12.12(172.16.12.12:3306) done.


IN SCRIPT TEST====/sbin/ifconfig eth0:100 down==/sbin/ifconfig eth0:100 172.16.12.100/24===

Disabling the VIP on old master: 172.16.12.11 
Mon Nov 20 15:31:16 2017 - [info]  done.
Mon Nov 20 15:31:16 2017 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.
Mon Nov 20 15:31:16 2017 - [debug]  Stop IO thread on 172.16.12.13(172.16.12.13:3306) done.
Mon Nov 20 15:31:16 2017 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Mon Nov 20 15:31:16 2017 - [info] 
Mon Nov 20 15:31:16 2017 - [info] * Phase 3: Master Recovery Phase..
Mon Nov 20 15:31:16 2017 - [info] 
Mon Nov 20 15:31:16 2017 - [info] * Phase 3.1: Getting Latest Slaves Phase..
Mon Nov 20 15:31:16 2017 - [info] 
Mon Nov 20 15:31:16 2017 - [debug] Fetching current slave status..
Mon Nov 20 15:31:16 2017 - [debug]  Fetching current slave status done.
Mon Nov 20 15:31:16 2017 - [info] The latest binary log file/position on all slaves is mysql-bin.000001:4362327
Mon Nov 20 15:31:16 2017 - [info] Retrieved Gtid Set: 865e07c9-bae8-11e7-8aba-08002729e4f7:138400-143340
Mon Nov 20 15:31:16 2017 - [info] Latest slaves (Slaves that received relay log files to the latest):
Mon Nov 20 15:31:16 2017 - [info]   172.16.12.12(172.16.12.12:3306)  Version=5.7.9-log (oldest major version between slaves) log-bin:enabled
Mon Nov 20 15:31:16 2017 - [info]     GTID ON
Mon Nov 20 15:31:16 2017 - [debug]    Relay log info repository: TABLE
Mon Nov 20 15:31:16 2017 - [info]     Replicating from 172.16.12.11(172.16.12.11:3306)
Mon Nov 20 15:31:16 2017 - [info]     Primary candidate for the new Master (candidate_master is set)
Mon Nov 20 15:31:16 2017 - [info] The oldest binary log file/position on all slaves is mysql-bin.000001:4081167
Mon Nov 20 15:31:16 2017 - [info] Retrieved Gtid Set: 865e07c9-bae8-11e7-8aba-08002729e4f7:138400-142948
Mon Nov 20 15:31:16 2017 - [info] Oldest slaves:
Mon Nov 20 15:31:16 2017 - [info]   172.16.12.13(172.16.12.13:3306)  Version=5.7.9-log (oldest major version between slaves) log-bin:enabled
Mon Nov 20 15:31:16 2017 - [info]     GTID ON
Mon Nov 20 15:31:16 2017 - [debug]    Relay log info repository: TABLE
Mon Nov 20 15:31:16 2017 - [info]     Replicating from 172.16.12.11(172.16.12.11:3306)
Mon Nov 20 15:31:16 2017 - [info]     Not candidate for the new Master (no_master is set)
Mon Nov 20 15:31:16 2017 - [info] 
Mon Nov 20 15:31:16 2017 - [info] * Phase 3.3: Determining New Master Phase..
Mon Nov 20 15:31:16 2017 - [info] 
Mon Nov 20 15:31:16 2017 - [info] Searching new master from slaves..
Mon Nov 20 15:31:16 2017 - [info]  Candidate masters from the configuration file:
Mon Nov 20 15:31:16 2017 - [info]   172.16.12.12(172.16.12.12:3306)  Version=5.7.9-log (oldest major version between slaves) log-bin:enabled
Mon Nov 20 15:31:16 2017 - [info]     GTID ON
Mon Nov 20 15:31:16 2017 - [debug]    Relay log info repository: TABLE
Mon Nov 20 15:31:16 2017 - [info]     Replicating from 172.16.12.11(172.16.12.11:3306)
Mon Nov 20 15:31:16 2017 - [info]     Primary candidate for the new Master (candidate_master is set)
Mon Nov 20 15:31:16 2017 - [info]  Non-candidate masters:
Mon Nov 20 15:31:16 2017 - [info]   172.16.12.13(172.16.12.13:3306)  Version=5.7.9-log (oldest major version between slaves) log-bin:enabled
Mon Nov 20 15:31:16 2017 - [info]     GTID ON
Mon Nov 20 15:31:16 2017 - [debug]    Relay log info repository: TABLE
Mon Nov 20 15:31:16 2017 - [info]     Replicating from 172.16.12.11(172.16.12.11:3306)
Mon Nov 20 15:31:16 2017 - [info]     Not candidate for the new Master (no_master is set)
Mon Nov 20 15:31:16 2017 - [info]  Searching from candidate_master slaves which have received the latest relay log events..
Mon Nov 20 15:31:16 2017 - [info] New master is 172.16.12.12(172.16.12.12:3306)
Mon Nov 20 15:31:16 2017 - [info] Starting master failover..
Mon Nov 20 15:31:16 2017 - [info] 
From:
172.16.12.11(172.16.12.11:3306) (current master)
 +--172.16.12.12(172.16.12.12:3306)
 +--172.16.12.13(172.16.12.13:3306)

To:
172.16.12.12(172.16.12.12:3306) (new master)
 +--172.16.12.13(172.16.12.13:3306)
Mon Nov 20 15:31:16 2017 - [info] 
Mon Nov 20 15:31:16 2017 - [info] * Phase 3.3: New Master Recovery Phase..
Mon Nov 20 15:31:16 2017 - [info] 
Mon Nov 20 15:31:16 2017 - [info]  Waiting all logs to be applied.. 
Mon Nov 20 15:31:16 2017 - [info]   done.
Mon Nov 20 15:31:16 2017 - [debug]  Stopping slave IO/SQL thread on 172.16.12.12(172.16.12.12:3306)..
Mon Nov 20 15:31:16 2017 - [debug]   done.
Mon Nov 20 15:31:16 2017 - [info] Getting new master's binlog name and position..
Mon Nov 20 15:31:16 2017 - [info]  mysql-bin.000001:4295590
Mon Nov 20 15:31:16 2017 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='172.16.12.12', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='xxx';
Mon Nov 20 15:31:16 2017 - [info] Master Recovery succeeded. File:Pos:Exec_Gtid_Set: mysql-bin.000001, 4295590, 865e07c9-bae8-11e7-8aba-08002729e4f7:1-143340
Mon Nov 20 15:31:16 2017 - [info] Executing master IP activate script:
Mon Nov 20 15:31:16 2017 - [info]   /usr/local/bin/master_ip_failover --command=start --ssh_user=root --orig_master_host=172.16.12.11 --orig_master_ip=172.16.12.11 --orig_master_port=3306 --new_master_host=172.16.12.12 --new_master_ip=172.16.12.12 --new_master_port=3306 --new_master_user='root'   --new_master_password=xxx
Unknown option: new_master_user
Unknown option: new_master_password


IN SCRIPT TEST====/sbin/ifconfig eth0:100 down==/sbin/ifconfig eth0:100 172.16.12.100/24===

Enabling the VIP - 172.16.12.100/24 on the new master - 172.16.12.12 
Mon Nov 20 15:31:18 2017 - [info]  OK.
Mon Nov 20 15:31:18 2017 - [info] ** Finished master recovery successfully.
Mon Nov 20 15:31:18 2017 - [info] * Phase 3: Master Recovery Phase completed.
Mon Nov 20 15:31:18 2017 - [info] 
Mon Nov 20 15:31:18 2017 - [info] * Phase 4: Slaves Recovery Phase..
Mon Nov 20 15:31:18 2017 - [info] 
Mon Nov 20 15:31:18 2017 - [info] 
Mon Nov 20 15:31:18 2017 - [info] * Phase 4.1: Starting Slaves in parallel..
Mon Nov 20 15:31:18 2017 - [info] 
Mon Nov 20 15:31:18 2017 - [info] -- Slave recovery on host 172.16.12.13(172.16.12.13:3306) started, pid: 3511. Check tmp log /var/log/masterha/app1/172.16.12.13_3306_20171120153115.log if it takes time..
Mon Nov 20 15:32:27 2017 - [info] 
Mon Nov 20 15:32:27 2017 - [info] Log messages from 172.16.12.13 ...
Mon Nov 20 15:32:27 2017 - [info] 
Mon Nov 20 15:31:18 2017 - [info]  Resetting slave 172.16.12.13(172.16.12.13:3306) and starting replication from the new master 172.16.12.12(172.16.12.12:3306)..
Mon Nov 20 15:31:18 2017 - [debug]  Stopping slave IO/SQL thread on 172.16.12.13(172.16.12.13:3306)..
Mon Nov 20 15:32:21 2017 - [debug]   done.
Mon Nov 20 15:32:21 2017 - [info]  Executed CHANGE MASTER.
Mon Nov 20 15:32:21 2017 - [debug]  Starting slave IO/SQL thread on 172.16.12.13(172.16.12.13:3306)..
Mon Nov 20 15:32:21 2017 - [debug]   done.
Mon Nov 20 15:32:21 2017 - [info]  Slave started.
Mon Nov 20 15:32:27 2017 - [info]  gtid_wait(865e07c9-bae8-11e7-8aba-08002729e4f7:1-143340) completed on 172.16.12.13(172.16.12.13:3306). Executed 25 events.
Mon Nov 20 15:32:27 2017 - [info] End of log messages from 172.16.12.13.
Mon Nov 20 15:32:27 2017 - [info] -- Slave on host 172.16.12.13(172.16.12.13:3306) started.
Mon Nov 20 15:32:27 2017 - [info] All new slave servers recovered successfully.
Mon Nov 20 15:32:27 2017 - [info] 
Mon Nov 20 15:32:27 2017 - [info] * Phase 5: New master cleanup phase..
Mon Nov 20 15:32:27 2017 - [info] 
Mon Nov 20 15:32:27 2017 - [info] Resetting slave info on the new master..
Mon Nov 20 15:32:27 2017 - [debug]  Clearing slave info..
Mon Nov 20 15:32:27 2017 - [debug]  Stopping slave IO/SQL thread on 172.16.12.12(172.16.12.12:3306)..
Mon Nov 20 15:32:27 2017 - [debug]   done.
Mon Nov 20 15:32:27 2017 - [debug]  SHOW SLAVE STATUS shows new master does not replicate from anywhere. OK.
Mon Nov 20 15:32:27 2017 - [info]  172.16.12.12: Resetting slave info succeeded.
Mon Nov 20 15:32:27 2017 - [info] Master failover to 172.16.12.12(172.16.12.12:3306) completed successfully.
Mon Nov 20 15:32:27 2017 - [debug]  Disconnected from 172.16.12.12(172.16.12.12:3306)
Mon Nov 20 15:32:27 2017 - [debug]  Disconnected from 172.16.12.13(172.16.12.13:3306)
Mon Nov 20 15:32:27 2017 - [info] 

----- Failover Report -----

app1: MySQL Master failover 172.16.12.11(172.16.12.11:3306) to 172.16.12.12(172.16.12.12:3306) succeeded

Master 172.16.12.11(172.16.12.11:3306) is down!

Check MHA Manager logs at db10:/var/log/masterha/app1/manager.log for details.

Started automated(non-interactive) failover.
Invalidated master IP address on 172.16.12.11(172.16.12.11:3306)
Selected 172.16.12.12(172.16.12.12:3306) as a new master.
172.16.12.12(172.16.12.12:3306): OK: Applying all logs succeeded.
172.16.12.12(172.16.12.12:3306): OK: Activated master IP address.
172.16.12.13(172.16.12.13:3306): OK: Slave started, replicating from 172.16.12.12(172.16.12.12:3306)
172.16.12.12(172.16.12.12:3306): Resetting slave info succeeded.
Master failover to 172.16.12.12(172.16.12.12:3306) completed successfully.
[root@db10 ~]# 

 1、检查配置  2、选新主库   3、切换VIP  4、change,启动从库  

 结论:非半同步复制可能会丢失数据,因为宕机的主库启动后git不一致。

#######################半同步+sysbench+mysqladmin shutdown主库

 set global rpl_semi_sync_master_enabled=ON; 

set global rpl_semi_sync_slave_enabled =ON;

show variables like "rpl%";

结论:宕机的主机启动后gtid也是不一致的。

应该是切换异步了

############################半同步+sysbench+mysqladmin shutdown主库+不切到异步

 set global rpl_semi_sync_master_timeout=100000000000;

Mon Nov 20 17:05:45 2017 - [info] * Phase 3.3: New Master Recovery Phase..
Mon Nov 20 17:05:45 2017 - [info] 
Mon Nov 20 17:05:45 2017 - [info]  Waiting all logs to be applied.. 
Mon Nov 20 17:05:45 2017 - [debug] Sql Thread Done: 0, Worker Thread done: 0, Ended workers: 1
Mon Nov 20 17:05:46 2017 - [debug] Sql Thread Done: 0, Worker Thread done: 0, Ended workers: 1
Mon Nov 20 17:05:47 2017 - [debug] Sql Thread Done: 0, Worker Thread done: 0, Ended workers: 1
Mon Nov 20 17:05:48 2017 - [debug] Sql Thread Done: 0, Worker Thread done: 0, Ended workers: 1
Mon Nov 20 17:05:49 2017 - [debug] Sql Thread Done: 0, Worker Thread done: 0, Ended workers: 1
Mon Nov 20 17:05:50 2017 - [debug] Sql Thread Done: 0, Worker Thread done: 0, Ended workers: 1
Mon Nov 20 17:05:51 2017 - [debug] Sql Thread Done: 0, Worker Thread done: 0, Ended workers: 1
Mon Nov 20 17:05:52 2017 - [debug] Sql Thread Done: 0, Worker Thread done: 0, Ended workers: 1

将差异的relay_log应用到库里,数据量也没那么大啊,但是执行时间很长啊。由于gtid设置的有问题。gtid变小导致   检查一下vip

  

 

posted on 2017-10-29 20:04  星期六男爵  阅读(226)  评论(0)    收藏  举报

导航