背景  8月7日15:58收到报障数据库出现不同步:数据库共四台,分别为10.255.70.11,10.255.70.12,10.255.70.13,10.255.70.14(ip为虚拟ip)

数据库结构为:

              

故障时不同步现为:(1)70.11和70.13之间主主不同步 ,(2)70.11和70.12之间主从不同步,(3)70.11和70.14之间主从是同步的

(1)由于my.cnf文件中有slave-skip-errors=all配置,所以在出现不同步错误时跳过,检查同步参数Slave_IO_Running: Yes/Slave_SQL_Running: Yes均为yes,实际数据是不同步的

(2)70.11和70.12之间不同步,同步参数为Slave_IO_Running: NO/Slave_SQL_Running: Yes,报错1062,截图如下:

 

在70.12上操作如下:

mysql>stop slave ;

mysql>SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1 

mysql>slave start

查看数据同步参数:发现已经同步

至此70.11和70.12的主从不同步问题已经解决,需要注意的是:跳过1062这个事务之后,虽然以后主从数据库是同步的,但是不同步期间的数据将不再重新同步,即:不同步期间从库的数据不完整

现在只存在70.11和70.13之间的主主数据库不同步,在此期间发现70.11的库db1数据丢失了,因为70.11和70.12,70.14的数据是同步的,所以这三个库全部数据丢失

解决办法思路如下:

(1)先将现有业务切至70.13上,将危害减少到最小

(2)首先恢复70.11的数据:查看数据库最近一次备份是当天凌晨3点半,3点半到下午15:58的数据通过binlog日志进行恢复

(3)备份70.11的数据,导入70.12/13/14的数据库,并作主主/主从同步

具体操作:

恢复8月7日凌晨3点半之前的数据:

[root@7011]# gunzip db1_20190807.sql.gz
[root@7011]# mysql -h 10.255.70.11 -uroot -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2039
Server version: 5.7.17-log MySQL Community Server (GPL)

Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
|db1 |
+--------------------+
11 rows in set (0.00 sec)

mysql> use db1;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed, 1 warning
mysql> source db1.sql.gz;

通过binlog日志恢复凌晨3:30~~16:30之间的数据:

(1)将时间段内的binlog日志转化为mysql可识别的语句

[root@7011]# mysqlbinlog  --no-defaults  --database=db1 --start-datetime='2019-08-07 03:30:00' --stop-datetime='2019-08-17 16:30:00' mysql-bin.000034 >temp20190807.sql
[root@7011]# ll temp20190807.sql
-rw-r--r-- 1 root root 655391529 8月   7 20:56 temp20190807.sql

(2)删除binlog日志中的drop语句

[root@7011]# sed -i -e '/DROP/d' temp20190807.sql

(3)数据恢复

[root@7011]# mysql -h 10.255.70.11 -uroot -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2039
Server version: 5.7.17-log MySQL Community Server (GPL)

Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| db1 |
+--------------------+
11 rows in set (0.00 sec)

mysql> use db1;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed, 1 warning
mysql> source temp20190807.sql;

至此:70.11的数据恢复完全

接下来进行70.12和13,14的数据同步:

(1)备份70.11的数据库

[root@7011 ~]# mysqldump -h 10.255.70.11 -uroot  -p  -R ottdb1 | gzip > /tmp/db120190807.sql.gz

(2)将70.11的备份数据copy给70.12/13/14

[root@7011 tmp]# scp db120190807.sql.gz 10.255.70.12:/tmp/
The authenticity of host '10.255.70.12 (10.255.70.12)' can't be established.
RSA key fingerprint is aa:56:33:d3:aa:a8:af:a3:a9:c9:6e:26:6b:05:7f:b3.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '10.255.70.12' (RSA) to the list of known hosts.
root@10.255.70.12's password: 
ottdb120190807.sql.gz        

[root@7011 tmp]# scp db120190807.sql.gz 10.255.70.13:/tmp/
The authenticity of host '10.255.70.13 (10.255.70.13)' can't be established.
RSA key fingerprint is aa:56:33:d3:aa:a8:af:a3:a9:c9:6e:26:6b:05:7f:b3.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '10.255.70.13' (RSA) to the list of known hosts.
root@10.255.70.13's password: 
ottdb120190807.sql.gz   

[root@7011 tmp]# scp db120190807.sql.gz 10.255.70.14:/tmp/
The authenticity of host '10.255.70.14 (10.255.70.14)' can't be established.
RSA key fingerprint is aa:56:33:d3:aa:a8:af:a3:a9:c9:6e:26:6b:05:7f:b3.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '10.255.70.14' (RSA) to the list of known hosts.
root@10.255.70.14's password: 
ottdb120190807.sql.gz                              

(3)恢复70.12/13/14数据,以70.12为例,13和14也进行如下操作:

[root@7012]# mysql -h 10.255.70.12 -uroot -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2039
Server version: 5.7.17-log MySQL Community Server (GPL)

Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| db1 |
+--------------------+
11 rows in set (0.00 sec)

mysql> use db1;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed, 1 warning
mysql> source db120180807.sql;

(4)记录70.11和70.13的file和position

[root@7011]# mysql -h 10.255.70.11 -uroot -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2039
Server version: 5.7.17-log MySQL Community Server (GPL)

Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> show master status;
+------------------------+-----------+--------------+------------------+-------------------+
| File                   | Position  | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------------+-----------+--------------+------------------+-------------------+
| mysql-bin.000048         278032567 |    db1       | mysql            |                   |
+------------------------+-----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)

mysql>

[root@7013]# mysql -h 10.255.70.13 -uroot -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2039
Server version: 5.7.17-log MySQL Community Server (GPL)

Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> show master status;
+------------------------+-----------+--------------+------------------+-------------------+
| File                   | Position  | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------------+-----------+--------------+------------------+-------------------+
| mysql-bin.000055         622032567 |    db1       | mysql            |                   |
+------------------------+-----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)

mysql>

(5)实现70.11和70.13之间的主主同步,具体步骤如下:

在70.11上执行:

mysql> change master to master_host='10.255.70.13',master_user='root',master_password='123456',master_log_file='mysql-bin.000055',master_log_pos=622032567;

在70.13上执行:

mysql> change master to master_host='10.255.70.11',master_user='root',master_password='123456',master_log_file='mysql-bin.000048',master_log_pos=278032567;

数据库同步完成,查看数据库同步情况

[root@7011]# mysql -h 10.255.70.11 -uroot -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2039
Server version: 5.7.17-log MySQL Community Server (GPL)

Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 10.255.70.13
                  Master_User: root
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000049
          Read_Master_Log_Pos: 612239326
               Relay_Log_File: mysq-relay-bin.000014
                Relay_Log_Pos: 1301
        Relay_Master_Log_File: mysql-bin.000049
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: db1
          Replicate_Ignore_DB: mysql

接下来进行70.11和70.12和14的主从同步:

在70.12和70.14上执行:

mysql> change master to master_host='10.255.70.11',master_user='root',master_password='123456',master_log_file='mysql-bin.000048',master_log_pos=278032567;

查看数据库同步情况:

[root@7012]# mysql -h 10.255.70.12 -uroot -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2039
Server version: 5.7.17-log MySQL Community Server (GPL)

Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 10.255.70.11
                  Master_User: root
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000049
          Read_Master_Log_Pos: 612239326
               Relay_Log_File: mysq-relay-bin.000014
                Relay_Log_Pos: 1301
        Relay_Master_Log_File: mysql-bin.000049
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: db1
          Replicate_Ignore_DB: mysql

数据库恢复完成,修改业务地址到VIP地址,此次故障解决

数据同步过程中报错处理:

   Last_IO_Errno: 1236
                Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Could not find first log file name in binary log index file'
处理方式:

flush logs;
show master status;

记下File, Position

重新进行同步

 Got fatal error 1236 from master when reading data from binary log