基于GTID环境的数据恢复

     下面说一个线上环境的数据恢复案例,线上环境误操作的事情,随时都有可能发生的,这时候运维DBA或者DBA会想,没人会傻到把数据库干掉的吧?又或者没有会闲得蛋痛删除了几条数据又想恢复这么无聊吧?适适这样的人才多着呢,不过,人非圣贤孰能无过,当这事情发生的时候,我们更多的是想办法去解决,以及多给开发或者新人DBA一些相关安全操作的培训。好了,废话不多说,我们来模拟一下线上数据被误操作的情形,以及是怎么把恢复的。

 

实验环境:

1、开启了GTID对应的选项

2、Binlog格式是row格式

 

针对以下的情况进行恢复:

delete from xxx (不带任何条件)

 

 下面是表test1的表结构以及表里有几条数据:

<test>(root@localhost) [xuanzhi]> show create table test1\G
*************************** 1. row ***************************
       Table: test1
Create Table: CREATE TABLE `test1` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `age` int(11) DEFAULT NULL,
  `name` char(10) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=9 DEFAULT CHARSET=utf8
1 row in set (0.00 sec)

<test>(root@localhost) [xuanzhi]> select * from test1;
+----+------+------+
| id | age  | name |
+----+------+------+
|  5 |   20 | aa   |
|  6 |   22 | bb   |
|  7 |   23 | cc   |
|  8 |   24 | dd   |
+----+------+------+
4 rows in set (0.00 sec)

<test>(root@localhost) [xuanzhi]> 


 现在我们对这个库做一次完整的备份(备份的参数,这里不多说了,至少为什么要--master-data选项,后面会说,那为什么-q选项呢,可以参考博客:http://17173ops.com/2015/03/21/mysql-faq-why-turn-on-quick-option.shtml)

[root@localhost mysql-5.6.10]# mysqldump -uroot -p123456 -S /data/mysql-5.6.10/mysql.sock --single-transaction  --master-data=2 -q xuanzhi > xuanzhi.sql
Warning: Using a password on the command line interface can be insecure.
Warning: A partial dump from a server that has GTIDs will by default include the GTIDs of all transactions, even those that changed suppressed parts of the database. If you don't want to restore GTIDs, pass --set-gtid-purged=OFF. To make a complete dump, pass --all-databases --triggers --routines --events. 
[root@localhost mysql-5.6.10]# 

这个时候又有新数据写入了

<test>(root@localhost) [xuanzhi]> insert into test1 (age,name) values (30,'MySQL');
Query OK, 1 row affected (0.01 sec)

<test>(root@localhost) [xuanzhi]> insert into test1 (age,name) values (40,'PYthon');
Query OK, 1 row affected (0.00 sec)

<test>(root@localhost) [xuanzhi]> select * from test1;
+----+------+--------+
| id | age  | name   |
+----+------+--------+
|  1 |   20 | aa     |
|  2 |   22 | bb     |
|  3 |   23 | cc     |
|  4 |   24 | dd     |
|  5 |   30 | MySQL  |
|  6 |   40 | PYthon |
+----+------+--------+
6 rows in set (0.00 sec)

<test>(root@localhost) [xuanzhi]> 

可以看到多了id为5和6的这两条记录,这两条记录是在完备后产生的记录!这个时候,有人开始误操作了,delete的时候忘记带where条件了

<test>(root@localhost) [xuanzhi]> delete from test1;
Query OK, 6 rows affected (0.01 sec)

<test>(root@localhost) [xuanzhi]> select * from test1;
Empty set (0.00 sec)

<test>(root@localhost) [xuanzhi]> 

可以看到很悲剧的事情发生了,表数据全没了,这时小心脏要顶住啊!!!!这时只能通过完整备份+Binlog去把数据恢复了

 

恢复思路 :

1、要清楚知道完备后的pos是从那里开始的,--master-data在这个时候起到了很重要的作用

2、在binlog找出delete from test1操作的上一个pos位置,预防在恢复时,又执行了delete操作,所以必须要找到delete前的pos位置

3、把之前做好的全备恢复回去

4、通过binlog基于position位置恢复

 

下面我们跟着思路来操作一下吧:

(1)找出备份后的position位置(在备份的时候没有使用--master-data是不会有以下信息的,--master-data的说明请参考官网):

[root@localhost ~]# cat xuanzhi.sql |grep -i "change"
-- CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000023', MASTER_LOG_POS=1393;
[root@localhost ~]# 

此时我们可以看到MASTER_LOG_POS=1393,说明从1393开始的binlog事件,在完备的数据里是没有的,所以我们可以会用--start-position=1393去解析binlog,这样可以减少更多的日志信息,方便找到对应的操作

[root@localhost ~]# mysqlbinlog -v --base64-output=DECODE-ROWS --start-position=1393 -d xuanzhi  /data/mysql-5.6.10/mysql-bin.000023 >test1.sql

(2) 我们通过test1.sql文件找到删除操作前的position位置:

[root@localhost ~]# cat test1.sql 
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
/*!40019 SET @@session.max_insert_delayed_threads=0*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at 1393
#150622 20:59:27 server id 1  end_log_pos 1441 CRC32 0x083aaf66         GTID [commit=yes]
SET @@SESSION.GTID_NEXT= 'fc77e38e-c8c1-11e4-a54c-000c2914208d:6'/*!*/;
# at 1441
#150622 20:59:27 server id 1  end_log_pos 1516 CRC32 0xfaaf6148         Query   thread_id=1     exec_time=0     error_code=0
SET TIMESTAMP=1434977967/*!*/;
SET @@session.pseudo_thread_id=1/*!*/;
SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/;
SET @@session.sql_mode=1075838976/*!*/;
SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/;
/*!\C utf8 *//*!*/;
SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=8/*!*/;
SET @@session.lc_time_names=0/*!*/;
SET @@session.collation_database=DEFAULT/*!*/;
BEGIN
/*!*/;
# at 1516
#150622 20:59:27 server id 1  end_log_pos 1571 CRC32 0x9feaa179         Table_map: `xuanzhi`.`test1` mapped to number 74
# at 1571
#150622 20:59:27 server id 1  end_log_pos 1621 CRC32 0xa761e858         Write_rows: table id 74 flags: STMT_END_F
### INSERT INTO `xuanzhi`.`test1`
### SET
###   @1=5
###   @2=30
###   @3='MySQL'
# at 1621
#150622 20:59:27 server id 1  end_log_pos 1652 CRC32 0x1669add4         Xid = 112
COMMIT/*!*/;
# at 1652
#150622 20:59:29 server id 1  end_log_pos 1700 CRC32 0x0c99198b         GTID [commit=yes]
SET @@SESSION.GTID_NEXT= 'fc77e38e-c8c1-11e4-a54c-000c2914208d:7'/*!*/;
# at 1700
#150622 20:59:29 server id 1  end_log_pos 1775 CRC32 0x1492f3ed         Query   thread_id=1     exec_time=0     error_code=0
SET TIMESTAMP=1434977969/*!*/;
BEGIN
/*!*/;
# at 1775
#150622 20:59:29 server id 1  end_log_pos 1830 CRC32 0xea811740         Table_map: `xuanzhi`.`test1` mapped to number 74
# at 1830
#150622 20:59:29 server id 1  end_log_pos 1881 CRC32 0x8d528f67         Write_rows: table id 74 flags: STMT_END_F
### INSERT INTO `xuanzhi`.`test1`
### SET
###   @1=6
###   @2=40
###   @3='PYthon'
# at 1881
#150622 20:59:29 server id 1  end_log_pos 1912 CRC32 0x0d0fc5fa         Xid = 113
COMMIT/*!*/;
# at 1912
#150622 21:02:01 server id 1  end_log_pos 1960 CRC32 0xa68bea7e         GTID [commit=yes]
SET @@SESSION.GTID_NEXT= 'fc77e38e-c8c1-11e4-a54c-000c2914208d:8'/*!*/;
# at 1960
#150622 21:02:01 server id 1  end_log_pos 2035 CRC32 0x4aeaf6e7         Query   thread_id=1     exec_time=0     error_code=0
SET TIMESTAMP=1434978121/*!*/;
BEGIN
/*!*/;
# at 2035
#150622 21:02:01 server id 1  end_log_pos 2090 CRC32 0x4bdb4429         Table_map: `xuanzhi`.`test1` mapped to number 74
# at 2090
#150622 21:02:01 server id 1  end_log_pos 2204 CRC32 0x32c01bf1         Delete_rows: table id 74 flags: STMT_END_F
### DELETE FROM `xuanzhi`.`test1`
### WHERE
###   @1=1
###   @2=20
###   @3='aa'
### DELETE FROM `xuanzhi`.`test1`
### WHERE
###   @1=2
###   @2=22
###   @3='bb'
### DELETE FROM `xuanzhi`.`test1`
### WHERE
###   @1=3
###   @2=23
###   @3='cc'
### DELETE FROM `xuanzhi`.`test1`
### WHERE
###   @1=4
###   @2=24
###   @3='dd'
### DELETE FROM `xuanzhi`.`test1`
### WHERE
###   @1=5
###   @2=30
###   @3='MySQL'
### DELETE FROM `xuanzhi`.`test1`
### WHERE
###   @1=6
###   @2=40
###   @3='PYthon'
# at 2204
#150622 21:02:01 server id 1  end_log_pos 2235 CRC32 0x2d665ce6         Xid = 115
COMMIT/*!*/;
# at 2235
#150622 21:38:29 server id 1  end_log_pos 2283 CRC32 0x8df83114         GTID [commit=yes]
SET @@SESSION.GTID_NEXT= 'fc77e38e-c8c1-11e4-a54c-000c2914208d:9'/*!*/;
# at 2283
#150622 21:38:29 server id 1  end_log_pos 2357 CRC32 0x06b2ac66         Query   thread_id=5     exec_time=0     error_code=0
SET TIMESTAMP=1434980309/*!*/;
BEGIN
/*!*/;
# at 2357
# at 2409
# at 2452
#150622 21:38:29 server id 1  end_log_pos 2483 CRC32 0x484fe32f         Xid = 129
COMMIT/*!*/;
# at 2483
#150622 21:38:34 server id 1  end_log_pos 2531 CRC32 0x14c0ff08         GTID [commit=yes]
SET @@SESSION.GTID_NEXT= 'fc77e38e-c8c1-11e4-a54c-000c2914208d:10'/*!*/;
# at 2531
#150622 21:38:34 server id 1  end_log_pos 2605 CRC32 0x47664df5         Query   thread_id=5     exec_time=0     error_code=0
SET TIMESTAMP=1434980314/*!*/;
BEGIN
/*!*/;
# at 2605
# at 2657
# at 2700
#150622 21:38:34 server id 1  end_log_pos 2731 CRC32 0x3eb63a54         Xid = 130
COMMIT/*!*/;
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;
View Code

可以看到删除操作前的position位置是1912:

### INSERT INTO `xuanzhi`.`test1`
### SET
###   @1=6
###   @2=40
###   @3='PYthon'
# at 1881
#150622 20:59:29 server id 1  end_log_pos 1912 CRC32 0x0d0fc5fa         Xid = 113
COMMIT/*!*/;
# at 1912
#150622 21:02:01 server id 1  end_log_pos 1960 CRC32 0xa68bea7e         GTID [commit=yes]
SET @@SESSION.GTID_NEXT= 'fc77e38e-c8c1-11e4-a54c-000c2914208d:8'/*!*/;
# at 1960
#150622 21:02:01 server id 1  end_log_pos 2035 CRC32 0x4aeaf6e7         Query   thread_id=1     exec_time=0     error_code=0
SET TIMESTAMP=1434978121/*!*/;
BEGIN
/*!*/;
# at 2035
#150622 21:02:01 server id 1  end_log_pos 2090 CRC32 0x4bdb4429         Table_map: `xuanzhi`.`test1` mapped to number 74
# at 2090
#150622 21:02:01 server id 1  end_log_pos 2204 CRC32 0x32c01bf1         Delete_rows: table id 74 flags: STMT_END_F
### DELETE FROM `xuanzhi`.`test1`

我们再基于--start-position和--stop-position去解析binlog,确保里面已经没有delete操作了:

[root@localhost ~]# mysqlbinlog -v --base64-output=DECODE-ROWS --start-position=1393  --stop-position=1912  -d xuanzhi  /data/mysql-5.6.10/mysql-bin.000023 >test1.sql
[root@localhost ~]# cat test1.sql    
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
/*!40019 SET @@session.max_insert_delayed_threads=0*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at 1393
#150622 20:59:27 server id 1  end_log_pos 1441 CRC32 0x083aaf66         GTID [commit=yes]
SET @@SESSION.GTID_NEXT= 'fc77e38e-c8c1-11e4-a54c-000c2914208d:6'/*!*/;
# at 1441
#150622 20:59:27 server id 1  end_log_pos 1516 CRC32 0xfaaf6148         Query   thread_id=1     exec_time=0     error_code=0
SET TIMESTAMP=1434977967/*!*/;
SET @@session.pseudo_thread_id=1/*!*/;
SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/;
SET @@session.sql_mode=1075838976/*!*/;
SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/;
/*!\C utf8 *//*!*/;
SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=8/*!*/;
SET @@session.lc_time_names=0/*!*/;
SET @@session.collation_database=DEFAULT/*!*/;
BEGIN
/*!*/;
# at 1516
#150622 20:59:27 server id 1  end_log_pos 1571 CRC32 0x9feaa179         Table_map: `xuanzhi`.`test1` mapped to number 74
# at 1571
#150622 20:59:27 server id 1  end_log_pos 1621 CRC32 0xa761e858         Write_rows: table id 74 flags: STMT_END_F
### INSERT INTO `xuanzhi`.`test1`
### SET
###   @1=5
###   @2=30
###   @3='MySQL'
# at 1621
#150622 20:59:27 server id 1  end_log_pos 1652 CRC32 0x1669add4         Xid = 112
COMMIT/*!*/;
# at 1652
#150622 20:59:29 server id 1  end_log_pos 1700 CRC32 0x0c99198b         GTID [commit=yes]
SET @@SESSION.GTID_NEXT= 'fc77e38e-c8c1-11e4-a54c-000c2914208d:7'/*!*/;
# at 1700
#150622 20:59:29 server id 1  end_log_pos 1775 CRC32 0x1492f3ed         Query   thread_id=1     exec_time=0     error_code=0
SET TIMESTAMP=1434977969/*!*/;
BEGIN
/*!*/;
# at 1775
#150622 20:59:29 server id 1  end_log_pos 1830 CRC32 0xea811740         Table_map: `xuanzhi`.`test1` mapped to number 74
# at 1830
#150622 20:59:29 server id 1  end_log_pos 1881 CRC32 0x8d528f67         Write_rows: table id 74 flags: STMT_END_F
### INSERT INTO `xuanzhi`.`test1`
### SET
###   @1=6
###   @2=40
###   @3='PYthon'
# at 1881
#150622 20:59:29 server id 1  end_log_pos 1912 CRC32 0x0d0fc5fa         Xid = 113
COMMIT/*!*/;
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;
[root@localhost ~]# 
View Code

可以看到没有DELETE操作了!

(3)进行之前的备份恢复

[root@localhost ~]# mysql -uroot  -p123456 -S /data/mysql-5.6.10/mysql.sock xuanzhi  <./xuanzhi.sql     
Warning: Using a password on the command line interface can be insecure.
ERROR 1840 (HY000) at line 24: GTID_PURGED can only be set when GTID_EXECUTED is empty.
[root@localhost ~]# 

我们查看下数据是否已经恢复回去了

<test>(root@localhost) [xuanzhi]> select * from test1;
Empty set (0.00 sec)

好悲剧的看到数据没有正常恢复回去,我们回过头来看看上面的报错,GTID_PURGED can only be set when GTID_EXECUTED is empty,意味着GTID_PURGED它已经有值了,那我们可以用-f选择强制进行数据恢复

再次查看备份的数据是否恢复回来了,可以看到,备份的数据已经恢复回来了

<test>(root@localhost) [xuanzhi]> select * from test1;
+----+------+------+
| id | age  | name |
+----+------+------+
|  1 |   20 | aa   |
|  2 |   22 | bb   |
|  3 |   23 | cc   |
|  4 |   24 | dd   |
+----+------+------+
4 rows in set (0.00 sec)

但备份后的数据依然没有,所以下一步要利用binlog进行数据恢复:

[root@localhost ~]#  mysqlbinlog --start-position=1393  --stop-position=1912 mysql-bin.000023 | /usr/local/mysql-5.6.10/bin/mysql -uroot  -p123456 -S /data/mysql-5.6.10/mysql.sock xuanzhi
Warning: Using a password on the command line interface can be insecure.
[root@localhost ~]# 

看到没报错,你可以开心一小会了,然而当你看到结果时,会让你哭笑不得,我们查看下数据是否恢复回来了:

<test>(root@localhost) [xuanzhi]> select * from test1;
+----+------+------+
| id | age  | name |
+----+------+------+
|  1 |   20 | aa   |
|  2 |   22 | bb   |
|  3 |   23 | cc   |
|  4 |   24 | dd   |
+----+------+------+
4 rows in set (0.00 sec)

<test>(root@localhost) [xuanzhi]>

哭了吧,完备后的数据并没有恢复回来!这时就蛋痛了,思路这么清晰,操作也步骤也没问题,数据怎么没恢复回来呢?这时唯一让我觉得有可能是因为GTID的问题,因为恢复时它可能检查到对应的GTID已经有了,它就不执行操作了

 

(4)进行reset master操作(切记,这只能在测试环境或者虚拟环境执行,千万不能在线上环境执行,因为它会自动清掉所有binlog,会产生一个新的binlog),当你在一个全新环境进行恢复时,就不需要做以下的reset master操作

在reset master前,一定要做好binlog备份,要不你想通过binlog再去恢复你想要的数据,也找不到了,所以我们进行拷贝一份到/root下

[root@localhost ~]# cp -a /data/mysql-5.6.10/mysql-bin.000023 /root/

再进行reset master

<test>(root@localhost) [xuanzhi]> reset master;
Query OK, 0 rows affected (0.00 sec)

再进行一次binlog恢复操作:

[root@localhost ~]# mysqlbinlog --start-position=1393  --stop-position=1912 mysql-bin.000023 | /usr/local/mysql-5.6.10/bin/mysql -uroot  -p123456 -S /data/mysql-5.6.10/mysql.sock xuanzhi
Warning: Using a password on the command line interface can be insecure.
[root@localhost ~]#

再去查看数据,是否正常恢复了:

<test>(root@localhost) [xuanzhi]> select * from test1;
+----+------+--------+
| id | age  | name   |
+----+------+--------+
|  1 |   20 | aa     |
|  2 |   22 | bb     |
|  3 |   23 | cc     |
|  4 |   24 | dd     |
|  5 |   30 | MySQL  |
|  6 |   40 | PYthon |
+----+------+--------+
6 rows in set (0.00 sec)

<test>(root@localhost) [xuanzhi]> 

嘻嘻,这时候你就可以开心笑一会了!

 

那么,我们如何避免这个delete from tb_name不带条件的呢?其实是有办法的,但这只针对运维DBA或者DBA在操作时候有用,但对于PHP和JAVA程序,它的连接操作方式,就没办法避免了

 set sql_safe_updates=on 或者 set global sql_safe_updates=on;

<test>(root@localhost) [xuanzhi]> set sql_safe_updates=on;       
Query OK, 0 rows affected (0.00 sec)

<test>(root@localhost) [xuanzhi]> select * from test1;
Empty set (0.00 sec)

<test>(root@localhost) [xuanzhi]> delete from test1;  
ERROR 1175 (HY000): You are using safe update mode and you tried to update a table without a WHERE that uses a KEY column
<test>(root@localhost) [xuanzhi]> 

可以看到没带where条件的操作是不允许的。

 

总结:

       一、一般恢复单个表的数据或者单个表的几条数据,都建议在测试环境进行,待数据都恢复完了,再导回线上的库

       二、不要用低版本的binlog命令去解析高版本的binary log,可能会出错,这个值得注意的,特别是多实例且版本不同的服务器上

       二、误操作时时会有,应多加小心,恢复数据的是比较蛋痛的一件事,一方面是时间问题,另一方面是数据是否完整恢复了

       三、要制定良好的备份计划,在出现特殊情况时,这就是救命稻草了,这是DBA必须要做好的一件事。

      

 

 

作者:陆炫志

出处:xuanzhi的博客 http://www.cnblogs.com/xuanzhi201111

您的支持是对博主最大的鼓励,感谢您的认真阅读。本文版权归作者所有,欢迎转载,但请保留该声明。

 

posted @ 2015-07-01 15:08  GoogSQL  阅读(2285)  评论(0编辑  收藏  举报