校验主从数据并修复

1.先安装pt工具

[root@bogon ~]# yum -y install percona-toolkit

2.主库数据:

root@localhost:mysql3316.sock  16:18:03 [zhangshuo]>select * from zs;
+----+--------------+
| id | name         |
+----+--------------+
|  1 | zhangshuo    |
|  2 | jie          |
|  3 | zhangsanfeng |
+----+--------------+
3 rows in set (0.00 sec)

3.从库数据:

root@localhost:mysql3317.sock  16:20:48 [zhangshuo]>select * from zs;
+----+-----------------+
| id | name            |
+----+-----------------+
|  1 | zhangshuo       |
|  2 | jie             |
|  3 | zhangsanfeng    |
|  4 | 从入门进阶      |
|  5 | 到删库跑路      |
+----+-----------------+
5 rows in set (0.00 sec)

4.很明显数据不一致,我们使用工具来检测:

首先要在从库配置文件中定义:

report_host=slave_ip

report_port=slave_port

root@localhost:mysql3316.sock  08:51:35 [percona]>grant all privileges on *.* to 'repl'@'%' identified by'repl';
root@localhost:mysql3316.sock  08:53:05 [percona]>flush privileges;

[root@bogon ~]# pt-table-checksum -S /tmp/mysql3316.sock -P 3316 --user=repl --password=repl --host=192.168.1.113 --databases=zhangshuo --recursion-method='hosts' --no-check-binlog-format --create-replicate-table --replicate=zhangshuo.checksums

            TS ERRORS  DIFFS     ROWS  CHUNKS SKIPPED    TIME TABLE
07-30T09:02:36      0      1        3       1       0   0.055 zhangshuo.zs
TS            #检测完成时间。
ERRORS        #检查时候发生错误的数量。
DIFFS         #0表示一致,1表示不一致。
ROWS          #表的行数。
CHUNKS        #被划分到表中的块的数目。
SKIPPED       #过多错误而跳过块的数目。
TIME          #执行时间
TABLE         #检查的表名

参数解析:

 -S /tmp/mysql3316.sock    #指定sock

-P 3316    #指定端口

--user=repl    #指定用户

--password=repl    #指定密码

--host=192.168.1.113    #指定检测主机

 --databases=zhangshuo    #指定检测库

 --recursion-method='hosts'    #发现从库方式

--no-check-binlog-forma    #如果是row格式会报错,需跳过。

--create-replicate-table --replicate=zhangshuo.checksums     #指定检测结果输出表

 5.通过DIFFS=1,确定主从数据不一致。因指定了参数--create-replicate-table --replicate=zhangshuo.checksums。进slave查看检测信息:

root@localhost:mysql3317.sock  10:08:28 [zhangshuo]>select * from checksums\G
*************************** 1. row ***************************
            db: zhangshuo
           tbl: zs
         chunk: 1
    chunk_time: 0.000992
   chunk_index: NULL
lower_boundary: NULL
upper_boundary: NULL
      this_crc: 855a644a
      this_cnt: 5          #从库
    master_crc: abfa5704
    master_cnt: 3          #主库         发现从库比主库多了两条数据
            ts: 2016-07-30 09:54:04
1 row in set (0.00 sec)

6.使用pt-table-sync来修复:

(1)可以使用--print参数将修复语句打印出来。  #修复语句也可以使用source来执行。

[root@bogon ~]# pt-table-sync  --replicate=zhangshuo.checksums --socket=/tmp/mysql3316.sock --port=3316 h=192.168.1.113,u=repl,p=repl --socket=/tmp/mysql3317.sock,--port=3317,h=192.168.1.113,u=repl,p=repl --print
DELETE FROM `zhangshuo`.`zs` WHERE `id`='4' LIMIT 1 /*percona-toolkit src_db:zhangshuo src_tbl:zs src_dsn:P=3316,S=/tmp/mysql3317.sock,h=192.168.1.113,p=...,u=repl dst_db:zhangshuo dst_tbl:zs dst_dsn:P=3317,S=/tmp/mysql3317.sock,h=192.168.1.113,p=...,u=repl lock:1 transaction:1 changing_src:zhangshuo.checksums replicate:zhangshuo.checksums bidirectional:0 pid:8659 user:root host:bogon*/;
DELETE FROM `zhangshuo`.`zs` WHERE `id`='5' LIMIT 1 /*percona-toolkit src_db:zhangshuo src_tbl:zs src_dsn:P=3316,S=/tmp/mysql3317.sock,h=192.168.1.113,p=...,u=repl dst_db:zhangshuo dst_tbl:zs dst_dsn:P=3317,S=/tmp/mysql3317.sock,h=192.168.1.113,p=...,u=repl lock:1 transaction:1 changing_src:zhangshuo.checksums replicate:zhangshuo.checksums bidirectional:0 pid:8659 user:root host:bogon*/;

(2)直接修复:

[root@bogon ~]# pt-table-sync  --replicate=zhangshuo.checksums --socket=/tmp/mysql3316.sock --port=3316 h=192.168.1.113,u=repl,p=repl --socket=/tmp/mysql3317.sock,--port=3317,h=192.168.1.113,u=repl,p=repl --execute

7.进入从库查看发现多出的两条数据被删除了:

root@localhost:mysql3317.sock  11:00:01 [zhangshuo]>select * from zs;
+----+--------------+
| id | name         |
+----+--------------+
|  1 | zhangshuo    |
|  2 | jie          |
|  3 | zhangsanfeng |
+----+--------------+
3 rows in set (0.00 sec)

8.重新校验确定中从数据同步:(DIFFS列为0)

[root@bogon ~]# pt-table-checksum -S /tmp/mysql3316.sock -P 3316 --user=repl --password=repl --host=192.168.1.113 --databases=zhangshuo --recursion-method='hosts' --no-check-binlog-format --create-replicate-table --replicate=zhangshuo.checksums
            TS ERRORS  DIFFS     ROWS  CHUNKS SKIPPED    TIME TABLE
07-30T11:14:55      0      0        3       1       0   0.010 zhangshuo.zs

9.分析pt-table-sync是怎么工作的:

(1)在从库更改一条数据:

root@localhost:mysql3317.sock  11:30:09 [zhangshuo]>select * from zs;
+----+--------------+
| id | name         |
+----+--------------+
|  1 | zhangshuo    |
|  2 | jie          |
|  3 | zhangsanfeng |
+----+--------------+
3 rows in set (0.00 sec

root@localhost:mysql3317.sock  11:32:40 [zhangshuo]>update zs set name='zhangjie' where id = 1;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

root@localhost:mysql3317.sock  11:33:01 [zhangshuo]>select * from zs;
+----+--------------+
| id | name         |
+----+--------------+
|  1 | zhangjie     |
|  2 | jie          |
|  3 | zhangsanfeng |
+----+--------------+
3 rows in set (0.00 sec)

(2)主库使用pt-table-checksum校验,并发现主从数据不一致(DIFFS为1):

[root@bogon ~]# pt-table-checksum -S /tmp/mysql3316.sock -P 3316 --user=repl --password=repl --host=192.168.1.113 --databases=zhangshuo --recursion-method='hosts' --no-check-binlog-format --create-replicate-table --replicate=zhangshuo.checksums
            TS ERRORS  DIFFS     ROWS  CHUNKS SKIPPED    TIME TABLE
07-30T11:33:55      0      1        3       1       0   0.008 zhangshuo.zs

(3)使用pt-table-sync修复,因为需要解析binlog,所以在数据修复之前flush logs方便实验:

root@localhost:mysql3316.sock  11:29:10 [zhangshuo]>flush logs;

[root@bogon ~]# pt-table-sync  --replicate=zhangshuo.checksums --socket=/tmp/mysql3316.sock --port=3316 h=192.168.1.113,u=repl,p=repl --socket=/tmp/mysql3317.sock,--port=3317,h=192.168.1.113,u=repl,p=repl --execute

(4)解析binlog查看pt-table-sync怎样修复数据:

[root@bogon ~]# nohup /usr/local/mysql/bin/mysqlbinlog -v --base64-output=decode-rows /data/mysql/mysql3316/logs/mysql-bin.000008 
nohup: 忽略输入并把输出追加到"nohup.out"
BEGIN
/*!*/;
# at 2345
#160730 11:34:42 server id 1283316  end_log_pos 2836 CRC32 0x66f458b4   Query   thread_id=63    exec_time=0     error_code=0
SET TIMESTAMP=1469849682/*!*/;
REPLACE INTO `zhangshuo`.`zs`(`id`, `name`) VALUES ('1', 'zhangshuo') /*percona-toolkit src_db:zhangshuo src_tbl:zs src_dsn:P=3316,S=/tmp/mysql3317.sock,h=192.168.1.113,p=...,u=repl dst_db:zhangshuo dst_tbl:zs dst_dsn:P=3317,S=/tmp/mysql3317.sock,h=192.168.1.113,p=...,u=repl lock:1 transaction:1 changing_src:zhangshuo.checksums replicate:zhangshuo.checksums bidirectional:0 pid:8948 user:root host:bogon*/
/*!*/;
# at 2836
#160730 11:34:42 server id 1283316  end_log_pos 2867 CRC32 0x7f9ef3d5   Xid = 1668
COMMIT/*!*/;
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;
查看以上事务发现 pt-table-sync 在master上生成replace into语句,记录bing log然后复制到从库来完成数据一致性修复,所以修复都在主库完成。 


 

posted on 2016-07-30 09:49  zhangshuo  阅读(482)  评论(0编辑  收藏  举报

导航