返回顶部

centos7下mysql半同步复制原理安装测试详解

原理简介

在MySQL5.5之前,MySQL的复制其实都是异步复制(见下图),主库和从库的数据之间存在一定的延迟,这样存在一个隐患:当在主库上写入一个事务并提交成功,而从库尚未得到主库推送的BinLog日志时,恰好主库宕机了,例如主库可能因磁盘损坏、内存故障等造成主库上该事务Binlog丢失,此时从库就可能损失这个事务,从而造成主从不一致。

        

  

为了解决这个问题,从MySQL5.5开始引入了半同步复制机制(Semi_synchronous Replication)。为了保证主库上的每一个Binlog事务都能够被可靠的复制到从库上,主库在每次事务成功提交时,并不及时反馈给前端用户,而是等待其中一个从库也接受到Binlog事务并成功写入中继日志后,主库才返回Commit操作成功给客户端。半同步复制保证了事务成功提交后,至少有两份日志记录,一份在主库的Binlog日志上,另一份在至少一个从库的中继日志Relay Log上,从而更进一步保证了数据的完整性。半同步复制的大致流程如下图:

    

半同步复制模式下,假如在上图步骤①②③中任何一个步骤中主库宕机,则事务并未提交成功,从库上也没收到事务对应的Binlog日志,所以主从数据是一致的;假如在步骤④传送Binlog日志到从库时,从库宕机或者网络故障,导致Binlog并没有及时地传送到从库上,此时主库上的事务会等待一段时间(时间长短由参数rpl_semi_sync_master_timeout设置的毫秒数决定),如果Binlog在这段时间内都无法成功推送到从库上,则MySQL自动调整复制为异步复制,事务正常返回提交结果给客户端。

半同步复制很大程度上取决于主从库之间的网络情况,往返时延RTT(Round-Trip Time)越小决定了从库的实时性越好。通俗地说,主从库之间的网络越快,从库越实时。

 

测试环境

 

安装步骤

半同步复制是以插件形式来实现的,安装比较简单,在异步复制的环境上(这里已经安装好传统的异步复制),安装半同步插件即可,也可在新建时写入my.cnf,参考文章尾部;

1. 查看MySQL服务器是否支持动态增加插件

mysql> select @@have_dynamic_loading;
+------------------------+
| @@have_dynamic_loading |
+------------------------+
| YES                    |  //YES表示支持
+------------------------+
1 row in set (0.00 sec)

2. 确认支持动态插件后,检查安装目录是否存在所需插件,一般在mysql安装目录中的一个.../plugin/目录下,可以搜索一下:

[root@server-10 ~]# find / -name semisync_*.so
/usr/lib64/mysql/plugin/debug/semisync_master.so
/usr/lib64/mysql/plugin/debug/semisync_slave.so
/usr/lib64/mysql/plugin/semisync_master.so
/usr/lib64/mysql/plugin/semisync_slave.so

3. 在主库上安装插件semisync_master.so

mysql> install plugin rpl_semi_sync_master SONAME 'semisync_master.so';
Query OK, 0 rows affected (0.03 sec)

4. 在从库上安装插件semisync_master.so

mysql> install plugin rpl_semi_sync_slave SONAME 'semisync_slave.so';
Query OK, 0 rows affected (0.00 sec)

插件安装完成后,从plugin表中能够看到刚才安装的插件(这里在主库上查看一下):

mysql> select * from mysql.plugin;
+----------------------+--------------------+
| name                 | dl                 |
+----------------------+--------------------+
| rpl_semi_sync_master | semisync_master.so |
+----------------------+--------------------+
1 row in set (0.00 sec)

也就是说,安装完成后,MySQL会在系统表plugin中记录刚才安装的插件,下次系统重启后会自动加载插件。

5. 分别在主库和从库上配置参数打开半同步semi-sync,默认半同步设置是不打开的。

在主库上配置全局参数:

mysql> show variables like 'rpl_semi_sync_master%';
+------------------------------------+-------+
| Variable_name                      | Value |
+------------------------------------+-------+
| rpl_semi_sync_master_enabled       | OFF   |
| rpl_semi_sync_master_timeout       | 10000 |
| rpl_semi_sync_master_trace_level   | 32    |
| rpl_semi_sync_master_wait_no_slave | ON    |
+------------------------------------+-------+
4 rows in set (0.00 sec)

mysql> set global rpl_semi_sync_master_enabled = 1;
Query OK, 0 rows affected (0.00 sec)

mysql> set global rpl_semi_sync_master_timeout = 20000;
Query OK, 0 rows affected (0.00 sec)

mysql> show variables like 'rpl_semi_sync_master%';
+------------------------------------+-------+
| Variable_name                      | Value |
+------------------------------------+-------+
| rpl_semi_sync_master_enabled       | ON    |
| rpl_semi_sync_master_timeout       | 20000 |
| rpl_semi_sync_master_trace_level   | 32    |
| rpl_semi_sync_master_wait_no_slave | ON    |
+------------------------------------+-------+
4 rows in set (0.00 sec)

在从库上配置参数:

mysql> set global rpl_semi_sync_slave_enabled = 1;
Query OK, 0 rows affected (0.00 sec)

6. 由于之前配置的是传统的异步复制,所以需要重启一下从库上的I/O线程(如果是全新配置的半同步复制则不需要,后面会提到全新配置):

mysql> stop slave io_thread;
Query OK, 0 rows affected (0.05 sec)

mysql> start slave io_thread;
Query OK, 0 rows affected (0.00 sec)

到此,半同步复制配置完成,下面可以来验证一下。

 

实际测试

1. 先查看当前主库上半同步复制的一些状态值:

mysql> show status like 'Rpl_semi_sync%';
+--------------------------------------------+-------+
| Variable_name                              | Value |
+--------------------------------------------+-------+
| Rpl_semi_sync_master_clients               | 1     |
| Rpl_semi_sync_master_net_avg_wait_time     | 0     |
| Rpl_semi_sync_master_net_wait_time         | 0     |
| Rpl_semi_sync_master_net_waits             | 0     |
| Rpl_semi_sync_master_no_times              | 0     |
| Rpl_semi_sync_master_no_tx                 | 0     |  //留意该值,后面测试会有变化
| Rpl_semi_sync_master_status                | ON    |  //留意该值,后面测试会有变化
| Rpl_semi_sync_master_timefunc_failures     | 0     |
| Rpl_semi_sync_master_tx_avg_wait_time      | 0     |
| Rpl_semi_sync_master_tx_wait_time          | 0     |
| Rpl_semi_sync_master_tx_waits              | 0     |
| Rpl_semi_sync_master_wait_pos_backtraverse | 0     |
| Rpl_semi_sync_master_wait_sessions         | 0     |
| Rpl_semi_sync_master_yes_tx                | 0     |  //留意该值,后面测试会有变化
+--------------------------------------------+-------+
14 rows in set (0.00 sec)

注意环境不一样,可能显示也不一样,着重关注以下3个状态值的变化,而不是上面这些初始值

Rpl_semi_sync_master_status :值为ON,表示半同步复制目前处于打开状态。

Rpl_semi_sync_master_yes_tx:值为0,表示主库当前尚未有任何一个事务是通过半同步复制到从库。

Rpl_semi_sync_master_no_tx:值为0,表示当前有0个事务不是半同步模式下从库及时响应的。

在主库上执行一个事务,然后再检查一下状态:

mysql> use mydb;
Database changed
mysql> show tables;
Empty set (0.00 sec)

mysql> CREATE TABLE customers
    -> (
    ->   cust_id      int       NOT NULL AUTO_INCREMENT,
    ->   cust_name    char(50)  NOT NULL ,
    ->   cust_address char(50)  NULL ,
    ->   cust_city    char(50)  NULL ,
    ->   cust_state   char(5)   NULL ,
    ->   cust_zip     char(10)  NULL ,
    ->   cust_country char(50)  NULL ,
    ->   cust_contact char(50)  NULL ,
    ->   cust_email   char(255) NULL ,
    ->   PRIMARY KEY (cust_id)
    -> ) ENGINE=InnoDB;
Query OK, 0 rows affected (0.10 sec)

mysql> show tables;
+----------------+
| Tables_in_mydb |
+----------------+
| customers      |
+----------------+
1 row in set (0.00 sec)

mysql> show status like 'Rpl_semi_sync%';
+--------------------------------------------+-------+
| Variable_name                              | Value |
+--------------------------------------------+-------+
| Rpl_semi_sync_master_clients               | 1     |
| Rpl_semi_sync_master_net_avg_wait_time     | 316   |
| Rpl_semi_sync_master_net_wait_time         | 316   |
| Rpl_semi_sync_master_net_waits             | 1     |
| Rpl_semi_sync_master_no_times              | 0     |
| Rpl_semi_sync_master_no_tx                 | 0     |
| Rpl_semi_sync_master_status                | ON    |
| Rpl_semi_sync_master_timefunc_failures     | 0     |
| Rpl_semi_sync_master_tx_avg_wait_time      | 488   |
| Rpl_semi_sync_master_tx_wait_time          | 488   |
| Rpl_semi_sync_master_tx_waits              | 1     |
| Rpl_semi_sync_master_wait_pos_backtraverse | 0     |
| Rpl_semi_sync_master_wait_sessions         | 0     |
| Rpl_semi_sync_master_yes_tx                | 1     |
+--------------------------------------------+-------+
14 rows in set (0.00 sec)

此时会发现Rpl_semi_sync_master_yes_tx的值变为1,即刚才的CREATE事务通过半同步复制到从库上了,Rpl_semi_sync_master_yes_tx计数增加1。

到从库确认一下,新建的customers表确实被复制过去了:

 

2. 接下来模仿网络异常的场景下,主库在等待 rpl_semi_sync_master_timeout毫秒超时后,自动转成异步复制的场景。

在主库上确认半同步复制会等待20s超时:

mysql> show variables like 'rpl_semi_sync_master_timeout';
+------------------------------+-------+
| Variable_name                | Value |
+------------------------------+-------+
| rpl_semi_sync_master_timeout | 20000 |    //单位:ms
+------------------------------+-------+
1 row in set (0.00 sec)

从库上通过iptables命令模拟从库宕机或者网络故障:

[root@server-11 ~]# iptables -A INPUT -s 138.138.82.10 -j DROP

在主库上执行一个事务并提交(默认提交即可),主库上的提交操作会被阻塞20秒

mysql> INSERT INTO customers(cust_id, cust_name, cust_address, cust_city, cust_state, cust_zip, cust_country, cust_contact, cust_email)
    -> VALUES(10001, 'Coyote Inc.', '200 Maple Lane', 'Detroit', 'MI', '44444', 'USA', 'Y Lee', 'ylee@coyote.com');
Query OK, 1 row affected (20.05 sec)   //回车后,会卡主(阻塞)20秒,然后才会跳出Query OK...这行,并显示用时20秒

在这个20秒阻塞过程中,新开一个窗口检查当前主库的线程,会发现提交操作在等待从库上半同步复制操作的响应:

mysql> show processlist\G
......
*************************** 3. row *************************** Id: 12 User: root Host: localhost db: mydb Command: Query Time: 3 State: Waiting for semi-sync ACK from slave    //阻塞,等待从库确认 Info: INSERT INTO customers(cust_id, cust_name, cust_address, cust_city, cust_state, cust_zip, cust_countr 3 rows in set (0.00 sec)

阻塞结束后,再次查看半同步复制的一些状态值:

mysql> show status like 'Rpl_semi_sync%';
+--------------------------------------------+----------+
| Variable_name                              | Value    |
+--------------------------------------------+----------+
| Rpl_semi_sync_master_clients               | 1        |
| Rpl_semi_sync_master_net_avg_wait_time     | 15011119 |
| Rpl_semi_sync_master_net_wait_time         | 30022238 |
| Rpl_semi_sync_master_net_waits             | 2        |
| Rpl_semi_sync_master_no_times              | 1        |
| Rpl_semi_sync_master_no_tx                 | 1        |//该值更新为1,表示在半同步复制模式下,从库没有及时响应的事务增加1个
| Rpl_semi_sync_master_status                | OFF      |//表示主库上半同步复制已经关闭了
| Rpl_semi_sync_master_timefunc_failures     | 0        |
| Rpl_semi_sync_master_tx_avg_wait_time      | 488      |
| Rpl_semi_sync_master_tx_wait_time          | 488      |
| Rpl_semi_sync_master_tx_waits              | 1        |
| Rpl_semi_sync_master_wait_pos_backtraverse | 0        |
| Rpl_semi_sync_master_wait_sessions         | 0        |
| Rpl_semi_sync_master_yes_tx                | 1        |//该值仍然为1,表示刚才的事务并不是通过半同步复制完成的,所以半同步成功事务仍然为1个
+--------------------------------------------+----------+
14 rows in set (0.00 sec)

继续测试:如果从库正常连接上主库之后,主库是否会自动切换回半同步复制模式呢?

那么把之前从库上面的iptables限制条目去除:

[root@server-11 ~]# iptables -F
[root@server-11 ~]# iptables -nL   //查看一下确实没了
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination   

然后在从库上查看slave状态:

mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 138.138.82.10
                  Master_User: repl_user
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: master-bin.000010
          Read_Master_Log_Pos: 988
               Relay_Log_File: relay-bin.000016
                Relay_Log_Pos: 704
        Relay_Master_Log_File: master-bin.000010
             Slave_IO_Running: Yes   
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 988
              Relay_Log_Space: 5483
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 10
                  Master_UUID: 8086bac0-a428-11e8-8bf9-00505691656b
             Master_Info_File: /var/lib/mysql/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 
            Executed_Gtid_Set: 
                Auto_Position: 0
1 row in set (0.00 sec)

mysql> select * from customers;   //之前在主库插入时阻塞20.05秒的条目也复制过来了
+---------+-------------+----------------+-----------+------------+----------+--------------+--------------+-----------------+
| cust_id | cust_name   | cust_address   | cust_city | cust_state | cust_zip | cust_country | cust_contact | cust_email      |
+---------+-------------+----------------+-----------+------------+----------+--------------+--------------+-----------------+
|   10001 | Coyote Inc. | 200 Maple Lane | Detroit   | MI         | 44444    | USA          | Y Lee        | ylee@coyote.com |
+---------+-------------+----------------+-----------+------------+----------+--------------+--------------+-----------------+
1 row in set (0.00 sec)

以上显示说明在网络状态恢复后(去掉iptables),从库会自动尝试连接主库,几秒钟后I/O线程状态从Connecting变成了YES,并且主库和从库的数据一致了。

再次查看主库上半同步复制的状态值

mysql> show status like 'Rpl_semi_sync%';
+--------------------------------------------+----------+
| Variable_name                              | Value    |
+--------------------------------------------+----------+
| Rpl_semi_sync_master_clients               | 1        |
| Rpl_semi_sync_master_net_avg_wait_time     | 10033583 |
| Rpl_semi_sync_master_net_wait_time         | 30100750 |
| Rpl_semi_sync_master_net_waits             | 3        |
| Rpl_semi_sync_master_no_times              | 1        |
| Rpl_semi_sync_master_no_tx                 | 1        |
| Rpl_semi_sync_master_status                | ON       |
| Rpl_semi_sync_master_timefunc_failures     | 0        |
| Rpl_semi_sync_master_tx_avg_wait_time      | 488      |
| Rpl_semi_sync_master_tx_wait_time          | 488      |
| Rpl_semi_sync_master_tx_waits              | 1        |
| Rpl_semi_sync_master_wait_pos_backtraverse | 0        |
| Rpl_semi_sync_master_wait_sessions         | 0        |
| Rpl_semi_sync_master_yes_tx                | 1        |
+--------------------------------------------+----------+
14 rows in set (0.00 sec)

以上发现Rpl_semi_sync_master_status的值自动从OFF变成ON,说明在检测到从库正常之后,主库到从库的复制方式会自动切换为半同步复制模式。

我们继续主库上做一个INSERT事务测试,确认当前的复制模式确实是半同步复制:

mysql> INSERT INTO customers(cust_id, cust_name, cust_address, cust_city, cust_state, cust_zip, cust_country, cust_contact)
    -> VALUES(10002, 'Mouse House', '333 Fromage Lane', 'Columbus', 'OH', '43333', 'USA', 'Jerry Mouse');
Query OK, 1 row affected (0.07 sec)

mysql> show status like 'Rpl_semi_sync%';
+--------------------------------------------+----------+
| Variable_name                              | Value    |
+--------------------------------------------+----------+
| Rpl_semi_sync_master_clients               | 1        |
| Rpl_semi_sync_master_net_avg_wait_time     | 7525296  |
| Rpl_semi_sync_master_net_wait_time         | 30101185 |
| Rpl_semi_sync_master_net_waits             | 4        |
| Rpl_semi_sync_master_no_times              | 1        |
| Rpl_semi_sync_master_no_tx                 | 1        |
| Rpl_semi_sync_master_status                | ON       |
| Rpl_semi_sync_master_timefunc_failures     | 0        |
| Rpl_semi_sync_master_tx_avg_wait_time      | 517      |
| Rpl_semi_sync_master_tx_wait_time          | 1034     |
| Rpl_semi_sync_master_tx_waits              | 2        |
| Rpl_semi_sync_master_wait_pos_backtraverse | 0        |
| Rpl_semi_sync_master_wait_sessions         | 0        |
| Rpl_semi_sync_master_yes_tx                | 2        |  //计数增加了1个,变为2
+--------------------------------------------+----------+
14 rows in set (0.00 sec)

可以看出,以上的一个INSERT事务提交后,Rpl_semi_sync_master_yes_tx 值从1变成2,确认了刚才事务的复制事半同步复制。

测试结束;

小结

从半同步复制的流程会发现,半同步复制的“半”就体现在:虽然主库和从库的Binlog日志时同步的,但是主库并不等待从库应用这部分日志就返回提交结果,这部分操作是异步的,从库的数据并不是和主库实时同步的,所以只能成为半同步,而不是完全的实时同步。

 

补充

通过配置文件添加半同步插件和参数,操作如下:

在/etc/my.cnf中添加以下参数

plugin_load = "rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"
rpl_semi_sync_master_enabled = 1
rpl_semi_sync_slave_enabled = 1
rpl_semi_sync_master_timeout = 20000

 

结束.

posted @ 2018-08-24 18:54  Adrian·Ding  阅读(530)  评论(0编辑  收藏  举报