KingbaseES V8R3集群运维案例之---sys_ctl promote手工切换备库为主库
案例说明:
KingbaseES V8R3集群不支持在线切换switchover,如果想将集群中一个备库promote为主库,需要通过执行‘sys_ctl promote’完成主备的切换。一般可以用于以下场景,主库数据库服务down后,但是failover主备切换失败,强制将其中一个备库提升为新的主库。
适用版本:
KingbaseES V8R3 
一、集群架构
如下所示,集群有一主一备两个节点构成,目前集群状态正常,现测试将备库手工promote为主库:
test=# show pool_nodes;
 node_id |   hostname    | port  | status | lb_weight |  role   | select_cnt | load_balance_node | replication_de
lay
---------+---------------+-------+--------+-----------+---------+------------+-------------------+---------------
----
 0       | 192.168.1.201 | 54321 | up     | 0.500000  | standby | 0          | true              | 0
 1       | 192.168.1.202 | 54321 | up     | 0.500000  | primary | 0          | false             | 0
(2 rows)
test=# select * from sys_stat_replication;
 pid  | usesysid | usename | application_name |  client_addr  | client_hostname | client_port |         backend_s
tart         | backend_xmin |   state   | sent_location | write_location | flush_location | replay_location | syn
c_priority | sync_state
------+----------+---------+------------------+---------------+-----------------+-------------+------------------
-------------+--------------+-----------+---------------+----------------+----------------+-----------------+----
-----------+------------
 8163 |       10 | SYSTEM  | node1            | 192.168.1.201 |                 |       60868 | 2023-09-20 10:25:
11.579320+08 |              | streaming | 0/18026C48    | 0/18026C48     | 0/18026C48     | 0/18026C48      |
         1 | sync
(1 row)
二、备库执行promote
1、当前节点状态
如下所示,当前节点为standby备库节点:
test=# select sys_is_in_recovery();
 sys_is_in_recovery
--------------------
 t
(1 row)
2、执行promote
[kingbase@node201 bin]$ ./sys_ctl promote -D ../data
server promoting
查看sys_log日志:
# 如下所示,备库数据库服务被关闭
2023-09-20 10:32:03.893 CST,,,8250,,650a5807.203a,4,,2023-09-20 10:25:11 CST,,0,LOG,00000,"received fast shutdown request",,,,,,,,,""
2023-09-20 10:32:03.893 CST,,,8250,,650a5807.203a,5,,2023-09-20 10:25:11 CST,,0,LOG,00000,"aborting any active transactions",,,,,,,,,""
.........
CST,"SUPERMANAGER_V8ADMIN","TEMPLATE2",14141,"192.168.1.202:56088",650a59a4.373d,1,"",2023-09-20 10:32:04 CST,,0,FATAL,57P03,"the database system is shutting down",,,,,,,,,""
2023-09-20 10:32:04.099 CST,,,8256,,650a5807.2040,8,,2023-09-20 10:25:11 CST,,0,LOG,00000,"checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=0.001 s, sync=0.000 s, total=0.162 s; sync files=0, longest=0.000 s, average=0.000 s; distance=16228 kB, estimate=16228 kB",,,,,,,,,""
2023-09-20 10:32:04.107 CST,,,8250,,650a5807.203a,7,,2023-09-20 10:25:11 CST,,0,LOG,00000,"database system is shut down",,,,,,,,,""
3、关闭原主库数据库服务
# 需要手工将原主库recovery为备库
[kingbase@node202 bin]$ ./sys_ctl stop -D ../data
waiting for server to shut down.... done
server stopped
4、启动新主库数据库服务
[kingbase@node201 bin]$ ./sys_ctl start -D ../data
server starting
test=# select sys_is_in_recovery();
 sys_is_in_recovery
--------------------
 f
(1 row)
# 如上所示,当前节点数据库服务已经启动为主库节点
sys_log日志:
[kingbase@node201 sys_log]$ tail -1000 kingbase-2023-09-20_103407.csv
2023-09-20 10:34:07.337 CST,,,15490,,650a5a1f.3c82,1,,2023-09-20 10:34:07 CST,,0,LOG,00000,"ending log output to stderr",,"Future log output will go to log destination ""csvlog"".",,,,,,,""
2023-09-20 10:34:07.339 CST,,,15492,,650a5a1f.3c84,1,,2023-09-20 10:34:07 CST,,0,LOG,00000,"database system was shut down at 2023-09-20 10:32:04 CST",,,,,,,,,""
2023-09-20 10:34:07.342 CST,,,15492,,650a5a1f.3c84,2,,2023-09-20 10:34:07 CST,,0,LOG,00000,"MultiXact member wraparound protections are now enabled",,,,,,,,,""
2023-09-20 10:34:07.343 CST,,,15490,,650a5a1f.3c82,2,,2023-09-20 10:34:07 CST,,0,LOG,00000,"database system is ready to accept connections",,,,,,,,,""
三、恢复原主库为备库
1、配置备库recovery.conf文件
[kingbase@node202 bin]$ cd ../data
[kingbase@node202 data]$ rm recovery.done
[kingbase@node202 data]$ cp ../etc/recovery.done ./recovery.conf
[kingbase@node202 data]$ cat recovery.conf
standby_mode='on'
primary_conninfo='port=54321 host=192.168.1.201 user=SYSTEM password=MTIzNDU2Cg====== application_name=node2'
recovery_target_timeline='latest'
primary_slot_name ='slot_node2'
2、启动数据库服务
[kingbase@node202 bin]$ ./sys_ctl start -D ../data
server starting
[kingbase@node202 bin]$ ./ksql -U SYSTEM -W 123456 test
ksql (V008R003C002B0370)
Type "help" for help.
test=# select sys_is_in_recovery();
 sys_is_in_recovery
--------------------
 t
(1 row)
# 如上所示,当前节点启动为备库节点
sys_log日志:
# 如下所示,当前节点作为备库启动后,出现wal streaming故障,导致数据库服务shutdown。
2023-09-20 10:38:18.263 CST,,,17077,,650a5b1a.42b5,2,,2023-09-20 10:38:18 CST,,0,LOG,00000,"database system is ready to accept read only connections",,,,,,,,,""
2023-09-20 10:38:18.267 CST,,,17086,,650a5b1a.42be,1,,2023-09-20 10:38:18 CST,,0,LOG,00000,"fetching timeline history file for timeline 3 from primary server",,,,,,,,,""
2023-09-20 10:38:18.269 CST,,,17086,,650a5b1a.42be,2,,2023-09-20 10:38:18 CST,,0,FATAL,XX000,"could not start WAL streaming: ERROR:  requested starting point 0/19000000 on timeline 2 is not in this server's history
DETAIL:  This server's history forked from timeline 2 at 0/18026C48.
",,,,,,,,,""
2023-09-20 10:39:44.303 CST,,,17077,,650a5b1a.42b5,3,,2023-09-20 10:38:18 CST,,0,LOG,00000,"received fast shutdown request",,,,,,,,,""
2023-09-20 10:39:44.303 CST,,,17077,,650a5b1a.42b5,4,,2023-09-20 10:38:18 CST,,0,LOG,00000,"aborting any active transactions",,,,,,,,,""
2023-09-20 10:39:49.309 CST,,,17083,,650a5b1a.42bb,1,,2023-09-20 10:38:18 CST,,0,LOG,00000,"shutting down",,,,,,,,,""
2023-09-20 10:39:49.317 CST,,,17077,,650a5b1a.42b5,5,,2023-09-20 10:38:18 CST,,0,LOG,00000,"database system is shut down",,,,,,,,,"
3、新备库执行recovery
如下所示,当备库第一次启动数据库服务down后,节点自动执行recovery,从recovery.log获悉恢复过程:
[kingbase@node202 log]$ tail -1000 recovery.log
---------------------------------------------------------------------
2023-09-20 10:38:01 recover beging...
my pid is 16494,officially began to perform recovery
2023-09-20 10:38:01 check read/write on mount point
2023-09-20 10:38:01 check read/write on mount point (1 / 6).
2023-09-20 10:38:01 stat the directory of the mount point "/home/kingbase/cluster/R3HA/db/data" ...
.........
2023-09-20 10:38:01 delete the file "/home/kingbase/cluster/R3HA/db/data/rw_status_file_827192217" ... OK
2023-09-20 10:38:01 success to check read/write on mount point (1 / 6).
2023-09-20 10:38:02 check read/write on mount point ... ok
2023-09-20 10:38:02 check if the network is ok
ping trust ip 192.168.1.1 success ping times :[3], success times:[2]
determine if i am master or standby
 node_id |   hostname    | port  | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay
---------+---------------+-------+--------+-----------+---------+------------+-------------------+-------------------
 0       | 192.168.1.201 | 54321 | up     | 0.500000  | primary | 0          | true              | 0
 1       | 192.168.1.202 | 54321 | down   | 0.500000  | standby | 0          | false             | 0
(2 rows)
i am standby in cluster,determine if recovery is needed
2023-09-20 10:38:04 now will del vip [192.168.1.10/24]
execute [/sbin/ip addr del 192.168.1.10/24 dev enp0s3 ]
now, there is no 192.168.1.10/24 on my DEV
ksql execute success,but node:node2 does not have correct streaming(or catchup) replication ,will retry ,retry times:[1/10]
.......
ksql execute success,but node:node2 does not have correct streaming(or catchup) replication ,will retry ,retry times:[10/10]
primary node/Im node status is changed, primary ip[192.168.1.201], recovery.conf NEED_CHANGE [0] (0 is need ), I,m status is [1] (1 is down), I will be in recovery.
 node_id |   hostname    | port  | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay
---------+---------------+-------+--------+-----------+---------+------------+-------------------+-------------------
 0       | 192.168.1.201 | 54321 | up     | 0.500000  | primary | 0          | true              | 0
 1       | 192.168.1.202 | 54321 | down   | 0.500000  | standby | 0          | false             | 0
(2 rows)
if recover node up, let it down , for rewind
waiting for server to shut down......... done
server stopped
2023-09-20 10:39:50 set /home/kingbase/cluster/R3HA/db/data down now... already down , check again
wait kb stop 5 sec .......
2023-09-20 10:39:51 sys_rewind...
sys_rewind  --target-data=/home/kingbase/cluster/R3HA/db/data --source-server="host=192.168.1.201 port=54321 user=SUPERMANAGER_V8ADMIN dbname=TEST"
datadir_source = /home/kingbase/cluster/R3HA/db/bin/../data
rewinding from last common checkpoint at 0/17025808 on timeline 2
find last common checkpoint start time from 2023-09-20 10:39:51.348556 CST to 2023-09-20 10:39:51.370137 CST, in "0.021581" seconds.
reading source file list
reading target file list
reading WAL in target
Rewind datadir file from source
Get archive xlog list from source
Rewind archive log from source
update the control file: minRecoveryPoint is '0/1901DD58', minRecoveryPointTLI is '3', and database state is 'in archive recovery'
rewind start wal location 0/170257D0 (file 000000020000000000000017), end wal location 0/1901DD58 (file 000000030000000000000019). time from 2023-09-20 10:39:51.348556 CST to 2023-09-20 10:39:52.692579 CST, in "1.344023" seconds.
Done!
 sed conf change #synchronous_standby_names
2023-09-20 10:39:54 file operate
cp recovery.conf...
 change recovery.conf ip -> primary.ip
2023-09-20 10:39:54 no need change recovery.conf, primary node is 192.168.1.201
delete pid file if exist
del the replication_slots if exist
drop the slot [slot_node1].
drop the slot [slot_node2].
2023-09-20 10:39:54 start up the kingbase...
waiting for server to start....LOG:  redirecting log output to logging collector process
HINT:  Future log output will appear in directory "/home/kingbase/cluster/R3HA/db/data/sys_log".
 done
server started
ksql "port=54321 user=SUPERMANAGER_V8ADMIN dbname=TEST connect_timeout=10"   -c "select 33333;"
 sys_create_physical_replication_slot
--------------------------------------
 (slot_node1,)
(1 row)
2023-09-20 10:39:56 create the slot [slot_node1] success.
 sys_create_physical_replication_slot
--------------------------------------
 (slot_node2,)
(1 row)
2023-09-20 10:39:56 create the slot [slot_node2] success.
2023-09-20 10:39:56 start up standby successful!
cluster is sync cluster.
SYNC RECOVER MODE ...
2023-09-20 10:39:56 remote primary node change sync
ALTER SYSTEM
 sys_reload_conf
-----------------
 t
(1 row)
SYNC RECOVER MODE DONE
2023-09-20 10:40:00 attach pool...
IM Node is 1, will try [pcp_attach_node -U kingbase -W MTIzNDU2 -h 192.168.1.11 -n 1]
pcp_attach_node -- Command Successful
 node_id |   hostname    | port  | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay
---------+---------------+-------+--------+-----------+---------+------------+-------------------+-------------------
 0       | 192.168.1.201 | 54321 | up     | 0.500000  | primary | 0          | true              | 0
 1       | 192.168.1.202 | 54321 | up     | 0.500000  | standby | 0          | false             | 0
(2 rows)
2023-09-20 10:40:01 attach end..
recovery success,exit script with success
---------------------------------------------------------------------
# 如上所示,备库恢复成功。
4、查看集群流复制状态
如下所示,备库恢复完成后,集群流复制状态正常:
# 复制槽状态
test=# select * from sys_replication_slots;
 slot_name  | plugin | slot_type | datoid | database | active | active_pid | xmin | catalog_xmin | restart_lsn |
confirmed_flush_lsn
------------+--------+-----------+--------+----------+--------+------------+------+--------------+-------------+-
--------------------
 slot_node1 |        | physical  |        |          | f      |            |      |              |             |
 slot_node2 |        | physical  |        |          | t      |      19079 | 2106 |              | 0/1901E9D0  |
(2 rows)
# 流复制状态
test=# select * from sys_stat_replication;
  pid  | usesysid | usename | application_name |  client_addr  | client_hostname | client_port |         backend_
start         | backend_xmin |   state   | sent_location | write_location | flush_location | replay_location | sy
nc_priority | sync_state
-------+----------+---------+------------------+---------------+-----------------+-------------+-----------------
--------------+--------------+-----------+---------------+----------------+----------------+-----------------+---
------------+------------
 19079 |       10 | SYSTEM  | node2            | 192.168.1.202 |                 |       56354 | 2023-09-20 10:39
:55.269554+08 |              | streaming | 0/1901E9D0    | 0/1901E9D0     | 0/1901E9D0     | 0/1901E9D0      |
          2 | sync
(1 row)
5、集群节点状态
如下所示,集群节点状态正常:
test=# show pool_nodes;
 node_id |   hostname    | port  | status | lb_weight |  role   | select_cnt | load_balance_node | replication_de
lay
---------+---------------+-------+--------+-----------+---------+------------+-------------------+---------------
----
 0       | 192.168.1.201 | 54321 | up     | 0.500000  | primary | 0          | true              | 0
 1       | 192.168.1.202 | 54321 | up     | 0.500000  | standby | 0          | false             | 0
(2 rows)
四、总结
对于备库的手工promote为主库,一般用于集群failover切换失败,或需要强制提升备库为主库的场景。在执行完备库的promote后,注意对原主库的恢复,避免出现“双主”场景。
 
                    
                     
                    
                 
                    
                
 
 
                
            
         
         浙公网安备 33010602011771号
浙公网安备 33010602011771号