kingbase集群数据库疑问解决
第1章 存在问题
kingbase@GJHYZX-pc:~$ repmgr cluster show
[WARNING] node "node2" not found in "sys_stat_replication"
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | LSN_Lag | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+---------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 | node1 | primary | * running | | default | 100 | 1 | | host=192.168.10.18 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=2 keepalives_interval=2 keepalives_count=3 tcp_user_timeout=9000
2 | node2 | standby | running | ! node1 | default | 100 | 1 | 0 bytes | host=192.168.10.19 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=2 keepalives_interval=2 keepalives_count=3 tcp_user_timeout=9000[WARNING] following issues were detected
- node "node2" (ID: 2) is not attached to its upstream node "node1" (ID: 1)

问题分析
kingbase@GJHYZX-pc:~/cluster/install/kingbase/etc$ repmgr -f ./repmgr.conf primary check
[WARNING] node "node2" not found in "sys_stat_replication"
Node "node2":
Server role: OK (node is standby)
Replication lag: OK (0 seconds)
WAL archiving: OK (0 pending archive ready files)
Upstream connection: CRITICAL (node "node2" (ID: 2) is not attached to expected upstream node "node1" (ID: 1))
Downstream servers: OK (this node has no downstream nodes)
Replication slots: OK (node has no physical replication slots)
Missing physical replication slots: OK (node has no missing physical replication slots)
Configured data directory: OK (configured "data_directory" is "/home/kingbase/cluster/install/kingbase/data")
第2章 解决思路(在从节点进行如下操作)
2.1 停止从节点
sys_ctl stop -D /home/kingbase/cluster/install/kingbase/data
2.2 将原来从节点数据库备份
mv /home/kingbase/cluster/install/kingbase/data /home/kingbase/cluster/install/kingbase/data_bak_20250929
2.3 克隆主节点数据库到从节点
repmgr -f /home/kingbase/cluster/install/kingbase/etc/repmgr.conf \
standby clone \
--host=192.168.10.18 \
--port=54322 \
--user=esrep \
--dbname=esrep \
--force
| 参数 | 示例值 | 说明 |
|---|---|---|
--host | 192.168.10.18 | ✅ 主库(node1)的 IP 地址 |
--port | 54322 | ✅ 主库的 Kingbase 端口(不是 repmgr 端口) |
--user | esrep | ✅ 用于复制的用户(在 sys_hba.conf 中授权的) |
--dbname | esrep | ✅ 复制用户连接的数据库名(通常是 esrep) |
--force | (无值) | 自动删除目标目录并克隆 |
-------------------------------------------------成功的样子---------------------------------------------------------------
kingbase@GJHYZX-pc:~/cluster/install/kingbase$ repmgr -f /home/kingbase/cluster/install/kingbase/etc/repmgr.conf \
> standby clone \
> --host=192.168.10.18 \
> --port=54322 \
> --user=esrep \
> --dbname=esrep \
> --force
[NOTICE] destination directory "/home/kingbase/cluster/install/kingbase/data" provided
[INFO] connecting to source node
[DETAIL] connection string is: host=192.168.10.18 port=54322 user=esrep dbname=esrep
[DETAIL] current installation size is 92 MB
WARNING: configuration parameter "ora_input_emptystr_isnull" should not set in current database mode
[NOTICE] checking for available walsenders on the source node (2 required)
[NOTICE] checking replication connections can be made to the source server (2 required)
[INFO] creating directory "/home/kingbase/cluster/install/kingbase/data"...
[INFO] creating replication slot as user "esrep"
[NOTICE] starting backup (using sys_basebackup)...
[HINT] this may take some time; consider using the -c/--fast-checkpoint option
[INFO] executing:
/home/kingbase/cluster/install/kingbase/bin/sys_basebackup -l "repmgr base backup" -D /home/kingbase/cluster/install/kingbase/data -h 192.168.10.18 -p 54322 -U esrep -X stream -S repmgr_slot_2
[NOTICE] standby clone (using sys_basebackup) complete
[NOTICE] you can now start your Kingbase server
[HINT] for example: sys_ctl -D /home/kingbase/cluster/install/kingbase/data start
[HINT] after starting the server, you need to re-register this standby with "repmgr standby register --force" to update the existing node record
2.4 启动新克隆的数据库
sys_ctl start -D /home/kingbase/cluster/install/kingbase/data
成功的样子:
kingbase@GJHYZX-pc:~/cluster/install/kingbase$ sys_ctl start -D data
等待服务器进程启动 ....2025-09-29 19:12:15.540 CST [23244] LOG: config the real archive_command string as soon as possible to archive WAL files
2025-09-29 19:12:15.545 CST [23244] LOG: sepapower extension initialized
2025-09-29 19:12:15.549 CST [23244] LOG: starting KingbaseES V008R006C009B0014 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28), 64-bit
2025-09-29 19:12:15.549 CST [23244] LOG: listening on IPv4 address "0.0.0.0", port 54322
2025-09-29 19:12:15.549 CST [23244] LOG: listening on IPv6 address "::", port 54322
2025-09-29 19:12:15.551 CST [23244] LOG: listening on Unix socket "/tmp/.s.KINGBASE.54322"
2025-09-29 19:12:15.707 CST [23244] LOG: redirecting log output to logging collector process
2025-09-29 19:12:15.707 CST [23244] HINT: Future log output will appear in directory "sys_log".
完成
服务器进程已经启动
2.5 重新注册从库节点
repmgr -f /home/kingbase/cluster/install/kingbase/etc/repmgr.conf standby register --force
成功的样子
kingbase@GJHYZX-pc:~/cluster/install/kingbase$ repmgr -f /home/kingbase/cluster/install/kingbase/etc/repmgr.conf standby register --force
[INFO] connecting to local node "node2" (ID: 2)
[INFO] connecting to primary database
[INFO] standby registration complete
[NOTICE] standby node "node2" (ID: 2) successfully registered
2.6 集群状态查看
repmgr -f /home/kingbase/cluster/install/kingbase/etc/repmgr.conf cluster show
主库正常运行,从库正在跟随主库。
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | LSN_Lag | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+---------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 | node1 | primary | * running | | default | 100 | 1 | | host=192.168.10.18 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=2 keepalives_interval=2 keepalives_count=3 tcp_user_timeout=9000
2 | node2 | standby | running | node1 | default | 100 | 1 | 0 bytes | host=192.168.10.19 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=2 keepalives_interval=2 keepalives_count=3 tcp_user_timeout=9000
----------------------------------------------------------------------------------------------------------------
repmgr 的全称是 REPLication Manager,即“复制管理器”。
详细解释:
- rep → 来自 Replication(复制)
- mgr → 来自 Manager(管理器)
所以 repmgr = Replication Manager
浙公网安备 33010602011771号