kingbase集群数据库疑问解决

第1章 存在问题

kingbase@GJHYZX-pc:~$ repmgr cluster show
[WARNING] node "node2" not found in "sys_stat_replication"
 ID | Name  | Role    | Status    | Upstream | Location | Priority | Timeline | LSN_Lag | Connection string                                                                                                                                                     
----+-------+---------+-----------+----------+----------+----------+----------+---------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 1  | node1 | primary | * running |          | default  | 100      | 1        |         | host=192.168.10.18 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=2 keepalives_interval=2 keepalives_count=3 tcp_user_timeout=9000
 2  | node2 | standby |   running | ! node1  | default  | 100      | 1        | 0 bytes | host=192.168.10.19 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=2 keepalives_interval=2 keepalives_count=3 tcp_user_timeout=9000

[WARNING] following issues were detected
  - node "node2" (ID: 2) is not attached to its upstream node "node1" (ID: 1)

问题分析

kingbase@GJHYZX-pc:~/cluster/install/kingbase/etc$ repmgr -f ./repmgr.conf primary check
[WARNING] node "node2" not found in "sys_stat_replication"
Node "node2":
	Server role: OK (node is standby)
	Replication lag: OK (0 seconds)
	WAL archiving: OK (0 pending archive ready files)
	Upstream connection: CRITICAL (node "node2" (ID: 2) is not attached to expected upstream node "node1" (ID: 1))
	Downstream servers: OK (this node has no downstream nodes)
	Replication slots: OK (node has no physical replication slots)
	Missing physical replication slots: OK (node has no missing physical replication slots)
	Configured data directory: OK (configured "data_directory" is "/home/kingbase/cluster/install/kingbase/data")

第2章 解决思路(在从节点进行如下操作)

2.1 停止从节点

sys_ctl stop -D /home/kingbase/cluster/install/kingbase/data

2.2 将原来从节点数据库备份

mv /home/kingbase/cluster/install/kingbase/data /home/kingbase/cluster/install/kingbase/data_bak_20250929

2.3 克隆主节点数据库到从节点

repmgr -f /home/kingbase/cluster/install/kingbase/etc/repmgr.conf \
       standby clone \
       --host=192.168.10.18 \
       --port=54322 \
       --user=esrep \
       --dbname=esrep \
       --force
参数示例值说明
--host192.168.10.18✅ 主库(node1)的 IP 地址
--port54322✅ 主库的 Kingbase 端口(不是 repmgr 端口)
--useresrep✅ 用于复制的用户(在 sys_hba.conf 中授权的)
--dbnameesrep✅ 复制用户连接的数据库名(通常是 esrep
--force(无值)自动删除目标目录并克隆

-------------------------------------------------成功的样子---------------------------------------------------------------

kingbase@GJHYZX-pc:~/cluster/install/kingbase$ repmgr -f /home/kingbase/cluster/install/kingbase/etc/repmgr.conf \
>        standby clone \
>        --host=192.168.10.18 \
>        --port=54322 \
>        --user=esrep \
>        --dbname=esrep \
>        --force
[NOTICE] destination directory "/home/kingbase/cluster/install/kingbase/data" provided
[INFO] connecting to source node
[DETAIL] connection string is: host=192.168.10.18 port=54322 user=esrep dbname=esrep
[DETAIL] current installation size is 92 MB
WARNING:  configuration parameter "ora_input_emptystr_isnull" should not set in current database mode
[NOTICE] checking for available walsenders on the source node (2 required)
[NOTICE] checking replication connections can be made to the source server (2 required)
[INFO] creating directory "/home/kingbase/cluster/install/kingbase/data"...
[INFO] creating replication slot as user "esrep"
[NOTICE] starting backup (using sys_basebackup)...
[HINT] this may take some time; consider using the -c/--fast-checkpoint option
[INFO] executing:
  /home/kingbase/cluster/install/kingbase/bin/sys_basebackup -l "repmgr base backup"  -D /home/kingbase/cluster/install/kingbase/data -h 192.168.10.18 -p 54322 -U esrep -X stream -S repmgr_slot_2
[NOTICE] standby clone (using sys_basebackup) complete
[NOTICE] you can now start your Kingbase server
[HINT] for example: sys_ctl -D /home/kingbase/cluster/install/kingbase/data start
[HINT] after starting the server, you need to re-register this standby with "repmgr standby register --force" to update the existing node record

2.4 启动新克隆的数据库

sys_ctl start -D /home/kingbase/cluster/install/kingbase/data

成功的样子:

kingbase@GJHYZX-pc:~/cluster/install/kingbase$ sys_ctl start -D data
等待服务器进程启动 ....2025-09-29 19:12:15.540 CST [23244] LOG:  config the real archive_command string as soon as possible to archive WAL files
2025-09-29 19:12:15.545 CST [23244] LOG:  sepapower extension initialized
2025-09-29 19:12:15.549 CST [23244] LOG:  starting KingbaseES V008R006C009B0014 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28), 64-bit
2025-09-29 19:12:15.549 CST [23244] LOG:  listening on IPv4 address "0.0.0.0", port 54322
2025-09-29 19:12:15.549 CST [23244] LOG:  listening on IPv6 address "::", port 54322
2025-09-29 19:12:15.551 CST [23244] LOG:  listening on Unix socket "/tmp/.s.KINGBASE.54322"
2025-09-29 19:12:15.707 CST [23244] LOG:  redirecting log output to logging collector process
2025-09-29 19:12:15.707 CST [23244] HINT:  Future log output will appear in directory "sys_log".
 完成
服务器进程已经启动

2.5 重新注册从库节点

repmgr -f /home/kingbase/cluster/install/kingbase/etc/repmgr.conf standby register --force

成功的样子

kingbase@GJHYZX-pc:~/cluster/install/kingbase$ repmgr -f /home/kingbase/cluster/install/kingbase/etc/repmgr.conf standby register --force
[INFO] connecting to local node "node2" (ID: 2)
[INFO] connecting to primary database
[INFO] standby registration complete
[NOTICE] standby node "node2" (ID: 2) successfully registered

2.6 集群状态查看

repmgr -f /home/kingbase/cluster/install/kingbase/etc/repmgr.conf cluster show

主库正常运行,从库正在跟随主库。

 ID | Name  | Role    | Status    | Upstream | Location | Priority | Timeline | LSN_Lag | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+---------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 1  | node1 | primary | * running |          | default  | 100      | 1        |         | host=192.168.10.18 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=2 keepalives_interval=2 keepalives_count=3 tcp_user_timeout=9000
 2  | node2 | standby |   running | node1    | default  | 100      | 1        | 0 bytes | host=192.168.10.19 user=esrep dbname=esrep port=54322 connect_timeout=10 keepalives=1 keepalives_idle=2 keepalives_interval=2 keepalives_count=3 tcp_user_timeout=9000

----------------------------------------------------------------------------------------------------------------

repmgr 的全称是 REPLication Manager,即“复制管理器”。


详细解释:

  • rep → 来自 Replication(复制)
  • mgr → 来自 Manager(管理器)

所以 repmgr = Replication Manager

posted @ 2025-10-25 15:27  ycfenxi  阅读(1)  评论(0)    收藏  举报