ClickHouse 副本表删除重建时遇到的 ZooKeeper/Keeper 元数据残留问题。删除表时,ZooKeeper/Keeper 中的元数据没有被完全清理。

1 解决方案

1.1 方案1:等待一段时间再重建(最简单-不建议)

-- 删除后等待几秒到几分钟,让 ZooKeeper/Keeper 完成清理
DROP TABLE app_eap_user_tag_map_aggregation_da_153809 ON CLUSTER cluster_1s2r;

-- 等待 30-60 秒后执行
SYSTEM SYNC REPLICA ON CLUSTER cluster_1s2r;
-- 然后重新创建
CREATE TABLE ...

1.2 方案2:手动清理 ZooKeeper/Keeper 残留数据(推荐)

-- 首先删除表的元数据
DROP TABLE app_eap_user_tag_map_aggregation_da_153809 ON CLUSTER cluster_1s2r;

-- 手动删除 ZooKeeper/Keeper 中的残留路径
SYSTEM DROP REPLICA 'replica_200' FROM ZKPATH '/clickhouse/tables/1/app/app_eap_user_tag_map_aggregation_da_153809';
SYSTEM DROP REPLICA 'replica_201' FROM ZKPATH '/clickhouse/tables/1/app/app_eap_user_tag_map_aggregation_da_153809';

-- 或者如果知道具体的 shard 和 replica 名称,可以使用:
SYSTEM DROP REPLICA 'replica_200' FROM TABLE app_eap_user_tag_map_aggregation_da_153809;
zb-yunweitest-mysql-204-200 :) CREATE TABLE app_eap_user_tag_map_aggregation_da_153809 on cluster cluster_1s2r
(
    pguid Int64 COMMENT '会员id',
    dt Date COMMENT '日期',
    tag_map_state String COMMENT '标签key/标签values 聚合字段',
    created_time DateTime DEFAULT now() COMMENT '数据写入时间',
^Itenant_channel String COMMENT '渠道编码',
^Ibrand_id String COMMENT '品牌编码'
)
ENGINE = ReplicatedAggregatingMergeTree('/clickhouse/tables/{shard}/app/app_eap_user_tag_map_aggregation_da_153809', '{replica}')
ORDER BY (dt, pguid)
PARTITION BY dt
TTL dt + INTERVAL 7 DAY
SETTINGS index_granularity = 1024;

CREATE TABLE app_eap_user_tag_map_aggregation_da_153809 ON CLUSTER cluster_1s2r
(
    `pguid` Int64 COMMENT '会员id',
    `dt` Date COMMENT '日期',
    `tag_map_state` String COMMENT '标签key/标签values 聚合字段',
    `created_time` DateTime DEFAULT now() COMMENT '数据写入时间',
    `tenant_channel` String COMMENT '渠道编码',
    `brand_id` String COMMENT '品牌编码'
)
ENGINE = ReplicatedAggregatingMergeTree('/clickhouse/tables/{shard}/app/app_eap_user_tag_map_aggregation_da_153809', '{replica}')
PARTITION BY dt
ORDER BY (dt, pguid)
TTL dt + toIntervalDay(7)
SETTINGS index_granularity = 1024

Query id: 1e2a6304-d5be-408e-933d-f1ce40df7df3

┌─host───────────┬─port─┬─status─┬─error─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┬─num_hosts_remaining─┬─num_hosts_active─┐
│ 172.21.204.2009000253 │ Code: 253. DB::Exception: Replica /clickhouse/tables/1/app/app_eap_user_tag_map_aggregation_da_153809/replicas/replica_200 already exists. (REPLICA_ALREADY_EXISTS) (version 23.8.16.40 (official build)) │                   10 │
│ 172.21.204.2019000253 │ Code: 253. DB::Exception: Replica /clickhouse/tables/1/app/app_eap_user_tag_map_aggregation_da_153809/replicas/replica_201 already exists. (REPLICA_ALREADY_EXISTS) (version 23.8.16.40 (official build)) │                   00 │
└────────────────┴──────┴────────┴───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┴─────────────────────┴──────────────────┘
← Progress: 0.00 rows, 0.00 B (0.00 rows/s., 0.00 B/s.)                                                                                                                                  
2 rows in set. Elapsed: 0.105 sec. 

Received exception from server (version 23.8.16):
Code: 253. DB::Exception: Received from localhost:9000. DB::Exception: There was an error on [172.21.204.200:9000]: Code: 253. DB::Exception: Replica /clickhouse/tables/1/app/app_eap_user_tag_map_aggregation_da_153809/replicas/replica_200 already exists. (REPLICA_ALREADY_EXISTS) (version 23.8.16.40 (official build)). (REPLICA_ALREADY_EXISTS)

zb-yunweitest-mysql-204-200 :) 
zb-yunweitest-mysql-204-200 :) 
zb-yunweitest-mysql-204-200 :) 
zb-yunweitest-mysql-204-200 :) SYSTEM DROP REPLICA 'replica_200' FROM ZKPATH '/clickhouse/tables/1/app/app_eap_user_tag_map_aggregation_da_153809';

SYSTEM DROP REPLICA 'replica_200' FROM ZKPATH '/clickhouse/tables/1/app/app_eap_user_tag_map_aggregation_da_153809'

Query id: 78d0b0f8-8b8c-4c03-9ddb-6772a09dccf9

Ok.

0 rows in set. Elapsed: 0.007 sec. 

zb-yunweitest-mysql-204-200 :) SYSTEM DROP REPLICA 'replica_201' FROM ZKPATH '/clickhouse/tables/1/app/app_eap_user_tag_map_aggregation_da_153809';

SYSTEM DROP REPLICA 'replica_201' FROM ZKPATH '/clickhouse/tables/1/app/app_eap_user_tag_map_aggregation_da_153809'

Query id: 4ca7e258-53a2-41d1-ade4-59a385630550

Ok.

0 rows in set. Elapsed: 0.018 sec. 

zb-yunweitest-mysql-204-200 :) CREATE TABLE app_eap_user_tag_map_aggregation_da_153809 on cluster cluster_1s2r
(
    pguid Int64 COMMENT '会员id',
    dt Date COMMENT '日期',
    tag_map_state String COMMENT '标签key/标签values 聚合字段',
    created_time DateTime DEFAULT now() COMMENT '数据写入时间',
^Itenant_channel String COMMENT '渠道编码',
^Ibrand_id String COMMENT '品牌编码'
)
ENGINE = ReplicatedAggregatingMergeTree('/clickhouse/tables/{shard}/app/app_eap_user_tag_map_aggregation_da_153809', '{replica}')
ORDER BY (dt, pguid)
PARTITION BY dt
TTL dt + INTERVAL 7 DAY
SETTINGS index_granularity = 1024;

CREATE TABLE app_eap_user_tag_map_aggregation_da_153809 ON CLUSTER cluster_1s2r
(
    `pguid` Int64 COMMENT '会员id',
    `dt` Date COMMENT '日期',
    `tag_map_state` String COMMENT '标签key/标签values 聚合字段',
    `created_time` DateTime DEFAULT now() COMMENT '数据写入时间',
    `tenant_channel` String COMMENT '渠道编码',
    `brand_id` String COMMENT '品牌编码'
)
ENGINE = ReplicatedAggregatingMergeTree('/clickhouse/tables/{shard}/app/app_eap_user_tag_map_aggregation_da_153809', '{replica}')
PARTITION BY dt
ORDER BY (dt, pguid)
TTL dt + toIntervalDay(7)
SETTINGS index_granularity = 1024

Query id: 2a65d1eb-05d6-4515-a275-308e53eff9d9

┌─host───────────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐
│ 172.21.204.20090000 │       │                   10 │
│ 172.21.204.20190000 │       │                   00 │
└────────────────┴──────┴────────┴───────┴─────────────────────┴──────────────────┘

2 rows in set. Elapsed: 0.106 sec. 

zb-yunweitest-mysql-204-200 :) 
View Code

1.3 方案3:使用不同路径重新创建

-- 添加时间戳或版本号到 ZooKeeper 路径
ENGINE = ReplicatedAggregatingMergeTree(
    '/clickhouse/tables/{shard}/app/app_eap_user_tag_map_aggregation_da_153809_v2',  -- 添加 _v2
    '{replica}')

1.4 方案4:通过配置文件修改元数据清理策略

在 config.xml 或 config.d/ 目录下的配置文件中添加:

xml
<zookeeper>
    <session_timeout_ms>30000</session_timeout_ms>
    <operation_timeout_ms>10000</operation_timeout_ms>
</zookeeper>

<!-- 或者使用 keeper -->
<keeper>
    <session_timeout_ms>30000</session_timeout_ms>
    <operation_timeout_ms>10000</operation_timeout_ms>
    <dead_session_check_period_ms>500</dead_session_check_period_ms>
</keeper>

1.5 方案5:检查并修复 ZooKeeper/Keeper 连接

-- 检查 ZooKeeper/Keeper 状态
SELECT * FROM system.zookeeper WHERE path = '/clickhouse/tables/1/app' FORMAT Vertical;

-- 重启 ClickHouse 服务(在每台服务器上)
SYSTEM RESTART REPLICAS ON CLUSTER cluster_1s2r;

1.6 推荐的完整解决步骤

  1. 首先确认表是否完全删除:

sql
SHOW TABLES LIKE 'app_eap_user_tag_map_aggregation_da_153809';
SELECT * FROM system.tables WHERE name = 'app_eap_user_tag_map_aggregation_da_153809';
  1. 清理所有节点:

# 在每个节点上执行
clickhouse-client --query="SYSTEM DROP REPLICA 'replica_200' FROM ZKPATH '/clickhouse/tables/1/app/app_eap_user_tag_map_aggregation_da_153809'"
clickhouse-client --query="SYSTEM DROP REPLICA 'replica_201' FROM ZKPATH '/clickhouse/tables/1/app/app_eap_user_tag_map_aggregation_da_153809'"
  1. 同步元数据:

SYSTEM SYNC REPLICA app_eap_user_tag_map_aggregation_da_153809 ; -- 同步单表的元数据
SYSTEM RESTART REPLICAS ON CLUSTER cluster_1s2r; -- 重启集群所有表的复制进程
  1. 最后重新创建表

1.7 预防措施

  1. 删除表后添加等待时间

  2. 在重建前检查 ZooKeeper 路径是否已清理

  3. 考虑使用 CREATE OR REPLACE TABLE(需要较新版本)

  4. 定期清理无用的 ZooKeeper 节点

建议优先采用方案1(等待+同步)或方案2(手动清理),大多数情况下等待一段时间或手动清理就能解决问题。如果频繁出现,需要检查 ZooKeeper/Keeper 的配置和网络连接。

 posted on 2025-12-16 16:10  xibuhaohao  阅读(23)  评论(0)    收藏  举报