clickhouse-copier(未完成)

环境:
源端--集群环境
192.168.1.118
192.168.1.134
192.168.1.135

目的端--单节点
192.168.1.85

 

1.在源端创建库和表
我这里是192.168.1.118上面创建本地表
创建库:
create database db_test;

创建表:
create table db_test.worker
(
id Int32,
name String,
gender Int16,
birthday Date,
cert_num String,
department_id Int32
)
ENGINE = MergeTree()
primary key id
partition by department_id
ORDER BY id;

写入数据:
insert into db_test.worker values(1,'name1',1,now(),12,10);
insert into db_test.worker values(2,'name2',1,now(),12,11);

 

2.在 /etc/clickhouse-server/ 中,创建目录层级 copytasks/task1/
在节点192.168.1.118上操作
[root@localhost clickhouse-server]# cd /etc/clickhouse-server/
[root@localhost clickhouse-server]# mkdir -p copytasks/task1

 

3.创建复制任务文件schema.xml
这个文件的内容包括源和目标的集群分片信息,以及需要同步的表信息.
vi /etc/clickhouse-server/copytasks/task1/schema.xml

<yandex>
    <remote_servers>
        <source_cluster>
            <shard>
                <weight>1</weight>
                <replica>
                    <host>192.168.1.118</host>
                    <port>9000</port>
                </replica>
            </shard>
        </source_cluster>
        <target_cluster>
            <shard>
                <weight>1</weight>
                <replica>
                    <host>192.168.1.85</host>
                    <port>9000</port>
                </replica>
            </shard>
        </target_cluster>
    </remote_servers>
    <!-- How many simultaneously active workers are possible. If you run more workers superfluous workers will sleep. -->
    <max_workers>2</max_workers>
    <!-- Setting used to fetch (pull) data from source cluster tables -->
    <settings_pull>
        <readonly>1</readonly>
    </settings_pull>
    <!-- Setting used to insert (push) data to destination cluster tables -->
    <settings_push>
        <readonly>0</readonly>
    </settings_push>
    <!-- Common setting for fetch (pull) and insert (push) operations. Also, copier process context uses it.

         They are overlaid by <settings_pull/> and <settings_push/> respectively. -->
    <settings>
        <connect_timeout>3</connect_timeout>
        <!-- Sync insert is set forcibly, leave it here just in case. -->
        <insert_distributed_sync>1</insert_distributed_sync>
    </settings>
    <tables>
        <table_person>
            <cluster_pull>source_cluster</cluster_pull>
            <database_pull>db_test</database_pull>
            <table_pull>worker</table_pull>
            <cluster_push>target_cluster</cluster_push>
            <database_push>db_test</database_push>
            <table_push>worker</table_push>
            <engine> ENGINE = MergeTree()
                PARTITION BY department_id
                PRIMARY KEY id
                ORDER BY id
                SETTINGS index_granularity = 8192
           </engine>
        </table_person>
    </tables>
</yandex>

 


4.进入zookeeper 的bin 目录,使用如下命令,在zookeeper中创建复制任务
cd /opt/apache-zookeeper-3.7.0-bin/bin/
./zkCli.sh -server 192.168.1.85:2181 create /clickhouse/copytasks ""
./zkCli.sh -server 192.168.1.85:2181 create /clickhouse/copytasks/task1 ""
./zkCli.sh -server 192.168.1.85:2181 create /clickhouse/copytasks/task1/description "`cat /etc/clickhouse-server/copytasks/task1/schema.xml`"

./zkCli.sh -server 192.168.1.85:2181 get

 

5.在/etc/clickhouse-server/copytasks/task1 中,创建文件 zookeeper.xml,它用来配置备份需要的zookeeper 的地址。内容如下

 

vi /etc/clickhouse-server/copytasks/task1/zookeeper.xml
<yandex>
    <logger>
        <level>trace</level>
        <size>100M</size>
        <count>3</count>
    </logger>
    <zookeeper>
        <node index="1">
            <host>192.168.1.118</host>
            <port>2182</port>
        </node>
    </zookeeper>
</yandex>

 

 

6. 现在执行下面的命令,运行clickhouse-copier备份任务。
clickhouse-copier --config-file=/etc/clickhouse-server/copytasks/task1/zookeeper.xml --task-path=/clickhouse/copytasks/task1 --base-dir /etc/clickhouse-server/copytasks/task1/

 

posted @ 2022-01-18 15:12  slnngk  阅读(263)  评论(0)    收藏  举报