Elasticsearch 备份工具elasticdump的安装与使用

数据备份是生产环境中非常重要的一件事，那如何对elasticsearch数据进行备份呢？关于数据备份找了一些方法，发现elasticdump 这个工具不错，ElasticDump是一个ElasticSearch的数据导入导出开源工具包，方便使用。

官方地址：https://github.com/taskrabbit/elasticsearch-dump

下面介绍一下这个工具的安装方法。

安装方式如下：

第一步：安装NodeJS
下载安装包:wget http://nodejs.org/dist/v0.10.32/node-v0.10.32-linux-x64.tar.gz
解压:tar xvf node-v0.10.22-linux-x64.tar.gz

配置环境变量：
在/etc/profile文件新增：
export NODE_HOME=/home/node-v0.10.0-linux-x64
export PATH=$PATH:$NODE_HOME/bin
export NODE_PATH=$NODE_HOME/lib/node_modules

执行 source /etc/profile 命令让环境变量设置生效
测试：
在终端输入node -v
在终端输入npm  -v
如果有版本信息输出，则说明安装成功

安装ElasticDump
npm install elasticdump -g
elasticdump

使用方法

elasticdump --input=http://192.168.0.92:9200/hs2840 --output ./hs2840_data_201905.json --type=data

type:可以导出数据或者mapping

备份示例：

# 从 production.es.com 机器备份到 staging.es.com 机器上

elasticdump  --input=http://production.es.com:9200/my_index  --output=http://staging.es.com:9200/my_index   --type=mapping
elasticdump  --input=http://production.es.com:9200/my_index  --output=http://staging.es.com:9200/my_index   --type=data

# 备份到文件

elasticdump  --input=http://production.es.com:9200/my_index  --output=/data/my_index_mapping.json  --type=mapping 
elasticdump  --input=http://production.es.com:9200/my_index  --output=/data/my_index.json         --type=data

# 备份并压缩
elasticdump  --input=http://production.es.com:9200/my_index  --output=$  | gzip > /data/my_index.json.gz

# 备份一条查询记录
elasticdump  --input=http://production.es.com:9200/my_index  --output=query.json 
--searchBody '{"query":{"term":{"username": "admin"}}}'

恢复数据
# elasticdump --input=/tmp/indexname.json --output=http://192.168.1.144:9200/mytest --type=data --limit=10000 --concurrency=20

ElasticSearch 实战：使用elasticdump导出导入数据

elasticdump 是一个用于备份和迁移 Elasticsearch 数据的命令行工具。以下是在实践中使用 elasticdump 导出和导入数据的具体步骤：

1. 安装 elasticdump
确保您已安装 Node.js。然后，使用 npm 安装 elasticdump：

npm install -g elasticdump
2. 导出数据
导出整个索引
假设要导出名为 my_index 的索引到本地 JSON 文件 my_index_dump.json：

elasticdump \
--input=http://localhost:9200/my_index \
--output=my_index_dump.json \
--type=data

导出索引 mapping
如果只想导出索引的 mapping（结构），不包含数据，可以指定 --type=mapping：

elasticdump \
--input=http://localhost:9200/my_index \
--output=my_index_mapping.json \
--type=mapping

导出索引 settings
要导出索引的 settings，使用 --type=settings：

elasticdump \
--input=http://localhost:9200/my_index \
--output=my_index_settings.json \
--type=settings

3. 导入数据
导入到空索引
将之前导出的 my_index_dump.json 数据文件导入到目标 Elasticsearch 环境中的空索引 my_index：

elasticdump \
--input=my_index_dump.json \
--output=http://target_es_host:9200/my_index \
--type=data

确保目标索引不存在或已清空，避免数据冲突。

覆盖现有索引
如果需要覆盖现有索引的数据，可以添加 --overwrite 参数：

elasticdump \
--input=my_index_dump.json \
--output=http://target_es_host:9200/my_index \
--type=data \
--overwrite

增量导入
elasticdump 支持增量导入数据，但需要索引具有 _timestamp 或 _seq_no 字段，并使用 --last-modified 参数。这通常用于定期增量备份和恢复。请参阅 elasticdump 文档了解详细用法。

其他选项与注意事项
并发与批量：通过 --limit 参数调整每次读写操作的数据量，提高导入导出效率。同时，可以使用 --concurrency 设置并发数。

认证：如果 Elasticsearch 集群启用了安全认证，需通过 --username 和 --password 参数提供凭据，或者使用 --headers 提供 JWT 等自定义认证头。

SSL：对于使用 HTTPS 的 Elasticsearch 集群，添加 --input-ca、--output-ca 等参数指定 CA 证书，或者使用 --noVerifyCert 忽略证书验证（非生产环境）。

索引状态：确保在导入数据前，目标索引的 mapping 和 settings 与导出数据相匹配。必要时，先导入 mapping 和 settings，再导入数据。

磁盘空间：导出的大规模数据集可能占用大量磁盘空间。确保有足够的磁盘空间存放临时文件和最终的导出文件。

性能影响：大规模数据迁移可能会影响集群性能。在业务低峰期进行操作，或考虑使用快照迁移等其他方法。

通过以上步骤，您已掌握了使用 elasticdump 工具进行 Elasticsearch 数据的导出和导入。根据实际需求调整参数，确保数据迁移过程顺利进行。

如果有用户名和密码认证

elasticdump --input=http://user:passwd@127.0.0.1:9200/exploit --output=my_index_mapping.json --type=mapping

elasticdump --input=http://user:passwd@127.0.0.1:9200/exploit --output=my_index_data.json --type=data

1.备份数据

--limit 1000 一次导入一千条数据，加快进度。参考：https://www.cnblogs.com/kerwinC/p/6296675.html

elasticdump --input=http://192.168.1.49:9200/mytest --output=/tmp/indexname.json --type=data --limit=10000 --concurrency=1

posted @ 2021-08-20 17:15 羊脂玉净瓶阅读(1765) 评论(0) 收藏举报

刷新页面返回顶部

羊脂玉净瓶

Elasticsearch 备份工具elasticdump的安装与使用

公告