Elasticsearch顶尖高手系列课程之高手进阶篇--110讲解translog详解

参加博客:https://blog.csdn.net/star1210644725/article/details/123564559

ES为什么需要有translog?
es是近实时的存储引擎(和搜索引擎)。所谓近实时,是指新增一条数据,或者修改一条数据,并不能保证被立刻看到。数据被看到的时候数据已经作为一个提交点,被写入到了文件系统中(这个过程称为refresh)。因为一次写入的成本相对比较大,所以用攒一波批量提交的方式,写入性能会更好。不管这些数据都是在堆内存中还是在文件系统中(Filesystem Cache),如果发生断电,或者JVM的崩溃,则这部分数据一定会丢失。为了防止数据丢失,这部分数据会被写入到traslog中一份。当然这个写入translog的代价远小于作为一个提交点写入到分片中(lucene实例中)的代价小。

所以trasnlog日志实际上是一个补偿机制。用来防止数据不丢失的。

什么时候写translog,以及什么时候用traslog,什么时候清理traslog?
上边有提到,当接到写请求或者修改请求的时候,就会写一份数据到trasnlog。

如果发生的断电,以及其它故障,这份数据就被排上用场了。因为最后一个提交点,并没有被写到磁盘上(数据落到磁盘上的过程叫做flush),他可能还在内存中,也可能在文件系统上。但是因为故障丢失了,此时可以从translog中拿到这份数据,回放。

而translog被清理,是当数据从文件系统上flush到磁盘上。此时这份translog已经失去了它的价值,所以理应被清理。这就好比方说,你请了一个雇佣兵,护送你从叙利亚回到中国。当你回到中国的目标实现了以后,雇佣兵也没有价值了,爱去哪里去哪里吧。

refresh和flush的区别是什么?
refresh,是数据从堆内存进入到文件系统。此时数据从不可见到可以用来被搜索到。

flush是数据从文件系统到磁盘。此时数据不会丢失了,除非硬盘坏了。

写translog过程 以及 translog的参数配置
虽然translog是用来防止数据丢失,但是也有数据丢失的风险。

写translog,并不是每个写操作或者更新操作,都立刻写入到磁盘。因为写translog也是有同步和异步两种模式的,在异步模式下translog只有被fsync才会被写入到磁盘。正是“index.translog.sync_interval”这个配置参数,决定了多久出发一次fsync,默认是5s,这意味着,在这5s内发生断电,数据也是回被丢失的。

“index.translog.durability”这配置参数,可以控制translog是同步还是异步。决定了是否在每次写数据或者修改数据就触发一次fsync。es的默认是index.translog.durability=request。也就是每次都触发fsync

如果制定index.translog.durability=async 那么就要面临丢数据的风险了。

translog日志的大小也不能是无限大的,因为它的大小,则决定了集群崩溃后恢复的时间长短。如果太大,则会面临集群恢复很久的问题。“index.translog.flush_threshold_size” 这个参数就是指定translog日志最大的大小的。默认为512M。意思是当translog日志大于512M,就一定会触发一次flush,将数据从文件系统落到磁盘上,并将translog清理掉。

关于参数调优-translog可以影响到写入的性能
translog能够修改的参数基本上也列出来了。这是我在官网上能看到的参数。

针对上边的分析,其实我们有个点可以平衡。如果我们可以接受一定的数据丢失的话,我们可以用异步写translog日志的方式来写入数据。因为一次成功的写,是要等待写translog成功的。如果我们用异步,则不用等写traslog成功,就可以返回了。

关于这个优化的点,其实很多文章都没有提到的,官网也没有说。全靠我们自己悟。

具体能提升多少性能,这要根据压测来确定。

es官网:https://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules-translog.html

Translogedit
Changes to Lucene are only persisted to disk during a Lucene commit, which is a relatively expensive operation and so cannot be performed after every index or delete operation. Changes that happen after one commit and before another will be removed from the index by Lucene in the event of process exit or hardware failure.

Lucene commits are too expensive to perform on every individual change, so each shard copy also writes operations into its transaction log known as the translog. All index and delete operations are written to the translog after being processed by the internal Lucene index but before they are acknowledged. In the event of a crash, recent operations that have been acknowledged but not yet included in the last Lucene commit are instead recovered from the translog when the shard recovers.

An Elasticsearch flush is the process of performing a Lucene commit and starting a new translog generation. Flushes are performed automatically in the background in order to make sure the translog does not grow too large, which would make replaying its operations take a considerable amount of time during recovery. The ability to perform a flush manually is also exposed through an API, although this is rarely needed.

Translog settingsedit
The data in the translog is only persisted to disk when the translog is fsynced and committed. In the event of a hardware failure or an operating system crash or a JVM crash or a shard failure, any data written since the previous translog commit will be lost.

By default, index.translog.durability is set to request meaning that Elasticsearch will only report success of an index, delete, update, or bulk request to the client after the translog has been successfully fsynced and committed on the primary and on every allocated replica. If index.translog.durability is set to async then Elasticsearch fsyncs and commits the translog only every index.translog.sync_interval which means that any operations that were performed just before a crash may be lost when the node recovers.

The following dynamically updatable per-index settings control the behaviour of the translog:

index.translog.sync_interval
How often the translog is fsynced to disk and committed, regardless of write operations. Defaults to 5s. Values less than 100ms are not allowed.
index.translog.durability
Whether or not to fsync and commit the translog after every index, delete, update, or bulk request. This setting accepts the following parameters:

request
(default) fsync and commit after every request. In the event of hardware failure, all acknowledged writes will already have been committed to disk.
async
fsync and commit in the background every sync_interval. In the event of a failure, all acknowledged writes since the last automatic commit will be discarded.
index.translog.flush_threshold_size
The translog stores all operations that are not yet safely persisted in Lucene (i.e., are not part of a Lucene commit point). Although these operations are available for reads, they will need to be replayed if the shard was stopped and had to be recovered. This setting controls the maximum total size of these operations, to prevent recoveries from taking too long. Once the maximum size has been reached a flush will happen, generating a new Lucene commit point. Defaults to 512mb.

 

视频截图

 

posted on 2024-04-23 22:47  luzhouxiaoshuai  阅读(8)  评论(0编辑  收藏  举报

导航