ES内存使用分析及熔断器设置

内存占用

https://www.elastic.co/guide/cn/elasticsearch/guide/current/_limiting_memory_usage.html

ES的JVM heap按使用场景分为可GC部分和常驻部分。可GC部分内存会随着GC操作而被回收；常驻部分不会被GC，通常使用LRU策略来进行淘汰；内存占用情况如下图：

common space包括了indexing buffer和其他ES运行需要的class。indexing buffer由indices.memory.index_buffer_size参数控制，默认最大占用10%，当full up后，该部分数据被刷入磁盘对应的Segments中。这部分空间是可以被回收反复利用的。

queryCache 是node级别的filter过滤器结果缓存，大小由indices.queries.cache.size 参数控制，默认10%。使用LRU淘汰策略。

requestCache是shard级别的query result缓存，通常 only requests of size 0 such as aggregations, counts and suggestions will be cached。使用LRU淘汰策略。通过indices.requests.cache.size参数控制，默认1%。设置后整个NODE都生效。

fieldDataCache，针对text字段，没有docValues属性(相当于列存储)，当对text类型字段进行sort,agg时，需要将对应的字段内容全部加载到内存，这部分数据就放在fieldDataCache。通过indices.fielddata.cache.size 参数限制大小，默认不限制。这种情况下，占用内存会逐渐增多，直到触发熔断；新数据无法加载。

segmentsMemory ，缓存段信息，包括FST,Dimensional points for numeric range filters，Deleted documents bitset ，Doc values and stored fields codec formats等数据。这部分缓存是必须的，不能进行大小设置，通常跟index息息相关，close index、force merge均会释放部分空间。可以通过命令

GET _cat/nodes?v&h=id,ip,port,r,ramPercent,ramCurrent,heapMax,heapCurrent,fielddataMemory,queryCacheMemory,requestCacheMemory,segmentsMemory

GET /_cat/nodes?v&h=name,port,sm

GET /_nodes/stats/breaker?pretty

查看当前各块的使用情况。

熔断器

Elasticsearch 有一系列的断路器，它们都能保证内存不会超出限制：

indices.breaker.fielddata.limit fielddata 断路器默认设置堆的 60% 作为 fielddata 大小的上限。
indices.breaker.request.limit request 断路器估算需要完成其他请求部分的结构大小，例如创建一个聚合桶，默认限制是堆内存的 60%。它实际上是node level的一个统计值，统计的是这个结点上，各类查询聚合操作，需要申请的Bigarray的空间大小总和。所以如果有一个聚合需要很大的空间，同时在执行的聚合可能也会被break掉。
indices.breaker.total.limit 父熔断，inflight、request(agg)和fielddata不会使用超过堆内存的 70%。
network.breaker.inflight requests.limit 限制当前通过HTTP等进来的请求使用内存不能超过Node内存的指定值。这个内存主要是限制请求内容的长度。默认100%。
script.max_compilations_per_minute
限制script并发执行数，默认值为15。

参考文档 https://www.elastic.co/guide/en/elasticsearch/reference/5.3/circuit-breaker.html#fielddata-circuit-breaker https://www.elastic.co/guide/cn/elasticsearch/guide/cn/_limiting_memory_usage.html http://zhengjianglong.leanote.com/post/ES%E8%AE%BE%E7%BD%AE

ES默认配置下，heap是存在超卖情况的。

类目	默认占比	是否常驻	淘汰策略(在控制大小情况下)	控制参数
query cache	10%	是	LRU	indices.queries.cache.size
request cache	1%	是	LRU	indices.requests.cache.size
fielddata cache	无限制	是	LRU	indices.fielddata.cache.size
segment memory	无限制	是	无	不能通过参数控制
common space	70%	否	GC	通过熔断器 indices.breaker.total.limit 限制

common space(可GC)

子类目	默认占比	控制参数
indexing buffer	10%	indices.memory.index_buffer_size
request agg data	60%	indices.breaker.request.limit
in-flight data	100%	network.breaker.inflight_requests.limit

通过上表可知，segment memory是非常重要，而且是不可通过参数干预的内存空间，而cache部分则可以提升性能，可以被清除。common space 是运行时的动态空间，可以被GC。

综上所述，需要保证segment memory+cache+common space不超过100%。由于熔断器是按整个heap大小来计算的，所以如果segment memory 过大，仍然可能会导致OOM。为了减少这种情况的发生，需要预留足够空间给segment。优化

限制fielddata大小，fielddata是针对text类型进行排序、聚合才用到。正常应该避免这种情况发生。
限制request agg data大小，这个参数会影响聚合使用的内存，如果触发熔断，业务需要进行优化。

内存分配

segment memory	预留10%
fielddata cache	限制在20%
query cache	限制10%
request cache	限制1%
indexing buffer	限制10%
request agg data	限制1%	父熔断器配置30%，扣除fielddata,agg剩余的就是in-flight
in-flight data	限制9%	父熔断器配置30%，扣除fielddata,agg剩余的就是in-flight

参数设置

indices.fielddata.cache.size:1%--需要重启节点

PUT _cluster/settings
{
  "persistent": {
    "indices.breaker.fielddata.limit":"20%",
    "indices.breaker.request.limit":"1%",
    "indices.breaker.total.limit":"70%"

  }
}

posted @ 2021-10-11 16:09 Cooper190113 阅读(552) 评论(0) 收藏举报

刷新页面返回顶部

Loading

Cooper's Blog

ES内存使用分析及熔断器设置

内存占用

熔断器

内存分配

公告