ES无法工作,CircuitBreakingException

异常现象:

  1. ES 无法工作,没办法写入
  2. ES有些shards没有分配
  3. ES pod正常运行中

排查思路:

查看es-data的日志发现有很多错误日志,关于熔断器CircuitBreakingException

[2020-10-18T00:55:22,010][WARN ][o.e.i.c.IndicesClusterStateService] [myproject-elasticsearch-data-0] [index-06-19][10] marking and sending shard failed due to [failed recovery]
org.elasticsearch.indices.recovery.RecoveryFailedException: [index-06-19][10]: Recovery failed from {myproject-elasticsearch-data-5}{WXPbN7GbQaOxRnBnfO_AXg}{2thYD0E1TuO8JwwarnaK-A}{172.20.4.52}{172.20.4.52:9300}{d}{xpack.installed=true} into {myproject-elasticsearch-data-0}{x3XJEg0ZQpC_GXfLbYXbgg}{nxNG_5sqRGqwcg-tPr-3Pw}{172.20.10.6}{172.20.10.6:9300}{d}{xpack.installed=true}
at org.elasticsearch.indices.recovery.PeerRecoveryTargetService.lambda$doRecovery$2(PeerRecoveryTargetService.java:247) [elasticsearch-7.6.2.jar:7.6.2]
at org.elasticsearch.indices.recovery.PeerRecoveryTargetService$1.handleException(PeerRecoveryTargetService.java:292) [elasticsearch-7.6.2.jar:7.6.2]
at org.elasticsearch.transport.PlainTransportFuture.handleException(PlainTransportFuture.java:97) [elasticsearch-7.6.2.jar:7.6.2]
at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1130) [elasticsearch-7.6.2.jar:7.6.2]
at org.elasticsearch.transport.InboundHandler.lambda$handleException$2(InboundHandler.java:244) [elasticsearch-7.6.2.jar:7.6.2]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:633) [elasticsearch-7.6.2.jar:7.6.2]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at java.lang.Thread.run(Thread.java:830) [?:?]
Caused by: org.elasticsearch.transport.RemoteTransportException: [myproject-elasticsearch-data-5][172.20.4.52:9300][internal:index/shard/recovery/start_recovery]
Caused by: org.elasticsearch.common.breaker.CircuitBreakingException: [parent] Data too large, data for [<transport_request>] would be [32242090164/30gb], which is larger than the limit of [31530549248/29.3gb], real usage: [32242088912/30gb], new bytes reserved: [1252/1.2kb], usages [request=0/0b, fielddata=23279084771/21.6gb, in_flight_requests=34132/33.3kb, accounting=3423418906/3.1gb]
at org.elasticsearch.indices.breaker.HierarchyCircuitBreakerService.checkParentLimit(HierarchyCircuitBreakerService.java:343) ~[elasticsearch-7.6.2.jar:7.6.2]
at org.elasticsearch.common.breaker.ChildMemoryCircuitBreaker.addEstimateBytesAndMaybeBreak(ChildMemoryCircuitBreaker.java:128) ~[elasticsearch-7.6.2.jar:7.6.2]
at org.elasticsearch.transport.InboundHandler.handleRequest(InboundHandler.java:171) ~[elasticsearch-7.6.2.jar:7.6.2]
at org.elasticsearch.transport.InboundHandler.messageReceived(InboundHandler.java:119) ~[elasticsearch-7.6.2.jar:7.6.2]
at org.elasticsearch.transport.InboundHandler.inboundMessage(InboundHandler.java:103) ~[elasticsearch-7.6.2.jar:7.6.2]
at org.elasticsearch.transport.TcpTransport.inboundMessage(TcpTransport.java:667) ~[elasticsearch-7.6.2.jar:7.6.2]
at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:62) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352) ~[?:?]
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:326) ~[?:?]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:300) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352) ~[?:?]
at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:241) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352) ~[?:?]
at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1478) ~[?:?]
at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1227) ~[?:?]
at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1274) ~[?:?]
at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:503) ~[?:?]
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:442) ~[?:?]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:281) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352) ~[?:?]
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1422) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) ~[?:?]
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:931) ~[?:?]
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) ~[?:?]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:700) ~[?:?]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:600) ~[?:?]
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:554) ~[?:?]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:514) ~[?:?]
at io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1050) ~[?:?]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[?:?]
at java.lang.Thread.run(Thread.java:830) ~[?:?]

从日志上很明显看出是内存过大,触动了熔断器。

index的写入和查询都可以产生缓存,但是从日志了看fielddata=23279084771/21.6gb 占据大部分

去官网查看相关参数

indices.fielddata.cache.size, 他的特点

  • 加载进入后,会一直保存。直到节点崩溃(只增不减)
  • 在查询的时候,会加载查询指定字段的所有文档。(很可能很大)

它的大小限制可以分为绝对值和百分比,我这设置的是5%。接下来都没遇到这个异常了

但是代价是查询性能可能没那么好,因为如果内存设置的太小,那么 ES的查询就是从磁盘中加载查询数据。

官网解释

https://www.elastic.co/guide/cn/elasticsearch/guide/current/_limiting_memory_usage.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值