cloudtalk 无法连接到消息服务器,solr - Solr Cloud down无法与Zookeeper对话客户端会话超时 - 堆栈内存溢出...

我有在16GB RAM内存上运行的solr云,用于分片的2个solr节点(相同ip),嵌入式zookeeper。 我在默认配置上运行solr,尽管默认配置随附-Xms5g-Xmx5g,但我在Solr仪表板上看到的内存有时会使用15gb的最大16gb内存。 这几个月来一切顺利。 它有300-900个馆藏,其文档大小在1到8.000.000 ++之间,分布在每个馆藏中(极少数情况是1个馆藏的文档超过100万个)。

但是目前,solr实例大部分时间是早上每7-8AM停机一次。 您可以在下面看到日志

ClientCnxn

Client session timed out,​ have not heard from server in 11856ms for sessionid 0x16784ac54710000

12/7/2018, 7:19:52 AM

WARN false

NIOServerCnxn

caught end of stream exception

12/7/2018, 7:19:53 AM

WARN false

NIOServerCnxn

caught end of stream exception

12/7/2018, 7:19:53 AM

WARN false

ConnectionManager

Watcher org.apache.solr.common.cloud.ConnectionManager@422f5928 name: ZooKeeperConnection Watcher:localhost:9983 got event WatchedEvent state:Disconnected type:None path:null path: null type: None

12/7/2018, 7:19:53 AM

WARN false

ConnectionManager

zkClient has disconnected

12/7/2018, 7:19:55 AM

WARN false

ClientCnxn

Unable to reconnect to ZooKeeper service,​ session 0x16784ac54710000 has expired

12/7/2018, 7:19:55 AM

WARN false

ConnectionManager

Watcher org.apache.solr.common.cloud.ConnectionManager@422f5928 name: ZooKeeperConnection Watcher:localhost:9983 got event WatchedEvent state:Expired type:None path:null path: null type: None

12/7/2018, 7:19:55 AM

WARN false

ConnectionManager

Our previous ZooKeeper session was expired. Attempting to reconnect to recover relationship with ZooKeeper...

12/7/2018, 7:19:55 AM

ERROR false

RequestHandlerBase

org.apache.solr.common.SolrException: Cannot talk to ZooKeeper - Updates are disabled.

12/7/2018, 7:19:55 AM

WARN false

OverseerTriggerThread

OverseerTriggerThread woken up but we are closed,​ exiting.

12/7/2018, 7:19:55 AM

ERROR false

SolrCmdDistributor

org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://10.250.200.217:8983/solr/BL_indomaret_oct_update.csv_shard1_replica_n1: Cannot talk to ZooKeeper - Updates are disabled.

12/7/2018, 7:19:55 AM

WARN false

DistributedUpdateProcessor

Error sending update to http://10.250.200.217:8983/solr

12/7/2018, 7:19:55 AM

ERROR false

Overseer

could not read the data

12/7/2018, 7:19:55 AM

WARN false

DefaultConnectionStrategy

Connection expired - starting a new one...

12/7/2018, 7:20:04 AM

ERROR false

RequestHandlerBase

org.apache.solr.common.SolrException: no servers hosting shard: shard1

我想像[here] [1]中的G1配置那样调优GC,但是如果我们从日志中看到,我想确认GC暂停是根本原因还是可能是其他原因。 使用CMS的默认配置

这是来自第一个solr节点的日志(使用jstat -gcutil)[bin] $ ./jstat -gcutil 31543 1000

S0 S1 E O M CCS YGC YGCT FGC FGCT GCT

14.39 0.00 84.22 42.04 93.99 89.40 2548 300.876 18 3.981 304.858

14.39 0.00 84.22 42.04 93.99 89.40 2548 300.876 18 3.981 304.858

14.39 0.00 84.22 42.04 93.99 89.40 2548 300.876 18 3.981 304.858

这个来自第二个solr节点。 ./jstat -gcutil 32223 1000

S0 S1 E O M CCS YGC YGCT FGC FGCT GCT

0.00 11.95 8.29 38.66 94.10 88.57 2121 206.174 8 2.076 208.251

0.00 11.95 8.29 38.66 94.10 88.57 2121 206.174 8 2.076 208.251

0.00 11.95 8.47 38.66 94.10 88.57 2121 206.174 8 2.076 208.251

这是solr_gc_current.log

2018-12-09T02:15:02.443+0700: 177309.558: Total time for which application threads were stopped: 0.1199759 seconds, Stopping threads took: 0.0046451 seconds

2018-12-09T02:15:25.680+0700: 177332.795: Total time for which application threads were stopped: 0.0309449 seconds, Stopping threads took: 0.0035637 seconds

2018-12-09T02:16:07.542+0700: 177374.657: Total time for which application threads were stopped: 0.0332466 seconds, Stopping threads took: 0.0036185 seconds

2018-12-09T02:16:07.576+0700: 177374.691: Total time for which application threads were stopped: 0.0306116 seconds, Stopping threads took: 0.0034811 seconds

2018-12-09T02:16:16.697+0700: 177383.812: Total time for which application threads were stopped: 0.0295741 seconds, Stopping threads took: 0.0035389 seconds

2018-12-09T02:16:31.868+0700: 177398.983: Total time for which application threads were stopped: 0.0390703 seconds, Stopping threads took: 0.0049162 seconds

2018-12-09T02:18:27.006+0700: 177514.121: Total time for which application threads were stopped: 0.0310958 seconds, Stopping threads took: 0.0037218 seconds

2018-12-09T02:18:27.964+0700: 177515.080: Total time for which application threads were stopped: 0.0360488 seconds, Stopping threads took: 0.0047906 seconds

{Heap before GC invocations=2120 (full 4):

par new generation total 1092288K, used 898004K [0x0000000680000000, 0x00000006d0000000, 0x00000006d0000000)

eden space 873856K, 99% used [0x0000000680000000, 0x00000006b555fee0, 0x00000006b5560000)

from space 218432K, 11% used [0x00000006b5560000, 0x00000006b6cf5470, 0x00000006c2ab0000)

to space 218432K, 0% used [0x00000006c2ab0000, 0x00000006c2ab0000, 0x00000006d0000000)

concurrent mark-sweep generation total 3932160K, used 1519752K [0x00000006d0000000, 0x00000007c0000000, 0x00000007c0000000)

Metaspace used 46887K, capacity 48033K, committed 49828K, reserved 1093632K

class space used 4960K, capacity 5217K, committed 5600K, reserved 1048576K

2018-12-09T02:18:28.456+0700: 177515.571: [GC (Allocation Failure) 2018-12-09T02:18:28.460+0700: 177515.575: [ParNew

Desired survivor size 201306928 bytes, new threshold 8 (max 8)

- age 1: 8579280 bytes, 8579280 total

- age 2: 6635784 bytes, 15215064 total

- age 3: 746072 bytes, 15961136 total

- age 4: 1137888 bytes, 17099024 total

- age 5: 273208 bytes, 17372232 total

- age 6: 1769872 bytes, 19142104 total

- age 7: 1744032 bytes, 20886136 total

- age 8: 277464 bytes, 21163600 total

: 898004K->26092K(1092288K), 0.0716839 secs] 2417757K->1546202K(5024448K), 0.0797908 secs] [Times: user=0.24 sys=0.00, real=0.08 secs]

Fyi,最近两天我的系统运行正常。 solr仪表板显示其已使用的最大16GB的67%(10GB)。 第一个代码段日志是发生错误/停机时的日志。 但是当最近几天系统运行平稳时,gc日志是摘要,但是我想做准备以防再次发生。 谢谢,感谢您的帮助和您的宝贵时间

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值