Cannot open channel to 3 at election address :3888 java.net.ConnectException: Connection refused (Co...

关于Linux中搭建分布式时可能遇到的问题

这个问题来自于今天安装zookeeper时踩的一个大坑,害的我花了一天时间。在搭建zookeeper的分布式时,往往要进行这样的配置:

server.1=hadoop01:2888:3888
server.2=hadoop02:2888:3888
server.3=hadoop03:2888:3888

一开始我是按照这样的配置来做的,后来死活不成功,zookeeper.out中的信息如下:

2017-04-21 06:05:34,385 [myid:1] - WARN  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@400] - Cannot open channel to 3 at election address hadoop05/192.168.31.155:3888
java.net.ConnectException: Connection refused (Connection refused)
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:589)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:381)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:426)
    at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:843)
    at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:822)

先不要关心日志的时间(其实你们也不会关注的,这个是没有配时间,所以显示早上六点,哪个傻叉会6点起来)。
对于这个错误,网上大多数说的是zookeeper启动顺序导致开始选举不稳定引起的,不用担心,过一会儿就会好的,可是对于我来说并不管用。然后就是什么主机映射之类的,就算配了主机映射,三台虚拟机都可以相互ping通,似乎也没什么卵用。继续查资料,发现又有这样的一种配置:

server.1=192.168.31.151:2888:3888
server.2=192.168.31.152:2888:3888
server.3=192.168.31.153:2888:3888

然后我又照着这种配置又配了一遍,发现这样居然可以,leader和follower都选出来了。激动之余,尼玛问题究竟出在哪儿,这样两种配置有啥不一样,这又让我寝食难安,百度了一圈毛都没发现。然后就google去了,反正就是在一个犄角旮旯找到了一个问答,发现别人也是有这种问题,链接在此:
https://unix.stackexchange.com/questions/240506/zookeeper-dns-name-problems-with-leader-elections-when-migrating-from-windows-to
问的题目是:

Zookeeper DNS name problems with leader elections when migrating from Windows to Debian

回答的人就说了:

The "smoking gun" was this line in my zookeeper log:

2015-11-26 20:48:31,439 [myid:1] - INFO
[Thread-2:QuorumCnxManager$Listener@504] - My election bind port:
spring-xd-1/127.0.0.1:3888 

So, why was Zookeeper binding the election port on the loopback interface? Well...

My /etc/hosts on one of the VMs looked like this:

127.0.0.1   spring-xd-1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1         localhost localhost.localdomain
localhost6 localhost6.localdomain6

## vagrant-hostmanager-start
172.28.128.3    spring-xd-1
172.28.128.4    spring-xd-2
172.28.128.7    spring-xd-3
## vagrant-hostmanager-end

I removed the hostname from the 127.0.0.1 line in /etc/hosts and bounced the zookeeper service on all 3 nodes, and BAM! everything came up roses. So, now the host file on each machine looks like this:

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1         localhost localhost.localdomain
localhost6 localhost6.localdomain6

## vagrant-hostmanager-start
172.28.128.3    spring-xd-1
172.28.128.4    spring-xd-2
172.28.128.7    spring-xd-3
## vagrant-hostmanager-end

最后一段有点启发意义:

EDIT: According to
http://ccl.cse.nd.edu/operations/condor/hostname.shtml, this seems to
be a fairly common problem with clustered apps on Linux, and
recommends editing the hosts file as I've described above. However,
the Zookeeper documentation on cluster setup doesn't mention it.

想去访问这个说明这个问题的网址,可惜访问不了,mmp!

P.S. 搭zookeeper这个硬是要搞出人命

  • 3
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 7
    评论
评论 7
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值