一般情况下,zk客户端与服务端建立连接后,会在2/3*sessionTime*1/2的时候发送一个心跳消息到服务端,保持会话更新。但是可能在某个时候网络中断可能导致客户端无法连接上服务端,此时客户端会不停的依次重试各个服务器,一直到连接上某个服务器为止。如果在未连接上这段时间内,服务端session已经过期,(参见SessionTracker的实现SessionTrackerImpl),它是依靠一个线程对过期的session进行清理,并关闭掉连接。
synchronized public void run() { try { while (running) { currentTime =System.currentTimeMillis(); if (nextExpirationTime >currentTime) { this.wait(nextExpirationTime - currentTime); continue; } SessionSet set; set =sessionSets.remove(nextExpirationTime); if (set != null) { for (SessionImpl s :set.sessions) { sessionsById.remove(s.sessionId); expirer.expire(s); } } nextExpirationTime +=expirationInterval; } } catch (InterruptedException e) { LOG.error("Unexpected interruption", e); } LOG.info("SessionTrackerImpl exited loop!"); }客户端在session过期后这段时间后,连接上某个服务器,并发送ConnectRequest,附带连接中断前的sessionid,以及lastZxid 等等消息,请求重新与服务器建立连接。服务器发现是一个ConnectRequest请求,于是readConnectRequest,如果sessionid不为0,则表示是需要恢复原来这个连接。
if (connReq.getSessionId() != 0) { long clientSessionId = connReq.getSessionId(); LOG.info("Client attempting to renew session 0x" +Long.toHexString(clientSessionId) + " at " +sock.socket().getRemoteSocketAddress()); factory.closeSessionWithoutWakeup(clientSessionId); setSessionId(clientSessionId); zk.reopenSession(this, sessionId, passwd, sessionTimeout); } else { LOG.info("Client attempting to establish new session at " +sock.socket().getRemoteSocketAddress()); zk.createSession(this, passwd, sessionTimeout); }reopenSession恢复时,会对session做一些校验
public void reopenSession(ServerCnxn cnxn, long sessionId, byte[]passwd, int sessionTimeout) throws IOException, InterruptedException { if (!checkPasswd(sessionId, passwd)) { cnxn.finishSessionInit(false); } else { revalidateSession(cnxn, sessionId, sessionTimeout); } } protected void revalidateSession(ServerCnxncnxn, long sessionId, int sessionTimeout) throwsIOException, InterruptedException { boolean rc =sessionTracker.touchSession(sessionId, sessionTimeout); if (LOG.isTraceEnabled()) { ZooTrace.logTraceMessage(LOG,ZooTrace.SESSION_TRACE_MASK, "Session 0x" + Long.toHexString(sessionId) + " is valid: " +rc); } cnxn.finishSessionInit(rc); }由于sessionid已经过期被删除,所以touchSession时由于找不到sessionid,这里rc会返回false, 根据rc的值,finishSessionInit会确定发送什么样的ConnectResponse给客户端。
public voidfinishSessionInit(boolean valid) ConnectResponsersp = new ConnectResponse(0, valid ? sessionTimeout : 0, valid ? sessionId : 0,// send 0 if session is no // longer valid valid ?zk.generatePasswd(sessionId) : new byte[16]);如果rc=false,同时服务端会发送关闭连接的指令。
客户端收到响应后,发现sessionTimeout时间小于或者等于0,则表示session过期。触发expired事件,并抛出异常。
negotiatedSessionTimeout =conRsp.getTimeOut(); if (negotiatedSessionTimeout <=0) { zooKeeper.state =States.CLOSED; eventThread.queueEvent(newWatchedEvent( Watcher.Event.EventType.None, Watcher.Event.KeeperState.Expired, null)); eventThread.queueEventOfDeath(); throw newSessionExpiredException( "Unable toreconnect to ZooKeeper service, session 0x" +Long.toHexString(sessionId) + " has expired"); }整个客户端实例退出,这个实例不能再次重用了,如果还需要连接服务器,则需要重新创建新的zookeeper实例。