接上篇文章继续分析 手把手带你撸zookeeper源码-zookeeper客户端如何和zk集群创建连接
上篇文章我们分析到了org.apache.zookeeper.ClientCnxn.SendThread#primeConnection 这个方法里面的代码,我粘贴几行比较重要的代码如下
// 创建链接请求的对象
ConnectRequest conReq = new ConnectRequest(0, lastZxid,
sessionTimeout, sessId, sessionPasswd);
// 如果设置了watcher监听器, 则把监听器对象打包成一个packet放入待发送的队列中
SetWatches sw = new SetWatches(setWatchesLastZxid,
dataWatchesBatch,
existWatchesBatch,
childWatchesBatch);
RequestHeader h = new RequestHeader();
h.setType(ZooDefs.OpCode.setWatches);
h.setXid(-8);
Packet packet = new Packet(h, new ReplyHeader(), sw, null, null);
// 把所有的watcher封装成一个packet
outgoingQueue.addFirst(packet);
//权限认证的一些数据
for (AuthData id : authInfo) {
outgoingQueue.addFirst(new Packet(new RequestHeader(-4,
OpCode.auth), null, new AuthPacket(0, id.scheme,
id.data), null, null));
}
//这个比较关键,就是把上面创建的ConenctRequest对象封装到packet加入到待发送队列中
outgoingQueue.addFirst(new Packet(null, null, conReq,
null, null, readOnly));
具体上面代码可以通过源码 + 注释一起来看
//关注读写请求
clientCnxnSocket.enableReadWriteOnly();
最后启用关注读写请求
然后我们回到SendThread.run方法中,继续往下执行
if (state.isConnected()) {
// determine whether we need to send an AuthFailed event.
// 权限认证
if (zooKeeperSaslClient != null) {
boolean sendAuthEvent = false;
if (zooKeeperSaslClient.getSaslState() == ZooKeeperSaslClient.SaslState.INITIAL) {
try {
zooKeeperSaslClient.initialize(ClientCnxn.this);
} catch (SaslException e) {
LOG.error("SASL authentication with Zookeeper Quorum member failed: " + e);
state = States.AUTH_FAILED;
sendAuthEvent = true;
}
}
KeeperState authState = zooKeeperSaslClient.getKeeperState();
if (authState != null) {
if (authState == KeeperState.AuthFailed) {
// An authentication error occurred during authentication with the Zookeeper Server.
state = States.AUTH_FAILED;
sendAuthEvent = true;
} else {
if (authState == KeeperState.SaslAuthenticated) {
sendAuthEvent = true;
}
}
}
if (sendAuthEvent == true) {
eventThread.queueEvent(new WatchedEvent(
Watcher.Event.EventType.None,
authState,null));
}
}
to = readTimeout - clientCnxnSocket.getIdleRecv();
} else {
to = connectTimeout - clientCnxnSocket.getIdleRecv();
}
这段代码就是如果已经和zk服务端建立链接完毕之后,则会处理一些和权限认证的一些相关信息,不关注
if (to <= 0) {
String warnInfo;
warnInfo = "Client session timed out, have not heard from server in "
+ clientCnxnSocket.getIdleRecv()
+ "ms"
+ " for sessionid 0x"
+ Long.toHexString(sessionId);
LOG.warn(warnInfo);
throw new SessionTimeoutException(warnInfo);
}
这块代码判断是否会话超时
// 如果是已经连接状态
if (state.isConnected()) {
//1000(1 second) is to prevent race condition missing to send the second ping
//also make sure not to send too many pings when readTimeout is small
// getIdleSend (now - lastSend) send完之后空闲时间
int timeToNextPing = readTimeout / 2 - clientCnxnSocket.getIdleSend() -
((clientCnxnSocket.getIdleSend() > 1000) ? 1000 : 0);
//send a ping request either time is due or no packet sent out within MAX_SEND_PING_INTERVAL
if (timeToNextPing <= 0 || clientCnxnSocket.getIdleSend() > MAX_SEND_PING_INTERVAL) {
sendPing();//1、最大空闲时间超过10s, 2、timeToNextPing
//客户端会定期发送Ping向服务端
clientCnxnSocket.updateLastSend();
} else {
if (timeToNextPing < to) {
to = timeToNextPing;
}
}
}
这块代码相对来说重要一些,它主要是为了解决两个问题,第一就是不会因为读超时时间设置的太小频繁发送ping请求,第二就是当客户端超过10s没有发送读写请求, 即超过10s的空闲,则会发送ping,其实每次客户端发送请求的时候(读、写、ping请求),其实服务端都会对Ping请求的逻辑进行处理的
sendPing()方法中就是发送一个空包给服务端,就是告诉服务端我还活着
private void sendPing() {
lastPingSentNs = System.nanoTime();
RequestHeader h = new RequestHeader(-2, OpCode.ping);
queuePacket(h, null, null, null, null, null, null, null, null);
}
同时也加入到outgoingQueue队列中
然后下面是最最关键的一行代码
// pendingQueue已经发送出去的数据等待响应
// outgoingQueue待发送的数据队列
clientCnxnSocket.doTransport(to, pendingQueue, outgoingQueue, ClientCnxn.this);
开始传输数据,我们看看具体是什么做的
void doTransport(int waitTimeOut, List<Packet> pendingQueue, LinkedList<Packet> outgoingQueue,
ClientCnxn cnxn)
throws IOException, InterruptedException {
selector.select(waitTimeOut);
Set<SelectionKey> selected;
synchronized (this) {
selected = selector.selectedKeys();
}
// Everything below and until we get back to the select is
// non blocking, so time is effectively a constant. That is
// Why we just have to do this once, here
updateNow();
for (SelectionKey k : selected) {
SocketChannel sc = ((SocketChannel) k.channel());
if ((k.readyOps() & SelectionKey.OP_CONNECT) != 0) {
if (sc.finishConnect()) {
updateLastSendAndHeard();
sendThread.primeConnection();
}
} else if ((k.readyOps() & (SelectionKey.OP_READ | SelectionKey.OP_WRITE)) != 0) {
doIO(pendingQueue, outgoingQueue, cnxn);
}
}
if (sendThread.getZkState().isConnected()) {
synchronized(outgoingQueue) {
if (findSendablePacket(outgoingQueue,
cnxn.sendThread.clientTunneledAuthenticationInProgress()) != null) {
enableWrite();
}
}
}
selected.clear();
}
updateNow();// 更新一下当前时间
for (SelectionKey k : selected) {
SocketChannel sc = ((SocketChannel) k.channel());
// 判断当前关注的key是否是OP_CONNECT
if ((k.readyOps() & SelectionKey.OP_CONNECT) != 0) {
if (sc.finishConnect()) {
updateLastSendAndHeard();
sendThread.primeConnection(); //上篇文章分析过
}
// 读写请求
} else if ((k.readyOps() & (SelectionKey.OP_READ | SelectionKey.OP_WRITE)) != 0) {
doIO(pendingQueue, outgoingQueue, cnxn);
}
}
在上面的代码中clientCnxnSocket.enableReadWriteOnly(); 启用了开始关注读写请求,我们直接进入到doIO()方法中
在这个方法中是实际上对读写请求做的业务逻辑处理,这个方法中分为两部分,下面是对读请求的业务逻辑处理
if (sockKey.isReadable()) {
//有数据可以读, zk服务器反向推送给你的事件通知
int rc = sock.read(incomingBuffer);
if (rc < 0) {
throw new EndOfStreamException(
"Unable to read additional data from server sessionid 0x"
+ Long.toHexString(sessionId)
+ ", likely server has closed socket");
}
if (!incomingBuffer.hasRemaining()) {
incomingBuffer.flip();
if (incomingBuffer == lenBuffer) {
recvCount++;
readLength();
} else if (!initialized) {
readConnectResult();
enableRead();
if (findSendablePacket(outgoingQueue,
cnxn.sendThread.clientTunneledAuthenticationInProgress()) != null) {
// Since SASL authentication has completed (if client is configured to do so),
// outgoing packets waiting in the outgoingQueue can now be sent.
enableWrite();
}
lenBuffer.clear();
incomingBuffer = lenBuffer;
updateLastHeard();
initialized = true;
} else {
sendThread.readResponse(incomingBuffer);
lenBuffer.clear();
incomingBuffer = lenBuffer;
updateLastHeard();
}
}
}
下面是对写请求的处理
if (sockKey.isWritable()) {
synchronized(outgoingQueue) {
Packet p = findSendablePacket(outgoingQueue,
cnxn.sendThread.clientTunneledAuthenticationInProgress());
if (p != null) {
updateLastSend();
// If we already started writing p, p.bb will already exist
if (p.bb == null) {
if ((p.requestHeader != null) &&
(p.requestHeader.getType() != OpCode.ping) &&
(p.requestHeader.getType() != OpCode.auth)) {
p.requestHeader.setXid(cnxn.getXid());
}
p.createBB();
}
//把packet的数据采用ByteBuffer的模式通过socket写出去
sock.write(p.bb);
// 这个代码是处理拆包的, 你发送数据的时候, 就怕一次性没有发送完毕
if (!p.bb.hasRemaining()) {
sentCount++;
//如果发送完毕, 从队列中删除
outgoingQueue.removeFirstOccurrence(p);
if (p.requestHeader != null
&& p.requestHeader.getType() != OpCode.ping
&& p.requestHeader.getType() != OpCode.auth) {
synchronized (pendingQueue) {
//发送完毕之后放到pendingQueue待确认队列中
pendingQueue.add(p);
}
}
}
}
if (outgoingQueue.isEmpty()) {
disableWrite();
} else if (!initialized && p != null && !p.bb.hasRemaining()) {
disableWrite();
} else {
// Just in case
enableWrite();
}
}
}
我们之前分析过,在outgoingQueue队列里面已经有数据了,此时肯定是从此队列中获取数据然后写给服务端,我们现在也主要关注此写部分的逻辑代码
首先使用synchronized对outgoingQueue队列进行加锁,接着
Packet p = findSendablePacket(outgoingQueue,
cnxn.sendThread.clientTunneledAuthenticationInProgress());
这行代码主要是从outgingQueue中获取第一个null-header的请求
private Packet findSendablePacket(LinkedList<Packet> outgoingQueue,
boolean clientTunneledAuthenticationInProgress) {
synchronized (outgoingQueue) {
if (outgoingQueue.isEmpty()) {
return null;
}
if (outgoingQueue.getFirst().bb != null // If we've already starting sending the first packet, we better finish
|| !clientTunneledAuthenticationInProgress) {
return outgoingQueue.getFirst();
}
ListIterator<Packet> iter = outgoingQueue.listIterator();
while (iter.hasNext()) {
Packet p = iter.next();
if (p.requestHeader == null) {
// We've found the priming-packet. Move it to the beginning of the queue.
iter.remove();
outgoingQueue.add(0, p);
return p;
} else {
// Non-priming packet: defer it until later, leaving it in the queue
// until authentication completes.
if (LOG.isDebugEnabled()) {
LOG.debug("deferring non-priming packet: " + p +
"until SASL authentication completes.");
}
}
}
// no sendable packet found.
return null;
}
}
我们在看看我们上面中ConnectRequest对象是如何构建的
ConnectRequest conReq = new ConnectRequest(0, lastZxid,sessionTimeout, sessId, sessionPasswd);
outgoingQueue.addFirst(new Packet(null, null, conReq,null, null, readOnly));
Packet(RequestHeader requestHeader, ReplyHeader replyHeader,
Record request, Record response,
WatchRegistration watchRegistration, boolean readOnly) {
this.requestHeader = requestHeader;
this.replyHeader = replyHeader;
this.request = request;
this.response = response;
this.readOnly = readOnly;
this.watchRegistration = watchRegistration;
}
此时除了request对象不为空,其他都为空,而此时知青上面的逻辑代码就是取出来requestHeader == null的这条数据放入到队列的头部,先处理这种链接请求
if (p != null) {
updateLastSend();
// If we already started writing p, p.bb will already exist
if (p.bb == null) {
if ((p.requestHeader != null) &&
(p.requestHeader.getType() != OpCode.ping) &&
(p.requestHeader.getType() != OpCode.auth)) {
p.requestHeader.setXid(cnxn.getXid());
}
p.createBB();
}
//把packet的数据采用ByteBuffer的模式通过socket写出去
sock.write(p.bb);
// 这个代码是处理拆包的, 你发送数据的时候, 就怕一次性没有发送完毕
if (!p.bb.hasRemaining()) {
sentCount++;
//如果发送完毕, 从队列中删除
outgoingQueue.removeFirstOccurrence(p);
if (p.requestHeader != null
&& p.requestHeader.getType() != OpCode.ping
&& p.requestHeader.getType() != OpCode.auth) {
synchronized (pendingQueue) {
//发送完毕之后放到pendingQueue待确认队列中
pendingQueue.add(p);
}
}
}
}
此时p肯定不为空,执行这段逻辑
首先先构建一个ByteBuffer
if (p.bb == null) {
if ((p.requestHeader != null) &&
(p.requestHeader.getType() != OpCode.ping) &&
(p.requestHeader.getType() != OpCode.auth)) {
p.requestHeader.setXid(cnxn.getXid());
}
p.createBB();
}
public void createBB() {
try {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
BinaryOutputArchive boa = BinaryOutputArchive.getArchive(baos);
boa.writeInt(-1, "len"); // We'll fill this in later
if (requestHeader != null) {
requestHeader.serialize(boa, "header");
}
//ConnectRequest对象
if (request instanceof ConnectRequest) {
request.serialize(boa, "connect");
// append "am-I-allowed-to-be-readonly" flag
boa.writeBool(readOnly, "readOnly");
} else if (request != null) {
request.serialize(boa, "request");
}
baos.close();
this.bb = ByteBuffer.wrap(baos.toByteArray());
this.bb.putInt(this.bb.capacity() - 4);
this.bb.rewind();
} catch (IOException e) {
LOG.warn("Ignoring unexpected exception", e);
}
}
此时就是把request对象给学理化一下,放入到ByteBuffer中
//把packet的数据采用ByteBuffer的模式通过socket写出去
sock.write(p.bb);
最后把这个ConnectRequest这种数据请求发送出去
// 这个代码是处理拆包的, 你发送数据的时候, 就怕一次性没有发送完毕
if (!p.bb.hasRemaining()) {
sentCount++;
//如果发送完毕, 从队列中删除
outgoingQueue.removeFirstOccurrence(p);
if (p.requestHeader != null
&& p.requestHeader.getType() != OpCode.ping
&& p.requestHeader.getType() != OpCode.auth) {
synchronized (pendingQueue) {
//发送完毕之后放到pendingQueue待确认队列中
pendingQueue.add(p);
}
}
}
这段代码主要是判断packet发送完毕之后,如果packet中的ByteBuffer中的数据比较大,则可能会拆包进行发送,如果发送完毕之后则会把数据从outgoingQueue发送队列中移除,同时加入到pendingQueue待响应的队列中,等待响应
ok, 上面我们主要分析了客户端如何消费outgoingQueue待发送队列中的数据如何发送给服务端的,接下来我们看看服务端如何出列这类请求的
在 手把手带你撸zookeeper源码-zookeeper集群如何接收客户端的连接的 分析了服务端如何监听2181端口等待客户端链接的,我们直接进入到org.apache.zookeeper.server.NIOServerCnxnFactory#run()中
if ((k.readyOps() & (SelectionKey.OP_READ | SelectionKey.OP_WRITE)) != 0) {
NIOServerCnxn c = (NIOServerCnxn) k.attachment();
c.doIO(k);
}
服务端监听客户端发送过来的读写请求逻辑走这里,我们进入到c.doIO()方法,这个方法里面会根据读写请求做各自的业务逻辑处理,客户端发送过来请求,服务端肯定是要读数据的,所以我们关注点放到读数据部分
if (k.isReadable()) {
int rc = sock.read(incomingBuffer);
if (rc < 0) {
throw new EndOfStreamException(
"Unable to read additional data from client sessionid 0x"
+ Long.toHexString(sessionId)
+ ", likely client has closed socket");
}
if (incomingBuffer.remaining() == 0) {
boolean isPayload;
if (incomingBuffer == lenBuffer) { // start of next request
incomingBuffer.flip();
isPayload = readLength(k);
incomingBuffer.clear();
} else {
// continuation
isPayload = true;
}
if (isPayload) { // not the case for 4letterword
readPayload();
}
else {
// four letter words take care
// need not do anything else
return;
}
}
}
上面部分的代码,前面大部分都是从sock里面把数据读到incomingBuffer缓冲池中的,读完之后会执行readPayload()方法去处理
private void readPayload() throws IOException, InterruptedException {
// 如果还有数据,则继续读取
if (incomingBuffer.remaining() != 0) { // have we read length bytes?
int rc = sock.read(incomingBuffer); // sock is non-blocking, so ok
if (rc < 0) {
throw new EndOfStreamException(
"Unable to read additional data from client sessionid 0x"
+ Long.toHexString(sessionId)
+ ", likely client has closed socket");
}
}
// 已经读取完毕
if (incomingBuffer.remaining() == 0) { // have we read length bytes?
packetReceived();
incomingBuffer.flip();
if (!initialized) {
readConnectRequest();
} else {
readRequest();
}
lenBuffer.clear();
incomingBuffer = lenBuffer;
}
}
这块也有做粘包的处理,直到读取数据完毕之后,进入到下面的逻辑
if (incomingBuffer.remaining() == 0) { // have we read length bytes?
packetReceived();
incomingBuffer.flip();
if (!initialized) {
readConnectRequest();
} else {
readRequest();
}
lenBuffer.clear();
incomingBuffer = lenBuffer;
}
第一次肯定会进入到readConnectRequest()方法中,之后再有发送过来的数据通过readRequest()来处理
private void readConnectRequest() throws IOException, InterruptedException {
if (!isZKServerRunning()) {
throw new IOException("ZooKeeperServer not running");
}
zkServer.processConnectRequest(this, incomingBuffer);
initialized = true;
}
真正的业务逻辑处理实在zkServer.processConenctRequest方法中,即ZookeeperServer类中
public void processConnectRequest(ServerCnxn cnxn, ByteBuffer incomingBuffer) throws IOException {
BinaryInputArchive bia = BinaryInputArchive.getArchive(new ByteBufferInputStream(incomingBuffer));
ConnectRequest connReq = new ConnectRequest();
connReq.deserialize(bia, "connect");
if (LOG.isDebugEnabled()) {
LOG.debug("Session establishment request from client "
+ cnxn.getRemoteSocketAddress()
+ " client's lastZxid is 0x"
+ Long.toHexString(connReq.getLastZxidSeen()));
}
boolean readOnly = false;
try {
readOnly = bia.readBool("readOnly");
cnxn.isOldClient = false;
} catch (IOException e) {
// this is ok -- just a packet from an old client which
// doesn't contain readOnly field
LOG.warn("Connection request from old client "
+ cnxn.getRemoteSocketAddress()
+ "; will be dropped if server is in r-o mode");
}
if (readOnly == false && this instanceof ReadOnlyZooKeeperServer) {
String msg = "Refusing session request for not-read-only client "
+ cnxn.getRemoteSocketAddress();
LOG.info(msg);
throw new CloseRequestException(msg);
}
if (connReq.getLastZxidSeen() > zkDb.dataTree.lastProcessedZxid) {
String msg = "Refusing session request for client "
+ cnxn.getRemoteSocketAddress()
+ " as it has seen zxid 0x"
+ Long.toHexString(connReq.getLastZxidSeen())
+ " our last zxid is 0x"
+ Long.toHexString(getZKDatabase().getDataTreeLastProcessedZxid())
+ " client must try another server";
LOG.info(msg);
throw new CloseRequestException(msg);
}
int sessionTimeout = connReq.getTimeOut();
byte passwd[] = connReq.getPasswd();
int minSessionTimeout = getMinSessionTimeout();
if (sessionTimeout < minSessionTimeout) {
sessionTimeout = minSessionTimeout;
}
int maxSessionTimeout = getMaxSessionTimeout();
if (sessionTimeout > maxSessionTimeout) {
sessionTimeout = maxSessionTimeout;
}
cnxn.setSessionTimeout(sessionTimeout);
// We don't want to receive any packets until we are sure that the
// session is setup
cnxn.disableRecv();
long sessionId = connReq.getSessionId();
if (sessionId != 0) {
long clientSessionId = connReq.getSessionId();
LOG.info("Client attempting to renew session 0x"
+ Long.toHexString(clientSessionId)
+ " at " + cnxn.getRemoteSocketAddress());
serverCnxnFactory.closeSession(sessionId);
cnxn.setSessionId(sessionId);
reopenSession(cnxn, sessionId, passwd, sessionTimeout);
} else {
LOG.info("Client attempting to establish new session at "
+ cnxn.getRemoteSocketAddress());
// sessionId一开始一定是空的
// session由服务端开启, 客户端仅仅是发送connectRequest过去而已
createSession(cnxn, passwd, sessionTimeout);
}
}
这块的代码就是处理第一次发送过来的ConnectRequest的处理,首先会从bytebuffer中读取出来数据并进行反序列化处理,当客户端第一次发送过来请求的时候,然后会根据发送过来的sessionId来判断是否要创建session等逻辑
总结:
客户端和服务端建立链接,然后根据需要判断是否需要权限认证、是否有watcher监听、以及链接请求数据,把数据全部放入到outgoingQueue队列中,创建链接成功之后,客户端回去从outgoingQueue中获取数据,并发送到zk服务端,客户端会对outgoingQueue中的数做一个处理,判断是否有ConnectRequest这样的请求,如果有则需要先发送这种请求,发送完毕之后会先把当前packet从outgoingQueue队列中移除,然后加入到pendingQueue队里中,等待服务端的响应
服务端接受到客户端发送过来的读请求,最终交给ZookeeperServer.processConnectRequest()方法处理
我们下篇文章看看服务端如何处理这种请求以及如何响应给客户端的