在zookeeper安装目录bin目录下有客户端、服务端开启的脚本
打开这个脚本,有如下信息:
"$JAVA" "-Dzookeeper.log.dir=${ZOO_LOG_DIR}" "-Dzookeeper.root.logger=${ZOO_LOG4J_PROP}" \
-cp "$CLASSPATH" $CLIENT_JVMFLAGS $JVMFLAGS \
org.apache.zookeeper.ZooKeeperMain "$@"
其实这就是启动java程序(ZookeeperMain)的入口。
因此一切从org.apache.zookeeper.ZooKeeperMain.main开始
public static void main(String args[]) throws CliException, IOException, InterruptedException
{
ZooKeeperMain main = new ZooKeeperMain(args);
main.run();
}
看起来有点像SpringBoot的启动类,首先构造一个类对象,然后执行run方法。
启动类的构造
- 首先看一下构造方法:
public ZooKeeperMain(String args[]) throws IOException, InterruptedException {
// 解析外面传入的参数 server timeout readonly
// 解析的结果存在在一个静态内部类MyCommandOptions中
// $ bin/zkCli.sh -server 127.0.0.1:2181
// 执行如上命令 此处明确指定了服务端地址 就会保存 server 127.0.0.1:2181
cl.parseOptions(args);
// 解析配置 server 控制台日志打印 --> 是否可以考虑用日志框架呢?
System.out.println("Connecting to " + cl.getOption("server"));
// 连接到服务端 在一个构造类里面执行IO操作 是不是需要警惕呢 ?
// 这个过程中会进行各种参数尤其是服务端地址的解析,然后根据配置选用NIO还是netty进行网络通讯,然后创建网络连接对象,启动两个线程对象SendThread和EventThread线程对象,前者用于处理(连接、心跳、客户端命令)请求的发送,后者用于异步事件的处理,在此过程中,会选择服务器进行连接,执行网络操作
connectToZK(cl.getOption("server"));
}
默认的配置(MyCommandOptions:用于解析系统的配置和命令行输入的命令):
// 通过HashMap保存
private Map<String,String> options = new HashMap<String,String>();
// 通过Patter进行字符匹配
public static final Pattern ARGS_PATTERN = Pattern.compile("\\s*([^\"\']\\S*|\"[^\"]*\"|'[^']*')\\s*");
public static final Pattern QUOTED_PATTERN = Pattern.compile("^([\'\"])(.*)(\\1)$");
public MyCommandOptions() {
options.put("server", "localhost:2181");
options.put("timeout", "30000");
}
- 连接服务端
protected void connectToZK(String newHost) throws InterruptedException, IOException {
// 已经启动过了 直接关闭 再重启
if (zk != null && zk.getState().isAlive()) {
zk.close();
}
host = newHost;
boolean readOnly = cl.getOption("readonly") != null;
if (cl.getOption("secure") != null) {
System.setProperty(ZKClientConfig.SECURE_CLIENT, "true");
System.out.println("Secure connection is enabled");
}
// 继续构造另一个对象 从类名感觉是一个进行ZooKeeper管理的类
// 此处额外还构造了一个MyWatcher
zk = new ZooKeeperAdmin(host, Integer.parseInt(cl.getOption("timeout")), new MyWatcher(), readOnly);
}
// 此类看起来不是很重要(仅仅用于打印异步事件的通知消息) 但是ZooKeeper的Watcher机制可是作用大大的
private class MyWatcher implements Watcher {
public void process(WatchedEvent event) {
// printWatches这个属性默认为true
if (getPrintWatches()) {
ZooKeeperMain.printMessage("WATCHER::");
ZooKeeperMain.printMessage(event.toString());
}
}
}
public static void printMessage(String msg) {
System.out.println("\n"+msg);
}
ZooKeeperAdmin的关系结构图
ZooKeeperAdmin的构造最后会调用到父类org.apache.zookeeper.ZooKeeper的构造
ZooKeeper类作为客户端的主类,注解如下
/ 除非额外声明 一般都是线程安全
* This is the main class of ZooKeeper client library. To use a ZooKeeper
* service, an application must first instantiate an object of ZooKeeper class.
* All the iterations will be done by calling the methods of ZooKeeper class.
* The methods of this class are thread-safe unless otherwise noted.
*
* <p> 只要连接服务端成功 就会分配一个事务ID 然后客户端通过持续的发送心跳保证会话的有效性
* Once a connection to a server is established, a session ID is assigned to the
* client. The client will send heart beats to the server periodically to keep
* the session valid.
*
* <p>
* The application can call ZooKeeper APIs through a client as long as the
* session ID of the client remains valid.
*
* <p> 如果在比较长的时间里(超过sessionTimeout),那么服务端就会将这个会话失效,无法
* 通过API进行操作了 除非再定义一个新的对象
* If for some reason, the client fails to send heart beats to the server for a
* prolonged period of time (exceeding the sessionTimeout value, for instance),
* the server will expire the session, and the session ID will become invalid.
* The client object will no longer be usable. To make ZooKeeper API calls, the
* application must create a new client object.
*
* <p> 如果当前连接的服务端连接失败或长久没有反应,在会话ID实现之前,就会自动尝试连接到其他的服务器。
* If the ZooKeeper server the client currently connects to fails or otherwise
* does not respond, the client will automatically try to connect to another
* server before its session ID expires. If successful, the application can
* continue to use the client.
*
* <p> 该类的方法包括同步的或异步,同步的方法会在服务端响应之前一直进行阻塞
* 异步的方法仅仅将消息放入到消息队列 然后快速的返回 在服务端响应之后 通过回调进行处理
* The ZooKeeper API methods are either synchronous or asynchronous. Synchronous
* methods blocks until the server has responded. Asynchronous methods just
* queue the request for sending and return immediately. They take a callback
* object that will be executed either on successful execution of the request or
* on error with an appropriate return code (rc) indicating the error.
*
* <p>
* Some successful ZooKeeper API calls can leave watches on the "data nodes" in
* the ZooKeeper server. Other successful ZooKeeper API calls can trigger those
* watches. Once a watch is triggered, an event will be delivered to the client
* which left the watch at the first place. Each watch can be triggered only
* once. Thus, up to one event will be delivered to a client for every watch it
* leaves.
* <p>
* A client needs an object of a class implementing Watcher interface for
* processing the events delivered to the client.
*
* When a client drops the current connection and re-connects to a server, all
* the existing watches are considered as being triggered but the undelivered
* events are lost. To emulate this, the client will generate a special event to
* tell the event handler a connection has been dropped. This special event has
* EventType None and KeeperState Disconnected.
*
*/
/*
* We suppress the "try" warning here because the close() method's signature
* allows it to throw InterruptedException which is strongly advised against by
* AutoCloseable (see:
* http://docs.oracle.com/javase/7/docs/api/java/lang/AutoCloseable.html#close()
* ). close() will never throw an InterruptedException but the exception remains
* in the signature for backwards compatibility purposes.
*/
Zookeeper类的构造
* 创建zookeeper客户端对象 需要传入host:port参数 可以为多个
* To create a ZooKeeper client object, the application needs to pass a
* connection string containing a comma separated list of host:port pairs, each
* corresponding to a ZooKeeper server.
* <p> 异步方式创建会话连接 通过watcher通知
* Session establishment is asynchronous. This constructor will initiate
* connection to the server and return immediately - potentially (usually)
* before the session is fully established. The watcher argument specifies the
* watcher that will be notified of any changes in state. This notification can
* come at any point before or after the constructor call has returned.
* <p> 对于多个服务端连接 随机选取一个进行连接 直到其中一个连接成功
* The instantiated ZooKeeper client object will pick an arbitrary server from
* the connectString and attempt to connect to it. If establishment of the
* connection fails, another server in the connect string will be tried (the
* order is non-deterministic, as we random shuffle the list), until a
* connection is established. The client will continue attempts until the
* session is explicitly closed.
* <p> 可选的chroot参数可以为所有创建的路径指定一个默认的开始路径 在下面connectString里面有说明
* Added in 3.2.0: An optional "chroot" suffix may also be appended to the
* connection string. This will run the client commands while interpreting all
* paths relative to this root (similar to the unix chroot command).
* <p>
* For backward compatibility, there is another version
* {@link #ZooKeeper(String, int, Watcher, boolean)} which uses default
* {@link StaticHostProvider}
*
* @param connectString comma separated host:port pairs, each corresponding to
* a zk server. e.g.
* "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002" If the
* optional chroot suffix is used the example would look
* like:
* "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002/app/a"
* where the client would be rooted at "/app/a" and all
* paths would be relative to this root - ie
* getting/setting/etc... "/foo/bar" would result in
* operations being run on "/app/a/foo/bar" (from the
* server perspective).
* @param sessionTimeout session timeout in milliseconds
* @param watcher a watcher object which will be notified of state
* changes, may also be notified for node events
* @param canBeReadOnly (added in 3.4) whether the created client is allowed to
* go to read-only mode in case of partitioning. Read-only
* mode basically means that if the client can't find any
* majority servers but there's partitioned server it
* could reach, it connects to one in read-only mode, i.e.
* read requests are allowed while write requests are not.
* It continues seeking for majority in the background.
* @param aHostProvider use this as HostProvider to enable custom behaviour.
* @param clientConfig (added in 3.5.2) passing this conf object gives each
* client the flexibility of configuring properties
* differently compared to other instances
* @throws IOException in cases of network failure
* @throws IllegalArgumentException if an invalid chroot path is specified
*/
/**
* 客户端核心线程,其内部包括两个线程:SendThread和EventThread。
* 前者是一个I/O线程,主要负责Zookeeper客户端和服务端之间的网络I/O通讯 后者是一个事件线程,主要负责对服务端事件进行处理
*/
protected final ClientCnxn cnxn;
private static final Logger LOG;
static {
// Keep these two lines together to keep the initialization order explicit
LOG = LoggerFactory.getLogger(ZooKeeper.class);
Environment.logEnv("Client environment:", LOG);
}
/**
* 客户端地址列表管理器
*/
protected final HostProvider hostProvider;
public ZooKeeper(String connectString, int sessionTimeout, Watcher watcher, boolean canBeReadOnly,
HostProvider aHostProvider, ZKClientConfig clientConfig) throws IOException {
LOG.info("Initiating client connection, connectString=" + connectString + " sessionTimeout="
+ sessionTimeout + " watcher=" + watcher);
if (clientConfig == null) {
clientConfig = new ZKClientConfig();
}
this.clientConfig = clientConfig;
// 1. 设置默认Watcher
watchManager = defaultWatchManager();
watchManager.defaultWatcher = watcher; // 将传入的watcher作为默认的watcher
// 2. 设置zookeeper服务器地址列表
ConnectStringParser connectStringParser = new ConnectStringParser(connectString);
hostProvider = aHostProvider;
// 3. 获取网络连接对象 仅仅实例化 getClientCnxnSocket
// 4. 创建clientCnxn
cnxn = createConnection(connectStringParser.getChrootPath(), hostProvider, sessionTimeout, this,
watchManager, getClientCnxnSocket(), canBeReadOnly);
// 5. 启动sendThread 和 eventThread 执行对应的run方法 此时就会执行sendThread里面的连接请求
cnxn.start();
}
读取系统配置
public ZKClientConfig() {
super();
initFromJavaSystemProperties();
}
/**
* Initialize all the ZooKeeper client properties which are configurable as
* java system property
*/
private void initFromJavaSystemProperties() {
setProperty(ZOOKEEPER_REQUEST_TIMEOUT,
System.getProperty(ZOOKEEPER_REQUEST_TIMEOUT));
}
// org.apache.zookeeper.common.ZKConfig ZKClientConfig的父类
/**
* properties, which are common to both client and server, are initialized
* from system properties
*/
public ZKConfig() {
init();
}
private void init() {
/**
* backward compatibility for all currently available client properties
*/
handleBackwardCompatibility();
}
设置默认Watcher
/* Useful for testing watch handling behavior */
protected ZKWatchManager defaultWatchManager() {
// Manage watchers & handle events generated by the ClientCnxn object.
return new ZKWatchManager(getClientConfig().getBoolean(ZKClientConfig.DISABLE_AUTO_WATCH_RESET));
}
获取网络连接
private ClientCnxnSocket getClientCnxnSocket() throws IOException {
String clientCnxnSocketName = getClientConfig()
.getProperty(ZKClientConfig.ZOOKEEPER_CLIENT_CNXN_SOCKET);
if (clientCnxnSocketName == null) {
// 未设置要使用的客户端连接器 则使用默认的NIO模式 可以选择的还有Netty
clientCnxnSocketName = ClientCnxnSocketNIO.class.getName();
}
try {
Constructor<?> clientCxnConstructor = Class.forName(clientCnxnSocketName)
.getDeclaredConstructor(ZKClientConfig.class);
ClientCnxnSocket clientCxnSocket = (ClientCnxnSocket) clientCxnConstructor
.newInstance(getClientConfig());
return clientCxnSocket;
} catch (Exception e) {
IOException ioe = new IOException("Couldn't instantiate " + clientCnxnSocketName);
ioe.initCause(e);
throw ioe;
}
}
除了原生的NIO,还支持通过Netty的方式进行网络通讯(Netty基于NIO,但是对NIO做了更好的封装,并且解决了NIO的BUG).通过zookeeper.clientCnxnSocket参数进行配置。
最重要的属性如下:
// 负责请求的发送
protected ClientCnxn.SendThread sendThread;
// 缓冲发送的请求 保证请求的顺序性
protected LinkedBlockingDeque<Packet> outgoingQueue;
该类中最主要的方法就是doTransport和doIO(负责请求的发送和响应接收)方法
@Override
void doTransport(int waitTimeOut, List<Packet> pendingQueue, ClientCnxn cnxn)
throws IOException, InterruptedException {
selector.select(waitTimeOut);
Set<SelectionKey> selected;
synchronized (this) {
selected = selector.selectedKeys();
}
// Everything below and until we get back to the select is
// non blocking, so time is effectively a constant. That is
// Why we just have to do this once, here
updateNow();
for (SelectionKey k : selected) {
SocketChannel sc = ((SocketChannel) k.channel());
if ((k.readyOps() & SelectionKey.OP_CONNECT) != 0) {
if (sc.finishConnect()) {
updateLastSendAndHeard();
updateSocketAddresses();
// 构建ConnectRequest请求
/*
* 之前的步骤只是纯粹地从网络TCP层面完成了客户端和服务端之间的socket连接,但远未完成ZooKeeper客户端的会话创建
*/
sendThread.primeConnection();
}
} else if ((k.readyOps() & (SelectionKey.OP_READ | SelectionKey.OP_WRITE)) != 0) {
// 响应控制阶段
/*
* 接收读写事件并处理
*/
doIO(pendingQueue, cnxn);
}
}
if (sendThread.getZkState().isConnected()) {
if (findSendablePacket(outgoingQueue, sendThread.tunnelAuthInProgress()) != null) {
enableWrite();
}
}
selected.clear();
}
/**
* 核心逻辑 主要负责对请求的发送和相应接收过程
* @return true if a packet was received
* @throws InterruptedException
* @throws IOException
*/
void doIO(List<Packet> pendingQueue, ClientCnxn cnxn) throws InterruptedException, IOException {
SocketChannel sock = (SocketChannel) sockKey.channel();
if (sock == null) {
throw new IOException("Socket is null!");
}
if (sockKey.isReadable()) {
int rc = sock.read(incomingBuffer);
if (rc < 0) {
throw new EndOfStreamException("Unable to read additional data from server sessionid 0x"
+ Long.toHexString(sessionId) + ", likely server has closed socket");
}
if (!incomingBuffer.hasRemaining()) {
incomingBuffer.flip();
if (incomingBuffer == lenBuffer) {
recvCount.getAndIncrement();
readLength();
} else if (!initialized) {
// 如果尚未初始化 那么就认为该相应一定是会话创建请求的响应 直接交给readConnectResult方法处理该响应
readConnectResult();
enableRead();
if (findSendablePacket(outgoingQueue, sendThread.tunnelAuthInProgress()) != null) {
// Since SASL authentication has completed (if client is configured to do so),
// outgoing packets waiting in the outgoingQueue can now be sent.
enableWrite();
}
lenBuffer.clear();
incomingBuffer = lenBuffer;
updateLastHeard();
initialized = true;
} else {
// 针对各种接收事件进行处理 如心跳 认证 常规请求响应
sendThread.readResponse(incomingBuffer);
lenBuffer.clear();
incomingBuffer = lenBuffer;
updateLastHeard();
}
}
}
if (sockKey.isWritable()) {
// 从发送队列中取出一个可发送的Packet对象 同时生成一个客户端请求序号XID并将其设置到Packet请求头中去
Packet p = findSendablePacket(outgoingQueue, sendThread.tunnelAuthInProgress());
if (p != null) {
updateLastSend();
// If we already started writing p, p.bb will already exist
if (p.bb == null) {
if ((p.requestHeader != null) && (p.requestHeader.getType() != OpCode.ping)
&& (p.requestHeader.getType() != OpCode.auth)) {
// 设置请求头序列号 XID
p.requestHeader.setXid(cnxn.getXid());
}
// 进行序列化
p.createBB();
}
sock.write(p.bb);
if (!p.bb.hasRemaining()) {
sentCount.getAndIncrement();
// 移除已发送Packet
outgoingQueue.removeFirstOccurrence(p);
if (p.requestHeader != null && p.requestHeader.getType() != OpCode.ping
&& p.requestHeader.getType() != OpCode.auth) {
synchronized (pendingQueue) {
// 将packet保存到pendingQueue队列中 以便等待服务端响应返回后进行相应的处理
pendingQueue.add(p);
}
}
}
}
if (outgoingQueue.isEmpty()) {
// No more packets to send: turn off write interest flag.
// Will be turned on later by a later call to enableWrite(),
// from within ZooKeeperSaslClient (if client is configured
// to attempt SASL authentication), or in either doIO() or
// in doTransport() if not.
disableWrite();
} else if (!initialized && p != null && !p.bb.hasRemaining()) {
// On initial connection, write the complete connect request
// packet, but then disable further writes until after
// receiving a successful connection response. If the
// session is expired, then the server sends the expiration
// response and immediately closes its end of the socket. If
// the client is simultaneously writing on its end, then the
// TCP stack may choose to abort with RST, in which case the
// client would never receive the session expired event. See
// http://docs.oracle.com/javase/6/docs/technotes/guides/net/articles/connection_release.html
disableWrite();
} else {
// Just in case
enableWrite();
}
}
}
创建clientCnxn
protected ClientCnxn createConnection(String chrootPath, HostProvider hostProvider, int sessionTimeout,
ZooKeeper zooKeeper, ClientWatchManager watcher, ClientCnxnSocket clientCnxnSocket,
boolean canBeReadOnly) throws IOException {
// 创建并初始化客户端网络连接器
return new ClientCnxn(chrootPath, hostProvider, sessionTimeout, this, watchManager, clientCnxnSocket,
canBeReadOnly);
}
org.apache.zookeeper.ClientCnxn这个类的类名取的是真的不好,怎么看也看不出是啥意思
这个类的注解是:This class manages the socket i/o for the client. ClientCnxn maintains a list of available servers to connect to and “transparently” switches servers it is connected to as needed.
网络连接器 管理客户端和服务端的所有网络交互 包括两个核心队列outgoingQueue和pendingQueue 以及连个核心网络线程 SendThread和EventThread 前者用于管理客户端和服务端之间的所有网络I/O 后者则用于进行客户端事件的处理 同时客户端还会将ClientCnxnSocket分配给SendThread作为底层网络I/O处理器 并初始化EventThread的待处理事件队列waitingEvents 用于存放所有等待被客户端换处理的事件
/**
* 创建一个网络连接对象
* Creates a connection object. The actual network connect doesn't get
* established until needed. The start() instance method must be called
* subsequent to construction.
*
* @param chrootPath - the chroot of this client. Should be removed from
* this Class in ZOOKEEPER-838
* @param hostProvider the list of ZooKeeper servers to connect to
* @param sessionTimeout the timeout for connections.
* @param zooKeeper the zookeeper object that this connection is related
* to.
* @param watcher watcher for this connection
* @param clientCnxnSocket the socket implementation used (e.g. NIO/Netty)
* @param sessionId session id if re-establishing session
* @param sessionPasswd session passwd if re-establishing session
* @param canBeReadOnly whether the connection is allowed to go to read-only
* mode in case of partitioning
* @throws IOException
*/
public ClientCnxn(String chrootPath, HostProvider hostProvider, int sessionTimeout, ZooKeeper zooKeeper,
ClientWatchManager watcher, ClientCnxnSocket clientCnxnSocket, long sessionId,
byte[] sessionPasswd, boolean canBeReadOnly) {
this.zooKeeper = zooKeeper;
this.watcher = watcher;
this.sessionId = sessionId;
this.sessionPasswd = sessionPasswd;
this.sessionTimeout = sessionTimeout;
this.hostProvider = hostProvider;
this.chrootPath = chrootPath;
connectTimeout = sessionTimeout / hostProvider.size();
readTimeout = sessionTimeout * 2 / 3;
readOnly = canBeReadOnly;
// 核心网络线程 管理客户端和服务端之间的所有网络I/O
sendThread = new SendThread(clientCnxnSocket);
// 核心网络线程 进行客户端的事件处理
eventThread = new EventThread();
this.clientConfig = zooKeeper.getClientConfig();
initRequestTimeout();
}
/
* This class services the outgoing request queue and generates the heart beats.
* It also spawns(产生) the ReadThread.
* 客户端ClientCnxn内部一个核心的I/O调度线程
* 用于管理客户端和服务端之间的所有网络I/O操作
* 在ZooKeeper客户端的实际运行过程中
* 一方面sendThread维护了客户端和服务端之间的会话声明周期,其通过在对应的周期频率内向
* 服务端发送一个PING包来实现心跳检测.同时在会话周期内,如果客户端与服务端之间出现TCP连接断开的情况,那么会自动且透明化完成重连操作
* 另一方面,SendThread管理了客户端所有的请求发送和响应接收操作,其将上层客户端API操作转换成相应的请求协议并发送到服务端,并完成对同步调用的返回和异步调用的回调。
* 同时,SendThread还负责将来自服务端的事件传递给EventThread去处理
*/
class SendThread extends ZooKeeperThread {
private long lastPingSentNs;
private final ClientCnxnSocket clientCnxnSocket;
private Random r = new Random();
private boolean isFirstConnect = true;
@Override
public void run() {
// 首先判断当前客户端的状态 进行一系列的请理性工作 为客户端发送“会话创建”请求做好准备
clientCnxnSocket.introduce(this, sessionId, outgoingQueue);
clientCnxnSocket.updateNow();
clientCnxnSocket.updateLastSendAndHeard();
int to;
long lastPingRwServer = Time.currentElapsedTime();
final int MAX_SEND_PING_INTERVAL = 10000; // 10 seconds
InetSocketAddress serverAddress = null;
while (state.isAlive()) {
try {
// 未连接状态则建立连接
if (!clientCnxnSocket.isConnected()) {
// don't re-establish connection if we are closing
if (closing) {
break;
}
// 获取一个服务器地址
/*
* 在开始创建TCP连接之前 SendThread首先需要获取一个ZooKeeper服务器的目标地址 这通常是从HostProvider中 随机获取一个地址
* 然后委托网络连接器去创建与ZooKeeper服务器之间的TCP连接
*/
if (rwServerAddress != null) {
serverAddress = rwServerAddress;
rwServerAddress = null;
} else {
serverAddress = hostProvider.next(1000);
}
// 创建TCP连接
/*
* 获取一个服务器地址后 网络连接器就负责和服务器创建一个TCP长连接 并设置状态为 CONNECTING
*/
startConnect(serverAddress);
clientCnxnSocket.updateLastSendAndHeard();
}
// 处于连接状态 判断是否已经认证
if (state.isConnected()) {
// determine whether we need to send an AuthFailed event.
if (zooKeeperSaslClient != null) {
boolean sendAuthEvent = false;
if (zooKeeperSaslClient.getSaslState() == ZooKeeperSaslClient.SaslState.INITIAL) {
try {
zooKeeperSaslClient.initialize(ClientCnxn.this);
} catch (SaslException e) {
LOG.error(
"SASL authentication with Zookeeper Quorum member failed: " + e);
state = States.AUTH_FAILED;
sendAuthEvent = true;
}
}
KeeperState authState = zooKeeperSaslClient.getKeeperState();
if (authState != null) {
if (authState == KeeperState.AuthFailed) {
// An authentication error occurred during authentication with the
// Zookeeper Server.
state = States.AUTH_FAILED;
sendAuthEvent = true;
} else {
if (authState == KeeperState.SaslAuthenticated) {
sendAuthEvent = true;
}
}
}
if (sendAuthEvent) {
eventThread.queueEvent(
new WatchedEvent(Watcher.Event.EventType.None, authState, null));
if (state == States.AUTH_FAILED) {
eventThread.queueEventOfDeath();
}
}
}
to = readTimeout - clientCnxnSocket.getIdleRecv();
} else {
to = connectTimeout - clientCnxnSocket.getIdleRecv();
}
if (to <= 0) {
String warnInfo;
warnInfo = "Client session timed out, have not heard from server in "
+ clientCnxnSocket.getIdleRecv() + "ms" + " for sessionid 0x"
+ Long.toHexString(sessionId);
LOG.warn(warnInfo);
throw new SessionTimeoutException(warnInfo);
}
if (state.isConnected()) {
// 1000(1 second) is to prevent race condition missing to send the second ping
// also make sure not to send too many pings when readTimeout is small
int timeToNextPing = readTimeout / 2 - clientCnxnSocket.getIdleSend()
- ((clientCnxnSocket.getIdleSend() > 1000) ? 1000 : 0);
// send a ping request either time is due or no packet sent out within
// MAX_SEND_PING_INTERVAL
if (timeToNextPing <= 0 || clientCnxnSocket.getIdleSend() > MAX_SEND_PING_INTERVAL) {
sendPing();
clientCnxnSocket.updateLastSend();
} else {
if (timeToNextPing < to) {
to = timeToNextPing;
}
}
}
// If we are in read-only mode, seek for read/write server
if (state == States.CONNECTEDREADONLY) {
long now = Time.currentElapsedTime();
int idlePingRwServer = (int) (now - lastPingRwServer);
if (idlePingRwServer >= pingRwTimeout) {
lastPingRwServer = now;
idlePingRwServer = 0;
pingRwTimeout = Math.min(2 * pingRwTimeout, maxPingRwTimeout);
pingRwServer();
}
to = Math.min(to, pingRwTimeout - idlePingRwServer);
}
// 发送请求
clientCnxnSocket.doTransport(to, pendingQueue, ClientCnxn.this);
} catch (Throwable e) {
if (closing) {
if (LOG.isDebugEnabled()) {
// closing so this is expected
LOG.debug("An exception was thrown while closing send thread for session 0x"
+ Long.toHexString(getSessionId()) + " : " + e.getMessage());
}
break;
} else {
// this is ugly, you have a better way speak up
if (e instanceof SessionExpiredException) {
LOG.info(e.getMessage() + ", closing socket connection");
} else if (e instanceof SessionTimeoutException) {
LOG.info(e.getMessage() + RETRY_CONN_MSG);
} else if (e instanceof EndOfStreamException) {
LOG.info(e.getMessage() + RETRY_CONN_MSG);
} else if (e instanceof RWServerFoundException) {
LOG.info(e.getMessage());
} else if (e instanceof SocketException) {
LOG.info("Socket error occurred: {}: {}", serverAddress, e.getMessage());
} else {
LOG.warn("Session 0x{} for server {}, unexpected error{}",
Long.toHexString(getSessionId()), serverAddress, RETRY_CONN_MSG, e);
}
// At this point, there might still be new packets appended to outgoingQueue.
// they will be handled in next connection or cleared up if closed.
cleanAndNotifyState();
}
}
}
// while循环结束
synchronized (state) {
// When it comes to this point, it guarantees that later queued
// packet to outgoingQueue will be notified of death.
cleanup();
}
clientCnxnSocket.close();
if (state.isAlive()) {
eventThread.queueEvent(
new WatchedEvent(Event.EventType.None, Event.KeeperState.Disconnected, null));
}
eventThread.queueEvent(new WatchedEvent(Event.EventType.None, Event.KeeperState.Closed, null));
ZooTrace.logTraceMessage(LOG, ZooTrace.getTextTraceLevel(),
"SendThread exited loop for session: 0x" + Long.toHexString(getSessionId()));
}
/**
* 客户端获取到来自服务端的完整响应数据后 根据不同的客户端请求类型 会进行不同的处理
*
* @param incomingBuffer
* @throws IOException
*/
void readResponse(ByteBuffer incomingBuffer) throws IOException {
ByteBufferInputStream bbis = new ByteBufferInputStream(incomingBuffer);
BinaryInputArchive bbia = BinaryInputArchive.getArchive(bbis);
ReplyHeader replyHdr = new ReplyHeader();
replyHdr.deserialize(bbia, "header");
if (replyHdr.getXid() == -2) {
// 心跳信息
// -2 is the xid for pings
if (LOG.isDebugEnabled()) {
LOG.debug("Got ping response for sessionid: 0x" + Long.toHexString(sessionId) + " after "
+ ((System.nanoTime() - lastPingSentNs) / 1000000) + "ms");
}
return;
}
if (replyHdr.getXid() == -4) {
// 认证
// -4 is the xid for AuthPacket
if (replyHdr.getErr() == KeeperException.Code.AUTHFAILED.intValue()) {
state = States.AUTH_FAILED;
eventThread.queueEvent(new WatchedEvent(Watcher.Event.EventType.None,
Watcher.Event.KeeperState.AuthFailed, null));
eventThread.queueEventOfDeath();
}
if (LOG.isDebugEnabled()) {
LOG.debug("Got auth sessionid:0x" + Long.toHexString(sessionId));
}
return;
}
if (replyHdr.getXid() == -1) {
// 事件通知
// -1 means notification
if (LOG.isDebugEnabled()) {
LOG.debug("Got notification sessionid:0x" + Long.toHexString(sessionId));
}
// 反序列化为WatcherEvent事件并放入到待处理队列中
WatcherEvent event = new WatcherEvent();
event.deserialize(bbia, "response");
// convert from a server path to a client path
if (chrootPath != null) {
String serverPath = event.getPath();
if (serverPath.compareTo(chrootPath) == 0)
event.setPath("/");
else if (serverPath.length() > chrootPath.length())
event.setPath(serverPath.substring(chrootPath.length()));
else {
LOG.warn("Got server path " + event.getPath() + " which is too short for chroot path "
+ chrootPath);
}
}
WatchedEvent we = new WatchedEvent(event);
if (LOG.isDebugEnabled()) {
LOG.debug("Got " + we + " for sessionid 0x" + Long.toHexString(sessionId));
}
eventThread.queueEvent(we);
return;
}
// If SASL authentication is currently in progress, construct and
// send a response packet immediately, rather than queuing a
// response as with other packets.
if (tunnelAuthInProgress()) {
GetSASLRequest request = new GetSASLRequest();
request.deserialize(bbia, "token");
zooKeeperSaslClient.respondToServer(request.getToken(), ClientCnxn.this);
return;
}
// 如果是一个常规的请求响应(指的是Create/GetData和Exist等操作请求),那么会从pendingQueue队列中取出一个Packet来
// 记性响应的处理
// ZooKeeper客户端首先会检验服务端响应中包含的XID值来确保请求处理的顺序性,然后再将接收到的ByteBuffer(incommingBuffer)
// 序列化成相应的Response对象
Packet packet;
synchronized (pendingQueue) {
if (pendingQueue.size() == 0) {
throw new IOException("Nothing in the queue, but got " + replyHdr.getXid());
}
packet = pendingQueue.remove();
}
/*
* Since requests are processed in order, we better get a response to the first
* request!
*/
try {
if (packet.requestHeader.getXid() != replyHdr.getXid()) {
packet.replyHeader.setErr(KeeperException.Code.CONNECTIONLOSS.intValue());
throw new IOException("Xid out of order. Got Xid " + replyHdr.getXid() + " with err "
+ +replyHdr.getErr() + " expected Xid " + packet.requestHeader.getXid()
+ " for a packet with details: " + packet);
}
packet.replyHeader.setXid(replyHdr.getXid());
packet.replyHeader.setErr(replyHdr.getErr());
packet.replyHeader.setZxid(replyHdr.getZxid());
if (replyHdr.getZxid() > 0) {
lastZxid = replyHdr.getZxid();
}
if (packet.response != null && replyHdr.getErr() == 0) {
packet.response.deserialize(bbia, "response");
}
if (LOG.isDebugEnabled()) {
LOG.debug("Reading reply sessionid:0x" + Long.toHexString(sessionId) + ", packet:: "
+ packet);
}
} finally {
// 处理watcher注册等逻辑
finishPacket(packet);
}
}
SendThread(ClientCnxnSocket clientCnxnSocket) {
super(makeThreadName("-SendThread()"));
state = States.CONNECTING;
this.clientCnxnSocket = clientCnxnSocket;
setDaemon(true); // 守护线程
}
}
/
* 核心线程 负责客户端的事件处理 并触发客户端注册的Watcher监听,
* EventThread中有一个waitingEvents队列,用于临时存放那些需要被触发的Object,包括那些客户端注册的Watcher和异步接口中注册的回调器AsyncCallback,
* 同时,EventThread会不断从waitingEvents中取出这个Object,识别其具体类型,并分别调用process和processResult接口方法来实现对事件的触发和回调
* @author Administrator
*
*/
class EventThread extends ZooKeeperThread {
private final LinkedBlockingQueue<Object> waitingEvents = new LinkedBlockingQueue<Object>();
/**
* This is really the queued session state until the event thread actually
* processes the event and hands it to the watcher. But for all intents and
* purposes this is the state.
*/
private volatile KeeperState sessionState = KeeperState.Disconnected;
private volatile boolean wasKilled = false;
private volatile boolean isRunning = false;
EventThread() {
super(makeThreadName("-EventThread"));
setDaemon(true);
}
@Override
@SuppressFBWarnings("JLM_JSR166_UTILCONCURRENT_MONITORENTER")
public void run() {
try {
isRunning = true;
while (true) {
// eventThread不断从waitingEvents队列中取出待处理的Watcher对象
Object event = waitingEvents.take();
if (event == eventOfDeath) {
wasKilled = true;
} else {
// 将Watcher对象进行处理 以达到触发watcher的目的
processEvent(event);
}
if (wasKilled)
synchronized (waitingEvents) {
if (waitingEvents.isEmpty()) {
isRunning = false;
break;
}
}
}
} catch (InterruptedException e) {
LOG.error("Event thread exiting due to interruption", e);
}
LOG.info("EventThread shut down for session: 0x{}", Long.toHexString(getSessionId()));
}
}
执行主类的run方法
void run() throws CliException, IOException, InterruptedException {
if (cl.getCommand() == null) {
System.out.println("Welcome to ZooKeeper!");
boolean jlinemissing = false;
// only use jline if it's in the classpath
try {
Class<?> consoleC = Class.forName("jline.console.ConsoleReader");
Class<?> completorC =
Class.forName("org.apache.zookeeper.JLineZNodeCompleter");
System.out.println("JLine support is enabled");
Object console =
consoleC.getConstructor().newInstance();
Object completor =
completorC.getConstructor(ZooKeeper.class).newInstance(zk);
Method addCompletor = consoleC.getMethod("addCompleter",
Class.forName("jline.console.completer.Completer"));
addCompletor.invoke(console, completor);
String line;
Method readLine = consoleC.getMethod("readLine", String.class);
while ((line = (String)readLine.invoke(console, getPrompt())) != null) {
executeLine(line);
}
} catch (ClassNotFoundException e) {
LOG.debug("Unable to start jline", e);
jlinemissing = true;
} catch (NoSuchMethodException e) {
LOG.debug("Unable to start jline", e);
jlinemissing = true;
} catch (InvocationTargetException e) {
LOG.debug("Unable to start jline", e);
jlinemissing = true;
} catch (IllegalAccessException e) {
LOG.debug("Unable to start jline", e);
jlinemissing = true;
} catch (InstantiationException e) {
LOG.debug("Unable to start jline", e);
jlinemissing = true;
}
if (jlinemissing) {
System.out.println("JLine support is disabled");
BufferedReader br =
new BufferedReader(new InputStreamReader(System.in));
String line;
while ((line = br.readLine()) != null) {
executeLine(line);
}
}
} else {
// Command line args non-null. Run what was passed.
// 主要的逻辑 读取客户端请求 并执行
processCmd(cl);
}
System.exit(exitCode);
}
protected boolean processZKCmd(MyCommandOptions co) throws CliException, IOException, InterruptedException {
String[] args = co.getArgArray();
String cmd = co.getCommand();
if (args.length < 1) {
usage();
throw new MalformedCommandException("No command entered");
}
if (!commandMap.containsKey(cmd)) {
usage();
throw new CommandNotFoundException("Command not found " + cmd);
}
boolean watch = false;
LOG.debug("Processing " + cmd);
if (cmd.equals("quit")) {
zk.close();
System.exit(exitCode);
} else if (cmd.equals("redo") && args.length >= 2) {
Integer i = Integer.decode(args[1]);
if (commandCount <= i || i < 0) { // don't allow redoing this redo
throw new MalformedCommandException("Command index out of range");
}
cl.parseCommand(history.get(i));
if (cl.getCommand().equals("redo")) {
throw new MalformedCommandException("No redoing redos");
}
history.put(commandCount, history.get(i));
processCmd(cl);
} else if (cmd.equals("history")) {
for (int i = commandCount - 10; i <= commandCount; ++i) {
if (i < 0) continue;
System.out.println(i + " - " + history.get(i));
}
} else if (cmd.equals("printwatches")) {
if (args.length == 1) {
System.out.println("printwatches is " + (printWatches ? "on" : "off"));
} else {
printWatches = args[1].equals("on");
}
} else if (cmd.equals("connect")) {
if (args.length >= 2) {
connectToZK(args[1]);
} else {
connectToZK(host);
}
}
// Below commands all need a live connection
if (zk == null || !zk.getState().isAlive()) {
System.out.println("Not connected");
return false;
}
// execute from commandMap
CliCommand cliCmd = commandMapCli.get(cmd);
if(cliCmd != null) {
cliCmd.setZk(zk);
// 执行其他的命令
watch = cliCmd.parse(args).exec();
} else if (!commandMap.containsKey(cmd)) {
usage();
}
return watch;
}
根据命令的类型获取一个CliCommand对象,比如set命令,返回的是SetCommand对象.
然后执行对应CliCommand的exec方法(典型的命令行设计模式)
public class SetCommand extends CliCommand {
private static Options options = new Options();
private String[] args;
private CommandLine cl;
static {
options.addOption("s", false, "stats");
options.addOption("v", true, "version");
}
public SetCommand() {
super("set", "[-s] [-v version] path data");
}
@Override
public CliCommand parse(String[] cmdArgs) throws CliParseException {
Parser parser = new PosixParser();
try {
cl = parser.parse(options, cmdArgs);
} catch (ParseException ex) {
throw new CliParseException(ex);
}
args = cl.getArgs();
if (args.length < 3) {
throw new CliParseException(getUsageStr());
}
return this;
}
@Override
public boolean exec() throws CliException {
String path = args[1];
byte[] data = args[2].getBytes();
int version;
if (cl.hasOption("v")) {
version = Integer.parseInt(cl.getOptionValue("v"));
} else {
version = -1;
}
try {
Stat stat = zk.setData(path, data, version);
if (cl.hasOption("s")) {
new StatPrinter(out).print(stat);
}
} catch (IllegalArgumentException ex) {
throw new MalformedPathException(ex.getMessage());
} catch (KeeperException|InterruptedException ex) {
throw new CliWrapperException(ex);
}
return false;
}
}
/**
* Set the data for the node of the given path if such a node exists and the
* given version matches the version of the node (if the given version is -1, it
* matches any node's versions). Return the stat of the node.
* <p>
* This operation, if successful, will trigger all the watches on the node of
* the given path left by getData calls.
* <p>
* A KeeperException with error code KeeperException.NoNode will be thrown if no
* node with the given path exists.
* <p>
* A KeeperException with error code KeeperException.BadVersion will be thrown
* if the given version does not match the node's version.
* <p>
* The maximum allowable size of the data array is 1 MB (1,048,576 bytes).
* Arrays larger than this will cause a KeeperException to be thrown.
*
* @param path the path of the node
* @param data the data to set
* @param version the expected matching version
* @return the state of the node
* @throws InterruptedException If the server transaction is interrupted.
* @throws KeeperException If the server signals an error with a
* non-zero error code.
* @throws IllegalArgumentException if an invalid path is specified
*/
public Stat setData(final String path, byte data[], int version)
throws KeeperException, InterruptedException {
final String clientPath = path;
PathUtils.validatePath(clientPath);
final String serverPath = prependChroot(clientPath);
// 创建请求头
RequestHeader h = new RequestHeader();
h.setType(ZooDefs.OpCode.setData);
SetDataRequest request = new SetDataRequest();
request.setPath(serverPath);
request.setData(data);
request.setVersion(version);
SetDataResponse response = new SetDataResponse();
// 通过网络连接对象提交一个请求
ReplyHeader r = cnxn.submitRequest(h, request, response, null);
if (r.getErr() != 0) {
throw KeeperException.create(KeeperException.Code.get(r.getErr()), clientPath);
}
return response.getStat();
}
public ReplyHeader submitRequest(RequestHeader h, Record request, Record response,
WatchRegistration watchRegistration) throws InterruptedException {
return submitRequest(h, request, response, watchRegistration, null);
}
public ReplyHeader submitRequest(RequestHeader h, Record request, Record response,
WatchRegistration watchRegistration, WatchDeregistration watchDeregistration)
throws InterruptedException {
ReplyHeader r = new ReplyHeader();
Packet packet = queuePacket(h, r, request, response, null, null, null, null, watchRegistration,
watchDeregistration);
synchronized (packet) {
if (requestTimeout > 0) {
// Wait for request completion with timeout
waitForPacketFinish(r, packet);
} else {
// Wait for request completion infinitely
while (!packet.finished) {
packet.wait();
}
}
}
if (r.getErr() == Code.REQUESTTIMEOUT.intValue()) {
sendThread.cleanAndNotifyState();
}
return r;
}
public Packet queuePacket(RequestHeader h, ReplyHeader r, Record request, Record response,
AsyncCallback cb, String clientPath, String serverPath, Object ctx,
WatchRegistration watchRegistration, WatchDeregistration watchDeregistration) {
Packet packet = null;
// Note that we do not generate the Xid for the packet yet. It is
// generated later at send-time, by an implementation of
// ClientCnxnSocket::doIO(),
// where the packet is actually sent.
packet = new Packet(h, r, request, response, watchRegistration);
packet.cb = cb;
packet.ctx = ctx;
packet.clientPath = clientPath;
packet.serverPath = serverPath;
packet.watchDeregistration = watchDeregistration;
// The synchronized block here is for two purpose:
// 1. synchronize with the final cleanup() in SendThread.run() to avoid race
// 2. synchronized against each packet. So if a closeSession packet is added,
// later packet will be notified.
synchronized (state) {
if (!state.isAlive() || closing) {
conLossPacket(packet);
} else {
// If the client is asking to close the session then
// mark as closing
if (h.getType() == OpCode.closeSession) {
closing = true;
}
outgoingQueue.add(packet);
}
}
sendThread.getClientCnxnSocket().packetAdded();
return packet;
}
总结
- 在构造方法中启动两个线程sendThread和eventThread,处理发送和接收以及异步事件处理
- 读取控制台命令放到指令的队列中,sendThread读取队列中的数据,进行发送
- 整个过程中大量使用了队列,就是为了保证顺序性(FIFO)
参照图如下: