前言
参考ZK官方文档
一个简单的监控客户端
需求
-
输入参数
- zk 服务的地址
- znode 的名称 - 要监视的节点
- 输出的文件名
- 可执行的程序
-
获取到znode数据之后启动程序
-
如果znode发生变化, client 重新获取内容并重启执行程序
-
如果znode消失, 关闭 可执行程序
程序设计
通常, ZL应用会被分成2个部分, 一部分负责维护连接, 另一部分负责监控数据. 在这个应用 Executor
类负责维护ZK连接, DataMonitor 监控数据. 此外, Executor 包含了执行逻辑的主线程, 它负责少量的用户交互,以及与作为参数传入的可执行程序的交互,根据znode的状态,示例(根据需求)关闭和重新启动该程序。
Executor 类
// from the Executor class...
public static void main(String[] args) {
if (args.length < 4) {
System.err.println("USAGE: Executor hostPort znode filename program [args ...]");
System.exit(2);
}
String hostPort = args[0];
String znode = args[1];
String filename = args[2];
String exec[] = new String[args.length - 3];
System.arraycopy(args, 3, exec, 0, exec.length);
try {
new Executor(hostPort, znode, filename, exec).run();
} catch (Exception e) {
e.printStackTrace();
}
}
public Executor(String hostPort, String znode, String filename,
String exec[]) throws KeeperException, IOException {
this.filename = filename;
this.exec = exec;
//Executor本身实现了Watcher接口, 监听ZK的连接状态
zk = new ZooKeeper(hostPort, 3000, this);
//把引用传给 DataMonitor , 用于执行回调
dm = new DataMonitor(zk, znode, null, this);
}
public void run() {
try {
synchronized (this) {
//如果 dm 没有死掉, 则等待被唤醒
while (!dm.dead) {
wait();
}
}
} catch (InterruptedException e) {
}
}
public class Executor implements Watcher, Runnable, DataMonitor.DataMonitorListener {
...
Watcher 接口是 ZooKeeper Java API 中定义的. 该接口中只有一个方法, process()
, 使用该方法来监听主线程感兴趣的事件, 比如说连接和会话的状态. 在这个例子中的实现是直接触发 dataMonitor 的process方法, 交给底层去处理.
public void process(WatchedEvent event) {
dm.process(event);
}
DataMonitorListener 接口是这个例子中自定义的接口, 而不是 ZK Java API中的接口,
public interface DataMonitorListener {
/**
* 节点存在性发生变化
*/
void exists(byte data[]);
/**
* ZK 会话不再有效
* @param rc
* the ZooKeeper reason code
*/
void closing(int rc);
}
这个接口在 DataMonitor 类重定义并在 Executor 类中实现. 当 Executor.exists() 触发的时候, 会根据当前的节点的存在性决定是允许还是关闭可执行程序.
当 Executor.closing() 被调用的时候, Executor 会在连接丢失的之后结束自己的运行.
下面是 DataMonitorListener.exists()
和DataMonitorListener.closing
实现:
public void exists( byte[] data ) {
if (data == null) { //如果事件的数据为空
if (child != null) {//且子进程存在, 则关闭子进程
System.out.println("Killing process");
child.destroy();
try {
child.waitFor();
} catch (InterruptedException e) {
}
}
child = null;
} else {//数据存在, 且子进程也存在, 就停止子进程, 开启新进程
if (child != null) {
System.out.println("Stopping child");
child.destroy();
try {
child.waitFor();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
try {
FileOutputStream fos = new FileOutputStream(filename);
fos.write(data);
fos.close();
} catch (IOException e) {
e.printStackTrace();
}
try {
System.out.println("Starting child");
child = Runtime.getRuntime().exec(exec);
new StreamWriter(child.getInputStream(), System.out);
new StreamWriter(child.getErrorStream(), System.err);
} catch (IOException e) {
e.printStackTrace();
}
}
}
public void closing(int rc) {
//触发关闭, 通知主进程检查 dm.dead (executor中有个 wait)
synchronized (this) {
notifyAll();
}
}
DataMonitor 类
DataMonitor 类是 ZK逻辑的核心. 大多数情况下都是异步和事件驱动的. DataMonitor 通过其构造函数来分离不相关的执行逻辑:
public DataMonitor(ZooKeeper zk, String znode, Watcher chainedWatcher,
DataMonitorListener listener) {
this.zk = zk;
this.znode = znode;
this.chainedWatcher = chainedWatcher;
this.listener = listener;
//初始化的时候就去触发检查, 事件驱动完成后面的逻辑
zk.exists(znode, true, this, null);
}
调用 ZooKeeper.exists() 检查znode, 设置监听器, 并且传递 它自己的引用(this, DataMonitor 实现了 StateCallback 接口 ) 作为回调函数. 这样就能实际执行逻辑和监听逻辑分离开来.
注意:
不要搞混了 watch 回调和 completion 回调(ZooKeeper.exists()的回调). ZooKeeper.exists() 回调是在 DataMonitor 中实现的 StatCallback.processResult(), 它是由监听的服务器异步触发的.
而 DataMonitor 中的watch 回调只是用来给 Executor 触发的, 因为 Executor 在实例化 ZooKeeper 的时候注册了监听 ZK 连接状态的监听器, 所以 Executor能触发回调, 并调用 DataMonitor 中的 process方法
ZK.exists 的 callback 参数的类型是 org.apache.zookeeper.AsyncCallback.StatCallback
, processResult 方法是 StatCallback
接口中的方法, DataMonitor 同时实现了 Watch
和 StatCallback
接口, 其 StatCallback
接口中processResult 方法实现如下:
public void processResult(int rc, String path, Object ctx, Stat stat) {
boolean exists;
//状态码 ok 则设置为 节点存在, 否则设置为不存在
//会话过期或者是未认证都调用关闭回调
//其他状态码则重试
switch (rc) {
case Code.Ok:
exists = true;
break;
case Code.NoNode:
exists = false;
break;
case Code.SessionExpired:
case Code.NoAuth:
dead = true;
listener.closing(rc);
return;
default:
// Retry errors
zk.exists(znode, true, this, null);
return;
}
//如果节点存在, 则获取其数据
byte[] b = null;
if (exists) {
try {
b = zk.getData(znode, false, null);
} catch (KeeperException e) {
//我们不需要担心恢复的问题, 监听器回调会剔除错误处理
e.printStackTrace();
} catch (InterruptedException e) {
return;
}
}
//数据为空 且 上一次的数据不为空
// 或者 数据不为空 且 不等于上一次数据
// 则触发回调
if ((b == null && b != prevData)
|| (b != null && !Arrays.equals(prevData, b))) {
listener.exists(b);
prevData = b;
}
}
这段代码首先检查 znode 的存在性, 致命错误 和 可恢复的错误. 如果文件(或者 znode) 存在, 就从 znode 获取数据, 如果状态发生变化, 就触发 Executor 的 exists() 回调 . 注意: 在这段代码里面的 getData 调用并没有对捕获的异常进行处理, 因为已经利用了监控机制来处理可能的异常了: 如果 znode 在调用 ZooKeeper.getData() 之前被删除了, 会触发 ZooKeeper.exists() 的另一个回调事件; 如果是连接发生了错误, 在其恢复的时候会在连接监控器上面触发事件.
最后, 看一下 DataMonitor 如何处理 Executor 传过来的 WatchEvent
public void process(WatchedEvent event) {
String path = event.getPath();
if (event.getType() == Event.EventType.None) {
// 连接状态改变的时候会收到通知
// 监听会话事件, 如果连接断开, 需要关闭应用
switch (event.getState()) {
case SyncConnected:
//在这个例子中, 我们不用做任何实现 - 监听是自动被重新注册的
//并且任何已断开的事件都会通知到客户端
break;
case Expired:
//会话过期就结束
dead = true;
listener.closing(KeeperException.Code.SessionExpired);
break;
}
} else {
if (path != null && path.equals(znode)) {
//节点发生了变化
//重新调用检查 节点是否存在
zk.exists(znode, true, this, null);
}
}
//调用处理链进行处理
if (chainedWatcher != null) {
chainedWatcher.process(event);
}
}
如果客户端的库能在会话过期(Expired 事件)之前重新建立断开的连接, 那么所有的监听器都会重新被建立(监听器的重置是在 ZK 3.0.0 版本引入的). ZK监听器参考. 在上述代码的最后重新检查节点的存在性.
个人总结
- 调用任何ZK的代码, 诸如 zk.getData 有可能抛出异常, 需要加以考虑. 但是异常不一定需要处理. 比如上面代码中对于 zk.getData 的异常捕获就是不处理的. (按理来说, 如果希望程序员处理异常, 就应该在函数签名中声明抛出异常, zk.getData 是有声明的)
- 不同种类的监听器需要区分对待, 实例化 ZooKeeper 的时候,使用的是 Watcher 接口, 而 zk.exists 使用的是 StateCallback 接口
- Watcher 接口一定要判断好事件类型, 因为其管理的事件既有会话,连接的事件, 也有节点的事件
- 会话过期前的断开重连之后, 会自动重新注册监听器. 而一般情况都需要重新注册监听器.
完整的源代码
Executor.java
/**
* A simple example program to use DataMonitor to start and
* stop executables based on a znode. The program watches the
* specified znode and saves the data that corresponds to the
* znode in the filesystem. It also starts the specified program
* with the specified arguments when the znode exists and kills
* the program if the znode goes away.
*/
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import org.apache.zookeeper.KeeperException;
import org.apache.zookeeper.WatchedEvent;
import org.apache.zookeeper.Watcher;
import org.apache.zookeeper.ZooKeeper;
public class Executor
implements Watcher, Runnable, DataMonitor.DataMonitorListener
{
String znode;
DataMonitor dm;
ZooKeeper zk;
String filename;
String exec[];
Process child;
public Executor(String hostPort, String znode, String filename,
String exec[]) throws KeeperException, IOException {
this.filename = filename;
this.exec = exec;
zk = new ZooKeeper(hostPort, 3000, this);
dm = new DataMonitor(zk, znode, null, this);
}
/**
* @param args
*/
public static void main(String[] args) {
if (args.length < 4) {
System.err
.println("USAGE: Executor hostPort znode filename program [args ...]");
System.exit(2);
}
String hostPort = args[0];
String znode = args[1];
String filename = args[2];
String exec[] = new String[args.length - 3];
System.arraycopy(args, 3, exec, 0, exec.length);
try {
new Executor(hostPort, znode, filename, exec).run();
} catch (Exception e) {
e.printStackTrace();
}
}
/***************************************************************************
* We do process any events ourselves, we just need to forward them on.
*
* @see org.apache.zookeeper.Watcher#process(org.apache.zookeeper.proto.WatcherEvent)
*/
public void process(WatchedEvent event) {
dm.process(event);
}
public void run() {
try {
synchronized (this) {
while (!dm.dead) {
wait();
}
}
} catch (InterruptedException e) {
}
}
public void closing(int rc) {
synchronized (this) {
notifyAll();
}
}
static class StreamWriter extends Thread {
OutputStream os;
InputStream is;
StreamWriter(InputStream is, OutputStream os) {
this.is = is;
this.os = os;
start();
}
public void run() {
byte b[] = new byte[80];
int rc;
try {
while ((rc = is.read(b)) > 0) {
os.write(b, 0, rc);
}
} catch (IOException e) {
}
}
}
public void exists(byte[] data) {
if (data == null) {
if (child != null) {
System.out.println("Killing process");
child.destroy();
try {
child.waitFor();
} catch (InterruptedException e) {
}
}
child = null;
} else {
if (child != null) {
System.out.println("Stopping child");
child.destroy();
try {
child.waitFor();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
try {
FileOutputStream fos = new FileOutputStream(filename);
fos.write(data);
fos.close();
} catch (IOException e) {
e.printStackTrace();
}
try {
System.out.println("Starting child");
child = Runtime.getRuntime().exec(exec);
new StreamWriter(child.getInputStream(), System.out);
new StreamWriter(child.getErrorStream(), System.err);
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
DataMonitor.java
/**
* A simple class that monitors the data and existence of a ZooKeeper
* node. It uses asynchronous ZooKeeper APIs.
*/
import java.util.Arrays;
import org.apache.zookeeper.KeeperException;
import org.apache.zookeeper.WatchedEvent;
import org.apache.zookeeper.Watcher;
import org.apache.zookeeper.ZooKeeper;
import org.apache.zookeeper.AsyncCallback.StatCallback;
import org.apache.zookeeper.KeeperException.Code;
import org.apache.zookeeper.data.Stat;
public class DataMonitor implements Watcher, StatCallback {
ZooKeeper zk;
String znode;
Watcher chainedWatcher;
boolean dead;
DataMonitorListener listener;
byte prevData[];
public DataMonitor(ZooKeeper zk, String znode, Watcher chainedWatcher,
DataMonitorListener listener) {
this.zk = zk;
this.znode = znode;
this.chainedWatcher = chainedWatcher;
this.listener = listener;
// Get things started by checking if the node exists. We are going
// to be completely event driven
zk.exists(znode, true, this, null);
}
/**
* Other classes use the DataMonitor by implementing this method
*/
public interface DataMonitorListener {
/**
* The existence status of the node has changed.
*/
void exists(byte data[]);
/**
* The ZooKeeper session is no longer valid.
*
* @param rc
* the ZooKeeper reason code
*/
void closing(int rc);
}
public void process(WatchedEvent event) {
String path = event.getPath();
if (event.getType() == Event.EventType.None) {
// We are are being told that the state of the
// connection has changed
switch (event.getState()) {
case SyncConnected:
// In this particular example we don't need to do anything
// here - watches are automatically re-registered with
// server and any watches triggered while the client was
// disconnected will be delivered (in order of course)
break;
case Expired:
// It's all over
dead = true;
listener.closing(KeeperException.Code.SessionExpired);
break;
}
} else {
if (path != null && path.equals(znode)) {
// Something has changed on the node, let's find out
zk.exists(znode, true, this, null);
}
}
if (chainedWatcher != null) {
chainedWatcher.process(event);
}
}
public void processResult(int rc, String path, Object ctx, Stat stat) {
boolean exists;
switch (rc) {
case Code.Ok:
exists = true;
break;
case Code.NoNode:
exists = false;
break;
case Code.SessionExpired:
case Code.NoAuth:
dead = true;
listener.closing(rc);
return;
default:
// Retry errors
zk.exists(znode, true, this, null);
return;
}
byte b[] = null;
if (exists) {
try {
b = zk.getData(znode, false, null);
} catch (KeeperException e) {
// We don't need to worry about recovering now. The watch
// callbacks will kick off any exception handling
e.printStackTrace();
} catch (InterruptedException e) {
return;
}
}
if ((b == null && b != prevData)
|| (b != null && !Arrays.equals(prevData, b))) {
listener.exists(b);
prevData = b;
}
}
}