watcher解决的问题
在进入watcher之前我们先试想在应用服务器集群中可能存在的两个问题:
- 因为集群中有很多机器,当某个通用的配置发生变化后,怎么让自动的让所有服务器的配置统一生效?
- 当集群中某个节点宕机,如何让集群中的其他节点知道?
为了解决这两个问题,zookeeper引入了watcher机制来实现发布/订阅功能,能够让多个订阅者同时监听某一个主题对象,当这个主题对象自身状态发生变化时,会通知所有订阅者。
watcher基本原理
zookeeper中实现watcher需要有三个部分,如下图所示:
分别是zookeeper服务端、客户端以及客户端的watchManager。
如图所示,客户端向zk注册watcher的同时,会将客户端的watcher对象存储在客户端的WatchManager中;zk服务器触发watch事件后,会向客户端发送通知,客户端线程从watchManager中取出对应watcher执行。
客户端如何实现事件通知的动作
客户端只需定义一个类实现org.apache.zookeeper.Watcher
接口并实现接口中的如下方法:
abstract public void process(WatchedEvent event);
即可在得到通知后执行相应的动作。参数org.apache.zookeeper.WatchedEvent
是zk服务端传过来的事件,有三个成员:
final private KeeperState keeperState; // 通知状态
final private EventType eventType; // 事件类型
private String path; // 哪个节点发生的时间
分别代表通知的状态、事件类型和发生事件的节点。
keeperState是个枚举对象,代表客户端和zk服务器的链接状态,定义如下:
/**
* Enumeration of states the ZooKeeper may be at the event
*/
public enum KeeperState {
/** Unused, this state is never generated by the server */
@Deprecated
Unknown (-1),
/** The client is in the disconnected state - it is not connected
* to any server in the ensemble. */
Disconnected (0),
/** Unused, this state is never generated by the server */
@Deprecated
NoSyncConnected (1),
/** The client is in the connected state - it is connected
* to a server in the ensemble (one of the servers specified
* in the host connection parameter during ZooKeeper client
* creation).
* /
SyncConnected (3),
/**
* Auth failed state
*/
AuthFailed (4),
/**
* The client is connected to a read-only server, that is the
* server which is not currently connected to the majority.
* The only operations allowed after receiving this state is
* read operations.
* This state is generated for read-only clients only since
* read/write clients aren't allowed to connect to r/o servers.
*/
ConnectedReadOnly (5),
/**
* SaslAuthenticated: used to notify clients that they are SASL-authenticated,
* so that they can perform Zookeeper actions with their SASL-authorized permissions.
*/
SaslAuthenticated(6),
/** The serving cluster has expired this session. The ZooKeeper
* client connection (the session) is no longer valid. You must
* create a new client connection (instantiate a new ZooKeeper
* instance) if you with to access the ensemble.
*/
Expired (-112);
private final int intValue; // Integer representation of value
// for sending over wire
KeeperState(int intValue) {
this.intValue = intValue;
}
public int getIntValue() {
return intValue;
}
public static KeeperState fromInt(int intValue) {
switch(intValue) {
case -1: return KeeperState.Unknown;
case 0: return KeeperState.Disconnected;
case 1: return KeeperState.NoSyncConnected;
case 3: return KeeperState.SyncConnected;
case 4: return KeeperState.AuthFailed;
case 5: return KeeperState.ConnectedReadOnly;
case 6: return KeeperState.SaslAuthenticated;
case -112: return KeeperState.Expired;
default:
throw new RuntimeException("Invalid integer value for conversion to KeeperState");
}
}
}
eventType也是个枚举类型,代表节点发生的事件类型,比如创建新的子节点、改变节点数据等,定义如下:
/**
* Enumeration of types of events that may occur on the ZooKeeper
*/
public enum EventType {
None (-1),
NodeCreated (1),
NodeDeleted (2),
NodeDataChanged (3),
NodeChildrenChanged (4),
DataWatchRemoved (5),
ChildWatchRemoved (6);
private final int intValue; // Integer representation of value
// for sending over wire
EventType(int intValue) {
this.intValue = intValue;
}
public int getIntValue() {
return intValue;
}
public static EventType fromInt(int intValue) {
switch(intValue) {
case -1: return EventType.None;
case 1: return EventType.NodeCreated;
case 2: return EventType.NodeDeleted;
case 3: return EventType.NodeDataChanged;
case 4: return EventType.NodeChildrenChanged;
case 5: return EventType.DataWatchRemoved;
case 6: return EventType.ChildWatchRemoved;
default:
throw new RuntimeException("Invalid integer value for conversion to EventType");
}
}
}
keeperState和eventType对应关系如下所示:
对于NodeDataChanged
事件:无论节点数据发生变化还是数据版本发生变化都会触发(即使被更新数据与新数据一样,数据版本都会发生变化)。
对于NodeChildrenChanged
事件:新增和删除子节点会触发该事件类型。
需要注意的是:WatchedEvent
只是事件相关的通知,并没有对应数据节点的原始数据内容及变更后的新数据内容,因此如果需要知道变更前的数据或变更后的新数据,需要业务保存变更前的数据和调用接口获取新的数据
如何注册watcher
watcher注册api
可以在创建zk客户端实例的时候注册watcher(构造方法中注册watcher):
public ZooKeeper(String connectString, int sessionTimeout, Watcher watcher)
public ZooKeeper(String connectString, int sessionTimeout, Watcher watcher,ZKClientConfig conf)
public ZooKeeper(String connectString, int sessionTimeout, Watcher watcher, boolean canBeReadOnly, HostProvider aHostProvider)
public ZooKeeper(String connectString, int sessionTimeout, Watcher watcher, boolean canBeReadOnly, HostProvider aHostProvider,ZKClientConfig clientConfig)
public ZooKeeper(String connectString, int sessionTimeout, Watcher watcher, boolean canBeReadOnly)
public ZooKeeper(String connectString, int sessionTimeout, Watcher watcher, boolean canBeReadOnly, ZKClientConfig conf)
public ZooKeeper(String connectString, int sessionTimeout, Watcher watcher, long sessionId, byte[] sessionPasswd)
public ZooKeeper(String connectString, int sessionTimeout, Watcher watcher, long sessionId, byte[] sessionPasswd, boolean canBeReadOnly, HostProvider aHostProvider)
public ZooKeeper(String connectString, int sessionTimeout, Watcher watcher, long sessionId, byte[] sessionPasswd, boolean canBeReadOnly)
ZooKeeper
的构造方法中传入的watcher将会作为整个zk会话期间的默认watcher,该watcher会一直保存为客户端ZKWatchManager
的defaultWatcher
成员,如果有其他的设置,这个watcher会被覆盖。
除了可以通过ZooKeeper
类的构造方法注册watcher外,还可以通过ZooKeeper
类中其他一些api来注册watcher,只不过这些api注册的watcher就不是默认watcher
了(以下每个注册watcher的方法有很多个重载的方法,就不一一列举出来)。
public List<String> getChildren(final String path, Watcher watcher)
// boolean watch表示是否使用上下文中默认的watcher,即创建zk实例时设置的watcher
public List<String> getChildren(String path, boolean watch)
// boolean watch表示是否使用上下文中默认的watcher,即创建zk实例时设置的watcher
public byte[] getData(String path, boolean watch, Stat stat)
public void getData(final String path, Watcher watcher, DataCallback cb, Object ctx)
// boolean watch表示是否使用上下文中默认的watcher,即创建zk实例时设置的watcher
public Stat exists(String path, boolean watch)
public Stat exists(final String path, Watcher watcher)
watcher注册示例代码
本示例中使用zookeeper自带客户端演示watcher的使用,zookeeper自带客户端有一点需要注意:
Watcher设置后,一旦触发一次即会失效,如果需要一直监听,则需要再注册
定义默认watcher:
/**
* 测试默认watcher
*/
public class DefaultWatcher implements Watcher {
@Override
public void process(WatchedEvent event) {
System.out.println("==========DefaultWatcher start==============");
System.out.println("DefaultWatcher state: " + event.getState().name());
System.out.println("DefaultWatcher type: " + event.getType().name());
System.out.println("DefaultWatcher path: " + event.getPath());
System.out.println("==========DefaultWatcher end==============");
}
}
定义监听子节点变化的watcher:
/**
* 用于监听子节点变化的watcher
*/
public class ChildrenWatcher implements Watcher {
@Override
public void process(WatchedEvent event) {
System.out.println("==========ChildrenWatcher start==============");
System.out.println("ChildrenWatcher state: " + event.getState().name());
System.out.println("ChildrenWatcher type: " + event.getType().name());
System.out.println("ChildrenWatcher path: " + event.getPath());
System.out.println("==========ChildrenWatcher end==============");
}
}
定义监听节点变化的watcher:
public class DataWatcher implements Watcher {
@Override
public void process(WatchedEvent event) {
System.out.println("==========DataWatcher start==============");
System.out.println("DataWatcher state: " + event.getState().name());
System.out.println("DataWatcher type: " + event.getType().name());
System.out.println("DataWatcher path: " + event.getPath());
System.out.println("==========DataWatcher end==============");
}
}
watcher测试代码:
public class WatcherTest {
/**
* 链接zk服务端的地址
*/
private static final String CONNECT_STRING = "192.168.0.113:2181";
public static void main(String[] args) {
// 除了默认watcher外其他watcher一旦触发就会失效,需要充新注册,本示例中因为
// 还未想到比较好的重新注册watcher方式(考虑到如果在Watcher中持有一个zk客户端的
// 实例可能存在循环引用的问题),因此暂不实现watcher失效后重新注册watcher的问题,
// 后续可以查阅curator重新注册watcher的实现方法。
// 默认watcher
DefaultWatcher defaultWatcher = new DefaultWatcher();
// 监听子节点变化的watcher
ChildrenWatcher childrenWatcher = new ChildrenWatcher();
// 监听节点数据变化的watcher
DataWatcher dataWatcher = new DataWatcher();
try {
// 创建zk客户端,并注册默认watcher
ZooKeeper zooKeeper = new ZooKeeper(CONNECT_STRING, 100000, defaultWatcher);
// 让默认watcher监听 /GetChildren 节点的子节点变化
// zooKeeper.getChildren("/GetChildren", true);
// 让childrenWatcher监听 /GetChildren 节点的子节点变化(默认watcher不再监听该节点子节点变化)
zooKeeper.getChildren("/GetChildren", childrenWatcher);
// 让dataWatcher监听 /GetChildren 节点本省的变化(默认watcher不再监听该节点变化)
zooKeeper.getData("/GetChildren", dataWatcher, null);
TimeUnit.SECONDS.sleep(1000000);
} catch (Exception ex) {
ex.printStackTrace();
}
}
}
测试过程:
首先在命令行客户端创建节点 /GetChildren
[zk: localhost:2181(CONNECTED) 133] create /GetChildren GetChildrenData
Created /GetChildren
运行测试代码WatcherTest
,输出如下内容:
==========DefaultWatcher start==============
DefaultWatcher state: SyncConnected
DefaultWatcher type: None
DefaultWatcher path: null
==========DefaultWatcher end==============
可以看出在客户端第一次链接zk服务端时触发了链接成功的事件通知,该事件由默认watcher接收,导致默认watcher相关代码得到执行。
接着在命令行客户端创建子节点:
[zk: localhost:2181(CONNECTED) 134] create /GetChildren/ChildNode ChildNodeData
Created /GetChildren/ChildNode
ChildrenWatcher收到通知,/GetChildren的子节点发生变化,因此输出如下内容:
==========ChildrenWatcher start==============
ChildrenWatcher state: SyncConnected
ChildrenWatcher type: NodeChildrenChanged
ChildrenWatcher path: /GetChildren
==========ChildrenWatcher end==============
最后在命令行客户端修改 /GetChildren 节点数据:
[zk: localhost:2181(CONNECTED) 135] set /GetChildren GetChildrenDataV2
cZxid = 0xab
ctime = Sat Sep 15 03:52:48 PDT 2018
mZxid = 0xb0
mtime = Sat Sep 15 04:06:05 PDT 2018
pZxid = 0xaf
cversion = 1
dataVersion = 1
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 17
numChildren = 1
DataWatcher收到通知,输出如下内容:
==========DataWatcher start==============
DataWatcher state: SyncConnected
DataWatcher type: NodeDataChanged
DataWatcher path: /GetChildren
==========DataWatcher end==============
我们可以接着在命令行客户端修改 /GetChildren 节点数据:
[zk: localhost:2181(CONNECTED) 136] set /GetChildren GetChildrenDataV3
cZxid = 0xab
ctime = Sat Sep 15 03:52:48 PDT 2018
mZxid = 0xb1
mtime = Sat Sep 15 04:14:54 PDT 2018
pZxid = 0xaf
cversion = 1
dataVersion = 2
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 17
numChildren = 1
但WatcherTest没有任何输出了,说明DataWatcher已经失效了,要能够继续出发需要重新注册。
watcher实现源码分析
我们以注册watcher的 getData api为例,分析watcher的注册流程,以setData api为例,分析watcher的触发流程。
getData
的实现在org.apache.zookeeper.ZooKeeper
类中,具体代码如下:
public byte[] getData(final String path, Watcher watcher, Stat stat)
throws KeeperException, InterruptedException
{
final String clientPath = path;
PathUtils.validatePath(clientPath);
// the watch contains the un-chroot path
WatchRegistration wcb = null;
if (watcher != null) {
wcb = new DataWatchRegistration(watcher, clientPath);
}
final String serverPath = prependChroot(clientPath);
RequestHeader h = new RequestHeader();
h.setType(ZooDefs.OpCode.getData);
GetDataRequest request = new GetDataRequest();
request.setPath(serverPath);
request.setWatch(watcher != null);
GetDataResponse response = new GetDataResponse();
ReplyHeader r = cnxn.submitRequest(h, request, response, wcb);
if (r.getErr() != 0) {
throw KeeperException.create(KeeperException.Code.get(r.getErr()),
clientPath);
}
if (stat != null) {
DataTree.copyStat(response.getStat(), stat);
}
return response.getData();
}
重点看:
......
WatchRegistration wcb = null;
if (watcher != null) {
wcb = new DataWatchRegistration(watcher, clientPath);
}
......
ReplyHeader r = cnxn.submitRequest(h, request, response, wcb);
......
我们首先看org.apache.zookeeper.ZooKeeper.DataWatchRegistration
和org.apache.zookeeper.ZooKeeper.WatchRegistration
类的实现代码:
/**
* Register a watcher for a particular path.
*/
public abstract class WatchRegistration {
private Watcher watcher;
private String clientPath;
public WatchRegistration(Watcher watcher, String clientPath)
{
this.watcher = watcher;
this.clientPath = clientPath;
}
abstract protected Map<String, Set<Watcher>> getWatches(int rc);
/**
* Register the watcher with the set of watches on path.
* @param rc the result code of the operation that attempted to
* add the watch on the path.
*/
public void register(int rc) {
if (shouldAddWatch(rc)) {
Map<String, Set<Watcher>> watches = getWatches(rc);
synchronized(watches) {
Set<Watcher> watchers = watches.get(clientPath);
if (watchers == null) {
watchers = new HashSet<Watcher>();
watches.put(clientPath, watchers);
}
watchers.add(watcher);
}
}
}
/**
* Determine whether the watch should be added based on return code.
* @param rc the result code of the operation that attempted to add the
* watch on the node
* @return true if the watch should be added, otw false
*/
protected boolean shouldAddWatch(int rc) {
return rc == 0;
}
}
class DataWatchRegistration extends WatchRegistration {
public DataWatchRegistration(Watcher watcher, String clientPath) {
super(watcher, clientPath);
}
@Override
protected Map<String, Set<Watcher>> getWatches(int rc) {
return watchManager.dataWatches;
}
}
org.apache.zookeeper.ZooKeeper.DataWatchRegistration#getWatches
方法是从org.apache.zookeeper.ZooKeeper.ZKWatchManager
中获取保存watcher的一个HashMap
:
private final Map<String, Set<Watcher>> dataWatches =
new HashMap<String, Set<Watcher>>();
org.apache.zookeeper.ZooKeeper.WatchRegistration#register
方法显然是注册一个watcher,该方法肯定会在后续流程得到调用,事实上在getData返回数据并且判断成功后就会调用该方法将watcher加入到ZKWatchManager
中,我们稍后到了这一步流程在分析,这里先有个大概的了解。
我们回到getData发送请求的代码:
ReplyHeader r = cnxn.submitRequest(h, request, response, wcb);
cnxn
的类型是org.apache.zookeeper.ClientCnxn
,进入到submitRequest方法:
public ReplyHeader submitRequest(RequestHeader h, Record request, Record response, WatchRegistration watchRegistration) throws InterruptedException {
return submitRequest(h, request, response, watchRegistration, null);
}
public ReplyHeader submitRequest(RequestHeader h, Record request, Record response, WatchRegistration watchRegistration, WatchDeregistration watchDeregistration) throws InterruptedException {
ReplyHeader r = new ReplyHeader();
Packet packet = queuePacket(h, r, request, response, null, null, null, null, watchRegistration, watchDeregistration);
synchronized (packet) {
while (!packet.finished) {
packet.wait();
}
}
return r;
}
先到此为止,等有空继续完善源码分析部分(源码分析描述起来太麻烦了)。。。。。。。