一、ZooKeeper 分布式锁实现
在单体应用开发场景中涉及并发的时候,一般采用 Synchronized 或者其它的 JUC 工具实现多线程间的同步问题,在分布式的应用场景中,就需要一种更加高级的锁机制来处理跨机器的进程之间数据同步问题。这种跨机器的锁就是 分布式锁
二、ZooKeeper 分布式锁的原理
-
ZooKeeper 的每一个节点都是一个天然的顺序发号器
在每一个节点下面创建临时顺序节点 (EPHEMERAL_SEQUENTIAL) 类型,新的子节点后面会加上一个顺序编号。这个顺序编号是在上一个生成的顺序编号加1。
-
ZooKeeper 节点的递增有序性可以确保锁的公平
在临时顺序节点中,编号最小的那个节点表示获得了锁,每个线程在尝试占用锁之前,首先判断自己的编号是不是最小的,如果是,则获取锁。如果不是,则监听前一个节点。
-
ZooKeeper 节点监听机制可以保障占有锁的传递有序而且高效
每个线程在抢占锁之前,先抢号创建自己的 ZNode。同样,释放锁的时候,就需要删除抢号的 ZNode。在抢号成功之后,如果不是排号最小节点,就处于等待前一个 ZNode 通知的状态。当前一个 ZNode 被删除的时候,就会收到通知,再进行一次判断,看看自己是不是序号最小的那个节点,如果是,则获得锁。
-
ZooKeeper 的节点监听机制能避免羊群效应
后面的节点监听前一个节点,这种方式可以避免羊群效应。所谓羊群效应就是一个节点挂了,所有节点都去监听,然后做出反应。这样会给服务器带来巨大的压力,所以有了临时顺序节点,当一个节点挂掉,只有它后面的那一个节点才作出反应。
三、实现ZooKeeper 分布式锁
- 分布式锁的实现
@Slf4j
public class LockUtils implements Watcher, AsyncCallback.StringCallback, AsyncCallback.ChildrenCallback {
private final static String ZK_CLUSTER = "localhost:2181/java";
private CountDownLatch lockLatch;
private CountDownLatch connectLatch;
private ZooKeeper zooKeeper;
private ThreadLocal<Integer> threadLock = new ThreadLocal<Integer>();
private String lockFilename;
public ZooKeeper getZooKeeper() {
if (zooKeeper == null) {
synchronized (LockUtils.class) {
if (zooKeeper == null) {
connectLatch = new CountDownLatch(1);
try {
zooKeeper = new ZooKeeper(ZK_CLUSTER, 3000, new Watcher() {
@Override
public void process(WatchedEvent event) {
switch (event.getState()) {
case SyncConnected:
connectLatch.countDown();
break;
}
}
});
connectLatch.await();
} catch (IOException e) {
log.error("error occured,", e);
} catch (InterruptedException e) {
log.error("error occured,", e);
}
}
}
}
return zooKeeper;
}
/**
* 监控删除前一个锁的 ephemeral 节点的事件
*
* @param event
*/
@Override
public void process(WatchedEvent event) {
switch (event.getType()) {
case NodeDeleted:
this.getZooKeeper().getChildren("/", null, this, "lock");
break;
}
}
/**
* 创建序列节点成功回调
* StringCallback
* <p>On success, rc is {@link KeeperException.Code#OK}.
*
* <p>On failure, rc is set to the corresponding failure code in {@link KeeperException}.
* <ul>
* <li>{@link KeeperException.Code#NODEEXISTS}
* - The node on give path already exists for some API calls.</li>
* <li>{@link KeeperException.Code#NONODE}
* - The node on given path doesn't exist for some API calls.</li>
* <li>{@link KeeperException.Code#NOCHILDRENFOREPHEMERALS}
* - An ephemeral node cannot have children. There is discussion in
* community. It might be changed in the future.</li>
* </ul>
*
* @param rc The return code or the result of the call.
* @param path The path that we passed to asynchronous calls.
* @param ctx Whatever context object that we passed to asynchronous calls.
* @param name The name of the znode that was created. On success, <i>name</i>
*/
@Override
public void processResult(int rc, String path, Object ctx, String name) {
if (rc == KeeperException.Code.OK.intValue()) {
// 创建序列节点成功,在回调方法中进行取锁
this.getZooKeeper().getChildren("/", null, this, "lock");
lockFilename = name;
}
}
/**
* 查询子所有ephemeral 节点的回调
* ChilrenCallback
*
* @param rc
* @param path
* @param ctx
* @param children
*/
@Override
public void processResult(int rc, String path, Object ctx, List<String> children) {
Collections.sort(children);
int index = children.indexOf(lockFilename.substring(1));
if (index == 0) {
//当前序列节点已经是第一个,获得锁。
lockLatch.countDown();
}
if (index > 0) {
try {
Stat stat = this.getZooKeeper().exists("/" + children.get(index - 1), this);
if (stat == null) {
// 这一步很重要
//在高并发情况下,如果查询的时候还在,但在exists 绑定watcher 的时候已经删除了,会导至 watcher无效。
//重新发起子节点的查询,并添加watcher 事件。
this.getZooKeeper().getChildren("/", null, this, "lock");
return;
}
} catch (InterruptedException e) {
log.error("error occured.", e);
} catch (KeeperException e) {
log.error("error occured.", e);
}
}
}
/**
* 获取锁的方法
* 1. 创建 ephemeral and sequential 节点,使用响应式编程,在 StringCallback 中获取结果。
* 2. 收到创建的 StringCallback,读取所有锁节点
* 3. 判断当前节点是不是第一个,如果是第一个,获得锁成功
* 4. 如果不是第一个,监控当前队列中的前一个节点的删除事件
* 5. 如果前一个节点删除或退出
* @param timeout 在指定时间内如果还没排到锁的话,做过期处理。
* @return
* @throws InterruptedException
* @throws KeeperException
*/
public boolean lock(int timeout) throws InterruptedException, KeeperException {
Integer times = threadLock.get();
if (times != null) {
//重入锁, 如果当前线程已经取得锁+1;
threadLock.set(++times);
return true;
}
lockLatch = new CountDownLatch(1);
ZooKeeper zooKeeper = this.getZooKeeper();
zooKeeper.create("/lock", "lock".getBytes(), ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.EPHEMERAL_SEQUENTIAL, this, "lock");
//超时返回。
boolean locked = lockLatch.await(timeout, TimeUnit.SECONDS);
if (locked) {
threadLock.set(1);
}
return locked;
}
public void unlock() throws InterruptedException, KeeperException {
Integer times = threadLock.get();
if (times <= 1) {
//重入锁, 如果当前线程锁完全移除,删除节点。
threadLock.remove();
this.getZooKeeper().delete(lockFilename, 0);
} else {
//重入锁, 如果当前线程多次取得锁,unlock 一次减1.
times--;
}
threadLock.set(times);
}
}
- 分布式锁的测试
@Test
public void testLock() {
int nums = 100;
CountDownLatch latch = new CountDownLatch(nums);
ExecutorService es = Executors.newFixedThreadPool(100);
try {
for (int i = 0; i < nums; i++) {
es.submit(() -> {
try {
LockUtils util = new LockUtils();
boolean locked = util.lock(10);
if (locked) {
//支持重入锁
util.lock(10);
String name = Thread.currentThread().getName();
log.info(name + " 拿到锁了");
log.info(name + " 干活ing...");
log.info(name + " 干完活,释放锁");
latch.countDown();
util.unlock();
util.unlock();
} else {
log.error("lock timeout....");
}
} catch (Exception e) {
log.error("error occured,", e);
}
});
}
latch.await();
log.info("-----------DONE-------------");
} catch (InterruptedException e) {
log.error("error occured.", e);
}
}