Leader Elections(leader选举)
- 指派一个进程作为组织者,将任务分发给各节点。在任务开始前,哪个节点都不知道谁是
leader(领导者)
或者coordinator(协调者)
。当选举算法开始执行后,每个节点最终会得到一个唯一的节点作为任务leader
。除此之外,选举还经常会发生在leader
意外宕机的情况下,新的leader
要被选举出来。 Curator
有两种leader
选举方式LeaderSelector
:前者是所有存活的客户端不间断的轮流做Leader
LeaderLatch
:一旦选举出Leader
,除非有客户端挂掉重新触发选举,否则不会交出领导权
LeaderLatch
-
LeaderLatch
方法start()
:启动,一旦启动,当前的LeaderLatch
会与具有相同的path的LeaderLatch
进行交涉,最终选择出一个leader
hasLeadership()
:返回true,表示当前的LeaderLatch
是leader
close()
:一旦不使用LeaderLatch
,必须调用此方法,如果是leader
则会释放领导权,其他的参数者会再次选举一个leader
-
代码,
Leader Latch
选举的本质是连接ZooKeeper
,然后在/jannal-leader/leader
路径为每个LeaderLatch
创建临时有序节点:@Test public void testLeader() { String connectString = "zk-master:2180,zk-slave1:2182,zk-slave2:2183"; int sessionTimeoutMs = 25000; int connectionTimeoutMs = 5000; String rootPath = "jannal-leader"; List<CuratorFramework> clients = Lists.newArrayList(); List<LeaderLatch> leaderLatches = Lists.newArrayList(); int clientCount = 5; try { for (int i = 0; i < clientCount; i++) { RetryPolicy retryPolicy = new ExponentialBackoffRetry(4000, 3); CuratorFramework curatorFramework = CuratorFrameworkFactory.builder() .connectString(connectString) .sessionTimeoutMs(sessionTimeoutMs) .connectionTimeoutMs(connectionTimeoutMs) .retryPolicy(retryPolicy) .namespace(rootPath) .build(); clients.add(curatorFramework); curatorFramework.start(); LeaderLatch leaderLatch = new LeaderLatch(curatorFramework, "/leader", "C" + i); leaderLatch.addListener(new LeaderLatchListener() { @Override public void isLeader() { logger.info("{}是leader", leaderLatch.getId()); } @Override public void notLeader() { logger.info("{}不是leader", leaderLatch.getId()); } }); leaderLatches.add(leaderLatch); leaderLatch.start(); } //等待选举完成 TimeUnit.SECONDS.sleep(10); LeaderLatch currentLeader = null; int currentIndex = 0; for (int i = 0; i < clientCount; ++i) { LeaderLatch leaderLatch = leaderLatches.get(i); //查看是否是leader if (leaderLatch.hasLeadership()) { currentLeader = leaderLatch; currentIndex = i; } } logger.info("当前leader:{}", currentLeader.getId()); TimeUnit.SECONDS.sleep(30); //释放领导权 currentLeader.close(); CloseableUtils.closeQuietly(clients.get(currentIndex)); //因为已经释放领导权(client已经关闭),所以必须移除 leaderLatches.remove(currentIndex); clients.remove(currentIndex); LockSupport.park(); } catch (Exception e) { logger.error(e.getMessage(), e); } finally { for (LeaderLatch leaderLatch : leaderLatches) { CloseableUtils.closeQuietly(leaderLatch); } for (CuratorFramework client : clients) { CloseableUtils.closeQuietly(client); } } }
-
上面代码,
LeaderLatch
启动之后Zookeeper
出现5个节点(如下图),第一次选举出的是C1
为leader,这是因为在创建临时节点时,LeaderLatch
中的checkLeadership(List<String> children)
方法会将选举路径(/jannal-leader/leader/
)下面的所有节点按照序列号排序,如果当前节点的序列号最小,则将该节点设置为leader
。当C1
释放领导权之后,C4
的序列号最小被选为Leader
,我们再通过Idea Zookeeper
插件手动删除C4
对应的节点jannal-leader/leader/_c_21fb5cfb-e92c-4a48-ba40-87b877e9100f-latch-0000000031
,会发现此时C0
(_c_e28110cd-902e-40dc-9de0-4559e70ba1ca-latch-0000000032
)被选为leader
leaderLatch.getId() 节点路径 C0 /jannal-leader/leader/_c_e28110cd-902e-40dc-9de0-4559e70ba1ca-latch-0000000032 C1 /jannal-leader/leader/_c_ac6553fd-dbc7-4fab-834c-cbd5ea54ddba-latch-0000000030 C2 /jannal-leader/leader/_c_82d88ba2-5130-4840-8571-8eb80c16a664-latch-0000000033 C3 /jannal-leader/leader/_c_c7300cbd-fd34-40ff-9bc0-0795fa208ffa-latch-0000000034 C4 /jannal-leader/leader/_c_21fb5cfb-e92c-4a48-ba40-87b877e9100f-latch-0000000031 -
根据以上的分析,如果按照时间顺序,依次启动
LeaderLatch
,则第一个启动的一定会是leader
,因为第一个序列号一定是最小的,只需要将上面的代码调整一下。因为LeaderLatch
启动是异步线程池执行的,所以这里的sleep(5)
只是大概的模拟了一下启动顺序,实际线程的执行顺序很难确定(我们只是假设5秒可以启动完成)for (int i = 0; i < clientCount; i++) { 省略.... leaderLatch.start(); } TimeUnit.SECONDS.sleep(5); 调整为以下 for (int i = 0; i < clientCount; i++) { 省略.... leaderLatch.start(); //按照顺序依次启动 TimeUnit.SECONDS.sleep(5); }
LeaderSelector
-
这种选举策略跟
LeaderLatch
选举策略不同之处在于每个实例都能公平获取领导权,而且当获取领导权的实例在释放领导权之后,该实例还有机会再次获取领导权。另外,选举出来的leader
不会一直占有领导权,当takeLeadership(CuratorFramework client)
方法执行结束之后会自动释放领导权,当实例取得领导权时你的listener的takeLeadership()
方法被调用 -
代码
class CustomLeaderSelectorListenerAdapter extends LeaderSelectorListenerAdapter implements Closeable { private String name; private LeaderSelector leaderSelector; public AtomicInteger leaderCount = new AtomicInteger(); public CustomLeaderSelectorListenerAdapter(CuratorFramework client, String path, String name) { this.name = name; this.leaderSelector = new LeaderSelector(client, path, this); /** * 自动重新排队 * 该方法的调用可以确保此实例在释放领导权后还可能获得领导权 */ leaderSelector.autoRequeue(); } public void start() throws IOException { leaderSelector.start(); } @Override public void close() throws IOException { leaderSelector.close(); } /** * 获取领导权 */ @Override public void takeLeadership(CuratorFramework client) throws Exception { final int waitSeconds = (int) (5 * Math.random()) + 1; logger.info("{}是leader,之前成为leader的次数:{}", name, leaderCount.getAndIncrement()); try { //等待waitSeconds秒后放弃领导权(模拟业务执行过程) Thread.sleep(TimeUnit.SECONDS.toMillis(waitSeconds)); //如果想让它一直是leader,这里可以阻塞 //LockSupport.park(); } catch (InterruptedException e) { logger.error(e.getMessage(), e); Thread.currentThread().interrupt(); } finally { logger.info("{}放弃领导权", name); } } } @Test public void testLeaderSelector() { String connectString = "zk-master:2180,zk-slave1:2182,zk-slave2:2183"; int sessionTimeoutMs = 25000; int connectionTimeoutMs = 5000; String rootPath = "jannal-leader"; List<CuratorFramework> clients = Lists.newArrayList(); List<CustomLeaderSelectorListenerAdapter> leaderSelectorListenerList = new ArrayList<CustomLeaderSelectorListenerAdapter>(); int clientCount = 5; try { for (int i = 0; i < clientCount; i++) { RetryPolicy retryPolicy = new ExponentialBackoffRetry(4000, 3); CuratorFramework curatorFramework = CuratorFrameworkFactory.builder() .connectString(connectString) .sessionTimeoutMs(sessionTimeoutMs) .connectionTimeoutMs(connectionTimeoutMs) .retryPolicy(retryPolicy) .namespace(rootPath) .build(); clients.add(curatorFramework); curatorFramework.start(); //创建LeaderSelectorListenerAdapter实例 CustomLeaderSelectorListenerAdapter leaderSelectorListener = new CustomLeaderSelectorListenerAdapter(curatorFramework, "/leaderSelector", "C" + i); leaderSelectorListener.start(); leaderSelectorListenerList.add(leaderSelectorListener); } LockSupport.park(); } catch (Exception e) { logger.error(e.getMessage(), e); } finally { for (CustomLeaderSelectorListenerAdapter customLeaderSelectorListenerAdapter : leaderSelectorListenerList) { CloseableUtils.closeQuietly(customLeaderSelectorListenerAdapter); } for (CuratorFramework client : clients) { CloseableUtils.closeQuietly(client); } } }
-
上面代码的执行结果
分布式锁
可重入锁
-
全局可重入锁
(Shared ReentrantLock),。Reentrant和JDK的ReentrantLock
类似,意味着同一个客户端在拥有锁的同时,可以多次获取,不会被阻塞。 -
代码示例
@Test public void testInterProcessMutex() { String connectString = "zk-master:2180,zk-slave1:2182,zk-slave2:2183"; int sessionTimeoutMs = 25000; int connectionTimeoutMs = 5000; String rootPath = "jannal-leader"; String lockPath = "/shared/reentrantlock"; final List<CuratorFramework> clientList = new ArrayList<CuratorFramework>(); int size = 10; ExecutorService service = Executors.newFixedThreadPool(20); try { for (int i = 0; i < size; i++) { RetryPolicy retryPolicy = new ExponentialBackoffRetry(4000, 3); CuratorFramework curatorFramework = CuratorFrameworkFactory.builder() .connectString(connectString) .sessionTimeoutMs(sessionTimeoutMs) .connectionTimeoutMs(connectionTimeoutMs) .retryPolicy(retryPolicy) .namespace(rootPath) .build(); clientList.add(curatorFramework); curatorFramework.start(); } for (int j = 0; j < size; j++) { final CuratorFramework client = clientList.get(j); final int index = j; service.submit(new Callable<Void>() { @Override public Void call() throws Exception { String clientName = "C" + index; //每个线程都使用一个InterProcessMutex final InterProcessMutex lock = new InterProcessMutex(client, lockPath); //尝试10次获取锁 for (int k = 0; k < 10; k++) { if (lock.acquire(1, TimeUnit.SECONDS)) { try { logger.info("{}已经获取到锁,开始执行业务代码", clientName); //模拟业务执行时间 TimeUnit.SECONDS.sleep(2); } finally { lock.release(); } } else { logger.warn("{}没有获取到锁", clientName); } } return null; } }); } LockSupport.park(); } catch (Exception e) { logger.error(e.getMessage(), e); } finally { service.shutdown(); for (CuratorFramework client : clientList) { CloseableUtils.closeQuietly(client); } } }
-
查看节点,获取锁的客户端节点会在ZK上创建一个节点,值为机器IP地址。锁一旦释放节点就删除。
-
执行结果
不可重入锁
-
InterProcessSemaphoreMutex
/** * 不可重入锁InterProcessSemaphoreMutex */ @Test public void testInterProcessSemaphoreMutex() { String connectString = "zk-master:2180,zk-slave1:2182,zk-slave2:2183"; int sessionTimeoutMs = 25000; int connectionTimeoutMs = 5000; String rootPath = "jannal-leader"; String lockPath = "/shared/reentrantlock"; final List<CuratorFramework> clientList = new ArrayList<CuratorFramework>(); int size = 10; ExecutorService service = Executors.newFixedThreadPool(20); try { for (int i = 0; i < size; i++) { RetryPolicy retryPolicy = new ExponentialBackoffRetry(4000, 3); CuratorFramework curatorFramework = CuratorFrameworkFactory.builder() .connectString(connectString) .sessionTimeoutMs(sessionTimeoutMs) .connectionTimeoutMs(connectionTimeoutMs) .retryPolicy(retryPolicy) .namespace(rootPath) .build(); clientList.add(curatorFramework); curatorFramework.start(); } for (int j = 0; j < size; j++) { final CuratorFramework client = clientList.get(j); final int index = j; service.submit(new Callable<Void>() { @Override public Void call() throws Exception { String clientName = "C" + index; //每个线程都使用一个InterProcessMutex final InterProcessSemaphoreMutex lock = new InterProcessSemaphoreMutex(client, lockPath); //尝试10次获取锁 for (int k = 0; k < 10; k++) { if (lock.acquire(1, TimeUnit.SECONDS)) { try { logger.info("{}已经获取到锁,开始执行业务代码", clientName); //模拟业务执行时间 TimeUnit.SECONDS.sleep(2); } finally { lock.release(); } } else { logger.warn("{}没有获取到锁", clientName); } } return null; } }); } LockSupport.park(); } catch (Exception e) { logger.error(e.getMessage(), e); } finally { service.shutdown(); for (CuratorFramework client : clientList) { CloseableUtils.closeQuietly(client); } } }