storm源码分析(三)

null_wry

于 2021-10-14 23:03:16 发布

阅读量389

点赞数

文章标签： storm java

本文链接：https://blog.csdn.net/null_wry/article/details/120773477

版权

文章目录

一、ACLs权限
二、代码分析
- verifyAcls()方法
- getTopoAcl()方法

2021SC@SDUSC

AclEnforcement类，这是用于执行ZK acl的类。
我们首先介绍一些关于ACLs权限的知识。

一、ACLs权限

权限

1） CREATE: 创建权限，可以在在当前node下创建child node
2） DELETE(d): 删除权限，可以删除当前的node
3） READ: 读权限，可以获取当前node的数据，可以list当前node所有的child nodes
4） WRITE(w): 写权限，可以向当前node写数据
5） ADMIN(a): 管理权限，可以设置当前node的permission

维度

从三个维度来理解：一是scheme; 二是user（可以用户名或者ip）; 三是permission（即上面的权限），通常表示为scheme?permissions。

1.scheme

scheme: scheme对应于采用哪种方案来进行权限管理，zookeeper实现了一个pluggable的ACL方案，可以通过扩展scheme，来扩展ACL的机制。zookeeper-3.4.4缺省支持下面几种scheme:

world:
它下面只有一个id, 叫anyone, world:anyone代表任何人，zookeeper中对所有人有权限的结点就是属于world:anyone的

auth:
它不需要id, 只要是通过authentication的user都有权限（zookeeper支持通过kerberos来进行authencation, 也支持username/password形式的authentication)

digest:
它对应的id为username:BASE64(SHA1(password))，它需要先通过username:password形式的authentication

ip:
它对应的id为客户机的IP地址，设置的时候可以设置一个ip段，比如ip:192.168.1.0/16, 表示匹配前16个bit的IP段

super:
在这种scheme情况下，对应的id拥有超级权限，可以做任何事情(cdrwa)

sasl:
sasl的对应的id，是一个通过sasl authentication用户的id，zookeeper-3.4.4中的sasl authentication是通过kerberos来实现的，也就是说用户只有通过了kerberos认证，才能访问它有权限的node。

2.id

id与scheme是紧密相关的，具体的情况在上面介绍scheme的过程都已介绍，这里不再赘述。

3.permission

权限cdrwa。

二、代码分析

verifyAcls()方法

该方法主要是验证ZK acl是否正确，并在需要时可选地修复它们。
该方法传入的变量conf为集群配置，如果要修复acl则传入的fixUp为true，否的话为false。

public static void verifyAcls(Map<String, Object> conf, final boolean fixUp) throws Exception {
        if (!Utils.isZkAuthenticationConfiguredStormServer(conf)) {
            LOG.info("SECURITY IS DISABLED NO FURTHER CHECKS...");
            //There is no security so we are done.
            return;
        }
        ACL superUserAcl = Utils.getSuperUserAcl(conf);
        List<ACL> superAcl = new ArrayList<>(1);
        superAcl.add(superUserAcl);

        List<ACL> drpcFullAcl = new ArrayList<>(2);
        drpcFullAcl.add(superUserAcl);

        String drpcAclString = (String) conf.get(Config.STORM_ZOOKEEPER_DRPC_ACL);
        if (drpcAclString != null) {
            Id drpcAclId = Utils.parseZkId(drpcAclString, Config.STORM_ZOOKEEPER_DRPC_ACL);
            ACL drpcUserAcl = new ACL(ZooDefs.Perms.READ, drpcAclId);
            drpcFullAcl.add(drpcUserAcl);
        }

首先先通过Utils.isZkAuthenticationConfiguredStormServer()方法判断Storm服务器是否配置了Zk认证。否的话我们认为是不安全的直接返回。
如果配置了，则获取superUser的Acl权限，加入到superAcl和drpcFullAcl的权限列表中。

 List<String> zkServers = (List<String>) conf.get(Config.STORM_ZOOKEEPER_SERVERS);
        int port = ObjectReader.getInt(conf.get(Config.STORM_ZOOKEEPER_PORT));
        String stormRoot = (String) conf.get(Config.STORM_ZOOKEEPER_ROOT);

        try (CuratorFramework zk = ClientZookeeper.mkClient(conf, zkServers, port, "",
                                                            new DefaultWatcherCallBack(), conf, DaemonType.NIMBUS)) {
            if (zk.checkExists().forPath(stormRoot) != null) {
                //First off we want to verify that ROOT is good
                verifyAclStrict(zk, superAcl, stormRoot, fixUp);
            } else {
                LOG.warn("{} does not exist no need to check any more...", stormRoot);
                return;
            }
        }

经过上面的处理现在根路径已经没问题了，下面开始查看它下面的其他路径。

try (CuratorFramework zk = ClientZookeeper.mkClient(conf, zkServers, port, stormRoot,
                                                            new DefaultWatcherCallBack(), conf, DaemonType.NIMBUS)) {
            //Next verify that the blob store is correct before we start it up.
            if (zk.checkExists().forPath(ClusterUtils.BLOBSTORE_SUBTREE) != null) {
                verifyAclStrictRecursive(zk, superAcl, ClusterUtils.BLOBSTORE_SUBTREE, fixUp);
            }

            if (zk.checkExists().forPath(ClusterUtils.BLOBSTORE_MAX_KEY_SEQUENCE_NUMBER_SUBTREE) != null) {
                verifyAclStrict(zk, superAcl, ClusterUtils.BLOBSTORE_MAX_KEY_SEQUENCE_NUMBER_SUBTREE, fixUp);
            }

            //The blobstore is good, now lets get the list of all topo Ids
            Set<String> topoIds = new HashSet<>();
            if (zk.checkExists().forPath(ClusterUtils.STORMS_SUBTREE) != null) {
                topoIds.addAll(zk.getChildren().forPath(ClusterUtils.STORMS_SUBTREE));
            }

            Map<String, Id> topoToZkCreds = new HashMap<>();
            //Now lets get the creds for the topos so we can verify those as well.
            BlobStore bs = ServerUtils.getNimbusBlobStore(conf, NimbusInfo.fromConf(conf), null);
            try {
                Subject nimbusSubject = new Subject();
                nimbusSubject.getPrincipals().add(new NimbusPrincipal());
                for (String topoId : topoIds) {
                    try {
                        String blobKey = topoId + "-stormconf.ser";
                        Map<String, Object> topoConf = Utils.fromCompressedJsonConf(bs.readBlob(blobKey, nimbusSubject));
                        String payload = (String) topoConf.get(Config.STORM_ZOOKEEPER_TOPOLOGY_AUTH_PAYLOAD);
                        try {
                            topoToZkCreds.put(topoId, new Id("digest", DigestAuthenticationProvider.generateDigest(payload)));
                        } catch (NoSuchAlgorithmException e) {
                            throw new RuntimeException(e);
                        }
                    } catch (KeyNotFoundException knf) {
                        LOG.debug("topo removed {}", topoId, knf);
                    }
                }
            } finally {
                if (bs != null) {
                    bs.shutdown();
                }
            }

接下来，在启动blob存储之前，先验证blob存储是否正确。
blobstore存储正确的话，就获取所有topo id的列表，然后得到topos的creds，这样也可以验证其他的。

verifyParentWithReadOnlyTopoChildren(zk, superUserAcl, ClusterUtils.STORMS_SUBTREE, topoToZkCreds, fixUp);
            verifyParentWithReadOnlyTopoChildren(zk, superUserAcl, ClusterUtils.ASSIGNMENTS_SUBTREE, topoToZkCreds, fixUp);
            //There is a race on credentials where they can be leaked in some versions of storm.
            verifyParentWithReadOnlyTopoChildrenDeleteDead(zk, superUserAcl, ClusterUtils.CREDENTIALS_SUBTREE, topoToZkCreds, fixUp);
            //There is a race on logconfig where they can be leaked in some versions of storm.
            verifyParentWithReadOnlyTopoChildrenDeleteDead(zk, superUserAcl, ClusterUtils.LOGCONFIG_SUBTREE, topoToZkCreds, fixUp);
            //There is a race on backpressure too...
            verifyParentWithReadWriteTopoChildrenDeleteDead(zk, superUserAcl, ClusterUtils.BACKPRESSURE_SUBTREE, topoToZkCreds, fixUp);

            if (zk.checkExists().forPath(ClusterUtils.ERRORS_SUBTREE) != null) {
                //errors is a bit special because in older versions of storm the worker created the parent directories lazily
                // because of this it means we need to auto create at least the topo-id directory for all running topos.
                for (String topoId : topoToZkCreds.keySet()) {
                    String path = ClusterUtils.errorStormRoot(topoId);
                    if (zk.checkExists().forPath(path) == null) {
                        LOG.warn("Creating missing errors location {}", path);
                        zk.create().withACL(getTopoReadWrite(path, topoId, topoToZkCreds, superUserAcl, fixUp)).forPath(path);
                    }
                }
            }
            //Error should not be leaked according to the code, but they are not important enough to fail the build if
            // for some odd reason they are leaked.
            verifyParentWithReadWriteTopoChildrenDeleteDead(zk, superUserAcl, ClusterUtils.ERRORS_SUBTREE, topoToZkCreds, fixUp);

            if (zk.checkExists().forPath(ClusterUtils.SECRET_KEYS_SUBTREE) != null) {
                verifyAclStrict(zk, superAcl, ClusterUtils.SECRET_KEYS_SUBTREE, fixUp);
                verifyAclStrictRecursive(zk, superAcl, ClusterUtils.secretKeysPath(WorkerTokenServiceType.NIMBUS), fixUp);
                verifyAclStrictRecursive(zk, drpcFullAcl, ClusterUtils.secretKeysPath(WorkerTokenServiceType.DRPC), fixUp);
            }

            if (zk.checkExists().forPath(ClusterUtils.NIMBUSES_SUBTREE) != null) {
                verifyAclStrictRecursive(zk, superAcl, ClusterUtils.NIMBUSES_SUBTREE, fixUp);
            }

            if (zk.checkExists().forPath("/leader-lock") != null) {
                verifyAclStrictRecursive(zk, superAcl, "/leader-lock", fixUp);
            }

            if (zk.checkExists().forPath(ClusterUtils.PROFILERCONFIG_SUBTREE) != null) {
                verifyAclStrictRecursive(zk, superAcl, ClusterUtils.PROFILERCONFIG_SUBTREE, fixUp);
            }

            if (zk.checkExists().forPath(ClusterUtils.SUPERVISORS_SUBTREE) != null) {
                verifyAclStrictRecursive(zk, superAcl, ClusterUtils.SUPERVISORS_SUBTREE, fixUp);
            }

            // When moving to pacemaker workerbeats can be leaked too...
            verifyParentWithReadWriteTopoChildrenDeleteDead(zk, superUserAcl, ClusterUtils.WORKERBEATS_SUBTREE, topoToZkCreds, fixUp);
        }
    }

验证带有只读拓扑子结点的父结点。
在某些版本的storm中，有关于证书的竞争，有在logconfig上的竞争，还有在backpressure反压力上的竞争。所以要将已经死了的带有只读拓扑子结点的父结点删除。
如果zk.checkExists().forPath(ClusterUtils.ERRORS_SUBTREE) 非空，也就是存在ClusterUtils.ERRORS_SUBTREE的路径，就需要为所有运行的topos自动创建至少一个topo-id目录。因为Errors有一点特殊，因为在旧版本的storm中，工人创建父目录是惰性的。
然后对ClusterUtils.SECRET_KEYS_SUBTREE，ClusterUtils.NIMBUSES_SUBTREE，"/leader-lock"，ClusterUtils.PROFILERCONFIG_SUBTREE，ClusterUtils.SUPERVISORS_SUBTREE路径非空的节点做相应的处理验证递归。
当移动到pacemaker时，心跳也会被泄露，这是就删除这些死了的带有只读拓扑子结点的父结点。

getTopoAcl()方法

private static List<ACL> getTopoAcl(String path, String topoId, Map<String, Id> topoToZkCreds, ACL superAcl, boolean fixUp, int perms) {
        Id id = topoToZkCreds.get(topoId);
        if (id == null) {
            String error = "Could not find credentials for topology " + topoId + " at path " + path + ".";
            if (fixUp) {
                error += " Don't know how to fix this automatically. Please add needed ACLs, or delete the path.";
            }
            throw new IllegalStateException(error);
        }
        List<ACL> ret = new ArrayList<>(2);
        ret.add(superAcl);
        ret.add(new ACL(perms, id));
        return ret;
    }