2021SC@SDUSC
storm代码阅读(五)
2021SC@SDUSC
Topology部分阅读(四)
Topolpgy提交【1】
1、用户代码调用submitTopology
用户一般通过StormSubmitter.submitTopology提交拓扑。
if (args != null && args.length > 0) {
conf.setNumWorkers(3);
StormSubmitter.submitTopologyWithProgressBar(args[0], conf, builder.createTopology());
}
上面代码中调用的StormSubmitter.submitTopologyWithProgressBar方法只是在submitTopology的基础上增加了一些进度信息。
2、submitTopologyWithProgressBar
具体代码如下:
public static void submitTopologyWithProgressBar(String name, Map stormConf, StormTopology topology, SubmitOptions opts) throws AlreadyAliveException, InvalidTopologyException, AuthorizationException {
// show a progress bar so we know we're not stuck (especially on slow connections)
submitTopology(name, stormConf, topology, opts, new StormSubmitter.ProgressListener() {
@Override
public void onStart(String srcFile, String targetFile, long totalBytes) {
System.out.printf("Start uploading file '%s' to '%s' (%d bytes)\n", srcFile, targetFile, totalBytes);
}
@Override
public void onProgress(String srcFile, String targetFile, long bytesUploaded, long totalBytes) {
int length = 50;
int p = (int)((length * bytesUploaded) / totalBytes);
String progress = StringUtils.repeat("=", p);
String todo = StringUtils.repeat(" ", length - p);
System.out.printf("\r[%s%s] %d / %d", progress, todo, bytesUploaded, totalBytes);
}
@Override
public void onCompleted(String srcFile, String targetFile, long totalBytes) {
System.out.printf("\nFile '%s' uploaded to '%s' (%d bytes)\n", srcFile, targetFile, totalBytes);
}
});
}
其本质上仍是调用submitTopology方法,同时在start, progress和complete阶段输出了一些信息。
3、submitTopology
具体代码如下:
@SuppressWarnings("unchecked")
public static void submitTopology(String name, Map stormConf, StormTopology topology, SubmitOptions opts,
ProgressListener progressListener) throws AlreadyAliveException, InvalidTopologyException, AuthorizationException {
submitTopologyAs(name, stormConf, topology, opts, progressListener, null);
}
StormSubmitter.submitTopology其实就是调用StormSubmitter.submitTopologyAs。
下面具体分析StormSubmitter.submitTopologyAs。
Topolpgy提交【2】
StormSubmitter.submitTopologyAs
1、加载配置
在submitTopologyAs中,第一件事就是将拓扑的配置加载到一个HashMap中。
if(!Utils.isValidConf(stormConf)) {
throw new IllegalArgumentException("Storm conf is not valid. Must be json-serializable");
}
stormConf = new HashMap(stormConf);
stormConf.putAll(Utils.readCommandLineOpts());
Map conf = Utils.readStormConfig();
conf.putAll(stormConf);
stormConf.putAll(prepareZookeeperAuthentication(conf));
以上代码实现的功能为:
(1)检查拓扑传进来的conf是否有效,是否能json化,然后将其转换为HashMap。
其中,这里的conf是用户在建立拓扑时通过以下类似代码传进来的:
Config config = new Config();
config.put(Config.TOPOLOGY_TRIDENT_BATCH_EMIT_INTERVAL_MILLIS, 200);
config.setNumWorkers(topoNumWorker);
config.setMaxTaskParallelism(20);
config.put(Config.NIMBUS_HOST, nimbusHost);
config.put(Config.NIMBUS_THRIFT_PORT, 6627);
config.put(Config.STORM_ZOOKEEPER_PORT, 2181);
config.put(Config.STORM_ZOOKEEPER_SERVERS, Arrays.asList(zk));
config.put(Config.TOPOLOGY_NAME, topologyName);
(2)将命令行中的参数加载进stormConf中。
(3)调用readStormConfig,加载配置文件中的内容,先加载defaults.yaml, 然后再加载storm.yaml。
public static Map readStormConfig() {
Map ret = readDefaultConfig();
String confFile = System.getProperty("storm.conf.file");
Map storm;
if (confFile==null || confFile.equals("")) {
storm = findAndReadConfigFile("storm.yaml", false);
} else {
storm = findAndReadConfigFile(confFile, true);
}
ret.putAll(storm);
ret.putAll(readCommandLineOpts());
return ret;
}
(4)最后,加载zk认证相关信息。
(5)除此之外,还可以组件中覆盖getComponentConfiguration方法以修改其组件的配置。
(6)最后,还可以使用spoutDeclare与boltDeclare设置外部组件。
要注意的是,这里有conf和stormConf2个变量,conf才是全部的配置,stormConf不包括defaults.yaml和storm.yaml。先将用户配置加载到stormConf,然后将defaults.yaml和storm.yaml回到conf,最后将stormConf加载到conf。
2、使用NimbusClient提交拓扑
当配置准备好以后,就开始向nimbus提交拓扑。在storm中,nimbus是一个thrift服务器,它接受客户端通过json文件提交RPC调用,即NimbusClient向nimbus提供一份json格式的字符串,用于提交拓扑信息。
String serConf = JSONValue.toJSONString(stormConf);
NimbusClient client = NimbusClient.getConfiguredClientAs(conf, asUser);
if(topologyNameExists(conf, name, asUser)) {
throw new RuntimeException("Topology with name `" + name + "` already exists on cluster");
}
String jar = submitJarAs(conf, System.getProperty("storm.jar"), progressListener, asUser);
try {
LOG.info("Submitting topology " + name + " in distributed mode with conf " + serConf);
if(opts!=null) {
client.getClient().submitTopologyWithOpts(name, jar, serConf, topology, opts);
} else {
// this is for backwards compatibility
client.getClient().submitTopology(name, jar, serConf, topology);
}
} catch(InvalidTopologyException e) {
LOG.warn("Topology submission exception: "+e.get_msg());
throw e;
} catch(AlreadyAliveException e) {
LOG.warn("Topology already alive exception", e);
throw e;
} finally {
client.close();
}
核心步骤为:
(1)将配置文件改为json格式的string。
String serConf = JSONValue.toJSONString(stormConf);
(2)获取Nimbus client对象。
NimbusClient client = NimbusClient.getConfiguredClientAs(conf, asUser);
getConfiguredClientAs的代码中的其中一行是指定nimbus的地址:
String nimbusHost = (String) conf.get(Config.NIMBUS_HOST);
(3)检查拓扑名称是否已经存在。
if(topologyNameExists(conf, name, asUser)) {
throw new RuntimeException("Topology with name `" + name + "` already exists on cluster");
}
(4)将jar包上传至nimbus。
String jar = submitJarAs(conf, System.getProperty("storm.jar"), progressListener, asUser);
(5)最后调用submitTopologyWithOpts正式向nimbus提交拓扑,参数包括:
client.getClient().submitTopologyWithOpts(name, jar, serConf, topology, opts);
其中,调用到的submitTopologyWithOpts方法具体为:
send_submitTopologyWithOpts(name, uploadedJarLocation, jsonConf, topology, options);
recv_submitTopologyWithOpts();
即将信息发送至thrift server及接收返回信息。发送的信息包括:
args.set_name(name);
args.set_uploadedJarLocation(uploadedJarLocation);
args.set_jsonConf(jsonConf);
args.set_topology(topology);
args.set_options(options);
其中set_uploadedJarLocation指定了jar包的上传路径。
综上,所谓的提交拓扑,其实就是将拓扑的配置信息通过thrift发送到thrift server,并把jar包上传到nimbus,等待nimbus的后续处理,此时拓扑并未真正起来,直至recv_submitTopologyWithOpts获得成功的返回信息为止。
参考文章链接:https://blog.csdn.net/jediael_lu/article/details/76794825