在一中讲到了topology提交给nimbus
nimbus
Nimbus可以 说是storm中最核心的部分,它的主要功能有两个:
- 对Topology的任务进行分配资源
- 接收用户的命令并做相应的处理,如Topology的提交,杀死,激活等等
服务接口的定义都在storm.thrift文件中定义,贴下部分代码:
service Nimbus {
void submitTopology(1: string name, 2: string uploadedJarLocation, 3: string jsonConf, 4: StormTopology topology) throws (1: AlreadyAliveException e, 2: InvalidTopologyException ite);
void submitTopologyWithOpts(1: string name, 2: string uploadedJarLocation, 3: string jsonConf, 4: StormTopology topology, 5: SubmitOptions options) throws (1: AlreadyAliveException e, 2: InvalidTopologyException ite);
void killTopology(1: string name) throws (1: NotAliveException e);
void killTopologyWithOpts(1: string name, 2: KillOptions options) throws (1: NotAliveException e);
void activate(1: string name) throws (1: NotAliveException e);
void deactivate(1: string name) throws (1: NotAliveException e);
void rebalance(1: string name, 2: RebalanceOptions options) throws (1: NotAliveException e, 2: InvalidTopologyException ite);
// need to add functions for asking about status of storms, what nodes they're running on, looking at task logs
string beginFileUpload();
void uploadChunk(1: string location, 2: binary chunk);
void finishFileUpload(1: string location);
string beginFileDownload(1: string file);
//can stop downloading chunks when receive 0-length byte array back
binary downloadChunk(1: string id);
// returns json
string getNimbusConf();
// stats functions
ClusterSummary getClusterInfo();
TopologyInfo getTopologyInfo(1: string id) throws (1: NotAliveException e);
//returns json
string getTopologyConf(1: string id) throws (1: NotAliveException e);
StormTopology getTopology(1: string id) throws (1: NotAliveException e);
StormTopology getUserTopology(1: string id) throws (1: NotAliveException e);
}
当执行命令 nohup ${STORM_HOME}/bin/storm nimbus & 时,会启动nimbus服务,具体的代码执行:
storm python脚本代码,默认启动backtype.storm.daemon.nimbus程序:
def nimbus(klass="backtype.storm.daemon.nimbus"):
"""Syntax: [storm nimbus]
Launches the nimbus daemon. This command should be run under
supervision with a tool like daemontools or monit.
See Setting up a Storm cluster for more information.
(http://storm.incubator.apache.org/documentation/Setting-up-a-Storm-cluster)
"""
cppaths = [CLUSTER_CONF_DIR]
jvmopts = parse_args(confvalue("nimbus.childopts", cppaths)) + [
"-Dlogfile.name=nimbus.log",
"-Dlogback.configurationFile=" + STORM_DIR + "/logback/cluster.xml",
]
exec_storm_class(
klass,
jvmtype="-server",
extrajars=cppaths,
jvmopts=jvmopts)
然后执行nimbus.clj 脚本,主要涉及两个方法——launch-server!(nimbus的启动入口)和service-handler(真正定义处理逻辑的地方)。
nimbus启动后,对外提供了一些服务,topology的提交,UI信息,topology的kill,rebalance等等。在文章一中讲到提交topology给nimbus,这些服务的处理逻辑全部在service-handler方法中。以下截取service-handler里面处理提交Topology的逻辑
(reify Nimbus$Iface
(^void submitTopologyWithOpts
[this ^String storm-name ^String uploadedJarLocation ^String serializedConf ^StormTopology topology
^SubmitOptions submitOptions]
(try
(assert (not-nil? submitOptions))
(validate-topology-name! storm-name)
(check-storm-active! nimbus storm-name false)
(let [topo-conf (from-json serializedConf)]
(try
(validate-configs-with-schemas topo-conf)
(catch IllegalArgumentException ex
(throw (InvalidTopologyException. (.getMessage ex)))))
(.validate ^backtype.storm.nimbus.ITopologyValidator (:validator nimbus)
storm-name
topo-conf
topology))
(swap! (:submitted-count nimbus) inc)
(let [storm-id (str storm-name "-" @(:submitted-count nimbus) "-" (current-time-secs))
storm-conf (normalize-conf
conf
(-> serializedConf
from-json
(assoc STORM-ID storm-id)
(assoc TOPOLOGY-NAME storm-name))
topology)
total-storm-conf (merge conf storm-conf)
topology (normalize-topology total-storm-conf topology)
storm-cluster-state (:storm-cluster-state nimbus)]
(system-topology! total-storm-conf topology) ;; this validates the structure of the topology
(log-message "Received topology submission for " storm-name " with conf " storm-conf)
;; lock protects against multiple topologies being submitted at once and
;; cleanup thread killing topology in b/w assignment and starting the topology
(locking (:submit-lock nimbus)
(setup-storm-code conf storm-id uploadedJarLocation storm-conf topology)
(.setup-heartbeats! storm-cluster-state storm-id)
(let [thrift-status->kw-status {TopologyInitialStatus/INACTIVE :inactive
TopologyInitialStatus/ACTIVE :active}]
(start-storm nimbus storm-name storm-id (thrift-status->kw-status (.get_initial_status submitOptions))))
(mk-assignments nimbus)))
(catch Throwable e
(log-warn-error e "Topology submission exception. (topology name='" storm-name "')")
(throw e))))
(^void submitTopology
[this ^String storm-name ^String uploadedJarLocation ^String serializedConf ^StormTopology topology]
(.submitTopologyWithOpts this storm-name uploadedJarLocation serializedConf topology
(SubmitOptions. TopologyInitialStatus/ACTIVE)))
检查Topology的DAG图是否是有效连接图、以及该topology Name是否已经存在,然后分配资源和任务调度(mk-assignments )方法,等分配好资源之后,把数据写入到zookeeper,watcher发现有数据,就通知supervisor读取数据启动新的worker,一个worker就是一个JVM进程,worker启动后就会按照用户事先定好的task数来启动task,一个task就是一个thread
在executor.clj中mk-threads: spout ,mk-threads: bolt方法就是启动task,而task就是对应的spout或bolt 组件,而且这时Spout的open,nextTuple方法,以及bolt的preapre,execute方法都是在这里被调用的,结合文章一中提到的,对于
Spout 方法调用顺序:
declareOutputFields-> open -> nextTuple -> fail/ack or other
Bolt 方法调用顺序:
declareOutputFields-> prepare -> execute
需要的注意的是在Spout中fail、ack方法和nextTuple是在同一线程中被顺序调用的,所以在nextTuple中不要做延迟很大的操作。
至此,一个topology算是可以正式启动工作了。