storm源码分析之topology提交过程
storm集群上运行的是一个个topology,一个topology是spouts和bolts组成的图。当我们开发完topology程序后将其打成jar包,然后在shell中执行storm jar xxxxxx.jar xxxxxxxClass就可以将jar包上传到storm集群的nimbus上,并执行topology。本文主要分析下topology的jar包是如何上传到nimbus上的。首先我们从storm的jar命令入手,jar命令的实现位于storm根目录的bin/storm文件里。定义如下:
"""Syntax: [storm jar topology-jar-path class ...]
Runs the main method of class with the specified arguments.
The storm jars and configs in ~/.storm are put on the classpath.
The process is configured so that StormSubmitter
(http://nathanmarz.github.com/storm/doc/backtype/storm/StormSubmitter.html)
will upload the jar at topology-jar-path when the topology is submitted.
"""
exec_storm_class(
klass ,
jvmtype = "-client" ,
extrajars = [ jarfile , USER_CONF_DIR , STORM_DIR + "/bin" ],
args = args ,
jvmopts = [ ' ' . join( filter( None , [ JAR_JVM_OPTS , "-Dstorm.jar=" + jarfile ]))])
jar命令是由python实现的,很奇怪为什么不用clojure实现呢?(不得而知)。jarfile表示jar包的位置;klass表示topology的入口,也就是有main函数的类;*args表示传递给main函数的参数。jvmtype="-client"表示指定jvm类型为client类型(jvm有两种类型client和server,服务器端默认为server类型);extrajars集合用于存放编译topology的jar包时,所有依赖jar包的路径;jvmopts集合存放以jvm参数,这里比较重要的是-Dstorm.jar参数,这个参数的值是jarfile,这样在运行submitTopology方法时就可以通过storm.jar参数获得jar包的路径了(通过jvm参数进行方法参数传递)exec_storm_class函数的逻辑比较简单,具体实现如下:
global CONFFILE
all_args = [
"java" , jvmtype , get_config_opts (),
"-Dstorm.home=" + STORM_DIR ,
"-Djava.library.path=" + confvalue( "java.library.path" , extrajars ),
"-Dstorm.conf.file=" + CONFFILE ,
"-cp" , get_classpath( extrajars ),
] + jvmopts + [ klass ] + list( args)
print "Running: " + " " . join( all_args)
if fork :
os . spawnvp( os . P_WAIT , "java" , all_args)
else :
os . execvp( "java" , all_args) # replaces the current process and never returns
get_config_opts()获取jvm的默认配置信息,confvalue("java.library.path", extrajars)获取storm使用的本地库JZMQ加载路径,get_classpath(extrajars)获取所有依赖jar包的完整路径,然后拼接一个java -cp命令运行topology的main方法。接下来程序执行流程转移到topology的main方法内,我们以storm-starter项目中的wordCountTopology的main方法为例:
TopologyBuilder builder = new TopologyBuilder();
builder . setSpout( "spout" , new RandomSentenceSpout (), 6);
builder . setBolt( "split" , new SplitSentence (), 12 ). shuffleGrouping( "spout");
builder . setBolt( "count" , new WordCount (), 10 ). fieldsGrouping( "split" , new Fields( "word"));
Config conf = new Config();
conf . setDebug( true);
if ( args != null && args . length > 0) {
conf . setNumWorkers( 4);
StormSubmitter . submitTopology( args [ 0 ], conf , builder . createTopology());
}
else {
conf . setMaxTaskParallelism( 3);
LocalCluster cluster = new LocalCluster();
cluster . submitTopology( "word-count" , conf , builder . createTopology());
Thread . sleep( 10000);
cluster . shutdown();
}
}
main方法构建topology后,调用StormSubmitter类的submitTopology方法提交topology。submitTopology方法如下:
* Submits a topology to run on the cluster. A topology runs forever or until
* explicitly killed.
*
*
* @param name the name of the storm.
* @param stormConf the topology-specific configuration. See {@link Config}.
* @param topology the processing to execute.
* @throws AlreadyAliveException if a topology with this name is already running
* @throws InvalidTopologyException if an invalid topology was submitted
*/
public static void submitTopology( String name , Map stormConf , StormTopology topology)
throws AlreadyAliveException , InvalidTopologyException {
submitTopology( name , stormConf , topology , null);
}
/**
* Submits a topology to run on the cluster. A topology runs forever or until
* explicitly killed.
*
*
* @param name the name of the storm.
* @param stormConf the topology-specific configuration. See {@link Config}.
* @param topology the processing to execute.
* @param options to manipulate the starting of the topology
* @throws AlreadyAliveException if a topology with this name is already running
* @throws InvalidTopologyException if an invalid topology was submitted
*/
public static void submitTopology( String name , Map stormConf , StormTopology topology , SubmitOptions opts)
throws AlreadyAliveException , InvalidTopologyException {
if (! Utils . isValidConf( stormConf)) {
throw new IllegalArgumentException( "Storm conf is not valid. Must be json-serializable");
}
stormConf = new HashMap( stormConf);
stormConf . putAll( Utils . readCommandLineOpts());
Map conf = Utils . readStormConfig();
conf . putAll( stormConf);
try {
String serConf = JSONValue . toJSONString( stormConf);
if( localNimbus != null) {
LOG . info( "Submitting topology " + name + " in local mode");
localNimbus . submitTopology( name , null , serConf , topology);
} else {
NimbusClient client = NimbusClient . getConfiguredClient( conf);
if( topologyNameExists( conf , name)) {
throw new RuntimeException( "Topology with name `" + name + "` already exists on cluster");
}
submitJar( conf);
try {
LOG . info( "Submitting topology " + name + " in distributed mode with conf " + serConf);
if( opts != null) {
client . getClient (). submitTopologyWithOpts( name , submittedJar , serConf , topology , opts);
} else {
// this is for backwards compatibility
client . getClient (). submitTopology( name , submittedJar , serConf , topology);
}
} catch( InvalidTopologyException e) {
LOG . warn( "Topology submission exception" , e);
throw e;
} catch( AlreadyAliveException e) {
LOG . warn( "Topology already alive exception" , e);
throw e;
} finally {
client . close();
}
}
LOG . info( "Finished submitting topology: " + name);
} catch( TException e) {
throw new RuntimeException( e);
}
}
submitTopology方法主要完成三件工作:
1. 配置参数
把命令行参数放在stormConf, 从conf/storm.yaml读取配置参数到conf, 再把stormConf也put到conf, 可见命令行参数的优先级更高,将stormConf转化为Json, 因为这个配置是要发送到服务器的
2. 调用submitJar方法
private static void submitJar( Map conf) {
if( submittedJar == null) {
LOG . info( "Jar not uploaded to master yet. Submitting jar...");
String localJar = System . getProperty( "storm.jar");
submittedJar = submitJar( conf , localJar);
} else {
LOG . info( "Jar already uploaded to master. Not submitting jar.");
}
}
System.getProperty("storm.jar")获取jvm参数storm.jar的值,即topology jar包的路径,然后调用重载方法submitJar。
if( localJar == null) {
throw new RuntimeException( "Must submit topologies using the 'storm' client script so that StormSubmitter knows which jar to upload.");
}
NimbusClient client = NimbusClient . getConfiguredClient( conf);
try {
String uploadLocation = client . getClient (). beginFileUpload();
LOG . info( "Uploading topology jar " + localJar + " to assigned location: " + uploadLocation);
BufferFileInputStream is = new BufferFileInputStream( localJar);
while( true) {
byte [] toSubmit = is . read();
if( toSubmit . length == 0) break;
client . getClient (). uploadChunk( uploadLocation , ByteBuffer . wrap( toSubmit));
}
client . getClient (). finishFileUpload( uploadLocation);
LOG . info( "Successfully uploaded topology jar to assigned location: " + uploadLocation);
return uploadLocation;
} catch( Exception e) {
throw new RuntimeException( e);
} finally {
client . close();
}
}
StormSubmitter的本质是个Thrift Client,而Nimbus则是Thrift Server,所以所有的操作都是通过Thrift RPC来完成,submitJar首先创建client,然后调用nimbus thrift server的beginFileUpload()方法获取nimbus存放jar的目录。beginFileUpload函数如下:
( let [ fileloc ( str ( inbox nimbus) "/stormjar-" ( uuid) ".jar" )]
( .put ( :uploaders nimbus)
fileloc
( Channels/newChannel ( FileOutputStream. fileloc)))
( log-message "Uploading file from client to " fileloc)
fileloc
))
(inbox nimbus)函数里面又调用了master-inbox函数,master-inbox主要创建storm.local.dir的值/inbox目录,并返回完整目录名,所以topology jar包的将会通过uploadChunk方法上传到nimbus上的storm.local.dir的值/inbox/stormjar-32位uuid.jar。
3. 生成thrift client并调用nimbus thrift server的submitTopologyWithOpts或submitTopology方法(submitTopologyWithOpts或submitTopology方法定义在Nimbus.clj中),submitTopologyWithOpts如下:
[ this ^ String storm-name ^ String uploadedJarLocation ^ String serializedConf ^ StormTopology topology
^ SubmitOptions submitOptions ]
( try
( assert ( not-nil? submitOptions))
( validate-topology-name! storm-name)
( check-storm-active! nimbus storm-name false)
( let [ topo-conf ( from-json serializedConf )]
( try
( validate-configs-with-schemas topo-conf)
( catch IllegalArgumentException ex
( throw ( InvalidTopologyException. ( .getMessage ex)))))
( .validate ^ backtype.storm.nimbus.ITopologyValidator ( :validator nimbus)
storm-name
topo-conf
topology))
( swap! ( :submitted-count nimbus) inc)
( let [ storm-id ( str storm-name "-" @( :submitted-count nimbus) "-" ( current-time-secs))
storm-conf ( normalize-conf
conf
( -> serializedConf
from-json
( assoc STORM-ID storm-id)
( assoc TOPOLOGY-NAME storm-name))
topology)
total-storm-conf ( merge conf storm-conf)
topology ( normalize-topology total-storm-conf topology)
storm-cluster-state ( :storm-cluster-state nimbus )]
( system-topology! total-storm-conf topology) ;; this validates the structure of the topology
( log-message "Received topology submission for " storm-name " with conf " storm-conf)
;; lock protects against multiple topologies being submitted at once and
;; cleanup thread killing topology in b/w assignment and starting the topology
( locking ( :submit-lock nimbus)
( setup-storm-code conf storm-id uploadedJarLocation storm-conf topology)
( .setup-heartbeats! storm-cluster-state storm-id)
( let [ thrift-status->kw-status { TopologyInitialStatus/INACTIVE :inactive
TopologyInitialStatus/ACTIVE :active }]
( start-storm nimbus storm-name storm-id ( thrift-status->kw-status ( .get_initial_status submitOptions))))
( mk-assignments nimbus)))
( catch Throwable e
( log-warn-error e "Topology submission exception. (topology name='" storm-name "')")
( throw e))))
storm-name表示topology的名字,uploadedJarLocation表示jar包在nimbus上的位置,serializedConf表示topology的序列化的配置信息,topology参数表示thrift结构的topology,topology结构定义在storm.thrift中,如下:
//ids must be unique across maps
// # workers to use is in conf
1 : required map<string, SpoutSpec> spouts;
2 : required map<string, Bolt> bolts;
3 : required map<string, StateSpoutSpec> state_spouts;
}
spouts存放spout id和spout的键值对,bolts存放bolt id和bolt的键值对,StateSpoutSpec暂未实现。SpoutSpec定义如下:
1 : required ComponentObject spout_object;
2 : required ComponentCommon common;
// can force a spout to be non-distributed by overriding the component configuration
// and setting TOPOLOGY_MAX_TASK_PARALLELISM to 1
}
Bolt定义如下:
1 : required ComponentObject bolt_object;
2 : required ComponentCommon common;
}
Bolt和Spout的结构相同,都是由1个ComponentObject结构和1个ComponentCommon结构组成。ComponentObject定义如下:
1 : binary serialized_java;
2 : ShellComponent shell;
3 : JavaObject java_object;
}
ComponentObject即是bolt的实现实体,它可以是以下三个类型之一:
1、1个序列化的java对象(这个对象实现IBolt接口)
2、1个ShellComponent对象,意味着bolt是由其他语言实现的。如果以这种方式来定义1个bolt,Storm将会实例化1个ShellBolt对象来
负责处理基于JVM的worker进程与非JVM的component(即该bolt)实现体之间的通讯。
3、1个JavaObject结构,这个结构告诉Storm实例化这个bolt所需要的classname和构造函数参数。这一点在你想用非JVM语言来定义topology时比较有用。这样,在你使用非JVM语言来定义topology时就可以做到既使用基于 JVM的spout或bolt,同时又不需要创建并序列化它们的Java对象。
ComponentCommon定义如下:
1 : required map<GlobalStreamId, Grouping> inputs;
2 : required map<string, StreamInfo> streams ; //key is stream id
3 : optional i32 parallelism_hint ; //how many threads across the cluster should be dedicated to this component
// component specific configuration respects :
// topology.debug : false
// topology.max.task.parallelism : null // can replace isDistributed with this
// topology.max.spout.pending : null
// topology.kryo.register // this is the only additive one
// component specific configuration
4 : optional string json_conf;
}
GlobalStreamId定义如下:
1 : required string componentId;
2 : required string streamId;
# Going to need to add an enum for the stream type ( NORMAL or FAILURE)
}
ComponentCommon定义了这个component的其他所有属性。包括:
1、这个component接收什么stream(被定义在1个component_id到stream_id的map里,在stream做分组时用到)
2、这个component发射什么stream以及stream的元数据(是否是direct stream,stream中field的声明)
3、这个component的并行度
4、这个component的配置项configuration
(assert (not-nil? submitOptions))如果submitOptions为nil,那么assert将会抛出java.lang.AssertionError,(validate-topology-name! storm-name)验证topology的名字,validate-topology-name!定义如下:
( if ( some #( .contains name %) DISALLOWED-TOPOLOGY-NAME-STRS)
( throw ( InvalidTopologyException.
( str "Topology name cannot contain any of the following: " ( pr-str DISALLOWED-TOPOLOGY-NAME-STRS))))
( if ( clojure.string/blank? name)
( throw ( InvalidTopologyException.
( "Topology name cannot be blank"))))))
DISALLOWED-TOPOLOGY-NAME-STRS定义如下:
包含了不允许出现在topology名字中的特殊字符,some函数的第一个参数是一个匿名函数,对DISALLOWED-TOPOLOGY-NAME-STRS集合中的每个元素应用该匿名函数,遇到第一个true则返回true。validate-topology-name!函数主要检查topology的名字中是否包含"非法字符"。check-storm-active!函数用于检查该topology的状态是否是"active"。定义如下:
( if ( = ( not active?)
( storm-active? ( :storm-cluster-state nimbus)
storm-name))
( if active?
( throw ( NotAliveException. ( str storm-name " is not alive")))
( throw ( AlreadyAliveException. ( str storm-name " is already active"))))
))
nimbus是一个保存了nimbus thrift server当前状态的map,这个map是由nimbus-data函数生成的,nimbus-data函数如下:
( let [ forced-scheduler ( .getForcedScheduler inimbus )]
{ :conf conf
:inimbus inimbus
:submitted-count ( atom 0)
:storm-cluster-state ( cluster/mk-storm-cluster-state conf)
:submit-lock ( Object.)
:heartbeats-cache ( atom {})
:downloaders ( file-cache-map conf)
:uploaders ( file-cache-map conf)
:uptime ( uptime-computer)
:validator ( new-instance ( conf NIMBUS-TOPOLOGY-VALIDATOR))
:timer ( mk-timer :kill-fn ( fn [ t ]
( log-error t "Error when processing event")
( exit-process! 20 "Error when processing an event")
))
:scheduler ( mk-scheduler conf inimbus)
}))
conf保存了storm集群的配置信息,inimbus表示当前nimbus实例,cluster/mk-storm-cluster-state返回一个实现了StormClusterState协议的实例。storm-active?函数定义如下:
( not-nil? ( get-storm-id storm-cluster-state storm-name)))
通过调用get-storm-id函数获取指定topology名字的topology id,如果id存在则返回true,否则返回false。get-storm-id函数如下:
( let [ active-storms ( .active-storms storm-cluster-state )]
( find-first
#( = storm-name ( :storm-name ( .storm-base storm-cluster-state % nil)))
active-storms)
))
active-storms函数获取zookeeper中/storms/的所有children,/storms/{topology-id}中存放当前正在运行的topology信息。保存的内容参考common.clj中的类StormBase。
find-first函数返回名字等于storm-name的第一个topology的id。当我们正确提交topology时,由于zookeeper中的/storms中不存在与之对应的{topology-id}文件,所以check-storm-active!函数的第一个if的条件表达式为(= true true)。进而通过check-storm-active!函数的检查。将topology的配置信息绑定到topo-conf,validate-configs-with-schemas函数验证配置信息的正确性,validate-configs-with-schemas定义如下:
[ conf ]
( doseq [[ k v ] conf
:let [ schema ( CONFIG-SCHEMA-MAP k )]]
( if ( not ( nil? schema))
( .validateField schema k v))))
CONFIG-SCHEMA-MAP定义如下:
;; Config fields must have a _SCHEMA field defined
( def CONFIG-SCHEMA-MAP
( ->> ( .getFields Config)
( filter #( not ( re-matches # ".*_SCHEMA$" ( .getName %))))
( map ( fn [ f ] [( .get f nil)
( get-FieldValidator
( -> Config
( .getField ( str ( .getName f) "_SCHEMA"))
( .get nil )))]))
( into {})))
Config.java中主要有两类静态变量:一类是配置信息,一类是配置信息对应的校验器,校验器属性以_SCHEMA结尾。CONFIG-SCHEMA-MAP中存放了配置信息变量名和对应校验器的键值对config-string -> validator。
validate-configs-with-schemas函数就是根据配置信息名获取对应校验器,然后对配置信息值进行校验。相关校验器请查看ConfigValidation类的内部类FieldValidator。(:validator nimbus)返回一个实现了backtype.storm.nimbus.ITopologyValidator接口的实例(backtype.storm.nimbus.DefaultTopologyValidators实例)并调用其validate方法。backtype.storm.nimbus.DefaultTopologyValidators类如下:
@Override
public void prepare( Map StormConf ){
}
@Override
public void validate( String topologyName , Map topologyConf , StormTopology topology) throws InvalidTopologyException {
}
}
默认情况下validate方法是一个空实现。
swap!函数用于将atom(原子类型,与java中的原子类型相同)类型的(:submitted-count nimbus)加1,保存已提交topology的个数。storm-id绑定了topology的id。storm-conf绑定topology配置信息和集群配置信息合并后序列化器、需要序列化的类、acker的个数和最大任务并行度配置信息。total-storm-conf绑定全部配置信息。normalize-topology函数主要功能就是为topology添加"topology.tasks"(task总数)配置信息。
normalize-topology定义如下:
( let [ ret ( .deepCopy topology )]
( doseq [[ _ component ] ( all-components ret )]
( .set_json_conf
( .get_common component)
( ->> { TOPOLOGY-TASKS ( component-parallelism storm-conf component )}
( merge ( component-conf component))
to-json )))
ret ))
ret绑定一个topology的深度复制,all-components函数返回该topology的所有组件的id和spout/bolt对象的键值对,然后通过调用get_common方法获取spot/bolt对象的ComponentCommon属性,->>是clojure中的一个宏,作用就是将{......}作为merge函数的最后一个参数,然后将merge函数的返回值作为to-json函数的最后一个参数,component-parallelism函数定义如下:
( let [ storm-conf ( merge storm-conf ( component-conf component))
num-tasks ( or ( storm-conf TOPOLOGY-TASKS) ( num-start-executors component))
max-parallelism ( storm-conf TOPOLOGY-MAX-TASK-PARALLELISM)
]
( if max-parallelism
( min max-parallelism num-tasks)
num-tasks)))
component-parallelism是个私有函数,主要功能就是确定"topology.tasks"的值,num-start-executors函数获取spout/bolt的并行度,没有设置并行度时默认值为1,num-tasks绑定该topology的任务数,max-parallelism绑定最大任务数,最后num-tasks和max-parallelism中较小的。normalize-topology函数会将添加了"topology.tasks"的配置信息保存到spout/bolt的ComponentCommon属性的json_conf中,并返回修改后的topology。
system-topology!函数定义如下:
( validate-basic! topology)
( let [ ret ( .deepCopy topology )]
( add-acker! storm-conf ret)
( add-metric-components! storm-conf ret)
( add-system-components! storm-conf ret)
( add-metric-streams! ret)
( add-system-streams! ret)
( validate-structure! ret)
ret
))
validate-basic!验证topology的基本信息,add-acker!添加acker bolt,add-acker!函数定义如下:
( let [ num-executors ( if ( nil? ( storm-conf TOPOLOGY-ACKER-EXECUTORS)) ( storm-conf TOPOLOGY-WORKERS) ( storm-conf TOPOLOGY-ACKER-EXECUTORS))
acker-bolt ( thrift/mk-bolt-spec* ( acker-inputs ret)
( new backtype.storm.daemon.acker)
{ ACKER-ACK-STREAM-ID ( thrift/direct-output-fields [ "id" ])
ACKER-FAIL-STREAM-ID ( thrift/direct-output-fields [ "id" ])
}
:p num-executors
:conf { TOPOLOGY-TASKS num-executors
TOPOLOGY-TICK-TUPLE-FREQ-SECS ( storm-conf TOPOLOGY-MESSAGE-TIMEOUT-SECS )})]
( dofor [[ _ bolt ] ( .get_bolts ret)
:let [ common ( .get_common bolt )]]
( do
( .put_to_streams common ACKER-ACK-STREAM-ID ( thrift/output-fields [ "id" "ack-val" ]))
( .put_to_streams common ACKER-FAIL-STREAM-ID ( thrift/output-fields [ "id" ]))
))
( dofor [[ _ spout ] ( .get_spouts ret)
:let [ common ( .get_common spout)
spout-conf ( merge
( component-conf spout)
{ TOPOLOGY-TICK-TUPLE-FREQ-SECS ( storm-conf TOPOLOGY-MESSAGE-TIMEOUT-SECS )})]]
( do
;; this set up tick tuples to cause timeouts to be triggered
( .set_json_conf common ( to-json spout-conf))
( .put_to_streams common ACKER-INIT-STREAM-ID ( thrift/output-fields [ "id" "init-val" "spout-task" ]))
( .put_to_inputs common
( GlobalStreamId. ACKER-COMPONENT-ID ACKER-ACK-STREAM-ID)
( thrift/mk-direct-grouping))
( .put_to_inputs common
( GlobalStreamId. ACKER-COMPONENT-ID ACKER-FAIL-STREAM-ID)
( thrift/mk-direct-grouping))
))
( .put_to_bolts ret "__acker" acker-bolt)
))
根据是否配置"topology.acker.executors"获取acker线程的个数,如果没有配置num-executors绑定"topology.workers"的值,否则绑定"topology.acker.executors"的值。acker-bolt绑定生成的acker bolt对象。acker-inputs函数定义如下:
( let [ bolt-ids ( .. topology get_bolts keySet)
spout-ids ( .. topology get_spouts keySet)
spout-inputs ( apply merge
( for [ id spout-ids ]
{[ id ACKER-INIT-STREAM-ID ] [ "id" ]}
))
bolt-inputs ( apply merge
( for [ id bolt-ids ]
{[ id ACKER-ACK-STREAM-ID ] [ "id" ]
[ id ACKER-FAIL-STREAM-ID ] [ "id" ]}
))]
( merge spout-inputs bolt-inputs)))
bolt-ids绑定topology所有bolt的id,spout-ids绑定所有spout的id,spout-inputs绑定来自spout的输入流,bolt-inputs绑定来自bolt的输入流,最后返回合并后的输入流(一个map对象)。ACKER-ACK-STREAM-ID和ACKER-FAIL-STREAM-ID表示acker的输出流。TOPOLOGY-TICK-TUPLE-FREQ-SECS表示tick tuple的频率,初始值为消息超时的时间。第一个dofor语句为每个bolt添加ACKER-ACK-STREAM-ID和ACKER-FAIL-STREAM-ID输出流用于将ack value发送个acker bolt,第二个dofor为每个spout设置了tick tuple的发送频率,并且设置了发送给acker bolt的ACKER-INIT-STREAM-ID输出流和来自ackerblot的两个输入流。这样acker bolt就可以与spout和bolt进行ack信息通信了。add-metric-components!函数主要功能就是将metric bolts添加到topology定义中。metric bolt主要用于统计线程executor相关的信息。add-metric-components!函数定义如下:
( doseq [[ comp-id bolt-spec ] ( metrics-consumer-bolt-specs storm-conf topology )]
( .put_to_bolts topology comp-id bolt-spec)))
metrics-consumer-bolt-specs 函数定义如下:
( defn metrics-consumer-bolt-specs [ storm-conf topology ]
( let [ component-ids-that-emit-metrics ( cons SYSTEM-COMPONENT-ID ( keys ( all-components topology)))
inputs ( ->> ( for [ comp-id component-ids-that-emit-metrics ]
{[ comp-id METRICS-STREAM-ID ] :shuffle })
( into {}))
mk-bolt-spec ( fn [ class arg p ]
( thrift/mk-bolt-spec*
inputs
( backtype.storm.metric.MetricsConsumerBolt. class arg)
{} :p p :conf { TOPOLOGY-TASKS p }))]
( map
( fn [ component-id register ]
[ component-id ( mk-bolt-spec ( get register "class")
( get register "argument")
( or ( get register "parallelism.hint") 1 ))])
( metrics-consumer-register-ids storm-conf)
( get storm-conf TOPOLOGY-METRICS-CONSUMER-REGISTER))))
component-ids-that-emit-metrics绑定包括system bolt在内的所有spout和bolt的id,inputs绑定了metric bolt的输入流,并且使用shuffle grouping。mk-bolt-spec绑定一个匿名函数,metrics-consumer-register-ids函数为每个metric consumer对象产生一个component id列表,get函数返回所有metric consumer对象,map函数返回component id和metric consumer对象集合的列表([component-id metric-consumer] [component-id metric-consumer]......)。add-system-components!函数主要功能是将system bolt添加到topology定义中。system bolt用于统计与进程worker相关的信息,如内存使用率,gc情况,网络吞吐量等。每个进程worker中只有一个system bolt。add-system-components!函数定义如下:
( let [ system-bolt-spec ( thrift/mk-bolt-spec*
{}
( SystemBolt.)
{ SYSTEM-TICK-STREAM-ID ( thrift/output-fields [ "rate_secs" ])
METRICS-TICK-STREAM-ID ( thrift/output-fields [ "interval" ])}
:p 0
:conf { TOPOLOGY-TASKS 0 })]
( .put_to_bolts topology SYSTEM-COMPONENT-ID system-bolt-spec)))
从thrift/mk-bolt-spec*函数的第一个参数{}我们可以发现system bolt没有输入流,从第三个参数可以发现它有两个输出流用于发送tick tuple,它的并行度为0,因为system bolt是与进程worker相关的,所以没有必要指定并行度。同时他也不需要执行任何task。add-metric-streams!函数主要功能用于给topology添加metric streams定义,add-metric-streams!定义如下:
( doseq [[ _ component ] ( all-components topology)
:let [ common ( .get_common component )]]
( .put_to_streams common METRICS-STREAM-ID
( thrift/output-fields [ "task-info" "data-points" ]))))
给spout和bolt添加METRICS-STREAM-ID标示的metric stream。add-system-streams!函数与add-metric-streams!相似,给spout和bolt添加SYSTEM-STREAM-ID标示的system stream。submitTopologyWithOpts函数在调用system-topology!函数后,首先加锁,然后调用setup-storm-code函数,该函数的主要功能就是将上传给nimbus的jar包、topology和配置信息拷贝到{storm.local.dir}/nimbus/stormdist/{topology id}目录中,定义如下:
( let [ stormroot ( master-stormdist-root conf storm-id )]
( FileUtils/forceMkdir ( File. stormroot))
( FileUtils/cleanDirectory ( File. stormroot))
( setup-jar conf tmp-jar-location stormroot)
( FileUtils/writeByteArrayToFile ( File. ( master-stormcode-path stormroot)) ( Utils/serialize topology))
( FileUtils/writeByteArrayToFile ( File. ( master-stormconf-path stormroot)) ( Utils/serialize storm-conf))
))
setup-jar函数将{storm.local.dir}/nimbus/inbox/中的jar包拷贝到{storm.local.dir}/nimbus/stormdist/{topology id}目录,并重命名为stormjar.jar。FileUtils/writeByteArrayToFile将topology对象和storm-conf序列化后分别保存到stormcode.ser和stormconf.ser。setup-heartbeats!函数定义在cluster.clj文件中,是StormClusterState协议的一个函数,主要功能就是在zookeeper上创建该topology用于存放心跳信息的目录。心跳目录:
/storm/workerbeats/{topology id}/。
start-storm函数的主要功能读取整个集群的配置信息、nimbus的配置信息、从stormconf.ser反序列化topology配置信息和从stormcode.ser反序列化出topology,然后通过调用activate-storm!函数将topology的元数据StormBase对象写入zookeeper的/storm/storms/{topology id}文件中。定义如下:
{ :pre [( # { :active :inactive } topology-initial-status )]}
( let [ storm-cluster-state ( :storm-cluster-state nimbus)
conf ( :conf nimbus)
storm-conf ( read-storm-conf conf storm-id)
topology ( system-topology! storm-conf ( read-storm-topology conf storm-id))
num-executors ( ->> ( all-components topology) ( map-val num-start-executors ))]
( log-message "Activating " storm-name ": " storm-id)
( .activate-storm! storm-cluster-state
storm-id
( StormBase. storm-name
( current-time-secs)
{ :type topology-initial-status }
( storm-conf TOPOLOGY-WORKERS)
num-executors))))
submitTopologyWithOpts函数最后调用mk-assignments函数进行任务分配。任务分配是stom架构的重要组成部分。鉴于篇幅问题,有关任务分配的源码分析会在之后的文章中讲解。