similar to other open source projects,spark has several shells are listed there
sbin | server side shells | ||
start-all.sh | start the whole spark daemons | (ie. start-master.sh,start-slaves.sh) | |
start-master.sh | startup the spark's master process | deliver to "spark-daemon.sh start master.Master" | |
start-slaves.sh | starts all workers | deliver to sbin/slaves.sh | |
spark-daemon.sh | spawn up any daemons,e.g. spark-class.sh,spark-submit.sh.usage: spark-daemon.sh [--config <conf-dir>] (start|stop|status) <spark-command> <spark-instance-number> <args...> | for start-master.sh,here will deliver to bin/spark-class.sh | |
slaves.sh | login into all slaves dfined in conf/slaves,then issue sbin/start-slave.sh | ||
spark-config.sh | export some global variables,e.g. SPARK_HOME | ||
start-slave.sh | deliver to spark-daemon.sh start deploy.worker.Worker | some settings,eg. SPARK_WORKER_INSTANCES | |
bin | |||
load-spark-env.sh | load the file spark-env.sh if exists | ||
spark-class.sh[end] | finally,spawn up a daemon(class) returned by executing "org.apache.spark.launcher.Main",the class is specified by caller.(eg. spark-daemon.sh) Usage: spark-class <class> [<args>] | ||
spark-shell | an interactive interface with spark to test,demo,committing spark app. Help msg is grapped from spark-submit.it uses athird-party jar named Jline to simulate a shell style. | it will deliver to spark-submit shell.ie. bin/spark-submit --class org.apache.spark.repl.Main | |
spark-submit | submit a spark app | differs with start-master.sh and start-slave.sh,here will issue new class: "spark-class org.apache.spark.deploy.SparkSubmit .." | |
run-example | runs the examples by given a subfix class name. | deliver to "spark-submit .." | |
conf | |||
spark-env.sh | misc spark cluster settings corresponding to cluster manager,e.g. SPARK_LOCAL_IP, SPARK_CLASSPATH,SPARK_MASTER_IP.. |
eg.
hadoop@GZsw04:~$ spark-daemon.sh status org.apache.spark.deploy.master.Master 1
org.apache.spark.deploy.master.Master is running.
launch.Main.java
the union entry of issuing any class/daemon,e.g.if u wanna submit a workcount app,u can do like this:
run-example JavaWordCount /file/to/count
then the concrete class named "org.apache.spark.examples.JavaWordCount" is executed by inflecting by Main.java.also,u can dig into the src of JavaWordCount for more details.
conclusions;
a.complete flow of startup master
[start-all.sh]>start-master.sh>spark-daemon.sh>spark-class.sh > launch.Main>issue app master.Master
and the slaves(workers)
[start-all.sh]>start-slaves.sh>slaves.sh>start-slave.sh>spark-daemon.sh[ worker.Worker]>same steps with above
b.it's different from hadoop,hbase.since it will supply a union entry of launch.Main to fix several params,or coming them.
c.advances step by step will reduce the reduplicate code,improves the reusability