启动Spark集群
启动spark集群,需要登陆Master节点,执行start-all.sh脚本,start-all.sh分别启动Master,Worker,如果参数中指定--with-tachyon,也会启动tachyon
1,登陆spark Master节点,启动$SPARK_HOME/sbin/start-all.sh,start-all.sh内容如下:
sbin="`dirname "$0"`" sbin="`cd "$sbin"; pwd`" TACHYON_STR="" while (( "$#" )); do case $1 in --with-tachyon) TACHYON_STR="--with-tachyon" ;; esac shift done # Load the Spark configuration . "$sbin/spark-config.sh" # Start Master "$sbin"/start-master.sh $TACHYON_STR # Start Workers "$sbin"/start-slaves.sh $TACHYON_STRstart-all.sh调用start-master.sh启动Master,调用start-slaves.sh启动Worker
2.start-master.sh内容如下:
sbin="`dirname "$0"`" sbin="`cd "$sbin"; pwd`" ORIGINAL_ARGS="$@" START_TACHYON=false while (( "$#" )); do case $1 in --with-tachyon) if [ ! -e "$sbin"/../tachyon/bin/tachyon ]; then echo "Error: --with-tachyon specified, but tachyon not found." exit -1 fi START_TACHYON=true ;; esac shift done . "$sbin/spark-config.sh" . "$SPARK_PREFIX/bin/load-spark-env.sh" if [ "$SPARK_MASTER_PORT" = "" ]; then SPARK_MASTER_PORT=7077 fi if [ "$SPARK_MASTER_IP" = "" ]; then SPARK_MASTER_IP=`hostname` fi if [ "$SPARK_MASTER_WEBUI_PORT" = "" ]; then SPARK_MASTER_WEBUI_PORT=8080 fi "$sbin"/spark-daemon.sh start org.apache.spark.deploy.master.Master 1 \ --ip $SPARK_MASTER_IP --port $SPARK_MASTER_PORT --webui-port $SPARK_MASTER_WEBUI_PORT \ $ORIGINAL_ARGS if [ "$START_TACHYON" == "true" ]; then "$sbin"/../tachyon/bin/tachyon bootstrap-conf $SPARK_MASTER_IP "$sbin"/../tachyon/bin/tachyon format -s "$sbin"/../tachyon/bin/tachyon-start.sh master fistart-master.sh通过spark-daemon.sh调用org.apache.spark.deploy.master.Master启动Master,参数有ip,port,webui-port
3.start-slaves.sh内容如下:
sbin="`dirname "$0"`" sbin="`cd "$sbin"; pwd`" START_TACHYON=false while (( "$#" )); do case $1 in --with-tachyon) if [ ! -e "$sbin"/../tachyon/bin/tachyon ]; then echo "Error: --with-tachyon specified, but tachyon not found." exit -1 fi START_TACHYON=true ;; esac shift done . "$sbin/spark-config.sh" . "$SPARK_PREFIX/bin/load-spark-env.sh" # Find the port number for the master if [ "$SPARK_MASTER_PORT" = "" ]; then SPARK_MASTER_PORT=7077 fi if [ "$SPARK_MASTER_IP" = "" ]; then SPARK_MASTER_IP="`hostname`" fi if [ "$START_TACHYON" == "true" ]; then "$sbin/slaves.sh" cd "$SPARK_HOME" \; "$sbin"/../tachyon/bin/tachyon bootstrap-conf "$SPARK_MASTER_IP" # set -t so we can call sudo SPARK_SSH_OPTS="-o StrictHostKeyChecking=no -t" "$sbin/slaves.sh" cd "$SPARK_HOME" \; "$sbin/../tachyon/bin/tachyon-start.sh" worker SudoMount \; sleep 1 fi # Launch the slaves "$sbin/slaves.sh" cd "$SPARK_HOME" \; "$sbin/start-slave.sh" "spark://$SPARK_MASTER_IP:$SPARK_MASTER_PORT"
start-slaves.sh通过slaves.sh利用ssh登陆每个Worker,然后在Worker节点上执行start-slave.sh启动该Worker节点,参数是从Master节点传到Worker节点上的【参数:SparkMaster地址 spark://$SPARK_MASTER_IP:$SPARK_MASTER_PORT】
4,slaves内容如下:
usage="Usage: slaves.sh [--config <conf-dir>] command..." # if no args specified, show usage if [ $# -le 0 ]; then echo $usage exit 1 fi sbin="`dirname "$0"`" sbin="`cd "$sbin"; pwd`" . "$sbin/spark-config.sh" # If the slaves file is specified in the command line, # then it takes precedence over the definition in # spark-env.sh. Save it here. if [ -f "$SPARK_SLAVES" ]; then HOSTLIST=`cat "$SPARK_SLAVES"` fi # Check if --config is passed as an argument. It is an optional parameter. # Exit if the argument is not a directory. if [ "$1" == "--config" ] then shift conf_dir="$1" if [ ! -d "$conf_dir" ] then echo "ERROR : $conf_dir is not a directory" echo $usage exit 1 else export SPARK_CONF_DIR="$conf_dir" fi shift fi . "$SPARK_PREFIX/bin/load-spark-env.sh" if [ "$HOSTLIST" = "" ]; then if [ "$SPARK_SLAVES" = "" ]; then if [ -f "${SPARK_CONF_DIR}/slaves" ]; then HOSTLIST=`cat "${SPARK_CONF_DIR}/slaves"` else HOSTLIST=localhost fi else HOSTLIST=`cat "${SPARK_SLAVES}"` fi fi # By default disable strict host key checking if [ "$SPARK_SSH_OPTS" = "" ]; then SPARK_SSH_OPTS="-o StrictHostKeyChecking=no" fi for slave in `echo "$HOSTLIST"|sed "s/#.*$//;/^$/d"`; do if [ -n "${SPARK_SSH_FOREGROUND}" ]; then ssh $SPARK_SSH_OPTS "$slave" $"${@// /\\ }" \ 2>&1 | sed "s/^/$slave: /" else ssh $SPARK_SSH_OPTS "$slave" $"${@// /\\ }" \ 2>&1 | sed "s/^/$slave: /" & fi if [ "$SPARK_SLAVE_SLEEP" != "" ]; then sleep $SPARK_SLAVE_SLEEP fi done waitslaves.sh会读取$SPARK_HOME/conf/slaves中的内容,该文件中记录了集群中所有Worker的地址,然后组装成ssh登陆语句,
$"${@// /\\ }"的用法可参考 这里
5.start-slave.sh内容如下:
usage="Usage: start-slave.sh <spark-master-URL> where <spark-master-URL> is like spark://localhost:7077" if [ $# -lt 1 ]; then echo $usage echo Called as start-slave.sh $* exit 1 fi sbin="`dirname "$0"`" sbin="`cd "$sbin"; pwd`" . "$sbin/spark-config.sh" . "$SPARK_PREFIX/bin/load-spark-env.sh" # First argument should be the master; we need to store it aside because we may # need to insert arguments between it and the other arguments MASTER=$1 shift # Determine desired worker port if [ "$SPARK_WORKER_WEBUI_PORT" = "" ]; then SPARK_WORKER_WEBUI_PORT=8081 fi # Start up the appropriate number of workers on this machine. # quick local function to start a worker function start_instance { WORKER_NUM=$1 shift if [ "$SPARK_WORKER_PORT" = "" ]; then PORT_FLAG= PORT_NUM= else PORT_FLAG="--port" PORT_NUM=$(( $SPARK_WORKER_PORT + $WORKER_NUM - 1 )) fi WEBUI_PORT=$(( $SPARK_WORKER_WEBUI_PORT + $WORKER_NUM - 1 )) "$sbin"/spark-daemon.sh start org.apache.spark.deploy.worker.Worker $WORKER_NUM \ --webui-port "$WEBUI_PORT" $PORT_FLAG $PORT_NUM $MASTER "$@" } if [ "$SPARK_WORKER_INSTANCES" = "" ]; then start_instance 1 "$@" else for ((i=0; i<$SPARK_WORKER_INSTANCES; i++)); do start_instance $(( 1 + $i )) "$@" done fi通过设置$SPARK_WORKER_INSTANCES可以设定在一个slave节点上启动几个worker,比如$SPARK_WORKER_INSTANCES=2(在$SPARK_HOME/conf/spark-env.sh中设置),就可以在slave中启动2个worker。内容如下:
if [ "$SPARK_WORKER_INSTANCES" = "" ]; then start_instance 1 "$@" else for ((i=0; i<$SPARK_WORKER_INSTANCES; i++)); do start_instance $(( 1 + $i )) "$@" done fi函数start_instance如下:
# Start up the appropriate number of workers on this machine. # quick local function to start a worker function start_instance { WORKER_NUM=$1 shift if [ "$SPARK_WORKER_PORT" = "" ]; then PORT_FLAG= PORT_NUM= else PORT_FLAG="--port" PORT_NUM=$(( $SPARK_WORKER_PORT + $WORKER_NUM - 1 )) fi WEBUI_PORT=$(( $SPARK_WORKER_WEBUI_PORT + $WORKER_NUM - 1 )) "$sbin"/spark-daemon.sh start org.apache.spark.deploy.worker.Worker $WORKER_NUM \ --webui-port "$WEBUI_PORT" $PORT_FLAG $PORT_NUM $MASTER "$@" }start_instance中通过spark-daemon.sh调用org.apache.spark.deploy.worker.Worker启动Worker