详细参考如下作者文章
https://www.cnblogs.com/purstar/p/6293605.html
补充一点就是spark-env.sh配置,前面一直报错找不到 slf4j的jar包,复制到spark的jars还是有问题,最终的解决是加入SPARK_DIST_CLASSPATH
export SCALA_HOME=/usr/local/bigdata/scala-2.10.7
export JAVA_HOME=/usr/local/bigdata/jdk1.8
export HADOOP_HOME=/usr/local/bigdata/hadoop-2.7.3
export SPARK_MASTER_IP=master
export SPARK_WORKER_MEMORY=1g
export HADOOP_CONF_DIR=/usr/local/bigdata/hadoop-2.7.3/etc/hadoop
export SPARK_DIST_CLASSPATH=$(/usr/local/bigdata/hadoop-2.7.3/bin/hadoop classpath)
再次感谢原作者。