阿里云ECS Spark安装

续上篇Hbase


下载spark2.11.0和scala
以下是对spark的版本描述
Spark runs on Java 7+, Python 2.6+/3.4+ and R 3.1+. For the Scala API, Spark 2.1.1 uses Scala 2.11. You will need to use a compatible Scala version (2.11.x).
Note that support for Java 7 and Python 2.6 are deprecated as of Spark 2.0.0, and support for Scala 2.10 and versions of Hadoop before 2.6 are deprecated as of Spark 2.1.0, and may be removed in Spark 2.2.0.


下载包

wget https://d3kbcqa49mib13.cloudfront.net/spark-2.1.1-bin-hadoop2.7.tgz
wget https://downloads.lightbend.com/scala/2.11.11/scala-2.11.11.tgz

创建文件夹

mkdir -p /opt/scala
mkdir -p /opt/spark

解压包

tar -zxvf scala-2.11.11.tgz -C /opt/scala
tar -zxvf spark-2.1.1-bin-hadoop2.7.tgz -C /opt/spark

分别创建用户级的环境变量文件

/etc/profile.d/scala.sh
export SCALA_HOME=/opt/scala/current
/etc/profile.d/spark.sh
export SPARK_HOME=/opt/spark/current
export PATH=$PATH:${SPARK_HOME}/bin


修改spark配置文件

cp ./conf/spark-env.sh.template ./conf/spark-env.sh
编辑spark-env.sh

export SCALA_HOME=${SCALA_HOME}
export JAVA_HOME=${JAVA_HOME}
export SPARK_MASTER_IP=master
export SPARK_WORKER_MEMORY=500m
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop

cp slaves.template slaves
编辑slaves

master
slave01
slave02
赋权

chown -R hadoop:hadoop /opt/scala
chown -R hadoop:hadoop /opt/spark
并把对应文件夹scp到其余机器上

在hadoop用户下创建软连接

ln -s /opt/scala/scala-2.11.11 ./current
ln -s /opt/spark/spark-2.1.1-bin-hadoop2.7 ./current

测试

spark-shell
val file=sc.textFile("hdfs://iZuf68ho3sfplkorf9r8akZ:9000/stella/input/wordcount.txt")
val rdd = file.flatMap(line => line.split(" ")).map(word => (word,1)).reduceByKey(_+_)
rdd.collect()
rdd.foreach(println)


详细安装可以参考,转载: http://www.cnblogs.com/purstar/p/6293605.html












  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值