Spark(一):Spark的安装部署。

一:Scala的安装.

    Scala 官网提供各个版本的 Scala,用户需要根据 Spark 官方规定的 Scala 版本进行下载
和安装。

    我下载的版本是 scala-2.11.8.tgz,下载地址:http://www.scala-lang.org/download/

1.在mysoftware下解压,即:

hadoop@master:/mysoftware$ tar -xzvf ~/Desktop/scala-2.11.8.tgz 

2.配置环境变量,在/etc/profile中添加如下内容:

    export SCALA_HOME=/mysoftware/scala-2.11.8


    export PATH=$SCALA_HOME/bin:$PATH

3.生效:./etc/profile

hadoop@master:~$ scala
Welcome to Scala 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_80).
Type in expressions for evaluation. Or try :help.

scala> 9*9
res0: Int = 81

scala>

二:Spark的安装。

    进入官网下载对应 Hadoop 版本的 Spark 程序包

    对于Spark,我下载的版本是:spark-1.6.1.tgz。

    官网下载地址:https://spark.apache.org/downloads.html

1.解压到mysoftware目录,即:

hadoop@master:/mysoftware$ tar -xzvf ~/Desktop/spark-1.6.1.tgz 

2.在 /etc/profile 下添加如下内容:

export SPARK_HOME=/mysoftware/spark-1.6.1

export PATH=$SPARK_HOME/bin:$PATH

 

3.配置conf/spark-env.sh,即:

hadoop@master:/mysoftware/spark-1.6.1/conf$ cp spark-env.sh.template  spark-env.sh
hadoop@master:/mysoftware/spark-1.6.1/conf$ sudo gedit spark-env.sh

在spark-env.sh文件末尾添加如下内容:


export SCALA_HOME=/mysoftware/scala-2.11.8

export JAVA_HOME=/mysoftware/jdk1.7.0_80


export SPARK_MASTER_IP=192.168.226.129


export SPARK_WORKER_MEMORY=512m

export master=spark://192.168.226.129:7077

    参数 SPARK_WORKER_MEMORY 决定在每一个Worker节点上可用的最大内存,增加这个数值可以在内存中缓存更多的数据,但是一定要给Slave的操作系统和其他服务预留足够的内存。

    需要配置 SPARK_MASTER_IP 和 MASTER,否则会造成Slave无法注册主机错误。3.启动Spark.

    在Spark根目录下启动spark,即:

hadoop@master:/mysoftware/spark-1.6.1$ sbin/start-all.sh 

但是启动spark后,出现了如下问题:

hadoop@master :/mysoftware/spark-1.6.1$ sbin/start-all.sh 
starting org.apache.spark.deploy.master.Master, logging to /mysoftware/spark-1.6.1/logs/spark-hadoop-org.apache.spark.deploy.master.Master-1-master.out
failed to launch org.apache.spark.deploy.master.Master:
  Failed to find Spark assembly in /mysoftware/spark-1.6.1/assembly/target/scala-2.10.
  You need to build Spark before running this program.

full log in /mysoftware/spark-1.6.1/logs/spark-hadoop-org.apache.spark.deploy.master.Master-1-master.out
master: starting org.apache.spark.deploy.worker.Worker, logging to /mysoftware/spark-1.6.1/logs/spark-hadoop-org.apache.spark.deploy.worker.Worker-1-master.out
master: failed to launch org.apache.spark.deploy.worker.Worker:
master:   Failed to find Spark assembly in /mysoftware/spark-1.6.1/assembly/target/scala-2.10.
master:   You need to build Spark before running this program.
master: full log in /mysoftware/spark-1.6.1/logs/spark-hadoop-org.apache.spark.deploy.worker.Worker-1-master.out
 

很好奇啊,我明明安装的scala版本是2.11.8,而错误提示大概告诉我说没有找到scala-2.10.

问题思路:

http://stackoverflow.com/questions/27618843/why-does-spark-submit-and-spark-shell-fail-with-failed-to-find-spark-assembly-j

原文关键:

You have to download one of pre-built version in "Choose a package type" section from the Spark download page.

哦,一开始以为像之前下载包一样,看到包就下载下来了,却忽视了这句话:

    进入官网下载对应 Hadoop 版本的 Spark 程序包,(原来是没有下载对应的包。。。。)

我的hadoop版本为hadoop-2.6.4,则重新下载spark包为:spark-1.6.1-bin-hadoop2.6.tgz

重新安装Spark:

hadoop@master:/mysoftware$ tar -xzvf ~/Desktop/spark-1.6.1-bin-hadoop2.6.tgz 

照之前安装完毕后,在重新启动spark:

hadoop@master:/mysoftware/hadoop-2.6.4$ cd ../spark-1.6.1/
hadoop@master:/mysoftware/spark-1.6.1$ sbin/start-all.sh 
starting org.apache.spark.deploy.master.Master, logging to /mysoftware/spark-1.6.1/logs/spark-hadoop-org.apache.spark.deploy.master.Master-1-master.out
master: starting org.apache.spark.deploy.worker.Worker, logging to /mysoftware/spark-1.6.1/logs/spark-hadoop-org.apache.spark.deploy.worker.Worker-1-master.out
hadoop@master:/mysoftware/spark-1.6.1$ jps
2975 NameNode
4055 Worker
3964 Master
3611 NodeManager
3282 SecondaryNameNode
3482 ResourceManager
2769 MainGenericRunner
4104 Jps
3108 DataNode
hadoop@master:/mysoftware/spark-1.6.1$ 

注意通过查看进程,会发现多了  Master 和 Worker

 然后在启动spark-shell,即:

hadoop@master:/mysoftware/spark-1.6.1$ spark-shell 
log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Using Spark's repl log4j profile: org/apache/spark/log4j-defaults-repl.properties
To adjust logging level use sc.setLogLevel("INFO")
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 1.6.1
      /_/

Using Scala version 2.10.5 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_80)
Type in expressions to have them evaluated.
Type :help for more information.
Spark context available as sc.
16/06/02 04:09:09 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
16/06/02 04:09:10 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
16/06/02 04:09:21 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
16/06/02 04:09:21 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
16/06/02 04:09:28 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
16/06/02 04:09:28 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
16/06/02 04:09:34 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
16/06/02 04:09:35 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
SQL context available as sqlContext.

scala> 

 

最后可在浏览器端输入以下地址,查看:

http://192.168.226.129:4040/

http://192.168.226.129:8080/

 

 

转载于:https://my.oschina.net/gently/blog/686192

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值