首先,安装spark之前需要安装配置的软件有:JDK,Scala,ssh,Hadoop这些开发平台的安装配置在我之前的博客中都有详细的攻略,需要的请去看看。
hadoop安装配置
再此提一句,无论是hadoop,hbase,hive,spark都是需要版本适配的,不然就会多很多步的不必要操作,版本的适配官网上都有,这里写者是使用:jdk1.7+hadoop2.6.4+scala2.11.8+spark1.6.1。
由于spark的内核是scala,所以使用spark之前,必先安装scala,那么废话不多说,开始安装。
scala安装配置
- 解压scala-2.11.8.tgz
hadoop@master:/software/spark-1.6.1-bin-hadoop2.6$ cd ~
hadoop@master:~$ cd Downloads/ hadoop@master:~/Downloads$ lsapache-hive-2.0.0-bin.tar.gz scala-2.11.8.tgz hadoop-2.6.4.tar.gz
spark-1.6.1-bin-hadoop2.6.tgz hbase-1.2.1-bin.tar.gz
zookeeper-3.5.0-alpha.tar.gz jdk-7u80-linux-x64.tar.gz
hadoop@master:~/Downloads$ cd /software/ hadoop@master:/software$ tar
-zxvf scala-2.11.8/
2.配置环境变量
hadoop@master:/software$ sudo gedit /etc/profile
里面添加
export SCALA_HOME=/software/scala-2.11.8
export PATH=$SCALA_HOME/bin:$PATH
hadoop@master:/software$ source /etc/profile
3.启动及验证
hadoop@master:/software$ scala
Welcome to Scala 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_80).
Type in expressions for evaluation. Or try :help.scala> 8*8
res0: Int = 64scala> ;
安装spark
- 解压spark-1.6.1-bin-hadoop2.6.tgz
hadoop@master:/software$ tar -zxvf ~/Downloads/spark-1.6.1-bin-hadoop2.6.tgz
2.配置环境变量
sudo gedit /etc/profile
export SPARK_HOME=/software/spark-1.6.1-bin-hadoop2.6
export PATH=$SPARK_HOME/bin:$PATHhadoop@master:/software$ source /etc/profile
3.修改spark-env.sh
hadoop@master:~$ cd /software/spark-1.6.1-bin-hadoop2.6/conf/
hadoop@master:/software/spark-1.6.1-bin-hadoop2.6/conf$ ls
Docker.properties.template metrics.properties.template
spark-env.sh.template fairscheduler.xml.template slaves.template
log4j.properties.template spark-defaults.conf.template
hadoop@master:/software/spark-1.6.1-bin-hadoop2.6/conf$ cp
spark-env.sh.template spark-env.sh
hadoop@master:/software/spark-1.6.1-bin-hadoop2.6/conf$ sudo gedit
spark-env.sh
加入
export SCALA_HOME=/software/scala-2.11.8
export JAVA_HOME=/software/jdk1.7.0_80
export SPARK_MASTER_IP=master
export SPARK_WORKER_MEMORY=512m
export master=spark://master:7070
修改slaves
hadoop@master:/software/spark-1.6.1-bin-hadoop2.6/conf$ cp slaves.template slaves
hadoop@master:/software/spark-1.6.1-bin-hadoop2.6/conf$ sudo gedit slaves
loaclhost改为master
4.启动spark
hadoop@master:/software/spark-1.6.1-bin-hadoop2.6$ sbin/start-all.sh
starting org.apache.spark.deploy.master.Master, logging to
/software/spark-1.6.1-bin-hadoop2.6/logs/spark-tg-org.apache.spark.deploy.master.Master-1-master.out
master: starting org.apache.spark.deploy.worker.Worker, logging to
/software/spark-1.6.1-bin-hadoop2.6/logs/spark-tg-org.apache.spark.deploy.worker.Worker-1-master.out
jps查看进程 多了Worker,Master
5.进入spark-shell
hadoop@master:/software/spark-1.6.1-bin-hadoop2.6$ bin/spark-shell
log4j:WARN No appenders could be found for logger
(org.apache.hadoop.metrics2.lib.MutableMetricsFactory). log4j:WARN
Please initialize the log4j system properly. log4j:WARN See
http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Using Spark’s repl log4j profile:
org/apache/spark/log4j-defaults-repl.properties To adjust logging
level use sc.setLogLevel(“INFO”) Welcome to
__
/ / _ _/ /__
\ \/ \/ _ `/ / ‘/ // ._/_,// //_\ version 1.6.1
/_/Using Scala version 2.10.5 (Java HotSpot(TM) 64-Bit Server VM, Java
1.7.0_80) Type in expressions to have them evaluated. Type :help for more information. Spark context available as sc. 16/05/31 05:58:36
WARN Connection: BoneCP specified but not present in CLASSPATH (or one
of dependencies) 16/05/31 05:58:37 WARN Connection: BoneCP specified
but not present in CLASSPATH (or one of dependencies) 16/05/31
05:58:45 WARN ObjectStore: Version information not found in metastore.
hive.metastore.schema.verification is not enabled so recording the
schema version 1.2.0 16/05/31 05:58:46 WARN ObjectStore: Failed to get
database default, returning NoSuchObjectException 16/05/31 05:58:50
WARN Connection: BoneCP specified but not present in CLASSPATH (or one
of dependencies) 16/05/31 05:58:51 WARN Connection: BoneCP specified
but not present in CLASSPATH (or one of dependencies) 16/05/31
05:58:57 WARN ObjectStore: Version information not found in metastore.
hive.metastore.schema.verification is not enabled so recording the
schema version 1.2.0 16/05/31 05:58:58 WARN ObjectStore: Failed to get
database default, returning NoSuchObjectException SQL context
available as sqlContext.scala>
可以通过浏览器访问查看:url为maste的Ip:4040,master的Ip:7077