文章目录
官网下载解压安装包:
spark-3.1.1-bin-without-hadoop
一、配置spark-env.sh
在conf
目录中,编辑spark-env.sh
文件,添加下面的变量:
SPARK_WORKER_INSTANCES=1
SPARK_MASTER_HOST=localhost
SPARK_MASTER_PORT=7077
SPARK_WORKER_CORES=4
SPARK_WORKER_MEMORY=1g
HADOOP_HOME=/mnt/d/hadoop/hadoop-3.3.0
JAVA_HOME=/usr/local/java/jdk1.8.0_241
LD_LIBRARY_PATH=/mnt/d/hadoop/hadoop-3.3.0/lib/native
SPARK_DIST_CLASSPATH=$($HADOOP_HOME/bin/hadoop classpath)
注意该版本的SPARK_DIST_CLASSPATH
变量配置方式,否则容易出现hadoop: command not found
这样的报错。
二、启动Spark
sh sbin/start-all.sh
查看spark进程:
18325 Worker
17148 Master
看到这两个进程,才算启动成功,若有报错,请看第三步
三、解决报错
该版本的spark缺少日志相关的jar包,需要我们自己下载并放入
jars
目录
1. java.lang.ClassNotFoundException: org.apache.log4j.spi.Filter
下载Apache log4j 1.2.17
将里面的jar包放入jars
目录中
2. java.lang.ClassNotFoundException: org.slf4j.Logger
下载org.slf4j:slf4j-api:jar:1.7.25和org.slf4j:slf4j-log4j12:jar:1.7.25
将里面的jar包放入jars
目录中
四、运行Spark案例
./run-example SparkPi 2
结果如下(每次运行均不一样):
Pi is roughly 3.1372156860784304
package org.apache.spark.examples
import scala.math.random
import org.apache.spark._
/** Computes an approximation to pi */
object SparkPi {
def main(args: Array[String]) {
val conf = new SparkConf().setAppName("Spark Pi")
val spark = new SparkContext(conf)
val slices = if (args.length > 0) args(0).toInt else 2
val n = 100000 * slices
val count = spark.parallelize(1 to n, slices).map { i =>
val x = random * 2 - 1
val y = random * 2 - 1
if (x*x + y*y < 1) 1 else 0
}.reduce(_ + _)
println("Pi is roughly " + 4.0 * count / n)
spark.stop()
}
}
五、启动spark-shell
bin/spark-shell
看到熟悉的logo,就可以进行下一步的学习了
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Spark context Web UI available at http://192.168.205.71:4040
Spark context available as 'sc' (master = local[*], app id = local-1620076835321).
Spark session available as 'spark'.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 3.1.1
/_/
Using Scala version 2.12.10 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_241)
Type in expressions to have them evaluated.
Type :help for more information.
scala>