准备:
已经搭建好docker环境
本文是基于macos电脑,docker 20.10.6
1.下载镜像
docker pull apache/spark
2.运行镜像
docker run -it apache/spark /opt/spark/bin/spark-shell
3.运行Spark命令
spark.range(1000 * 1000 * 1000).count()
运行记录:
localhost:carbondata1 xubo$ docker run -it apache/spark /opt/spark/bin/spark-shell
++ id -u
+ myuid=185
++ id -g
+ mygid=0
+ set +e
++ getent passwd 185
+ uidentry=
+ set -e
+ '[' -z '' ']'
+ '[' -w /etc/passwd ']'
+ echo '185:x:185:0:anonymous uid:/opt/spark:/bin/false'
+ '[' -z /usr/local/openjdk-11 ']'
+ SPARK_CLASSPATH=':/opt/spark/jars/*'
+ sed 's/[^=]*=\(.*\)/\1/g'
+ sort -t_ -k4 -n
+ grep SPARK_JAVA_OPT_
+ env
+ readarray -t SPARK_EXECUTOR_JAVA_OPTS
+ '[' -n '' ']'
+ '[' -z ']'
+ '[' -z ']'
+ '[' -n '' ']'
+ '[' -z ']'
+ '[' -z ']'
+ '[' -z x ']'
+ SPARK_CLASSPATH='/opt/spark/conf::/opt/spark/jars/*'
+ case "$1" in
+ echo 'Non-spark-on-k8s command provided, proceeding in pass-through mode...'
Non-spark-on-k8s command provided, proceeding in pass-through mode...
+ CMD=("$@")
+ exec /usr/bin/tini -s -- /opt/spark/bin/spark-shell
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
23/04/09 17:40:39 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Spark context Web UI available at http://ab245d51dce7:4040
Spark context available as 'sc' (master = local[*], app id = local-1681062040709).
Spark session available as 'spark'.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 3.3.2
/_/
Using Scala version 2.12.15 (OpenJDK 64-Bit Server VM, Java 11.0.16)
Type in expressions to have them evaluated.
Type :help for more information.
scala>
scala>
scala> spark.range(1000 * 1000 * 1000).count()
res0: Long = 1000000000
参考:
【1】https://hub.docker.com/r/apache/spark

文章描述了如何在MacOS环境下,通过Docker下载并运行ApacheSpark镜像,进入SparkShell,然后执行一个简单的Spark命令`spark.range(1000*1000*1000).count()`来计算一亿个数字的个数。
1万+

被折叠的 条评论
为什么被折叠?



