添加spark.yarn.jars 解决 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set,

一:问题现象:

在spark on yarn 提交任务是,提示如下:

WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.

在这里插入图片描述

二:解决办法:

1).创建 archive: jar cv0f spark-libs.jar -C $SPARK_HOME/jars/ .
2).hdfs下创建目录:hdfs dfs -mkdir -p /system/SparkJars/jar
上传jar包到 HDFS: hdfs dfs -put spark-libs.jar /system/SparkJars/jar
3). 在spark-default.conf中设置 spark.yarn.archive=hdfs:///system/SparkJars/jar/spark-libs.jar

三:结果:

这是SPARK on YARN,调优的一个手段,节约每个NODE上传JAR到HDFS的时间,可通过具体情况查看:
在这里插入图片描述
四:官网解释:
在[https://spark.apache.org/docs/latest/running-on-yarn.html#spark-properties]里有解释:

To make Spark runtime jars accessible from YARN side, you can specify spark.yarn.archive or spark.yarn.jars. For details please refer to Spark Properties. If neither spark.yarn.archive nor spark.yarn.jars is specified, Spark will create a zip file with all jars under $SPARK_HOME/jars and upload it to the distributed cache.

继续查看具体的 Spark Properties:
spark.yarn.jars:none :List of libraries containing Spark code to distribute to YARN containers. By default, Spark on YARN will use Spark jars installed locally, but the Spark jars can also be in a world-readable location on HDFS. This allows YARN to cache it on nodes so that it doesn’t need to be distributed each time an application runs. To point to jars on HDFS, for example, set this configuration to hdfs:///some/path. Globs are allowed.

spark.yarn.archive:An archive containing needed Spark jars for distribution to the YARN cache. If set, this configuration replaces spark.yarn.jars and the archive is used in all the application’s containers. The archive should contain jar files in its root directory. Like with the previous option, the archive can also be hosted on HDFS to speed up file distribution.

就是在默认情况:Spark on YARN要用Spark jars(默认就在Spark安装目录),但这个jars也可以再HDFS任何可以读到的地方,这样就方便每次应用程序跑的时候在节点上可以Cache,这样就不用上传这些jars,

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值