spark执行python脚本_如何使用spark-submit运行Scala脚本（类似于Python脚本）？

最新推荐文章于 2022-10-23 16:16:53 发布

柳芽新雨

最新推荐文章于 2022-10-23 16:16:53 发布

阅读量576

点赞数

文章标签： spark执行python脚本

本文链接：https://blog.csdn.net/weixin_33609020/article/details/112959388

版权

本文介绍了尝试使用Spark提交执行Scala脚本时遇到的问题，以及解决方法。当试图用类似Python脚本的方式运行Scala代码时，遇到了'Cannot load main class from JAR file'的错误。解决方案包括在spark-shell中使用`:load`命令加载Scala脚本，这种方法适用于PoC或测试，但不推荐用于生产环境。

摘要由CSDN通过智能技术生成

I try to execute a simple Scala script using Spark as described in the Spark Quick Start Tutorial. I have not troubles to execute the following Python code:

"""SimpleApp.py"""

from pyspark import SparkContext

logFile = "tmp.txt" # Should be some file on your system

sc = SparkContext("local", "Simple App")

logData = sc.textFile(logFile).cache()

numAs = logData.filter(lambda s: 'a' in s).count()

numBs = logData.filter(lambda s: 'b' in s).count()

print "Lines with a: %i, lines with b: %i" % (numAs, numBs)

I execute this code using the following command:

/home/aaa/spark/spark-2.1.0-bin-hadoop2.7/bin/spark-submit hello_world.py

However, if I try to do the same using Scala, I have technical problems. In more detail, the code that I try to execute is:

* SimpleApp.scala */