Error
一、org.apache.spark.SparkException: Could not parse Master URL: ‘<pyspark.conf.SparkConf object at 0x00000207C46963A0>’
报错代码:
from pyspark import SparkConf, SparkContext
conf = SparkConf().setAppName("test").setMaster("local[*]")
#构建sparkContext对象
sc = SparkContext(conf)
rdd = sc.parallelize([1,2,3,4,5,6,7,8,9])
print("默认分期数:",rdd.getNumPartitions())
修改后代码:
from pyspark import SparkConf, SparkContext
conf = SparkConf().setAppName("test").setMaster("local[*]")
#构建sparkContext对象
sc = SparkContext(conf=conf)
rdd = sc.parallelize([1,2,3,4,5,6,7,8,9])
print("默认分期数:",rdd.getNumPartitions())
sparkContext后的参数有问题,即使加上这个参数会标红,运行是没问题的
二、Constructor org.apache.spark.sql.SparkSession([class org.apache.spark.SparkContext, class java.util.HashMap]) does not exist
这个报错一般是因为本地spark版本和pyspark版本不一致导致的,可以用
pip show pyspark
查看pyspark版本,然后将版本替换为一致即可