运行spark程序时出现以下错误
PythonRDD[1] at RDD at PythonRDD.scala:53
解决方法
原程序代码:
from pyspark import SparkConf,SparkContext
conf = SparkConf().setAppName('filer').setMaster('local[*]')
sc = SparkContext(conf=conf)
rdd = sc.parallelize([1,2,3,4,5,6])
rdd_filer = rdd.filter(lambda x:x>1)
print(rdd_filer)
更改后:
from pyspark import SparkConf,SparkContext
conf = SparkConf().setAppName('filer').setMaster('local[*]')
sc = SparkContext(conf=conf)
rdd = sc.parallelize([1,2,3,4,5,6])
rdd_filer = rdd.filter(lambda x:x>1)
print(rdd_filer.collect())