dataframe 转rdd java,在pyspark中将RDD转换为Dataframe

我想在pyspark中将我的RDD转换为Dataframe .

我的RDD:

[(['abc', '1,2'], 0), (['def', '4,6,7'], 1)]

我希望RDD以Dataframe的形式:

Index Name Number

0 abc [1,2]

1 def [4,6,7]

我试过了:

rd2=rd.map(lambda x,y: (y, x[0] , x[1]) ).toDF(["Index", "Name" , "Number"])

但我收到了错误

An error occurred while calling

z:org.apache.spark.api.python.PythonRDD.runJob.

: org.apache.spark.SparkException: Job aborted due to stage failure:

Task 0 in stage 62.0 failed 1 times, most recent failure: Lost task 0.0

in stage 62.0 (TID 88, localhost, executor driver):

org.apache.spark.api.python.PythonException: Traceback (most recent

call last):

你能让我知道吗,我哪里错了?

更新:

rd2=rd.map(lambda x: (x[1], x[0][0] , x[0][1]))

我有以下形式的RDD:

[(0, 'abc', '1,2'), (1, 'def', '4,6,7')]

要转换为Dataframe:

rd2.toDF(["Index", "Name" , "Number"])

它仍然给我错误:

An error occurred while calling o2271.showString.

: java.lang.IllegalStateException: SparkContext has been shutdown

at org.apache.spark.SparkContext.runJob(SparkContext.scala:2021)

at org.apache.spark.SparkContext.runJob(SparkContext.scala:2050)

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值