pyspark
gavenyeah
北京师范大学计算机专业研究生
展开
-
pyspark rdd def partitionBy自定义partitionFunc
def partitionBy(self, numPartitions, partitionFunc=portable_hash): def partitionBy(self, numPartitions, partitionFunc=portable_hash): “”” Return a copy of the RDD partitioned us原创 2017-12-11 15:10:52 · 5792 阅读 · 1 评论 -
StructType can not accept object %r in type %s" % (obj, type(obj)))
在将string类型的数据类型转换为spark rdd时,一直报这个错,. . . s = str(tree)y = str(YESTERDAY)list0 = [s, y]outRes = self.sc.parallelize(list0)df_tree = outRes.toDF("model: string, dt: string").registerTempTable("temp") .原创 2017-11-27 22:09:47 · 3953 阅读 · 0 评论