a=[[1,2,3,2,3,4],[3,4,5,6,7,5,3,2]]
b=sc.parallelize(a)
d=b.flatMap(lambda x:x) #铺平 ,形成一个rdd
e=d.distinct()
e.collect() => [1, 2, 3, 4, 5, 6, 7]
a=[[1,2,3,2,3,4],[3,4,5,6,7,5,3,2]]
b=sc.parallelize(a)
d=b.flatMap(lambda x:x) #铺平 ,形成一个rdd
e=d.distinct()
e.collect() => [1, 2, 3, 4, 5, 6, 7]
转载于:https://www.cnblogs.com/zhangbojiangfeng/p/6490984.html