RDD创建

烙痕

于 2018-08-08 17:21:30 发布

阅读量241

点赞数

分类专栏： Spark

本文链接：https://blog.csdn.net/qq_37408712/article/details/81511036

版权

RDD详解：
https://blog.csdn.net/u013850277/article/details/73648742

RDD创建方式一：
Parallelized collections are created by calling SparkContext’s parallelize method on an existing collection in your driver program (a Scala Seq). The elements of the collection are copied to form a distributed dataset that can be operated on in parallel. For example, here is how to create a parallelized collection holding the numbers 1 to 5:

val data = Array(1, 2, 3, 4, 5)
val distData = sc.parallelize(data)
//val distData = sc.parallelize(data,5)

Once created, the distributed dataset (distData) can be operated on in parallel. For example, we might call distData.reduce((a, b) => a + b) to add up the elements of the array. We describe opera

最低0.47元/天解锁文章

烙痕

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
RDD创建

RDD详解：https://blog.csdn.net/u013850277/article/details/73648742RDD创建方式一：Parallelized collections are created by calling SparkContext’s parallelize method on an existing collection in your driver prog...
复制链接

扫一扫

专栏目录