【实践】Spark RDD API实战

  • map

Applies a transformation function on each item of the RDD and returns the result as a new RDD.

//3表示指定为3个Partitions
var a = sc.parallelize(List("dog", "salmon", "salmon", "rat", "elephant"), 3)
//以a各元素的长度建议新的RDD
var b = a.map(_.length)
//将两个RDD组合新一个新的RDD
var c = a.zip(b)
c.collect
res0: Array[(String, Int)] = Array((dog,3), (salmon,6), (salmon,6), (rat,3), 
(elephant,8))*
  • zip

Joins two RDDs by combining the i-th of either partition with each other. The resulting RDD will consist of two-component tuples which are interpreted as key-va lue pairs by the methods provided by the PairRDDFunctions extension.

var a1 = sc.parallelize(1 to 10, 3)
var b1 = sc.parallelize(11 to 20, 3)
a1.zip(b1).collect
res1: Array[(Int, Int)] = Array((1,11), (2,12), (3,13), (4,14), \
(5,15), (6,16), (7,17), (8,18), (9,19), (10,20))

var a2 = sc.parallelize(1 to 10, 3)
var b2 = sc.parallelize(11 to 20, 3)
var c2 = sc.parallelize(21 to 30, 3)
a2.zip(b2).zip(c2).collect
res3: Array[((Int, Int), Int)] = Array(((1,11),21), ((2,12),22),
((3,13),23), ((4,14),24), ((5,15),25), ((6,16),26), ((7,17),27),
((8,18),28), ((9,19),29), ((10,20),30))
a2.zip(b2).zip(c2).map((x) => (x._1._1, x._1._2, x._2 )).collect
res2: Array[(Int, Int, Int)] = Array((1,11,21), (2,12,22), (3,13,23),
(4,14,24), (5,15,25), (6,16,26), (7,17,27), (8,18,28), (9,19,29), (10,20,30))
  • filter

Evaluates a boolean function for each data item of the RDD and puts the items fo r which the function returned true into the resulting RDD.Joins two RDDs by combining the i-th of either partition with each other. The resulting RDD will consist of two-component tuples which are interpreted as key-va lue pairs by the methods provided by the PairRDDFunctions extension.

val a = sc.parallelize(1 to 10, 3)
val b = a.filter(_ % 2 == 0)
b.collect
res4: Array[Int] = Array(2, 4, 6, 8, 10)
  • flatMap
    Similar to map, but allows emitting more than one item in the map function. map是一个元素,变成另一个元素。flatMap是一个元素变成
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值