Spark 学习笔记可以follow这里:https://github.com/MachineLP/Spark-
Types of spark operations
There are Three types of operations on RDDs: Transformations, Actions and Shuffles.
- The most expensive operations are those the require communication between nodes.
Transformations: RDD → RDD.
- Examples map, filter, sample, More
- No communication needed.
Actions: RDD → Python-object in head node.
- Ex