深入Spark "Locality level"(Spark-2.3.0版本)
1、可以在Spark job ui上查看到
2、Locality level解释
说明:为了保证没有理解的偏差,将把英文原文说明粘贴如下。
Data locality can have a major impact on the performance of Spark jobs. If data and the code that operates on it are together then computation tends to be fast. But if code and data are separated, one must move to the other. Typically it is faster to ship serialized code from place to place than a chunk of data because code size is much smaller than data.