Spark学习chapter8 Tunning and Debugging Spark

Configuring Spark with SparkConf

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

Components of Execution:Jobs, Tasks, and Stages

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

Spark Execution

  1. User Code defines a DAG(directed acyclic graph) of RDDs
    Operations on RDDs create new RDDs that refer back to their parents, thereby creating a graph.
  2. Action force translation of the DAG to an execution plan
    When you call an action on an RDD it must be computed. This requires computing its parent RDDs as well. Spark’s scheduler sumbits a job to compute all needed RDDs. That job wil have one or more stages, which are parallel waves of computation composed of tasks. Each stage will correspond one or more RDDs in the DAG. A single stage can correspond to mutiple RDDs due to pipelining.
  3. Tasks are scheduler and executed on a cluster
    Stages are porcessed in order, with individual launching to compute segments of the RDDs. Once the final stage is finished in a job, the action is complete.

filter() and coalesce()

在这里插入图片描述

Memory Management

  1. RDD storage
    When you call persist() and cache() on an RDD, its partitions will be stored in memory buffers. Spark will limit the amount of memory used when caching to a certain fraction of the JVM’s overall heap, set by spark.storage.memoryFraction. If this limit is exceeded, older partitions will be dropped from memory.
  2. Shuffle and aggregation buffers
    When performing shuffle operations, Spark will create intermediate buffers for storing shuffle output data. spark.shuffle.memeoryFraction
  3. User Code
    User code has access to everything lefet by JVM heap after the space for RDD storage and shuffle storage are allocated.

By default, Spark will leave 60% of space for RDD storage, 20% for shuffle memory, and the remaining 20% for user programs.

在这里插入图片描述
在这里插入图片描述

Hardware Provisioning

hardware provisioning 在这里插入图片描述
spark.cores.max
在这里插入图片描述

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值