Spark 内存管理机制,基于Spark3.x


一、内存参数

Property NameDefault
spark.memory.fraction0.6
spark.memory.storageFraction0.5
RESERVED_SYSTEM_MEMORY_BYTES300M
Runtime.getRuntime.maxMemory约为java heap memory * 0.89
1.简图:

在这里插入图片描述

2.示例:

Calculate the Memory for 5GB executor memory:
To calculate Reserved memory, User memory, Spark memory, Storage memory, and Execution memory, we will use the following parameters:

spark.executor.memory=5g
spark.memory.fraction=0.6
spark.memory.storageFraction=0.5
Java Heap Memory       = 5 GB
                       = 5 * 1024 MB
                       = 5120 MB

Reserved Memory        = 300 MB

Usable Memory          = (Java Heap Memory — Reserved Memory)
                       = 5120 MB - 300 MB
                       = 4820 MB

User Memory            = Usable Memory * (1.0 — spark.memory.fraction) 
                       = 4820 MB * (1.0 - 0.6) 
                       = 4820 MB * 0.4 
                       = 1928 MB

Spark Memory           = Usable Memory * spark.memory.fraction
                       = 4820 MB * 0.6 
                       = 2892 MB

Spark Storage Memory   = Spark Memory * spark.memory.storageFraction
                       = 2892 MB * 0.5 
                       = 1446 MB

Spark Execution Memory = Spark Memory * (1.0 - spark.memory.storageFraction)
                       = 2892 MB * ( 1 - 0.5) 
                       = 2892 MB * 0.5 
                       = 1446 MB

在这里插入图片描述

Reserved Memory —  300 MB 	—	5.85%
User Memory 	— 1928 MB 	— 	37.65%
Spark Memory 	— 2892 MB 	—	56.48%

二、Spark 内存分配在Spark UI的表现

0.前置知识

1.Runtime.getRuntime.maxMemory(Max Memory)
Runtime.getRuntime.maxMemory

--executor-memory 1024M    Max Memory    : 910.5 MB
--executor-memory 2048M    Max Memory    : 1820.5 MB

大概是89%,下面会给出具体的计算公示

我们设置了 --executor-memory ,但是 Spark 的 Executor 端通过 Runtime.getRuntime.maxMemory 拿到的内存其实没这么大,这个数据是怎么计算的?
Runtime.getRuntime.maxMemory 是程序能够使用的最大内存,其值会比实际配置的执行器内存的值小。这是因为内存分配池的堆部分分为 Eden,Survivor 和 Tenured 三部分空间,而这里面一共包含了两个 Survivor 区域,而这两个 Survivor 区域在任何时候我们只能用到其中一个,所以我们可以使用下面的公式进行描述:

ExecutorMemory = Eden + 2 * Survivor + Tenured

Runtime.getRuntime.maxMemory =  Eden + Survivor + Tenured

上面的值可能因为你的 GC 配置不一样得到的数据不一样,但是上面的计算公式是一样的。

2.不同版本bytes 变 MB 转换规则
2.1Spark 2.x
//1G = 1000MB

function formatBytes(bytes, type) {
    if (type !== 'display') return bytes;
    if (bytes == 0) return '0.0 B';
    var k = 1000;
    var dm = 1;
    var sizes = ['B', 'KB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB', 'YB'];
    var i = Math.floor(Math.log(bytes) / Math.log(k));
    return parseFloat((bytes / Math.pow(k, i)).toFixed(dm)) + ' ' + sizes[i];
}
2.2 Spark 3.x
//1G = 1024MB

function formatBytes(bytes, type) {
  if (type !== 'display') return bytes;
  if (bytes <= 0) return '0.0 B';
  var k = 1024;
  var dm = 1;
  var sizes = ['B', 'KiB', 'MiB', 'GiB', 'TiB', 'PiB', 'EiB', 'ZiB', 'YiB'];
  var i = Math.floor(Math.log(bytes) / Math.log(k));
  return parseFloat((bytes / Math.pow(k, i)).toFixed(dm)) + ' ' + sizes[i];
}

1.Spark UI with On Heap

  • 1.提交参数
spark-shell \
    --driver-memory 1g \
    --executor-memory 1g
  • 2.Spark UI表现
    在这里插入图片描述
  • 3.Storage Memory计算
Java Heap Memory       = 1 GB

Runtime.getRuntime.maxMemory = 1024MB * 0.89 = 911.36MB

Reserved Memory        = 300 MB

Usable Memory          = 911.36MB - 300MB = 611.36MB

User Memory            = 611.36MB * 0.4 = 244.544MB

Spark Memory           = 611.36MB * 0.6 = 366.816MB

Spark Storage Memory   = 366.816MB * 0.5 = 183.408MB

Spark Execution Memory = 366.816MB * 0.5 = 183.408MB

从spark UI我们得知, Storage Memory value 是 366.3MB 由此可知 :

Storage Memory = Spark Storage Memory + Spark Execution Memory = Spark Memory
               = 366.816MB

2.Spark UI with OffHeap Enabled

  • 1.提交参数
spark-shell \
    --driver-memory 1g \
    --executor-memory 1g \
    --conf spark.memory.offHeap.enabled=true \
    --conf spark.memory.offHeap.size=5g
  • 2.On Heap Memory
根据上文可知Storage Memory = 366.3MB 
  • 3.Off Heap Memory
spark.memory.offHeap.size = 5 GB = 5 * 1024 MB = 5120 MB
  • 3.3 Storage Memory
Storage Memory = On Heap Memory + Off Heap Memory
               = 5120MB + 366MB

3.Spark Storage Memory 计算程序demo

// JVM Arguments: -Xmx5g
public class SparkMemoryCalculation {

    private static final long MB = 1024 * 1024;
    private static final long RESERVED_SYSTEM_MEMORY_BYTES = 300 * MB;
    private static final double SparkMemoryStorageFraction = 0.5;
    private static final double SparkMemoryFraction = 0.6;

    public static void main(String[] args) {

        long systemMemory = Runtime.getRuntime().maxMemory();
        long usableMemory = systemMemory - RESERVED_SYSTEM_MEMORY_BYTES;
        long sparkMemory = convertDoubletLong(usableMemory * SparkMemoryFraction);
        long userMemory = convertDoubletLong(usableMemory * (1 - SparkMemoryFraction));

        long storageMemory = convertDoubletLong(sparkMemory * SparkMemoryStorageFraction);
        long executionMemory = convertDoubletLong(sparkMemory * (1 - SparkMemoryStorageFraction));

        printMemoryInMB("Heap Memory\t\t", systemMemory);
        printMemoryInMB("Reserved Memory", RESERVED_SYSTEM_MEMORY_BYTES);
        printMemoryInMB("Usable Memory\t", usableMemory);
        printMemoryInMB("User Memory\t\t", userMemory);
        printMemoryInMB("Spark Memory\t", sparkMemory);

        printMemoryInMB("Storage Memory\t", storageMemory);
        printMemoryInMB("Execution Memory", executionMemory);

        System.out.println();
        printStorageMemoryInMB("Spark Storage Memory", sparkMemory);
        printStorageMemoryInMB("Storage Memory UI \t", storageMemory);
        printStorageMemoryInMB("Execution Memory UI", executionMemory);
    }

    private static void printMemoryInMB(String type, long memory) {
        System.out.println(type + " \t=\t"+ (memory/MB) +" MB");
    }

    private static void printStorageMemoryInMB(String type, long memory) {
        System.out.println(type + " \t=\t"+ (memory/(1000*1000)) +" MB");
    }

    private static Long convertDoubletLong(double val) {
        return new Double(val).longValue();
    }
}

总结

参考:
https://community.cloudera.com/t5/Community-Articles/Spark-Memory-Management/ta-p/317794
大佬文章写的宛如艺术品
最后,送大家一句话
“知识,哪怕是知识的幻影,也会成为你的铠甲,保护你不被愚昧反噬”(来自知乎——《为什么读书?》)

  • 2
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值