前言
Flink的内存管理器管理着用于排序、散列和缓存所需的内存。内存以相等大小的(Segments)表示,称为内存页。操作器通过请求多个内存页来分配内存。在Flink中,内存又分为堆内存和非堆内存。至于是去申请什么类型的内存,这里有相关的参数去配置。
内存管理器可以预先分配所有内存,或者按需分配内存。在前者中,内存将从开始就被占用和保留,这意味着在请求内存时不能出现OutOfMemoryError,释放的内存也将返回到内存管理器的池中。按需分配是指内存管理器只跟踪当前分配了多少内存段(仅记账)。释放内存页不会将其添加到池中,而是让垃圾收集器重新声明它。
下面让我们一起剖析task manager的内存管理。
一. 堆内存和非堆内存的定义
Flink的内存分为堆内存和非堆内存。他们共同继承一个MemoryPool的抽象类。所以首先让我们先看下MemoryPool的定义。
// ------------------------------------------------------------------------
// 内存池抽象类
// ------------------------------------------------------------------------
abstract static class MemoryPool {
// 得到内存池中可用的内存页
abstract int getNumberOfAvailableMemorySegments();
// 分配内存池的大小
abstract MemorySegment allocateNewSegment(Object owner);
// 从内存池中请求内存页
abstract MemorySegment requestSegmentFromPool(Object owner);
// 往内存池中添加内存页
abstract void returnSegmentToPool(MemorySegment segment);
// 清空内存池
abstract void clear();
}
从代码可以看出MemoryPool就是一个抽象类,具体的实现我们就需要继续往下看了。
堆内存的定义类:
static final class HybridHeapMemoryPool extends MemoryPool {
/** 内存段的容器 */
private final ArrayDeque<byte[]> availableMemory;
private final int segmentSize;
HybridHeapMemoryPool(int numInitialSegments, int segmentSize) {
this.availableMemory = new ArrayDeque<>(numInitialSegments);
this.segmentSize = segmentSize;
for (int i = 0; i < numInitialSegments; i++) {
this.availableMemory.add(new byte[segmentSize]);
}
}
@Override
MemorySegment allocateNewSegment(Object owner) {
return MemorySegmentFactory.allocateUnpooledSegment(segmentSize, owner);
}
@Override
MemorySegment requestSegmentFromPool(Object owner) {
byte[] buf = availableMemory.remove();
return MemorySegmentFactory.wrapPooledHeapMemory(buf, owner);
}
@Override
void returnSegmentToPool(MemorySegment segment) {
if (segment.getClass() == HybridMemorySegment.class) {
HybridMemorySegment heapSegment = (HybridMemorySegment) segment;
availableMemory.add(heapSegment.getArray());
heapSegment.free();
}
else {
throw new IllegalArgumentException("Memory segment is not a " + HybridMemorySegment.class.getSimpleName());
}
}
@Override
protected int getNumberOfAvailableMemorySegments() {
return availableMemory.size();
}
@Override
void clear() {
availableMemory.clear();
}
}
再来看下非堆内存的定义:
static final class HybridOffHeapMemoryPool extends MemoryPool {
/** 内存段的容器 */
private final ArrayDeque<ByteBuffer> availableMemory;
private final int segmentSize;
HybridOffHeapMemoryPool(int numInitialSegments, int segmentSize) {
this.availableMemory = new ArrayDeque<>(numInitialSegments);
this.segmentSize = segmentSize;
for (int i = 0; i < numInitialSegments; i++) {
this.availableMemory.add(ByteBuffer.allocateDirect(segmentSize));
}
}
@Override
MemorySegment allocateNewSegment(Object owner) {
return MemorySegmentFactory.allocateUnpooledOffHeapMemory(segmentSize, owner);
}
@Override
MemorySegment requestSegmentFromPool(Object owner) {
ByteBuffer buf = availableMemory.remove();
return MemorySegmentFactory.wrapPooledOffHeapMemory(buf, owner);
}
@Override
void returnSegmentToPool(MemorySegment segment) {
if (segment.getClass() == HybridMemorySegment.class) {
HybridMemorySegment hybridSegment = (HybridMemorySegment) segment;
ByteBuffer b