一、flink内存机制
在前面的槽机制上,提到了内存的共享,这篇文章就分析一下,在Flink中对内存的管理。在Flink中,内存被抽象出来,形成了一套自己的管理机制。Flink本身基本是以Java语言完成的,理论上说,直接使用JVM的虚拟机的内存管理就应该更简单方便,但Flink为什么还要自己抽象出自己的内存管理呢?
这首先要考虑Flink的应用场景,Flink是为大数据而产生的,而大数据使用会消耗大量的内存,而JVM的内存管理管理设计是兼顾平衡的,不可能单独为了大数据而修改,这对于Flink来说,非常的不灵活,而且频繁GC会导致长时间的机器暂停应用,这对于大数据的应用场景来说也是无法忍受的。
Flink的内存设计分为两部分即基础数据结构和内存管理机制。而在内存管理机制中,又将内存抽象成两种类型,堆内存和非堆内存:
HEAP:JVM堆内存
OFF_HEAP:非堆内存
Flink的内存管理机制其实主要就是减少内存GC,减少OOM并针对大数据优化空间应用并支持二进制操作。同时,对大数据需要的内存缓冲提供良好的支持。无论是网络通信还是任务应用,最终都要落到这些内存的管理的分配机制中。为了实现内存的优化,Flink使用了堆外内存,这个在JVM中有详细的解释,大家有兴趣可以看看,使用堆外内存可以减少分配内存的限制(时间和空间上),同时可以降低内存的GC。使用堆外内存可以极大地减小堆内存(只需要分配Remaining Heap),使得 TaskManager 扩展到上百GB内存不是问题。而在IO操作时,使用堆外内存可以zero-copy,使用堆内内存至少要复制一次。需要说明的是,堆外内存在进程间是共享的。
二、内存使用分类
Flink的内存基本的思想是使用抽象的内存池,大量进行内存的预分配,提高复用的比例。规范内存的大小,提升效率。内存主要分为下面的三种情况:
1、内存池
内存管理池是由很多的MemorySegment组成的海量集合。Flink中应用的到算法,如排序,清洗数据等都会从中申请分配,并将其序列化的数据存储于其内部。这些内存是可以被反复利用分配的。一般情况下,这部分应用的占据内存使用要占整个内存的大多数。
2、网络缓冲区
大数据的数据通信一定会占据相当数量的内存,在Flink的网络缓冲区中,是一个预分配的一定数量的32K的缓存,这有一点类似于网线驱动中的IO缓冲区。它会在taskmanager启动时分配,默认的数量是2K,可以通过taskmanager.network.numberOfBuffers 来配置。
3、用户使用内存
这部分内存主要是用于用户的一些基础的数据结构(如taskmanager)以及一些代码相关的内存使用。
三、内存的抽象数据结构
1、MemorySegment
这段内存可以认为是JVM中堆内存的一部分。它是抽象的基础类:
@Internal
public abstract class MemorySegment {
......
MemorySegment(byte[] buffer, Object owner) {
if (buffer == null) {
throw new NullPointerException("buffer");
}
this.heapMemory = buffer;
this.address = BYTE_ARRAY_BASE_OFFSET;
this.size = buffer.length;
this.addressLimit = this.address + this.size;
this.owner = owner;
}
MemorySegment(long offHeapAddress, int size, Object owner) {
if (offHeapAddress <= 0) {
throw new IllegalArgumentException("negative pointer or size");
}
if (offHeapAddress >= Long.MAX_VALUE - Integer.MAX_VALUE) {
// this is necessary to make sure the collapsed checks are safe against numeric overflows
throw new IllegalArgumentException("Segment initialized with too large address: " + offHeapAddress
+ " ; Max allowed address is " + (Long.MAX_VALUE - Integer.MAX_VALUE - 1));
}
this.heapMemory = null;
this.address = offHeapAddress;
this.addressLimit = this.address + size;
this.size = size;
this.owner = owner;
}
......
public byte[] getArray() {
if (heapMemory != null) {
return heapMemory;
} else {
throw new IllegalStateException("Memory segment does not represent heap memory");
}
}
......
public abstract byte get(int index);
public abstract void put(int index, byte b);
public abstract void get(int index, byte[] dst);
public abstract void put(int index, byte[] src);
public abstract void get(int index, byte[] dst, int offset, int length);
......
@SuppressWarnings("restriction")
public final char getChar(int index) {
final long pos = address + index;
if (index >= 0 && pos <= addressLimit - 2) {
return UNSAFE.getChar(heapMemory, pos);
}
else if (address > addressLimit) {
throw new IllegalStateException("This segment has been freed.");
}
else {
// index is in fact invalid
throw new IndexOutOfBoundsException();
}
}
......
@SuppressWarnings("restriction")
public final void putChar(int index, char value) {
final long pos = address + index;
if (index >= 0 && pos <= addressLimit - 2) {
UNSAFE.putChar(heapMemory, pos, value);
}
else if (address > addressLimit) {
throw new IllegalStateException("segment has been freed");
}
else {
// index is in fact invalid
throw new IndexOutOfBoundsException();
}
}
......
public final void copyToUnsafe(int offset, Object target, int targetPointer, int numBytes) {
final long thisPointer = this.address + offset;
if (thisPointer + numBytes > addressLimit) {
throw new IndexOutOfBoundsException(
String.format("offset=%d, numBytes=%d, address=%d",
offset, numBytes, this.address));
}
UNSAFE.copyMemory(this.heapMemory, thisPointer, target, targetPointer, numBytes);
}
public final void copyFromUnsafe(int offset, Object source, int sourcePointer, int numBytes) {
final long thisPointer = this.address + offset;
if (thisPointer + numBytes > addressLimit) {
throw new IndexOutOfBoundsException(
String.format("offset=%d, numBytes=%d, address=%d",
offset, numBytes, this.address));
}
UNSAFE.copyMemory(source, sourcePointer, this.heapMemory, thisPointer, numBytes);
}
public final int compare(MemorySegment seg2, int offset1, int offset2, int len) {
while (len >= 8) {
long l1 = this.getLongBigEndian(offset1);
long l2 = seg2.getLongBigEndian(offset2);
if (l1 != l2) {
return (l1 < l2) ^ (l1 < 0) ^ (l2 < 0) ? -1 : 1;
}
offset1 += 8;
offset2 += 8;
len -= 8;
}
while (len > 0) {
int b1 = this.get(offset1) & 0xff;
int b2 = seg2.get(offset2) & 0xff;
int cmp = b1 - b2;
if (cmp != 0) {
return cmp;
}
offset1++;
offset2++;
len--;
}
return 0;
}
public final void swapBytes(byte[] tempBuffer, MemorySegment seg2, int offset1, int offset2, int len) {
if ((offset1 | offset2 | len | (tempBuffer.length - len)) >= 0) {
final long thisPos = this.address + offset1;
final long otherPos = seg2.address + offset2;
if (thisPos <= this.addressLimit - len && otherPos <= seg2.addressLimit - len) {
// this -> temp buffer
UNSAFE.copyMemory(this.heapMemory, thisPos, tempBuffer, BYTE_ARRAY_BASE_OFFSET, len);
// other -> this
UNSAFE.copyMemory(seg2.heapMemory, otherPos, this.heapMemory, thisPos, len);
// temp buffer -> other
UNSAFE.copyMemory(tempBuffer, BYTE_ARRAY_BASE_OFFSET, seg2.heapMemory, otherPos, len);
return;
}
else if (this.address > this.addressLimit) {
throw new IllegalStateException("this memory segment has been freed.");
}
else if (seg2.address > seg2.addressLimit) {
throw new IllegalStateException("other memory segment has been freed.");
}
}
// index is in fact invalid
throw new IndexOutOfBoundsException(
String.format("offset1=%d, offset2=%d, len=%d, bufferSize=%d, address1=%d, address2=%d",
offset1, offset2, len, tempBuffer.length, this.address, seg2.address));
}
......
}
2、HybridMemorySegment
这段内存可能是堆内存也可能不是(on-heap or off-heap)。
@Internal
public final class HybridMemorySegment extends MemorySegment {
private final ByteBuffer offHeapBuffer;
HybridMemorySegment(ByteBuffer buffer) {
this(buffer, null);
}
HybridMemorySegment(ByteBuffer buffer, Object owner) {
super(checkBufferAndGetAddress(buffer), buffer.capacity(), owner);
this.offHeapBuffer = buffer;
}
HybridMemorySegment(byte[] buffer) {
this(buffer, null);
}
HybridMemorySegment(byte[] buffer, Object owner) {
super(buffer, owner);
this.offHeapBuffer = null;
}
public ByteBuffer getOffHeapBuffer() {
if (offHeapBuffer != null) {
return offHeapBuffer;
} else {
throw new IllegalStateException("Memory segment does not represent off heap memory");
}
}
@Override
public ByteBuffer wrap(int offset, int length) {
if (address <= addressLimit) {
if (heapMemory != null) {
return ByteBuffer.wrap(heapMemory, offset, length);
}
else {
try {
ByteBuffer wrapper = offHeapBuffer.duplicate();
wrapper.limit(offset + length);
wrapper.position(offset);
return wrapper;
}
catch (IllegalArgumentException e) {
throw new IndexOutOfBoundsException();
}
}
}
else {
throw new IllegalStateException("segment has been freed");
}
}
@Override
public final byte get(int index) {
final long pos = address + index;
if (index >= 0 && pos < addressLimit) {
return UNSAFE.getByte(heapMemory, pos);
}
else if (address > addressLimit) {
throw new IllegalStateException("segment has been freed");
}
else {
// index is in fact invalid
throw new IndexOutOfBoundsException();
}
}
@Override
public final void put(int index, byte b) {
final long pos = address + index;
if (index >= 0 && pos < addressLimit) {
UNSAFE.putByte(heapMemory, pos, b);
}
else if (address > addressLimit) {
throw new IllegalStateException("segment has been freed");
}
else {
// index is in fact invalid
throw new IndexOutOfBoundsException();
}
}
@Override
public final void get(int index, byte[] dst) {
get(index, dst, 0, dst.length);
}
@Override
public final void put(int index, byte[] src) {
put(index, src, 0, src.length);
}
......
@Override
public final boolean getBoolean(int index) {
return get(index) != 0;
}
@Override
public final void putBoolean(int index, boolean value) {
put(index, (byte) (value ? 1 : 0));
}
@Override
public final void get(DataOutput out, int offset, int length) throws IOException {
if (address <= addressLimit) {
if (heapMemory != null) {
out.write(heapMemory, offset, length);
}
else {
while (length >= 8) {
out.writeLong(getLongBigEndian(offset));
offset += 8;
length -= 8;
}
while (length > 0) {
out.writeByte(get(offset));
offset++;
length--;
}
}
}
else {
throw new IllegalStateException("segment has been freed");
}
}
@Override
public final void put(DataInput in, int offset, int length) throws IOException {
if (address <= addressLimit) {
if (heapMemory != null) {
in.readFully(heapMemory, offset, length);
}
else {
while (length >= 8) {
putLongBigEndian(offset, in.readLong());
offset += 8;
length -= 8;
}
while (length > 0) {
put(offset, in.readByte());
offset++;
length--;
}
}
}
else {
throw new IllegalStateException("segment has been freed");
}
}
......
}
其实就是在类的内部大量实现了一些相关数据类型和接口的重载。在这个数据结构中,提到了unsafe的指针,同时它在一个对象中既有可能是堆内存又有可能不是堆内存,如何判断和处理呢?这就需要使用JVM提供的一系列的UNSAFE的非安全操作方法。它有如下的特点:
首先,如果对象不为null。而且后面的地址或者位置是相对位置,那么会直接对当前对象(比如数组)的相对位置进行操作,即on-heap;如果对象为null,而且后面的地址是某个内存块的绝对地址,那么这些方法的调用也相当于对该内存块进行操作。这里对象为null,所操作的内存块不是JVM堆内存即off-heap。
3、HeapMemorySegment
此类是对MemorySegment的继承:
@SuppressWarnings("unused")
@Internal
public final class HeapMemorySegment extends MemorySegment {
private byte[] memory;
HeapMemorySegment(byte[] memory) {
this(memory, null);
}
HeapMemorySegment(byte[] memory, Object owner) {
super(Objects.requireNonNull(memory), owner);
this.memory = memory;
}
@Override
public void free() {
super.free();
this.memory = null;
}
@Override
public ByteBuffer wrap(int offset, int length) {
try {
return ByteBuffer.wrap(this.memory, offset, length);
}
catch (NullPointerException e) {
throw new IllegalStateException("segment has been freed");
}
}
public byte[] getArray() {
return this.heapMemory;
}
@Override
public final byte get(int index) {
return this.memory[index];
}
@Override
public final void put(int index, byte b) {
this.memory[index] = b;
}
@Override
public final void get(int index, byte[] dst) {
get(index, dst, 0, dst.length);
}
@Override
public final void put(int index, byte[] src) {
put(index, src, 0, src.length);
}
@Override
public final void get(int index, byte[] dst, int offset, int length) {
// system arraycopy does the boundary checks anyways, no need to check extra
System.arraycopy(this.memory, index, dst, offset, length);
}
@Override
public final void put(int index, byte[] src, int offset, int length) {
// system arraycopy does the boundary checks anyways, no need to check extra
//使用了JVM的内存拷贝机制
System.arraycopy(src, offset, this.memory, index, length);
}
@Override
public final boolean getBoolean(int index) {
return this.memory[index] != 0;
}
@Override
public final void putBoolean(int index, boolean value) {
this.memory[index] = (byte) (value ? 1 : 0);
}
@Override
public final void get(DataOutput out, int offset, int length) throws IOException {
out.write(this.memory, offset, length);
}
@Override
public final void put(DataInput in, int offset, int length) throws IOException {
in.readFully(this.memory, offset, length);
}
@Override
public final void get(int offset, ByteBuffer target, int numBytes) {
// ByteBuffer performs the boundary checks
target.put(this.memory, offset, numBytes);
}
@Override
public final void put(int offset, ByteBuffer source, int numBytes) {
// ByteBuffer performs the boundary checks
source.get(this.memory, offset, numBytes);
}
......
public static final HeapMemorySegmentFactory FACTORY = new HeapMemorySegmentFactory();
}
最后的代码可以看到,工厂类是用HeapMemorySegmentFactory这个实例对象来实现的。
4、MemorySegmentFactory
内存分配管理的工厂,Flink是不推荐手动分配内存MemorySegment,而建议使用此类。
@Internal
public final class MemorySegmentFactory {
public static MemorySegment wrap(byte[] buffer) {
return new HybridMemorySegment(buffer);
}
public static MemorySegment allocateUnpooledSegment(int size) {
return allocateUnpooledSegment(size, null);
}
public static MemorySegment allocateUnpooledSegment(int size, Object owner) {
return new HybridMemorySegment(new byte[size], owner);
}
public static MemorySegment allocateUnpooledOffHeapMemory(int size, Object owner) {
ByteBuffer memory = ByteBuffer.allocateDirect(size);
return wrapPooledOffHeapMemory(memory, owner);
}
public static MemorySegment wrapPooledHeapMemory(byte[] memory, Object owner) {
return new HybridMemorySegment(memory, owner);
}
public static MemorySegment wrapPooledOffHeapMemory(ByteBuffer memory, Object owner) {
return new HybridMemorySegment(memory, owner);
}
}
这个工厂可以创建不同情况下的MemorySegment实例对象,这就是这个工厂的目的所在。
5、view(DataView)
视图,是建立在MemorySegment的更高的抽象,目的是隔离数据的变化,和数据的视图一样,都是为了只读。
public interface DataInputView extends DataInput {
void skipBytesToRead(int numBytes) throws IOException;
int read(byte[] b, int off, int len) throws IOException;
int read(byte[] b) throws IOException;
}
public interface DataOutputView extends DataOutput {
void skipBytesToWrite(int numBytes) throws IOException;
void write(DataInputView source, int numBytes) throws IOException;
}
四、内存的映射和应用
1、二进制处理
二进制的处理,其实就是下面的序列化的一种处理方式,通过序列化可以得到二进制的数据流。然后将数据流推给相应的operator,通过它对数据进行操作,比如Sort,Join等。
2、对Scala的内存管理
对Scalable的内存管理,也是通过序列化的机制进行的,目前Flink完全支持所有的Scala的数据类型。
3、序列化
序列化其实好理解,就是二进制流和数据结构的来回转换,或者说叫编码解码的过程。
Flink实现了一套自己的序列化框架而没有使用其它的现成的框架。这也和Flink本身的数据结构单一有很大关系。Flink通过JAVA的反射机制,得到类型信息(TypeInformation),其主要包含以下几种:
BasicTypeInfo: 任意Java 基本类型(装箱的)或 String 类型。
BasicArrayTypeInfo: 任意Java基本类型数组(装箱的)或 String 数组。
WritableTypeInfo: 任意 Hadoop Writable 接口的实现类。
TupleTypeInfo: 任意的 Flink Tuple 类型(支持Tuple1 to Tuple25)。Flink tuples 是固定长度固定类型的Java Tuple实现。
CaseClassTypeInfo: 任意的 Scala CaseClass(包括 Scala tuples)。
PojoTypeInfo: 任意的 POJO (Java or Scala),例如,Java对象的所有成员变量,要么是 public 修饰符定义,要么有 getter/setter 方法。
GenericTypeInfo: 任意无法匹配之前几种类型的类。
Flink可以通过TypeSerializer来序列化基础的数据类型,对于GenericTypeInfo,则使用Kryo来实现序列化。
4、内存管理(内存池)
MemoryManager提供了两个内部类HybridOffHeapMemoryPool和HybridOffHeapMemoryPool,其中包含内存分配allocate的方法。在HybridOffHeapMemoryPool一般分配的是比特数组byte[],而在HybridOffHeapMemoryPool中一般分配的是ByteBuffer.看一下代码:
abstract static class MemoryPool {
abstract int getNumberOfAvailableMemorySegments();
abstract MemorySegment allocateNewSegment(Object owner);
abstract MemorySegment requestSegmentFromPool(Object owner);
abstract void returnSegmentToPool(MemorySegment segment);
abstract void clear();
}
static final class HybridHeapMemoryPool extends MemoryPool {
/** The collection of available memory segments. */
private final ArrayDeque<byte[]> availableMemory;
private final int segmentSize;
HybridHeapMemoryPool(int numInitialSegments, int segmentSize) {
this.availableMemory = new ArrayDeque<>(numInitialSegments);
this.segmentSize = segmentSize;
for (int i = 0; i < numInitialSegments; i++) {
this.availableMemory.add(new byte[segmentSize]);
}
}
@Override
MemorySegment allocateNewSegment(Object owner) {
return MemorySegmentFactory.allocateUnpooledSegment(segmentSize, owner);
}
@Override
MemorySegment requestSegmentFromPool(Object owner) {
byte[] buf = availableMemory.remove();
return MemorySegmentFactory.wrapPooledHeapMemory(buf, owner);
}
@Override
void returnSegmentToPool(MemorySegment segment) {
if (segment.getClass() == HybridMemorySegment.class) {
HybridMemorySegment heapSegment = (HybridMemorySegment) segment;
availableMemory.add(heapSegment.getArray());
heapSegment.free();
}
else {
throw new IllegalArgumentException("Memory segment is not a " + HybridMemorySegment.class.getSimpleName());
}
}
@Override
protected int getNumberOfAvailableMemorySegments() {
return availableMemory.size();
}
@Override
void clear() {
availableMemory.clear();
}
}
static final class HybridOffHeapMemoryPool extends MemoryPool {
/** The collection of available memory segments. */
private final ArrayDeque<ByteBuffer> availableMemory;
private final int segmentSize;
HybridOffHeapMemoryPool(int numInitialSegments, int segmentSize) {
this.availableMemory = new ArrayDeque<>(numInitialSegments);
this.segmentSize = segmentSize;
for (int i = 0; i < numInitialSegments; i++) {
this.availableMemory.add(ByteBuffer.allocateDirect(segmentSize));
}
}
@Override
MemorySegment allocateNewSegment(Object owner) {
return MemorySegmentFactory.allocateUnpooledOffHeapMemory(segmentSize, owner);
}
@Override
MemorySegment requestSegmentFromPool(Object owner) {
ByteBuffer buf = availableMemory.remove();
return MemorySegmentFactory.wrapPooledOffHeapMemory(buf, owner);
}
@Override
void returnSegmentToPool(MemorySegment segment) {
if (segment.getClass() == HybridMemorySegment.class) {
HybridMemorySegment hybridSegment = (HybridMemorySegment) segment;
ByteBuffer buf = hybridSegment.getOffHeapBuffer();
availableMemory.add(buf);
hybridSegment.free();
}
else {
throw new IllegalArgumentException("Memory segment is not a " + HybridMemorySegment.class.getSimpleName());
}
}
@Override
protected int getNumberOfAvailableMemorySegments() {
return availableMemory.size();
}
@Override
void clear() {
availableMemory.clear();
}
}
再看一个管理的类:
public class MemoryManager {
private static final Logger LOG = LoggerFactory.getLogger(MemoryManager.class);
......
public MemoryManager(long memorySize, int numberOfSlots) {
this(memorySize, numberOfSlots, DEFAULT_PAGE_SIZE, MemoryType.HEAP, true);
}
public MemoryManager(long memorySize, int numberOfSlots, int pageSize,
MemoryType memoryType, boolean preAllocateMemory) {
// sanity checks
if (memoryType == null) {
throw new NullPointerException();
}
if (memorySize <= 0) {
throw new IllegalArgumentException("Size of total memory must be positive.");
}
if (pageSize < MIN_PAGE_SIZE) {
throw new IllegalArgumentException("The page size must be at least " + MIN_PAGE_SIZE + " bytes.");
}
if (!MathUtils.isPowerOf2(pageSize)) {
throw new IllegalArgumentException("The given page size is not a power of two.");
}
this.memoryType = memoryType;
this.memorySize = memorySize;
this.numberOfSlots = numberOfSlots;
// assign page size and bit utilities
this.pageSize = pageSize;
this.roundingMask = ~((long) (pageSize - 1));
final long numPagesLong = memorySize / pageSize;
if (numPagesLong > Integer.MAX_VALUE) {
throw new IllegalArgumentException("The given number of memory bytes (" + memorySize
+ ") corresponds to more than MAX_INT pages.");
}
this.totalNumPages = (int) numPagesLong;
if (this.totalNumPages < 1) {
throw new IllegalArgumentException("The given amount of memory amounted to less than one page.");
}
this.allocatedSegments = new HashMap<Object, Set<MemorySegment>>();
this.isPreAllocated = preAllocateMemory;
this.numNonAllocatedPages = preAllocateMemory ? 0 : this.totalNumPages;
final int memToAllocate = preAllocateMemory ? this.totalNumPages : 0;
switch (memoryType) {
case HEAP:
this.memoryPool = new HybridHeapMemoryPool(memToAllocate, pageSize);
break;
case OFF_HEAP:
if (!preAllocateMemory) {
LOG.warn("It is advisable to set 'taskmanager.memory.preallocate' to true when" +
" the memory type 'taskmanager.memory.off-heap' is set to true.");
}
this.memoryPool = new HybridOffHeapMemoryPool(memToAllocate, pageSize);
break;
default:
throw new IllegalArgumentException("unrecognized memory type: " + memoryType);
}
}
......
}
通过上述的内存的管理应用,就可以很好的将Flink的数据通过内存管理起来。
五、总结
Flink为了针对大数据的应用场景自己抽象定制了一套内存的解决机制,从目前来看,还是比直接应用JVM的内存管理要好很多。大牛们经常说,不要动不动就自己实现一个内存池,确实是这样,但所有的东西都不是绝对的,动态的发展的眼光看待问题,并使用一定的方法和手段解决它,才是正道。