大数据常见面试题:1

1、Flink CDC 1.x 2.x 区别 实现原理
参考链接:1

2、Hudi mor cow 读取流程
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

3、Spark shuffle 类型及区别

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

4、Flink 精准一次性消费

5、ClickHouse的表引擎介绍

6、Flink 作业提交流程

7、Flink sql 提交流程 广播流程

8、Flink 反压处理优化 Flink taskmanager 内存

9、Spark 宽依赖 窄依赖 shuffle

10、Flink 内存管理
在这里插入图片描述

Flink是jvm之上的大数据处理引擎,jvm存在java对象存储密度低、full gc时消耗性能,gc存在stw的问题,同时omm时会影响稳定性。同时针对频繁序列化和反序列化问题flink使用堆内堆外内存可以直接在一些场景下操作二进制数据,减少序列化反序列化的消耗。同时基于大数据流式处理的特点,flink定制了自己的一套序列化框架。flink也会基于cpu L1 L2 L3高速缓存的机制以及局部性原理,设计使用缓存友好的数据结构。

flink内存管理和spark的tungsten的内存管理的出发点很相似。

flink使用内存划分为堆内内存和堆外内存。按照用途可以划分为task所用内存,network memory、managed memory、以及framework所用内存,其中task network managed所用内存计入slot内存。framework为taskmanager公用。
堆内内存包含用户代码所用内存、heapstatebackend、框架执行所用内存。

堆外内存是未经jvm虚拟化的内存,直接映射到操作系统的内存地址,堆外内存包含框架执行所用内存,jvm堆外内存、Direct、native等。

Direct memory内存可用于网络传输缓冲。network memory属于direct memory的范畴,flink可以借助于此进行zero copy,从而减少内核态到用户态copy次数,从而进行更高效的io操作。

jvm metaspace存放jvm加载的类的元数据,加载的类越多,需要的空间越大,overhead用于jvm的其他开销,如native memory、code cache、thread stack等。

Managed Memory主要用于RocksDBStateBackend和批处理算子,也属于native memory的范畴,其中rocksdbstatebackend对应rocksdb,rocksdb基于lsm数据结构实现,每个state对应一个列族,占有独立的writebuffer,rocksdb占用native内存大小为 blockCahe + writebufferNum * writeBuffer + index ,同时堆外内存是进程之间共享的,jvm虚拟化大量heap内存耗时较久,使用堆外内存的话可以有效的避免该环节。但堆外内存也有一定的弊端,即监控调试使用相对复杂,对于生命周期较短的segment使用堆内内存开销更低,flink在一些情况下,直接操作二进制数据,避免一些反序列化带来的开销。如果需要处理的数据超出了内存限制,则会将部分数据存储到硬盘上。

内存管理

类似于OS中的page机制,flink模拟了操作系统的机制,通过page来管理内存,flink对应page的数据结构为dataview和MemorySegment,memorysegment是flink内存分配的最小单位,默认32kb,其可以在堆上也可以在堆外,flink通过MemorySegment的数据结构来访问堆内堆外内存,借助于flink序列化机制(序列化机制会在下一小节讲解),memorysegment提供了对二进制数据的读取和写入的方法,flink使用datainputview和dataoutputview进行memorysegment的二进制的读取和写入,flink可以通过HeapMemorySegment 管理堆内内存,通过HybridMemorySegment来管理堆内和堆外内存,MemorySegment管理jvm堆内存时,其定义一个字节数组的引用指向内存端,基于该内部字节数组的引用进行操作的HeapMemorySegment。

public abstract class MemorySegment { /**
 * The heap byte array object relative to which we access the memory.
 *  如果为堆内存,则指向访问的内存的引用,否则若内存为非堆内存,则为null
 * <p>Is non-<tt>null</tt> if the memory is on the heap, and is <tt>null</tt>, if the memory is
 * off the heap. If we have this buffer, we must never void this reference, or the memory
 * segment will point to undefined addresses outside the heap and may in out-of-order execution
 * cases cause segmentation faults. */
protected final byte[] heapMemory; /**
 * The address to the data, relative to the heap memory byte array. If the heap memory byte
 * array is <tt>null</tt>, this becomes an absolute memory address outside the heap.
 * 字节数组对应的相对地址 */
protected long address; 
}

HeapMemorySegment用来分配堆上内存。
在这里插入图片描述

HybridMemorySegment即支持onheap和offheap内存,flink通过jvm的unsafe操作,如果对象o不为null,为onheap的场景,并且后面的地址或者位置是相对位置,那么会直接对当前对象(比如数组)的相对位置进行操作。如果对象o为null,操作的内存块不是JVM堆内存,为off-heap的场景,并且后面的地址是某个内存块的绝对地址,那么这些方法的调用也相当于对该内存块进行操作。

public final class HybridMemorySegment extends MemorySegment {
.........
	@Override
	public ByteBuffer wrap(int offset, int length) {
		if (address <= addressLimit) {
			if (heapMemory != null) {
				return ByteBuffer.wrap(heapMemory, offset, length);
			}
			else {
				try {
					ByteBuffer wrapper = offHeapBuffer.duplicate();
					wrapper.limit(offset + length);
					wrapper.position(offset);
					return wrapper;
				}
				catch (IllegalArgumentException e) {
					throw new IndexOutOfBoundsException();
				}
			}
		}
		else {
			throw new IllegalStateException("segment has been freed");
		}
	}
	.........
}

flink通过MemorySegmentFactory来创建memorySegment,memorySegment是flink内存分配的最小单位。对于跨memorysegment的数据方位,flink抽象出一个访问视图,数据读取datainputView,数据写入dataoutputview。

@Public
public interface DataInputView extends DataInput {

	/**
	 * Skips {@code numBytes} bytes of memory. In contrast to the {@link #skipBytes(int)} method,
	 * this method always skips the desired number of bytes or throws an {@link java.io.EOFException}.
	 *
	 * @param numBytes The number of bytes to skip.
	 *
	 * @throws IOException Thrown, if any I/O related problem occurred such that the input could not
	 *                     be advanced to the desired position.
	 */
	void skipBytesToRead(int numBytes) throws IOException;

	/**
	 * Reads up to {@code len} bytes of memory and stores it into {@code b} starting at offset {@code off}.
	 * It returns the number of read bytes or -1 if there is no more data left.
	 *
	 * @param b byte array to store the data to
	 * @param off offset into byte array
	 * @param len byte length to read
	 * @return the number of actually read bytes of -1 if there is no more data left
	 * @throws IOException
	 */
	int read(byte[] b, int off, int len) throws IOException;

	/**
	 * Tries to fill the given byte array {@code b}. Returns the actually number of read bytes or -1 if there is no
	 * more data.
	 *
	 * @param b byte array to store the data to
	 * @return the number of read bytes or -1 if there is no more data left
	 * @throws IOException
	 */
	int read(byte[] b) throws IOException;
}

dataoutputview是数据写入的视图,outputview持有多个memorysegment的引用,flink可以顺序的写入segment。

/**
 * This interface defines a view over some memory that can be used to sequentially write contents to the memory.
 * The view is typically backed by one or more {@link org.apache.flink.core.memory.MemorySegment}.
 */
@Public
public interface DataOutputView extends DataOutput {

	/**
	 * Skips {@code numBytes} bytes memory. If some program reads the memory that was skipped over, the
	 * results are undefined.
	 *
	 * @param numBytes The number of bytes to skip.
	 *
	 * @throws IOException Thrown, if any I/O related problem occurred such that the view could not
	 *                     be advanced to the desired position.
	 */
	void skipBytesToWrite(int numBytes) throws IOException;

	/**
	 * Copies {@code numBytes} bytes from the source to this view.
	 *
	 * @param source The source to copy the bytes from.
	 * @param numBytes The number of bytes to copy.
	 *
	 * @throws IOException Thrown, if any I/O related problem occurred, such that either the input view
	 *                     could not be read, or the output could not be written.
	 */
	void write(DataInputView source, int numBytes) throws IOException;
}

上一小节中讲到的managedmemory内存部分,flink使用memorymanager来管理该内存,managedmemory只使用堆外内存,主要用于批处理中的sorting、hashing、以及caching(社区消息,未来流处理也会使用到该部分),在流计算中作为rocksdbstatebackend的部分内存。memeorymanager通过memorypool来管理memorysegment。


	// ------------------------------------------------------------------------
	//  Memory allocation and release
	// ------------------------------------------------------------------------

	/**
	 * Allocates a set of memory segments from this memory manager.
	 *
	 * <p>The total allocated memory will not exceed its size limit, announced in the constructor.
	 *
	 * @param owner The owner to associate with the memory segment, for the fallback release.
	 * @param numPages The number of pages to allocate.
	 * @return A list with the memory segments.
	 * @throws MemoryAllocationException Thrown, if this memory manager does not have the requested amount
	 *                                   of memory pages any more.
	 */
	public List<MemorySegment> allocatePages(Object owner, int numPages) throws MemoryAllocationException {
		List<MemorySegment> segments = new ArrayList<>(numPages);
		allocatePages(owner, segments, numPages);
		return segments;
	}

	/**
	 * Allocates a set of memory segments from this memory manager.
	 *
	 * <p>The total allocated memory will not exceed its size limit, announced in the constructor.
	 *
	 * @param owner The owner to associate with the memory segment, for the fallback release.
	 * @param target The list into which to put the allocated memory pages.
	 * @param numberOfPages The number of pages to allocate.
	 * @throws MemoryAllocationException Thrown, if this memory manager does not have the requested amount
	 *                                   of memory pages any more.
	 */
	public void allocatePages(
			Object owner,
			Collection<MemorySegment> target,
			int numberOfPages) throws MemoryAllocationException {
		// sanity check
		Preconditions.checkNotNull(owner, "The memory owner must not be null.");
		Preconditions.checkState(!isShutDown, "Memory manager has been shut down.");
		Preconditions.checkArgument(
			numberOfPages <= totalNumberOfPages,
			"Cannot allocate more segments %d than the max number %d",
			numberOfPages,
			totalNumberOfPages);

		// reserve array space, if applicable
		if (target instanceof ArrayList) {
			((ArrayList<MemorySegment>) target).ensureCapacity(numberOfPages);
		}

		long memoryToReserve = numberOfPages * pageSize;
		try {
			memoryBudget.reserveMemory(memoryToReserve);
		} catch (MemoryReservationException e) {
			throw new MemoryAllocationException(String.format("Could not allocate %d pages", numberOfPages), e);
		}

		Runnable pageCleanup = this::releasePage;
		allocatedSegments.compute(owner, (o, currentSegmentsForOwner) -> {
			Set<MemorySegment> segmentsForOwner = currentSegmentsForOwner == null ?
				new HashSet<>(numberOfPages) : currentSegmentsForOwner;
			for (long i = numberOfPages; i > 0; i--) {
				MemorySegment segment = allocateOffHeapUnsafeMemory(getPageSize(), owner, pageCleanup);
				target.add(segment);
				segmentsForOwner.add(segment);
			}
			return segmentsForOwner;
		});

		Preconditions.checkState(!isShutDown, "Memory manager has been concurrently shut down.");
	}

对于上一小节中提到的NetWorkMemory的内存,flink使用networkbuffer做了一层buffer封装。buffer的底层也是memorysegment,flink通过bufferpool来管理buffer,每个taskmanager都有一个netwokbufferpool,该tm上的各个task共享该networkbufferpool,同时task对应的localbufferpool所需的内存需要从networkbufferpool申请而来,它们都是flink申请的堆外内存。

上游算子向resultpartition写入数据时,申请buffer资源,使用bufferbuilder将数据写入memorysegment,下游算子从resultsubpartition消费数据时,利用bufferconsumer从memorysegment中读取数据,bufferbuilder与bufferconsumer一一对应。同时这一流程也和flink的反压机制相关。如图

在这里插入图片描述

class LocalBufferPool implements BufferPool {
	private static final Logger LOG = LoggerFactory.getLogger(LocalBufferPool.class);

	private static final int UNKNOWN_CHANNEL = -1;

	/** Global network buffer pool to get buffers from. */
	private final NetworkBufferPool networkBufferPool;

	/** The minimum number of required segments for this pool. */
	private final int numberOfRequiredMemorySegments;

	/**
	 * The currently available memory segments. These are segments, which have been requested from
	 * the network buffer pool and are currently not handed out as Buffer instances.
	 *
	 * <p><strong>BEWARE:</strong> Take special care with the interactions between this lock and
	 * locks acquired before entering this class vs. locks being acquired during calls to external
	 * code inside this class, e.g. with
	 * {@link org.apache.flink.runtime.io.network.partition.consumer.BufferManager#bufferQueue}
	 * via the {@link #registeredListeners} callback.
	 */
	private final ArrayDeque<MemorySegment> availableMemorySegments = new ArrayDeque<MemorySegment>();

	/**
	 * Buffer availability listeners, which need to be notified when a Buffer becomes available.
	 * Listeners can only be registered at a time/state where no Buffer instance was available.
	 */
	private final ArrayDeque<BufferListener> registeredListeners = new ArrayDeque<>();

	/** Maximum number of network buffers to allocate. */
	private final int maxNumberOfMemorySegments;

	/** The current size of this pool. */
	@GuardedBy("availableMemorySegments")
	private int currentPoolSize;
public abstract class ResultPartition implements ResultPartitionWriter {

	protected static final Logger LOG = LoggerFactory.getLogger(ResultPartition.class);

	private final String owningTaskName;

	private final int partitionIndex;

	protected final ResultPartitionID partitionId;

	/** Type of this partition. Defines the concrete subpartition implementation to use. */
	protected final ResultPartitionType partitionType;

	protected final ResultPartitionManager partitionManager;

	protected final int numSubpartitions;

	private final int numTargetKeyGroups;

	// - Runtime state --------------------------------------------------------

	private final AtomicBoolean isReleased = new AtomicBoolean();

	protected BufferPool bufferPool;

	private boolean isFinished;

	private volatile Throwable cause;

	private final SupplierWithException<BufferPool, IOException> bufferPoolFactory;

	/** Used to compress buffer to reduce IO. */
	@Nullable
	protected final BufferCompressor bufferCompressor;

	protected Counter numBytesOut = new SimpleCounter();

	protected Counter numBuffersOut = new SimpleCounter();

	public ResultPartition(
		String owningTaskName,
		int partitionIndex,
		ResultPartitionID partitionId,
		ResultPartitionType partitionType,
		int numSubpartitions,
		int numTargetKeyGroups,
		ResultPartitionManager partitionManager,
		@Nullable BufferCompressor bufferCompressor,
		SupplierWithException<BufferPool, IOException> bufferPoolFactory) {

		this.owningTaskName = checkNotNull(owningTaskName);
		Preconditions.checkArgument(0 <= partitionIndex, "The partition index must be positive.");
		this.partitionIndex = partitionIndex;
		this.partitionId = checkNotNull(partitionId);
		this.partitionType = checkNotNull(partitionType);
		this.numSubpartitions = numSubpartitions;
		this.numTargetKeyGroups = numTargetKeyGroups;
		this.partitionManager = checkNotNull(partitionManager);
		this.bufferCompressor = bufferCompressor;
		this.bufferPoolFactory = bufferPoolFactory;
	}

	/**
	 * Registers a buffer pool with this result partition.
	 *
	 * <p>There is one pool for each result partition, which is shared by all its sub partitions.
	 *
	 * <p>The pool is registered with the partition *after* it as been constructed in order to conform
	 * to the life-cycle of task registrations in the {@link TaskExecutor}.
	 */
	@Override
	public void setup() throws IOException {
		checkState(this.bufferPool == null, "Bug in result partition setup logic: Already registered buffer pool.");

		this.bufferPool = checkNotNull(bufferPoolFactory.get());
		partitionManager.registerResultPartition(this);
	}

	public String getOwningTaskName() {
		return owningTaskName;
	}

	@Override
	public ResultPartitionID getPartitionId() {
		return partitionId;
	}

	public int getPartitionIndex() {
		return partitionIndex;
	}

	@Override
	public int getNumberOfSubpartitions() {
		return numSubpartitions;
	}

	public BufferPool getBufferPool() {
		return bufferPool;
	}

	/**

自定义序列化框架
flink对自身支持的基本数据类型,实现了定制的序列化机制,flink数据集对象相对固定,可以只保存一份schema信息,从而节省存储空间,数据序列化就是java对象和二进制数据之间的数据转换,flink使用TypeInformation的createSerializer接口负责创建每种类型的序列化器,进行数据的序列化反序列化,类型信息在构建streamtransformation时通过typeextractor根据方法签名类信息等提取类型信息并存储在streamconfig中。

 * Creates a serializer for the type. The serializer may use the ExecutionConfig
 * for parameterization.
 * 创建出对应类型的序列化器
 * @param config The config used to parameterize the serializer.
 * @return A serializer for this type. */ @PublicEvolving public abstract TypeSerializer<T> createSerializer(ExecutionConfig config); /**

A utility for reflection analysis on classes, to determine the return type of implementations of transformation
functions. / @Public public class TypeExtractor { /*
Creates a {@link TypeInformation} from the given parameters.

* If the given {@code instance} implements {@link ResultTypeQueryable}, its information
* is used to determine the type information. Otherwise, the type information is derived
* based on the given class information.
* @param instance            instance to determine type information for
* @param baseClass            base class of {@code instance}
* @param clazz                class of {@code instance}
* @param returnParamPos    index of the return type in the type arguments of {@code clazz}
* @param <OUT>                output type
* @return type information */ @SuppressWarnings("unchecked")
@PublicEvolving public static <OUT> TypeInformation<OUT> createTypeInfo(Object instance, Class<?> baseClass, Class<?> clazz, int returnParamPos) { if (instance instanceof ResultTypeQueryable) { return ((ResultTypeQueryable<OUT>) instance).getProducedType();

   } else { return createTypeInfo(baseClass, clazz, returnParamPos, null, null);
   }
}

在这里插入图片描述
在这里插入图片描述

原文链接

辅助理解 1:

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值