flink的分区器

flink:1.8.0

flink 和spark一样,都有分区器,也可以自定义分区器,用来决定上下游直接的数据关系

flink中的分区器如下图

ChannelSelector

功能:这个接口的核心功能是对输入的record选择一个逻辑的channel

主要方法:

  • setup 初始化channel的数量
  • selectChannel 根据recore和channel的数量来决定发送到下游的哪个channel
  • isBroadcast 是否广播模式,决定是否将记录写入到下游的所有channel

StreamPartitioner

这是一个抽象类,实现了接口ChannelSelector,对流数据选择分区的抽象类

核心方法:

  • setup 设置channel的数量
  • isBroadcast 默认Flase 不广播
  • copy 分区器的名称

StreamPartitioner的实现类

ShufflePartitioner

public class ShufflePartitioner<T> extends StreamPartitioner<T> {
	private static final long serialVersionUID = 1L;

	private Random random = new Random();

	@Override
	public int selectChannel(SerializationDelegate<StreamRecord<T>> record) {
		// 对每条记录,随机选择下游operator的某个Channel
		return random.nextInt(numberOfChannels);
	}

	@Override
	public StreamPartitioner<T> copy() {
		return new ShufflePartitioner<T>();
	}

	@Override
	public String toString() {
		return "SHUFFLE";
	}
}

RescalePartitioner

基于上下游Operator的并行度,将记录以循环的方式输出到下游Operator的每个实例。举例: 上游并行度是2,下游是4,

则上游一个并行度以循环的方式将记录输出到下游的两个并行度上;

上游另一个并行度以循环的方式将记录输出到下游另两个并行度上。

若上游并行度是4,下游并行度是2,则上游两个并行度将记录输出到下游一个并行度上;上游另两个并行度将记录输出到下游另一个并行度上

public class RescalePartitioner<T> extends StreamPartitioner<T> {
	private static final long serialVersionUID = 1L;

	private int nextChannelToSendTo = -1;

	@Override
	public int selectChannel(SerializationDelegate<StreamRecord<T>> record) {
		if (++nextChannelToSendTo >= numberOfChannels) {
			nextChannelToSendTo = 0;
		}
		return nextChannelToSendTo;
	}

	public StreamPartitioner<T> copy() {
		return this;
	}

	@Override
	public String toString() {
		return "RESCALE";
	}
}

KeyGroupStreamPartitioner

public int selectChannel(SerializationDelegate<StreamRecord<T>> record) {
		K key;
		try {
			// 根据keySelector 来进行选择
			key = keySelector.getKey(record.getInstance().getValue());
		} catch (Exception e) {
			throw new RuntimeException("Could not extract key from " + record.getInstance().getValue(), e);
		}
		return KeyGroupRangeAssignment.assignKeyToParallelOperator(key, maxParallelism, numberOfChannels);
	}

GlobalPartitioner

对每条记录,只选择下游operator的第一个Channel

public class GlobalPartitioner<T> extends StreamPartitioner<T> {
	private static final long serialVersionUID = 1L;

	@Override
	public int selectChannel(SerializationDelegate<StreamRecord<T>> record) {
		return 0;
	}

	@Override
	public StreamPartitioner<T> copy() {
		return this;
	}

	@Override
	public String toString() {
		return "GLOBAL";
	}
}

BroadcastPartitioner

/**
	 * Note: Broadcast mode could be handled directly for all the output channels
	 * in record writer, so it is no need to select channels via this method.
	 */
	@Override
	public int selectChannel(SerializationDelegate<StreamRecord<T>> record) {
		throw new UnsupportedOperationException("Broadcast partitioner does not support select channels.");
	}

	@Override
	public boolean isBroadcast() {
		// 启用广播模式,此时Channel选择器会选择下游所有Channel
		return true;
	}

RebalancePartitioner

将记录以循环的方式输出到下游Operator的每个实例

private int nextChannelToSendTo;

	@Override
	public void setup(int numberOfChannels) {
		super.setup(numberOfChannels);

		nextChannelToSendTo = ThreadLocalRandom.current().nextInt(numberOfChannels);
	}

	@Override
	public int selectChannel(SerializationDelegate<StreamRecord<T>> record) {
		nextChannelToSendTo = (nextChannelToSendTo + 1) % numberOfChannels;
		return nextChannelToSendTo;
	}

ForwardPartitioner

将记录输出到下游本地的operator实例。ForwardPartitioner分区器要求上下游算子并行度一样。上下游Operator同属一个SubTasks

public class ForwardPartitioner<T> extends StreamPartitioner<T> {
	private static final long serialVersionUID = 1L;

	@Override
	public int selectChannel(SerializationDelegate<StreamRecord<T>> record) {
		return 0;
	}

	public StreamPartitioner<T> copy() {
		return this;
	}

	@Override
	public String toString() {
		return "FORWARD";
	}
}

CustomPartitionerWrapper

通过Partitioner实例的partition方法(自定义的)将记录输出到下游

Partitioner<K> partitioner;
	KeySelector<T, K> keySelector;

	public CustomPartitionerWrapper(Partitioner<K> partitioner, KeySelector<T, K> keySelector) {
		this.partitioner = partitioner;
		this.keySelector = keySelector;
	}

	@Override
	public int selectChannel(SerializationDelegate<StreamRecord<T>> record) {
		K key;
		try {
			key = keySelector.getKey(record.getInstance().getValue());
		} catch (Exception e) {
			throw new RuntimeException("Could not extract key from " + record.getInstance(), e);
		}

		return partitioner.partition(key, numberOfChannels);
	}

参考:https://zhuanlan.zhihu.com/p/165907184

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值