Flink 源码阅读之Async IO该如何使用

先看例怎么实现一个异步IO的例子

public class AsyncFunctionExample extends RichAsyncFunction<String, String> {
    private transient DataSource dataSource = null;

    @Override
    public void open(Configuration parameters) throws Exception {
        dataSource = new DruidDataSource();
        dataSource.setDriverClassName("com.mysql.jdbc.Driver");
        dataSource.setUsername("root");
        dataSource.setPassword("123456");
        dataSource.setUrl("jdbc:mysql://localhost:3306/day01?characterEncoding=utf8");
    }


    @Override
    public void asyncInvoke(String input, ResultFuture<String> resultFuture) throws Exception {
        String sql = "SELECT id, name FROM orde WHERE id = ?";
        String result = null;
        Connection connection = null;
        PreparedStatement stmt = null;
        ResultSet rs = null;
        try {
            connection = dataSource.getConnection();
            stmt = connection.prepareStatement(sql);
            stmt.setString(1, param);
            rs = stmt.executeQuery();
            while (rs.next()) {
                result = rs.getString("name");
            }
        } finally {
            if (rs != null) {
                rs.close();
            }
            if (stmt != null) {
                stmt.close();
            }
            if (connection != null) {
                connection.close();
            }
        }
        resultFuture.complete(Collections.singleton(result));
    }

    @Override
    public void timeout(String input, ResultFuture<String> resultFuture) throws Exception {

    }
}

读源码

源码,使用AsyncWaitOperator来实现用户定义的AsyncFunction

@Override
	public void processElement(StreamRecord<IN> element) throws Exception {
		// add element first to the queue
		//1
		final ResultFuture<OUT> entry = addToWorkQueue(element);

		final ResultHandler resultHandler = new ResultHandler(element, entry);
        
		// register a timeout for the entry if timeout is configured
		// 2
		if (timeout > 0L) {
			final long timeoutTimestamp = timeout + getProcessingTimeService().getCurrentProcessingTime();

			final ScheduledFuture<?> timeoutTimer = getProcessingTimeService().registerTimer(
				timeoutTimestamp,
				timestamp -> userFunction.timeout(element.getValue(), resultHandler));

			resultHandler.setTimeoutTimer(timeoutTimer);
		}
        //3 user function 就是用户定义的AsyncFunction
		userFunction.asyncInvoke(element.getValue(), resultHandler);
	}

从截取的代码看
1 将进入的 element 加入到一个queue
2 注册一个超时的计时器(可以不用太过于关注)
3 调用用户编写的逻辑AsyncFunction.asyncInvoke

一步一步往下看

  1. addToWorkQueue
    AsyncWaitOperator#addToWorkQueue
    这是一个同步的逻辑,这里有个queue成员变量,分别有OrderedStreamElementQueue和UnorderedStreamElementQueue。内部实现有所不同区别就在于出队的时候是否跟入队顺序一样,这里主要看UnorderedStreamElementQueue。
private ResultFuture<OUT> addToWorkQueue(StreamElement streamElement) throws InterruptedException {

	Optional<ResultFuture<OUT>> queueEntry;
	while (!(queueEntry = queue.tryPut(streamElement)).isPresent()) {
		mailboxExecutor.yield();
	}
	return queueEntry.get();
}

UnorderedStreamElementQueue#tryPut 如果队列没满 'capacity’就添加到队列,如果满了就返回空,等待下一次的处理。

public Optional<ResultFuture<OUT>> tryPut(StreamElement streamElement) {
	if (size() < capacity) {
		StreamElementQueueEntry<OUT> queueEntry;
		if (streamElement.isRecord()) {
			queueEntry = addRecord((StreamRecord<?>) streamElement);
		}
		...
		numberOfEntries++;
		...
	} else {
	    ...
	}
}

private StreamElementQueueEntry<OUT> addRecord(StreamRecord<?> record) {
	// ensure that there is at least one segment
	..
	StreamElementQueueEntry<OUT> queueEntry = new SegmentedStreamRecordQueueEntry<>(record, lastSegment);
	lastSegment.add(queueEntry);
	return queueEntry;
}

最终将元素添加到 Segment.incompleteElements;

static class Segment<OUT> {
	/** Unfinished input elements. */
	private final Set<StreamElementQueueEntry<OUT>> incompleteElements;
	
   /** Undrained finished elements. */
	private final Queue<StreamElementQueueEntry<OUT>> completedElements;
    void add(StreamElementQueueEntry<OUT> queueEntry) {
    	if (queueEntry.isDone()) {
    		completedElements.add(queueEntry);
    	} else {
    		incompleteElements.add(queueEntry);
    	}
    }
    
}

上述过程操作完毕之后返回一个ResultFuture用于完成用户逻辑后的回调。
生成一个ResultHandler (implements ResultFuture) 传递给AsyncFunction#asyncInvoke
2. 第二步跳过,有兴趣可以自己研究
3. 调用AsyncFunction#asyncInvoke
也就是用户逻辑,在用户逻辑完成后调用ResultHandler#complete通知在第一步中加入队列的元素可以被发往下游了。
ResultHandler 拥有一个resultFeature成员变量,也就是第一步生成的resultFeature

private class ResultHandler implements ResultFuture<OUT> {

    /**
	 * The handle received from the queue to update the entry. Should only be used to inject the result;
	 * exceptions are handled here.
	 */
	private final ResultFuture<OUT> resultFuture;
	
    public void complete(Collection<OUT> results) {
    	Preconditions.checkNotNull(results, "Results must not be null, use empty collection to emit nothing");
    	// already completed (exceptionally or with previous complete call from ill-written AsyncFunction), so
    	// ignore additional result
    	if (!completed.compareAndSet(false, true)) {
    		return;
    	}
    	processInMailbox(results);
    }
}

processInMailBox(results)之前的逻辑不细说,检查了一下这个ResultFuture是不是已经被处理过了,如果处理不会被重复处理。主要看processInMailbox(results)

private void processInMailbox(Collection<OUT> results) {
	// move further processing into the mailbox thread
	mailboxExecutor.execute(		
	 () -> processResults(results),
	"Result in AsyncWaitOperator of input %s", results);
}

private void processResults(Collection<OUT> results) {
	// Cancel the timer once we've completed the stream record buffer entry. This will remove the registered
	// timer task
	if (timeoutTimer != null) {
		// canceling in mailbox thread avoids https://issues.apache.org/jira/browse/FLINK-13635
		timeoutTimer.cancel(true);
	}
	// update the queue entry with the result
	resultFuture.complete(results);
	// now output all elements from the queue that have been completed (in the correct order)
	outputCompletedElement();
}

关键两行代码
resultFuture.complete(results) 将用户处理好的值设置到StreamRecord。
outputCompletedElement() 向下游发送返回的数据。

private void outputCompletedElement() {
	if (queue.hasCompletedElements()) {
    	// emit only one element to not block the mailbox thread unnecessarily
    	queue.emitCompletedElement(timestampedCollector);
    	// if there are more completed elements, emit them with subsequent mails
    	if (queue.hasCompletedElements()) {
    		mailboxExecutor.execute(this::outputCompletedElement, "AsyncWaitOperator#outputCompletedElement");
    	}
	}
}

UnorderStreamElementQueue#emitCompletedElement
UnorderStreamElementQueue#emitCompleted

public void emitCompletedElement(TimestampedCollector<OUT> output) {
	if (segments.isEmpty()) {
		return;
	}
	final Segment currentSegment = segments.getFirst();
	numberOfEntries -= currentSegment.emitCompleted(output);
	// remove any segment if there are further segments, if not leave it as an optimization even if empty
	if (segments.size() > 1 && currentSegment.isEmpty()) {
		segments.pop();
	}
}

# class Segment
void completed(StreamElementQueueEntry<OUT> elementQueueEntry) {
	// adding only to completed queue if not completed before
	// there may be a real result coming after a timeout result, which is updated in the queue entry but
	// the entry is not re-added to the complete queue
	if (incompleteElements.remove(elementQueueEntry)) {
		completedElements.add(elementQueueEntry);
	}
}

int emitCompleted(TimestampedCollector<OUT> output) {
	final StreamElementQueueEntry<OUT> completedEntry = completedElements.poll();
	if (completedEntry == null) {
		return 0;
	}
	completedEntry.emitResult(output);
	return 1;
}

移除未完成队列的元素Segment.incompleteElements,添加元素到已完成队列,移除未完成队列的元素。
从Segment.completedElements队列中取出完成的元素,output发送出去

总结

可以看到此处的异步IO做到什么功能

  1. 将元素放入队列
  2. 执行用户编写的逻辑
  3. 用户逻辑执行完毕调用resultFeature.complete方法
  4. 发射结果到下游
    这里异步的的地方是 在用户逻辑中可以注册回调方法,待结果返回后调用complete方法。再向下游发送数据。

灵魂拷问

那么问题来了,开头的示例代码问题在哪里?它虽然用了异步IO但它异步了吗?在什么场景或者是哪些引擎能支持Flink的异步IO

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值