Flink kafka connectors 源码详解---＜3＞

a解code

已于 2023-01-11 20:17:08 修改

阅读量170

点赞数 1

分类专栏： Flink 文章标签： kafka flink java

于 2023-01-11 20:09:23 首次发布

本文链接：https://blog.csdn.net/adddimin/article/details/128629641

版权

Flink 专栏收录该内容

8 篇文章 0 订阅

订阅专栏

接着之前

这篇主要是补上篇挖的洞，看看拉取数据的线程模型。
这里面要考量的点，包括几点

1.数据拉取存放在哪里，如果有多个线程，采用哪种方式存储？作为数据获取源头是不是要采取生产者消费者模型，防止一味的生产数据，造成内存的膨胀。
2.线程中除了要拉取数据，当分区信息有变动时，还需要添加新分区，怎么安排。
3.线程的异常处理

问1

如果看了之前的系列，应该清楚，即使是多个线程，也代表不同线程中是不同分区的数据，所以线程和线程之间的存储顺序是不需要计较的，但是同一个线程中，存取就需要严格按照顺序来，而且作为源头势必需要做好生产者与消费者模型。
代码主要体现在FutureCompletingBlockingQueue
ta 是一个会阻塞的队列，也就是说ta 有可能因为阻塞而致线程占用，所以，还包含了一个wakeup，从阻塞中wakeup的一个功能。
建议直接看源码，源码上有详细的注解。
这里就介绍下，不同于一般的生存者消费者模型，这里采用CompletableFuture 来做，CompletableFuture 的get() 会阻塞线程，就是利用这个，没有数据了就new 一个CompletableFuture对象，一直等待，有数据，就将CompletableFuture 的result设置为null。

 public T take() throws InterruptedException {
        T next;
        while ((next = poll()) == null) {
            // use the future to wait for availability to avoid busy waiting
            try {
                getAvailabilityFuture().get();
            } catch (ExecutionException | CompletionException e) {
                // this should never happen, but we propagate just in case
                throw new FlinkRuntimeException("exception in queue future completion", e);
            }
        }
        return next;
    }
     private void moveToAvailable() {
        final CompletableFuture<Void> current = currentFuture;
        //AVAILABLE 是成员变量，已经初始化，且赋值为null。
        //  public static final CompletableFuture<Void> AVAILABLE = getAvailableFuture();
        if (current != AVAILABLE) {
            currentFuture = AVAILABLE;
            current.complete(null);
        }
    }
     private void moveToUnAvailable() {
        if (currentFuture == AVAILABLE) {
            currentFuture = new CompletableFuture<>();
        }
    }

问二
fetchTask，与addSplitTask 运行在同一个线程中，就需要判断，如果任务队列有其它任务（如：addSplitTask）就运行，不然就运行fetchTask，但是fetchTask 有可能阻塞，原因，在于FutureCompletingBlockingQueue 的容量满了之后，会阻塞线程，就需要wakeUp，打破这个阻塞，才能运行addSplitTask。

// 阻塞代码，这里是 ReentrantLock的 Condition
 private void waitOnPut(int fetcherIndex) throws InterruptedException {
        maybeCreateCondition(fetcherIndex);
        Condition cond = putConditionAndFlags[fetcherIndex].condition();
        notFull.add(cond);
        cond.await();
    }

打破阻塞就在添加AddSplitTask 后，对splitFetcher 调用wakeUp()

 public void addSplits(List<SplitT> splitsToAdd) {
        //先添加任务
        enqueueTask(new AddSplitsTask<>(splitReader, splitsToAdd, assignedSplits));
        // 后wakeup
        wakeUp(true);
    }

这里的wakeUp 有点说头，这里的wakeUp，除了fetchTask 在添加数据，有可能阻塞线程，需要唤醒，还有一点，就是kafka Consumer 正准备或正在读数据时，这时wakeUp，就被打断了，再去跑addSplitTask 话，就分区就会被更新。

 void wakeUp(boolean taskOnly) {
        // Synchronize to make sure the wake up only works for the current invocation of runOnce().
        synchronized (wakeUp) {
            // Do not wake up repeatedly.
            wakeUp.set(true);
            // Now the wakeUp flag is set.
            SplitFetcherTask currentTask = runningTask;
            if (isRunningTask(currentTask)) {
                // The running task may have missed our wakeUp flag and running, wake it up.
                LOG.debug("Waking up running task {}", currentTask);
                currentTask.wakeUp();
            } else if (!taskOnly) {
                // The task has not started running yet, and it will not run for this
                // runOnce() invocation due to the wakeUp flag. But we might have to
                // wake up the fetcher thread in case it is blocking on the task queue.
                // Only wake up when the thread has started and there is no running task.
                LOG.debug("Waking up fetcher thread.");
                taskQueue.add(WAKEUP_TASK);
            }
        }
    }

    private v

fetchTask 中的 wakeUp

public void wakeUp() {
        // Set the wakeup flag first.
        wakeup = true;
        if (lastRecords == null) {
            // Two possible cases:
            // 1. The splitReader is reading or is about to read the records.
            // 2. The records has been enqueued and set to null.
            // In case 1, we just wakeup the split reader. In case 2, the next run might be skipped.
            // In any case, the records won't be enqueued in the ongoing run().
            splitReader.wakeUp();
        } else {
            // The task might be blocking on enqueuing the records, just interrupt.
            elementsQueue.wakeUpPuttingThread(fetcherIndex);
        }
    }