package java.util.concurrent;
import java.lang.Thread.UncaughtExceptionHandler;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collection;
import java.util.Collections;
import java.util.List;
import java.util.concurrent.AbstractExecutorService;
import java.util.concurrent.Callable;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Future;
import java.util.concurrent.RejectedExecutionException;
import java.util.concurrent.RunnableFuture;
import java.util.concurrent.ThreadLocalRandom;
import java.util.concurrent.TimeUnit;
import java.security.AccessControlContext;
import java.security.ProtectionDomain;
import java.security.Permissions;
/**
* 一个ExecutorService运行ForkJoinTask 任务。
* An {@link ExecutorService} for running {@link ForkJoinTask}s.
* {@code ForkJoinPool}为非{@code ForkJoinTask}客户端提供提交入口,同时也提供管理和监控操作。
* A {@code ForkJoinPool} provides the entry point for submissions
* from non-{@code ForkJoinTask} clients, as well as management and
* monitoring operations.
*
* {@code ForkJoinPool}与其他类型的{@link ExecutorService}的区别主要在于采用了<em>work-steal </em>:
* <p>A {@code ForkJoinPool} differs from other kinds of {@link
* ExecutorService} mainly by virtue of employing
* <em>work-stealing</em>:
* 池中的所有线程都试图查找和执行提交给池和/或由其他活动任务创建的任务
* (如果不存在工作,则最终阻塞等待工作)。
* all threads in the pool attempt to find and
* execute tasks submitted to the pool and/or created by other active
* tasks (eventually blocking waiting for work if none exist). This
* 当大多数任务生成其他子任务时(就像大多数{@code ForkJoinTask}一样),
* 以及许多小任务从外部客户端提交到池时,这就支持了高效的处理。
* enables efficient processing when most tasks spawn other subtasks
* (as do most {@code ForkJoinTask}s), as well as when many small
* tasks are submitted to the pool from external clients. Especially
* 特别是在构造函数中将<em>asyncMode</em>设置为true时,{@code ForkJoinPool}
* 可能也适合用于从未加入的事件类型任务。
* when setting <em>asyncMode</em> to true in constructors, {@code
* ForkJoinPool}s may also be appropriate for use with event-style
* tasks that are never joined.
*
* 静态{@link #commonPool()}是可用的,并且适合于大多数应用程序。
* <p>A static {@link #commonPool()} is available and appropriate for
* most applications.
* 公共池由任何没有显式提交到指定池的ForkJoinTask使用。
* The common pool is used by any ForkJoinTask that
* is not explicitly submitted to a specified pool. Using the common
* 使用公共池通常会减少资源的使用(它的线程在不使用期间缓慢地回收,在随后使用时恢复)。
* pool normally reduces resource usage (its threads are slowly
* reclaimed during periods of non-use, and reinstated upon subsequent
* use).
*
* 对于需要单独或自定义池的应用程序,可以使用给定的目标并行度级别构造{@code ForkJoinPool};
* 默认情况下,等于可用处理器的数量。
* <p>For applications that require separate or custom pools, a {@code
* ForkJoinPool} may be constructed with a given target parallelism
* level; by default, equal to the number of available processors. The
* 池尝试通过动态添加、挂起或恢复内部工作线程来维护足够的活动(或可用)线程,
* 即使有些任务因等待加入其他任务而停止。
* pool attempts to maintain enough active (or available) threads by
* dynamically adding, suspending, or resuming internal worker
* threads, even if some tasks are stalled waiting to join others.
* 但是,在面对阻塞的I/O或其他非托管同步时,不能保证这样的调整。
* 嵌套的{@link ManagedBlocker}接口允许扩展所适应的同步类型。
* However, no such adjustments are guaranteed in the face of blocked
* I/O or other unmanaged synchronization. The nested {@link
* ManagedBlocker} interface enables extension of the kinds of
* synchronization accommodated.
*
* 除了执行和生命周期控制方法之外,这个类还提供状态检查方法(例如{@link #getStealCount}),
* 用于帮助开发、调优和监视fork/join应用程序。
* <p>In addition to execution and lifecycle control methods, this
* class provides status check methods (for example
* {@link #getStealCount}) that are intended to aid in developing,
* tuning, and monitoring fork/join applications. Also, method
* 而且,方法{@link #toString}以方便非正式监视的形式返回池状态指示。
* {@link #toString} returns indications of pool state in a
* convenient form for informal monitoring.
*
* 与其他ExecutorServices一样,下表总结了三种主要的任务执行方法。
* <p>As is the case with other ExecutorServices, there are three
* main task execution methods summarized in the following table.
* 它们主要用于当前池中尚未参与fork/join计算的客户端。
* These are designed to be used primarily by clients not already
* engaged in fork/join computations in the current pool. The main
* 这些方法的主要形式接受{@code ForkJoinTask}的实例,但是重载的形式也
* 允许混合执行普通的{@code Runnable}或{@code Callable}。
* forms of these methods accept instances of {@code ForkJoinTask},
* but overloaded forms also allow mixed execution of plain {@code
* Runnable}- or {@code Callable}- based activities as well. However,
* 但是,已经在池中执行的任务通常应该使用表中列出的计算内表单,除非使用通常
* 不连接的异步事件样式的任务,在这种情况下,方法的选择几乎没有区别。
* tasks that are already executing in a pool should normally instead
* use the within-computation forms listed in the table unless using
* async event-style tasks that are not usually joined, in which case
* there is little difference among choice of methods.
*
* <table BORDER CELLPADDING=3 CELLSPACING=1>
* <caption>Summary of task execution methods</caption>
* <tr>
* <td></td>
* <td ALIGN=CENTER> <b>Call from non-fork/join clients</b></td>
* <td ALIGN=CENTER> <b>Call from within fork/join computations</b></td>
* </tr>
* <tr>
* <td> <b>Arrange async execution</b></td>
* <td> {@link #execute(ForkJoinTask)}</td>
* <td> {@link ForkJoinTask#fork}</td>
* </tr>
* <tr>
* <td> <b>Await and obtain result</b></td>
* <td> {@link #invoke(ForkJoinTask)}</td>
* <td> {@link ForkJoinTask#invoke}</td>
* </tr>
* <tr>
* <td> <b>Arrange exec and obtain Future</b></td>
* <td> {@link #submit(ForkJoinTask)}</td>
* <td> {@link ForkJoinTask#fork} (ForkJoinTasks <em>are</em> Futures)</td>
* </tr>
* </table>
*
* 公共池默认情况下是用默认参数构造的,但是可以通过设置三个
* {@linkplain System#getProperty System properties}来控制这些参数:
* <p>The common pool is by default constructed with default
* parameters, but these may be controlled by setting three
* {@linkplain System#getProperty system properties}:
* <ul>
* <li>{@code java.util.concurrent.ForkJoinPool.common.parallelism}
* 并行级别,非负整数
* - the parallelism level, a non-negative integer
*
* <li>{@code java.util.concurrent.ForkJoinPool.common.threadFactory}
* ForkJoinWorkerThreadFactory 类名
* - the class name of a {@link ForkJoinWorkerThreadFactory}
*
* <li>{@code java.util.concurrent.ForkJoinPool.common.exceptionHandler}
* - the class name of a {@link UncaughtExceptionHandler}
* </ul>
*
* 如果一个SecurityManager存在且没有指定工厂,则默认池使用一个工厂提供的线程不启用Permissions 。
* If a {@link SecurityManager} is present and no factory is
* specified, then the default pool uses a factory supplying
* threads that have no {@link Permissions} enabled.
* 系统类加载器用于加载这些类。
* The system class loader is used to load these classes.
* 建立这些设置有任何错误,使用默认参数。
* Upon any error in establishing these settings, default parameters
* are used.
* 通过将parallelism属性设置为0,和/或使用可能返回{@code null}的工厂,
* 可以禁用或限制公共池中线程的使用。
* It is possible to disable or limit the use of threads in
* the common pool by setting the parallelism property to zero, and/or
* using a factory that may return {@code null}. However doing so may
* 但是这样做可能会导致未加入的任务永远不会执行。
* cause unjoined tasks to never be executed.
*
* 实现注意事项 :此实现将运行的最大线程数限制为32767.尝试创建大于最大数目
* 的池导致IllegalArgumentException 。 ( (1 << 15) - 1)
* <p><b>Implementation notes</b>: This implementation restricts the
* maximum number of running threads to 32767. Attempts to create
* pools with greater than the maximum number result in
* {@code IllegalArgumentException}.
*
* 此实现仅在池关闭或内部资源耗尽时拒绝提交的任务(即抛出RejectedExecutionException )。
* <p>This implementation rejects submitted tasks (that is, by throwing
* {@link RejectedExecutionException}) only when the pool is shut down
* or internal resources have been exhausted.
*
* @since 1.7
* @author Doug Lea
*/
/**
* 1、外部提交任务时,如果线程在workQueues中对应的索引位置没有workQueue,则会创建一个workQueue,
* 并放入workQueues中对应的索引位置。
*
* 2、注册工作线程ForkJoinWorkerThread时,其构造方法会调用ForkJoinPool的registerWorker() 方法,
* 方面里面会创建workQueue,并注册到 workQueues中,最后返回给 ForkJoinWorkerThread,然后线程
* 里面也会保存此 workQueue。线程创建时没有初始化工作队列,线程运行的时候才会调用growArray()方法
* 分配工作队列ForkJoinTask<?>[]。
*
* 3、b = q.base, ((a.length - 1) & b 的作用是?
* 每取出一个元素base都会增加(扩容时也不会重置或者减少),每插入一个元素top也会增加,top - base 结果
* 为队列中的任务个数,只要 top - base 不大于队列array的长度,那么array中就还能存储任务,base 一直增加,
* 任务多的时候就会超过a.length,但是任务也一直在被取出,因此队列可以循环使用的,(a.length - 1) & b
* 计算任务保存的位置。从队列的中间索引位置开始,循环使用。
*
* 4、CountedCompleter 和 一般ForkJoinTask任务在等待任务完成上的不同,CountedCompleter只能等待根任务完成,
* 而一般的ForkJoinTask可以等待子任务完成,所以join()在帮助任务完成时,一般的ForkJoinTask只需要判断其他
* 工作线程当前窃取的任务是否是等待完成的任务即可,而CountedCompleter 任务需要一级一级往上判断是否是其子
* 任务,而不能通过判断工作线程当前窃取的任务是否是目标任务。
*
* 5、join()方法进入等待之前,会进行补偿,如果池中只剩下一个活动的工作线程了,那么会注册新的工作线程去执行任务,
* 注册成功后,线程才可以进入等待状态,因此活动的线程数会大于 parallelism。
*
*/
@sun.misc.Contended
public class ForkJoinPool extends AbstractExecutorService {
/*
* 实现概述
* Implementation Overview
*
* 这个类和它的嵌套类提供了一组工作线程的主要功能和控制:
* This class and its nested classes provide the main
* functionality and control for a set of worker threads:
* 来自非fj(ForkJoin)线程的提交将进入提交队列。
* Submissions from non-FJ threads enter into submission queues.
* Workers接受这些任务,并通常将它们分解为可能被其他workers窃取的子任务。
* Workers take these tasks and typically split them into subtasks
* that may be stolen by other workers.
*
* 优先级规则优先处理来自其自身队列的任务(LIFO或FIFO,取决于模式),
* 然后随机FIFO处理其他队列中的任务
* Preference rules give
* first priority to processing tasks from their own queues (LIFO
* or FIFO, depending on mode), then to randomized FIFO steals of
* tasks in other queues.
*
* 工作队列
* WorkQueues
* ==========
*
* 大多数操作发生在工作窃取队列中(在嵌套的类工作队列中)。
* Most operations occur within work-stealing queues (in nested
* class WorkQueue).
* 这些特别形式的双队列只能支持四种可能终端操作的三种,push,pop, and poll(又称 偷),
* These are special forms of Deques that
* support only three of the four possible end-operations -- push,
* pop, and poll (aka steal), under the further constraints that
* 在进一步的约束下,push和pop只能从拥有的线程中调用(或者,正如这里所扩展的,在一个锁中调用),
* 而poll可以从其他线程中调用。
* push and pop are called only from the owning thread (or, as
* extended here, under a lock), while poll may be called from
* other threads.
* 如果您不熟悉它们,那么您可能想要阅读Herlihy和Shavit的书《多处理器编程的艺术》
* (The Art of Multiprocessor programming),第16章对这些进行了更详细的描述。
* (If you are unfamiliar with them, you probably
* want to read Herlihy and Shavit's book "The Art of
* Multiprocessor programming", chapter 16 describing these in
* more detail before proceeding.)
* 主要的工作窃取队列设计与Chase和Lev的论文《"Dynamic Circular Work-Stealing Deque"
* 大致相似,
* The main work-stealing queue
* design is roughly similar to those in the papers "Dynamic
* Circular Work-Stealing Deque" by Chase and Lev, SPAA 2005
* (http://research.sun.com/scalable/pubs/index.html) and
* "Idempotent work stealing" by Michael, Saraswat, and Vechev,
* PPoPP 2009 (http://portal.acm.org/citation.cfm?id=1504186).
* See also "Correct and Efficient Work-Stealing for Weak Memory
* Models" by Le, Pop, Cohen, and Nardelli, PPoPP 2013
* (http://www.di.ens.fr/~zappa/readings/ppopp13.pdf) for an
* analysis of memory ordering (atomic, volatile etc) issues.
* 主要的区别最终源于GC需求,即我们尽可能快地取消插槽,以便在生成大量
* 任务的程序中保持尽可能小的内存占用。
* The main differences ultimately stem from GC requirements that we
* null out taken slots as soon as we can, to maintain as small a
* footprint as possible even in programs generating huge numbers
* of tasks.
* 为了实现这一点,我们将CAS仲裁pop和poll (steal)从索引(“base”和“top”)转移到槽本身。
* To accomplish this, we shift the CAS arbitrating pop
* vs poll (steal) from being on the indices ("base" and "top") to
* the slots themselves.
* 因此,成功的pop和poll主要需要使用CAS把槽从非空设置为空。
* So, both a successful pop and poll
* mainly entail a CAS of a slot from non-null to null. Because
* 因为我们依赖引用的CAS操作,我们不需要在 base 或 top 上标记位。
* we rely on CASes of references, we do not need tag bits on base
* 它们是在任何基于循环数组的队列中使用的简单int(例如ArrayDeque)。
* or top. They are simple ints as used in any circular
* array-based queue (see for example ArrayDeque).
* 对索引的更新仍然必须以一种确保 top == base 意味着队列为空的方式排序,
* 否则可能会在push、pop或poll尚未完全提交时使队列显示为非空。
* Updates to the
* indices must still be ordered in a way that guarantees that top
* == base means the queue is empty, but otherwise may err on the
* side of possibly making the queue appear nonempty when a push,
* pop, or poll have not fully committed.
* 注意,这意味着poll操作(单独考虑)不是没有等待的。
* Note that this means
* that the poll operation, considered individually, is not
* 一个小偷不能成功地继续,直到另一个正在进行的小偷(或者,如果之前是空的,push)完成。
* wait-free. One thief cannot successfully continue until another
* in-progress one (or, if previously empty, a push) completes.
* 但是,总的来说,我们至少确保了概率性的非阻塞性。
* However, in the aggregate, we ensure at least probabilistic
* non-blockingness.
* 如果一个尝试窃取失败,小偷总是选择一个不同的随机目标受害者尝试下一步。
* If an attempted steal fails, a thief always
* chooses a different random victim target to try next. So, in
* 因此,为了让一个小偷继续前进,它足以完成任何正在进行的poll或任何空队列上的push。
* order for one thief to progress, it suffices for any
* in-progress poll or new push on any empty queue to
* 这就是为什么我们通常使用pollAt方法及其变体,它们只在表面的base索引上尝试一次,
* 或者考虑其他操作,而不是使用poll方法。
* complete. (This is why we normally use method pollAt and its
* variants that try once at the apparent base index, else
* consider alternative actions, rather than method poll.)
*
* 这种方法还支持一种用户模式,在这种模式中,本地任务处理采用FIFO而不是LIFO顺序,
* 只需使用poll而不是pop。
* This approach also enables support of a user mode in which local
* task processing is in FIFO, not LIFO order, simply by using
* poll rather than pop.
* 这在从不连接任务的消息传递框架中非常有用。
* This can be useful in message-passing
* frameworks in which tasks are never joined.
* 但是,这两种模式都不考虑亲和性、负载、缓存位置等,因此很少在给定的机器上提供可能的最佳性能,
* 但是通过对这些因素进行平均,可以提供良好的吞吐量。
* However neither
* mode considers affinities, loads, cache localities, etc, so
* rarely provide the best possible performance on a given
* machine, but portably provide good throughput by averaging over
* 此外,即使我们确实试图利用这些信息,我们通常也没有利用这些信息的基础。
* these factors. (Further, even if we did try to use such
* information, we do not usually have a basis for exploiting it.
* 例如,一些任务集从缓存亲和性中获益,而另一些任务集则受到缓存污染影响的损害。
* For example, some sets of tasks profit from cache affinities,
* but others are harmed by cache pollution effects.)
*
* 工作队列也以类似的方式用于提交给池的任务。我们不能将这些任务混合在用
* 于窃取工作的队列中(这将影响lifo/fifo处理)。
* WorkQueues are also used in a similar way for tasks submitted
* to the pool. We cannot mix these tasks in the same queues used
* for work-stealing (this would contaminate lifo/fifo
* processing).
* 相反,我们使用一种散列的形式,将提交队列与提交线程随机关联起来。
* Instead, we randomly associate submission queues
* with submitting threads, using a form of hashing. The
* ThreadLocalRandom探测值用作选择现有队列的散列代码,可以在与其他提交者
* 争用时随机重新定位。
* ThreadLocalRandom probe value serves as a hash code for
* choosing existing queues, and may be randomly repositioned upon
* contention with other submitters.
* 本质上,提交者就像工作者一样,除了他们被限制在执行他们提交的本地任务
* (或者在CountedCompleters的情况下,其他具有相同根任务)。
* In essence, submitters act
* like workers except that they are restricted to executing local
* tasks that they submitted (or in the case of CountedCompleters,
* others with the same root task).
* 但是,由于大多数共享/外部队列操作比内部队列操作更昂贵,而且在稳定状态下,
* 外部提交者将与工作人员争夺CPU,所以如果所有工作人员都处于活动状态,
* ForkJoinTask.join和相关方法将禁止它们重复地帮助处理任务。
* However, because most
* shared/external queue operations are more expensive than
* internal, and because, at steady state, external submitters
* will compete for CPU with workers, ForkJoinTask.join and
* related methods disable them from repeatedly helping to process
* tasks if all workers are active.
* 在共享模式下插入任务需要一个锁(主要是在调整大小的情况下进行保护),
* 但是我们只使用一个简单的自旋锁(在字段qlock中使用位),因为提交者遇到
* 繁忙的队列时会继续尝试或创建其他队列
* Insertion of tasks in shared
* mode requires a lock (mainly to protect in the case of
* resizing) but we use only a simple spinlock (using bits in
* field qlock), because submitters encountering a busy queue move
* on to try or create other queues -- they block only when
* 它们只在创建和注册新队列时阻塞。
* creating and registering new queues.
*
* 管理
* Management
* ==========
*
* 工作窃取的主要吞吐量优势源于分散控制 -- workers通常从他们自己或彼此那里接受任务。
* The main throughput advantages of work-stealing stem from
* decentralized control -- workers mostly take tasks from
* themselves or each other. We cannot negate this in the
* 我们不能在执行其他管理职责时否定这一点。
* implementation of other management responsibilities. The main
* 避免瓶颈的主要策略是将几乎所有本质上的原子控制状态打包成两个volatile变量,
* 这两个变量最常被读取(而不是写入)作为状态和一致性检查。
* tactic for avoiding bottlenecks is packing nearly all
* essentially atomic control state into two volatile variables
* that are by far most often read (not written) as status and
* consistency checks.
*
* 字段“ctl”包含64位,包含原子地添加、停用、(在事件队列上)排队、
* 退出队列,和/或重新激活workers 所需的所有信息。
* Field "ctl" contains 64 bits holding all the information needed
* to atomically decide to add, inactivate, enqueue (on an event
* queue), dequeue, and/or re-activate workers.
* 为了启用这种封装,我们将最大并行度限制为(1<<15)-1(远远超出正常的操作范围),
* 以允许ids、计counts及其negations(用于阈值设置)适合16位字段。
* To enable this
* packing, we restrict maximum parallelism to (1<<15)-1 (which is
* far in excess of normal operating range) to allow ids, counts,
* and their negations (used for thresholding) to fit into 16bit
* fields.
*
* 字段“plock”是一种带有饱和关闭位(类似于每个队列的“qlocks”)的序列锁,
* 主要保护对工作队列数组的更新,以及启用shutdown。
* Field "plock" is a form of sequence lock with a saturating
* shutdown bit (similarly for per-queue "qlocks"), mainly
* protecting updates to the workQueues array, as well as to
* enable shutdown.
* 当作为锁使用时,它通常只被短暂地持有,因此在短暂的旋转之后几乎总是可用,
* 但是我们在需要时使用基于监视器的备份策略来阻塞。
* When used as a lock, it is normally only very
* briefly held, so is nearly always available after at most a
* brief spin, but we use a monitor-based backup strategy to
* block when needed.
*
* 记录工作队列。工作队列记录在“workQueues”数组中,该数组在首次使用时创建,
* 并在必要时扩展。
* Recording WorkQueues. WorkQueues are recorded in the
* "workQueues" array that is created upon first use and expanded
* if necessary.
* 在记录新workers和移除终止worker时,对数组的更新通过一个锁相互保护,
* 但是数组是可并发读的,并且可以直接访问。
* Updates to the array while recording new workers
* and unrecording terminated ones are protected from each other
* by a lock but the array is otherwise concurrently readable, and
* accessed directly.
* 为了简化基于索引的操作,数组大小总是2的幂,并且所有的读取器必须容忍空槽。
* To simplify index-based operations, the
* array size is always a power of two, and all readers must
* tolerate null slots.
* 工作队列的索引是奇数。共享(提交)队列的索引是偶数的,最多64个槽,
* 即使数组需要扩展以添加更多的workers,也会限制增长。
* Worker queues are at odd indices. Shared
* (submission) queues are at even indices, up to a maximum of 64
* slots, to limit growth even if array needs to expand to add
* 以这种方式将它们组合在一起可以简化和加速任务扫描。
* more workers. Grouping them together in this way simplifies and
* speeds up task scanning.
*
* 所有worker线程的创建都是按需的,由任务提交、终止workers的替换,
* 和/或阻塞工作的补偿触发。
* All worker thread creation is on-demand, triggered by task
* submissions, replacement of terminated workers, and/or
* compensation for blocked workers.
* 但是,所有其他支持代码都被设置为与其他策略一起工作。
* However, all other support
* code is set up to work with other policies.
* 为了确保我们不持有worker引用(这会阻止GC),所有对工作队列的访问都是
* 通过对工作队列数组的索引进行的(这是一些混乱代码结构的一个来源)。
* To ensure that we
* do not hold on to worker references that would prevent GC, ALL
* accesses to workQueues are via indices into the workQueues
* array (which is one source of some of the messy code
* constructions here).
* 实际上,workQueues数组是一种弱引用机制。因此,例如,ctl的等待队列字段存储索引,
* 而不是引用。
* In essence, the workQueues array serves as
* a weak reference mechanism. Thus for example the wait queue
* field of ctl stores indices, not references.
* 对相关方法(例如signalWork)中的工作队列的访问必须同时进行索引检查和空检查IDs。
* Access to the
* workQueues in associated methods (for example signalWork) must
* both index-check and null-check the IDs.
* 所有这些访问都通过提前返回来忽略坏的IDs,因为这只与终止相关,
* 在这种情况下,放弃是可以的。
* All such accesses
* ignore bad IDs by returning out early from what they are doing,
* since this can only be associated with termination, in which
* case it is OK to give up.
* 工作队列数组的所有用法还将检查它是否为非空(即使以前是非空)。
* All uses of the workQueues array
* also check that it is non-null (even if previously
* 这允许在终止期间为空,这是目前不需要的,但仍然是基于资源撤销的shutdown方案的一个选项。
* non-null). This allows nulling during termination, which is
* currently not necessary, but remains an option for
* resource-revocation-based shutdown schemes. It also helps
* 它还有助于减少异常陷阱代码的JIT发布,这往往会使某些方法中的控制流变得不必要地复杂。
* reduce JIT issuance of uncommon-trap code, which tends to
* unnecessarily complicate control flow in some methods.
*
* 事件队列。与HPC的工作窃取框架不同,我们不能让workers在无法立即找到任务
* 的情况下无限期地扫描任务,并且我们不能启动/恢复workers,除非出现可用的任务。
* Event Queuing. Unlike HPC work-stealing frameworks, we cannot
* let workers spin indefinitely scanning for tasks when none can
* be found immediately, and we cannot start/resume workers unless
* there appear to be tasks available. On the other hand, we must
* 另一方面,在提交或生成新任务时,我们必须快速地促使它们采取行动。
* quickly prod them into action when new tasks are submitted or
* 在许多情况下,激活worker的启动时间是总体性能的主要限制因素
* (在程序启动时,JIT编译和分配会加剧这种限制)。
* generated. In many usages, ramp-up time to activate workers is
* the main limiting factor in overall performance (this is
* compounded at program start-up by JIT compilation and
* 所以我们尽可能地简化它。
* allocation). So we try to streamline this as much as possible.
* 当workers找不到工作时,我们将他们放入事件等待队列中,然后让他们park/unpark。
* We park/unpark workers after placing in an event wait queue
* when they cannot find work.
* 这个“queue”实际上是一个简单的Treiber堆栈,以ctl的“id”字段为首,加上一个15位的
* 计数器值(它反映了一个worker被灭活的次数)来避免ABA影响(我们只需要像worker线程
* 一样多的版本号)。
* This "queue" is actually a simple
* Treiber stack, headed by the "id" field of ctl, plus a 15bit
* counter value (that reflects the number of times a worker has
* been inactivated) to avoid ABA effects (we need only as many
* version numbers as worker threads). Successors are held in
* Successors被WorkQueue.nextWait 字段保存。
* field WorkQueue.nextWait.
* Queuing处理几个固有的竞争,主要是一个任务生产线程可能看不到(和signalling)
* 另一个线程放弃寻找工作,但还没有进入等待队列。
* Queuing deals with several intrinsic
* races, mainly that a task-producing thread can miss seeing (and
* signalling) another thread that gave up looking for work but
* has not yet entered the wait queue.
* 我们通过在新等待工作者被添加到等待队列之前和之后都需要对所有workers进行
* 全面扫描(通过反复调用方法scan())来解决这个问题。
* We solve this by requiring
* a full sweep of all workers (via repeated calls to method
* scan()) both before and after a newly waiting worker is added
* to the wait queue.
* 因为排队的workers实际上可能在重新扫描而不是等待,所以我们设置并清除工作队列的“parker”字段,
* 以减少不必要的取消unpark的调用。
* Because enqueued workers may actually be
* rescanning rather than waiting, we set and clear the "parker"
* field of WorkQueues to reduce unnecessary calls to unpark.
* 这需要再次检查,以避免错过信号。
* (This requires a secondary recheck to avoid missed signals.)
* 请注意:关于Thread.interrupts在parking和其他阻塞周围的不同寻常的约定:
* Note the unusual conventions about Thread.interrupts
* surrounding parking and other blocking:
* 因为中断只用于警示线程检查终止,这是在阻塞时无论如何都进行检查,我们在任何
* 调用park之前清除状态(使用Thread.interrupted),因此park并不会立即返回
* 由于状态被设置通过一些其他不相关的用户代码中调用中断。
* Because interrupts are
* used solely to alert threads to check termination, which is
* checked anyway upon blocking, we clear status (using
* Thread.interrupted) before any call to park, so that park does
* not immediately return due to status being set via some other
* unrelated call to interrupt in user code.
*
* 发信号。 只有当出现至少有一个任务他们能够找到并执行时,我们才会
* 创建或唤醒workers。
* Signalling. We create or wake up workers only when there
* appears to be at least one task they might be able to find and
* execute.
* 当一个提交被添加,或者另一个worker将一个任务添加到一个少于两个任务
* 的队列中时,它们就通知等待的worker(或者如果少于给定的并行度级别,
* 就触发创建新任务——signalWork)。
* When a submission is added or another worker adds a
* task to a queue that has fewer than two tasks, they signal
* waiting workers (or trigger creation of new ones if fewer than
* the given parallelism level -- signalWork). These primary
* 当其他线程从队列中删除一个任务并注意到队列中还有其他任务时,
* 这些主信号将得到其他线程的支持。
* signals are buttressed by others whenever other threads remove
* a task from a queue and notice that there are other tasks there
* 因此,总体而言,池将会被过度通知。
* as well. So in general, pools will be over-signalled. On most
* 在大多数平台上,信号(unpark)开销时间非常长,而且从向线程发出信号到
* 它实际取得进展之间的时间间隔非常长,因此值得尽可能多地消除关键路径上的这些延迟。
* platforms, signalling (unpark) overhead time is noticeably
* long, and the time between signalling a thread and it actually
* making progress can be very noticeably long, so it is worth
* offloading these delays from critical paths as much as
* possible.
* 此外,只要workers看到ctl的状态发生变化,他们就会保持活跃,逐渐地向下旋转。
* 类似的稳定性感知技术也被用于阻塞之前的awaitJoin和helpComplete。
* Additionally, workers spin-down gradually, by staying
* alive so long as they see the ctl state changing. Similar
* 类似的稳定性感知技术也被用于阻塞之前的awaitJoin和helpComplete。
* stability-sensing techniques are also used before blocking in
* awaitJoin and helpComplete.
*
* 削减workers. 在一段时间不使用后释放资源,当池静止时worker开始等待,并且如果
* 池在给定的时期保持静止,worker将会超时并且终止 -- 如果线程数大于并行度,则周期较短,
* 如果线程数减少,则周期较长。
* Trimming workers. To release resources after periods of lack of
* use, a worker starting to wait when the pool is quiescent will
* time out and terminate if the pool has remained quiescent for a
* given period -- a short period if there are more threads than
* parallelism, longer as the number of threads decreases. This
* 这将慢慢传播,最终在一段时间的不使用后终止所有的workers。
* will slowly propagate, eventually terminating all workers after
* periods of non-use.
*
* Shutdown 和 Termination. 调用shutdownNow会原子地设置plock位,然后(非原子地)设置
* 每个worker的qlock状态,取消所有未处理的任务,并唤醒所有等待的worker。
* Shutdown and Termination. A call to shutdownNow atomically sets
* a plock bit and then (non-atomically) sets each worker's
* qlock status, cancels all unprocessed tasks, and wakes up
* all waiting workers.
* 检测是否应该在非突然shutdown()调用后开始终止需要更多的工作并记帐。
* Detecting whether termination should
* commence after a non-abrupt shutdown() call requires more work
* 我们需要对平静达成共识。(比如,没有更多的工作)。
* and bookkeeping. We need consensus about quiescence (i.e., that
* there is no more work).
* 活动计数提供了一个主要的指示,但非突然shutdown仍然需要重新检查扫描的任何
* 不活动的但没有排队的workers。
* The active count provides a primary
* indication but non-abrupt shutdown still requires a rechecking
* scan for any workers that are inactive but not queued.
*
* 加入任务
* Joining Tasks
* =============
*
* 当一个worker正在等待加入被另一个worker窃取(或总是持有)的任务时,可以采取以下任何一种操作。
* Any of several actions may be taken when one worker is waiting
* to join a task stolen (or always held) by another. Because we
* 因为我们将许多任务多路复用到一个workers池中,所以我们不能让它们阻塞(如Thread.join)。
* are multiplexing many tasks on to a pool of workers, we can't
* just let them block (as in Thread.join).
* 我们也不能只是用另一个重新分配joiner的运行时堆栈,然后替换它,这将是一种“延续”的形式,
* 即使可能也不一定是一个好主意,因为我们有时需要一个未阻塞的任务和它的延续来进行。
* We also cannot just
* reassign the joiner's run-time stack with another and replace
* it later, which would be a form of "continuation", that even if
* possible is not necessarily a good idea since we sometimes need
* both an unblocked task and its continuation to progress.
* 相反,我们结合了两种策略:
* Instead we combine two tactics:
*
* 帮助: 安排连接程序执行一些任务,这些任务在未发生偷取时将运行。
* Helping: Arranging for the joiner to execute some task that it
* would be running if the steal had not occurred.
*
* 补偿: 除非已经有足够的活动线程,否则tryCompensate()方法可以
* 创建或重新激活一个备用线程,以补偿阻塞的连接,直到它们解除阻塞为止。
* Compensating: Unless there are already enough live threads,
* method tryCompensate() may create or re-activate a spare
* thread to compensate for blocked joiners until they unblock.
*
* 第三种形式(在tryRemoveAndExec中实现)相当于帮助一个假设的补偿器:
* A third form (implemented in tryRemoveAndExec) amounts to
* helping a hypothetical compensator:
* 如果我们可以很容易地判断出补偿器的一个可能动作是偷取并执行正在连接的任务,
* 那么连接线程就可以直接这样做,而不需要补偿线程(尽管会以较大的运行时堆栈为代价,
* 但是权衡下通常是值得的)。
* If we can readily tell that
* a possible action of a compensator is to steal and execute the
* task being joined, the joining thread can do so directly,
* without the need for a compensation thread (although at the
* expense of larger run-time stacks, but the tradeoff is
* typically worthwhile).
*
* ManagedBlocker扩展API不能使用帮助,因此仅依赖于方法awaitBlocker中的补偿。
* The ManagedBlocker extension API can't use helping so relies
* only on compensation in method awaitBlocker.
*
* tryHelpStealer中的算法需要一种“线性”帮助:
* The algorithm in tryHelpStealer entails a form of "linear"
* 每个worker记录(在currentSteal字段中)它从其他某个worker那里窃取的最近的任务。
* helping: Each worker records (in field currentSteal) the most
* recent task it stole from some other worker. Plus, it records
* 另外,它记录(在currentJoin字段中)当前正在积极加入的任务。
* (in field currentJoin) the task it is currently actively
* joining.
* tryHelpStealer方法使用这些标记试图找到一个worker来帮助(比如:从它那里偷回一个任务并执行),
* 从而加速主动加入任务的完成。
* Method tryHelpStealer uses these markers to try to
* find a worker to help (i.e., steal back a task from and execute
* it) that could hasten completion of the actively joined task.
* 本质上,如果要加入的任务未被盗用,则joiner执行在它自己本地deque的任务。
* In essence, the joiner executes a task that would be on its own
3 ForkJoinPool 源码注释1
最新推荐文章于 2024-06-16 09:40:29 发布
本文详细探讨了Java的ForkJoinPool工作原理,通过源码分析其任务分发、并行执行及任务合并的过程,帮助后端开发者更好地理解和运用这一并发工具。
摘要由CSDN通过智能技术生成