全文地址:https://github.com/nathanmarz/storm/wiki/Understanding-the-parallelism-of-a-Storm-topology
Storm distinguishes between the following three main entities that are used to actually run a topology in a Storm cluster:
Worker processes
Executors (threads)
Tasks
Here is a simple illustration of their relationships:
A worker process executes a subset of a topology. A worker process belongs to a specific topology and may run one or more executors for one or more components (spouts or bolts) of this topology. A running topology consists of many such processes running on many machines within a Storm cluster.
An executor is a thread that is spawned by a worker process. It may run one or more tasks for the same component (spout or bolt).
A task performs the actual data processing — each spout or bolt that you implement in your code executes as many tasks across the cluster. The number of tasks for a component is always the same throughout the lifetime of a topology, but the number of executors (threads) for a component can change over time. This means that the following condition holds true: #threads ≤ #tasks. By default, the number of tasks is set to be the same as the number of executors, i.e. Storm will run one task per thread.
worker -> process
executor -> thread
component -> static class
task -> component instance (bolt or spout)
worker : executor -> 1 : N (N>=1)
executor : task -> 1 : N (N>=1)