架构和处理模型

翻译 2015年11月20日 16:55:57


The Processes

When the Flink system is started, it bring up the JobManager and one or more TaskManagers. The JobManager is the coordinator of the Flink system, while the TaskManagers are the workers that execute parts of the parallel programs. When starting the system in local mode, a single JobManager and TaskManager are brought up within the same JVM.

When a program is submitted, a client is created that performs the pre-processing and turns the program into the parallel data flow form that is executed by the JobManager and TaskManagers. The figure below illustrates the different actors in the system and their interactions.

Flink Process Model

Component Stack

As a software stack, Flink is a layered system. The different layers of the stack build on top of each other and raise the abstraction level of the program representations they accept:

  • The runtime layer receives a program in the form of a JobGraph. A JobGraph is a generic parallel data flow with arbitrary tasks that consume and produce data streams.

  • Both the DataStream API and the DataSet API generate JobGraphs through separate compilation processes. The DataSet API uses an optimizer to determine the optimal plan for the program, while the DataStream API uses a stream builder.

  • The JobGraph is executed according to a variety of deployment options available in Flink (e.g., local, remote, YARN, etc)

  • Libraries and APIs that are bundled with Flink generate DataSet or DataStream API programs. These are Table for queries on logical tables, FlinkML for Machine Learning, and Gelly for graph processing.

You can click on the components in the figure to learn more.

Stack

Graph API: GellyFlink MLTableDataSet API (Java/Scala)DataStream API (Java/Scala)Flink RuntimeLocalRemoteEmbeddedYARNTez

Projects and Dependencies

The Flink system code is divided into multiple sub-projects. The goal is to reduce the number of dependencies that a project implementing a Flink progam needs, as well as to faciltate easier testing of smaller sub-modules.

The individual projects and their dependencies are shown in the figure below.

The Flink sub-projects and their dependencies

In addition to the projects listed in the figure above, Flink currently contains the following sub-projects:

  • flink-dist: The distribution project. It defines how to assemble the compiled code, scripts, and other resources into the final folder structure that is ready to use.

  • flink-staging: A series of projects that are in an early version. Currently contains among other things projects for YARN support, JDBC data sources and sinks, hadoop compatibility, graph specific operators, and HBase connectors.

  • flink-quickstart: Scripts, maven archetypes, and example programs for the quickstarts and tutorials.

  • flink-contrib: Useful tools contributed by users. The code is maintained mainly by external contributors. The requirements for code being accepted into flink-contrib are lower compared to the rest of the code.


相关文章推荐

初探Nginx架构之进程模型与事件处理机制

from http://blog.csdn.net/yankai0219/article/details/8018275 来自yankai0219 文章内容: 0.序 1.概述 2.N...
  • spivic
  • spivic
  • 2013年11月12日 13:52
  • 1243

文章12:初探Nginx架构之进程模型与事件处理机制

欢迎大家转载,转载请注明出处http://blog.csdn.net/yankai0219/article/details/8018275 来自yankai0219 文章内容: ...

Nginx----IO模型及架构流程概述

按说暮春时节,草长莺飞,带上心爱的姑娘或者家人出去踏踏青、赏赏花最合适不过,但一场接一场的大雪,下得人心生烦闷,每日早起都要考虑时宜穿秋裤还是时宜不穿,费脑又耗时,关键是我烧烤架都备好了,寒什么也不能...

软件架构RUP 4+1 视图模型

RUP 4+1架构 软件需求分析的复杂性 RUP 4+1架构 RUP4+1架构方法采用用例驱动,在软件生命周期的各个阶段对软件进行建模,从不同视角对系统进行解读,从而形成...

云计算安全架构机制与模型评价

  • 2013年12月22日 14:35
  • 1.08MB
  • 下载
内容举报
返回顶部
收藏助手
不良信息举报
您举报文章:架构和处理模型
举报原因:
原因补充:

(最多只允许输入30个字)