Apache Flink 1.1.3 安装配置

直接按照官网的来就行了,这个本地部署很容。

https://ci.apache.org/projects/flink/flink-docs-release-1.1/quickstart/setup_quickstart.html


Important: Maven artifacts which depend on Scala are now suffixed with the Scala major version, e.g. "2.10" or "2.11". Please consult the  migration guide on the project Wiki.

Quickstart: Setup

Get a Flink example program up and running in a few simple steps.

Setup: Download and Start

Flink runs on Linux, Mac OS X, and Windows. To be able to run Flink, the only requirement is to have a working Java 7.x (or higher) installation. Windows users, please take a look at the Flink on Windows guide which describes how to run Flink on Windows for local setups.

Download

Download a binary from the downloads page. You can pick any Hadoop/Scala combination you like.

  1. Go to the download directory.
  2. Unpack the downloaded archive.
  3. Start Flink.
$ cd ~/Downloads        # Go to download directory
$ tar xzf flink-*.tgz   # Unpack the downloaded archive
$ cd flink-1.1.3
$ bin/start-local.sh    # Start Flink

Check the JobManager’s web frontend at http://localhost:8081 and make sure everything is up and running. The web frontend should report a single available TaskManager instance.

JobManager: Overview

Run Example

Now, we are going to run the SocketWindowWordCount example and read text from a socket and count the number of distinct words.

  • First of all, we use netcat to start local server via

    $ nc -l 9000
  • Submit the Flink program:

    $ bin/flink run examples/streaming/SocketWindowWordCount.jar --port 9000
    
    03/08/2016 17:21:56 Job execution switched to status RUNNING.
    03/08/2016 17:21:56 Source: Socket Stream -> Flat Map(1/1) switched to SCHEDULED
    03/08/2016 17:21:56 Source: Socket Stream -> Flat Map(1/1) switched to DEPLOYING
    03/08/2016 17:21:56 Keyed Aggregation -> Sink: Unnamed(1/1) switched to SCHEDULED
    03/08/2016 17:21:56 Keyed Aggregation -> Sink: Unnamed(1/1) switched to DEPLOYING
    03/08/2016 17:21:56 Source: Socket Stream -> Flat Map(1/1) switched to RUNNING
    03/08/2016 17:21:56 Keyed Aggregation -> Sink: Unnamed(1/1) switched to RUNNING

    The program connects to the socket and waits for input. You can check the web interface to verify that the job is running as expected:

    JobManager: Overview (cont'd)
    JobManager: Running Jobs
  • Counts are printed to stdout. Monitor the JobManager’s output file and write some text in nc:

    $ nc -l 9000
    lorem ipsum
    ipsum ipsum ipsum
    bye

    The .out file will print the counts immediately:

    $ tail -f log/flink-*-jobmanager-*.out
    (lorem,1)
    (ipsum,1)
    (ipsum,2)
    (ipsum,3)
    (ipsum,4)
    (bye,1)

    To stop Flink when you’re done type:

    $ bin/stop-local.sh

    Quickstart: Setup

Next Steps

Check out the step-by-step example in order to get a first feel of Flink’s programming APIs. When you are done with that, go ahead and read the streaming guide.

Cluster Setup

Running Flink on a cluster is as easy as running it locally. Having passwordless SSH and the same directory structure on all your cluster nodes lets you use our scripts to control everything.

  1. Copy the unpacked flink directory from the downloaded archive to the same file system path on each node of your setup.
  2. Choose a master node (JobManager) and set the jobmanager.rpc.address key in conf/flink-conf.yaml to its IP or hostname. Make sure that all nodes in your cluster have the same jobmanager.rpc.address configured.
  3. Add the IPs or hostnames (one per line) of all worker nodes (TaskManager) to the slaves files in conf/slaves.

You can now start the cluster at your master node with bin/start-cluster.sh.

The following example illustrates the setup with three nodes (with IP addresses from 10.0.0.1 to 10.0.0.3 and hostnames masterworker1worker2) and shows the contents of the configuration files, which need to be accessible at the same path on all machines:

/path/to/flink/conf/
flink-conf.yaml

jobmanager.rpc.address: 10.0.0.1

/path/to/flink/
conf/slaves

10.0.0.2
10.0.0.3

Have a look at the Configuration section of the documentation to see other available configuration options. For Flink to run efficiently, a few configuration values need to be set.

In particular,

  • the amount of available memory per TaskManager (taskmanager.heap.mb),
  • the number of available CPUs per machine (taskmanager.numberOfTaskSlots),
  • the total number of CPUs in the cluster (parallelism.default) and
  • the temporary directories (taskmanager.tmp.dirs)

are very important configuration values.

You can easily deploy Flink on your existing YARN cluster.

  1. Download the Flink Hadoop2 packageFlink with Hadoop 2
  2. Make sure your HADOOP_HOME (or YARN_CONF_DIR or HADOOP_CONF_DIRenvironment variable is set to read your YARN and HDFS configuration.
  3. Run the YARN client with: ./bin/yarn-session.sh. You can run the client with options -n 10 -tm 8192 to allocate 10 TaskManagers with 8GB of memory each.

案例篇 阿里巴巴为什么选择 Apache Flink? .................................................................. 1 Apache Flink 在滴滴出行的应用与实践............................................................11 字节跳动 Jstorm 到 Apache Flink 的迁移实践...............................................20 Apache Flink 在美团的实践与应用 ....................................................................32 Apache Flink 在唯品会的实践.............................................................................47 携程基于 Apache Flink 的实时特征平台...........................................................57 技术篇 一文了解 Apache Flink 核心技术 .......................................................................66 流计算框架 Flink 与 Storm 的性能对比.............................................................73 Spark VS Flink – 下一代大数据计算引擎之争,谁主沉浮? ......................95 5分钟从零构建第一个ApacheFlink应用.................................................. 109 Apache Flink 零基础实战教程:如何计算实时热门商品.......................... 114 Apache Flink SQL 概览 ..................................................................................... 124 Apache Flink 类型和序列化机制简介 ............................................................. 140 深度剖析阿里巴巴对 Apache Flink 的优化与改进 ....................................... 151
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值