1、Flink官网下载二进制包,解压。
[hadoop@master install]$ tar -zxvf flink-1.7.2-bin-hadoop27-scala_2.11.tgz -C /app/
2、启动一个本地的Flink集群
[hadoop@master bin]$ ./start-cluster.sh
Starting cluster.
Starting standalonesession daemon on host master.
Starting taskexecutor daemon on host master.
通过访问http://localhost:8081检查JobManager网页,确保所有组件都已运行。网页会显示一个有效的TaskManager实例。
3、你也可以通过检查日志目录里的日志文件来验证系统是否已经运行:
[hadoop@master flink-1.7.2]$ tail log/flink-*-standalonesession-*.log
2019-06-25 09:47:07,179 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService - Starting RPC endpoint for org.apache.flink.runtime.resourcemanager.StandaloneResourceManager at akka://flink/user/resourcemanager .
2019-06-25 09:47:07,935 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService - Starting RPC endpoint for org.apache.flink.runtime.dispatcher.StandaloneDispatcher at akka://flink/user/dispatcher .
2019-06-25 09:47:08,590 INFO org.apache.flink.runtime.resourcemanager.StandaloneResourceManager - ResourceManager akka.tcp://flink@localhost:6123/user/resourcemanager was granted leadership with fencing token 00000000000000000000000000000000
2019-06-25 09:47:09,331 INFO org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager - Starting the SlotManager.
2019-06-25 09:47:09,682 INFO org.apache.flink.runtime.dispatcher.StandaloneDispatcher - Dispatcher akka.tcp://flink@localhost:6123/user/dispatcher was granted leadership with fencing token 00000000-0000-0000-0000-000000000000
2019-06-25 09:47:09,776 INFO org.apache.flink.runtime.dispatcher.StandaloneDispatcher - Recovering all persisted jobs.
2019-06-25 09:47:29,093 INFO org.apache.flink.runtime.resourcemanager.StandaloneResourceManager - Registering TaskManager with ResourceID fa8eabc7916040a2a9705f3130a0cf1b (akka.tcp://flink@master:49783/user/taskmanager_0) at ResourceManager
2019-06-25 09:47:29,104 INFO org.apache.flink.runtime.resourcemanager.StandaloneResourceManager - Registering TaskManager with ResourceID fa8eabc7916040a2a9705f3130a0cf1b (akka.tcp://flink@master:49783/user/taskmanager_0) at ResourceManager
2019-06-25 09:47:29,192 INFO org.apache.flink.runtime.resourcemanager.StandaloneResourceManager - Registering TaskManager with ResourceID fa8eabc7916040a2a9705f3130a0cf1b (akka.tcp://flink@master:49783/user/taskmanager_0) at ResourceManager
2019-06-25 09:47:29,344 INFO org.apache.flink.runtime.resourcemanager.StandaloneResourceManager - Registering TaskManager with ResourceID fa8eabc7916040a2a9705f3130a0cf1b (akka.tcp://flink@master:49783/user/taskmanager_0) at ResourceManager
4、这个例子将会从一个socket中读一段文本,并且每隔5秒打印每个单词出现的数量。
第一步, 我们可以通过netcat命令来启动本地服务
[hadoop@master flink-1.7.2]$ nc -l 9000
提交Flink程序:
[hadoop@master flink-1.7.2]$ ./bin/flink run examples/streaming/SocketWindowWordCount.jar --port 9000
Starting execution of program
程序连接socket并等待输入,你可以通过web界面来验证任务期望的运行结果:
单词的数量在5秒的时间窗口中进行累加(使用处理时间和tumbling窗口),并打印在stdout。监控JobManager的输出文件,并在nc写一些文本(回车一行就发送一行输入给Flink) :
[hadoop@master ~]$ nc -l 9000
spark flink
hadoop flink
java flink
python flink
.out文件将被打印每个时间窗口单词的总数:
[hadoop@master flink-1.7.2]$ tail -f log/flink-*-taskexecutor-*.out
spark : 1
flink : 1
hadoop : 1
flink : 1
java : 1
flink : 1
python : 1
flink : 1
使用以下命令来停止Flink:
./bin/stop-cluster.sh