环境搭建
Flink 安装部署
- 前提条件
- HDFS正常启动 (SSH免密码认证)
- JDK1.8+
- 上传并解压flink
[root@CentOS ~]# tar -zxf flink-1.8.1-bin-scala_2.11.tgz -C /usr/
- 配置flink-conf.yaml配置文件
[root@CentOS ~]# cd /usr/flink-1.8.1/
[root@CentOS flink-1.8.1]# vi conf/flink-conf.yaml
jobmanager.rpc.address: CentOS
taskmanager.numberOfTaskSlots: 4
[root@CentOS flink-1.8.1]# vi conf/slaves
CentOS
- 启动flink服务
[root@CentOS flink-1.8.1]# ./bin/start-cluster.sh
Starting cluster.
Starting standalonesession daemon on host CentOS.
Starting taskexecutor daemon on host CentOS.
[root@CentOS flink-1.8.1]# jps
4721 SecondaryNameNode
4420 DataNode
36311 TaskManagerRunner
35850 StandaloneSessionClusterEntrypoint
2730 QuorumPeerMain
3963 Kafka
36350 Jps
4287 NameNode
访问http://centos:8081/#/overview查看flink web UI
依赖
<dependencies>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-core</artifactId>
<version>1.8.1</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-clients_2.11</artifactId>
<version>1.8.1</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-scala_2.11</artifactId>
<version>1.8.1</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-streaming-scala_2.11</artifactId>
<version>1.8.1</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<version>4.0.1</version>
<executions>
<execution>
<id>scala-compile-first</id>
<phase>process-resources</phase>
<goals>
<goal>add-source</goal>
<goal>compile</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
java代码
package com
import org.apache.flink.streaming.api.scala.{DataStream, StreamExecutionEnvironment, _}
object FlinkStreamWordCount {
def main(args: Array[String]): Unit = {
//1.创建StreamExecutionEnvironment
val env=StreamExecutionEnvironment.getExecutionEnvironment
//2.设置Source
val lines:DataStream[String]=env.socketTextStream("Flink",9999)
//3.对lines数据实现常规转换
lines.flatMap(_.split("\\s+"))
.map(word(_,1))
.keyBy("word")
.sum("count")
.print()
//4.执行任务
env.execute("wordcount")
}
}