Flink入门:Wordcount详述,笔记
pom文件:
< ? xml version= "1.0" encoding= "UTF-8" ? >
< project xmlns= "http://maven.apache.org/POM/4.0.0"
xmlns: xsi= "http://www.w3.org/2001/XMLSchema-instance"
xsi: schemaLocation= "http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd" >
< modelVersion> 4.0 .0 < / modelVersion>
< groupId> cn. wgy< / groupId>
< artifactId> FlinkDome< / artifactId>
< version> 1.0 - SNAPSHOT< / version>
< dependencies>
< dependency>
< groupId> org. apache. flink< / groupId>
< artifactId> flink- scala_2. 12 < / artifactId>
< version> 1.10 .1 < / version>
< / dependency>
< dependency>
< groupId> org. apache. flink< / groupId>
< artifactId> flink- streaming- scala_2. 12 < / artifactId>
< version> 1.10 .1 < / version>
< / dependency>
< / dependencies>
< / project>
1.批次处理代码:
package com. wgy. wordcount
import org. apache. flink. api. scala. { AggregateDataSet, DataSet, ExecutionEnvironment}
import org. apache. flink. api. scala. _
object WordCount {
def main( args: Array[ String ] ) : Unit = {
val env: ExecutionEnvironment = ExecutionEnvironment. getExecutionEnvironment
var inputPath= "D:\\scala_spark\\FlinkDome\\src\\main\\resources\\words" ;
val inputDataSet: DataSet[ String ] = env. readTextFile( inputPath)
val resultDataSet: AggregateDataSet[ ( String , Int ) ] = inputDataSet. flatMap( _. split( " " ) ) . map( ( _, 1 ) ) . groupBy( 0 ) . sum( 1 )
resultDataSet. print( )
}
}
SLF4J: See http: / / www. slf4j. org/ codes. html#StaticLoggerBinder for further details.
( scala, 2 )
( flink, 2 )
( hello, 8 )
( java, 2 )
( word, 2 )
2.流式处理代码:
//linux系统
yum install -y nc //下载端口工具
nc -lk 7777 //设置端口
package com. wgy. wordcount
import org. apache. flink. streaming. api. scala. { DataStream, StreamExecutionEnvironment}
import org. apache. flink. streaming. api. scala. _
object StreamWordCount {
def main( args: Array[ String ] ) : Unit = {
val env: StreamExecutionEnvironment = StreamExecutionEnvironment. getExecutionEnvironment
val inputDataSet: DataStream[ String ] = env. socketTextStream( "hadoop101" , 7777 )
val resultDataSet: DataStream[ ( String , Int ) ] = inputDataSet. flatMap( _. split( " " ) ) . map( ( _, 1 ) ) . keyBy( 0 ) . sum( 1 )
resultDataSet. print( )
env. execute( "stream word count" )
}
}
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder" .
SLF4J: Defaulting to no - operation ( NOP) logger implementation
SLF4J: See http: / / www. slf4j. org/ codes. html#StaticLoggerBinder for further details.
3 > ( hello, 1 )
7 > ( flink, 1 )
3 > ( hello, 2 )
7 > ( flink, 2 )