所有的东西都是从helloword开始,大数据都是从wordcount开始的
直接上代码:
package com.yeahmobi.test.stream
import org.apache.flink.streaming.api.scala.StreamExecutionEnvironment
import org.apache.flink.api.scala._
object WordCount {
def main(args: Array[String]): Unit = {
//win+R nc -l -p 9000 linux
val env = StreamExecutionEnvironment.getExecutionEnvironment
val textStream =env.socketTextStream("localhost",9999)
val lowerData =textStream.flatMap(line =>
line.toLowerCase.split("\\W+"))
val nonEmpty_data = lowerData.filter(line=> line.nonEmpty)
val mapData= nonEmpty_data.map(line=>(line,1))
//基于指定的可以进行分组
val keybyData = mapData.keyBy(0)// 根据单词分组
val sum =keybyData.sum(1)
sum.print()
env.execute("start streaming window wordCount")
}
}
运行说明,
1先win+R nc -l -p 9999 启动netcat监听
2.启动程序
3.nc上输入一行 问题
4.idea上的控制台查看结果
如果windows默认不安装nc ,如果没有安装请参考我的博客