flink的transform方法

1、简单转化算子(map,flatmap,filter这些)datastream和keyedStream都可以有,但是datastream没有聚合算子,只有keyedStream才有。键值转换后的才有滚动聚合算子sum(),min(),max() ,minBy(),maxBy(),reduce()

2、键值转换后的才有滚动聚合算子sum(),min(),max() ,minBy(),maxBy()

min是来一条数据就和历史数据比较,然后输出最小值,格式还是SensorReading类型,如果取min(temperatrue)则会输出temperature最小值,其他字段是keyBy的第一条记录

minBy是来一条数据就和历史数据比较,然后输出最小值,格式还是SensorReading类型,如果取minby(temperatrue)则会输出temperature最小值,其他字段是最小值字段的当条记录

sum是来一条数据就和历史数据相加,然后输出汇总值

3、reduce方法,需求是获得temperature的最小值,但是timeStamp的最大值,reduce方法内的lambda的第一个值是之前的聚合后的结果,第二个值是最新值,带状态的

4、connectStream的map方法内部是comap,是对合并的多个流分别去做处理,connect两个流的数据类型可以不一致

package flinkSourse

import org.apache.flink.api.common.functions.{MapFunction, ReduceFunction, RichMapFunction}
import org.apache.flink.api.java.tuple.Tuple
import org.apache.flink.configuration.Configuration
import org.apache.flink.streaming.api.functions.co.CoMapFunction
import org.apache.flink.streaming.api.scala.{ConnectedStreams, _}


//简单转化算子(map,flatmap,filter这些)datastream和keyedStream都可以有,但是datastream没有聚合算子,只有keyedStream才有,
// 键值转换后的才有滚动聚合算子sum(),min(),max() ,minBy(),maxBy(),reduce()
// 多流转换 split、select、connect、comap、union
object FlinkTransform {
  def main(args: Array[String]): Unit = {
    val executionEnvironment: StreamExecutionEnvironment = StreamExecutionEnvironment.getExecutionEnvironment
    executionEnvironment.setParallelism(1)
    // 有界流  env.readTextFile
    val stream2: DataStream[String] = executionEnvironment.readTextFile("src/main/resources/sensorReading.txt")
    //只有keyby后的keyedStream才能进行min、max、sum等聚合操作
    //1、简单转化 flatmap
    //    stream2.flatMap(data => data.split(",")).print()
    //2、键值转换后的才有滚动聚合算子sum(),min(),max() ,minBy(),maxBy()
    //min是来一条数据就和历史数据比较,然后输出最小值,格式还是SensorReading类型,如果取min(temperatrue)则会输出temperature最小值,其他字段是keyBy的第一条记录
    //minBy是来一条数据就和历史数据比较,然后输出最小值,格式还是SensorReading类型,如果取minby(temperatrue)则会输出temperature最小值,其他字段是最小值字段的当条记录
    //sum是来一条数据就和历史数据相加,然后输出汇总值
    val transforStream: DataStream[SensorReading] = stream2.map(data => {
      val tmpList: Array[String] = data.split(",")
      SensorReading(tmpList(0), tmpList(1).toLong, tmpList(2).toDouble)
    })
        val keyedStream: KeyedStream[SensorReading, Tuple] = transforStream.keyBy("id")
//        keyedStream.minBy("temperature").print()
    //3、reduce方法,要求获得temperature的最小值,但是timeStamp的最大值
    //reduce方法内的lambda的第一个值是之前的聚合后的结果,第二个值是最新值
    //第一种lambda表达式
    //    keyedStream.reduce((valueState,newData)=>{
    //      SensorReading(valueState.id,newData.timestamp,valueState.temperature.min(newData.temperature))
    //    }).print()
    //第二种自定义类的方式,scala需要用class去实现java的interface接口,而不是继承
    //    keyedStream.reduce(new MyReduceFunction()).print()
    //    keyedStream状态流 不能用aggregate,因为是private def aggregate 的

    //4.1、分流操作 将传感器的的流按照温度分为两个流.split方法相当于给盖了戳,select方法进行选取
    val splitStream: SplitStream[SensorReading] = transforStream.split((data) => {
      if (data.temperature >= 32) Seq("high") else Seq("low")
    })
    val highStream: DataStream[SensorReading] = splitStream.select("high")
    val lowStream: DataStream[SensorReading] = splitStream.select("low")
    val allStream: DataStream[SensorReading] = splitStream.select("high", "low")
    //4.2 connectStream的map方法内部是comap,是对合并的多个流分别去做处理,connect两个流的数据类型可以不一致
    val warningStream: DataStream[(String, Double)] = highStream.map(data => (data.id, data.temperature))
    val connectedStreams: ConnectedStreams[(String, Double), SensorReading] = warningStream.connect(lowStream)
    val comapDataStream: DataStream[Product] = connectedStreams.map(new MyCoMapFunction("warn you"))
//    comapDataStream.print("comapDataStream")
    //4.3 union 两个流的数据类型一致
        val unionStream: DataStream[SensorReading] = highStream.union(lowStream)
        unionStream.print("union")


    executionEnvironment.execute("transform")
  }
}

class MyReduceFunction extends ReduceFunction[SensorReading] {
  override def reduce(t: SensorReading, t1: SensorReading): SensorReading = {
    SensorReading(t.id, t1.timestamp, t.temperature.min(t1.temperature))
  }
}


//可以传进来构造方法的参数的,也可以不传
class MyCoMapFunction(val info: String) extends CoMapFunction[(String, Double), SensorReading, Product] {
  override def map1(in1: (String, Double)): Tuple3[String,Double,String] = {
    (in1._1, in1._2, info)
  }

  override def map2(in2: SensorReading): Tuple2[String,String] = {
    (in2.id, "healthy")
  }
}

//richfunction多了一些生命周期 和 运行时环境,可以从运行时环境上下文得到state的信息
class MyRichMapFunction extends RichMapFunction[SensorReading, String] {
  override def open(parameters: Configuration): Unit = {
    //做一些数据库初始化链接等操作,只执行一次
//    val value: ListState[Nothing] = getRuntimeContext.getListState()
  }

  override def close(): Unit = {
    //关闭连接等操作
  }

  override def map(in: SensorReading): String = {
    in.id + in.temperature + in.timestamp
  }
}


class MyMapFunction extends MapFunction[SensorReading, String] {
  override def map(t: SensorReading): String = {"D"}
}

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值