Flink 流式处理

一.三种Environment

 1.getExecutionEnvironment

val env: StreamExecutionEnvironment = StreamExecutionEnvironment.getExecutionEnvironment
val env1: ExecutionEnvironment = ExecutionEnvironment.getExecutionEnvironment

2.createLocalEnvironment

val env2: StreamExecutionEnvironment = StreamExecutionEnvironment.createLocalEnvironment(1)

3. createRemoteEnvironment用的很少

二. 四种数据源

1.集合数据源

val CollectionDS: DataStream[String] = env.fromCollection(List("hadoop","spark","flink","hive"))

2.文件数据源

val fileDS: DataStream[String] = env.readTextFile("D:\\ideaProject\\flink-base\\test.txt")

3.Kafka数据源

val prop= new Properties()
        prop.setProperty("bootstrap.servers", "node01:9092,node01:9092,node01:9092")
        prop.setProperty("group.id", "consumer-group")
        prop.setProperty("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer")
        prop.setProperty("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer")
        prop.setProperty("auto.offset.reset", "latest")
        val kafkaDS: DataStream[String] = env.addSource(new FlinkKafkaConsumer011[String]("sensor",new SimpleStringSchema(),prop))

4.自定数据源

class mySensorSource  extends  SourceFunction[String]{
override def run(sourceContext: SourceFunction.SourceContext[String]): Unit = {
    //生产数据处
  }
override def cancel(): Unit = {//停止数据处
    }
}

三 转换算子

1.map 

   var envs= fileDS.map{  x=>x.split(" ")  (x,1) }.print()

2.flatMap

   var envs= fileDS.flatMap{
     x=>x.split(" ").map((_,1))
   }.print()

3.Filter 过滤

   var envs= fileDS.flatMap{ x=>x.split(" ") .filter(t=>t.equals("hello")) }.print()

4 .KeyBy 将流拆分不相交的分区,每个分区含有相同的key的元素

   var envs= fileDS.flatMap{
     x=>x.split(" ").filter(t=>t .equals("kylin"))
         .map((_,1))
   }.keyBy(0).sum(1).print()

5.两个数据流被Connect之后,只是被放在了一个同一个流中,内部依然保持各自的数据和形式不发生任何变化,两个流相互独立

val warning = high.map( sensorData => (sensorData.id, sensorData.temperature) )
val connected = warning.connect(low)

6.CoMap,CoFlatMap 功能与map和flatMap一样,对ConnectedStreams中的每一个Stream分别进行map和flatMap处理

val coMap = connected.map(
    warningData => (warningData._1, warningData._2, "warning"),
    lowData => (lowData.id, "healthy")
)

7.Union 对两个或者两个以上的DataStream进行union操作,产生一个包含所有DataStream元素的新DataStream

val fileDS: DataStream[String] = env.readTextFile("D:\\ideaProject\\flink-base\\test.txt")
val fileDS1: DataStream[String] = env.readTextFile("D:\\ideaProject\\flink-base\\test.txt")
fileDS.union(fileDS1).print()

Union之前两个流的类型必须是一样,Connect可以不一样,在之后的coMap中再去调整成为一样的。

Connect只能操作两个流,Union可以操作多个。

四.Sink 落地

  1. Kafka落地
fileDS1.addSink(new FlinkKafkaProducer011[String]("localhost:9092", "test", new SimpleStringSchema()))

    2. Redis 落地

class MyRedisMapper extends  RedisMapper[String,String]{
  override def getCommandDescription: RedisCommandDescription = {
    new RedisCommandDescription(RedisCommand.HSET,"sensor")
  }
  override def getKeyFromData(t: String): String = ???

  override def getValueFromData(t: String): String = ???
}

 主函数调用

val conf = new FlinkJedisPoolConfig.Builder().setHost("localhost").setPort(6379).build()
dataStream.addSink( new RedisSink[SensorReading](conf, new MyRedisMapper) )

 3.定义下沉地

class MyJdbcSink() extends RichSinkFunction[SensorReading]{
  var conn: Connection = _
  var insertStmt: PreparedStatement = _
  var updateStmt: PreparedStatement = _

  // open 主要是创建连接
  override def open(parameters: Configuration): Unit = {
    super.open(parameters)

    conn = DriverManager.getConnection("jdbc:mysql://localhost:3306/test", "root", "123456")
    insertStmt = conn.prepareStatement("INSERT INTO temperatures (sensor, temp) VALUES (?, ?)")
    updateStmt = conn.prepareStatement("UPDATE temperatures SET temp = ? WHERE sensor = ?")
  }
  // 调用连接,执行sql
  override def invoke(value: SensorReading, context: SinkFunction.Context[_]): Unit = {
    
updateStmt.setDouble(1, value.temperature)
    updateStmt.setString(2, value.id)
    updateStmt.execute()

    if (updateStmt.getUpdateCount == 0) {
      insertStmt.setString(1, value.id)
      insertStmt.setDouble(2, value.temperature)
      insertStmt.execute()
    }
  }

  override def close(): Unit = {
    insertStmt.close()
    updateStmt.close()
    conn.close()
  }
}

 

评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值