2021-11-09

该博客展示了如何使用Scala和Spark从PostgreSQL数据库中读取数据,然后通过KafkaProducer将数据发送到Kafka主题。内容涉及数据查询、DataFrame操作、KafkaProducer的使用以及自定义事件类RelationshipPeer的创建和序列化过程。
摘要由CSDN通过智能技术生成

解决问题:从gp->kafka,数据集获取,发送kafka的数据变换

val sourceDF = Range(0, 10)
  .map(index => {
    val dbtable =
      s"""(
         |select
         |aid1,
         |aid2,
         |t1 ,
         |t2 ,
         |along_interval ,
         |source_id ,
         |create_time,
         |dt ,
         |thumbnail_id1,
         |thumbnail_url1,
         |image_id1,
         |image_url1 ,
         |thumbnail_id2 ,
         |thumbnail_url2,
         |image_id2 ,
         |image_url2 ,
         |score1 ,
         |score2
         |from dwd_bigdata_relation_peer_day_5030
         |WHERE  create_time > '2020-06-27 00:00:00' AND create_time <= '2020-06-28 00:00:00') AS t_tmp_$index""".stripMargin
    println(dbtable)
    spark
      .read
      .format("jdbc")
      .option("driver", "org.postgresql.Driver")
      .option("url","jdbc:postgresql://192.168.11.33:2222/bigdata_dwd" )
      .option("dbtable", dbtable)
      .option("user", "张三")
      .option("password", "123456")
      .option("fetchsize","5000")
      .load()
  })
  .reduce((rdd1, rdd2) => rdd1.union(rdd2))
println("加载同行事件")
sourceDF.show()

val producer = new KafkaProducer[String, String](props)


val peerArray = sourceDF.collect
for(i <- 0 to peerArray.length-1){


  val row = peerArray(i)


  val sourceAid = row.getAs[String]("aid1")
  val targetAid = row.getAs[String]("aid2")
  val time = row.getAs[String]("t1")
  val source_id = row.getAs[String]("source_id") //根据摄像头sourceid进行反查
  val along_interval = row.getAs[Int]("along_interval")
  val create_time = row.getAs[Timestamp]("create_time")
  val dt = row.getAs[String]("dt")
  val thumbnail_id1 = row.getAs[String]("thumbnail_id1")
  val thumbnail_url1 = row.getAs[String]("thumbnail_url1")
  val image_id1 = row.getAs[String]("image_id1")
  val image_url1 = row.getAs[String]("image_url1")
  val thumbnail_id2 = row.getAs[String]("thumbnail_id2")
  val thumbnail_url2 = row.getAs[String]("thumbnail_url2")
  val image_id2 = row.getAs[String]("image_id2")
  val image_url2 = row.getAs[String]("image_url2")
  val score1 = row.getAs[String]("score1")
  val score2 = row.getAs[String]("score2")
  val event = RelationshipPeer(sourceAid, targetAid, time, time, along_interval, source_id, create_time, dt, thumbnail_id1, thumbnail_url1, image_id1, image_url1, thumbnail_id2, thumbnail_url2, image_id2, image_url2, score1, score2)


  implicit val formats: DefaultFormats.type = org.json4s.DefaultFormats
  val data: String = Serialization.write(event)
  val message = new ProducerRecord[String, String](topic,s"key$i", data.toString())
  producer.send(message)
}

case class RelationshipPeer(aid1: String, aid2: String, t1: String, t2: String, along_interval: Int, source_id: String, create_time: Timestamp, dt: String, thumbnail_id1: String, thumbnail_url1: String, image_id1: String, image_url1: String , thumbnail_id2: String , thumbnail_url2: String, image_id2: String, image_url2: String, score1: String, score2: String)
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值