sparkstreaming实时

该博客探讨了使用SparkStreaming进行实时数据处理的场景,包括每日订单销售额的实时统计和每小时统计。同时,提出了一个具体需求,即监测同一设备在5分钟内用不同账号三次及以上登录领取优惠券的行为。还介绍了在Phoenix中创建表进行数据存储,并提供了每日日活跃用户(DAU)的实时统计SQL以及用户购买明细的灵活分析方法。内容涵盖了Bean和Util的使用。
摘要由CSDN通过智能技术生成

pom依赖

<dependencies>


        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_2.12</artifactId>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-streaming_2.12</artifactId>
        </dependency>

        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-streaming-kafka-0-10_2.12</artifactId>
        </dependency>

        <dependency>
            <groupId>redis.clients</groupId>
            <artifactId>jedis</artifactId>
            <version>2.9.0</version>
        </dependency>

        <dependency>
            <groupId>com.wm.realtime</groupId>
            <artifactId>gmall-common</artifactId>
            <version>1.0-SNAPSHOT</version>
        </dependency>

        <dependency>
            <groupId>org.apache.phoenix</groupId>
            <artifactId>phoenix-spark</artifactId>
            <version>5.0.0-HBase-2.0</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-sql_2.12</artifactId>
        </dependency>

        <!--es 相关依赖开始-->
        <dependency>
            <groupId>io.searchbox</groupId>
            <artifactId>jest</artifactId>
            <version>6.3.1</version>
        </dependency>

        <dependency>
            <groupId>mysql</groupId>
            <artifactId>mysql-connector-java</artifactId>
            <version>5.1.47</version>
        </dependency>
    </dependencies>

每日订单销售额和每小时实时统计

import com.alibaba.fastjson.JSON
import com.atguigu.gmall.realtime.bean.OrderInfo
import com.atguigu.realtime.gmall.common.Constant
import org.apache.spark.streaming.dstream.DStream


/**
 * 每日订单销售额和每小时实时统计
 */
object OrderAppV2 extends BaseApp {
   
    override val topics: Set[String] = Set(Constant.ORDER_INFO_TOPIC)
    override val groupId: String = "OrderApp"
    override val appName: String = "OrderApp"
    override val master: String = "local[2]"
    override val bachTime: Int = 3
    
    
    override def run(sourceStream: DStream[String]): Unit = {
   
        sourceStream
            .map(json => JSON.parseObject(json, classOf[OrderInfo]))
            .foreachRDD(rdd => {
   
                import org.apache.phoenix.spark._
                rdd.saveToPhoenix("gmall_order_info0421",
                    Seq("ID", "PROVINCE_ID", "CONSIGNEE", "ORDER_COMMENT", "CONSIGNEE_TEL", "ORDER_STATUS", "PAYMENT_WAY", "USER_ID", "IMG_URL", "TOTAL_AMOUNT", "EXPIRE_TIME", "DELIVERY_ADDRESS", "CREATE_TIME", "OPERATE_TIME", "TRACKING_NO", "PARENT_ORDER_ID", "OUT_TRADE_NO", "TRADE_BODY", "CREATE_DATE", "CREATE_HOUR"),
                    zkUrl = Option("hadoop102,hadoop103,hadoop104:2181"))
            })
    }
    
}

需求:同一设备,5分钟内三次及以上用不同账号登录并领取优惠劵

abstract class BaseApp {
   
    val topics: Set[String]
    val groupId: String
    val master: String
    val appName: String
    val bachTime: Int
    
    def run(sourceStream: DStream[String]): Unit
    
    def main(args: Array[String]): Unit = {
   
        val conf: SparkConf = new SparkConf().setMaster(master).setAppName(appName)
        val ssc = new StreamingContext(conf, Seconds(bachTime))
        
        val sourceStream: DStream[String] = MyKafkaUtil
            .getKafkaStream(ssc, groupId, topics)
        
        run(sourceStream)
        
        
        // 4. 启动上下文
        ssc.start()
        // 5. 阻塞
        ssc.awaitTermination()
    }
}
import java.{
   util => ju}

import com.alibaba.fastjson.JSON
import com.atguigu.gmall.realtime.bean.{
   AlertInfo, EventLog}
import com.atguigu.gmall.realtime.util.ESUtil
import com.atguigu.realtime.gmall.common.Constant
import org.apache.spark.rdd.RDD
import org.apache.spark.streaming.dstream.DStream
import org.apache.spark.streaming.{
   Minutes, Seconds}

import scala.util.control.Breaks._

/**
 * 实时预警:同一设备,5分钟内三次及以上用不同账号登录并领取优惠劵
 */
object AlertApp extends BaseApp {
   
    override val topics: Set[String] = Set(Constant.EVENT_TOPIC)
    override val groupId: String = "AlertApp"
    override val master: String = "local[2]"
    override val appName: String = "AlertApp"
    override val bachTime: Int = 3
    
    override def run(sourceStream: DStream[String]): Unit = {
   
        val eventLogStream = sourceStream
            .map(json => {
   
                val log = JSON.parseObject(json, classOf[EventLog])
                (log.mid, log)
            })
            .window(Minutes(5), Seconds(6))

        val alertInfoStream = eventLogStream
            .groupByKey()
            .map {
   
                case (mid, it: Iterable[EventLog]) =>
                    // 对 it变量, 计算想要数据
                    // 1. 存储在当前设备领取优惠券的用户id
                    val uids = new ju.HashSet[String]()
                    // 2. 在当前设备的所有操作事件
                    val events = new ju.ArrayList[String]()
                    // 3. 存储优惠券所在的商品
                    val items = new ju.HashSet[String](
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值