Spark读取mysql(jdbc)数据分区两种API

1,sparkSession.read.jdbc

    val modelDataSql = "select pay_time, ...   "
    
    //根据时间分区
    val partitionClause = Array(
                            "EXTRACT(month FROM pay_time) = 1 "
                          , "EXTRACT(month FROM pay_time) = 2 "
                          , "EXTRACT(month FROM pay_time) = 3 "
                          , "EXTRACT(month FROM pay_time) = 4 "
                          , "EXTRACT(month FROM pay_time) = 5 "
                          , "EXTRACT(month FROM pay_time) = 6 "
                          , "EXTRACT(month FROM pay_time) = 7 "
                          , "EXTRACT(month FROM pay_time) = 8 "
                          , "EXTRACT(month FROM pay_time) = 9 "
                          , "EXTRACT(month FROM pay_time) = 10 "
                          , "EXTRACT(month FROM pay_time) = 11 "
                          , "EXTRACT(month FROM pay_time) = 12 "
            )

    sparkSession
        .read
        .jdbc(
                      "jdbc.xxxx.url"
                    , s" ( $modelDataSql ) T"
                    , partitionClause
                    , jdbcProperties
)

2,写在option中

    var dataDf: DataFrame = spark.read.format("jdbc")
      .option("url", jdbcArgs.get("url").get)
      .option("driver", jdbcArgs.get("driver").get)
      .option("user", jdbcArgs.get("user").get)
      .option("password", jdbcArgs.get("password").get)
      .option("fetchSize", jdbcArgs.get("fetchSize").get)
      .option("dbtable", "(" + jdbcArgs.get("sql").get + ") temp")
      .option("partitionColumn","xxxxx")
      .option("lowerBound","xxxxx")
      .option("upperBound","xxxxx")
      .option("numPartitions",xxx)
      .load()

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值