Hive 分区表 进行动态插入

创建分区表

spark.sql(
          """
            |create table mro_ns2_hive_db.tmp_mro_msisdn_njy_xgboots_105_ts_partitions
            |(
            |sc_longitude double
            |,sc_latitude double
            |,sc_pci      double
            |,sc_freq     double
            |,scrsrp      double
            |,scrsrq      double
            |,nc1pci      double
            |,nc1freq     double
            |,nc1rsrp     double
            |,nc1rsrq     double
            |,nc2pci      double
            |,nc2freq     double
            |,nc2rsrp     double
            |,nc2rsrq     double
            |,nc3pci      double
            |,nc3freq     double
            |,nc3rsrp     double
            |,nc3rsrq     double
            |,sctadv      double
            |,longitude   double
            |,latitude	 double
            |,time_stamp string
            |,msisdn      string
            |)
            |partitioned by (key Int)
            |ROW FORMAT DELIMITED
            |FIELDS TERMINATED BY '\t'
            |STORED AS TEXTFILE
          """.stripMargin)

注意这里的分区字段不能在()中出现

动态插入

spark.sql(
      """
        |set hive.exec.dynamic.partition.mode=nonstrict
        |set hive.exec.dynamic.partition=true
      """.stripMargin)

这里是设置动态插入

spark.sql(
        """
          |insert overwrite table mro_ns2_hive_db.tmp_mro_msisdn_njy_xgboots_105_ts_partitions
          |PARTITION(key)
          |select
          |CAST( sc_longitude as DECIMAL(10,7)) as sc_longitude
          |,CAST( sc_latitude as DECIMAL(10,7)) as sc_latitude
          |,CAST( sc_pci as decimal) as sc_pci
          |,CAST( sc_freq as  decimal ) as sc_freq
          |,CAST( scrsrp as  decimal ) as scrsrp
          |,CAST( scrsrq as  decimal ) as scrsrq
          |,CAST( nc1pci as  decimal ) as nc1pci
          |,CAST( nc1freq as  decimal ) as nc1freq
          |,CAST( nc1rsrp as  decimal ) as nc1rsrp
          |,CAST( nc1rsrq as  decimal ) as nc1rsrq
          |,CAST( nc2pci as  decimal ) as nc2pci
          |,CAST( nc2freq as  decimal ) as nc2freq
          |,CAST( nc2rsrp as  decimal ) as nc2rsrp
          |,CAST( nc2rsrq as  decimal ) as nc2rsrq
          |,CAST( nc3pci as  decimal ) as nc3pci
          |,CAST( nc3freq as  decimal ) as nc3freq
          |,CAST( nc3rsrp as  decimal ) as nc3rsrp
          |,CAST( nc3rsrq as  decimal ) as nc3rsrq
          |,CAST(sctadv AS decimal)   as sctadv
          |,case when instr(longitude ,'.')>0 then CAST(longitude AS DECIMAL(10,7))
          |else (CAST(longitude AS DECIMAL(10,7))  * 360 / 16777216.0) end as longitude
          |,case when instr(latitude ,'.')>0 then CAST( latitude as DECIMAL(10,7))
          |else (CAST(latitude as DECIMAL(10,7)) * 90 / 8388608.0) end as latitude
          |,time_stamp
          |,msisdn
          |,key(CAST( sc_longitude as DECIMAL(10,7)),CAST( sc_latitude as DECIMAL(10,7))) as key
          |from mro_ns2_hive_db.tmp_mro_msisdn_njy_xgboots_105_ts
          |where city_id=571
          |distribute by cast(rand()*4 as int)
        """.stripMargin)

注意这里分区字段需要在最后

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
对于Hive分区表插入数据,有两种方式:动态分区插入和静态分区插入动态分区插入是指在插入数据时,根据数据中的某个列的值来动态创建分区,并将数据插入到相应的分区中。要使用动态分区插入,需要先设置以下两个参数: set hive.exec.dynamic.partition=true; set hive.exec.dynamic.partition.mode=strict; 然后使用以下语句进行插入操作: insert into table <table_name> partition(<partition_column>) select <columns> from <source_table>; 其中,<table_name>是目标表的名称,<partition_column>是用于分区的列名,<columns>是要插入的列,可以是源表的列或其他计算得到的值,<source_table>是源数据表的名称。 静态分区插入是指在插入数据时,直接指定分区的值,并将数据插入到指定的分区中。要使用静态分区插入,需要先设置以下两个参数: set hive.exec.dynamic.partition=true; set hive.exec.dynamic.partition.mode=nonstrict; 然后使用以下语句进行插入操作: insert into table <table_name> partition(<partition_columns>) select <columns> from <source_table>; 其中,<table_name>是目标表的名称,<partition_columns>是用于分区的列名及其对应的值,<columns>是要插入的列,可以是源表的列或其他计算得到的值,<source_table>是源数据表的名称。 这是Hive分区表插入数据的常用方法,可以根据实际需求选择使用动态分区插入还是静态分区插入

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值