抽取ds_db01库中customer_inf的增量数据进入Hive的ods库中表customer_inf。
根据ods.customer_inf表中modified_time作为增量字段,只将新增的数据抽入,字段名称、类型不变,同时添加静态分区,
分区字段为etl_date,类型为String,且值为当前日期的前一天日期(分区字段格式为yyyyMMdd)。
使用hive cli执行show partitions ods.customer_inf命令;
先在hive中建表:
create table ods.customer_inf(
customer_inf_id int,
customer_id int,
customer_name string,
identity_card_type int,
identity_card_no string,
mobile_phone string,
customer_email string,
gender string,
customer_point int,
register_time timestamp,
birthday timestamp,
customer_level int,
customer_money double,
modified_time timestamp
)partitioned by (etl_date string);
抽取ds_db01库中customer_inf的增量数据进入Hive的ods库中表customer_inf:
spark.read.format("jdbc")
.option("driver","com.mysql.jdbc.Driver")
.option("url","jdbc:mysql://192.168.45.17:3306/ds_db01?characterEncoding=UTF-8")
.option("user","root")
.option("password","123456")
.option("dbtable","customer_inf")
.load()
.createOrReplaceTempView("mysql_customer_inf")
insert overwrite table ods.customer_inf partition(etl_date=)
select * from mysql_customer_inf
where modified_time >
(select
if(max(modified_time) is null,cast(-1 as timestamp),max(modified_time))
from ods.customer_inf)
在hive客户端输入:
show partitions ods.customer_inf;