业务需求:
将当天产生的数据写入Hive分区表中(以日期作为分区)
业务分析:
利用MapReduce将数据写入Hive表实则上就是将数据写入至Hive表的HDFS目录下,但是问题在于写入至当天的分区,因此问题转换为:如何事先创建Hive表的当天分区
解决方案:
1. 创建Hive表
# 先创建分区表rcmd_valid_path
hive -e "set mapred.job.queue.name=pms;
drop table if exists pms.test_rcmd_valid_path;
create table if not exists pms.test_rcmd_valid_path
(
track_id string,
track_time string,
session_id string,
gu_id string,
end_user_id string,
page_category_id bigint,
algorithm_id int,
is_add_cart int,
rcmd_product_id bigint,
product_id bigint,
path_id string,
path_type string,
path_length int,
path_list string,
order_code string,
groupon_id bigint
)
partitioned by (ds string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
LINES TERMINATED BY