注意:Hive建表时,要加上Hive分区和字段分割符。
create table *** (
**** int ,
**** varchar(255)
)
PARTITIONED BY (
`store_day` int COMMENT '存储日期')
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
WITH SERDEPROPERTIES (
'field.delim'='\t',
'line.delim'='\n',
'serialization.format'='\t');
1、将mysql的数据导出。
mysql -hmysqlhost -uweb_hive -pmysqlpassword -N -e "select *,enter_day from w37_tj_h5.login_logs where enter_day < 20171204" > localfile
2、将这些数据插入到一个指定的分区中。
hive -e "load data local inpath 'localfile' overwrite into table database.table partition(store_day=20171101)";
3、将数据重新插入到所有分区中
登陆到hive的命令行:
set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.max.dynamic.partitions.pernode=3000;
set hive.exec.max.dynamic.partitions=6000;
set mapreduce.map.memory.mb=2048;
set mapreduce.reduce.memory.mb=3072;
设置好以上配置后,执行以下语句:
insert overwrite table login_logs partition(store_day)
select enter_time,enter_day,uid,register_game,register_server,register_app,register_time,register_day,
game_id,server_id,appid,app_param,ad_type,ip,dev,nettype,phone_model,os_version,deviceplate,
browser,37_is_old,game_is_old,auth_type,enter_day from login_logs where store_day = 20171101 ;
注意:一定要查出每个字段,然后最后补上动态分区的字段。
4、查看数据是否已经插入完成
在Linux的命令行下查看:
hadoop fs -du -h /user/hive/warehouse/h5_game_platform_publish.db/login_logs