从文本文件导入数据到hive表中
1.数据保存为CSV格式,不能带标题行,以逗号分隔,用tr命令将逗号替换成hive默认的\001分隔
tr ',' '\001' < /home/zengsiwei362/zhoulixin54520150730.csv > /home/zengsiwei362/zhoulixin54520150730.txt
2.建一张与目标表一样的临时表,可以是分区表也可以不是分区表,导入的语句不同而已,但是文件格式一定是textfile
3.导入数据到tmp表,再从tmp表加载到正式表,文件可以放在本地,也可以放到hdfs,本地:load data local inpath ... hdfs:去掉local
另外”overwrite into “和”into“的区别没验证过
hive -e " use pad_hdp;
load data local inpath '/home/zengsiwei362/zhoulixin54520150730.txt' overwrite into table pad_hdp.DWY2_CUST_VEHICLE_import_tmp ;
--load data local inpath '/home/zengsiwei362/zhoulixin54520150730.txt' overwrite into table pad_hdp.DWY2_CUST_VEHICLE_import_tmp partition(op_day='20150707');
set mapred.job.queue.name=queue02;
INSERT overwrite TABLE pad_hdp.DWY2_CUST_VEHICLE PARTITION (OP_DAY = '20150707')
SELECT *
FROM pad_hdp.DWY2_CUST_VEHICLE_import_tmp ;"
建表语句:
USE PAD_HDP;
DROP TABLE DWY2_CUST_VEHICLE_import_tmp;
CREATE TABLE DWY2_CUST_VEHICLE_import_tmp(
list_no string comment '',
vehicle_no string comment '',
tcims_cust_id string comment '',
asset_id double comment '',
contact_name string comment '',
contact_email string comment '',
zip_code string comment '',
address string comment '',
remark string comment '',
driver_license_no string comment '',
driver_license_fst_issue_date string comment '',
drive_vehicle_type_code string comment '',
brand_type_code string comment '',
usage_attribute string comment '',
usage_code string comment '',
attribute_code string comment '',
vehicle_type string comment '',
vehicle_type_code string comment '',
vehicle_body_color string comment '',
automodel_name string comment '',
brand_type_first_sale_date string comment '',
seat_number