1.在把csv文件上传到hdfs中 dfs /test/marco/路径下
hdfs dfs -put /app/scripts/feaure/xxx/schedule.csv /test/marco/
2.在impala建外部表
CREATE EXTERNAL TABLE if not exists feature.fea_schedule (
day_dt STRING COMMENT ‘日期’,
schedule_ix int COMMENT ‘档期时间排序’
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘,’
WITH SERDEPROPERTIES (‘field.delim’=’,’, ‘serialization.format’=’,’, ‘serialization.null.format’=‘NULL’)
STORED AS TEXTFILE
LOCATION ‘/test/marco/’
TBLPROPERTIES (‘serialization.null.format’=‘NULL’);
注意location必须要和文件路径保持一致
3.把csv文件的数据导入到对应的外部表中
load data inpath ‘/test/marco/schedule.csv’ into table feature.fea_schedule
即可