Fayson的github: https://github.com/fayson/cdhproject
推荐关注微信公众号:“Hadoop实操”,ID:gh_c4c535955d0f
1 环境准备
- 测试环境:
1.CDH6.2
2.集群已开启Kerberos
3.Redhat7.4
1.准备一张文本表,数据文件约6GB。
create table if not exists hive_table_test (
s1 string,
s2 string,
s3 string,
s4 string,
s5 string,
s6 string,
s7 string,
s8 string,
s9 string,
s10 string,
s11 string
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ","
stored as textfile location '/fayson/hive_table_test';
hadoop fs -put hbase_data.csv /fayson/hive_table_test
select * from hive_table_test limit 1;
2.创建一张Parquet文件表,然后从文本表将数据插入过去