文件格式 | 压缩编码 | Impala是否可直接创建 | 是否可直接插入 |
Parquet | Snappy(默认), GZIP; | Yes | 支持:CREATE TABLE, INSERT, 查询 |
TextFile | LZO,gzip,bzip2,snappy | Yes. 不指定 STORED AS 子句的 CREATE TABLE 语句,默认的文件格式就是未压缩文本 | 支持:CREATE TABLE, INSERT, 查询。如果使用 LZO 压缩,则必须在 Hive 中创建表和加载数据 |
RCFile | Snappy, GZIP, deflate, BZIP2 | Yes. | 支持CREATE,查询,在 Hive 中加载数据 |
SequenceFile | Snappy, GZIP, deflate, BZIP2 | Yes. | 支持:CREATE TABLE, INSERT, 查询。需设置 |
注:impala不支持ORC格式
1.创建parquet格式的表并插入数据进行查询
[hadoop104:21000] > create table student2(id int, name string)
> row format delimited
> fields terminated by '\t'
> stored as PARQUET;
[hadoop104:21000] > insert into table student2 values(1001,'zhangsan');
[hadoop104:21000] > select * from student2;
2.创建sequenceFile格式的表,插入数据时报错
[hadoop104:21000] > create table student3(id int, name string)
> row format delimited
> fields terminated by '\t'
> stored as sequenceFile;
[hadoop104:21000] > insert into table student3 values(1001,'zhangsan');
Query: insert into table student3 values(1001,'zhangsan')
Query submitted at: 2018-10-25 20:59:31 (Coordinator: http://hadoop104:25000)
Query progress can be monitored at: http://hadoop104:25000/query_plan?query_id=da4c59eb23481bdc:26f012ca00000000
WARNINGS: Writing to table format SEQUENCE_FILE is not supported. Use query option ALLOW_UNSUPPORTED_FORMATS to override.
[hadoop104:21000] > set ALLOW_UNSUPPORTED_FORMATS=true;
[hadoop104:21000] > insert into table student3 values(1001,'zhangsan');