1 文件格式
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL
https://cwiki.apache.org/confluence/display/Hive/SerDe
file_format:
: SEQUENCEFILE
| TEXTFILE -- (Default, depending on hive.default.fileformat configuration)
| RCFILE -- (Note: Available in Hive 0.6.0 and later)
| ORC -- (Note: Available in Hive 0.11.0 and later)
| PARQUET -- (Note: Available in Hive 0.13.0 and later)
| AVRO -- (Note: Available in Hive 0.14.0 and later)
| JSONFILE -- (Note: Available in Hive 4.0.0 and later)
| INPUTFORMAT input_format_classname OUTPUTFORMAT output_format_classname
2 总结
实际的项目开发中,hive 表的数据,推荐
- 存储格式 :
orcfile
,parquet
- 数据压缩:
snappy