hive的表分为三种:内部表(删除表会同时删除HDFS文件)、外部表(删除表只会删除源数据,并不会删除HDFS文件)、临时表(只会当前会话有效,会话结束,临时表消失)
hive创建表的方式有三种:
- 直接创建表
CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name -- (Note: TEMPORARY available in Hive 0.14.0 and later)
[(col_name data_type [column_constraint_specification] [COMMENT col_comment], ... [constraint_specification])]
[COMMENT table_comment]
[PARTITIONED BY (col_name data_type [COMMENT col_comment], ...)]
[CLUSTERED BY (col_name, col_name, ...) [SORTED BY (col_name [ASC|DESC], ...)] INTO num_buckets BUCKETS]
[SKEWED BY (col_name, col_name, ...) -- (Note: Available in Hive 0.10.0 and later)]
ON ((col_value, col_value, ...), (col_value, col_value, ...), ...)
[STORED AS DIRECTORIES]
[
[ROW FORMAT row_format]
[STORED AS file_format]
| STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)] -- (Note: Available in Hive 0.6.0 and later)
]
[LOCATION hdfs_path]
[TBLPROPERTIES (property_name=property_value, ...)] -- (Note: Available in Hive 0.6.0 and later)
[AS select_statement]; -- (Note: Available in Hive 0.5.0 and later; not supported for external tables)
以上为官网语句,解释下:
[]中的为可选项
TEMPORARY:为临时表的关键字,只对当前会话有效,会话结束,临时表消失
EXTERNAL:外部表关键字,删除外部表只会删除元数据,并不会删除对应的HDFS文件
如果以上两个关键字都没选(只能选一个关键字),默认为内部表,删除表不仅会删除元数据,也会删除对应的HDFS文件
column_constraint_specification:列的约束,类似于MySQL的列约束
COMMENT:字段和表的描述
PARTITIONED BY:分区
CLUSTERED BY SORTED BY:分桶
SKEWED BY:指定倾斜字段及值
ROW FORMAT:指定行的分隔符
STORED AS :指定存储格式
LOCATION :指定HDFS文件位置
TBLPROPERTIES :指定配置属性
create table temp.temp_create_table_test (
col1 string comment "字段1",
col2 int comment "字段2"
)
comment "测试"
partitioned by (part_col string comment "分区字段")
clustered by (col1) sorted by (col1 desc) into 2 buckets
skewed by (col1) on ((2), (3))
row format delimited fields terminated by "\t"
stored as parquet
location "/user/hive/warehouse-3.1.1/temp.db/temp_create_table_test"
tblproperties("parquet.compression"="snappy");
- 通过复制表结构创建新表
CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name
LIKE existing_table_or_view_name
[LOCATION hdfs_path];
create table temp.temp_create_table_like_test like temp.temp_create_table_test;
- 通过子查询创建表
create table temp.temp_create_table_as_test as
select col1, col2 from temp.temp_create_table_test where part_col = '2';
官网地址:https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL