最终版
- 数据存储格式推荐使用orc,压缩方式推荐使用snappy(ORC文件的默认压缩方式是ZLIB)
COMMENT '表注释'
PARTITIONED BY (dt string COMMENT '日分区')
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\007'
STORED AS ORC
TBLPROPERTIES ('orc.compression'='snappy');
- 分桶(分区常用;分桶少用)
COMMENT '表注释'
PARTITIONED BY (dt string COMMENT '日分区')
clustered by(c_id) into 4 buckets
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\007'
STORED AS ORC
TBLPROPERTIES ('orc.compression'='snappy');
建表书写
- 分区 \007分隔符 textfile
comment '表注释'
partitioned by (dt string)
row format delimited
fields terminated by '\007'
stored as textfile;
- 分区 \007分隔符 orc
comment '表注释'
partitioned by (dt string)
row format delimited
fields terminated by '\007'
stored as orc;
- row format delimited
ROW FORMAT DELIMITED NULL DEFINED AS ‘’
功能:将Hive的这张表中的null设置为空
注意:Hive中的NULL值是假NULL,Hive底层的数据是文件,Hive中的NULL值实际是\n
为什么特殊指定?
如果将Hive中的NUll导出到MySQL中,就不能成功
show create table 显示
- 分区 \t分隔符,要当心 textfile
PARTITIONED BY (dt string)
ROW FORMAT SERDE
‘org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe’
WITH SERDEPROPERTIES (
‘field.delim’=‘\t’,
‘serialization.format’=‘\t’)
STORED AS INPUTFORMAT
‘org.apache.hadoop.mapred.TextInputFormat’
OUTPUTFORMAT
‘org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat’
- 分区 \t分隔符 orc
partitioned by (dt date)
ROW FORMAT SERDE
‘org.apache.hadoop.hive.ql.io.orc.OrcSerde’
WITH SERDEPROPERTIES (
‘field.delim’=‘\t’,
‘serialization.format’=‘\t’)
STORED AS INPUTFORMAT
‘org.apache.hadoop.hive.ql.io.orc.OrcInputFormat’
OUTPUTFORMAT
‘org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat’
- 分区 默认分隔符 orc
partitioned by (dt date)
ROW FORMAT SERDE
‘org.apache.hadoop.hive.ql.io.orc.OrcSerde’
STORED AS INPUTFORMAT
‘org.apache.hadoop.hive.ql.io.orc.OrcInputFormat’
OUTPUTFORMAT
‘org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat’
- 分区 未知分隔符,要当心 textfile
PARTITIONED BY (dt string)
ROW FORMAT SERDE
‘org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe’
WITH SERDEPROPERTIES (
‘field.delim’=‘’,
‘serialization.format’=‘’)
STORED AS INPUTFORMAT
‘org.apache.hadoop.mapred.TextInputFormat’
OUTPUTFORMAT
‘org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat’