NULL转化为空字符串,可以节省磁盘空间,实现方法有几种
1)建表时直接指定(两种方式)
a、用语句
,如:
CREATE TABLE text_1 (
id int
,name STRING)
PARTITIONED BY ( partation_date string)
ROW FORMAT SERDE ‘org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe’
WITH SERDEPROPERTIES ( ‘field.delim’='/t’, ‘escape.delim’='//’, ‘serialization.null.format'='' )
STORED AS ORC;
b、或者通过ROW FORMAT DELIMITED NULL DEFINED AS '' 如
CREATE TABLE text_2 (
id int,
name STRING)
PARTITIONED BY ( partation_date string)
ROW FORMAT DELIMITED NULL DEFINED AS ''
STORED AS TEXTFILE;
2)修改已存在的表
alter table hive_tb set serdeproperties('serialization.null.format' = '');
1)建表时直接指定(两种方式)
a、用语句
,如:
CREATE TABLE text_1 (
id int
,name STRING)
PARTITIONED BY ( partation_date string)
ROW FORMAT SERDE ‘org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe’
WITH SERDEPROPERTIES ( ‘field.delim’='/t’, ‘escape.delim’='//’, ‘serialization.null.format'='' )
STORED AS ORC;
b、或者通过ROW FORMAT DELIMITED NULL DEFINED AS '' 如
CREATE TABLE text_2 (
id int,
name STRING)
PARTITIONED BY ( partation_date string)
ROW FORMAT DELIMITED NULL DEFINED AS ''
STORED AS TEXTFILE;
2)修改已存在的表
alter table hive_tb set serdeproperties('serialization.null.format' = '');
注意:此配置只是底层存储,null默认存储为 \N ,改为''. 单再CLI界面上仍为NULL,实际存储为''
select * from text_2 ;输出均为NULL