hive表:
select * from tmp.zsh_test1;
c1 c2 c3 c4 c5 c6 topicdate
11 12 13 NULL NULL NULL 2018-09-01
21 22 23 NULL NULL NULL 2018-09-02
31 32 33 NULL NULL NULL 2018-09-03
41 42 43 44 NULL NULL 2018-09-04
51 52 NULL 54 55 NULL 2019-02-13
(注:CREATE TABLE `zsh_test1`(
`c1` int,
`c2` int,
`c3` int,
`c4` string,
`c5` string,
`c6` string)
PARTITIONED BY (
`topicdate` string)
)
HDFS文件:
hadoop fs -text hdfs:///user/hive/warehouse/tmp.db/zsh_test1/topicdate=2018-09-01/* | less
hadoop fs -text hdfs:///user/hive/warehouse/tmp.db/zsh_test1/topicdate=2019-02-13/* | less
思考:
1. 为什么2018-09-01的文件里null没有被存储?
2. 为什么2019-02-13的文件里int c3 和 string c6 里的null都存储成了‘\N’?
答:INT与STRING的存储,NULL默认的存储都是'\N'。
ref:
https://blog.csdn.net/weixin_38750084/article/details/82873171
https://blog.csdn.net/lvhuiyin/article/details/77894289
https://blog.csdn.net/SunnyYoona/article/details/78276551