hive不使用insert上传数据到表中

最新推荐文章于 2023-05-29 08:35:07 发布

黑星bm

最新推荐文章于 2023-05-29 08:35:07 发布

阅读量229

点赞数

分类专栏： hive 文章标签： hive hadoop hdfs

本文链接：https://blog.csdn.net/weixin_45425054/article/details/112306352

版权

hive 专栏收录该内容

12 篇文章 2 订阅

订阅专栏

Hive上创建测试表test

create table test(
name string,
friends array<string>,
children map<string, int>,
address struct<street:string,city:string,email:int>
)
row format delimited fields terminated by ','
collection items terminated by '_'
map keys terminated by ':'
lines terminated by '\n';

字段解释：
row format delimited fields terminated by ‘,’ – 列分隔符
collection items terminated by ‘_’ --MAP STRUCT 和 ARRAY 的分隔符(数据分割符号)
map keys terminated by ‘:’ – MAP中的key与value的分隔符
lines terminated by ‘\n’; – 行分隔符

注：此时在hive上就存在表test（这个表放在了hdfs上）
在这里插入图片描述
我们往这个表上上传数据就不用insert这种方法了
上传数据为：
songsong,bingbing_lili,xiao song:18_xiaoxiao song:19,hui long guan_beijing_10010
yangyang,caicai_susu,xiao yang:18_xiaoxiao yang:19,chao yang_beijing_10011
可以用两种方法
方法一：
直接在hive中上传满足test表条件的数据（数据放在datas的testdata.txt中）
在这里插入图片描述
方法二：
在hadoop中直接上传文件到hdfs中表所在的目录注：这种方法是不走元数据的，我们在元数据中看不到新增的文件信息

此时在hdfs中：

注：
如果我上传的文件不满足表所规定的数据格式的话，就会自动填补null
如下：
在这里插入图片描述
这里上传的数据为

注意：

加载命令不执行对模式的数据验证。使用load data形式往hive表中装载数据时，不会检查字段类型，如果字段类型不一致，使用null值填充。如果字段过多则会丢弃，缺失则会使用null值填充。
如果文件位于hdfs中，则将其移入Hive控制的文件系统名称空间。会将数据从HDFS文件/目录加载到表中。
请注意，从HDFS加载数据将导致移动文件/目录，这里是文件的移动，不是复制。因此，该操作几乎是即时的。