Hive：前后端的数据传输案例实操

最新推荐文章于 2024-11-12 20:45:37 发布

程序员无羡

最新推荐文章于 2024-11-12 20:45:37 发布

阅读量139

点赞数

文章标签： hive hadoop 数据仓库

本文链接：https://blog.csdn.net/weixin_45427648/article/details/131840027

版权

简单了解前后端的数据传输

在这里插入图片描述

数据结构映射

（1）假设某表有如下一行，我们用JSON格式来表示其数据结构。在Hive下访问的格式为

{
    "name": "songsong",
    "friends": ["bingbing" , "lili"] ,       //列表Array, 
    "children": {                      //键值Map,
        "xiao song": 19 ,
        "xiaoxiao song": 18
    }
    "address": {                      //结构Struct,
        "street": "hui long guan" ,
        "city": "beijing" 
    }
}

（2）基于上述数据结构，我们在Hive里创建对应的表，并导入数据。
在目录/opt/module/hive/datas下创建本地测试文件personInfo.txt
[atguigu@hadoop102 datas]$ vim personInfo.txt
songsong,bingbing_lili,xiao song:18_xiaoxiao song:19,hui long guan_beijing
yangyang,caicai_susu,xiao yang:18_xiaoxiao yang:19,chao yang_beijing
注意：MAP，STRUCT和ARRAY里的元素间关系都可以用同一个字符表示，这里用“_”。

测试案例

（1）Hive上创建测试表personInfo
在这里插入图片描述

hive(default)>create table personInfo (
name string,
friends array<string>,
children map<string, int>,
address struct<street:string, city:string>
)
row format delimited
fields terminated by ','
collection items terminated by '_'
map keys terminated by ':'
lines terminated by '\n';

指定数据文件中行格式的分隔符
指定字段之间用’,’进行分割
指定集合类型的元素之间用’_’进行分割
指定map类型中key和value用’:’进行分割
指定行之间的分隔符为’\n’

（2）上传数据到hdfs中上述表的对应路径下
[atguigu@hadoop102 ~]$ hadoop fs -put /opt/module/hive/datas/personInfo.txt /user/hive/warehouse/personInfo;

（3）访问三种集合列里的数据，以下分别是ARRAY，MAP，STRUCT的访问方式

hive (default)>
select
friends[1],
children['xiao song'],
address.city
from personInfo
where name="songsong";
结果：
_c0     _c1     city
lili    18      beijing