json_tuple函数的利用

最新推荐文章于 2024-08-02 21:00:00 发布

weixin_52116589

最新推荐文章于 2024-08-02 21:00:00 发布

阅读量5.8k

点赞数 2

分类专栏： Hive 文章标签： hive json

本文链接：https://blog.csdn.net/weixin_52116589/article/details/114946998

版权

本文介绍了在Hive中如何使用json_tuple函数处理json格式的电影评分数据。通过创建表、加载数据，展示了一系列练习题，包括统计每部电影的平均得分、电影数量、评分次数以及用户平均评分，最后讲解了如何查询每个用户评分最高的20部电影。

摘要由CSDN通过智能技术生成

首先在Hive 中查看json_tuple 函数的详细信息

desc function extended json_tuple;

出现解释：

json_tuple(jsonStr, p1, p2, …, pn) - like get_json_object, but it takes multiple names and return a tuple. All the input parameters and output column types are string. |

jsonStr,就是我们自己的json数据文件名称，p1,p2,p3…等是json 数据的参数

例题

： json格式的电影评分数据##

首先解析json
create table t_json(json string);
加载数据到表中
select json_tuple(json,‘movie’,‘rate’,‘timeStamp’,‘uid’) as
(movie,rate,ts,uid) from tb_json limit 20;

查看建表数据

|+--------+-------+------------+------+
| movie  | rate  |     ts     | uid  |
+--------+-------+------------+------+
| 1193   | 5     | 978300760  | 1    |
| 661    | 3     | 978302109  | 1    |
| 914    | 3     | 978301968  | 1    |
| 3408   | 4     | 978300275  | 1    |
| 2355   | 5     | 978824291  | 1    |
| 1197   | 3     | 978302268  | 1    |
| 1287   | 5     | 978302039  | 1    |
| 2804   | 5     | 978300719  | 1    |
| 594    | 4     | 978302268  | 1    |
| 919    | 4     | 978301368  | 1    |
+--------+-------+------------+------+|  |

将表数据创建到新表

create table tb_movie_rate as select json_tuple
(json,‘movie’,‘rate’,‘timeStamp’,‘uid’) as (movie,rate,st,uid) from
tb_json;

以上是将建立新表到后面练习可以用
接下来就是练习题：
1、统计每部电影的平均得分

create table tb_avg_rate as select movie, avg(rate) from
tb_movie_rate group by movie;

2、统计数据中一共有多少部电影

create table tb_cout_movie as select count(distinct movie) from
tb_movie_rate;

+--------------------+
| tb_cout_movie._c0  |
+--------------------+
| 3706               |
+--------------------+

3、统计每部电影的被评分次数（热门度）

create table tb_sum_movie as select movie, count(distinct uid) from
tb_movie_rate group by movie;

+---------------------+-------------------+
| tb_sum_movie.movie  | tb_sum_movie._c1  |
+---------------------+-------------------+
| 1                   | 2077              |
| 10                  | 888               |
| 100                 | 128               |
| 1000                | 20                |
| 1002                | 8                 |
| 1003                | 121               |
| 1004                | 101               |
| 1005                | 142               |