hive sql 将结果输出到变量中并用于后续使用(未完全解决)

最新推荐文章于 2024-05-24 17:26:47 发布

无用理想家

最新推荐文章于 2024-05-24 17:26:47 发布

阅读量1w

点赞数

分类专栏：数据分析文章标签： hive sql

本文链接：https://blog.csdn.net/sinat_41663922/article/details/116982456

版权

数据分析专栏收录该内容

8 篇文章 1 订阅

订阅专栏

问题场景描述：遇到会有需要反复使用的结果，比如总数，一直join就感觉不够优雅，就想着能不能把结果输出到变量中，然后利用变量就好了。

然后百度搜了半天的结果无一例外是像下面这样，先设定了变量，然后用变量去做判断。这个结果和我要的其实算是相反的？

set cur_time = '2012-01-01'
select a
from table_name
where date_time = ${hiveconf:cur_time}

无奈之下换个语言搜索，stack overflow大法好，我竟然在差评回答中找到了我要的答案
在这里插入图片描述

正解：
像写其他代码初始化变量一样输出赋值，${hiveconf:xxx}这样子检查结果（或者${hivevar:xxx}如果你设置了用户变量的话

set num = select count(1) from table_name where visitTime = current_date;
${hiveconf:num};//usage

也可以存储多个结果字段

set total = select (case when user_type = '1' then 'Apple'
else 'Orange' end) as fruit,count(1) as cnt
from store_list
group by fruit;

${hiveconf:total};

但有个问题是，这样子赋的值，无论是单个字段还是多个，他们都是以类似dataframe的形式存储在变量中的，要取出来的话不知道该怎么弄，感觉只能改成用python写？（在线蹲个大佬解答

补充：hiveconf vs. hivevar vs. system
(贴一下链接上的解释

Most of the answers here have suggested to either use hiveconf or hivevar namespace to store the variable. And all those answers are right. However, there is one more namespace.
There are total three namespaces available for holding variables.

hiveconf - hive started with this, all the hive configuration is stored as part of this conf. Initially, variable substitution was not part of hive and when it got introduced, all the user-defined variables were stored as part of this as well. Which is definitely not a good idea. So two more namespaces were created.
hivevar: To store user variables
system: To store system variables.

And so if you are storing a variable as part of a query (i.e. date or product_number) you should use hivevar namespace and not hiveconf namespace.
And this is how it works.
hiveconf is still the default namespace, so if you don’t provide any namespace it will store your variable in hiveconf namespace.
However, when it comes to referring a variable, it’s not true. By default it refers to hivevar namespace.
If you do not provide namespace as mentioned below, variable var will be stored in hiveconf namespace.

无用理想家

关注

0
点赞
踩
4

收藏

觉得还不错? 一键收藏
8
评论
hive sql 将结果输出到变量中并用于后续使用(未完全解决)

问题场景描述：遇到会有需要反复使用的结果，比如总数，一直join就感觉不够优雅，就想着能不能把结果输出到变量中，然后利用变量就好了。然后百度搜了半天的结果无一例外是像下面这样，先设定了变量，然后用变量去做判断。这个结果和我要的其实算是相反的？set cur_time = '2012-01-01'select afrom table_namewhere date_time = ${hiveconf:cur_time}无奈之下换个语言搜索，stack overflow大法好，我竟然在差评回答中找到
复制链接

扫一扫

专栏目录