HiveSQL脚本传参&使用

江畔独步

已于 2022-03-25 16:23:49 修改

阅读量4.1k

点赞数

分类专栏： Hive 文章标签： hive sql big data

于 2021-10-26 11:19:58 首次发布

本文链接：https://blog.csdn.net/liuwei0376/article/details/120967185

版权

Hive 专栏收录该内容

18 篇文章 0 订阅

订阅专栏

一、基础语法

运行hive命令时，如果想动态传入参数，可以使用如下2种方式：

参数格式	说明
`--hivevar`	传参数，专门提供给用户自定义变量
`--hiveconf`	①传参数；②覆盖 hive-site.xml中配置的hive全局变量

二、测试验证

测试目标

根据动态传入的省份参数，去查询不同省份的人口数据库

准备测试脚本 test.hql：

--数据库前的省份限定字符，根据外部参数传入
use ${传入的地域}db_population;

--不限定数据库，根据上述参数动态指定
select count(1) from basic_population_detail; 

--参照物：上海市人口
select count(1) from shanghai_db_population.basic_population_detail;
--参照物：河南省人口
select count(1) from henan_db_population.basic_population_detail;

2.1 使用 `--hivevar` 参数：自定义参数

注意:

-d 参数与 --hivevar 作用一样, 可以类比为--hivevar的简写形式.

CASE 1). 传shanghai_参数给hql脚本：

hive --hivevar v_location=shanghai_ -S -f test.hql
hive --hivevar 'v_location=shanghai_' -S -f test.hql
hive --hivevar "v_location=shanghai_" -S -f test.hql

-S参数表示静默输出

--数据库前的省份限定字符，根据外部参数传入，这里将查询上海数据库：
use ${hivevar:v_location}db_population;

--不限定数据库，根据上述参数动态指定
select count(1) from basic_population_detail; --未指定库名，根据上下文这里是获取上海人口

--参照物：上海市人口
select count(1) from shanghai_db_population.basic_population_detail;
--参照物：河南省人口
select count(1) from henan_db_population.basic_population_detail;

CASE 2). 传henan_参数给hql脚本：

hive --hivevar v_location=henan_ -S -f test.hql

--数据库前的省份限定字符，根据外部参数传入，这里将查询河南数据库：
use ${hivevar:v_location}db_population;

--不限定数据库，根据上述参数动态指定
select count(1) from basic_population_detail; --未指定库名，根据上下文这里是获取河南人口

CASE 3). 单纯变量替换：一般语法

hivevar用于定义HIVE运行时的变量替换，类似于JAVA中的“PreparedStatement”，与“${}”配合使用，示例如下：

定义变量，并启动HIVE CLI
hive --hivevar my=“202105” --database deafult -e ‘select * from a1 where concat(year, month) = ${my} limit 10’;

CASE 4). 单纯变量替换：替代语法

define

define与hivevar用途完全一样，还有一种简写“-d”，示例如下：

定义变量
hive --hiveconf “mapred.job.queue.name=root.default” -d my=“202105” --database default -e
‘select * from mydb where concat(year, month) = ${my} limit 10’;

2.2 使用 `--hiveconf` 参数：自定义参数

CASE 1). 传shanghai_参数给hql脚本：

hive --hiveconf v_location=shanghai_ -f test.hql
hive --hiveconf 'v_location=shanghai_' -f test.hql
hive --hiveconf "v_location=shanghai_" -f test.hql

--数据库前的省份限定字符，根据外部参数传入，这里将查询上海数据库：
use ${hiveconf:v_location}db_population;

--不限定数据库，根据上述参数动态指定
select count(1) from basic_population_detail; --未指定库名，根据上下文这里是获取上海人口

--参照物：上海市人口
select count(1) from shanghai_db_population.basic_population_detail;
--参照物：河南省人口
select count(1) from henan_db_population.basic_population_detail;

CASE 2). 传参给hql脚本：

hive --hiveconf v_location=henan_ -f test.hql

--数据库前的省份限定字符，根据外部参数传入，这里将查询河南数据库：
use ${hiveconf:v_location}db_population;

--不限定数据库，根据上述参数动态指定
select count(1) from basic_population_detail; --未指定库名，根据上下文这里是获取河南人口

2.3 使用 `--hiveconf` 参数覆盖hive-site.xml中配置的hive全局变量

在这种使用场景下：

hiveconf用于定义HIVE执行上下文的属性(配置参数)，可覆盖覆盖hive-site.xml（hive-default.xml）中的参数值，如用户执行目录、日志打印级别、执行队列等，常用的配置属性如下：

参数名称参数解释
hive.metastore.warehouse.dir 启动时指定用户目录，不同的用户不同的目录
hive.cli.print.current.db 显示当前数据库
hive.root.logger 输出日志信息
hive.cli.print.header 显示列名称
mapred.job.queue.name 执行队列名称

在HIVE 里操作的话，可以利用“set”指令进行设置（会话优先级高于配置文件），如下：

场景 A).
①、打开hive CMD终端；
②、然后设置参数：
set hive.root.logger=ERROR

场景 B).
上述指令等价于“hive --hiveconf”命令，如下：
hive --hiveconf “hive.root.logger=ERROR”

场景 C). 动态传递到当前hql的会话中
hive --hiveconf “hive.root.logger=ERROR” --hiveconf v_location=henan_ -S -f test.hql

场景 D). shell命令行中动态传递到当前hql的会话中

[hdfs@hadoop test]$ hive --hivevar year=$(date +%Y) --hivevar month=`expr $(date +%m) + 0` -f /data/program/hive_partition_test.hql

hive_partition_test.hql 中引用的位置:


-- ///
-- Azkaban中使用,请使用如下方式传参:
--     hive --hivevar year=$(date +%Y) --hivevar month=`expr $(date +%m) + 0` -f /data/program/hive_partition_test.hql
-- expr语法是为了解决以0开头的月份,需要转化为纯数字月份的场景
-- ///
ALTER TABLE default.hive_partition_table_test DROP IF EXISTS PARTITION(year=${hivevar:year}, month=${hivevar:month});

江畔独步

关注

0
点赞
踩
8

收藏

觉得还不错? 一键收藏
0
评论
HiveSQL脚本传参&使用

一、基础语法运行hive命令时，如果想动态传入参数，可以使用如下2种方式：参数格式说明--hivevar传参数，专门提供给用户自定义变量--hiveconf①传参数；②覆盖 hive-site.xml中配置的hive全局变量二、测试验证测试目标根据动态传入的省份参数，去查询不同省份的人口数据库准备测试脚本 test.hql：--数据库前的省份限定字符，根据外部参数传入use ${传入的地域}db_population;--不限定数据库，根据上述
复制链接

扫一扫