hive之hplsql学习

最近需要搞这方面的工作,首先百度了一下,发现重复率太高,于是自己去官网查看了,附上官网地址

http://www.hplsql.org/doc

这个人应该最开始发表文章的,写的不错

http://lxw1234.com/archives/category/hive

这段话翻译一下 注意时间是2016年2月就有了

1、hive2.0.0 实现了hpl/sql功能,实际上就是类似于mysql、oracle的存储过程

2、hpl/sql很牛b

一、commandline

hplsql -e 'query' | -f file 
      [-main procname]
      [-d | --define | -hiveconf | -hivevar var=value ...] 
      [-version | --version]
      [-trace | --trace]
      [-H | --help]

Parameters:

ParameterDescriptionHPL/SQL Version
-e 'query'SQL statements to execute0.1
-f fileExecute SQL statements from file0.1
-main procnameEntry point (procedure or function name)0.3.7
-d var=valueVariable definition0.1
--define var=value0.1
-hiveconf var=value0.1
-hivevar var=value0.1
-version--versionPrint HPL/SQL version0.1
-trace--tracePrint trace information0.1
-H--helpPrint help information and exit0.1

进入hive的bin安装目录就可以看到hplsql这个脚本,版本是cdh6.3.0,HPL/SQL 0.3.31

Notes:

  • -e and -f cannot be specified together

  • if -main option is not specified, HPL/SQL starts executing all statements from the beginning of the script

  • You can use single 'query' and double “query” quotes with -e option

  • Currently -d, --define, -hivevar and -hiveconf are equivalent and allow you to define input variables.

翻译下

1、-e 和-f 不能同时使用

2、如果-main选项没有指定,hplsql也就是你写的存储过程将执行所有的内容

3、-e的时候,你可以使用单引号和双引号

4、-d, --define, -hivevar 和-hiveconf都可以用来传递参数

Example 1:

Executing HPL/SQL statements from a script:

hplsql -f script.sql 

Example 2:

Executing HPL/SQL statements from command line:

hplsql -e "NVL(MAX_PARTITION_DATE(db.sales, local_dt, code='A'), CURRENT_DATE)" 

这句话暂时还不太理解,这个函数目前没看到,而且里面的参数也不明白,这个官方示例还不太懂

Example 3:

Using variables:

hplsql -e "PRINT a || ', ' || b" -d a=Hello -d b=world 

Result:

Hello, world

就目前来说./hplsql 和./hive的功能基本一样

——————————————————————————————————————————————————————

下面主要学习存储过程基础、写法和使用

Operators

 

操作符

ExpressionResultResult Type
3 + 14Integer
'Ab' + 'c''Abc'String
DATE '2014-12-31' + 1DATE '2015-01-01'DATE
3 = 3TrueBoolean
3 = 1FalseBoolean
'CA' = NULLNULLBoolean
3 == 3TrueBoolean
3 <> 3FalseBoolean
3 != 3FalseBoolean
3 > 1TrueBoolean
3 < 1FalseBoolean
3 >= 1TrueBoolean
3 <= 1FalseBoolean
'a' || 'b' || 'c''abc'String
'a' || 1 || 'c''a1c'String
'a' || NULL || 'c''ac'String
NULL || NULLNULLString

 许多示例使用

./hplsql -e "PRINT a || ', ' || b" --hivevar a=Hello --hivevar b=lxw1234.com

也没人说使用原因,hplsql中 a b是字符串,然后用||连接符连接 ', '注意这个,是字符串,所以一定要引起来

使用位置:

比如我现在用hplsql写个脚本,每天把a表的数据插入到b表

vim test.sh

hive -e 'insert into tableA as select * from  impala.'||tableB||' limit 10';

—————————————————————————————————————————————————————————

Attributes and Built-in Variables

HPL/SQL attributes and built-in variables:

ACTIVITY_COUNTNumber of rows affected by last SQL statement //是否有数据,有为1没有为0
ERRORCODEReturn code of the last SQL statement //上句sql写的对不对 0正确,正数warning 负数error
HOSTCODEReturn code of the last OS command  //
SQLCODEReturn code of the last SQL statement  //0正确,正数warning 负数error
SQLSTATEReturn status of the last SQL statement //算了感觉都差不多

 

这个的具体作用主要就是判断上一句sql 的查询插入删除等情况是否正常运行,从而来输出日志,同时判断下一步走向

————————————————————————————————————————————————————————

 

Statements

HPL/SQL statements:

ALLOCATE CURSORAllocate cursor for procedure result set
ASSOCIATE RESULT SET LOCATORDefine locators for procedure result sets
BREAKExit a loop
CALLExecute a stored procedure
CLOSEClose a cursor
CMPCompare data in tables
COPYCopy data between tables and files
COPY FROM FTPCopy FTP files to Hadoop compatible file system
COPY FROM LOCALCopy local files to Hadoop compatible file system
CREATE DATABASECreate a database
CREATE FUNCTIONCreate a user-defined SQL function
CREATE LOCAL TEMPORARY TABLECreate a session-level temporary table
CREATE PACKAGECreate a program package
CREATE PROCEDURECreate a user-defined SQL procedure
CREATE TABLECreate a table
CREATE VOLATILE TABLECreate a session-level temporary table
DECLAREDeclare a variable
DECLARE CONDITIONDeclare a condition
DECLARE CURSORDeclare a cursor
DECLARE HANDLERDeclare a condition handler
DECLARE TEMPORARY TABLEDeclare a temporary table
DESCRIBEDescribe a database object
DROP DATABASEDrop a database
DROP TABLEDrop a table
EXECEXECUTEEXECUTE IMMEDIATEExecute a dynamic SQL statement or procedure
EXIT WHENExit a loop
FETCHFetch the next row from a cursor
FOR cursorFOR statement (Cursor loop)
FOR rangeFOR statement (Integer range)
GET DIAGNOSTICSGet execution information
HOSTExecute an OS command or run an external process
IFIF statement
INCLUDEInclude statements from another script
INSERTINSERT statement
INSERT DIRECTORYWrite query results to a file
LEAVEExit a loop
OPENOpen a cursor
LOOPUnconditional loop
MAP OBJECTMap object name to a connection profile
NULLNo operation (no-op) statement
PRINTPrint a line
RESIGNALResignal the exception
RETURNReturn from a routine
SELECTSELECT statement
SELECT INTOAssign values from a query
SIGNALRaise a condition or exception
SUMMARYSummary stats for a table or result set
SETAssign a value to a variable
SET Session OptionSet a session option
TRUNCATETruncate a table
UPDATEUPDATE statement
USEChange the default database
VALUES INTOAssign a value to a variable
WHILEWhile loop

这个是存储过程中的语法,以前看过oracle的存储过程感觉挺像的。

CREATE PROCEDURE set_message(IN arg STRING)
BEGIN
 SET result = 'Hello, lxw的大数据田地!';
 print arg;
 PRINT result || ', ' || arg;
END;
 
PRINT 'Hello, lxw1234.com! at beginning .. ';
CALL set_message(' by CALL ..');

借用文章开头那位大神的例子说明下(大神觉得简单,基本没介绍)

首先CREATE PROCEDURE 声明了一个存储过程,其实类似与java里的方法一样,调用的方式是call procedure,这个就相当于main方法。(中文可能有乱码,无所谓)

第一种 hplsql -f xxx.sql

          该方法会执行print 和call set_message(' by CALL ..');

          输出Hello, lxw1234.com! at beginning ..

                 by CALL ..

                 Hello, lxw的大数据田地! , by CALL ..

第二种 hplsql -f xxx.sql -main set_message

          直接调用存储过程 call set_message(null)

          输出 null

                 Hello, lxw的大数据田地! ,

 这里调用的时候是没有参数的,这个时候我就在想肯定不行吧,一般作为脚本肯定需要时间传递

那么肯定有解决办法:内置的时间参数类似与${YYYY-mm-dd} 或者可以传入时间参数类似 call set_message(2020-04-13)

1、使用-d参数传递

./hplsql -f cc_test/hplsql_test -d yyyy=2020

脚本里的print yyyy 就会在屏幕上输出 2020

2、内置时间参数暂时还没找到。。。。。。。。。。。。。。

一些简单的使用demo

hplsql的使用demo

查询数据库里的内容

[devuser@node04 ~/hplsql/hplsql-0.3.31/cc_test_hplsql]$ cat  select.sql 
select * from impala.impala_test_middle limit 3 --hive的库名和表名

[devuser@node04 ~/hplsql/hplsql-0.3.31/cc_test_hplsql]$ ../hplsql -f select.sql

Starting query
Query executed successfully (893 ms)
col1_2 col2_2 col3_2 col4_2 col5_2 col6_2 col7_2 col8_2 col9_2 part10
col1_2 col2_2 col3_2 col4_2 col5_2 col6_2 col7_2 col8_2 col9_2 part1
col1_4 col2_4 col3_4 col4_4 col5_4 col6_4 col7_4 col8_4 col9_4 part10

使用简单的函数

[devuser@node04 ~/hplsql/hplsql-0.3.31/cc_test_hplsql]$ cat function_hello.sql 
CREATE FUNCTION hello2(text STRING)
RETURNS STRING
BEGIN
RETURN 'Hello, ' || text || '!';
END;
-- Call the function
PRINT hello2('world');

[devuser@node04 ~/hplsql/hplsql-0.3.31/cc_test_hplsql]$ ../hplsql -f function_hello.sql

Hello, world!

创建存储过程

[devuser@node04 ~/hplsql/hplsql-0.3.31/cc_test_hplsql]$ cat proc_print.sql 
CREATE PROCEDURE set_message(IN arg STRING)
BEGIN
SET result = 'Hello chenchi!';
print arg;
PRINT result || ', ' || arg;
END;

PRINT 'Hello, chenchi! at beginning .. ';
CALL set_message(' by CALL ..');

[devuser@node04 ~/hplsql/hplsql-0.3.31/cc_test_hplsql]$ ../hplsql -f proc_print.sql

Hello, chenchi! at beginning .. 
by CALL ..
Hello chenchi!, by CALL ..

[devuser@node04 ~/hplsql/hplsql-0.3.31/cc_test_hplsql]$ ../hplsql -f proc_print.sql -main set_message

null
Hello chenchi!,

使用include引用其他文件里的存储过程

[devuser@node04 ~/hplsql/hplsql-0.3.31/cc_test_hplsql]$ cat set_message.sql 
CREATE PROCEDURE set_message(IN name STRING, OUT result STRING)
BEGIN
print 'name='||name
SET result = 'Hello, ' || name || '!';
END;

[devuser@node04 ~/hplsql/hplsql-0.3.31/cc_test_hplsql]$ cat include.sql 
INCLUDE set_message.sql
DECLARE str STRING;
CALL set_message('world', str);
PRINT str;

[devuser@node04 ~/hplsql/hplsql-0.3.31/cc_test_hplsql]$ ../hplsql -f include.sql

name=world
Hello, world!

使用package

[devuser@node04 ~/hplsql/hplsql-0.3.31/cc_test_hplsql]$ cat package_users.sql
create or replace package users as
session_count int := 0;
function get_count() return int; 
procedure add(name varchar(100));
end;

create or replace package body users as
function get_count() return int
is
begin
return session_count;
end; 
procedure add(name varchar(100))
is 
begin
-- ...
session_count = session_count + 1;end;
end;
users.add('John');
users.add('Sarah');
users.add('Paul');
print 'Number of users: ' || users.get_count();

[devuser@node04 ~/hplsql/hplsql-0.3.31/cc_test_hplsql]$ ../hplsql -f package_users.sql

Number of users: 3

使用游标

[devuser@node04 ~/hplsql/hplsql-0.3.31/cc_test_hplsql]$ cat cursor.sql 
CREATE PROCEDURE spOpenIssues 
DYNAMIC RESULT SETS 1
BEGIN
DECLARE cur CURSOR WITH RETURN FOR
SELECT col1,col2 FROM impala.impala_test_middle;
OPEN cur;
END;

DECLARE id INT;
DECLARE name VARCHAR(30);

CALL spOpenIssues;
ALLOCATE c1 CURSOR FOR PROCEDURE spOpenIssues;

FETCH c1 INTO id, name;
WHILE (SQLCODE = 0)
DO
PRINT id || ' - ' || name;
FETCH c1 INTO id, name;
END WHILE;
CLOSE c1;

[devuser@node04 ~/hplsql/hplsql-0.3.31/cc_test_hplsql]$ ../hplsql -f cursor.sql

- col2_2
- col2_4
- col2_4

更多demo自己看http://www.hplsql.org/doc

 

       

 

  • 1
    点赞
  • 11
    收藏
    觉得还不错? 一键收藏
  • 3
    评论
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值