Hive字符串处理

本文介绍了Hive中处理字符串的各种操作,包括截取、拼接、分割和替换等。详细讲解了concat()和concat_ws()函数的使用,以及如何进行字符串替换。此外,还探讨了正则匹配的贪婪与非贪婪模式,并分享了JSON字符串解析的方法,如get_json_object()、json_tuple()和str_to_map(),帮助处理复杂的数据结构。
摘要由CSDN通过智能技术生成

截取

Return Type Name(Signature) Description
string substr(string A, int start) Returns the substring or slice of the byte array of A starting from start position till the end of string A. For example, substr(‘foobar’, 4) results in ‘bar’ …对于字符串A,从start位置开始截取字符串并返回
string substr(string A, int start, int len) Returns the substring or slice of the byte array of A starting from start position with length len. For example, substr(‘foobar’, 4, 1) results in ‘b’ …对于二进制/字符串A,从start位置开始截取长度为length的字符串并返回
string substring_index(string A, string delim, int count) Returns the substring from string A before count occurrences of the delimiter delim (as of Hive 1.3.0). If count is positive, everything to the left of the final delimiter (counting from the left) is returned. If count is negative, everything to the right of the final delimiter (counting from the right) is returned. Substring_index performs a case-sensitive match when searching for delim. Example: substring_index(‘www.apache.org’, ‘.’, 2) = ‘www.apache’…截取第count分隔符之前的字符串,如count为正则从左边开始截取,如果为负则从右边开始截取
int instr(string str, string substr) Returns the position of the first occurrence of substr in str. Returns null if either of the arguments are null and returns 0 if substr could not be found in str. Be aware that this is not zero based. The first character in str has index 1…查找字符串str中子字符串substr出现的位置,如果查找失败将返回0,如果任一参数为Null将返回null,注意位置为从1开始的

拼接

Return Type Name(Signature) Description
string concat(string A, string B…) Returns the string or bytes resulting from concatenating the strings or bytes passed in as parameters in order. For example, concat(‘foo’, ‘bar’) results in ‘foobar’. Note that this function can take any number of input strings…对二进制字节码或字符串按次序进行拼接
string concat_ws(string SEP, string A, string B…) Like concat() above, but with custom separator SEP…与concat()类似,但使用指定的分隔符喜进行分隔
string concat_ws(string SEP, array<string>) Like concat_ws() above, but taking an array of strings. (as of Hive 0.9.0).拼接Array中的元素并用指定分隔符进行分隔

1. concat()

2. concat_ws()

指定分隔符将多个字符串连接起来,结合group by与collect_set使用可实现“列转行”。

hive > select concat_ws('+','a','b','c');
OK
a+b+c

hive > select aa, bb, cc from jj_tmp.user_list;
OK
c	d	1
c	d	2
c	d	3
e	f	4
e	f	5
e	f	6

hive > select aa, bb, concat_ws(',' , collect_set(cast(cc as string))) from user_list group
  • 1
    点赞
  • 11
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值