Spark版本号:2.4.3,Spark SQL命令行输入
show functions;
示例
- abs:取绝对值
> select abs(-10);
+------+
| _c0 |
+------+
| 10 |
+------+
- acos:反余弦函数
> select acos(0.5);
+---------------------+
| _c0 |
+---------------------+
| 1.0471975511965979 |
+---------------------+
- add_months(start_date, num_months),增加月份
> select add_months('2016-02-29', 1);
+-------------+
| _c0 |
+-------------+
| 2016-03-31 |
+-------------+
-
aggregate
-
and
-
approx_count_distinct(expr[, relativeSD])
-
approx_percentile(col, percentage [, accuracy])
-
array(expr, …) - 返回给定值组成的数组。
> select array(1, 2 , 3, 4, 5, 6);
+----------------+
| _c0 |
+----------------+
| [1,2,3,4,5,6] |
+----------------+
- array_contains(array, value) - 如果数组包含了 value,则返回 true。
> select array_contains(array(1, 2, 3, 4, 5, 6), 1);
+------------------------------------------+
|array_contains(array(1, 2, 3, 4, 5, 6), 1)|
+------------------------------------------+
| true|
+------------------------------------------+
- array_distinct(array) - 数组去重
> select array_distinct(array(1, 1, 3, 3));
+---------------------------------+
|array_distinct(array(1, 1, 3, 3))|
+---------------------------------+
| [1, 3]|
+---------------------------------+
- array_except(array1, array2) - 数组移除
> select array_except(array(1, 2, 3) , array(1));
+--------------------------------------+
|array_except(array(1, 2, 3), array(1))|
+--------------------------------------+
| [2, 3]|
+--------------------------------------+
- array_intersect(array1, array2) - 数组求交
> select array_intersect(array(1, 2, 3) , array(1, 2));
+--------------------------------------------+
|array_intersect(array(1, 2, 3), array(1, 2))|
+--------------------------------------------+
| [1, 2]|
+--------------------------------------------+
- array_join(array, String[, String]): String:使用分隔符将数组的元素进行拼接。另外,我们还可以指定可选的字符来替换 null 值。如果我们没有指定 null 值的替换字符串,那么结果中将会把 null 值替换掉。
> select array_join(array('hello', null, 'world'), ',', 'null');
+--------------------------------------------------------------+
|array_join(array(hello, CAST(NULL AS STRING), world), ,, null)|
+--------------------------------------------------------------+
| hello,null,world|
+--------------------------------------------------------------+
第一个参数为:stirng的数组
第二个参数为:拼接字符串
第三个参数为:null替换为字符串
- array_max(array): T:返回数组中的最大值, null 元素将会被忽略。
> select array_max(array(1, 2, 10, null, 3));
+------------------------------------------------+
|array_max(array(1, 2, 10, CAST(NULL AS INT), 3))|
+------------------------------------------------+
| 10|
+------------------------------------------------+
- array_min(array): T:返回数组中的最小值, null 元素将会被忽略。
> select array_min(array(1, 2, 10, null, 3));
+------------------------------------------------+
|array_min(array(1, 2, 10, CAST(NULL AS INT), 3))|
+------------------------------------------------+
| 1|
+------------------------------------------------+
- array_position(array, T): Long:返回元素 T 在数组第一次出现的位置,从1开始算。
> select array_position(array(1, 3, 4, 1), 1);
+------------------------------------+
|array_position(array(1, 3, 4, 1), 1)|
+------------------------------------+
| 1|
+------------------------------------+
- array_remove(array, T): array:从给定数组中删除所有与给定元素相等的元素。
> select array_remove(array(1, 3, 4, 1, 1), 1);
+-------------------------------------+
|array_remove(array(1, 3, 4, 1, 1), 1)|
+-------------------------------------+
| [3, 4]|
+-------------------------------------+
- array_repeat(T, Int): array:返回包含元素计数次数的数组。
> select array_repeat('1', 2);
+------------------+
|array_repeat(1, 2)|
+------------------+
| [1, 1]|
+------------------+
- array_sort(array): array:按升序对输入数组进行排序。输入数组的元素必须是可排序的。null 元素将放置在返回数组的末尾。
> select array_sort(array(3, 4, 1, null, 0));
+------------------------------------------------+
|array_sort(array(3, 4, 1, CAST(NULL AS INT), 0))|
+------------------------------------------------+
| [0, 1, 3, 4,]|
+------------------------------------------------+
- array_union(array, array): array:返回给定两个数组元素的并集,结果已经去重了。使用如下
> select array_union(array(1, 2, 3), array(1, 3, 5));
+-------------------------------------------+
|array_union(array(1, 2, 3), array(1, 3, 5))|
+-------------------------------------------+
| [1, 2, 3, 5]|
+-------------------------------------------+
- arrays_overlap(array, array): array:如果数组1最少包含数组2中一个非空的元素,则返回 true;如果两个数组没有公共元素,而且两个数组均为空则返回 false,如果其中任何一个数组包含 null 元素则返回 null。
> select arrays_overlap(array(1, 2, 3), array(3, 4, 5));
+----------------------------------------------+
|arrays_overlap(array(1, 2, 3), array(3, 4, 5))|
+----------------------------------------------+
| true|
+----------------------------------------------+
- array_zip(array …):数组组合
> select arrays_zip(array(1, 2), array(2, 3));
+------------------------------------+
|arrays_zip(array(1, 2), array(2, 3))|
+------------------------------------+
| [[1, 2], [2, 3]]|
+------------------------------------+
- ascii(string):返回str中第一个字符的ascii值
> select ascii('China 100');
+----------------+
|ascii(China 100)|
+----------------+
| 67|
+----------------+
- asin(string):反正弦
> select ascii('0.5');
+----------+
|ascii(0.5)|
+----------+
| 48|
+----------+
- assert_true(expr):如果 expr 表达式的返回值不是 true 则抛出异常。
> select assert_true(0 < 1);
+--------------------+
|assert_true((0 < 1))|
+--------------------+
| null|
+--------------------+
- atan(double):反正切函数
> select atan(0.5);
spark.sql("select atan(0.5)").show();
+-------------------------+
|ATAN(CAST(0.5 AS DOUBLE))|
+-------------------------+
| 0.4636476090008061|
+-------------------------+
- atan2(double, double): 返回平面的正 x 轴与由坐标(expr1,expr2)点之间的弧度角度。
> select atan2(1, 1);
+-------------------------------------------+
|ATAN2(CAST(1 AS DOUBLE), CAST(1 AS DOUBLE))|
+-------------------------------------------+
| 0.7853981633974483|
+-------------------------------------------+
- avg:求平均
- base64(bin):将参数从二进制文件转换为 base64 的字符串
> select base64('hello world');
+-----------------------------------+
|base64(CAST(hello world AS BINARY))|
+-----------------------------------+
| aGVsbG8gd29ybGQ=|
+-----------------------------------+
- bigint(expr):将值 expr 转换为 bigint 数据类型
> select bigint('121');
+-------------------+
|CAST(121 AS BIGINT)|
+-------------------+
| 121|
+-------------------+
- bin(expr):返回 long 类型的参数 expr 的二进制字符串表示形式
> select bin(100);
+------------------------+
|bin(CAST(100 AS BIGINT))|
+------------------------+
| 1100100|
+------------------------+
- binary(expr): 将值 expr 转换为 binary 数据类型。
> select binary(bin(100));
+----------------------------------------+
|CAST(bin(CAST(100 AS BIGINT)) AS BINARY)|
+----------------------------------------+
| [31 31 30 30 31 3...|
+----------------------------------------+
- bit_length(expr):返回字符串数据的位长度或二进制数据的位数。
> select bit_length(binary(bin(100)));
+----------------------------------------------------+
|bit_length(CAST(bin(CAST(100 AS BIGINT)) AS BINARY))|
+----------------------------------------------------+
| 56|
+----------------------------------------------------+
> select bit_length('hello world');
+-----------------------+
|bit_length(hello world)|
+-----------------------+
| 88|
+-----------------------+
- boolean(expr):将值 expr 转换为 boolean 数据类型
> select boolean(0);
+------------------+
|CAST(0 AS BOOLEAN)|
+------------------+
| false|
+------------------+
- bround(expr, len):四舍五入,保留len长度的小数位
> select bround(3.1415926, 2);
+--------------------+
|bround(3.1415926, 2)|
+--------------------+
| 3.14|
+--------------------+
cardinality- cast(expr AS type):将 expr 转换成 type 类型的数据
> select cast('10' as integer);
+---------------+
|CAST(10 AS INT)|
+---------------+
| 10|
+---------------+
- cbrt(expr):返回expr的立方根
> select cbrt(8);
+-----------------------+
|CBRT(CAST(8 AS DOUBLE))|
+-----------------------+
| 2.0|
+-----------------------+
- ceil(expr):向上取整
> select ceil(-3.1415926);
+----------------+
|CEIL(-3.1415926)|
+----------------+
| -3|
+----------------+
- ceiling(expr):向上取整
> select ceiling(-3.1415926);
+----------------+
|CEIL(-3.1415926)|
+----------------+
| -3|
+----------------+
- char(expr):返回二进制等效于 expr 的 ASCII 字符。 如果 n 大于256,则结果等于 chr(n%256)
> select char(100);
+------------------------+
|chr(CAST(100 AS BIGINT))|
+------------------------+
| d|
+------------------------+
- char_length(expr):返回字符串数据的字符长度或二进制数据的字节数。 字符串数据的长度包括尾随空格,二进制数据的长度包括二进制零。
> select char_length('hello world');
+-------------------+
|length(hello world)|
+-------------------+
| 11|
+-------------------+
- character_length(expr):返回字符串数据的字符长度或二进制数据的字节数。 字符串数据的长度包括尾随空格,二进制数据的长度包括二进制零。
> select character_length('hello world');
+-------------------+
|length(hello world)|
+-------------------+
| 11|
+-------------------+
- chr(expr) : 返回二进制等效于 expr 的 ASCII 字符。 如果 n 大于256,则结果等于 chr(n%256)
> select chr(100);
+------------------------+
|chr(CAST(100 AS BIGINT))|
+------------------------+
| d|
+------------------------+
- coalesce(expr1, expr2, …): 返回第一个非空参数(如果存在)。 否则,返回 null。
> select coalesce(null, 1, null, 2);
+----------------------------------------------------+
|coalesce(CAST(NULL AS INT), 1, CAST(NULL AS INT), 2)|
+----------------------------------------------------+
| 1|
+----------------------------------------------------+
- collect_list(expr):返回list,不去重
> select collect_list(carstate) from gps_car;
+----------------------+
|collect_list(carstate)|
+----------------------+
| [2.0, 2.0, 2.0, 2...|
+----------------------+
- collect_set(expr):返回set,去重
> select collect_set(carstate) from gps_car;
+---------------------+
|collect_set(carstate)|
+---------------------+
| [2.0, 0.0, 1.0]|
+---------------------+
- concat(str1, str2,…,strN):返回由 str1, str2, …, strN 组成的字符串
> select concat('hello', 'world');
+--------------------+
|concat(hello, world)|
+--------------------+
| helloworld|
+--------------------+
- concat_ws(sep, [str | array(str)]+) : 返回由 sep 分隔组成的字符串连接。
> select concat_ws(' ', 'hello', 'world');
+--------------------------+
|concat_ws( , hello, world)|
+--------------------------+
| hello world|
+--------------------------+
> select concat_ws(' ', array('hello', 'world'));
+---------------------------------+
|concat_ws( , array(hello, world))|
+---------------------------------+
| hello world|
+---------------------------------+
- conv(num, from_base, to_base) : 将 num 从 from_base 进制转换为 to_base 进制。
> select conv(100, 2, 10);
+--------------------------------+
|conv(CAST(100 AS STRING), 2, 10)|
+--------------------------------+
| 4|
+--------------------------------+
corr(expr1, expr2)- cos(expr):余弦函数
> select cos(0.5);
+------------------------+
|COS(CAST(0.5 AS DOUBLE))|
+------------------------+
| 0.8775825618903728|
+------------------------+
- cosh(expr): 双曲余弦
> select cosh(0);
+-----------------------+
|COSH(CAST(0 AS DOUBLE))|
+-----------------------+
| 1.0|
+-----------------------+
- cot(expr): 余切
> select cot(1);
+----------------------+
|COT(CAST(1 AS DOUBLE))|
+----------------------+
| 0.6420926159343306|
+----------------------+
- count
- count_min_sketch
- covar_pop
- covar_samp
- crc32
- cube
- cume_dist
- current_database:查询当前数据库
> select current_database();
+------------------+
|current_database()|
+------------------+
| default|
+------------------+
- current_date:查询当前时间
> select current_date();
+--------------+
|current_date()|
+--------------+
| 2020-03-11|
+--------------+
- current_timestamp():当前时间戳
> select current_timestamp();
+--------------------+
| current_timestamp()|
+--------------------+
|2020-03-11 04:26:...|
+--------------------+
- date(expr):
> select date('2010-10-10');
+------------------------+
|CAST(2010-10-10 AS DATE)|
+------------------------+
| 2010-10-10|
+------------------------+
- date_add(start_date, num_days):增加时间
> select date_add('2010-10-10', 1);
+-------------------------------------+
|date_add(CAST(2010-10-10 AS DATE), 1)|
+-------------------------------------+
| 2010-10-11|
+-------------------------------------+
- date_format(timestamp, fmt):时间格式化
> select date_format('2010-10-10', 'yyyy');
+------------------------------------------------+
|date_format(CAST(2010-10-10 AS TIMESTAMP), yyyy)|
+------------------------------------------------+
| 2010|
+------------------------------------------------+
- date_sub(start_date, num_days):时间减少天数
> select date_sub('2020-10-10', 1);
+-------------------------------------+
|date_sub(CAST(2020-10-10 AS DATE), 1)|
+-------------------------------------+
| 2020-10-09|
+-------------------------------------+
- date_trunc(fmt, ts):时间精确,fmt:[“YEAR”, “YYYY”, “YY”, “MON”, “MONTH”, “MM”, “DAY”, “DD”, “HOUR”, “MINUTE”, “SECOND”, “WEEK”, “QUARTER”]
> select date_trunc('yyyy', '2011-11-11 11:11:11.111');
+------------------------------------------------------------+
|date_trunc(yyyy, CAST(2011-11-11 11:11:11.111 AS TIMESTAMP))|
+------------------------------------------------------------+
| 2011-01-01 00:00:00|
+------------------------------------------------------------+
> select date_trunc('month', '2011-11-11 11:11:11.111');
+-------------------------------------------------------------+
|date_trunc(month, CAST(2011-11-11 11:11:11.111 AS TIMESTAMP))|
+-------------------------------------------------------------+
| 2011-11-01 00:00:00|
+-------------------------------------------------------------+
> select date_trunc('day', '2011-11-11 11:11:11.111');
+-----------------------------------------------------------+
|date_trunc(day, CAST(2011-11-11 11:11:11.111 AS TIMESTAMP))|
+-----------------------------------------------------------+
| 2011-11-11 00:00:00|
+-----------------------------------------------------------+
- datediff(start_date, end_date):返回天数差
> select datediff('2011-11-11', '2011-11-10');
+------------------------------------------------------------+
|datediff(CAST(2011-11-11 AS DATE), CAST(2011-11-10 AS DATE))|
+------------------------------------------------------------+
| 1|
+------------------------------------------------------------+
- day(date):返回天数
> select day('2020-2-11');
+-----------------------------------+
|dayofmonth(CAST(2020-2-11 AS DATE))|
+-----------------------------------+
| 11|
+-----------------------------------+
- dayofmonth(date):返回天数
> select dayofmonth('2020-2-11');
+-----------------------------------+
|dayofmonth(CAST(2020-2-11 AS DATE))|
+-----------------------------------+
| 11|
+-----------------------------------+
- dayofweek(date):返回天数
> select dayofmonth('2020-2-11');
+-----------------------------------+
|dayofmonth(CAST(2020-2-11 AS DATE))|
+-----------------------------------+
| 11|
+-----------------------------------+
- dayofyear(date):返回天数
> select dayofyear('2020-2-11');
+----------------------------------+
|dayofyear(CAST(2020-2-11 AS DATE))|
+----------------------------------+
| 42|
+----------------------------------+
- decimal(expr):转换为decimal
> select decimal(1.0);
+--------------------------+
|CAST(1.0 AS DECIMAL(10,0))|
+--------------------------+
| 1|
+--------------------------+
- decode(bin, charset):解码
> select decode('abcd', 'utf-8');
+-----------------------------------+
|decode(CAST(abcd AS BINARY), utf-8)|
+-----------------------------------+
| abcd|
+-----------------------------------+
- degrees
- dense_rank
- double(expr):转成double
> select double(1);
+-----------------+
|CAST(1 AS DOUBLE)|
+-----------------+
| 1.0|
+-----------------+
- e():返回e的值
> select e();
+-----------------+
| E()|
+-----------------+
|2.718281828459045|
+-----------------+
- element_att(n, input1, input2, …):返回第几个
- elt
> select elt(2, 'java', 'php', 'R');
+--------------------+
|elt(2, java, php, R)|
+--------------------+
| php|
+--------------------+
- encode(str, charset):编码
> select encode('abcd', 'utf-8');
+-------------------+
|encode(abcd, utf-8)|
+-------------------+
| [61 62 63 64]|
+-------------------+
- exists
- exp
- explode(expr):拆分数组等
> select explode(array(1, 3, 5));
+---+
|col|
+---+
| 1|
| 3|
| 5|
+---+
- explode_outer(expr):拆分数组等
> select explode_outer(array(1, 3, 5));
+---+
|col|
+---+
| 1|
| 3|
| 5|
+---+
- expm1
- factorial(expr):阶乘
> select factorial(5);
+------------+
|factorial(5)|
+------------+
| 120|
+------------+
- filter
- find_in_set(str, str_array):查询索引
> select find_in_set('ab','abc,b,ab,c,def');
+-------------------------------+
|find_in_set(ab, abc,b,ab,c,def)|
+-------------------------------+
| 3|
+-------------------------------+
- first
- first_value
- flatten
- float(expr):转成float
> select float(1);
+----------------+
|CAST(1 AS FLOAT)|
+----------------+
| 1.0|
+----------------+
- floor(expr):向下取整
> select floor(-0.1);
+-----------+
|FLOOR(-0.1)|
+-----------+
| -1|
+-----------+
- format_number(expr1, expr2):数字格式化
> select format_number(3141.5926, 3);
+---------------------------+
|format_number(3141.5926, 3)|
+---------------------------+
| 3,141.593|
+---------------------------+
- format_string(strfmt, obj, …):字符串格式化
> select format_string('Hello World %d %s', 100, 'days');
+-------------------------------------------+
|format_string(Hello World %d %s, 100, days)|
+-------------------------------------------+
| Hello World 100 days|
+-------------------------------------------+
- from_json(jsonStr, schema[, options]):
> select from_json("{'id': 1, 'name': 'zhangsan'}", "id int, name string");
+--------------------------------------------+
|jsontostructs({'id': 1, 'name': 'zhangsan'})|
+--------------------------------------------+
| [1, zhangsan]|
+--------------------------------------------+
- from_unixtime(unix_time, format):unix_time为java.util.Date().getTime() / 1000
> select from_unixtime(1583888136, 'yyyy-MM-dd HH:mm:ss');
+--------------------------------------------------------------+
|from_unixtime(CAST(1583888136 AS BIGINT), yyyy-MM-dd HH:mm:ss)|
+--------------------------------------------------------------+
| 2020-03-11 08:55:36|
+--------------------------------------------------------------+
- from_utc_timestamp(timestamp, timezone)
> select from_utc_timestamp('2020-01-01', 'Asia/Seoul');
+-------------------------------------------------------------+
|from_utc_timestamp(CAST(2020-01-01 AS TIMESTAMP), Asia/Seoul)|
+-------------------------------------------------------------+
| 2020-01-01 09:00:00|
+-------------------------------------------------------------+
- get_json_object(json_txt, path):取最大值
- greatest(expr, …):
> select greatest(1, 3, 4, 10);
+---------------------+
|greatest(1, 3, 4, 10)|
+---------------------+
| 10|
+---------------------+
- grouping
- grouping_id
- hash(expr1, …):
> select hash('hello', 'world');
+------------------+
|hash(hello, world)|
+------------------+
| -165041762|
+------------------+
- hex(expr):转成hexadecimal
> select hex('hello world');
+--------------------+
| hex(hello world)|
+--------------------+
|68656C6C6F20776F7...|
+--------------------+
- hour(timestamp):返回小时
> select hour('2020-11-11 11:11:11');
+--------------------------------------------+
|hour(CAST(2020-11-11 11:11:11 AS TIMESTAMP))|
+--------------------------------------------+
| 11|
+--------------------------------------------+
- hypot(expr1, expr2):返回值为
sqrt(expr^2 + expr2^2)
> select hypot(3, 4);
+-------------------------------------------+
|HYPOT(CAST(3 AS DOUBLE), CAST(4 AS DOUBLE))|
+-------------------------------------------+
| 5.0|
+-------------------------------------------+
- if(expr1, expr2, expr3):expr1=true返回expr2,否则返回expr3
> select if(1 > 2, '1', '2');
+-------------------+
|(IF((1 > 2), 1, 2))|
+-------------------+
| 2|
+-------------------+
- ifnull(expr1, expr2):如果expr1为返回expr2
> select ifnull(null, 'expr2');
+---------------------+
|ifnull(NULL, 'expr2')|
+---------------------+
| expr2|
+---------------------+
- in:expr1 in(expr2, expr3, …)
> select 1 in(1, 3, 4);
+----------------+
|(1 IN (1, 3, 4))|
+----------------+
| true|
+----------------+
- initcap(str):每个单词首字母大写
> select initcap('hello world');
+--------------------+
|initcap(hello world)|
+--------------------+
| Hello World|
+--------------------+
- inline
> select inline(array(struct(1, 'a'), struct(2, 'b')));
+----+----+
|col1|col2|
+----+----+
| 1| a|
| 2| b|
+----+----+
- inline_outer
> select inline_outer(array(struct(1, 'a'), struct(2, 'b')));
+----+----+
|col1|col2|
+----+----+
| 1| a|
| 2| b|
+----+----+
- input_file_block_length()
- input_file_block_start
- input_file_name
- instr(str, substr):查询索引
> select instr('hello world', 'world');
+-------------------------+
|instr(hello world, world)|
+-------------------------+
| 7|
+-------------------------+
- int(expr):转成int
> select int(3.14);
+-----------------+
|CAST(3.14 AS INT)|
+-----------------+
| 3|
+-----------------+
- isnan
- isnotnull(expr):非空返回true
> select isnotnull(1);
+---------------+
|(1 IS NOT NULL)|
+---------------+
| true|
+---------------+
- isnull:空返回true
> select isnull(null);
+--------------+
|(NULL IS NULL)|
+--------------+
| true|
+--------------+
- java_method(class, method[, arg1[, arg2 …]]):调用java方法
> select java_method('java.util.UUID', 'randomUUID');
+-----------------------------------+
|reflect(java.util.UUID, randomUUID)|
+-----------------------------------+
| 3aaebaa8-6335-440...|
+-----------------------------------+
- json_tuple
- kurtosis
- lag
- last
> select last(array(1, 3, null, 5));
+----------------------------------------------+
|last(array(1, 3, CAST(NULL AS INT), 5), false)|
+----------------------------------------------+
| [1, 3,, 5]|
+----------------------------------------------+
- last_day(date):返回最后一天
> select last_day('2020-02-01');
+----------------------------------+
|last_day(CAST(2020-02-01 AS DATE))|
+----------------------------------+
| 2020-02-29|
+----------------------------------+
- last_value
- lcase(str):转换为小写
> select lcase('Hello World');
+------------------+
|lower(Hello World)|
+------------------+
| hello world|
+------------------+
- lead
- least(expr, …):返回最小值
> select least(null, 10, 3, 4);
+----------------------------------+
|least(CAST(NULL AS INT), 10, 3, 4)|
+----------------------------------+
| 3|
+----------------------------------+
- left(str, len):返回左边几个字符
> select left('hello world', 3);
+----------------------+
|left('hello world', 3)|
+----------------------+
| hel|
+----------------------+
- length(expr):返回长度
> select length('hello world');
+-------------------+
|length(hello world)|
+-------------------+
| 11|
+-------------------+
- levenshtein
- like
- ln
- locate(substr, str[, pos]):
- log
- log10
- log1p
- log2
- lower(str):转成小写
> select lower('Hello World');
+------------------+
|lower(Hello World)|
+------------------+
| hello world|
+------------------+
- lpad
- ltrim(trimStr, str):去除前后trimStr,不传责去除前空格
> select ltrim('ha', 'haHello worldha');
+--------------------------+
|ltrim(haHello worldha, ha)|
+--------------------------+
| Hello worldha|
+--------------------------+
- map(key0, value0, key1, value1, …):创建map
> select map(1.0, '2', 3.0, '4');
+--------------------+
| map(1.0, 2, 3.0, 4)|
+--------------------+
|[1.0 -> 2, 3.0 -> 4]|
+--------------------+
- map_concat(map<K, V>, …):返回多个 map 的并集
> select map_concat(map(1, 'a', 2, 'b'), map(2, 'c', 3, 'd'));
+--------------------------------------------+
|map_concat(map(1, a, 2, b), map(2, c, 3, d))|
+--------------------------------------------+
| [1 -> a, 2 -> b, ...|
+--------------------------------------------+
- map_from_arrays(array, array)
> select map_from_arrays(array(1.0, 3.0), array('2', '4'));
+---------------------------------------------+
|map_from_arrays(array(1.0, 3.0), array(2, 4))|
+---------------------------------------------+
| [1.0 -> 2, 3.0 -> 4]|
+---------------------------------------------+
- map_from_entries(array<struct<K, V>>)
> select map_from_entries(array(struct(1, 'a'), struct(2, 'b')));
+---------------------------------------------------------------------------------------+
|map_from_entries(array(named_struct(col1, 1, col2, a), named_struct(col1, 2, col2, b)))|
+---------------------------------------------------------------------------------------+
| [1 -> a, 2 -> b]|
+---------------------------------------------------------------------------------------+
- map_keys(map)
> select map_keys(map(1, 'a', 2, 'b'));
+-------------------------+
|map_keys(map(1, a, 2, b))|
+-------------------------+
| [1, 2]|
+-------------------------+
- map_values(map)
> select map_values(map(1, 'a', 2, 'b'));
+---------------------------+
|map_values(map(1, a, 2, b))|
+---------------------------+
| [a, b]|
+---------------------------+
- max
- md5(expr):md5加密
> select md5('hello world');
+--------------------------------+
|md5(CAST(hello world AS BINARY))|
+--------------------------------+
| 5eb63bbbe01eeed09...|
+--------------------------------+
- mean
- min
- minute(timestamp):返回分钟数
> select minute('2011-11-11 11:11:11');
+----------------------------------------------+
|minute(CAST(2011-11-11 11:11:11 AS TIMESTAMP))|
+----------------------------------------------+
| 11|
+----------------------------------------------+
- mod(expr1, expr2):%
> select mod(11, 2);
+--------+
|(11 % 2)|
+--------+
| 1|
+--------+
- monotonically_increasing_id
- month(date):返回月份
> select month('2011-11-11');
+-------------------------------+
|month(CAST(2011-11-11 AS DATE))|
+-------------------------------+
| 11|
+-------------------------------+
- months_between(timestamp1, timestamp2):月份差
> select months_between('2000-10-1', '2000-11-2');
+--------------------------------------------------------------------------------+
|months_between(CAST(2000-10-1 AS TIMESTAMP), CAST(2000-11-2 AS TIMESTAMP), true)|
+--------------------------------------------------------------------------------+
| -1.03225806|
+--------------------------------------------------------------------------------+
- named_struct(name1, val1, name2, val2, …)
> select named_struct('a', 1, 'b', 2, 'c', 3);
+------------------------------+
|named_struct(a, 1, b, 2, c, 3)|
+------------------------------+
| [1, 2, 3]|
+------------------------------+
- nanvl
- negative(expr):取反
> select negative(100);
+-------+
|(- 100)|
+-------+
| -100|
+-------+
- next_day(start_date, day_of_week)
> select next_day('2020-03-11', 'FRI');
+---------------------------------------+
|next_day(CAST(2020-03-11 AS DATE), FRI)|
+---------------------------------------+
| 2020-03-13|
+---------------------------------------+
- not
- now()
> select now();
+--------------------+
| current_timestamp()|
+--------------------+
|2020-03-11 23:07:...|
+--------------------+
- ntile
- nullif
- nvl(expr1, expr2):expr1为空返回expr2
> select nvl(null, 'hello');
+------------------+
|nvl(NULL, 'hello')|
+------------------+
| hello|
+------------------+
- nvl2(expr1, expr2, expr3):
- octet_length(expr):长度
> select octet_length('hello world');
+-------------------------+
|octet_length(hello world)|
+-------------------------+
| 11|
+-------------------------+
- or
- parse_url
- percent_rank
- percentile
- percentile_approx
- pi():圆周率
> select pi();
+-----------------+
| PI()|
+-----------------+
|3.141592653589793|
+-----------------+
- pmod
- posexplode(expr):分割数组
> select posexplode(array(1, 3, 4));
+---+---+
|pos|col|
+---+---+
| 0| 1|
| 1| 3|
| 2| 4|
+---+---+
- posexplode_outer(expr):分割数组
> select posexplode_outer(array(1, 3, 4));
+---+---+
|pos|col|
+---+---+
| 0| 1|
| 1| 3|
| 2| 4|
+---+---+
- position(substr, str[, pos]):返回位置
> select position('he', 'world hello', 1);
+--------------------------+
|locate(he, world hello, 1)|
+--------------------------+
| 7|
+--------------------------+
- positive
- pow(expr1, expr2):乘方
> select pow(2, 4);
+-------------------------------------------+
|POWER(CAST(2 AS DOUBLE), CAST(4 AS DOUBLE))|
+-------------------------------------------+
| 16.0|
+-------------------------------------------+
- power(expr1, expr2):乘方
> select power(2, 4);
+-------------------------------------------+
|POWER(CAST(2 AS DOUBLE), CAST(4 AS DOUBLE))|
+-------------------------------------------+
| 16.0|
+-------------------------------------------+
- printf(strfmt, obj, …):字符串格式化
> select printf('Hello World %d %s', 100, 'days');
+-------------------------------------------+
|format_string(Hello World %d %s, 100, days)|
+-------------------------------------------+
| Hello World 100 days|
+-------------------------------------------+
- quarter(date):
> select quarter('2020-11-11');
+---------------------------------+
|quarter(CAST(2020-11-11 AS DATE))|
+---------------------------------+
| 4|
+---------------------------------+
- radians(expr):
> select radians(180);
+----------------------------+
|RADIANS(CAST(180 AS DOUBLE))|
+----------------------------+
| 3.141592653589793|
+----------------------------+
- rand([seed]):随机数
> select rand();
+-------------------------+
|rand(7896507467301193574)|
+-------------------------+
| 0.1643759947938076|
+-------------------------+
- randn([seed]):随机数
> select randn();
+---------------------------+
|randn(-2498753337637632801)|
+---------------------------+
| -0.4827279803917336|
+---------------------------+
- rank
- reflect(class, method[, arg1[, arg2 …]]),反射调用java方法
> select reflect('java.util.UUID', 'randomUUID');
+-----------------------------------+
|reflect(java.util.UUID, randomUUID)|
+-----------------------------------+
| 3930dab2-d5e0-49e...|
+-----------------------------------+
- regexp_extract
- regexp_replace(str, regexp, rep):
- repeat(str, n):重复
> select repeat('ha', 4);
+-------------+
|repeat(ha, 4)|
+-------------+
| hahahaha|
+-------------+
- replace(str, search[, replace]):替换
> select replace('hello world', 'lo', 'ol');
+----------------------------+
|replace(hello world, lo, ol)|
+----------------------------+
| helol world|
+----------------------------+
- reverse(str):反转
> select reverse('hello world');
+--------------------+
|reverse(hello world)|
+--------------------+
| dlrow olleh|
+--------------------+
- right(str, len):从右截取字符串
> select right('hello world', 3);
+-----------------------+
|right('hello world', 3)|
+-----------------------+
| rld|
+-----------------------+
- rint(expr):返回int部分的值
> select rint(3.14159);
+------------------------------+
|ROUND(CAST(3.14159 AS DOUBLE))|
+------------------------------+
| 3.0|
+------------------------------+
- rlike
- rollup
- round(expr, d):
> select round(2.5, 0);
+-------------+
|round(2.5, 0)|
+-------------+
| 3|
+-------------+
- row_number
- rpad
- rtrim(trimStr, str):字符串右边截取
> select rtrim('ha', 'hahello worldha');
+--------------------------+
|rtrim(hahello worldha, ha)|
+--------------------------+
| hahello world|
+--------------------------+
- schema_of_json
- second(timestamp):查询秒
> select second('2011-11-11 11:11:11');
+----------------------------------------------+
|second(CAST(2011-11-11 11:11:11 AS TIMESTAMP))|
+----------------------------------------------+
| 11|
+----------------------------------------------+
- sentences(str[, lang, country])
> select sentences('hello world!my dreams');
+------------------------------------+
|sentences(hello world!my dreams, , )|
+------------------------------------+
| [[hello, world], ...|
+------------------------------------+
- sequence
- sha(expr):加密
> select sha('hello world');
+---------------------------------+
|sha1(CAST(hello world AS BINARY))|
+---------------------------------+
| 2aae6c35c94fcfb41...|
+---------------------------------+
- sha1(expr):加密
> select sha1('hello world');
+---------------------------------+
|sha1(CAST(hello world AS BINARY))|
+---------------------------------+
| 2aae6c35c94fcfb41...|
+---------------------------------+
- sha2(expr, bitLength):加密
> select sha2('hello world', 256);
+--------------------------------------+
|sha2(CAST(hello world AS BINARY), 256)|
+--------------------------------------+
| b94d27b9934d3e08a...|
+--------------------------------------+
- shiftleft
- shiftright
- shiftrightunsigned
- shuffle
- sign
- signum
- sin(expr):正弦函数
> select sin(0);
+----------------------+
|SIN(CAST(0 AS DOUBLE))|
+----------------------+
| 0.0|
+----------------------+
- sinh
- size(expr):数据大小
> select size(array(1, 2, 3));
+--------------------+
|size(array(1, 2, 3))|
+--------------------+
| 3|
+--------------------+
- skewness
- slice
- smallint(expr):转成smallint
> select smallint(100);
+---------------------+
|CAST(100 AS SMALLINT)|
+---------------------+
| 100|
+---------------------+
- sort_array(array[, ascendingOrder]):数组排序
> select sort_array(array(2, 3, 1, 0), true);
+-----------------------------------+
|sort_array(array(2, 3, 1, 0), true)|
+-----------------------------------+
| [0, 1, 2, 3]|
+-----------------------------------+
- soundex
- space
- spark_partition_id()
- split(str, regex):字符串分割
> select split('hehahehahehaa', 'ha');
+------------------------+
|split(hehahehahehaa, ha)|
+------------------------+
| [he, he, he, a]|
+------------------------+
- sqrt(expr):平方根
> select sqrt(9);
+-----------------------+
|SQRT(CAST(9 AS DOUBLE))|
+-----------------------+
| 3.0|
+-----------------------+
- stack
- std
- stddev
- stddev_pop
- stddev_samp
- str_to_map(text[, pairDelim[, keyValueDelim]]):转成map
> select str_to_map('a:1,b:2,c:3', ',', ':');
+-----------------------------+
|str_to_map(a:1,b:2,c:3, ,, :)|
+-----------------------------+
| [a -> 1, b -> 2, ...|
+-----------------------------+
- string(expr):转成string
- struct(col1, col2, col3, …)
> select struct(1, 3, 4);
+---------------------------------------+
|named_struct(col1, 1, col2, 3, col3, 4)|
+---------------------------------------+
| [1, 3, 4]|
+---------------------------------------+
- substr(str, pos[, len]):字符串截取
> select substr('hello world', 5, 1);
+----------------------------+
|substring(hello world, 5, 1)|
+----------------------------+
| o|
+----------------------------+
- substring(str, pos[, len]):字符串截取
> select substring('hello world', 5, 1);
+----------------------------+
|substring(hello world, 5, 1)|
+----------------------------+
| o|
+----------------------------+
- substring_index
- sum(expr):求和
- tan(expr):正切函数
> select tan(0);
+----------------------+
|TAN(CAST(0 AS DOUBLE))|
+----------------------+
| 0.0|
+----------------------+
- tanh
- timestamp(expr):转成时间戳
- tinyint(expr):转成tinyint
- to_date(date_str[, fmt]):转成时间
> select to_date('2011-11-11 11:11:11');
+------------------------------+
|to_date('2011-11-11 11:11:11')|
+------------------------------+
| 2011-11-11|
+------------------------------+
- to_json(expr[, options]):生成json
> select to_json(named_struct('a', 1, 'b', 2));
+---------------------------------------+
|structstojson(named_struct(a, 1, b, 2))|
+---------------------------------------+
| {"a":1,"b":2}|
+---------------------------------------+
- to_timestamp(timestamp[, fmt]):
> select to_timestamp('2011-11-11 11:11:11', 'yyyy-MM-dd HH:mm:ss');
+----------------------------------------------------------+
|to_timestamp('2011-11-11 11:11:11', 'yyyy-MM-dd HH:mm:ss')|
+----------------------------------------------------------+
| 2011-11-11 11:11:11|
+----------------------------------------------------------+
- to_unix_timestamp(expr[, pattern])
> select to_unix_timestamp('2011-11-11 11:11:11', 'yyyy-MM-dd HH:mm:ss');
+-----------------------------------------------------------+
|to_unix_timestamp(2011-11-11 11:11:11, yyyy-MM-dd HH:mm:ss)|
+-----------------------------------------------------------+
| 1320981071|
+-----------------------------------------------------------+
- to_utc_timestamp(timestamp, timezone):
> select to_utc_timestamp('2011-11-11 11:11:11', 'Asia/Seoul');
+--------------------------------------------------------------------+
|to_utc_timestamp(CAST(2011-11-11 11:11:11 AS TIMESTAMP), Asia/Seoul)|
+--------------------------------------------------------------------+
| 2011-11-11 02:11:11|
+--------------------------------------------------------------------+
- transform
- translate
- trim(str):字符串去除前后空格
> select trim('ha', 'hahello worldha');
+-------------------------+
|trim(hahello worldha, ha)|
+-------------------------+
| ello world|
+-------------------------+
- trunc(date, fmt):fmt - [“year”, “yyyy”, “yy”, “mon”, “month”, “mm”]
> select trunc('2011-11-11 02:11:11', 'yyyy');
+----------------------------------------------+
|trunc(CAST(2011-11-11 02:11:11 AS DATE), yyyy)|
+----------------------------------------------+
| 2011-01-01|
+----------------------------------------------+
- ucase(str):转换为大写
> select ucase('hello world');
+------------------+
|upper(hello world)|
+------------------+
| HELLO WORLD|
+------------------+
- unbase64(str):
> select unbase64('aGVsbG8gd29ybGQ=');
+--------------------------+
|unbase64(aGVsbG8gd29ybGQ=)|
+--------------------------+
| [68 65 6C 6C 6F 2...|
+--------------------------+
- unhex(str):
> select unhex('68656C6C6F');
+-----------------+
|unhex(68656C6C6F)|
+-----------------+
| [68 65 6C 6C 6F]|
+-----------------+
- unix_timestamp([expr[, pattern]]):
> select unix_timestamp('2020-01-01', 'yyyy-MM-dd');
+--------------------------------------+
|unix_timestamp(2020-01-01, yyyy-MM-dd)|
+--------------------------------------+
| 1577808000|
+--------------------------------------+
- upper(str):转换为大写
> select upper('hello world');
+------------------+
|upper(hello world)|
+------------------+
| HELLO WORLD|
+------------------+
- uuid()
> select uuid();
+--------------------+
| uuid()|
+--------------------+
|bd8aa547-9d08-46c...|
+--------------------+
- var_pop
- var_samp
- variance
- weekday
- weekofyear(date):
> select weekofyear('2011-11-11');
+------------------------------------+
|weekofyear(CAST(2011-11-11 AS DATE))|
+------------------------------------+
| 45|
+------------------------------------+
- when
> select case when 1 > 0 then 1 when 2 > 0 then 2.0 else 1.2 end;
+--------------------------------------------------------------------------------------------------------------------------------+
|CASE WHEN (1 > 0) THEN CAST(1 AS DECIMAL(11,1)) WHEN (2 > 0) THEN CAST(2.0 AS DECIMAL(11,1)) ELSE CAST(1.2 AS DECIMAL(11,1)) END|
+--------------------------------------------------------------------------------------------------------------------------------+
| 1.0|
+--------------------------------------------------------------------------------------------------------------------------------+
- window
- xpath
- xpath_boolean
- xpath_double
- xpath_float
- xpath_int
- xpath_long
- xpath_number
- xpath_short
- xpath_string
- year(date):
> select year('2011-11-11');
+------------------------------+
|year(CAST(2011-11-11 AS DATE))|
+------------------------------+
| 2011|
+------------------------------+
- zip_with