大数据 - Doris系列《四》- Doris常用函数

最新推荐文章于 2025-09-18 15:17:23 发布

原创最新推荐文章于 2025-09-18 15:17:23 发布 · 3.7k 阅读

25 ·

CC 4.0 BY-SA版权

文章标签：

#大数据

Doris 专栏收录该内容

4 篇文章

订阅专栏

往期文章：

大数据 - Doris系列《一》- Doris简介_doris详细介绍-CSDN博客

大数据 - Doris系列《二》- Doris安装（亲测成功版）_java8 please set vm.max_map_count to be 2000000 un-CSDN博客

大数据 - Doris系列《三》- 数据表设计之表的基本概念_doris 一张表最多字段-CSDN博客

4.1.1条件函数

4.1.1.1 if函数

4.1.1.2 ifnull,nvl,coalesce,nullif函数

4.1.1.3 case

4.1.1.4 练习题

4.1.2聚合函数

4.1.2.1 min,max,sum,avg,count

4.1.2.2 min_by和max_by

🥙4.1.2.3 group_concat

4.1.2.4 collect_list，collect_set (1.2版本上线)

4.1.3 字符串函数

4.1.3.1 length，lower，upper，reverse

4.1.3.2 lpad，rpad

4.1.3.3 concat，concat_ws

4.1.3.4 substr

4.1.3.5 ends_with，starts_with

4.1.3.6 trim，ltrim，rtrim

4.1.3.7 null_or_empty，not_null_or_empty

4.1.3.8 replace

4.1.3.9 split_part

4.1.3.10 money_format

4.1.1条件函数

4.1.1.1 if函数

语法示例：

if(boolean condition, type valueTrue, type valueFalseOrNull)
--如果表达式 condition 成立，返回结果 valueTrue；否则，返回结果 valueFalseOrNull
--返回值类型：valueTrue 表达式结果的类型

示例：

mysql> select  user_id, if(user_id = 1, "true", "false") as test_if from test;
+---------+---------+
| user_id | test_if |
+---------+---------+
| 1       | true    |
| 2       | false   |
+---------+---------+

4.1.1.2 ifnull,nvl,coalesce,nullif函数

语法示例：

ifnull(expr1, expr2)
--如果 expr1 的值不为 NULL 则返回 expr1，否则返回 expr2

nvl(expr1, expr2)
--如果 expr1 的值不为 NULL 则返回 expr1，否则返回 expr2

coalesce(expr1, expr2, ...., expr_n))
--返回参数中的第一个非空表达式（从左向右）

nullif(expr1, expr2)
-- 如果两个参数相等，则返回NULL。否则返回第一个参数的值

示例：

mysql> select ifnull(1,0);
+--------------+
| ifnull(1, 0) |
+--------------+
|            1 |
+--------------+

mysql> select nvl(null,10);
+------------------+
| nvl(null,10)     |
+------------------+
|               10 |
+------------------+

mysql> select coalesce(NULL, null, '0000');
+--------------------------------+
| coalesce(NULL, '1111', '0000') |
+--------------------------------+
| 1111                           |
+--------------------------------+

mysql> select coalesce(NULL, NULL,NULL,'0000', NULL);
+----------------------------------------+
| coalesce(NULL, NULL,NULL,'0000', NULL) |
+----------------------------------------+
| 0000                                   |
+----------------------------------------+

mysql> select nullif(1,1);
+--------------+
| nullif(1, 1) |
+--------------+
|         NULL |
+--------------+

mysql> select nullif(1,0);
+--------------+
| nullif(1, 0) |
+--------------+
|            1 |
+--------------+

4.1.1.3 case

语法示例：

-- 方式一
CASE expression
    WHEN condition1 THEN result1
    [WHEN condition2 THEN result2]
    ...
    [WHEN conditionN THEN resultN]
    [ELSE result]
END


-- 方式二
CASE WHEN condition1 THEN result1
    [WHEN condition2 THEN result2]
    ...
    [WHEN conditionN THEN resultN]
    [ELSE result]
END

-- 将表达式和多个可能的值进行比较，当匹配时返回相应的结果

示例：

mysql> select user_id, 
case user_id 
when 1 then 'user_id = 1' 
when 2 then 'user_id = 2' 
else 'user_id not exist' 
end as test_case 
from test;
+---------+-------------+
| user_id | test_case   |
+---------+-------------+
| 1       | user_id = 1 |
| 2       | user_id = 2 |
| 3       | 'user_id not exist' |
+---------+-------------+
 
mysql> select user_id, 
case 
when user_id = 1 then 'user_id = 1' 
when user_id = 2 then 'user_id = 2' 
else 'user_id not exist' 
end as test_case 
from test;
+---------+-------------+
| user_id | test_case   |
+---------+-------------+
| 1       | user_id = 1 |
| 2       | user_id = 2 |
+---------+-------------+

4.1.1.4 练习题

SQL1158 市场分析I

4.1.2聚合函数

4.1.2.1 min,max,sum,avg,count

4.1.2.2 min_by和max_by

MAX_BY(expr1, expr2)
返回expr2最大值所在行的 expr1 （求分组top1的简介函数）

语法示例：

MySQL > select * from tbl;
+------+------+------+------+
| k1   | k2   | k3   | k4   |
+------+------+------+------+
|    0 | 3    | 2    |  100 |
|    1 | 2    | 3    |    4 |
|    4 | 3    | 2    |    2 |
|    3 | 4    | 2    |    1 |
+------+------+------+------+

MySQL > select max_by(k1, k4) from tbl;

select max_by(k1, k4) from tbl;
--取k4这个列中的最大值对应的k1这个列的值
+--------------------+
| max_by(`k1`, `k4`) |
+--------------------+
|                  0 |
+--------------------+

练习：

name   subject   score 
zss,chinese,99
zss,math,89
zss,English,79
lss,chinese,88
lss,math,88
lss,English,22
www,chinese,99
www,math,45
zll,chinese,23
zll,math,88
zll,English,80
www,English,94



-- 建表语句
create table score
(
name varchar(50),
subject varchar(50),
score double
)
DUPLICATE KEY(name)
DISTRIBUTED BY HASH(name) BUCKETS 1;

-- 通过本地文件的方式导入数据
curl \
 -u root: \
 -H "label:salary" \
 -H "column_separator:," \
 -T /root/data/sorce.txt \
 http://linux01:8040/api/test/score/_stream_load


-- 求每门课程成绩最高分的那个人
select  
subject,max_by(name,score) as name
from score
group by subject


+---------+------+
| subject | name |
+---------+------+
| English | www  |
| math    | lss  |
| chinese | www  |
+---------+------+

🥙4.1.2.3 group_concat

求：每一个人有考试成绩的所有科目

select name， group_concat(subject,',') as all_subject from score group by name

语法示例：

VARCHAR GROUP_CONCAT([DISTINCT] VARCHAR 列名[, VARCHAR sep]

该函数是类似于hive中collect_set()和collect_list()函数，group_concat 将结果集中的多行结果连接成一个字符串

-- group_concat对于收集的字段只能是string，varchar，char类型  
--当不指定分隔符的时候，默认使用 ','

VARCHAR ：代表GROUP_CONCAT函数返回值类型
[DISTINCT]：可选参数，针对需要拼接的列的值进行去重  
[, VARCHAR sep]：拼接成字符串的分隔符，默认是 ','

示例：

--建表
create table example(
id int,
name varchar(50),
age int,
gender string,
is_marry boolean,
marry_date date,
marry_datetime datetime
)engine = olap
distributed by hash(id) buckets 3;

--插入数据
insert into example values \
(1,'zss',18,'male',0,null,null),\
(2,'lss',28,'female',1,'2022-01-01','2022-01-01 11:11:11'),\
(3,'ww',38,'male',1,'2022-02-01','2022-02-01 11:11:11'),\
(4,'zl',48,'female',0,null,null),\
(5,'tq',58,'male',1,'2022-03-01','2022-03-01 11:11:11'),\
(6,'mly',18,'male',1,'2022-04-01','2022-04-01 11:11:11'),\
(7,null,18,'male',1,'2022-05-01','2022-05-01 11:11:11');


+--------+---------------+

当收集的那一列，有值为null时，他会自动将null的值过滤掉

select 
gender,
group_concat(name,',') as gc_name
from example 
group by gender;
+--------+---------------+
| gender | gc_name       |
+--------+---------------+
| female | zl,lss        |
| male   | zss,ww,tq,mly |
+--------+---------------+

select 
gender,
group_concat(DISTINCT cast(age as string)) as gc_age
from example 
group by gender;

+--------+------------+
| gender | gc_age     |
+--------+------------+
| female | 48, 28     |
| male   | 58, 38, 18 |
+--------+------------+

4.1.2.4 collect_list，collect_set (1.2版本上线)

语法示例：

ARRAY<T> collect_list(expr)
--返回一个包含 expr 中所有元素(不包括NULL)的数组，数组中元素顺序是不确定的。

ARRAY<T> collect_set(expr)
--返回一个包含 expr 中所有去重后元素(不包括NULL)的数组，数组中元素顺序是不确定的。

4.1.3 字符串函数

4.1.3.1 length，lower，upper，reverse

示例：

获取到字符串的长度，对字符串转大小写和字符串的反转

4.1.3.2 lpad，rpad

语法：

VARCHAR rpad(VARCHAR str, INT len, VARCHAR pad)

VARCHAR lpad(VARCHAR str, INT len, VARCHAR pad)

-- 返回 str 中长度为 len（从首字母开始算起）的字符串。
--如果 len 大于 str 的长度，则在 str 的后面不断补充 pad  字符，
--直到该字符串的长度达到 len 为止。如果 len 小于 str 的长度，
--该函数相当于截断 str 字符串，只返回长度为 len  的字符串。
--len 指的是字符长度而不是字节长度。

示例：

-- 向左边补齐
SELECT lpad("1", 5, "0");
+---------------------+
| lpad("1", 5, "0") |
+---------------------+
| 00001             |
+---------------------+

-- 向右边补齐
SELECT rpad('11', 5, '0');
+---------------------+
| rpad('11', 5, '0')  |
+---------------------+
| 11000               |
+---------------------+

4.1.3.3 concat，concat_ws

语法：

select concat("a", "b");
+------------------+
| concat('a', 'b') |
+------------------+
| ab               |
+------------------+

select concat("a", "b", "c");
+-----------------------+
| concat('a', 'b', 'c') |
+-----------------------+
| abc                   |
+-----------------------+

-- concat中，如果有一个值为null，那么得到的结果就是null
mysql> select concat("a", null, "c");
+------------------------+
| concat('a', NULL, 'c') |
+------------------------+
| NULL                   |
+------------------------+


--使用第一个参数 sep 作为连接符
--将第二个参数以及后续所有参数(或ARRAY中的所有字符串)拼接成一个字符串。
-- 如果分隔符是 NULL，返回 NULL。 concat_ws函数不会跳过空字符串，会跳过 NULL 值。
mysql> select concat_ws("_", "a", "b");
+----------------------------+
| concat_ws("_", "a", "b")   |
+----------------------------+
| a_b                        |
+----------------------------+

mysql> select concat_ws(NULL, "d", "is");
+----------------------------+
| concat_ws(NULL, 'd', 'is') |
+----------------------------+
| NULL                       |
+----------------------------+

4.1.3.4 substr

语法：

--求子字符串，返回第一个参数描述的字符串中从start开始长度为len的部分字符串。
--首字母的下标为1。
mysql> select substr("Hello doris", 3, 5);
+-----------------------------+
| substr('Hello doris', 2, 1) |
+-----------------------------+
| e                           |
+-----------------------------+
mysql> select substr("Hello doris", 1, 2);
+-----------------------------+
| substr('Hello doris', 1, 2) |
+-----------------------------+
| He                          |
+-----------------------------+

4.1.3.5 ends_with，starts_with

语法：

BOOLEAN ENDS_WITH (VARCHAR str, VARCHAR suffix)
--如果字符串以指定后缀结尾，返回true。否则，返回false。
--任意参数为NULL，返回NULL。

BOOLEAN STARTS_WITH (VARCHAR str, VARCHAR prefix)
--如果字符串以指定前缀开头，返回true。否则，返回false。
--任意参数为NULL，返回NULL。

示例：

mysql> select ends_with("Hello doris", "doris");
+-----------------------------------+
| ends_with('Hello doris', 'doris') |
+-----------------------------------+
|                                 1 | 
+-----------------------------------+

mysql> select ends_with("Hello doris", "Hello");
+-----------------------------------+
| ends_with('Hello doris', 'Hello') |
+-----------------------------------+
|                                 0 | 
+-----------------------------------+


MySQL [(none)]> select starts_with("hello world","hello");
+-------------------------------------+
| starts_with('hello world', 'hello') |
+-------------------------------------+
|                                   1 |
+-------------------------------------+

MySQL [(none)]> select starts_with("hello world","world");
+-------------------------------------+
| starts_with('hello world', 'world') |
+-------------------------------------+
|                                   0 |
+-------------------------------------+

4.1.3.6 trim，ltrim，rtrim

语法：

VARCHAR trim(VARCHAR str)
-- 将参数 str 中左侧和右侧开始部分连续出现的空格去掉
mysql> SELECT trim('   ab d   ') str;
+------+
| str  |
+------+
| ab d |
+------+

VARCHAR ltrim(VARCHAR str)
-- 将参数 str 中从左侧部分开始部分连续出现的空格去掉
mysql> SELECT ltrim('   ab d') str;
+------+
| str  |
+------+
| ab d |
+------+

VARCHAR rtrim(VARCHAR str)
--将参数 str 中从右侧部分开始部分连续出现的空格去掉

mysql> SELECT rtrim('ab d   ') str;
+------+
| str  |
+------+
| ab d |
+------+

4.1.3.7 null_or_empty，not_null_or_empty

BOOLEAN NULL_OR_EMPTY (VARCHAR str)

-- 如果字符串为空字符串或者NULL，返回true。否则，返回false。
MySQL [(none)]> select null_or_empty(null);
+---------------------+
| null_or_empty(NULL) |
+---------------------+
|                   1 |
+---------------------+

MySQL [(none)]> select null_or_empty("");
+-------------------+
| null_or_empty('') |
+-------------------+
|                 1 |
+-------------------+

MySQL [(none)]> select null_or_empty("a");
+--------------------+
| null_or_empty('a') |
+--------------------+
|                  0 |
+--------------------+

BOOLEAN NOT_NULL_OR_EMPTY (VARCHAR str)
如果字符串为空字符串或者NULL，返回false。否则，返回true。

MySQL [(none)]> select not_null_or_empty(null);
+-------------------------+
| not_null_or_empty(NULL) |
+-------------------------+
|                       0 |
+-------------------------+

MySQL [(none)]> select not_null_or_empty("");
+-----------------------+
| not_null_or_empty('') |
+-----------------------+
|                     0 |
+-----------------------+

MySQL [(none)]> select not_null_or_empty("a");
+------------------------+
| not_null_or_empty('a') |
+------------------------+
|                      1 |
+------------------------+

4.1.3.8 replace

VARCHAR REPLACE (VARCHAR str, VARCHAR old, VARCHAR new)
-- 将str字符串中的old子串全部替换为new串

mysql> select replace("http://www.baidu.com:9090", "9090", "");
+------------------------------------------------------+
| replace('http://www.baidu.com:9090', '9090', '') |
+------------------------------------------------------+
| http://www.baidu.com:                                |
+------------------------------------------------------+

4.1.3.9 split_part

VARCHAR split_part(VARCHAR content, VARCHAR delimiter, INT field)
-- 根据分割符拆分字符串, 返回指定的分割部分(从一开始计数)。

mysql> select split_part("hello world", " ", 1);
+----------------------------------+
| split_part('hello world', ' ', 1) |
+----------------------------------+
| hello                            |
+----------------------------------+


mysql> select split_part("hello world", " ", 2);
+----------------------------------+
| split_part('hello world', ' ', 2) |
+----------------------------------+
| world                             |
+----------------------------------+

mysql> select split_part("2019年7月8号", "月", 1);
+-----------------------------------------+
| split_part('2019年7月8号', '月', 1)     |
+-----------------------------------------+
| 2019年7                                 |
+-----------------------------------------+

mysql> select split_part("abca", "a", 1);
+----------------------------+
| split_part('abca', 'a', 1) |
+----------------------------+
|                            |
+----------------------------+

4.1.3.10 money_format

VARCHAR money_format(Number)
-- 将数字按照货币格式输出，整数部分每隔3位用逗号分隔，小数部分保留2位

mysql> select money_format(17014116);
+------------------------+
| money_format(17014116) |
+------------------------+
| 17,014,116.00          |
+------------------------+

mysql> select money_format(1123.456);
+------------------------+
| money_format(1123.456) |
+------------------------+
| 1,123.46               |
+------------------------+

mysql> select money_format(1123.4);
+----------------------+
| money_format(1123.4) |
+----------------------+
| 1,123.40             |
+----------------------+