ClickHouse中的SQL语法和函数详解

最新推荐文章于 2024-08-01 17:55:09 发布

ChenPD27595

最新推荐文章于 2024-08-01 17:55:09 发布

阅读量3k

点赞数 3

分类专栏： OLAP--ClickHouse 文章标签： hadoop

本文链接：https://blog.csdn.net/weixin_46011754/article/details/110788151

版权

OLAP--ClickHouse 专栏收录该内容

4 篇文章 0 订阅

订阅专栏

– with语法

 with '1' as v select * from tb_user where id=v;
 --求平均年龄
 with (select count(1) from tb_user) as cnt select (sum(age)/cnt) from tb_user;

array join 语法

array join 相当于explode+lateral view

select id,arr from tb_arr_join array join arr;  --会覆盖原先的数组
select id,arr,x from tb_arr_join array join arr as x ;  
 select id,arr,x from tb_arr_join array join [1,2,3,4,5] as x; --自动拼
  ┌─id─┬─arr──────────────┬─x─┐
│  1 │ ['a1','a2']      │ 1 │
│  1 │ ['a1','a2']      │ 2 │
│  1 │ ['a1','a2']      │ 3 │
│  1 │ ['a1','a2']      │ 4 │
│  1 │ ['a1','a2']      │ 5 │
│  2 │ ['b1','b2','b3'] │ 1 │
│  2 │ ['b1','b2','b3'] │ 2 │
│  2 │ ['b1','b2','b3'] │ 3 │
│  2 │ ['b1','b2','b3'] │ 4 │
│  2 │ ['b1','b2','b3'] │ 5 │
└────┴──────────────────┴───┘

案例需求：将如下数据成功转换
在这里插入图片描述

代码实现：

select id ,e,i from (select id ,groupArray(name) arr, 
arrayEnumerate(arr) arr_index
from tb_test_arr
group by id ) t 
array join arr as e,
arr_index as i
;

FORMAT 指定输出和输入的数据格式

clickhouse-client  -q  "select * from tb_user  FORMAT  XML"

LIMIT/LIMIT BY 语法

select * from tb_user limit 2,2 ; --从索引为第二个开始找两个

在这里插入图片描述
代码实现：

select * from tb_limit limit 2 by name; --通过name分组

在这里插入图片描述

create  view  v_limit as   select * from tb_limit ;   --创建一个视图临时存储数据  不会存储到磁盘
create  table  t_limit engine=Log as   select * from tb_limit ;
create  table  tb_limit2  like  tb_limit ;  --不支持

创建分区表

按照指定字段分组

create table  tb_p(
oid String ,
money  Float64 ,
cDate  Date
)  engine = MergeTree 
order by oid 
partition by cDate ;  
insert into  tb_p values ('002',99,'2020-12-01') ,('001',98,'2020-12-01') ,('003',199,'2020-12-02');

┌─oid─┬─money─┬──────cDate─┐
│ 003 │   199 │ 2020-12-02 │
└─────┴───────┴────────────┘
┌─oid─┬─money─┬──────cDate─┐
│ 001 │    98 │ 2020-12-01 │
│ 002 │    99 │ 2020-12-01 │
----------------------------

按照月份分组

create table  tb_p2(
oid String ,
money  Float64 ,
cDate  Date
)  engine = MergeTree 
order by oid 
partition by toMonth(cDate) ;
insert into  tb_p2 values ('002',99,'2020-12-01') ,('001',98,'2020-12-01') ,('003',199,'2020-12-02'),('004',299,'2020-11-02');

┌─oid─┬─money─┬──────cDate─┐
│ 001 │    98 │ 2020-12-01 │
│ 002 │    99 │ 2020-12-01 │
│ 003 │   199 │ 2020-12-02 │
└─────┴───────┴────────────┘
┌─oid─┬─money─┬──────cDate─┐
│ 004 │   299 │ 2020-11-02 │
└─────┴───────┴────────────┘

指定两个分区字段年+月不同就分区！！！！

create table  tb_p3(
oid String ,
money  Float64 ,
cDate  Date
)  engine = MergeTree 
order by oid 
partition by (toYear(cDate) , toMonth(cDate)) ;-- 月进行分区 
insert into  tb_p3 values ('002',99,'2020-12-01') ,('001',98,'2020-12-01') ,('003',199,'2020-12-02'),('004',299,'2020-11-02'),('005',299,'2019-11-02')
 ;
drwxr-x---. 2 clickhouse clickhouse 228 Dec  6 04:33 2019-11_3_3_0
drwxr-x---. 2 clickhouse clickhouse 228 Dec  6 04:33 2020-11_2_2_0
drwxr-x---. 2 clickhouse clickhouse 228 Dec  6 04:33 2020-12_1_1_0

数组函数

高阶函数
方法: 是一段逻辑代码的封装, 实现功能 , 代码的重复使用, 方便调用
函数: 是一段逻辑代码的封装, 实现功能 , 代码的重复使用, 方便调用
public int add(int a , int b){ return a+b}

方法是对象的一部分
函数比方法高级 , 可以作为一个特殊的对象单独存在

arrayMap是一个高阶函数

Array(T) 参数一是一个函数 function f(T x){} 参数二数组

select arrayMap(x->x*x ,[1,2,3,4]) ;

。。。。未完待续

字符串函数

字符串切割

select splitByString(':','hello:world:kitty'); --按照':'切割

字符串拼接

 select concat('hello,','jj!');

JSON函数

①：visitParamHas(params, name) 参数一 jsonrowdata 参数二属性是否有(1 , 0)

select visitParamHas(cont,'movie') from tb_json;

②：visitParamExtractString(params, name) 访问数的一个属性获取一个列的指定属性的值

select visitParamExtractString(cont,'rate') from tb_json ;

③：解析多个属性


SELECT JSONExtract(cont, 'Tuple(movie String , rate String , uid String)')
FROM tb_json

③：元组取值

select JSONExtract(cont,'Tuple(movie String,rate String,timeStamp String)').2 from tb_json;


select cast(tupleElement(JSONExtract(cont , 'Tuple(movie String , rate String , uid String)'),2) as Float64)+12 from tb_json;

枚举数据类型

建表

create table tb_color(color Enum('RED'=1,'BLUE'=2,'GREEN'=3))engine=Log;

插入数据

insert into tb_color values(1); --插入进去的是RED
insert into tb_color values('GREEN');

查询数据

select cast(color as Int8) from tb_color;

ChenPD27595

关注

3
点赞
踩
7

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录