Presto/Trino的explain和system catalog的使用、各种数据类型(基础、集合、日期时间)

最新推荐文章于 2023-03-14 11:31:20 发布

Bulut0907

最新推荐文章于 2023-03-14 11:31:20 发布

阅读量2.2k

点赞数

分类专栏： # Presto/Trino 文章标签： explain system catalog 基础类型集合类型 trino

本文链接：https://blog.csdn.net/yy8623977/article/details/122375769

版权

Presto/Trino 专栏收录该内容

14 篇文章 7 订阅

订阅专栏

1. explain

查看查询执行计划，简单例子是检查查询在语法上是否正确

trino>
trino> explain (type validate) select name from tpch.tiny.region;
 Valid 
-------
 true  
(1 row)

Query 20220106_183905_00013_b26ia, FINISHED, 1 node
Splits: 1 total, 1 done (100.00%)
0.26 [0 rows, 0B] [0 rows/s, 0B/s]

trino>

2. system catalog

2.1 system.runtime数据库

trino>
trino> show tables from system.runtime;
        Table         
----------------------
 nodes                
 optimizer_rule_stats 
 queries              
 tasks                
 transactions         
(5 rows)

Query 20220106_184602_00014_b26ia, FINISHED, 3 nodes
Splits: 6 total, 6 done (100.00%)
0.82 [5 rows, 134B] [6 rows/s, 164B/s]

trino>

查看queries和tasks表对性能优化最重要。queries表提供了当前和历史查询。tasks表则提供了task运行时的底层细节

可以使用下面的方法，来终结某个查询

trino>
trino> call system.runtime.kill_query(query_id  => '20220106_185627_00018_b26ia', message => 'killed');
CALL
trino>

nodes表可以查看集群中的节点信息

2.2 system.information_schema数据库

每个catalog都有自己的information_schema数据库，其中的表是基于视图实现的

每个information_schema数据库都有9个视图，如果某个连接器不支持某个功能，查询对应的视图会报错

trino>
trino> show tables from system.information_schema;
             Table              
--------------------------------
 applicable_roles               
 columns                        
 enabled_roles                  
 role_authorization_descriptors 
 roles                          
 schemata                       
 table_privileges               
 tables                         
 views                          
(9 rows)

Query 20220107_031226_00015_b26ia, FINISHED, 3 nodes
Splits: 6 total, 6 done (100.00%)
0.34 [9 rows, 358B] [26 rows/s, 1.04KB/s]

trino>

3. 数据类型

Trino中的数据类型不是所有的连接器都支持，数据源中的数据类型可能Trino也不支持

可以使用cast (value as data_type)和try_cast(value as data_type)进行数据类型转换，但try_cast转换失败则返回NULL

3.1 基础数据类型

布尔类型

类型	描述	示例
boolean	true或false	true

整数类型

类型	描述	示例
tinyint	8位有符号整数，最小值 $2^7$ ，最大值 $2^{7-1}$	42
smallint	16位有符号整数，最小值 $2^{15}$ ，最大值 $2^{15-1}$	42
integer、int	32位有符号整数，最小值 $2^{31}$ ，最大值 $2^{31-1}$	42
bigint	64位有符号整数，最小值 $2^{63}$ ，最大值 $2^{63-1}$	42

浮点数类型

类型	描述	示例
real	32位浮点数，遵循IEEE 754二进制浮点数运算标准	2.71828
double	64位浮点数，遵循IEEE 754二进制浮点数运算标准	2.71828
decimal	固定精度小数	123456.7890

字符串类型

类型	描述	示例
varchar、varchar(n)	可变长度字符串。字符长度为m(m < n)，则分配m个字符	“hello world”
char、char(n)	固定长度字符串。总是分配n个字符，不管字符长度是多少。char表示char(1)	“hello world”

当字符串cast为char(n)，不足的字符用空格填充，多的字符被截断
当插入字符串到类型为char(n)的列，不足的字符用空格填充，多了就报错
当插入字符串到类型为varchar(n)的列，多了就报错

trino> 
trino> select length(cast('hello world' as char(100)));
 _col0 
-------
   100 
(1 row)

Query 20220108_132257_00002_b26ia, FINISHED, 1 node
Splits: 1 total, 1 done (100.00%)
0.29 [0 rows, 0B] [0 rows/s, 0B/s]

trino>

下面的应该是一个bug，结果应该为false

trino>
trino> select cast('hello world' as char(14)) = cast('hello world' as char(15));
 _col0 
-------
 true  
(1 row)

Query 20220108_133452_00040_b26ia, FINISHED, 1 node
Splits: 1 total, 1 done (100.00%)
0.23 [0 rows, 0B] [0 rows/s, 0B/s]

trino>

3.2 集合数据类型

类型	示例
array	array[‘apples’, ‘oranges’, ‘pears’]
map	map(array[‘a’, ‘b’, ‘c’], array[1, 2, 3])
json	{‘a’:1, ‘b’:2, ‘c’:3}
row	row(1, 2, 3)

3.3 日期时间数据类型

类型	描述	示例
date	包含年、月、日的日期	2022-01-10
time	包含时、分、秒、毫秒的时间, 时区可选	16:26:08.123 +08:00
timestamp	包含日期和时间, 时区可选	2022-01-10 16:26:08.123 Asia/Shanghai
interval year to month	间隔时间跨度为年、月	interval ‘1-2’ year to month
interval day to second	间隔时间跨度为天、时、分、秒、毫秒	interval ‘5’ day to second

查询示例如下：

trino> 
trino> select timestamp '2022-01-10 16:26:08.123 Asia/Shanghai';
                 _col0                 
---------------------------------------
 2022-01-10 16:26:08.123 Asia/Shanghai 
(1 row)

Query 20220109_020002_00037_jiymz, FINISHED, 1 node
Splits: 1 total, 1 done (100.00%)
0.51 [0 rows, 0B] [0 rows/s, 0B/s]

trino> 
trino> 
trino> select to_unixtime(timestamp '2022-01-10 16:26:08.123 Asia/Shanghai');
      _col0       
------------------
 1.641803168123E9 
(1 row)

Query 20220209_233903_00154_jx84g, FINISHED, 1 node
Splits: 1 total, 1 done (100.00%)
0.25 [0 rows, 0B] [0 rows/s, 0B/s]

trino>
trino> select interval '2-3' year to month;
 _col0 
-------
 2-3   
(1 row)

Query 20220109_021402_00055_jiymz, FINISHED, 1 node
Splits: 1 total, 1 done (100.00%)
0.74 [0 rows, 0B] [0 rows/s, 0B/s]

trino> 
trino> select interval '2' year;
 _col0 
-------
 2-0   
(1 row)

Query 20220109_021900_00057_jiymz, FINISHED, 1 node
Splits: 1 total, 1 done (100.00%)
0.53 [0 rows, 0B] [0 rows/s, 0B/s]

trino>
trino> select interval '3' month;
 _col0 
-------
 0-3   
(1 row)

Query 20220109_021911_00058_jiymz, FINISHED, 1 node
Splits: 1 total, 1 done (100.00%)
0.33 [0 rows, 0B] [0 rows/s, 0B/s]

trino>
trino> select interval '2 08:06:18.888' day to second;
     _col0      
----------------
 2 08:06:18.888 
(1 row)

Query 20220109_021501_00056_jiymz, FINISHED, 1 node
Splits: 1 total, 1 done (100.00%)
0.39 [0 rows, 0B] [0 rows/s, 0B/s]

trino> 
trino> select interval '2' day;
     _col0      
----------------
 2 00:00:00.000 
(1 row)

Query 20220109_021928_00059_jiymz, FINISHED, 1 node
Splits: 1 total, 1 done (100.00%)
0.45 [0 rows, 0B] [0 rows/s, 0B/s]

trino>
trino> select interval '08' hour;
     _col0      
----------------
 0 08:00:00.000 
(1 row)

Query 20220109_021946_00060_jiymz, FINISHED, 1 node
Splits: 1 total, 1 done (100.00%)
0.34 [0 rows, 0B] [0 rows/s, 0B/s]

trino>
trino> select interval '06' minute;
     _col0      
----------------
 0 00:06:00.000 
(1 row)

Query 20220109_022007_00061_jiymz, FINISHED, 1 node
Splits: 1 total, 1 done (100.00%)
0.33 [0 rows, 0B] [0 rows/s, 0B/s]

trino>
trino> select interval '18.888' second;
     _col0      
----------------
 0 00:00:18.888 
(1 row)

Query 20220109_022019_00062_jiymz, FINISHED, 1 node
Splits: 1 total, 1 done (100.00%)
0.31 [0 rows, 0B] [0 rows/s, 0B/s]

trino>

Bulut0907

关注

0
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
Presto/Trino的explain和system catalog的使用、各种数据类型(基础、集合、日期时间)

目录1. explain2. system catalog2.1 system.runtime数据库2.2 system.information_schema数据库3. 创建、更新、删除schema(未执行)4. 创建table4.1 表和列with子句的使用(未执行)4.2 复制现有的表结构(未执行)4.3 从查询结果中新建表(未执行)4.4 修改表和删除表(未执行)1. explain查看查询执行计划，简单例子是检查查询在语法上是否正确trino>trino> explain (ty
复制链接

扫一扫