关于Doris的物化视图与异步物化视图

1.物化视图

1.物化视图与Mysql视图的区别

  • 物化视图:物化视图是预先计算(根据定义的 SELECT 语句)并存储在 Doris 中的特殊表中的数据集。物化视图的出现主要是为了满足用户。可以分析任意维度的原始明细数据,也可以快速分析查询固定维度。
  • mysql:MySQL中的视图(View)保存的是对数据库中一个或多个表的查询定义
    概括:物化视图是存储的实际的结果数据,而mysql存储的只是查询语句

2.物化视图什么时候使用?

  • 比较耗时的操作
  • 需要重复使用的子查询

3.物化视图与Roll up对比

物化视图支持聚合函数和Aggregation模型
物化视图是roll up加强版

4.要求

其实就是抽出共性重复,因为物化视图需要消耗资源

  • 从查询语句中抽象出,多个查询共有的分组和聚合方式作为物化视图的定义。
  • 不需要给所有维度组合都创建物化视图。

5.创建结构

meterialized view

CREATE MATERIALIZED VIEW [IF NOT EXISTS] mv_name
[PARTITION BY (partition_column)]
[DISTRIBUTE BY (distribution_column)]
[PROPERTIES ("key" = "value", ...)]
AS SELECT ...
FROM base_table
[WHERE conditions]
[GROUP BY ...]
[HAVING ...];

取消

CANCEL ALTER TABLE MATERIALIZED VIEW FROM db_name.table_name

2.异步物化视图

1.介绍(官方说法,其实我也不明白)

Doris 的异步物化视图采用基于 SPJG (SELECT-PROJECT-JOIN-GROUP-BY) 模式结构信息的算法来进行透明重写。

2.join重写

查询表与物化视图的表相同,这时候会与物化视图链接,
Join重写是指查询和物化中使用的表相同时。在这种情况下,优化器将通过将物化视图的输入与查询连接或将连接放置在查询的 WHERE 子句的外层来尝试透明重写。

这种重写模式支持多表连接,并支持内连接和左连接类型。对其他类型的支持正在不断扩大。

3.官网样例

物化视图定义,基于lineitem 表

1.join重写样例

1.物化视图创建语句
CREATE MATERIALIZED VIEW mv1
BUILD IMMEDIATE REFRESH AUTO ON SCHEDULE EVERY 1 hour
DISTRIBUTED BY RANDOM BUCKETS 12
PROPERTIES ('replication_num' = '1')
AS
SELECT t1.l_linenumber,
       o_custkey,
       o_orderdate
FROM (SELECT * FROM lineitem WHERE l_linenumber > 1) t1
         LEFT OUTER JOIN orders
                         ON l_orderkey = o_orderkey;
2.视图创建sql
SELECT t1.l_linenumber,
       o_custkey,
       o_orderdate
FROM (SELECT * FROM lineitem WHERE l_linenumber > 1) t1
LEFT OUTER JOIN orders
ON l_orderkey = o_orderkey;
##  上下是一致的注意看
SELECT t1.l_linenumber,
       o_custkey,
       o_orderdate
FROM  lineitem  t1
LEFT OUTER JOIN orders
ON l_orderkey = o_orderkey;
WHERE l_linenumber > 1
3.查询语句
SELECT l_linenumber,
       o_custkey
FROM lineitem
LEFT OUTER JOIN orders
ON l_orderkey = o_orderkey
WHERE l_linenumber > 1 and o_orderdate = '2023-12-31';
4.查询join优化后(后台做到的)
select 
       l_linenumber,
       o_custkey
from mv1
where o_orderdate = '2023-12-31';

2. JOIN 派生

1.物化视图创建语句(设定为rv2吧)
SELECT
    l_shipdate, l_suppkey, o_orderdate
    sum(o_totalprice) AS sum_total,
    max(o_totalprice) AS max_total,
    min(o_totalprice) AS min_total,
    count(*) AS count_all,
    count(distinct CASE WHEN o_shippriority > 1 AND o_orderkey IN (1, 3) THEN o_custkey ELSE null END) AS bitmap_union_basic
FROM lineitem
LEFT OUTER JOIN orders ON lineitem.l_orderkey = orders.o_orderkey AND l_shipdate = o_orderdate
GROUP BY
l_shipdate,
l_suppkey,
o_orderdate;
2.查询语句
SELECT
    l_shipdate, l_suppkey, o_orderdate
    sum(o_totalprice) AS sum_total,
    max(o_totalprice) AS max_total,
    min(o_totalprice) AS min_total,
    count(*) AS count_all,
    count(distinct CASE WHEN o_shippriority > 1 AND o_orderkey IN (1, 3) THEN o_custkey ELSE null END) AS bitmap_union_basic
FROM lineitem
INNER JOIN orders ON lineitem.l_orderkey = orders.o_orderkey AND l_shipdate = o_orderdate
WHERE o_orderdate = '2023-12-11' AND l_suppkey = 3
GROUP BY
l_shipdate,
l_suppkey,
o_orderdate;
3.查询优化
select 
  l_shipdate, l_suppkey, o_orderdate
    sum(o_totalprice) AS sum_total,
    max(o_totalprice) AS max_total,
    min(o_totalprice) AS min_total,
    count(*) AS count_all,
    count(distinct CASE WHEN o_shippriority > 1 AND o_orderkey IN (1, 3) THEN o_custkey ELSE null END) AS bitmap_union_basic
from rv2
WHERE o_orderdate = '2023-12-11' AND l_suppkey = 3

3.聚合重写

1.物化视图定义(rv3)
SELECT
    l_shipdate, o_orderdate, l_partkey, l_suppkey,
    sum(o_totalprice) AS sum_total,
    max(o_totalprice) AS max_total,
    min(o_totalprice) AS min_total,
    count(*) AS count_all,
    bitmap_union(to_bitmap(CASE WHEN o_shippriority > 1 AND o_orderkey IN (1, 3) THEN o_custkey ELSE null END)) AS bitmap_union_basic
FROM lineitem
LEFT OUTER JOIN orders ON lineitem.l_orderkey = orders.o_orderkey AND l_shipdate = o_orderdate
GROUP BY
l_shipdate,
o_orderdate,
l_partkey,
l_suppkey;
2.查询语句
SELECT
    l_shipdate, l_suppkey,
    sum(o_totalprice) AS sum_total,
    max(o_totalprice) AS max_total,
    min(o_totalprice) AS min_total,
    count(*) AS count_all,
    count(distinct CASE WHEN o_shippriority > 1 AND o_orderkey IN (1, 3) THEN o_custkey ELSE null END) AS bitmap_union_basic
FROM lineitem
LEFT OUTER JOIN orders ON lineitem.l_orderkey = orders.o_orderkey AND l_shipdate = o_orderdate
WHERE o_orderdate = '2023-12-11' AND l_partkey = 3
GROUP BY
l_shipdate,
l_suppkey;
3.查询语句优化
SELECT
    l_shipdate, l_suppkey,
    sum_total,
    max_total,
    min_total,
    count_all,
    count(distinct CASE WHEN o_shippriority > 1 AND o_orderkey IN (1, 3) THEN o_custkey ELSE null END) AS bitmap_union_basic
FROM rv3
WHERE o_orderdate = '2023-12-11' AND l_partkey = 3
GROUP BY
l_shipdate,
l_suppkey;

4.总结

自动查询优化,而且支持join过程部分字段的优化以及异步刷新操作,异步刷新就是类似Mysql读写机制,记录日志,然后根据日志进行更新操作。而且是后台启动线程去操作,而不需要操作完成再返回成功。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值