paimon表读优化-Read-optimized Table

概述

paimon 版本 : 0.7

测试目标: 类似 hudi ro

实践

文档

Read-optimized Table

在这里插入图片描述

测试

0: jdbc:hive2://10.32.36.142:10009/> select  * from `trace_log_refdes_hive_ro$ro` limit 10;

24/02/28 14:24:33 INFO ExecuteStatement: Execute in full collect mode
24/02/28 14:24:33 INFO V2ScanRelationPushDown: 
Output: pcbid#139, rid#140, refdes#141, bm_circuit_no#142, timestamp#143, pickupstatus#144, serial_number#145, flag#146, kitid#147, id#148, createdate#149, etl#150, opt1#151, opt2#152, opt3#153, opt4#154, opt5#155, nozzleid#156, laneno#157, componentbarcode#158, pn#159, lotcode#160, datecode#161, verdor#162, workorder#163, dt#164
         

2024-02-28 14:24:34.138 INFO KyuubiSessionManager-exec-pool: Thread-1643 org.apache.kyuubi.operation.ExecuteStatement: Processing root's query[defb80f5-4309-449d-b9a6-606fccf3fde8]: RUNNING_STATE -> FINISHED_STATE, time taken: 0.417 seconds
+-----------------+------------------------------+---------+----------------+----------------------+---------------+--------------------+-------+--------+-----------------------------------+----------------------+------+-------+-------+-------+-------+-------+-----------+---------+-------------------+-------+----------+-----------+---------+------------+-------------+
|      pcbid      |             xxx              | yyyyff  | zzzzzxxxxxxxx  |      timestamp       | gggggggfffff  |   ddddddddddddd    | cccc  | aaaaa  |                id                 |      createdate      | etl  | opt1  | opt2  | opt3  | opt4  | opt5  | nozzleid  | laneno  | componentbarcode  |  pn   | xyzzzzx  | xzzzzzzz  | txxxxx  | zxxxxxxxx  |     dt      |
+-----------------+------------------------------+---------+----------------+----------------------+---------------+--------------------+-------+--------+-----------------------------------+----------------------+------+-------+-------+-------+-------+-------+-----------+---------+-------------------+-------+----------+-----------+---------+------------+-------------+
| E23MPM42203175  | 514S00292-11420240109000060  | J0200   | 5              | 2024-02-23 18:16:42  | 0             | DLC4084004RPQVLAG  | 0     | NXT    | 11C0928D5A26E048E063AA2C200ABEF3  | 2024-02-23 18:14:50  | N    | NULL  | NULL  | NULL  | NULL  | NULL  | NULL      | NULL    | NULL              | NULL  | NULL     | NULL      | NULL    | NULL       | 2024-02-23  |
| E23MPM42203175  | 514S00292-11420240109000060  | J0200   | 4              | 2024-02-23 18:16:42  | 0             | DLC4084004SPQVLAF  | 0     | NXT    | 11C0928D5A27E048E063AA2C200ABEF3  | 2024-02-23 18:14:50  | N    | NULL  | NULL  | NULL  | NULL  | NULL  | NULL      | NULL    | NULL              | NULL  | NULL     | NULL      | NULL    | NULL       | 2024-02-23  |
| E23MPM42203175  | 514S00292-11420240117000057  | J0200   | 7              | 2024-02-23 18:16:42  | 0             | DLC4084004PPQVLAJ  | 0     | NXT    | 11C0928D5A28E048E063AA2C200ABEF3  | 2024-02-23 18:14:50  | N    | NULL  | NULL  | NULL  | NULL  | NULL  | NULL      | NULL    | NULL              | NULL  | NULL     | NULL      | NULL    | NULL       | 2024-02-23  |
| E23MPM42203175  | 514S00292-11420240117000057  | J0200   | 9              | 2024-02-23 18:16:42  | 0             | DLC4084004MPQVLAL  | 0     | NXT    | 11C0928D5A29E048E063AA2C200ABEF3  | 2024-02-23 18:14:50  | N    | NULL  | NULL  | NULL  | NULL  | NULL  | NULL      | NULL    | NULL              | NULL  | NULL     | NULL      | NULL    | NULL       | 2024-02-23  |
| E23MPM42203175  | 514S00292-11420240117000056  | J0200   | 12             | 2024-02-23 18:16:42  | 0             | DLC4084004VPQVLAC  | 0     | NXT    | 11C0928D5A2CE048E063AA2C200ABEF3  | 2024-02-23 18:14:50  | N    | NULL  | NULL  | NULL  | NULL  | NULL  | NULL      | NULL    | NULL              | NULL  | NULL     | NULL      | NULL    | NULL       | 2024-02-23  |
| E23MPM42203176  | 514S00292-11420240117000056  | J0200   | 1              | 2024-02-23 18:16:42  | 0             | DLC4084005NPQVLAG  | 0     | NXT    | 11C0928D5A31E048E063AA2C200ABEF3  | 2024-02-23 18:14:50  | N    | NULL  | NULL  | NULL  | NULL  | NULL  | NULL      | NULL    | NULL              | NULL  | NULL     | NULL      | NULL    | NULL       | 2024-02-23  |
| E23MPM42203176  | 514S00292-11420240117000059  | J0200   | 4              | 2024-02-23 18:16:42  | 0             | DLC4084005GPQVLAN  | 0     | NXT    | 11C0928D5A34E048E063AA2C200ABEF3  | 2024-02-23 18:14:50  | N    | NULL  | NULL  | NULL  | NULL  | NULL  | NULL      | NULL    | NULL              | NULL  | NULL     | NULL      | NULL    | NULL       | 2024-02-23  |
| E23MPM42203176  | 514S00292-11420240116000073  | J0200   | 10             | 2024-02-23 18:16:42  | 0             | DLC4084005MPQVLAH  | 0     | NXT    | 11C0928D5A3AE048E063AA2C200ABEF3  | 2024-02-23 18:14:50  | N    | NULL  | NULL  | NULL  | NULL  | NULL  | NULL      | NULL    | NULL              | NULL  | NULL     | NULL      | NULL    | NULL       | 2024-02-23  |
| E23MPM42201540  | 117S0158-A420240115001804    | R0514   | 7              | 2024-02-23 18:16:42  | 0             | DLC40860DTVPQVLAW  | 0     | NXT    | 11C0928D5A59E048E063AA2C200ABEF3  | 2024-02-23 18:14:51  | N    | NULL  | NULL  | NULL  | NULL  | NULL  | NULL      | NULL    | NULL              | NULL  | NULL     | NULL      | NULL    | NULL       | 2024-02-23  |
| E23MPM42201540  | 117S0158-A420240115001804    | R0401   | 1              | 2024-02-23 18:16:42  | 0             | DLC40860DU4PQVLAJ  | 0     | NXT    | 11C0928D5A5DE048E063AA2C200ABEF3  | 2024-02-23 18:14:51  | N    | NULL  | NULL  | NULL  | NULL  | NULL  | NULL      | NULL    | NULL              | NULL  | NULL     | NULL      | NULL    | NULL       | 2024-02-23  |
+-----------------+------------------------------+---------+----------------+----------------------+---------------+--------------------+-------+--------+-----------------------------------+----------------------+------+-------+-------+-------+-------+-------+-----------+---------+-------------------+-------+----------+-----------+---------+------------+-------------+
10 rows selected (0.435 seconds)
0: jdbc:hive2://10.32.36.142:10009/> select * from `trace_log_refdes_hive_ro$ro` where id='11C0928D5A2CE048E063AA2C200ABEF3';
2024-02-28 14:25:14.794 INFO KyuubiSessionManager-exec-pool: Thread-1646 org.apache.kyuubi.operation.ExecuteStatement: Processing root's query[7cf0bdca-b375-4287-8f3b-64968d09b69c]: PENDING_STATE -> RUNNING_STATE, statement:
select * from `trace_log_refdes_hive_ro$ro` where id='11C0928D5A2CE048E063AA2C200ABEF3'

24/02/28 14:25:14 INFO V2ScanRelationPushDown: 
Pushing operators to trace_log_refdes_hive_ro$ro
Pushed Filters: IsNotNull(id), EqualTo(id,11C0928D5A2CE048E063AA2C200ABEF3)
Post-Scan Filters: isnotnull(id#253),(id#253 = 11C0928D5A2CE048E063AA2C200ABEF3)
         
24/02/28 14:25:14 INFO V2ScanRelationPushDown: 
Output: pcbid#244, rid#245, refdes#246, bm_circuit_no#247, timestamp#248, pickupstatus#249, serial_number#250, flag#251, kitid#252, id#253, createdate#254, etl#255, opt1#256, opt2#257, opt3#258, opt4#259, opt5#260, nozzleid#261, laneno#262, componentbarcode#263, pn#264, lotcode#265, datecode#266, verdor#267, workorder#268, dt#269
         

2024-02-28 14:25:16.622 INFO KyuubiSessionManager-exec-pool: Thread-1646 org.apache.kyuubi.operation.ExecuteStatement: Processing root's query[7cf0bdca-b375-4287-8f3b-64968d09b69c]: RUNNING_STATE -> FINISHED_STATE, time taken: 1.828 seconds
+-----------------+------------------------------+---------+----------------+----------------------+---------------+--------------------+-------+--------+-----------------------------------+----------------------+------+-------+-------+-------+-------+-------+-----------+---------+-------------------+-------+----------+-----------+---------+------------+-------------+
|      pcbid      |             xxx              | yyyyff  | zzzzzxxxxxxxx  |      timestamp       | gggggggfffff  |   ddddddddddddd    | cccc  | aaaaa  |                id                 |      createdate      | etl  | opt1  | opt2  | opt3  | opt4  | opt5  | nozzleid  | laneno  | componentbarcode  |  pn   | xyzzzzx  | xzzzzzzz  | txxxxx  | zxxxxxxxx  |     dt      |
+-----------------+------------------------------+---------+----------------+----------------------+---------------+--------------------+-------+--------+-----------------------------------+----------------------+------+-------+-------+-------+-------+-------+-----------+---------+-------------------+-------+----------+-----------+---------+------------+-------------+
| E23MPM42203175  | 514S00292-11420240117000056  | J0200   | 12             | 2024-02-23 18:16:42  | 0             | DLC4084004VPQVLAC  | 0     | NXT    | 11C0928D5A2CE048E063AA2C200ABEF3  | 2024-02-23 18:14:50  | N    | NULL  | NULL  | NULL  | NULL  | NULL  | NULL      | NULL    | NULL              | NULL  | NULL     | NULL      | NULL    | NULL       | 2024-02-23  |
+-----------------+------------------------------+---------+----------------+----------------------+---------------+--------------------+-------+--------+-----------------------------------+----------------------+------+-------+-------+-------+-------+-------+-----------+---------+-------------------+-------+----------+-----------+---------+------------+-------------+
1 row selected (1.839 seconds)
0: jdbc:hive2://10.32.36.142:10009/> select count(*) from  `trace_log_refdes_hive_ro`
. . . . . . . . . . . . . . . . . .> ;
2024-02-28 14:30:14.554 INFO KyuubiSessionManager-exec-pool: Thread-1649 org.apache.kyuubi.operation.ExecuteStatement: Processing root's query[fbf94a7a-ac81-4e08-9316-63179fab1658]: PENDING_STATE -> RUNNING_STATE, statement:
select count(*) from  `trace_log_refdes_hive_ro`
24/02/28 14:30:14 INFO ExecuteStatement: Processing root's query[fbf94a7a-ac81-4e08-9316-63179fab1658]: PENDING_STATE -> RUNNING_STATE, statement:
select count(*) from  `trace_log_refdes_hive_ro`
24/02/28 14:30:14 INFO ExecuteStatement: 
           Spark application name: kyuubi_USER_SPARK_SQL_root_default_6fdc3b8a-61da-4d54-80bc-74a358c64f6e
                 application ID: application_1708505130791_0031
                 application web UI: http://hadoop01:8088/proxy/application_1708505130791_0031
                 master: yarn
                 deploy mode: cluster
                 version: 3.4.2
           Start time: 2024-02-28T14:23:25.247
           User: root
24/02/28 14:30:14 INFO ExecuteStatement: Execute in full collect mode
24/02/28 14:30:14 INFO V2ScanRelationPushDown: 
Output: 
         

24/02/28 14:30:36 INFO AdaptiveSparkPlanExec: Final plan:
*(2) HashAggregate(keys=[], functions=[count(1)], output=[count(1)#376L])
+- ShuffleQueryStage 0
   +- Exchange SinglePartition, ENSURE_REQUIREMENTS, [plan_id=87]
      +- *(1) HashAggregate(keys=[], functions=[partial_count(1)], output=[count#379L])
         +- *(1) Project
            +- BatchScan trace_log_refdes_hive_ro[] PaimonScan: [trace_log_refdes_hive_ro] RuntimeFilters: []

24/02/28 14:30:36 INFO CodeGenerator: Code generated in 6.663188 ms
24/02/28 14:30:36 INFO ExecuteStatement: Processing root's query[fbf94a7a-ac81-4e08-9316-63179fab1658]: RUNNING_STATE -> FINISHED_STATE, time taken: 21.644 seconds
2024-02-28 14:30:36.200 INFO KyuubiSessionManager-exec-pool: Thread-1649 org.apache.kyuubi.operation.ExecuteStatement: Query[fbf94a7a-ac81-4e08-9316-63179fab1658] in FINISHED_STATE
2024-02-28 14:30:36.200 INFO KyuubiSessionManager-exec-pool: Thread-1649 org.apache.kyuubi.operation.ExecuteStatement: Processing root's query[fbf94a7a-ac81-4e08-9316-63179fab1658]: RUNNING_STATE -> FINISHED_STATE, time taken: 21.646 seconds
+-----------+
| count(1)  |
+-----------+
| 77134590  |
+-----------+
1 row selected (21.667 seconds)
0: jdbc:hive2://10.32.36.142:10009/> select count(*) from  `trace_log_refdes_hive_ro$ro`
. . . . . . . . . . . . . . . . . .> ;

         
24/02/28 14:30:45 INFO AdaptiveSparkPlanExec: Final plan:
*(2) HashAggregate(keys=[], functions=[count(1)], output=[count(1)#409L])
+- ShuffleQueryStage 0
   +- Exchange SinglePartition, ENSURE_REQUIREMENTS, [plan_id=122]
      +- *(1) HashAggregate(keys=[], functions=[partial_count(1)], output=[count#412L])
         +- *(1) Project
            +- BatchScan trace_log_refdes_hive_ro$ro[] PaimonScan: [trace_log_refdes_hive_ro$ro] RuntimeFilters: []

24/02/28 14:30:45 INFO SQLOperationListener: Query [f50c8cb8-165a-4f9b-9c48-8a0149208754]: Job 7 succeeded, 0 active jobs running
+-----------+
| count(1)  |
+-----------+
| 73545958  |
+-----------+
1 row selected (4.424 seconds)
0: jdbc:hive2://10.32.36.142:10009/> select count(*) from  `trace_log_refdes_hive_ro$ro`;

24/02/28 14:31:07 INFO StatsReportListener: task runtime:(count: 1, mean: 31.000000, stdev: 0.000000, max: 31.000000, min: 31.000000)
24/02/28 14:31:07 INFO DAGScheduler: Job 9 finished: collect at ExecuteStatement.scala:72, took 0.039294 s
24/02/28 14:31:07 INFO AdaptiveSparkPlanExec: Final plan:
*(2) HashAggregate(keys=[], functions=[count(1)], output=[count(1)#442L])
+- ShuffleQueryStage 0
   +- Exchange SinglePartition, ENSURE_REQUIREMENTS, [plan_id=157]
      +- *(1) HashAggregate(keys=[], functions=[partial_count(1)], output=[count#445L])
         +- *(1) Project
            +- BatchScan trace_log_refdes_hive_ro$ro[] PaimonScan: [trace_log_refdes_hive_ro$ro] RuntimeFilters: []

defe-4000-bbb1-bcc3c08fec7f]: RUNNING_STATE -> FINISHED_STATE, time taken: 4.045 seconds
+-----------+
| count(1)  |
+-----------+
| 73545958  |
+-----------+
1 row selected (4.061 seconds)
0: jdbc:hive2://10.32.36.142:10009/> 

由上可以看出,在涉及需要数据合并时,性能提升巨大。

配合 full-compaction.delta-commits 控制延迟时间。

结束

paimon表读优化-Read-optimized Table 至此结束。

  • 10
    点赞
  • 7
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

流月up

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值