Filter | Postgresql(default) /ms | Postgresql(Index+view) /ms | ClickHouse (default) /ms | ClickHouse +table, index /ms | ClickHouse Cluster /ms |
---|---|---|---|---|---|
Taxonomy (default) | 22017 | 1921 | 343 | 150 | |
Taxonomy (ngrams) | 5745 | 136 | 217 | 91 | 122 |
Taxonomy Score (default) | 65642 | 14605 | 800 | 431 | |
Taxonomy node (ngrams) | 1387 | 367 | 433 | 95 | 307 |
Taxonomy (default)
Postgresql | ClickHouse | ClickHouse Cluster | |
---|---|---|---|
default | 22017 | 351 | 164 |
table index | 21903 | 347 | 172 |
table index+view | 20784 | 438 | 279 |
table index+view, index | 1921 | ❌ (view not support add index) | ❌ (view not support add index) |
table index + unnest table, index | 397 |
Taxonomy (ngrams)
Postgresql | ClickHouse | ClickHouse Cluster | |
---|---|---|---|
default | 5745 | 238 | 112 |
table index | 4200 | 249 | 119 |
table index+view | 3847 | 254 | 135 |
table index+view, index | 136 | ||
table index + unnest table, index | 101 |
Taxonomy node ngram
Postgresql | ClickHouse | ClickHouse Cluster | |
---|---|---|---|
default | 1387 | 440 | |
table index | 337 | 519 | |
table index+view | ### blocking | ||
table index+view, index | 367 | ||
table index + unnest table, index |
Taxonomy score default
Postgresql | ClickHouse | ClickHouse Cluster | |
---|---|---|---|
default | 65642 | 812 | 473 |
table index | 875 | 489 | |
table index+view | 60970 | 1007 | |
table index+view, index | 14605 | ||
table index + unnest table, index |
default
EXPLAIN
SELECT
col0 AS diffcol0,
count(*) AS count
FROM stream_616d2986bd17682e66aedbb3 AS tab0,
(
SELECT
tab0._id,
arrayJoin(`f_7d1_ff35_t_labels$textual`) AS col0
FROM stream_616d2986bd17682e66aedbb3 AS tab0
) AS tab1
WHERE tab1._id = tab0._id
GROUP BY diffcol0
ORDER BY count DESC NULLS LAST
LIMIT 0, 50
Query id: edb8e6a4-d562-4513-bcc0-4cc07013b454
┌─explain──────────────────────────────────────────────────────────────────────────────────────────────────┐
│ Expression (Projection) │
│ Limit (preliminary LIMIT (without OFFSET)) │
│ Sorting (Sorting for ORDER BY) │
│ Expression (Before ORDER BY) │
│ Aggregating │
│ Expression (Before GROUP BY) │
│ Filter (WHERE) │
│ Join (JOIN) │
│ Expression (Before JOIN) │
│ SettingQuotaAndLimits (Set limits and quota after reading from storage) │
│ ReadFromMergeTree │
│ Expression ((Joined actions + (Rename joined columns + (Projection + Before ORDER BY)))) │
│ SettingQuotaAndLimits (Set limits and quota after reading from storage) │
│ ReadFromMergeTree │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────┘
14 rows in set. Elapsed: 0.008 sec.
50 rows in set. Elapsed: 0.351 sec. Processed 1.79 million rows, 171.82 MB (5.11 million rows/s., 489.56 MB/s.)
docs: https://clickhouse.com/docs
Clickhouse: 4核16G
Postgres:
Clickhouse Cluster: 4Core 16G * 3
create ngrams table, index
create table stream_ngrams_index (_id Int32, vw_unnested Text, INDEX f_n61_af35_ngrams$textual_string vw_unnested TYPE bloom_filter GRANULARITY 3) ENGINE=MergeTree order by vw_unnested
taxonomy (default) Use arrayJoin function:
PGSQL: 22017.586 ms → 1921.623 ms
ClickHouse: 343 ms
ClickHouse: Cluster: 150ms
EXPLAIN SELECT
col0 AS diffcol0,
count(*) AS count
FROM stream_616d2986bd17682e66aedbb3 AS tab0,
(
SELECT
tab0._id,
arrayJoin(`f_7d1_ff35_t_labels$textual`) AS col0
FROM stream_616d2986bd17682e66aedbb3 AS tab0
) AS tab1
WHERE tab1._id = tab0._id
GROUP BY diffcol0
ORDER BY count DESC NULLS LAST
LIMIT 0, 50
┌─explain─────────────────────────────────────────────────────┐
│ (Expression) │
│ ExpressionTransform │
│ (Limit) │
│ Limit │
│ (Sorting) │
│ MergeSortingTransform │
│ LimitsCheckingTransform │
│ PartialSortingTransform │
│ (Expression) │
│ ExpressionTransform │
│ (Aggregating) │
│ Resize 4 → 1 │
│ AggregatingTransform × 4 │
│ StrictResize 4 → 4 │
│ (Expression) │
│ ExpressionTransform × 4 │
│ (Filter) │
│ FilterTransform × 4 │
│ (Join) │
│ JoiningTransform × 4 2 → 1 │
│ Resize 1 → 4 │
│ FillingRightJoinSide │
│ Resize 4 → 1 │
│ (Expression) │
│ ExpressionTransform × 4 │
│ (SettingQuotaAndLimits) │
│ (ReadFromMergeTree) │
│ MergeTreeThread × 4 0 → 1 │
│ (Expression) │
│ ExpressionTransform × 4 │
│ (SettingQuotaAndLimits) │
│ (ReadFromMergeTree) │
│ MergeTreeThread × 4 0 → 1 │
└─────────────────────────────────────────────────────────────┘
Explain Time:33 rows in set. Elapsed: 0.005 sec.
Execution Time: 50 rows in set. Elapsed: 0.343 sec. Processed 1.79 million rows, 171.82 MB (5.22 million rows/s., 500.44 MB/s.)
Cluster Execution: 50 rows in set. Elapsed: 0.150 sec. Processed 1.79 million rows, 171.82 MB (11.93 million rows/s., 1.14 GB/s.)
EXPLAIN
SELECT
col0 AS diffcol0,
count(*) AS count
FROM stream_617fc79a355b7f54684a9c8b AS tab0,
(
SELECT
tab0._id,
arrayJoin(`f_jq2_20d7_t_labels$textual`) AS col0
FROM stream_617fc79a355b7f54684a9c8b AS tab0
) AS tab1
WHERE tab1._id = tab0._id
GROUP BY diffcol0
ORDER BY count DESC NULLS LAST
LIMIT 0, 50
Query id: ac9a0578-cf17-4e5c-8e44-34320d454380
┌─explain──────────────────────────────────────────────────────────────────────────────────────────────────┐
│ Expression (Projection) │
│ Limit (preliminary LIMIT (without OFFSET)) │
│ Sorting (Sorting for ORDER BY) │
│ Expression (Before ORDER BY) │
│ Aggregating │
│ Expression (Before GROUP BY) │
│ Filter (WHERE) │
│ Join (JOIN) │
│ Expression (Before JOIN) │
│ SettingQuotaAndLimits (Set limits and quota after reading from storage) │
│ ReadFromMergeTree │
│ Expression ((Joined actions + (Rename joined columns + (Projection + Before ORDER BY)))) │
│ SettingQuotaAndLimits (Set limits and quota after reading from storage) │
│ ReadFromMergeTree │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────┘
Explain Time: 14 rows in set. Elapsed: 0.003 sec.
Execution Time: 50 rows in set. Elapsed: 0.585 sec. Processed 5.33 million rows, 244.40 MB (9.10 million rows/s., 417.47 MB/s.)
Cluster Execution Time: 50 rows in set. Elapsed: 0.287 sec. Processed 5.33 million rows, 244.40 MB (18.58 million rows/s., 851.80 MB/s.)
taxonomy (ngrams) Use arrayJoin function
EXPLAIN
SELECT
col0 AS diffcol0,
count('*') AS count
FROM stream_616d2986bd17682e66aedbb3 AS tab0,
(
SELECT
tab0._id,
arrayJoin(`f_7d1_ff35_t_labels$textual`) AS col0
FROM stream_616d2986bd17682e66aedbb3 AS tab0
) AS tab1
WHERE hasAny(`f_n61_af35_ngrams$textual`, ['日期_新鲜', '值得_信赖']) AND (tab1._id = tab0._id)
GROUP BY diffcol0
ORDER BY diffcol0 ASC NULLS FIRST
LIMIT 0, 200
Query id: 56501396-937e-478d-a8e2-09635a2104dc
┌─explain──────────────────────────────────────────────────────────────────────────────────────────────────┐
│ Expression (Projection) │
│ Limit (preliminary LIMIT (without OFFSET)) │
│ Sorting (Sorting for ORDER BY) │
│ Expression (Before ORDER BY) │
│ Aggregating │
│ Expression (Before GROUP BY) │
│ Filter (WHERE) │
│ Join (JOIN) │
│ Expression (Before JOIN) │
│ SettingQuotaAndLimits (Set limits and quota after reading from storage) │
│ ReadFromMergeTree │
│ Expression ((Joined actions + (Rename joined columns + (Projection + Before ORDER BY)))) │
│ SettingQuotaAndLimits (Set limits and quota after reading from storage) │
│ ReadFromMergeTree │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────┘
EXPLAIN Time: 14 rows in set. Elapsed: 0.004 sec.
Execution Time: 125 rows in set. Elapsed: 0.217 sec. Processed 1.79 million rows, 204.71 MB (8.25 million rows/s., 941.35 MB/s.)
Cluster Execution Time: 50 rows in set. Elapsed: 0.122 sec. Processed 1.79 million rows, 204.71 MB (14.74 million rows/s., 1.68 GB/s.)
taxonomy (ngrams) Use arrayJoin function + table, index
EXPLAIN
SELECT
col0 AS diffcol0,
count('*') AS count
FROM stream_616d2986bd17682e66aedbb3_ngrams_index_1 AS tab0,
(
SELECT
tab0._id,
arrayJoin(`f_7d1_ff35_t_labels$textual`) AS col0
FROM stream_616d2986bd17682e66aedbb3_ngrams_index_1 AS tab0
) AS tab1
WHERE (tab1._id IN (
SELECT _id
FROM stream_ngrams_index
WHERE vw_unnested IN ['日期_新鲜', '值得_信赖']
)) AND (tab1._id = tab0._id)
GROUP BY diffcol0
ORDER BY diffcol0 ASC NULLS FIRST
LIMIT 0, 200
Query id: e54603a8-2370-417e-b038-d728b5051eea
┌─explain──────────────────────────────────────────────────────────────────────────────────────────────────┐
│ Expression (Projection) │
│ Limit (preliminary LIMIT (without OFFSET)) │
│ Sorting (Sorting for ORDER BY) │
│ Expression (Before ORDER BY) │
│ Aggregating │
│ Expression (Before GROUP BY) │
│ Filter (WHERE) │
│ Join (JOIN) │
│ Expression (Before JOIN) │
│ SettingQuotaAndLimits (Set limits and quota after reading from storage) │
│ ReadFromMergeTree │
│ Expression ((Joined actions + (Rename joined columns + (Projection + Before ORDER BY)))) │
│ SettingQuotaAndLimits (Set limits and quota after reading from storage) │
│ ReadFromMergeTree │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────┘
14 rows in set. Elapsed: 0.011 sec.
125 rows in set. Elapsed: 0.091 sec. Processed 1.79 million rows, 82.19 MB (19.61 million rows/s., 898.72 MB/s.)
taxonomy (ngrams) Use arrayJoin function+ View
EXPLAIN
SELECT
col0 AS diffcol0,
count('*') AS count
FROM stream_616d2986bd17682e66aedbb3 AS tab0,
(
SELECT
vw_view._id,
vw_view.vw_unnested AS col0
FROM `vw_s_dbb3_f_7d1_ff35_t_labels$textual_` AS vw_view
) AS tab1
WHERE hasAny(`f_n61_af35_ngrams$textual`, ['日期_新鲜', '值得_信赖']) AND (tab1._id = tab0._id)
GROUP BY diffcol0
ORDER BY diffcol0 ASC NULLS FIRST
LIMIT 0, 200
Query id: 5bf23233-02a3-4be5-b233-912badedf908
┌─explain────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ Expression (Projection) │
│ Limit (preliminary LIMIT (without OFFSET)) │
│ Sorting (Sorting for ORDER BY) │
│ Expression (Before ORDER BY) │
│ Aggregating │
│ Expression (Before GROUP BY) │
│ Filter (WHERE) │
│ Join (JOIN) │
│ Expression (Before JOIN) │
│ SettingQuotaAndLimits (Set limits and quota after reading from storage) │
│ ReadFromMergeTree │
│ Expression ((Joined actions + (Rename joined columns + (Projection + Before ORDER BY)))) │
│ SettingQuotaAndLimits (Set limits and quota after reading from storage) │
│ Expression ((Convert VIEW subquery result to VIEW table structure + (Materialize constants after VIEW subquery + (Projection + Before ORDER BY)))) │
│ SettingQuotaAndLimits (Set limits and quota after reading from storage) │
│ ReadFromMergeTree │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
16 rows in set. Elapsed: 0.010 sec.
125 rows in set. Elapsed: 0.227 sec. Processed 1.79 million rows, 204.71 MB (7.91 million rows/s., 903.07 MB/s.)
taxonomy score default
EXPLAIN
SELECT
col0 AS diffcol0,
colnest1 AS diffcolnest1,
count('*') AS col2
FROM stream_616d2986bd17682e66aedbb3 AS tab0,
(
SELECT
tab0._id,
arrayJoin(`f_7d1_ff35_t_labels$textual`) AS col0
FROM stream_616d2986bd17682e66aedbb3 AS tab0
) AS tab2,
(
SELECT
tab0._id,
arrayJoin(`f_8d1_ff35_t_scores$numerical`) AS colnest1
FROM stream_616d2986bd17682e66aedbb3 AS tab0
) AS tab3
WHERE (tab2._id = tab0._id) AND (tab3._id = tab0._id)
GROUP BY
diffcol0,
diffcolnest1
ORDER BY diffcol0 ASC NULLS FIRST
LIMIT 0, 200
Query id: 76bacd84-e295-492c-a9e1-390b98767dc3
┌─explain────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ Expression (Projection) │
│ Limit (preliminary LIMIT (without OFFSET)) │
│ Sorting (Sorting for ORDER BY) │
│ Expression (Before ORDER BY) │
│ Aggregating │
│ Expression (Before GROUP BY) │
│ Filter (WHERE) │
│ Join (JOIN) │
│ Expression ((Before JOIN + (Projection + Before ORDER BY))) │
│ Filter (WHERE) │
│ Join (JOIN) │
│ Expression (Before JOIN) │
│ SettingQuotaAndLimits (Set limits and quota after reading from storage) │
│ ReadFromMergeTree │
│ Expression ((Joined actions + (Rename joined columns + (Projection + Before ORDER BY)))) │
│ SettingQuotaAndLimits (Set limits and quota after reading from storage) │
│ ReadFromMergeTree │
│ Expression ((Joined actions + (Rename joined columns + (Projection + Before ORDER BY)))) │
│ SettingQuotaAndLimits (Set limits and quota after reading from storage) │
│ ReadFromMergeTree │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
Explain Time: 20 rows in set. Elapsed: 0.008 sec.
Execution Time: 200 rows in set. Elapsed: 0.800 sec. Processed 2.69 million rows, 190.64 MB (3.36 million rows/s., 238.25 MB/s.)
Cluster Execution Time: 200 rows in set. Elapsed: 0.431 sec. Processed 2.09 million rows, 178.10 MB (4.85 million rows/s., 413.09 MB/s.)
taxonomy node (ngrams) Use arrayJoin function
EXPLAIN
SELECT count(*)
FROM stream_616d2986bd17682e66aedbb3 AS tab0
WHERE hasAny(`f_n61_af35_ngrams$textual`, ['日期_新鲜', '值得_信赖']) AND (_id IN (
SELECT al._id
FROM
(
SELECT
unwhe._id,
arrayJoin(`f_7d1_ff35_t_labels$textual`) AS oo,
`f_7d1_ff35_t_labels$textual` AS af
FROM stream_616d2986bd17682e66aedbb3 AS unwhe
) AS al
WHERE ((oo[1]) = '消费洞察话题') AND (_id = al._id) AND hasAll(arrayFlatten(af), ['消费洞察话题'])
))
Query id: fd6721e8-4cdd-47b1-bdcc-b05e874c949b
┌─explain─────────────────────────────────────────────────────────────────────────┐
│ Expression ((Projection + Before ORDER BY)) │
│ Aggregating │
│ Expression (Before GROUP BY) │
│ Filter (WHERE) │
│ SettingQuotaAndLimits (Set limits and quota after reading from storage) │
│ ReadFromMergeTree │
└─────────────────────────────────────────────────────────────────────────────────┘
Explain Time: 6 rows in set. Elapsed: 0.390 sec.
Execution Time: 1 rows in set. Elapsed: 0.433 sec. Processed 896.55 thousand rows, 36.47 MB (2.05 million rows/s., 83.46 MB/s.)
Cluster Execution Time: 1 rows in set. Elapsed: 0.307 sec. Processed 896.55 thousand rows, 36.47 MB (2.92 million rows/s., 118.84 MB/s.)
taxonomy node (ngrams) Use arrayJoin function + table , index
EXPLAIN
SELECT count(*)
FROM stream_616d2986bd17682e66aedbb3_ngrams_index_1 AS tab0
WHERE _id IN (
SELECT al._id
FROM
(
SELECT
unwhe._id,
arrayJoin(`f_7d1_ff35_t_labels$textual`) AS oo,
`f_7d1_ff35_t_labels$textual` AS af
FROM stream_616d2986bd17682e66aedbb3_ngrams_index_1 AS unwhe
) AS al
WHERE (_id IN (
SELECT _id
FROM stream_ngrams_index
WHERE vw_unnested IN ['日期_新鲜', '值得_信赖']
)) AND ((oo[1]) = '消费洞察话题') AND (_id = al._id) AND hasAll(arrayFlatten(af), ['消费洞察话题'])
)
Query id: 77bc34ab-d1e3-4ee9-9956-0410e88dbfa8
┌─explain─────────────────────────────────────────────────────────────────────────┐
│ Expression ((Projection + Before ORDER BY)) │
│ Aggregating │
│ Expression (Before GROUP BY) │
│ Filter (WHERE) │
│ SettingQuotaAndLimits (Set limits and quota after reading from storage) │
│ ReadFromMergeTree │
└─────────────────────────────────────────────────────────────────────────────────┘
6 rows in set. Elapsed: 0.088 sec.
1 rows in set. Elapsed: 0.095 sec. Processed 896.55 thousand rows, 3.59 MB (9.46 million rows/s., 37.83 MB/s.)