目录
结论摘要
小数据量(线程10万行),无论单多线程,有无batch,PG和TSDB性能相仿,PG略强。
增加一条索引,两者插入性能都有所下降
WAL增长约为插入数据的2倍,但是WAL有上限。
大数据量(2000万、1亿行),BATCHSIZE=1000,PG插入性能依旧强于TSDB。
CPU占用,TSDB占用略高,以8线程batch1000为例(TSDB:80%, PG: %60)
总体上单独PG 性能和资源消耗都优于TSDB。
下一步需要找到在何种场景下TSDB插入性能会优于PG。
测试环境
Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz
32G memory
SSD harddisk.
postgresql.conf
max_connections = 256
shared_buffers = 3GB # min 128kB
work_mem = 16MB # min 64kB
maintenance_work_mem = 256MB # min 1MB
wal_buffers = 4MB
checkpoint_timeout = 10min # range 30s-1d
max_wal_size = 2GB
checkpoint_completion_target = 0.9
---------------------
作者:Kun_Tsai
来源:CSDN
原文:https://blog.csdn.net/jacicson1987/article/details/82988556?utm_source=copy
版权声明:本文为博主原创文章,转载请附上博文链接!
数据构造
timestamp 每行递增1s
id 每行递增1
10个 integer 随机数 + 9个随机长度字符串
CASE 1 单TIME索引
timescaledb 不设置索引(默认time索引)
PG 设置time索引
ts1=# \d+ ckts1
Table "public.ckts1"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
--------+--------------------------+-----------+----------+---------+----------+--------------+-------------
time | timestamp with time zone | | not null | | plain | |
id | integer | | | | plain | |
col2 | integer | | | | plain | |
col3 | integer | | | | plain | |
col4 | integer | | | | plain | |
col5 | integer | | | | plain | |
col6 | integer | | | | plain | |
col7 | integer | | | | plain | |
col8 | integer | | | | plain | |
col9 | integer | | | | plain | |
col10 | integer | | | | plain | |
col11 | integer | | | | plain | |
col12 | character varying(30) | | | | extended | |
col13 | character varying(30) | | | | extended | |
col14 | character varying(30) | | | | extended | |
col15 | character varying(30) | | | | extended | |
col16 | character varying(30) | | | | extended | |
col17 | character varying(30) | | | | extended | |
col18 | character varying(30) | | | | extended | |
col19 | character varying(30) | | | | extended | |
col20 | character varying(30) | | | | extended | |
Indexes:
"ckts1_time_idx" btree ("time" DESC)
Triggers:
ts_insert_blocker BEFORE INSERT ON ckts1 FOR EACH ROW EXECUTE PROCEDURE _timescaledb_internal.insert_blocker()
postgres=# \d+ cts1
Table "public.cts1"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
--------+--------------------------+-----------+----------+---------+----------+--------------+-------------
time | timestamp with time zone | | not null | | plain | |
id | integer | | | | plain | |
col2 | integer | | | | plain | |
col3 | integer | | | | plain | |
col4 | integer | | | | plain | |
col5 | integer | | | | plain | |
col6 | integer | | | | plain | |
col7 | integer | | | | plain | |
col8 | integer | | | | plain | |
col9 | integer | | | | plain | |
col10 | integer | | | | plain | |
col11 | integer | | | | plain | |
col12 | character varying(30) | | | | extended | |
col13 | character varying(30) | | | | extended | |
col14 | character varying(30) | | | | extended | |
col15 | character varying(30) | | | | extended | |
col16 | character varying(30) | | | | extended | |
col17 | character varying(30) | | | | extended | |
col18 | character varying(30) | | | | extended | |
col19 | character varying(30) | | | | extended | |
col20 | character varying(30) | | | | extended | |
Indexes:
"cts1_time_index" btree ("time")
单行写入
每线程10W行数据。每行约180bytes. 一个线程写入约18MB。
看起来两者性能相差不大,PG稍强5%~10%, 之前此场景下influxdb的测试数据也放在一起做个对比
线程平均速率(行/s) | 单线程 | 2线程 | 4线程 | 8线程 | 16线程 | 32线程 |
timescaledb | 1075 | 1008 | 1026 | 1045 | 923 | 654 |
PG | 1092 | 1020 | 1070 | 1120 | 982 | 737 |
infuxdb | 422 | 378 | 398 | 370 | 341 | 305 |
总速率(行/s) | 单线程 | 2线程 | 4线程 | 8线程 | 16线程 | 32线程 |
timescaledb | 1075 | 2011 | 4000 | 8166 | 14358 | 20337 |
PG | 1092 | 2037 | 4037 | 8734 | 15087 | 22445 |
infuxdb | 422 | 752 | 1588 | 2947 | 5444 | 9732 |
WAL文件大小两者基本一致,都是大约数据量的 2倍。WAL文件增加
线程数 | 字节数 | PG_WAL_SIZE | TS_WAL_SIZE |
1 | 18697562 | 37428272 | 37464824 |
2 | 37379223 | 74547144 | 75026672 |
4 | 74785548 | 149803536 | 149525488 |
8 | 149548561 | 298455240 | 298400040 |
16 | 299134911 | 596227840 | 596302792 |
32 | 598385115 | 1191610952 | 1194038480 |
BATCH写入
单线程行数设为1000万, BATCHSIZE 设为1000
PG优于TIMESCALEDB。
每线程1000W行数据 | batchsize=1000 | |
线程平均速率(行/s) | 单线程 | 8线程 |
timescaledb | 26732 | 11916 |
PG | 39113 | 14972 |
总速率(行/s) | 单线程 | 8线程 |
timescaledb | 26732 | 95140 |
PG | 39113 | 119528 |
资源占用
TDSB 32线程, CPU平均占用 57%。 比PG高10%。
PG 32线程, CPU平均占用 46%.
CASE 2 增加一个索引
增加id为索引
ts1=# \d+ ckts1
Table "public.ckts1"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
--------+--------------------------+-----------+----------+---------+----------+--------------+-------------
time | timestamp with time zone | | not null | | plain | |
id | integer | | | | plain | |
col2 | integer | | | | plain | |
col3 | integer | | | | plain | |
col4 | integer | | | | plain | |
col5 | integer | | | | plain | |
col6 | integer | | | | plain | |
col7 | integer | | | | plain | |
col8 | integer | | | | plain | |
col9 | integer | | | | plain | |
col10 | integer | | | | plain | |
col11 | integer | | | | plain | |
col12 | character varying(30) | | | | extended | |
col13 | character varying(30) | | | | extended | |
col14 | character varying(30) | | | | extended | |
col15 | character varying(30) | | | | extended | |
col16 | character varying(30) | | | | extended | |
col17 | character varying(30) | | | | extended | |
col18 | character varying(30) | | | | extended | |
col19 | character varying(30) | | | | extended | |
col20 | character varying(30) | | | | extended | |
Indexes:
"ckts1_id_idx" btree (id)
"ckts1_time_idx" btree ("time" DESC)
Triggers:
ts_insert_blocker BEFORE INSERT ON ckts1 FOR EACH ROW EXECUTE PROCEDURE _timescaledb_internal.insert_blocker()
ts1=#
postgres=# \d+ cts1
Table "public.cts1"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
--------+--------------------------+-----------+----------+---------+----------+--------------+-------------
time | timestamp with time zone | | not null | | plain | |
id | integer | | | | plain | |
col2 | integer | | | | plain | |
col3 | integer | | | | plain | |
col4 | integer | | | | plain | |
col5 | integer | | | | plain | |
col6 | integer | | | | plain | |
col7 | integer | | | | plain | |
col8 | integer | | | | plain | |
col9 | integer | | | | plain | |
col10 | integer | | | | plain | |
col11 | integer | | | | plain | |
col12 | character varying(30) | | | | extended | |
col13 | character varying(30) | | | | extended | |
col14 | character varying(30) | | | | extended | |
col15 | character varying(30) | | | | extended | |
col16 | character varying(30) | | | | extended | |
col17 | character varying(30) | | | | extended | |
col18 | character varying(30) | | | | extended | |
col19 | character varying(30) | | | | extended | |
col20 | character varying(30) | | | | extended | |
Indexes:
"cst1_id_index" btree (id)
"cts1_time_index" btree ("time")
单行写入
10万行数据,单条写入
和没有添加索引时性能差不多,PG看起来性能下降更多一点。
总速率(行/s) | 单线程 |
timescaledb | 1051 |
PG | 1019 |
BATCH写入
每线程1000万行数据,batchsize设为1000, 8线程处理,13.9G数据
增加一个索引以后,PG和TSDB比之前性能都有所降低,PG写入仍旧比TSDB快。
每线程1000W行数据 | batchsize=1000 | |
线程平均速率(行/s) | 单线程 | 8线程 |
timescaledb | 26142 | 10591 |
PG | 36861 | 13037 |
总速率(行/s) | 单线程 | 8线程 |
timescaledb | 26142 | 84588 |
PG | 36861 | 103806 |
资源占用
8线程,batchsize=1000, TSDB CPU占用80%左右
8线程,batchsize=1000, PG CPU占用60%左右
CASE 3 大量数据
设置PG的表cts1索引time为DESC(倒序),测试结果与上面并无不同
postgres=# \d+ cts1
Table "public.cts1"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
--------+--------------------------+-----------+----------+---------+----------+--------------+-------------
time | timestamp with time zone | | not null | | plain | |
id | integer | | | | plain | |
col2 | integer | | | | plain | |
col3 | integer | | | | plain | |
col4 | integer | | | | plain | |
col5 | integer | | | | plain | |
col6 | integer | | | | plain | |
col7 | integer | | | | plain | |
col8 | integer | | | | plain | |
col9 | integer | | | | plain | |
col10 | integer | | | | plain | |
col11 | integer | | | | plain | |
col12 | character varying(30) | | | | extended | |
col13 | character varying(30) | | | | extended | |
col14 | character varying(30) | | | | extended | |
col15 | character varying(30) | | | | extended | |
col16 | character varying(30) | | | | extended | |
col17 | character varying(30) | | | | extended | |
col18 | character varying(30) | | | | extended | |
col19 | character varying(30) | | | | extended | |
col20 | character varying(30) | | | | extended | |
Indexes:
"cts1_id_idx" btree (id)
"cts1_time_idx" btree ("time" DESC)
ts1=# \d+ ckts1
Table "public.ckts1"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
--------+--------------------------+-----------+----------+---------+----------+--------------+-------------
time | timestamp with time zone | | not null | | plain | |
id | integer | | | | plain | |
col2 | integer | | | | plain | |
col3 | integer | | | | plain | |
col4 | integer | | | | plain | |
col5 | integer | | | | plain | |
col6 | integer | | | | plain | |
col7 | integer | | | | plain | |
col8 | integer | | | | plain | |
col9 | integer | | | | plain | |
col10 | integer | | | | plain | |
col11 | integer | | | | plain | |
col12 | character varying(30) | | | | extended | |
col13 | character varying(30) | | | | extended | |
col14 | character varying(30) | | | | extended | |
col15 | character varying(30) | | | | extended | |
col16 | character varying(30) | | | | extended | |
col17 | character varying(30) | | | | extended | |
col18 | character varying(30) | | | | extended | |
col19 | character varying(30) | | | | extended | |
col20 | character varying(30) | | | | extended | |
Indexes:
"ckts1_id_idx" btree (id)
"ckts1_time_idx" btree ("time" DESC)
Triggers:
ts_insert_blocker BEFORE INSERT ON ckts1 FOR EACH ROW EXECUTE PROCEDURE _timescaledb_internal.insert_blocker()
怀疑因为数据量不够大,导致TSDB写入性能优势没能体现
增大数据量到1亿行
batchsize = 1000,单线程写入
行数 | 字节数 | PG执行时间 | TSDB执行时间 | PG行速率 | TSDB行速率(行/s) | TSDB TRUNK数量 |
2000万 | 3.45GB | 565 s | 817 s | 35398 (行/s) | 24480 (行/s) | 30 |
1亿 | 17.4GB | 2782 s | 3992 s | 35945 (行/s) | 25050 (行/s) | 166 |
结果仍然是PG写入速度更快。
To be continued...