PostgreSQL IoT,车联网 - 实时轨迹、行程实践 2 - (含index only scan类聚簇表效果)

标签

PostgreSQL , 实时轨迹 , IoT , 车联网 , GIS , 离散IO , 顺序IO , IO放大 , 聚合 , PGLZ压缩解压缩 , IN OUT函数消耗


背景

车联网场景,一个行程由若干个点组成,业务需要按行程查询,按。

但是问题来了:

一个行程包含多个点,那么一次查询意味着需要查询并返回多个点(多条记录)。

由于有多个设备(例如汽车),大概率可能同时活跃,同时上传点,因此以堆表存储的话,一个行程中的多条记录,实际上在数据库的表里面,存储在多个BLOCK里面。

如上分析,性能问题:IO放大。

如何优化?

1、行程记录,按行聚集存储。

类似cluster操作。

2、行程记录,合并到单条,聚集存储。

类似将一个行程多条记录聚合。

例子

下面分别测试几种优化方法带来的性能优化效果。

1、cluster

2、array 聚合带压缩

3、array 聚合不带压缩

4、jsonb 聚合带压缩

5、jsonb 聚合不带压缩

6、text 聚合带压缩

7、text 聚合不带压缩

8、类聚簇表

《PostgreSQL index include - 类聚簇表与应用(append only, IoT时空轨迹, 离散多行扫描与返回)》

1 原始状态

create unlogged table t_sensor(id serial8 primary key, sid int8, att text, crt_time timestamp);  
create index idx_t_sensor_1 on t_sensor (sid, crt_time);  
vi test.sql  
  
\set sid random(1,10000)  
insert into t_sensor(sid,att,crt_time) values (:sid, md5(random()::text), now());  
  
pgbench -M prepared -n -r -P 1 -f ./test.sql -c 50 -j 50 -t 2000000  

查看一个行程,需要访问11227个数据块。

postgres=# explain (analyze,verbose,timing,costs,buffers) select * from t_sensor where sid=1 order by crt_time;  
                                                                 QUERY PLAN                                                                   
--------------------------------------------------------------------------------------------------------------------------------------------  
 Index Scan using idx_t_sensor_1 on public.t_sensor  (cost=0.49..253.36 rows=11121 width=57) (actual time=0.020..47.591 rows=11189 loops=1)  
   Output: id, sid, att, crt_time  
   Index Cond: (t_sensor.sid = 1)  
   Buffers: shared hit=3406 read=7821  
 Planning Time: 0.092 ms  
 Execution Time: 48.303 ms  
(6 rows)  
  
postgres=# select 119*11227*8/1024.0;  
      ?column?        
--------------------  
 10437.601562500000  
(1 row)  

压测性能

vi test.sql  
  
\set sid random(1,10000)  
select * from t_sensor where sid=:sid order by crt_time ;  
  
  
pgbench -M prepared -n -r -P 1 -f ./test.sql -c 56 -j 56 -T 120  
transaction type: ./test.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 56  
number of threads: 56  
duration: 120 s  
number of transactions actually processed: 14307  
latency average = 470.175 ms  
latency stddev = 43.500 ms  
tps = 119.037842 (including connections establishing)  
tps = 119.098221 (excluding connections establishing)  
statement latencies in milliseconds:  
         0.003  \set sid random(1,10000)  
       470.178  select * from t_sensor where sid=:sid order by crt_time ;  

2 cluster

按行程,时间索引聚集。一次查询返回一条聚合后的记录。

postgres=# cluster t_sensor USING idx_t_sensor_1 ;  

查询一个行程,扫描174个数据块。IO 骤降。

postgres=# explain (analyze,verbose,timing,costs,buffers) select * from t_sensor where sid=1 order by crt_time;  
                                                                QUERY PLAN                                                                   
-------------------------------------------------------------------------------------------------------------------------------------------  
 Index Scan using idx_t_sensor_1 on public.t_sensor  (cost=0.49..253.39 rows=11149 width=57) (actual time=0.018..2.212 rows=11189 loops=1)  
   Output: id, sid, att, crt_time  
   Index Cond: (t_sensor.sid = 1)  
   Buffers: shared hit=174  
 Planning Time: 0.094 ms  
 Execution Time: 2.816 ms  
(6 rows)  

压测性能,见末尾。

3 array 带压缩

create unlogged table t_sensor_agg(sid int8 primary key, agg t_sensor[]);  
insert into t_sensor_agg select sid,array_agg(t_sensor order by crt_time) from t_sensor group by sid;  

瓶颈:array类型的INPUT OUTPUT接口,以及压缩解压缩接口

Samples: 3M of event 'cpu-clock', Event count (approx.): 559749813667                   
Overhead  Shared Object                           Symbol                                
  24.59%  postgres                                [.] array_out                         
  20.70%  postgres                                [.] record_out                        
   6.74%  postgres                                [.] pglz_decompress                   
   3.78%  libc-2.17.so                            [.] __memcpy_ssse3_back               
   2.72%  postgres                                [.] pg_ltostr_zeropad                 
   2.34%  [kernel]                                [k] run_timer_softirq                 
   2.23%  postgres                                [.] pg_lltoa                          
   2.13%  postgres                                [.] 0x000000000035c350                
   1.76%  postgres                                [.] heap_deform_tuple                 
   1.49%  libc-2.17.so                            [.] __strlen_sse2_pminub              
   1.36%  postgres                                [.] appendStringInfoChar              
   1.36%  [kernel]                                [k] copy_user_enhanced_fast_string    
   1.29%  postgres                                [.] 0x000000000035c36c                
   1.28%  postgres                                [.] 0x000000000035c362                
   1.17%  postgres                                [.] FunctionCall1Coll                 
   0.92%  postgres                                [.] hash_search_with_hash_value       
   0.86%  [kernel]                                [k] _raw_spin_unlock_irqrestore       
   0.84%  postgres                                [.] j2date                            
   0.82%  postgres                                [.] 0x000000000035c357                
   0.76%  postgres                                [.] palloc                            
   0.76%  postgres                                [.] lookup_type_cache                 
   0.67%  postgres                                [.] 0x000000000035c360                
   0.66%  postgres                                [.] timestamp2tm                      
   0.64%  [kernel]                                [k] rcu_process_callbacks             
   0.64%  [kernel]                                [k] __do_softirq                     
vi test.sql  
  
\set sid random(1,10000)  
select * from t_sensor_agg4 where sid=:sid ;  

4 array 不带压缩

create unlogged table t_sensor_agg1(sid int8 primary key, agg t_sensor[]);  
alter table t_sensor_agg1 alter column agg set storage external;  
insert into t_sensor_agg1 select sid,array_agg(t_sensor order by crt_time) from t_sensor group by sid;  

瓶颈:array类型的INPUT OUTPUT接口

Samples: 1M of event 'cpu-clock', Event count (approx.): 310127790569                  
Overhead  Shared Object                           Symbol                               
  26.06%  postgres                                [.] array_out                        
  21.44%  postgres                                [.] record_out                       
   4.20%  libc-2.17.so                            [.] __memcpy_ssse3_back              
   2.86%  [kernel]                                [k] run_timer_softirq                
   2.75%  postgres                                [.] pg_ltostr_zeropad                
   2.65%  postgres                                [.] heap_deform_tuple                
   2.28%  postgres                                [.] pg_lltoa                         
   2.14%  postgres                                [.] 0x000000000035c350               
   1.87%  [kernel]                                [k] copy_user_enhanced_fast_string   
   1.52%  libc-2.17.so                            [.] __strlen_sse2_pminub             
   1.47%  postgres                                [.] appendStringInfoChar             
   1.32%  postgres                                [.] 0x000000000035c36c               
   1.30%  postgres                                [.] 0x000000000035c362               
   1.20%  postgres                                [.] FunctionCall1Coll                
   1.11%  postgres                                [.] hash_search_with_hash_value      
   0.87%  postgres                                [.] j2date                           
   0.85%  postgres                                [.] 0x000000000035c357               
   0.81%  [kernel]                                [k] _raw_spin_unlock_irqrestore      
   0.76%  postgres                                [.] lookup_type_cache                
   0.75%  postgres                                [.] 0x000000000046d33b               
   0.74%  postgres                                [.] palloc                           
   0.72%  [kernel]                                [k] rcu_process_callbacks            
   0.68%  postgres                                [.] timestamp2tm                     
   0.68%  postgres                                [.] pfree                         

5 jsonb 带压缩

create unlogged table t_sensor_agg2(sid int8 primary key, agg jsonb);  
insert into t_sensor_agg2 select sid,jsonb_agg(t_sensor order by crt_time) from t_sensor group by sid;  

6 jsonb 不带压缩

create unlogged table t_sensor_agg3(sid int8 primary key, agg jsonb);  
alter table t_sensor_agg3 alter column agg set storage external;  
insert into t_sensor_agg3 select sid,jsonb_agg(t_sensor order by crt_time) from t_sensor group by sid;  

7 text 带压缩

create unlogged table t_sensor_agg4(sid int8 primary key, agg text);  
insert into t_sensor_agg4 select sid,string_agg(t_sensor::text, '|' order by crt_time) from t_sensor group by sid;  

8 text 不带压缩

create unlogged table t_sensor_agg5(sid int8 primary key, agg text);  
alter table t_sensor_agg5 alter column agg set storage external;  
insert into t_sensor_agg5 select sid,string_agg(t_sensor::text, '|' order by crt_time) from t_sensor group by sid;  

9 index only scan 类似聚集表效果

注意:除了本文提到的full index达到类似聚簇表效果,还可以使用index include,达到同样效果,而且索引更加有效。

《PostgreSQL index include - 类聚簇表与应用(append only, IoT时空轨迹, 离散多行扫描与返回)》

所有内容作为INDEX的KEY,类似聚集表的效果(相邻内容在同一个INDEX PAGE里面)。查询时走INDEX ONLY SCAN扫描方法,扫描的BLOCK最少。

注意:btree 索引内容不能超过1/3 PAGE (因为BTREE是双向链表,一个PAGE至少要有一条有效记录,所以有这个限制。)。

写入数据

create table t_sensor (id serial8 primary key, sid int8, att text, crt_time timestamp);  
  
create index idx_t_sensor_1 on t_sensor (sid, crt_time, att, id);  
  
  
vi test.sql  
\set sid random(1,10000)    
insert into t_sensor(sid,att,crt_time) values (:sid, md5(random()::text), now());    
  
pgbench -M prepared -n -r -P 1 -f ./test.sql -c 50 -j 50 -t 2000000  
  
  
transaction type: ./test.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 50  
number of threads: 50  
number of transactions per client: 2000000  
number of transactions actually processed: 100000000/100000000  
latency average = 0.193 ms  
latency stddev = 0.461 ms  
tps = 257995.418591 (including connections establishing)  
tps = 258024.212148 (excluding connections establishing)  
statement latencies in milliseconds:  
         0.001  \set sid random(1,10000)    
         0.192  insert into t_sensor(sid,att,crt_time) values (:sid, md5(random()::text), now());  

生成VM文件(autovacuum触发时会自动生成,但是为了立马看效果,手工执行一下。)

vacuum analyze t_sensor;  

INDEX ONLY SCAN, IO减少效果如下:

postgres=# explain (analyze,verbose,timing,costs,buffers) select * from t_sensor where sid=2 order by crt_time;  
                                                                 QUERY PLAN                                                                    
---------------------------------------------------------------------------------------------------------------------------------------------  
 Index Only Scan using idx_t_sensor_1 on public.t_sensor  (cost=0.60..70.41 rows=9960 width=57) (actual time=0.019..2.109 rows=9978 loops=1)  
   Output: id, sid, att, crt_time  
   Index Cond: (t_sensor.sid = 2)  
   Heap Fetches: 0  
   Buffers: shared hit=235  
 Planning Time: 0.090 ms  
 Execution Time: 2.652 ms  
(7 rows)  

查询性能:

vi test.sql  
  
\set sid random(1,10000)  
select * from t_sensor where sid=:sid order by crt_time;  
  
pgbench -M prepared -n -r -P 1 -f ./test.sql -c 28 -j 28 -T 120  
  
  
transaction type: ./test.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 28  
number of threads: 28  
duration: 120 s  
number of transactions actually processed: 283638  
latency average = 11.844 ms  
latency stddev = 1.931 ms  
tps = 2363.410561 (including connections establishing)  
tps = 2363.913145 (excluding connections establishing)  
statement latencies in milliseconds:  
         0.002  \set sid random(1,10000)    
        11.842  select * from t_sensor where sid=:sid order by crt_time;  

小结

目标数据分散在多个BLOCK中,引起IO放大的问题,通过聚集存储,或者通过聚合存储,可以很好的解决这个问题。

聚合后,另一个瓶颈则是聚合后的类型(array,jsonb,text)的IN OUT接口。

/表存储行程查询 qps
原始(IO 放大)8880 MB119
顺序(无IO 放大)8880 MB2057
index only scan(类似聚集表)(无IO 放大)8880 MB2363
聚合array(压缩)4523 MB2362
聚合array(不压缩)8714 MB2515
聚合json(压缩)5052 MB3102
聚合json(不压缩)13 GB3184
聚合text(压缩)4969 MB6057
聚合text(不压缩)7692 MB5997

从上面的测试,可以看到IN OUT函数接口的开销,text<jsonb<array(composite array)。

实际的优化例子,可参考末尾的几篇文章。例如:

1、按时间分区,旧的分区使用cluster,按行程整理数据,使用AB表切换,解决IO放大的问题。

2、异步聚合,将点数据准实时按行程ID,聚合到聚合后的行程表。

3、使用INDEX ONLY SCAN, 达到聚集表效果。对业务无任何侵入性。(例如按天分区,加全量(业务需要查询的字段)索引。),相当于两倍存储空间(一份在堆表,一份在索引中)。

参考

《PostgreSQL IoT,车联网 - 实时轨迹、行程实践》

《PostgreSQL AB表切换最佳实践 - 提高切换成功率,杜绝雪崩 - 珍藏级》

《PostgreSQL 时序最佳实践 - 证券交易系统数据库设计 - 阿里云RDS PostgreSQL最佳实践》

《PostgreSQL index include - 类聚簇表与应用(append only, IoT时空轨迹, 离散多行扫描与返回)》

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值