本文对比一下MonetDB 和 PostgreSQL在某些SQL下的性能.
(MonetDB 使用列存储, 支持单SQL使用多核, 线程模式, 支持数据压缩, 还有较多的OLAP相关的插件, 更适合OLAP的应用场景)
注意, PostgreSQL注重的是高并发, 而MonetDB注重数据分析, 所以两者没有什么可比性, 这里拿MonetDB的强项来VS PostgreSQL的软肋, 主要是给一些在使用PostgreSQL做数据分析并且遇到瓶颈的朋友一些启发, 换个产品试试, 不要使用一个产品的软肋(这里指PostgreSQL在统计方面目前略差, 未来9.4+版本出来单SQL可以用到多核再来谈统计效率)
PostgreSQL支持的列存储插件cstore_fdw性能, 如果要玩的话可以
参考 :
对于定长类型, 可以提升较高的性能.
测试环境
DELL R610
CPU 1.6G 8核
CentOS 6.5 x64
PostgreSQL 9.3.5 (主要参数 shared_buffer=1G, block_size=8KB, checkpoint_segment=32, wal_segsize=16MB, synchronous_commit=off, autovacuum=on)
MonetDB 11.17.21
块设备 OCZ RevoDrive3X2 240G
文件系统 ext4
内存 96GB
测试数据2.6亿
MonetDB 占用 1.9G. (含PK)
PostgreSQL 占用 22G.
(含PK)
插入性能对比, PostgreSQL比MonetDB几乎差了9倍性能.
插入条数 / 时间(毫秒) | MonetDB | PostgreSQL |
1024 | 6 | 11 |
2048 | 7 | 21 |
4096 | 10 | 42 |
8192 | 17 | 84 |
16384 | 29 | 166 |
32768 | 51 | 332 |
65536 | 96 | 661 |
131072 | 184 | 1361 |
262144 | 361 | 2760 |
524288 | 701 | 5550 |
1048576 | 1300 | 11221 |
2097152 | 2800 | 22581 |
4194304 | 5500 | 45018 |
8388608 | 11200 | 91126 |
16777216 | 22100 | 183895 |
33554432 | 45300 | 376365 |
67108864 | 97000 | 764031 |
134217728 | 178000 | 1533433 |
平均每秒插入条数 | 754000/s | 87000/s |
查询性能对比, MonetDB完胜, 完全不是一个数量级的差别.
当然PostgreSQL count(*)性能本来就不咋地, 更适合OLTP, 曾经用了大量的篇幅来写使用PostgreSQL实现准实时数据统计, 可参阅
SQL / 时间(毫秒) | MonetDB | PostgreSQL |
select count(*) from a; | 1 | 46484 |
select count(*) from a where id=1; | 1 | 1 |
select count(*) from a where id>1; | 2 | 68813 |
select count(*) from a where id<1; | 2 | 0.6 |
select count(*) from a where id<>1; | 3500 | 68710 |
select count(*) from a where id is null; | 752 | 0.5 |
select count(*) from a where id is not null; | 710 | 64771 |
select count(*) from a where info='2014-08-13 20:20:14.000000+08:00'; | 0.4 | 68006 |
select count(*) from a where info>'2014-08-13 20:20:14.000000+08:00'; | 0.8 | 91155 |
select count(*) from a where info<'2014-08-13 20:20:14.000000+08:00'; | 1.7 | 72517 |
select count(*) from a where info<>'2014-08-13 20:20:14.000000+08:00'; | 2800 | 86810 |
select count(*) from a where info is null; | 773 | 43593 |
select count(*) from a where info is not null; | 748 | 65161 |
select count(*) from (select info from a group by info having count(*) >1) as t; | 4700 | 144180 |
空间占用对比,
2.6亿数据(含索引), MonetDB 1.9G, PostgreSQL 22G.
当然这里和测试数据有关, 因为字符串大量重复了.
后期做一下宽 表的测试. 对比GreenPlum.
MonetDB还有很多强大的功能等待挖掘, 大家一起来玩吧.
详细数据如下 :
MonetDB
插入性能
空间占用大约1.9G.
通过函数storage可以查看详细大小. 这里包含了count和columnsize. info字段占用500mb左右, id字段占用1G左右.
查询测试
PostgreSQL
sql>create table a(id int auto_increment primary key, info varchar(32));
operation successful (10.112ms)
sql>insert into a (info) values (now());
1 affected rows, last generated key: 1 (3.274ms)
sql>select * from a;
+------+----------------------------------+
| id | info |
+======+==================================+
| 1 | 2014-08-13 20:18:53.000000+08:00 |
+------+----------------------------------+
1 tuple (2.448ms)
sql>insert into a (info) values (now());
1 affected rows, last generated key: 2 (3.023ms)
sql>insert into a (info) select now() from a;
2 affected row (6.033ms)
sql>insert into a (info) select now() from a;
4 affected row (3.798ms)
sql>insert into a (info) select now() from a;
8 affected row (2.893ms)
sql>insert into a (info) select now() from a;
16 affected row (2.700ms)
sql>insert into a (info) select now() from a;
32 affected row (5.064ms)
sql>insert into a (info) select now() from a;
64 affected row (4.054ms)
sql>insert into a (info) select now() from a;
128 affected row (4.138ms)
sql>insert into a (info) select now() from a;
256 affected row (4.304ms)
sql>insert into a (info) select now() from a;
512 affected row (2.563ms)
sql>insert into a (info) select now() from a;
1024 affected row (6.094ms)
sql>insert into a (info) select now() from a;
2048 affected row (6.956ms)
sql>insert into a (info) select now() from a;
4096 affected row (9.618ms)
sql>insert into a (info) select now() from a;
8192 affected row (17.475ms)
sql>insert into a (info) select now() from a;
16384 affected row (29.347ms)
sql>insert into a (info) select now() from a;
32768 affected row (50.941ms)
sql>insert into a (info) select now() from a;
65536 affected row (95.736ms)
sql>insert into a (info) select now() from a;
131072 affected row (183.726ms)
sql>insert into a (info) select now() from a;
262144 affected row (361.436ms)
sql>insert into a (info) select now() from a;
524288 affected row (701.202ms)
sql>insert into a (info) select now() from a;
1048576 affected row (1.3s)
sql>insert into a (info) select now() from a;
2097152 affected row (2.8s)
sql>insert into a (info) select now() from a;
4194304 affected row (5.5s)
sql>insert into a (info) select now() from a;
8388608 affected row (11.2s)
sql>insert into a (info) select now() from a;
16777216 affected row (22.1s)
sql>insert into a (info) select now() from a;
33554432 affected row (45.3s)
sql>insert into a (info) select now() from a;
67108864 affected row (1m 27s)
sql>insert into a (info) select now() from a;
134217728 affected row (2m 58s)
空间占用大约1.9G.
[root@150 ~]# cd /data02/mdb1/
[root@150 mdb1]# ll
total 16
-rw------- 1 root root 7196 Aug 13 21:08 merovingian.log
-rw-r--r-- 1 root root 6 Aug 13 20:08 merovingian.pid
drwx------ 5 root root 4096 Aug 13 20:09 test
[root@150 mdb1]# du -sh *
8.0K merovingian.log
4.0K merovingian.pid
1.9G test
[root@150 mdb1]# cd test/
[root@150 test]# ll
total 12
-rw------- 1 root root 0 Aug 13 20:09 58ec944a-5cff-dc51-4873-986923c64567
drwx------ 14 root root 4096 Aug 13 20:33 bat
drwx------ 2 root root 4096 Aug 13 20:09 box
drwx------ 3 root root 4096 Aug 13 20:09 sql_logs
[root@150 test]# du -sh *
0 58ec944a-5cff-dc51-4873-986923c64567
1.9G bat
4.0K box
12K sql_logs
[root@150 test]# cd bat
[root@150 bat]# du -sh *
492K 01
288K 02
92K 04
1.1G 05
513M 07
572K 10
4.0K 11
4.0K 12
65M 13
323M 14
772K 15
4.0K 3.head
4.0K 3.tail
12K 3.theap
4.0K 4.head
4.0K 4.tail
12K 4.theap
28K BACKUP
[root@150 bat]# ll
total 88
drwx------ 2 root root 4096 Aug 13 21:27 01
drwx------ 2 root root 4096 Aug 13 20:21 02
drwx------ 2 root root 4096 Aug 13 21:15 04
drwx------ 2 root root 4096 Aug 13 21:12 05
drwx------ 2 root root 4096 Aug 13 21:27 07
drwx------ 2 root root 4096 Aug 13 21:27 10
drwx------ 2 root root 4096 Aug 13 21:15 11
drwx------ 2 root root 4096 Aug 13 21:27 12
drwx------ 2 root root 4096 Aug 13 21:27 13
drwx------ 2 root root 4096 Aug 13 21:27 14
drwx------ 2 root root 4096 Aug 13 21:27 15
-rw------- 1 root root 8 Aug 13 20:09 3.head
-rw------- 1 root root 1 Aug 13 20:09 3.tail
-rw------- 1 root root 8216 Aug 13 20:09 3.theap
-rw------- 1 root root 8 Aug 13 20:09 4.head
-rw------- 1 root root 1 Aug 13 20:09 4.tail
-rw------- 1 root root 8343 Aug 13 20:09 4.theap
drwx------ 2 root root 4096 Aug 13 21:27 BACKUP
通过函数storage可以查看详细大小. 这里包含了count和columnsize. info字段占用500mb左右, id字段占用1G左右.
sql>select * from storage();
+-------+-------------------+----------------+-----------+----------+-----------+-------+------------+----------+---------+--------+
| schem | table | column | type | location | count | typew | columnsize | heapsize | indices | sorted |
: a : : : : : : idth : : : : :
+=======+===================+================+===========+==========+===========+=======+============+==========+=========+========+
| sys | a | id | int | 05/573 | 268435456 | 4 | 1073741824 | 0 | 0 | true |
| sys | a | info | varchar | 07/734 | 268435456 | 32 | 536870912 | 16384 | 0 | true |
| sys | a | a_id_pkey | oid | 05/567 | 0 | 8 | 0 | 0 | 0 | true |
查询测试
sql>select count(*) from a;
+-----------+
| L1 |
+===========+
| 268435456 |
+-----------+
1 tuple (0.979ms)
sql>select count(*) from a where id=1;
+------+
| L1 |
+======+
| 1 |
+------+
1 tuple (0.920ms)
sql>select count(*) from a where id>1;
+-----------+
| L1 |
+===========+
| 268435455 |
+-----------+
1 tuple (2.347ms)
sql>select count(*) from a where id<1;
+------+
| L1 |
+======+
| 0 |
+------+
1 tuple (2.270ms)
sql>select count(*) from a where id<>1;
+-----------+
| L1 |
+===========+
| 268435455 |
+-----------+
1 tuple (3.5s)
sql>select count(*) from a where id is null;
+------+
| L1 |
+======+
| 0 |
+------+
1 tuple (751.665ms)
sql>select count(*) from a where id is not null;
+-----------+
| L1 |
+===========+
| 268435456 |
+-----------+
1 tuple (710.477ms)
sql>select * from a limit 10;
+------+----------------------------------+
| id | info |
+======+==================================+
| 1 | 2014-08-13 20:18:53.000000+08:00 |
| 2 | 2014-08-13 20:18:59.000000+08:00 |
| 3 | 2014-08-13 20:20:06.000000+08:00 |
| 4 | 2014-08-13 20:20:06.000000+08:00 |
| 5 | 2014-08-13 20:20:14.000000+08:00 |
| 6 | 2014-08-13 20:20:14.000000+08:00 |
| 7 | 2014-08-13 20:20:14.000000+08:00 |
| 8 | 2014-08-13 20:20:14.000000+08:00 |
| 9 | 2014-08-13 20:20:14.000000+08:00 |
| 10 | 2014-08-13 20:20:14.000000+08:00 |
+------+----------------------------------+
10 tuples (1.581ms)
sql>select count(*) from a where info='2014-08-13 20:20:14.000000+08:00';
+------+
| L1 |
+======+
| 12 |
+------+
1 tuple (0.442ms)
sql>select count(*) from a where info>'2014-08-13 20:20:14.000000+08:00';
+-----------+
| L1 |
+===========+
| 268435440 |
+-----------+
1 tuple (0.844ms)
sql>select count(*) from a where info<'2014-08-13 20:20:14.000000+08:00';
+------+
| L1 |
+======+
| 4 |
+------+
1 tuple (1.734ms)
sql>select count(*) from a where info<>'2014-08-13 20:20:14.000000+08:00';
+-----------+
| L1 |
+===========+
| 268435444 |
+-----------+
1 tuple (2.8s)
sql>select count(*) from a where info is null;
+------+
| L1 |
+======+
| 0 |
+------+
1 tuple (772.987ms)
sql>select count(*) from a where info is not null;
+-----------+
| L1 |
+===========+
| 268435456 |
+-----------+
1 tuple (747.712ms)
sql>select count(*) from (select info from a group by info having count(*) >1) as t;
+------+
| L2 |
+======+
| 18 |
+------+
1 tuple (4.7s)
PostgreSQL
插入测试
查询测试
[参考]
postgres=# create table a(id serial primary key, info text);
CREATE TABLE
postgres=# insert into a(info) values (now());
INSERT 0 1
postgres=# insert into a(info) select now() from a;
INSERT 0 1
Time: 0.954 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 2
Time: 0.606 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 4
Time: 0.352 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 8
Time: 0.393 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 16
Time: 0.476 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 32
Time: 0.840 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 64
Time: 1.166 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 128
Time: 1.632 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 256
Time: 3.214 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 512
Time: 5.519 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 1024
Time: 10.924 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 2048
Time: 21.212 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 4096
Time: 41.930 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 8192
Time: 84.242 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 16384
Time: 165.705 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 32768
Time: 332.269 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 65536
Time: 661.579 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 131072
Time: 1361.281 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 262144
Time: 2760.690 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 524288
Time: 5550.096 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 1048576
Time: 11221.544 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 2097152
Time: 22581.858 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 4194304
Time: 45018.124 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 8388608
Time: 91126.619 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 16777216
Time: 183895.102 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 33554432
Time: 376365.578 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 67108864
Time: 764031.565 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 134217728
Time: 1533433.507 ms
postgres=# select pg_total_relation_size('a')/1024/1024/1024||'GB';
?column?
----------
22GB
(1 row)
Time: 1.027 ms
postgres=# select pg_relation_size('a')/1024/1024/1024||'GB';
?column?
----------
17GB
(1 row)
Time: 0.702 ms
查询测试
postgres=# select count(*) from a;
count
-----------
268435456
(1 row)
Time: 60818.321 ms
postgres=# select count(*) from a;
count
-----------
268435456
(1 row)
Time: 46484.165 ms
postgres=# select count(*) from a where id=1;
count
-------
1
(1 row)
Time: 0.996 ms
postgres=# select count(*) from a where id>1;
count
-----------
268435455
(1 row)
Time: 68812.883 ms
postgres=# select count(*) from a where id<1;
count
-------
0
(1 row)
Time: 0.647 ms
postgres=# select count(*) from a where id<>1;
count
-----------
268435455
(1 row)
Time: 68710.001 ms
postgres=# select count(*) from a where id is null;
count
-------
0
(1 row)
Time: 0.537 ms
postgres=# select count(*) from a where id is not null;
count
-----------
268435456
(1 row)
Time: 64771.858 ms
postgres=# select * from a limit 10;
id | info
----+-------------------------------
1 | 2014-08-13 20:37:52.165199+08
2 | 2014-08-13 20:38:06.945296+08
3 | 2014-08-13 20:38:07.965092+08
4 | 2014-08-13 20:38:07.965092+08
5 | 2014-08-13 20:38:08.502181+08
6 | 2014-08-13 20:38:08.502181+08
7 | 2014-08-13 20:38:08.502181+08
8 | 2014-08-13 20:38:08.502181+08
9 | 2014-08-13 20:38:08.966175+08
10 | 2014-08-13 20:38:08.966175+08
(10 rows)
Time: 0.629 ms
postgres=# select count(*) from a where info='2014-08-13 20:38:08.966175+08';
count
-------
8
(1 row)
Time: 68005.864 ms
postgres=# select count(*) from a where info>'2014-08-13 20:38:08.966175+08';
count
-----------
268435440
(1 row)
Time: 91154.757 ms
postgres=# select count(*) from a where info<'2014-08-13 20:38:08.966175+08';
count
-------
8
(1 row)
Time: 72516.503 ms
postgres=# select count(*) from a where info<>'2014-08-13 20:38:08.966175+08';
count
-----------
268435448
(1 row)
Time: 86810.115 ms
postgres=# select count(*) from a where info is null;
count
-------
0
(1 row)
Time: 43593.035 ms
postgres=# select count(*) from a where info is not null;
count
-----------
268435456
(1 row)
Time: 65161.258 ms
postgres=# select count(*) from (select info from a group by info having count(*) >1) as t;
count
-------
27
(1 row)
Time: 144180.678 ms
[参考]