MonetDB vs PostgreSQL a lite case

本文对比一下MonetDB 和 PostgreSQL在某些SQL下的性能. 
(MonetDB 使用列存储, 支持单SQL使用多核, 线程模式, 支持数据压缩, 还有较多的OLAP相关的插件, 更适合OLAP的应用场景)

注意, PostgreSQL注重的是高并发, 而MonetDB注重数据分析, 所以两者没有什么可比性, 这里拿MonetDB的强项来VS PostgreSQL的软肋, 主要是给一些在使用PostgreSQL做数据分析并且遇到瓶颈的朋友一些启发, 换个产品试试, 不要使用一个产品的软肋(这里指PostgreSQL在统计方面目前略差, 未来9.4+版本出来单SQL可以用到多核再来谈统计效率)
PostgreSQL支持的列存储插件cstore_fdw性能, 如果要玩的话可以 参考 : 
对于定长类型, 可以提升较高的性能.

测试环境
DELL R610
CPU 1.6G 8核
CentOS 6.5 x64
PostgreSQL 9.3.5 (主要参数 shared_buffer=1G, block_size=8KB, checkpoint_segment=32, wal_segsize=16MB, synchronous_commit=off, autovacuum=on)
MonetDB 11.17.21
块设备 OCZ RevoDrive3X2 240G
文件系统 ext4
内存 96GB
测试数据2.6亿
MonetDB 占用 1.9G. (含PK)
PostgreSQL 占用 22G.  (含PK)

插入性能对比, PostgreSQL比MonetDB几乎差了9倍性能.
 插入条数 
/ 时间(毫秒)
 MonetDB PostgreSQL
 1024 6 11
 2048 7 21
 4096 10 42
 8192 17 84
 16384 29 166
 32768 51 332
 65536 96 661
 131072 184 1361
 262144 361 2760
 524288 701 5550
 1048576 1300 11221
 2097152 2800 22581
 4194304 5500 45018
 8388608 11200 91126
 16777216 22100 183895
 33554432 45300 376365
 67108864 97000 764031
 134217728 178000 1533433
 平均每秒插入条数 754000/s 87000/s
   
查询性能对比, MonetDB完胜, 完全不是一个数量级的差别. 
当然PostgreSQL count(*)性能本来就不咋地, 更适合OLTP, 曾经用了大量的篇幅来写使用PostgreSQL实现准实时数据统计, 可参阅
 SQL
/
时间(毫秒)
 MonetDB PostgreSQL
 select count(*) from a; 1 46484
 select count(*) from a where id=1; 1 1
 select count(*) from a where id>1; 2 68813
 select count(*) from a where id<1; 2 0.6
 select count(*) from a where id<>1; 3500 68710
 select count(*) from a where id is null; 752 0.5
 select count(*) from a where id is not null; 710 64771
 select count(*) from a where info='2014-08-13 20:20:14.000000+08:00'; 0.4 68006
 select count(*) from a where info>'2014-08-13 20:20:14.000000+08:00'; 0.8 91155
 select count(*) from a where info<'2014-08-13 20:20:14.000000+08:00'; 1.7 72517
 select count(*) from a where info<>'2014-08-13 20:20:14.000000+08:00'; 2800 86810
 select count(*) from a where info is null; 773 43593
 select count(*) from a where info is not null; 748 65161
 select count(*) from (select info from a group by info having count(*) >1) as t; 4700 144180

空间占用对比,
2.6亿数据(含索引), MonetDB 1.9G, PostgreSQL 22G.
当然这里和测试数据有关, 因为字符串大量重复了.
后期做一下宽 表的测试. 对比GreenPlum.
MonetDB还有很多强大的功能等待挖掘, 大家一起来玩吧.

详细数据如下 : 
MonetDB
插入性能
sql>create table a(id int auto_increment primary key, info varchar(32));
operation successful (10.112ms)
sql>insert into a (info) values (now());
1 affected rows, last generated key: 1 (3.274ms)
sql>select * from a;
+------+----------------------------------+
| id   | info                             |
+======+==================================+
|    1 | 2014-08-13 20:18:53.000000+08:00 |
+------+----------------------------------+
1 tuple (2.448ms)
sql>insert into a (info) values (now());
1 affected rows, last generated key: 2 (3.023ms)
sql>insert into a (info) select now() from a;
2 affected row (6.033ms)
sql>insert into a (info) select now() from a;
4 affected row (3.798ms)
sql>insert into a (info) select now() from a;
8 affected row (2.893ms)
sql>insert into a (info) select now() from a;
16 affected row (2.700ms)
sql>insert into a (info) select now() from a;
32 affected row (5.064ms)
sql>insert into a (info) select now() from a;
64 affected row (4.054ms)
sql>insert into a (info) select now() from a;
128 affected row (4.138ms)
sql>insert into a (info) select now() from a;
256 affected row (4.304ms)
sql>insert into a (info) select now() from a;
512 affected row (2.563ms)
sql>insert into a (info) select now() from a;
1024 affected row (6.094ms)
sql>insert into a (info) select now() from a;
2048 affected row (6.956ms)
sql>insert into a (info) select now() from a;
4096 affected row (9.618ms)
sql>insert into a (info) select now() from a;
8192 affected row (17.475ms)
sql>insert into a (info) select now() from a;
16384 affected row (29.347ms)
sql>insert into a (info) select now() from a;
32768 affected row (50.941ms)
sql>insert into a (info) select now() from a;
65536 affected row (95.736ms)
sql>insert into a (info) select now() from a;
131072 affected row (183.726ms)
sql>insert into a (info) select now() from a;
262144 affected row (361.436ms)
sql>insert into a (info) select now() from a;
524288 affected row (701.202ms)
sql>insert into a (info) select now() from a;
1048576 affected row (1.3s)
sql>insert into a (info) select now() from a;
2097152 affected row (2.8s)
sql>insert into a (info) select now() from a;
4194304 affected row (5.5s)
sql>insert into a (info) select now() from a;
8388608 affected row (11.2s)
sql>insert into a (info) select now() from a;
16777216 affected row (22.1s)
sql>insert into a (info) select now() from a;
33554432 affected row (45.3s)
sql>insert into a (info) select now() from a;
67108864 affected row (1m 27s)
sql>insert into a (info) select now() from a;
134217728 affected row (2m 58s)


空间占用大约1.9G.
[root@150 ~]# cd /data02/mdb1/
[root@150 mdb1]# ll
total 16
-rw------- 1 root root 7196 Aug 13 21:08 merovingian.log
-rw-r--r-- 1 root root    6 Aug 13 20:08 merovingian.pid
drwx------ 5 root root 4096 Aug 13 20:09 test
[root@150 mdb1]# du -sh *
8.0K    merovingian.log
4.0K    merovingian.pid
1.9G    test
[root@150 mdb1]# cd test/
[root@150 test]# ll
total 12
-rw-------  1 root root    0 Aug 13 20:09 58ec944a-5cff-dc51-4873-986923c64567
drwx------ 14 root root 4096 Aug 13 20:33 bat
drwx------  2 root root 4096 Aug 13 20:09 box
drwx------  3 root root 4096 Aug 13 20:09 sql_logs
[root@150 test]# du -sh *
0       58ec944a-5cff-dc51-4873-986923c64567
1.9G    bat
4.0K    box
12K     sql_logs
[root@150 test]# cd bat
[root@150 bat]# du -sh *
492K    01
288K    02
92K     04
1.1G    05
513M    07
572K    10
4.0K    11
4.0K    12
65M     13
323M    14
772K    15
4.0K    3.head
4.0K    3.tail
12K     3.theap
4.0K    4.head
4.0K    4.tail
12K     4.theap
28K     BACKUP

[root@150 bat]# ll
total 88
drwx------ 2 root root 4096 Aug 13 21:27 01
drwx------ 2 root root 4096 Aug 13 20:21 02
drwx------ 2 root root 4096 Aug 13 21:15 04
drwx------ 2 root root 4096 Aug 13 21:12 05
drwx------ 2 root root 4096 Aug 13 21:27 07
drwx------ 2 root root 4096 Aug 13 21:27 10
drwx------ 2 root root 4096 Aug 13 21:15 11
drwx------ 2 root root 4096 Aug 13 21:27 12
drwx------ 2 root root 4096 Aug 13 21:27 13
drwx------ 2 root root 4096 Aug 13 21:27 14
drwx------ 2 root root 4096 Aug 13 21:27 15
-rw------- 1 root root    8 Aug 13 20:09 3.head
-rw------- 1 root root    1 Aug 13 20:09 3.tail
-rw------- 1 root root 8216 Aug 13 20:09 3.theap
-rw------- 1 root root    8 Aug 13 20:09 4.head
-rw------- 1 root root    1 Aug 13 20:09 4.tail
-rw------- 1 root root 8343 Aug 13 20:09 4.theap
drwx------ 2 root root 4096 Aug 13 21:27 BACKUP

通过函数storage可以查看详细大小. 这里包含了count和columnsize. info字段占用500mb左右, id字段占用1G左右.
sql>select * from storage();
+-------+-------------------+----------------+-----------+----------+-----------+-------+------------+----------+---------+--------+
| schem | table             | column         | type      | location | count     | typew | columnsize | heapsize | indices | sorted |
: a     :                   :                :           :          :           : idth  :            :          :         :        :
+=======+===================+================+===========+==========+===========+=======+============+==========+=========+========+
| sys   | a                 | id             | int       | 05/573   | 268435456 |     4 | 1073741824 |        0 |       0 | true   |
| sys   | a                 | info           | varchar   | 07/734   | 268435456 |    32 |  536870912 |    16384 |       0 | true   |
| sys   | a                 | a_id_pkey      | oid       | 05/567   |         0 |     8 |          0 |        0 |       0 | true   |


查询测试
sql>select count(*) from a;
+-----------+
| L1        |
+===========+
| 268435456 |
+-----------+
1 tuple (0.979ms)
sql>select count(*) from a where id=1;
+------+
| L1   |
+======+
|    1 |
+------+
1 tuple (0.920ms)
sql>select count(*) from a where id>1;
+-----------+
| L1        |
+===========+
| 268435455 |
+-----------+
1 tuple (2.347ms)
sql>select count(*) from a where id<1;
+------+
| L1   |
+======+
|    0 |
+------+
1 tuple (2.270ms)
sql>select count(*) from a where id<>1;
+-----------+
| L1        |
+===========+
| 268435455 |
+-----------+
1 tuple (3.5s)
sql>select count(*) from a where id is null;
+------+
| L1   |
+======+
|    0 |
+------+
1 tuple (751.665ms)
sql>select count(*) from a where id is not null;
+-----------+
| L1        |
+===========+
| 268435456 |
+-----------+
1 tuple (710.477ms)
sql>select * from a limit 10;
+------+----------------------------------+
| id   | info                             |
+======+==================================+
|    1 | 2014-08-13 20:18:53.000000+08:00 |
|    2 | 2014-08-13 20:18:59.000000+08:00 |
|    3 | 2014-08-13 20:20:06.000000+08:00 |
|    4 | 2014-08-13 20:20:06.000000+08:00 |
|    5 | 2014-08-13 20:20:14.000000+08:00 |
|    6 | 2014-08-13 20:20:14.000000+08:00 |
|    7 | 2014-08-13 20:20:14.000000+08:00 |
|    8 | 2014-08-13 20:20:14.000000+08:00 |
|    9 | 2014-08-13 20:20:14.000000+08:00 |
|   10 | 2014-08-13 20:20:14.000000+08:00 |
+------+----------------------------------+
10 tuples (1.581ms)
sql>select count(*) from a where info='2014-08-13 20:20:14.000000+08:00';
+------+
| L1   |
+======+
|   12 |
+------+
1 tuple (0.442ms)
sql>select count(*) from a where info>'2014-08-13 20:20:14.000000+08:00';
+-----------+
| L1        |
+===========+
| 268435440 |
+-----------+
1 tuple (0.844ms)
sql>select count(*) from a where info<'2014-08-13 20:20:14.000000+08:00';
+------+
| L1   |
+======+
|    4 |
+------+
1 tuple (1.734ms)
sql>select count(*) from a where info<>'2014-08-13 20:20:14.000000+08:00';
+-----------+
| L1        |
+===========+
| 268435444 |
+-----------+
1 tuple (2.8s)
sql>select count(*) from a where info is null;
+------+
| L1   |
+======+
|    0 |
+------+
1 tuple (772.987ms)
sql>select count(*) from a where info is not null;
+-----------+
| L1        |
+===========+
| 268435456 |
+-----------+
1 tuple (747.712ms)
sql>select count(*) from (select info from a group by info having count(*) >1) as t;
+------+
| L2   |
+======+
|   18 |
+------+
1 tuple (4.7s)


PostgreSQL
插入测试
postgres=# create table a(id serial primary key, info text);
CREATE TABLE
postgres=# insert into a(info) values (now());
INSERT 0 1
postgres=# insert into a(info) select now() from a;
INSERT 0 1
Time: 0.954 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 2
Time: 0.606 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 4
Time: 0.352 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 8
Time: 0.393 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 16
Time: 0.476 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 32
Time: 0.840 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 64
Time: 1.166 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 128
Time: 1.632 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 256
Time: 3.214 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 512
Time: 5.519 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 1024
Time: 10.924 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 2048
Time: 21.212 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 4096
Time: 41.930 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 8192
Time: 84.242 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 16384
Time: 165.705 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 32768
Time: 332.269 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 65536
Time: 661.579 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 131072
Time: 1361.281 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 262144
Time: 2760.690 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 524288
Time: 5550.096 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 1048576
Time: 11221.544 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 2097152
Time: 22581.858 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 4194304
Time: 45018.124 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 8388608
Time: 91126.619 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 16777216
Time: 183895.102 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 33554432
Time: 376365.578 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 67108864
Time: 764031.565 ms
postgres=# insert into a(info) select now() from a;
INSERT 0 134217728
Time: 1533433.507 ms

postgres=# select pg_total_relation_size('a')/1024/1024/1024||'GB';
 ?column? 
----------
 22GB
(1 row)
Time: 1.027 ms
postgres=# select pg_relation_size('a')/1024/1024/1024||'GB';
 ?column? 
----------
 17GB
(1 row)
Time: 0.702 ms


查询测试
postgres=# select count(*) from a;
   count   
-----------
 268435456
(1 row)

Time: 60818.321 ms
postgres=# select count(*) from a;
   count   
-----------
 268435456
(1 row)

Time: 46484.165 ms
postgres=# select count(*) from a where id=1;
 count 
-------
     1
(1 row)

Time: 0.996 ms
postgres=# select count(*) from a where id>1;
   count   
-----------
 268435455
(1 row)

Time: 68812.883 ms
postgres=# select count(*) from a where id<1;
 count 
-------
     0
(1 row)

Time: 0.647 ms
postgres=# select count(*) from a where id<>1;
   count   
-----------
 268435455
(1 row)

Time: 68710.001 ms
postgres=# select count(*) from a where id is null;
 count 
-------
     0
(1 row)

Time: 0.537 ms
postgres=# select count(*) from a where id is not null;
   count   
-----------
 268435456
(1 row)

Time: 64771.858 ms
postgres=# select * from a limit 10;
 id |             info              
----+-------------------------------
  1 | 2014-08-13 20:37:52.165199+08
  2 | 2014-08-13 20:38:06.945296+08
  3 | 2014-08-13 20:38:07.965092+08
  4 | 2014-08-13 20:38:07.965092+08
  5 | 2014-08-13 20:38:08.502181+08
  6 | 2014-08-13 20:38:08.502181+08
  7 | 2014-08-13 20:38:08.502181+08
  8 | 2014-08-13 20:38:08.502181+08
  9 | 2014-08-13 20:38:08.966175+08
 10 | 2014-08-13 20:38:08.966175+08
(10 rows)

Time: 0.629 ms
postgres=# select count(*) from a where info='2014-08-13 20:38:08.966175+08';
 count 
-------
     8
(1 row)

Time: 68005.864 ms
postgres=# select count(*) from a where info>'2014-08-13 20:38:08.966175+08';
   count   
-----------
 268435440
(1 row)

Time: 91154.757 ms
postgres=# select count(*) from a where info<'2014-08-13 20:38:08.966175+08';
 count 
-------
     8
(1 row)

Time: 72516.503 ms
postgres=# select count(*) from a where info<>'2014-08-13 20:38:08.966175+08';
   count   
-----------
 268435448
(1 row)

Time: 86810.115 ms
postgres=# select count(*) from a where info is null;
 count 
-------
     0
(1 row)

Time: 43593.035 ms
postgres=# select count(*) from a where info is not null;
   count   
-----------
 268435456
(1 row)

Time: 65161.258 ms
postgres=# select count(*) from (select info from a group by info having count(*) >1) as t;
 count 
-------
    27
(1 row)
Time: 144180.678 ms


[参考]
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值