php mongodb查询效率,为什么PostgreSQL比MongoDB还快之完结篇(深挖单点索引查询)

之前两篇测试中发现:单点索引查询中PostgreSQL的速度是MongoDB(WiredTiger引擎)的4倍。

http://blog.chinaunix.net/xmlrpc.php?r=blog/article&uid=20726500&id=4960138

http://blog.chinaunix.net/xmlrpc.php?r=blog/article&uid=20726500&id=4981629

虽然本人很偏好PG,但也对这个结果表示不能理解。按照常理,纯NoSQL的MongoDB应该比PG快或和PG差不多快才比较合理。

所以,在之前测试的基础上再进行一次深入的挖掘。

之前的0匹配查询,由于执行的时间太短,没有采集到CPU利用率,而且时间值太小,对比的准确性也值得怀疑。

所以,现在构造一个循环的单点索引查询,并在单并发和多并发场景下对比PG和MongoDB的性能。

1. MongoDB(WiredTiger引擎)的测试

单并发测试

循环1万次单点索引查询,平均一次大约返回一条记录。

点击(此处)折叠或打开

-bash-4.1$ cat batchselect_mongo.sh

for ((i=0;i

do

echo "db.json_tables.find({ brand: 'ACME${i}'})"

done

-bash-4.1$ sh batchselect_mongo.sh 10000|time -p mongo benchmark >/dev/null

real 8.53

user 4.59

sys 2.62

-bash-4.1$ sh batchselect_mongo.sh 10000|time -p mongo benchmark|grep ACME|wc

real 8.33

user 5.33

sys 1.51

9091 3281851 22061914

看看top的资源使用

点击(此处)折叠或打开

[root@hanode1 bin]# top

top - 08:29:13 up 11 days, 11:29, 6 users, load average: 0.09, 0.03, 0.01

Tasks: 159 total, 2 running, 157 sleeping, 0 stopped, 0 zombie

Cpu(s): 16.8%us, 1.3%sy, 0.0%ni, 75.6%id, 0.0%wa, 0.0%hi, 6.4%si, 0.0%st

Mem: 1019320k total, 942260k used, 77060k free, 134736k buffers

Swap: 2064376k total, 64712k used, 1999664k free, 265524k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

3951 postgres 20 0 746m 63m 9200 R 85.5 6.4 0:06.97 mongo

26391 postgres 20 0 617m 339m 12m S 19.3 34.1 10:01.76 mongod

算下来mongod进程占用CPU 1.6s(8.33*19.3%=1.6)

3并发的测试

前面一直是单并发测试,现在看看3并发的场景(测试机是4核)。点击(此处)折叠或打开

-bash-4.1$ sh batchselect_mongo.sh 10000 |time -p mongo benchmark >/dev/null &

[1] 5398

-bash-4.1$ sh batchselect_mongo.sh 10000 |time -p mongo benchmark >/dev/null &

[2] 5401

-bash-4.1$ sh batchselect_mongo.sh 10000 |time -p mongo benchmark >/dev/null &

[3] 5412

-bash-4.1$ real 8.61

user 6.30

sys 0.64

real 8.61

user 6.37

sys 0.59

real 8.49

user 6.36

sys 0.62

[1] Done sh batchselect_mongo.sh 10000 | time -p mongo benchmark > /dev/null

[2]- Done sh batchselect_mongo.sh 10000 | time -p mongo benchmark > /dev/null

[3]+ Done sh batchselect_mongo.sh 10000 | time -p mongo benchmark > /dev/null

3并发时的top结果

点击(此处)折叠或打开

[root@hanode1 bin]# top

top - 08:42:13 up 11 days, 11:42, 6 users, load average: 1.26, 0.95, 0.39

Tasks: 166 total, 4 running, 162 sleeping, 0 stopped, 0 zombie

Cpu(s): 67.4%us, 5.3%sy, 0.0%ni, 25.0%id, 0.0%wa, 0.0%hi, 2.3%si, 0.0%st

Mem: 1019320k total, 570600k used, 448720k free, 1008k buffers

Swap: 2064376k total, 136636k used, 1927740k free, 38632k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

5399 postgres 20 0 746m 63m 9192 R 82.5 6.3 0:05.27 mongo

5402 postgres 20 0 744m 61m 9192 R 82.5 6.1 0:05.27 mongo

5413 postgres 20 0 744m 61m 9192 R 82.1 6.2 0:04.96 mongo

26391 postgres 20 0 625m 266m 3048 S 47.5 26.8 10:27.54 mongod

可以注意到几点1)每个客户端的执行时间和单并发时差不多一样。3个mongo进程和1个mongod进程几乎各占了一个CPU核,没有CPU争用。

2)mongod进程的CPU实际占用时间是8.6*47.5%=4s,是单并发时的2.5倍(接近理论效果的3倍,里面会有测量误差)

3)性能瓶颈还是在客户端

10并发的测试

10并发时,忙碌的进程数超过CPU核心数,会有CPU争用。

点击(此处)折叠或打开

-bash-4.1$

-bash-4.1$ sh batchselect_mongo.sh 10000 |time -p mongo benchmark >/dev/null &

[1] 5156

-bash-4.1$ sh batchselect_mongo.sh 10000 |time -p mongo benchmark >/dev/null &

[2] 5159

-bash-4.1$ sh batchselect_mongo.sh 10000 |time -p mongo benchmark >/dev/null &

[3] 5161

-bash-4.1$ sh batchselect_mongo.sh 10000 |time -p mongo benchmark >/dev/null &

[4] 5163

-bash-4.1$ sh batchselect_mongo.sh 10000 |time -p mongo benchmark >/dev/null &

[5] 5166

-bash-4.1$ sh batchselect_mongo.sh 10000 |time -p mongo benchmark >/dev/null &

[6] 5170

-bash-4.1$ sh batchselect_mongo.sh 10000 |time -p mongo benchmark >/dev/null &

[7] 5172

-bash-4.1$ sh batchselect_mongo.sh 10000 |time -p mongo benchmark >/dev/null &

[8] 5174

-bash-4.1$ sh batchselect_mongo.sh 10000 |time -p mongo benchmark >/dev/null &

[9] 5177

-bash-4.1$ sh batchselect_mongo.sh 10000 |time -p mongo benchmark >/dev/null &

[10] 5179

-bash-4.1$

-bash-4.1$ real 25.15

user 7.97

sys 1.06

real 25.54

user 7.93

sys 1.02

real 26.32

user 7.94

sys 1.08

real 26.63

user 8.39

sys 0.93

real 27.01

user 8.44

sys 1.04

real 28.06

user 8.52

sys 1.02

real 28.16

user 8.11

sys 1.24

real 28.27

user 8.47

sys 1.03

real 28.64

user 8.23

sys 1.13

real 29.14

user 8.11

sys 1.12

[1] Done sh batchselect_mongo.sh 10000 | time -p mongo benchmark > /dev/null

[2] Done sh batchselect_mongo.sh 10000 | time -p mongo benchmark > /dev/null

[3] Done sh batchselect_mongo.sh 10000 | time -p mongo benchmark > /dev/null

[4] Done sh batchselect_mongo.sh 10000 | time -p mongo benchmark > /dev/null

[5] Done sh batchselect_mongo.sh 10000 | time -p mongo benchmark > /dev/null

[6] Done sh batchselect_mongo.sh 10000 | time -p mongo benchmark > /dev/null

[7] Done sh batchselect_mongo.sh 10000 | time -p mongo benchmark > /dev/null

[8] Done sh batchselect_mongo.sh 10000 | time -p mongo benchmark > /dev/null

[9]- Done sh batchselect_mongo.sh 10000 | time -p mongo benchmark > /dev/null

[10]+ Done sh batchselect_mongo.sh 10000 | time -p mongo benchmark > /dev/null

10并发时的top结果

点击(此处)折叠或打开

[root@hanode1 bin]# top

top - 08:40:22 up 11 days, 11:40, 6 users, load average: 1.60, 0.40, 0.14

Tasks: 187 total, 11 running, 176 sleeping, 0 stopped, 0 zombie

Cpu(s): 73.7%us, 24.9%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 1.5%si, 0.0%st

Mem: 1019320k total, 969800k used, 49520k free, 77180k buffers

Swap: 2064376k total, 66948k used, 1997428k free, 36160k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

26391 postgres 20 0 625m 332m 5220 S 50.5 33.4 10:10.47 mongod

5168 postgres 20 0 745m 50m 9200 R 40.5 5.0 0:02.16 mongo

5175 postgres 20 0 745m 58m 9192 R 40.5 5.9 0:02.19 mongo

5167 postgres 20 0 745m 48m 9192 R 34.9 4.9 0:01.97 mongo

5157 postgres 20 0 745m 48m 9192 R 34.6 4.8 0:01.96 mongo

5184 postgres 20 0 744m 44m 9192 R 34.6 4.5 0:01.68 mongo

5164 postgres 20 0 744m 44m 9192 R 33.6 4.5 0:01.67 mongo

5182 postgres 20 0 744m 44m 9192 R 31.2 4.5 0:01.58 mongo

5180 postgres 20 0 744m 44m 9192 R 30.9 4.5 0:01.58 mongo

5181 postgres 20 0 744m 44m 9192 R 29.2 4.5 0:01.52 mongo

5183 postgres 20 0 744m 44m 9192 R 28.6 4.5 0:01.54 mongo

1)mongod进程的CPU实际占用时间大约是27*50.5%=13.5s,是单并发时的8.4倍(接近理论效果的10倍,里面会有测量误差)

2)CPU被撑满,客户端消耗的大部分的CPU资源。

2. PostgreSQL的测试

单并发测试

循环1万次单点索引查询,平均一次大约返回一条记录。

点击(此处)折叠或打开

-bash-4.1$ cat batchselect_pg.sh

for ((i=0;i

do

echo "SELECT data FROM json_tables WHERE data @> '{\"brand\":\"ACME${i}\"}';"

done

-bash-4.1$ sh batchselect_pg.sh 10000|time -p psql -qAt benchmark >/dev/null

real 1.48

user 0.13

sys 0.14

由于PG太快了,不好收集top信息,改成10万次循环。

点击(此处)折叠或打开

-bash-4.1$ sh batchselect_pg.sh 100000|time -p psql -qAt benchmark >/dev/null

real 10.72

user 0.96

sys 1.01

-bash-4.1$ sh batchselect_pg.sh 100000|time -p psql -qAt benchmark |wc

real 10.91

user 0.61

sys 1.49

9091 3172759 21561900

看看top结果

点击(此处)折叠或打开

[root@hanode1 bin]# top

top - 09:18:00 up 11 days, 12:18, 6 users, load average: 0.35, 0.14, 0.05

Tasks: 166 total, 2 running, 164 sleeping, 0 stopped, 0 zombie

Cpu(s): 27.5%us, 4.5%sy, 0.0%ni, 67.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

Mem: 1019320k total, 349168k used, 670152k free, 9200k buffers

Swap: 2064376k total, 68376k used, 1996000k free, 243836k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

9042 postgres 20 0 587m 77m 76m R 81.8 7.8 0:04.35 postgres

9038 postgres 20 0 103m 1224 1064 S 31.9 0.1 0:01.60 sh

9041 postgres 20 0 105m 1204 1016 S 18.6 0.1 0:01.08 psql

算下来postgres进程占用CPU 8.9s(10.9*81.1%=8.9)。也就是说mongod进程的CPU实际占用时间是postgres进程的1.8倍,而不是之前简单测试得出的4倍。

3并发的测试

点击(此处)折叠或打开

-bash-4.1$ sh batchselect_pg.sh 100000|time -p psql -qAt benchmark >/dev/null &

[1] 9740

-bash-4.1$ sh batchselect_pg.sh 100000|time -p psql -qAt benchmark >/dev/null &

[2] 9743

-bash-4.1$ sh batchselect_pg.sh 100000|time -p psql -qAt benchmark >/dev/null &

[3] 9746

-bash-4.1$

-bash-4.1$

-bash-4.1$ real 14.15

user 1.21

sys 1.03

real 14.34

user 1.20

sys 1.01

real 14.41

user 1.23

sys 1.01

[1] Done sh batchselect_pg.sh 100000 | time -p psql -qAt benchmark > /dev/null

[2]- Done sh batchselect_pg.sh 100000 | time -p psql -qAt benchmark > /dev/null

[3]+ Done sh batchselect_pg.sh 100000 | time -p psql -qAt benchmark > /dev/null

top的结果

点击(此处)折叠或打开

[root@hanode1 bin]# top

top - 09:25:01 up 11 days, 12:25, 6 users, load average: 0.97, 0.24, 0.08

Tasks: 176 total, 4 running, 172 sleeping, 0 stopped, 0 zombie

Cpu(s): 68.6%us, 13.3%sy, 0.0%ni, 18.1%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

Mem: 1019320k total, 354280k used, 665040k free, 9512k buffers

Swap: 2064376k total, 68376k used, 1996000k free, 243848k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

9749 postgres 20 0 587m 77m 76m R 70.2 7.8 0:08.37 postgres

9750 postgres 20 0 587m 77m 76m R 67.5 7.8 0:08.60 postgres

9748 postgres 20 0 587m 77m 76m R 65.8 7.8 0:08.44 postgres

9739 postgres 20 0 103m 1224 1064 S 31.9 0.1 0:03.39 sh

9745 postgres 20 0 103m 1224 1064 S 28.9 0.1 0:03.35 sh

9742 postgres 20 0 103m 1228 1064 S 25.3 0.1 0:03.32 sh

9741 postgres 20 0 105m 1200 1016 S 16.3 0.1 0:01.95 psql

9744 postgres 20 0 105m 1200 1016 S 15.6 0.1 0:02.01 psql

9747 postgres 20 0 105m 1200 1016 S 15.6 0.1 0:01.95 psql

算下来postgres进程占用CPU 大约28.3s(14*(70%+67%+65%+)=28.3),是单并发时的3.2倍。

10并发的测试

点击(此处)折叠或打开

-bash-4.1$ sh batchselect_pg.sh 100000|time -p psql -qAt benchmark >/dev/null &

[1] 10628

-bash-4.1$ sh batchselect_pg.sh 100000|time -p psql -qAt benchmark >/dev/null &

[2] 10631

-bash-4.1$ sh batchselect_pg.sh 100000|time -p psql -qAt benchmark >/dev/null &

[3] 10634

-bash-4.1$ sh batchselect_pg.sh 100000|time -p psql -qAt benchmark >/dev/null &

[4] 10637

-bash-4.1$ sh batchselect_pg.sh 100000|time -p psql -qAt benchmark >/dev/null &

[5] 10639

-bash-4.1$ sh batchselect_pg.sh 100000|time -p psql -qAt benchmark >/dev/null &

[6] 10641

-bash-4.1$ sh batchselect_pg.sh 100000|time -p psql -qAt benchmark >/dev/null &

[7] 10644

-bash-4.1$ sh batchselect_pg.sh 100000|time -p psql -qAt benchmark >/dev/null &

[8] 10646

-bash-4.1$ sh batchselect_pg.sh 100000|time -p psql -qAt benchmark >/dev/null &

[9] 10648

-bash-4.1$ sh batchselect_pg.sh 100000|time -p psql -qAt benchmark >/dev/null &

[10] 10651

-bash-4.1$

-bash-4.1$

-bash-4.1$ real 38.35

user 1.47

sys 0.85

real 38.72

user 1.39

sys 0.90

real 38.87

user 1.45

sys 0.91

real 39.13

user 1.50

sys 0.82

real 39.23

user 1.41

sys 0.94

real 39.63

user 1.42

sys 0.93

real 39.75

user 1.39

sys 0.90

real 39.84

user 1.36

sys 0.98

real 40.24

user 1.42

sys 0.92

real 40.58

user 1.40

sys 0.93

[1] Done sh batchselect_pg.sh 100000 | time -p psql -qAt benchmark > /dev/null

[2] Done sh batchselect_pg.sh 100000 | time -p psql -qAt benchmark > /dev/null

[3] Done sh batchselect_pg.sh 100000 | time -p psql -qAt benchmark > /dev/null

[4] Done sh batchselect_pg.sh 100000 | time -p psql -qAt benchmark > /dev/null

[5] Done sh batchselect_pg.sh 100000 | time -p psql -qAt benchmark > /dev/null

[6] Done sh batchselect_pg.sh 100000 | time -p psql -qAt benchmark > /dev/null

[7] Done sh batchselect_pg.sh 100000 | time -p psql -qAt benchmark > /dev/null

[8] Done sh batchselect_pg.sh 100000 | time -p psql -qAt benchmark > /dev/null

[9]- Done sh batchselect_pg.sh 100000 | time -p psql -qAt benchmark > /dev/null

[10]+ Done sh batchselect_pg.sh 100000 | time -p psql -qAt benchmark > /dev/null

10并发时的top结果

点击(此处)折叠或打开

[root@hanode1 bin]# top

top - 09:34:12 up 11 days, 12:34, 6 users, load average: 5.21, 1.33, 0.47

Tasks: 201 total, 12 running, 189 sleeping, 0 stopped, 0 zombie

Cpu(s): 84.7%us, 14.9%sy, 0.0%ni, 0.3%id, 0.0%wa, 0.0%hi, 0.1%si, 0.0%st

Mem: 1019320k total, 370468k used, 648852k free, 9952k buffers

Swap: 2064376k total, 68376k used, 1996000k free, 243860k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

10664 postgres 20 0 587m 77m 76m R 26.8 7.8 0:08.77 postgres

10660 postgres 20 0 587m 77m 76m R 26.2 7.8 0:08.83 postgres

10665 postgres 20 0 587m 77m 76m R 26.2 7.8 0:08.46 postgres

10642 postgres 20 0 587m 77m 76m R 25.5 7.8 0:08.48 postgres

10656 postgres 20 0 587m 77m 76m R 25.5 7.8 0:08.15 postgres

10663 postgres 20 0 587m 77m 76m R 24.8 7.8 0:08.48 postgres

10662 postgres 20 0 587m 77m 76m R 23.8 7.8 0:08.73 postgres

10635 postgres 20 0 587m 77m 76m R 23.5 7.8 0:09.20 postgres

10666 postgres 20 0 587m 77m 76m R 23.5 7.8 0:08.87 postgres

10654 postgres 20 0 587m 77m 76m R 22.5 7.8 0:07.94 postgres

10636 postgres 20 0 103m 1224 1064 S 9.9 0.1 0:03.15 sh

10638 postgres 20 0 103m 1228 1064 S 9.9 0.1 0:03.08 sh

10650 postgres 20 0 103m 1228 1064 S 9.6 0.1 0:03.18 sh

10627 postgres 20 0 103m 1228 1064 S 9.3 0.1 0:03.00 sh

10640 postgres 20 0 103m 1228 1064 R 9.3 0.1 0:02.84 sh

10645 postgres 20 0 103m 1224 1064 S 9.3 0.1 0:03.14 sh

10630 postgres 20 0 103m 1228 1064 S 8.6 0.1 0:03.22 sh

10633 postgres 20 0 103m 1228 1064 S 8.6 0.1 0:02.98 sh

10647 postgres 20 0 103m 1228 1064 S 8.6 0.1 0:03.17 sh

10643 postgres 20 0 103m 1228 1064 S 8.3 0.1 0:02.82 sh

10658 postgres 20 0 105m 1200 1016 S 7.0 0.1 0:02.05 psql

10655 postgres 20 0 105m 1200 1016 S 6.6 0.1 0:02.09 psql

10629 postgres 20 0 105m 1200 1016 S 6.3 0.1 0:02.02 psql

10657 postgres 20 0 105m 1200 1016 S 6.3 0.1 0:01.93 psql

10652 postgres 20 0 105m 1204 1016 S 6.0 0.1 0:01.88 psql

10649 postgres 20 0 105m 1204 1016 S 5.6 0.1 0:01.84 psql

10659 postgres 20 0 105m 1200 1016 S 5.6 0.1 0:01.97 psql

10632 postgres 20 0 105m 1204 1016 S 5.3 0.1 0:02.09 psql

10653 postgres 20 0 105m 1204 1016 S 5.3 0.1 0:01.96 psql

10661 postgres 20 0 105m 1200 1016 S 5.3 0.1 0:02.04 psql

1)大致算下来postgres进程占用CPU 大约97.5s(39*25%*10=97.5),是单并发时的10.9倍(97.5/8.9)。

2)CPU被撑满,postgres进程占用了大部分的CPU。

3.总结

测试结果总结如下

97b2eb73cbc9aae493f19f0b133a72be.png

从上面的结果可以看出:

1)在多并发场景下,MongoDB和PostgreSQL的服务端进程占用的总CPU时间和并发数基本成正比。说明负载在CPU多核间分担的比较好。

2)多并发时MongoDB在单点索引查询占用的CPU时间大约是PostgreSQL的1.4倍。

3)这个1.4倍的比率基本可以代表了实际场景(高并发,且客户端和服务端不在同一台机器上)下它们的单点索引查询性能差异,而不是之前简单测试显示的4倍。

经过这样严格的比较,我终于可以相信PostgreSQL在单点索引查询上比MongoDB(WiredTiger引擎)快了那么一点点。

结合前两次测试结果,最终的总结如下

1)加载

WiredTiger的性能是PG的3倍(注1,注2)

2)插入

相差不大,WiredTiger小胜(注1,注2)

3)全表扫描(0匹配)

WiredTiger的性能是PG的4倍

4)单点索引扫描

PG的性能是WiredTiger的1.4倍(注1)

5)数据大小

PG的数据大小是WiredTiger的3倍

注1)以服务端进程CPU消耗作为衡量指标的,忽略了MongoDB客户端的高CPU消耗。

注2)仅仅是单并发的测试数据

这个结果虽然和开头那个EnterpriseDB的流传较广的测试结果有很大出入,但PG的NoSQL特性在单机环境下仍然有巨大的优势(即:NoSQL+SQL+ACID)。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值