mongoDB VS PostgreSQL dml performance use python (pymongo & py-postgresql)

前面测试了mongodb和postgresql的插入性能对比, 参考如下 : 
本文将测试对比一下select, update, 以及select, insert, update混合场景的性能.
同样使用并行8个线程测试.
(因为使用的驱动问题, python测试结果性能较差, mongo改用motor驱动的异步调用后, 性能提升明显, 测试结果仅供参考, 如果是PG的话, 建议使用pgbench测试)
mongoDB : 
更新测试 :
# vi test.py
import threading
import time
import pymongo
import random

c=pymongo.MongoClient('/tmp/mongodb-5281.sock')
db = c.test_database
db.drop_collection('test_collection')
collection = db.test_collection
print(collection.count())
for i in range(0,1000000):
  collection.insert({'id':i, 'username': 'digoal.zhou', 'age':32, 'email':'digoal@126.com', 'qq':'276732431'})
collection.create_index("id")

class n_t(threading.Thread):   #The timer class is derived from the class threading.Thread
  def __init__(self, num):
    threading.Thread.__init__(self)
    self.thread_num = num

  def run(self): #Overwrite run() method, put what you want the thread do here
    c=pymongo.MongoClient('/tmp/mongodb-5281.sock')
    db = c.test_database
    collection = db.test_collection
    start_t = time.time()
    print("TID:" + str(self.thread_num) + " " + str(start_t))
    for i in range(0,125000):
      collection.update({'id':random.randrange(0,1000000)}, {'$set': {'age': random.randrange(0,1000000)}})
    stop_t = time.time()
    print("TID:" + str(self.thread_num) + " " + str(stop_t))
    print(stop_t-start_t)

def test():
  t_names = dict()
  for i in range(0,8):
    t_names[i] = n_t(i) 
    t_names[i].start()
  return

if __name__ == '__main__':
  test()

测试结果379秒 : 
[root@localhost ~]# python test.py
0
TID:0 1423070907.7699816
TID:1 1423070907.770696
TID:2 1423070907.7744105
TID:3 1423070907.7760801
TID:4 1423070907.779555
TID:7 1423070907.78037
TID:6 1423070907.7860947
TID:5 1423070907.78793
TID:6 1423071285.5971715
377.81107687950134
TID:7 1423071286.6500263
378.8696563243866
TID:0 1423071286.9464445
379.17646288871765
TID:1 1423071287.1227949
379.352098941803
TID:3 1423071287.1230247
379.3469445705414
TID:5 1423071287.2262568
379.4383268356323
TID:4 1423071287.2609653
379.4814102649689
TID:2 1423071287.4058232
379.6314127445221


查询测试 : 
import threading
import time
import pymongo
import random

c=pymongo.MongoClient('/tmp/mongodb-5281.sock')
db = c.test_database
# db.drop_collection('test_collection')
collection = db.test_collection
print(collection.count())
# for i in range(0,1000000):
#   collection.insert({'id':i, 'username': 'digoal.zhou', 'age':32, 'email':'digoal@126.com', 'qq':'276732431'})
# collection.create_index("id")

class n_t(threading.Thread):   #The timer class is derived from the class threading.Thread
  def __init__(self, num):
    threading.Thread.__init__(self)
    self.thread_num = num

  def run(self): #Overwrite run() method, put what you want the thread do here
    c=pymongo.MongoClient('/tmp/mongodb-5281.sock')
    db = c.test_database
    collection = db.test_collection
    start_t = time.time()
    print("TID:" + str(self.thread_num) + " " + str(start_t))
    for i in range(0,125000):
      collection.find_one({'id': random.randrange(0,1000000)})
    stop_t = time.time()
    print("TID:" + str(self.thread_num) + " " + str(stop_t))
    print(stop_t-start_t)

def test():
  t_names = dict()
  for i in range(0,8):
    t_names[i] = n_t(i) 
    t_names[i].start()
  return

if __name__ == '__main__':
  test()

测试结果361秒 : 
[root@localhost ~]# python test.py
2000000
TID:2 1423079875.4572093
TID:3 1423079875.4576375
TID:1 1423079875.4596934
TID:0 1423079875.4600854
TID:4 1423079875.4589622
TID:5 1423079875.4653761
TID:6 1423079875.463017
TID:7 1423079875.4694664
TID:7 1423080235.7239776
360.2545111179352
TID:3 1423080236.109339
360.6517014503479
TID:4 1423080236.1194305
360.66046833992004
TID:1 1423080236.260948
360.8012545108795
TID:2 1423080236.5218844
361.06467509269714
TID:5 1423080236.6404896
361.17511343955994
TID:0 1423080236.6446981
361.1846127510071
TID:6 1423080236.6607506
361.1977336406708


更新, 插入, 查询综合测试 : 
import threading
import time
import pymongo
import random

c=pymongo.MongoClient('/tmp/mongodb-5281.sock')
db = c.test_database
# db.drop_collection('test_collection')
collection = db.test_collection
print(collection.count())
# for i in range(0,1000000):
#   collection.insert({'id':i, 'username': 'digoal.zhou', 'age':32, 'email':'digoal@126.com', 'qq':'276732431'})
# collection.create_index("id")

class n_t(threading.Thread):   #The timer class is derived from the class threading.Thread
  def __init__(self, num):
    threading.Thread.__init__(self)
    self.thread_num = num

  def run(self): #Overwrite run() method, put what you want the thread do here
    c=pymongo.MongoClient('/tmp/mongodb-5281.sock')
    db = c.test_database
    collection = db.test_collection
    start_t = time.time()
    print("TID:" + str(self.thread_num) + " " + str(start_t))
    for i in range(0,125000):
      collection.insert({'id':random.randrange(1000001,2000000), 'username': 'digoal.zhou', 'age':32, 'email':'digoal@126.com', 'qq':'276732431'})
      collection.update({'id':random.randrange(0,1000000)}, {'$set': {'age': random.randrange(0,1000000)}})
      collection.find_one({'id': random.randrange(0,1000000)})
    stop_t = time.time()
    print("TID:" + str(self.thread_num) + " " + str(stop_t))
    print(stop_t-start_t)

def test():
  t_names = dict()
  for i in range(0,8):
    t_names[i] = n_t(i) 
    t_names[i].start()
  return

if __name__ == '__main__':
  test()

测试结果1150秒 : 
[root@localhost ~]# python test.py
2000000
TID:0 1423080359.006871
TID:1 1423080359.0083587
TID:2 1423080359.009925
TID:4 1423080359.0124109
TID:3 1423080359.015088
TID:5 1423080359.0179524
TID:6 1423080359.0209677
TID:7 1423080359.0235417
TID:4 1423081508.11507
1149.1026592254639
TID:7 1423081508.1888452
1149.1653034687042
TID:2 1423081508.2641344
1149.2542095184326
TID:1 1423081508.3973265
1149.3889677524567
TID:0 1423081508.5400703
1149.5331993103027
TID:6 1423081508.594207
1149.573239326477
TID:5 1423081509.0279126
1150.0099601745605
TID:3 1423081509.0943637
1150.0792756080627


PostgreSQL : 
更新测试 :
import threading
import time
import postgresql
import random

conn = { "user": "postgres",
         "database": "postgres",
         "unix": "/data01/pgdata/pg_root/.s.PGSQL.1921"
       }

db = postgresql.open(**conn)
db.execute("drop table if exists tt")
db.execute("create table tt(id int, username name, age int2, email text, qq text)")
ins = db.prepare("insert into tt values($1,$2,$3,$4,$5)")
for i in range(0,1000000):
  ins(i,'digoal.zhou',32,'digoal@126.com','276732431')
db.execute("create index idx_tt_id on tt(id)")
print(db.query("select count(1) as a from tt"))

class n_t(threading.Thread):   #The timer class is derived from the class threading.Thread
  def __init__(self, num):
    threading.Thread.__init__(self)
    self.thread_num = num

  def run(self): #Overwrite run() method, put what you want the thread do here
    conn = { "user": "postgres", 
             "database": "postgres",
             "unix": "/data01/pgdata/pg_root/.s.PGSQL.1921"
           }

    db = postgresql.open(**conn)
    upd = db.prepare("update tt set age=$1 where id=$2")
    start_t = time.time()
    print("TID:" + str(self.thread_num) + " " + str(start_t))
    for i in range(0,125000):
      upd(random.randrange(0,100), random.randrange(0,1000000))
    stop_t = time.time()
    print("TID:" + str(self.thread_num) + " " + str(stop_t))
    print(stop_t-start_t)

def test():
  t_names = dict()
  for i in range(0,8):
    t_names[i] = n_t(i) 
    t_names[i].start()
  return

if __name__ == '__main__':
  test()

测试结果244秒 : 
TID:0 1423072792.0481002
TID:1 1423072792.050467
TID:3 1423072792.0514963
TID:2 1423072792.051693
TID:5 1423072792.059382
TID:4 1423072792.0605848
TID:7 1423072792.0643597
TID:6 1423072792.0657377
TID:2 1423073034.8827112
242.8310182094574
TID:5 1423073035.024978
242.96559596061707
TID:4 1423073035.2550452
243.1944603919983
TID:7 1423073035.5245414
243.46018171310425
TID:1 1423073036.0639975
244.0135304927826
TID:3 1423073036.3519847
244.30048847198486
TID:0 1423073036.5292883
244.48118805885315
TID:6 1423073036.5383787
244.47264099121094


查询测试 : 
import threading
import time
import postgresql
import random

conn = { "user": "postgres",
         "database": "postgres",
         "unix": "/data01/pgdata/pg_root/.s.PGSQL.1921"
       }

db = postgresql.open(**conn)
# db.execute("drop table if exists tt")
# db.execute("create table tt(id int, username name, age int2, email text, qq text)")
# ins = db.prepare("insert into tt values($1,$2,$3,$4,$5)")
# for i in range(0,1000000):
#   ins(i,'digoal.zhou',32,'digoal@126.com','276732431')
# db.execute("create index idx_tt_id on tt(id)")
print(db.query("select count(1) as a from tt"))

class n_t(threading.Thread):   #The timer class is derived from the class threading.Thread
  def __init__(self, num):
    threading.Thread.__init__(self)
    self.thread_num = num

  def run(self): #Overwrite run() method, put what you want the thread do here
    conn = { "user": "postgres", 
             "database": "postgres",
             "unix": "/data01/pgdata/pg_root/.s.PGSQL.1921"
           }

    db = postgresql.open(**conn)
    sel = db.prepare("select * from tt where id=$1 limit 1")
    start_t = time.time()
    print("TID:" + str(self.thread_num) + " " + str(start_t))
    for i in range(0,125000):
      sel(random.randrange(0,1000000))
    stop_t = time.time()
    print("TID:" + str(self.thread_num) + " " + str(stop_t))
    print(stop_t-start_t)

def test():
  t_names = dict()
  for i in range(0,8):
    t_names[i] = n_t(i) 
    t_names[i].start()
  return

if __name__ == '__main__':
  test()

测试结果438秒 : 
postgres@localhost-> python test.py
[(1000000,)]
TID:2 1423081634.6041436
TID:1 1423081634.6072564
TID:5 1423081634.6098883
TID:0 1423081634.6110475
TID:6 1423081634.611464
TID:7 1423081634.6147678
TID:3 1423081634.617597
TID:4 1423081634.6184704
TID:7 1423082070.8112974
436.1965296268463
TID:4 1423082071.5796437
436.96117329597473
TID:5 1423082071.6695313
437.0596430301666
TID:0 1423082071.8521369
437.24108934402466
TID:1 1423082072.5634701
437.95621371269226
TID:2 1423082072.678791
438.0746474266052
TID:3 1423082072.9825838
438.3649866580963
TID:6 1423082072.9963892
438.3849251270294


更新, 插入, 查询综合测试 : 
import threading
import time
import postgresql
import random

conn = { "user": "postgres",
         "database": "postgres",
         "unix": "/data01/pgdata/pg_root/.s.PGSQL.1921"
       }

db = postgresql.open(**conn)
# db.execute("drop table if exists tt")
# db.execute("create table tt(id int, username name, age int2, email text, qq text)")
# ins = db.prepare("insert into tt values($1,$2,$3,$4,$5)")
# for i in range(0,1000000):
#   ins(i,'digoal.zhou',32,'digoal@126.com','276732431')
# db.execute("create index idx_tt_id on tt(id)")
print(db.query("select count(1) as a from tt"))

class n_t(threading.Thread):   #The timer class is derived from the class threading.Thread
  def __init__(self, num):
    threading.Thread.__init__(self)
    self.thread_num = num

  def run(self): #Overwrite run() method, put what you want the thread do here
    conn = { "user": "postgres", 
             "database": "postgres",
             "unix": "/data01/pgdata/pg_root/.s.PGSQL.1921"
           }

    db = postgresql.open(**conn)
    ins = db.prepare("insert into tt values($1,$2,$3,$4,$5)")
    upd = db.prepare("update tt set age=$1 where id=$2")
    sel = db.prepare("select * from tt where id=$1")

    start_t = time.time()
    print("TID:" + str(self.thread_num) + " " + str(start_t))
    for i in range(0,125000):
      ins(random.randrange(1000001,2000000),'digoal.zhou',32,'digoal@126.com','276732431')
      upd(random.randrange(0,100), random.randrange(0,1000000))
      sel(random.randrange(0,1000000))
    stop_t = time.time()
    print("TID:" + str(self.thread_num) + " " + str(stop_t))
    print(stop_t-start_t)

def test():
  t_names = dict()
  for i in range(0,8):
    t_names[i] = n_t(i) 
    t_names[i].start()
  return

if __name__ == '__main__':
  test()

测试结果938秒 : 
postgres@localhost-> python test.py
[(1000000,)]
TID:0 1423083626.888068
TID:2 1423083626.8912995
TID:1 1423083626.8920445
TID:3 1423083626.893638
TID:4 1423083626.8974612
TID:6 1423083626.9039218
TID:5 1423083626.9061637
TID:7 1423083626.908666

TID:5 1423084561.3804135
934.4742498397827
TID:6 1423084563.4344044
936.5304825305939
TID:2 1423084564.1677904
937.2764909267426
TID:0 1423084564.5768228
937.6887547969818
TID:4 1423084564.839536
937.9420747756958
TID:1 1423084564.9242597
938.0322151184082
TID:7 1423084565.0638845
938.1552186012268
TID:3 1423084565.345857
938.4522190093994


PostgreSQL使用pgbench的测试结果 : 
# 初始化数据
psql
truncate tt;
insert into tt select generate_series(1,1000000), 'digoal.zhou',32,'digoal@126.com','276732431';

更新
postgres@localhost-> vi test.sql
\setrandom v_id 0 1000000
\setrandom v_age 0 100
update tt set age=:v_age where id=:v_id;

测试结果32秒 : 
postgres@localhost-> pgbench -M prepared -n -r -f ./test.sql -c 8 -j 4 -t 125000
transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 8
number of threads: 4
number of transactions per client: 125000
number of transactions actually processed: 1000000/1000000
tps = 31631.177435 (including connections establishing)
tps = 31637.366682 (excluding connections establishing)
statement latencies in milliseconds:
        0.002963        \setrandom v_id 0 1000000
        0.000671        \setrandom v_age 0 100
        0.232106        update tt set age=:v_age where id=:v_id;

查询
postgres@localhost-> vi test.sql
\setrandom v_id 0 1000000
select * from tt where id=:v_id;

测试结果15秒 : 
postgres@localhost-> pgbench -M prepared -n -r -f ./test.sql -c 8 -j 4 -t 125000
transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 8
number of threads: 4
number of transactions per client: 125000
number of transactions actually processed: 1000000/1000000
tps = 66487.929382 (including connections establishing)
tps = 66514.422913 (excluding connections establishing)
statement latencies in milliseconds:
        0.002007        \setrandom v_id 0 1000000
        0.104391        select * from tt where id=:v_id;

插入, 更新, 查询
postgres@localhost-> vi test.sql
\setrandom v_newid 1000001 2000000
\setrandom v_id 0 1000000
\setrandom v_age 0 100
insert into tt values(:v_newid, 'digoal.zhou',32,'digoal@126.com','276732431');
update tt set age=:v_age where id=:v_id;
select * from tt where id=:v_id;

测试结果耗时69秒 : 
postgres@localhost-> pgbench -M prepared -n -r -f ./test.sql -c 8 -j 4 -t 125000
transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 8
number of threads: 4
number of transactions per client: 125000
number of transactions actually processed: 1000000/1000000
tps = 14429.720005 (including connections establishing)
tps = 14431.006796 (excluding connections establishing)
statement latencies in milliseconds:
        0.003031        \setrandom v_newid 1000001 2000000
        0.000816        \setrandom v_id 0 1000000
        0.000733        \setrandom v_age 0 100
        0.150249        insert into tt values(:v_newid, 'digoal.zhou',32,'digoal@126.com','276732431');
        0.180603        update tt set age=:v_age where id=:v_id;
        0.190850        select * from tt where id=:v_id;


[小结]
python 测试结果 : 
mongoDB 
插入100W记录耗时 - 364秒
更新100W记录耗时 - 379
100W记录索引检索100W次耗时 - 361
综合测试, 插入 100W记录 , 更新 100W记录 , 查询 100W次耗时 - 1150

PostgreSQL
插入100W记录耗时 - 226秒.
更新100W记录耗时 - 244 秒.
100W记录索引检索100W次耗时 - 438 秒.
综合测试, 插入 100W记录 , 更新 100W记录 , 查询 100W次耗时 - 938 秒.

PostgreSQL 使用pgbench测试结果 : 
插入100W记录耗时 - 16秒
更新100W记录耗时 - 32 秒.
100W记录索引检索100W次耗时 - 15 秒.
综合测试, 插入 100W记录 , 更新 100W记录 , 查询 100W次耗时 - 69 秒.

[参考]
1.  http://blog.163.com/digoal@126/blog/static/16387704020151435825593/
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值