PostgreSQL & Oracle -1 OLTP "update/select based primary key" & insert

最新推荐文章于 2024-09-22 18:31:53 发布

Postgresql中国用户会

最新推荐文章于 2024-09-22 18:31:53 发布

阅读量586

点赞数

分类专栏：转载数据库管理文章标签： postgresql 数据库 oracle

转载同时被 2 个专栏收录

108 篇文章 0 订阅

订阅专栏

数据库管理

27 篇文章 0 订阅

订阅专栏

Postgres2015全国用户大会将于11月20至21日在北京丽亭华苑酒店召开。本次大会嘉宾阵容强大，国内顶级PostgreSQL数据库专家将悉数到场，并特邀欧洲、俄罗斯、日本、美国等国家和地区的数据库方面专家助阵:

Postgres-XC项目的发起人铃木市一(SUZUKI Koichi)
Postgres-XL的项目发起人Mason Sharp
pgpool的作者石井达夫(Tatsuo Ishii)
PG-Strom的作者海外浩平(Kaigai Kohei)
Greenplum研发总监姚延栋
周正中(德哥), PostgreSQL中国用户会创始人之一
汪洋，平安科技数据库技术部经理
……

 
 2015年度PG大象会报名地址：http://postgres2015.eventdove.com/
PostgreSQL中国社区： http://postgres.cn/
PostgreSQL专业1群： 3336901（已满）
PostgreSQL专业2群： 100910388
PostgreSQL专业3群： 150657323

基于同一台主机和存储，分别测试PostgreSQL 9.4.1, Oracle 12c 的小事务处理能力。

测试结果仅供参考，有兴趣的同学可以自行测试或者更改测试用例来玩。

(因测试使用工具不一样，工具本身的损耗不一样，结果可能没有可比性。)

(即使用同样的工具，驱动的性能可能也不一样，很难做到完全没有偏颇。)

(所以，本文目的旨在挑战产品自身的极限或者发现自身的问题和缺陷，而非两种产品的VS，纯属娱乐。 )

压力测试结果汇总：

PostgreSQL 9.4.1：

UPDATE

平均TPS：95021

最小TPS：90017

最大TPS：113981

SELECT

平均TPS：328895

最小TPS：327336

最大TPS：330360

INSERT

平均TPS：70433

最小TPS：57417.4

最大TPS：75758.9

Oracle 12c：

详见：

http://blog.163.com/digoal@126/blog/static/16387704020154431045764/

UPDATE

平均TPS：32xxx

最小TPS：29000

最大TPS：33900

SELECT

平均TPS：36xxx

最小TPS：36300

最大TPS：36620

INSERT

平均TPS：9xxx

最小TPS：8750

最大TPS：10500

[ 测试详情 ]

压力测试内容：

基于主键的查询，更新。

带主键的表的插入。

测试环境：

服务器 2009年购买的 IBM X3950, 和现在的CPU比起来性能已经比较差了.
CPU 4 * 6核 Intel(R) Xeon(R) CPU X7460 @ 2.66GHz
内存 32 * 4GB DDR2 533MHz
硬盘上海宝存 1.2TB Direct-IO PCI-E SSD
数据库 PostgreSQL 9.4.1
操作系统 CentOS 6.6 x64
文件系统 EXT4, noatime,nodiratime,nobarrier

更新,查询数据量 5000万

插入数据量 100亿

PostgreSQL 数据库参数：

listen_addresses = '0.0.0.0' # what IP address(es) to listen on;

port = 1921 # (change requires restart)

max_connections = 56 # (change requires restart)

superuser_reserved_connections = 13 # (change requires restart)

unix_socket_directories = '.' # comma-separated list of directories

tcp_keepalives_idle = 60 # TCP_KEEPIDLE, in seconds;

tcp_keepalives_interval = 10 # TCP_KEEPINTVL, in seconds;

tcp_keepalives_count = 10 # TCP_KEEPCNT;

shared_buffers = 8GB # min 128kB 内存足够大的情况下,配置为和数据库的活跃数据量相当即可获得最好的性能.

huge_pages = try # on, off, or try

maintenance_work_mem = 1GB # min 1MB

autovacuum_work_mem = 1GB # min 1MB, or -1 to use maintenance_work_mem

dynamic_shared_memory_type = posix # the default is the first option

bgwriter_delay = 10ms # 10-10000ms between rounds

synchronous_commit = off # synchronization level;

wal_sync_method = fdatasync # the default is the first option

wal_buffers = 16MB # min 32kB, -1 sets based on shared_buffers

wal_writer_delay = 10ms # 1-10000 milliseconds

checkpoint_timeout = 10min # 对于产生XLOG非常频繁的数据库, 为了降低性能影响, 可以配置为大于产生checkpoint_segments需要的周期.

# 例如产生512个XLOG需要8分钟, 那么这里可以配置为超过8分钟.

# 这里配置的时间越长, 如果数据库DOWN机, 恢复需要的时间也越长.

checkpoint_completion_target = 0.9

checkpoint_segments = 512 # in logfile segments, min 1, 16MB each 配置为大于等于shared_buffers

random_page_cost = 2.0 # same scale as above

effective_cache_size = 100GB

log_destination = 'csvlog' # Valid values are combinations of

logging_collector = on # Enable capturing of stderr and csvlog

log_truncate_on_rotation = on # If on, an existing log file with the

log_checkpoints = on

log_connections = on

log_disconnections = on

log_error_verbosity = verbose # terse, default, or verbose messages

log_timezone = 'PRC'

autovacuum = on # Enable autovacuum subprocess? 'on'

log_autovacuum_min_duration = 0 # -1 disables, 0 logs all actions and

autovacuum_vacuum_scale_factor = 0.002 # fraction of table size before vacuum , 对于产生垃圾非常频繁的库, 越小越好

autovacuum_analyze_scale_factor = 0.001 # fraction of table size before analyze

autovacuum_vacuum_cost_delay = 0ms # default vacuum cost delay for , 对于IO能力非常好的库, 不要延迟

datestyle = 'iso, mdy'

timezone = 'PRC'

lc_messages = 'C' # locale for system error message

lc_monetary = 'C' # locale for monetary formatting

lc_numeric = 'C' # locale for number formatting

lc_time = 'C' # locale for time formatting

default_text_search_config = 'pg_catalog.english'

生成查询，更新压力测试数据：


   
   
    
    digoal=> create table tbl(id int, info text, crt_time timestamptz default now()) tablespace tbs_digoal;
   
   
   
   
    
    CREATE TABLE
   
   
   
   
    
    digoal=> insert into tbl select generate_series(1,50000000),now(),now();
   
   
   
   
    
    INSERT 0 50000000
   
   
   
   
    
    digoal=> set maintenance_work_mem='4GB';
   
   
   
   
    
    SET
   
   
   
   
    
    digoal=> alter table tbl add constraint tbl_pkey primary key(id) using index tablespace tbs_digoal_idx;
   
   
   
   
    
    ALTER TABLE
   
   
   
   
    
    digoal=> \dt+ tbl
   
   
   
   
    
                       List of relations
   
   
   
   
    
     Schema | Name | Type  | Owner  |  Size   | Description 
   
   
   
   
    
    --------+------+-------+--------+---------+-------------
   
   
   
   
    
     digoal | tbl  | table | digoal | 3634 MB | 
   
   
   
   
    
    (1 row)
   
   
   
   
    
    digoal=> \di+ tbl_pkey 
   
   
   
   
    
                             List of relations
   
   
   
   
    
     Schema |   Name   | Type  | Owner  | Table |  Size   | Description 
   
   
   
   
    
    --------+----------+-------+--------+-------+---------+-------------
   
   
   
   
    
     digoal | tbl_pkey | index | digoal | tbl   | 1063 MB | 
   
   
   
   
    
    (1 row)

根据主键进行更新测试，测试时长超过8小时。

$ vi test.sql

\setrandom id 1 50000000

update tbl set crt_time=now() where id=:id;

nohup pgbench -M prepared -n -f test.sql -P 10 -c 26 -j 26 -T 30000000 >./log 2>&1 &

超过8小时的测试后，表大了100多MB，索引未变化。


  
  
   
   digoal=> \dt+
  
  
  
  
   
                      List of relations
  
  
  
  
   
    Schema | Name | Type  | Owner  |  Size   | Description 
  
  
  
  
   
   --------+------+-------+--------+---------+-------------
  
  
  
  
   
    digoal | tbl  | table | digoal | 3842 MB | 
  
  
  
  
   
   (1 rows)
  
  
  
  
   
   digoal=> \di+
  
  
  
  
   
                            List of relations
  
  
  
  
   
    Schema |   Name   | Type  | Owner  | Table |  Size   | Description 
  
  
  
  
   
   --------+----------+-------+--------+-------+---------+-------------
  
  
  
  
   
    digoal | tbl_pkey | index | digoal | tbl   | 1063 MB | 
  
  
  
  
   
   (1 row)

统计到tbl已更新超过21亿次。

digoal=> select * from pg_stat_all_tables where relname='tbl';

-[ RECORD 1 ]-------+------------------------------

relid | 16387

schemaname | digoal

relname | tbl

seq_scan | 2

seq_tup_read | 100000000

idx_scan | 2136267592

idx_tup_fetch | 2136267592

n_tup_ins | 100278348

n_tup_upd | 2136267592

n_tup_del | 0

n_tup_hot_upd | 2097129671

n_live_tup | 50081001

n_dead_tup | 135956

n_mod_since_analyze | 3111673

last_vacuum |

last_autovacuum | 2015-05-02 08:27:02.690159+08

last_analyze |

last_autoanalyze | 2015-05-02 08:27:05.800603+08

vacuum_count | 0

autovacuum_count | 580

analyze_count | 0

autoanalyze_count | 579

可以导入测试结果，或者使用R进行分析。


  
  
   
   
    
    digoal=> create table az(tps numeric);
   
   
   
   
    
    CREATE TABLE
   
   
   
   
    
    digoal=# \copy digoal.az from program 'awk ''NR>2 {print $4}'' /home/postgres/log'
   
   
   
   
    
    COPY 3057
   
   
  
  
  
  
   
   
    
    digoal=> select avg(tps),min(tps),max(tps),count(*) from az;
   
   
   
   
    
            avg         |   min   |   max   | count 
   
   
   
   
    
    --------------------+---------+---------+-------
   
   
   
   
    
     60217.684494602552 | 27666.0 | 65708.7 |  3057
   
   
   
   
    
    (1 row)

平均TPS： 60217

最小TPS： 27666

最大TPS： 65708

图：

每一次tps下降都和checkpoint有关，因为检查点后第一次变脏的数据块需要写full page，所以会导致wal写buffer的压力(实际是连续写几个wal block size大小的能力，如果block_size=32K, wal_block_size=8K, 那么一个脏块需要写4个wal_block_size，假设wal fsync能力是每秒写10000个8K的块，那么检查点后的写操作如果都发生在不同的数据块上面，写WAL可能造成瓶颈，即tps可能降到2500以下。)，原因分析见：

http://blog.163.com/digoal@126/blog/static/163877040201542103933969/

http://blog.163.com/digoal@126/blog/static/1638770402015463252387/

http://blog.163.com/digoal@126/blog/static/16387704020154651655783/

http://blog.163.com/digoal@126/blog/static/16387704020154653422892/

http://blog.163.com/digoal@126/blog/static/16387704020154811421484/

http://blog.163.com/digoal@126/blog/static/16387704020154129958753/

关闭full page write的压力测试TPS散点图如下，检查点带来的影响消失了：

查询测试，测试时长超过8小时：

$ vi test.sql

\setrandom id 1 50000000

select * from tbl where id=:id;

nohup pgbench -M prepared -n -f test.sql -P 10 -c 38 -j 38 -T 30000000 >./log 2>&1 &

导入测试结果：


   
   
    
    digoal=> create table az(tps numeric);
   
   
   
   
    
    CREATE TABLE
   
   
   
   
    
    digoal=> \c digoal postgres
   
   
   
   
    
    You are now connected to database "digoal" as user "postgres".
   
   
   
   
    
    digoal=# \copy digoal.az from program 'awk ''NR>2 {print $4}'' /home/postgres/log'
   
   
   
   
    
    COPY 3027
   
   
   
   
    
    
     
     digoal=> select avg(tps),min(tps),max(tps),count(*) from digoal.az;
    
    
    
    
     
              avg         |   min    |   max    | count 
    
    
    
    
     
     ---------------------+----------+----------+-------
    
    
    
    
     
      328895.445688800793 | 327336.0 | 330360.6 |  3027
    
    
    
    
     
     (1 row)

平均TPS：328895

最小TPS：327336

最大TPS：330360

图：

查询的TPS比较平稳，维持在32.7万tps以上。

插入测试，测试时长超过8小时：


  
  
   
   digoal=> drop table tbl;
  
  
  
  
   
   digoal=> create table tbl(id serial primary key using index tablespace tbs_digoal_idx, info text, crt_time timestamptz default now()) tablespace tbs_digoal;
  
  
  
  
   
   

  
  
  
  
   
   
    
    $ vi test.sql
   
   
   
   
    
    insert into tbl(info) values ('hello world');
   
   
  
  
  
  
   
   

  
  
  
  
   
   nohup pgbench -M prepared -n -f test.sql -P 10 -c 20 -j 20 -T 30000000 >./log 2>&1 &

导入测试结果：


   
   
    
    约4小时后插入数据量如下：
   
   
   
   
    
    digoal=> \dt+
                   List of relations
 Schema | Name | Type  | Owner  |  Size  | Description 
--------+------+-------+--------+--------+-------------
 digoal | tbl  | table | digoal | 69 GB  | 
(1 rows)

digoal=> \di+
                        List of relations
 Schema |   Name   | Type  | Owner  | Table | Size  | Description 
--------+----------+-------+--------+-------+-------+-------------
 digoal | tbl_pkey | index | digoal | tbl   | 20 GB | 
(1 row)
   
   
   
   
    
    

   
   
   
   
    
    digoal=> create table az(tps numeric);
   
   
   
   
    
    CREATE TABLE
   
   
   
   
    
    digoal=> \c digoal postgres
   
   
   
   
    
    You are now connected to database "digoal" as user "postgres".
   
   
   
   
    
    digoal=# \copy digoal.az from program 'awk ''NR>2 {print $4}'' /home/postgres/log'
   
   
   
   
    
    COPY 1385
   
   
   
   
    
    
     
     digoal=# select avg(tps),min(tps),max(tps),count(*) from digoal.az;
    
    
    
    
     
             avg         |   min   |   max   | count 
    
    
    
    
     
     --------------------+---------+---------+-------
    
    
    
    
     
      69839.050685920578 | 65283.7 | 72175.5 |  1385
    
    
    
    
     
     (1 row)

平均TPS：70433
最小TPS：57417.4
最大TPS：75758.9

图：

检查点同样会对插入有一定影响，不过比对更新的影响小很多，因为并发的xlog full page write更少了(写完一个再扩展一个新的)。

[其他]

1. 使用PostgreSQL 9.5 重新测试更新，性能同样受到检查点的影响：

listen_addresses = '0.0.0.0' # what IP address(es) to listen on;
port = 1922 # (change requires restart)
max_connections = 100 # (change requires restart)
superuser_reserved_connections = 13 # (change requires restart)
unix_socket_directories = '.' # comma-separated list of directories
tcp_keepalives_idle = 60 # TCP_KEEPIDLE, in seconds;
tcp_keepalives_interval = 10 # TCP_KEEPINTVL, in seconds;
tcp_keepalives_count = 10 # TCP_KEEPCNT;
shared_buffers = 8GB # min 128kB
huge_pages = try # on, off, or try
maintenance_work_mem = 1GB # min 1MB
autovacuum_work_mem = 1GB # min 1MB, or -1 to use maintenance_work_mem
dynamic_shared_memory_type = posix # the default is the first option
vacuum_cost_delay = 0 # 0-100 milliseconds
vacuum_cost_limit = 10000 # 1-10000 credits
bgwriter_delay = 10ms # 10-10000ms between rounds
synchronous_commit = off # synchronization level;
wal_sync_method = fdatasync # the default is the first option
wal_buffers = 16MB # min 32kB, -1 sets based on shared_buffers
wal_writer_delay = 10ms # 1-10000 milliseconds
checkpoint_timeout = 10min # range 30s-1h
max_wal_size = 16GB # 配置为shared_buffers一倍, 对于DML频繁的数据库较好
min_wal_size = 512MB
random_page_cost = 2.0 # same scale as above
effective_cache_size = 64GB
log_destination = 'csvlog' # Valid values are combinations of
logging_collector = on # Enable capturing of stderr and csvlog
log_truncate_on_rotation = on # If on, an existing log file with the
log_checkpoints = on
log_connections = on
log_disconnections = on
log_error_verbosity = verbose # terse, default, or verbose messages
log_timezone = 'PRC'
autovacuum = on # Enable autovacuum subprocess? 'on'
log_autovacuum_min_duration = 0 # -1 disables, 0 logs all actions and
autovacuum_vacuum_scale_factor = 0.002 # fraction of table size before vacuum
autovacuum_analyze_scale_factor = 0.001 # fraction of table size before analyze
autovacuum_vacuum_cost_delay = 0 # default vacuum cost delay for
datestyle = 'iso, mdy'
timezone = 'PRC'
lc_messages = 'C' # locale for system error message
lc_monetary = 'C' # locale for monetary formatting
lc_numeric = 'C' # locale for number formatting
lc_time = 'C' # locale for time formatting
default_text_search_config = 'pg_catalog.english'

关于检查点为什么有如此大的影响，后面的文章再针对检查点源码分析一下原因。

[小结]

1. 测试结果反应了PostgreSQL checkpoint方面的不足之处，影响太大(实际上和checkpointer带来的IO关系不大，主要是这里的更新测试用例瞬间的FULL PAGE WRITE量太大，导致wal write buffer延迟变大而影响了TPS)。

有兴趣的朋友可查看我另外几篇文章的分析。

http://blog.163.com/digoal@126/blog/static/163877040201542103933969/

http://blog.163.com/digoal@126/blog/static/1638770402015463252387/

http://blog.163.com/digoal@126/blog/static/16387704020154651655783/

http://blog.163.com/digoal@126/blog/static/16387704020154653422892/

http://blog.163.com/digoal@126/blog/static/16387704020154811421484/

http://blog.163.com/digoal@126/blog/static/16387704020154129958753/

2. 如果你不想使用pgbench来测试PG, 也可以用python, 不过因为psycopg2目前不支持绑定变量, 所以效率会低很多.

原因见：

http://blog.163.com/digoal@126/blog/static/1638770402015151653642/


   
   
    
    import threading
   
   
   
   
    
    import time
   
   
   
   
    
    import psycopg2
   
   
   
   
    
    import random
   
   
   
   
    
    

   
   
   
   
    
    xs=12000
   
   
   
   
    
    tps=dict()
   
   
   
   
    
    

   
   
   
   
    
    class n_t(threading.Thread):   # The timer class is derived from the class threading.Thread
   
   
   
   
    
      def __init__(self, num):
   
   
   
   
    
        threading.Thread.__init__(self)
   
   
   
   
    
        self.thread_num = num
   
   
   
   
    
    

   
   
   
   
    
      def run(self): #Overwrite run() method, put what you want the thread do here
   
   
   
   
    
        conn = psycopg2.connect(database="digoal", user="digoal", password="digoal", host="/data01/pgdata/pg_root", port="1922")
   
   
   
   
    
        curs = conn.cursor()
   
   
   
   
    
        conn.autocommit=True
   
   
   
   
    
    

   
   
   
   
    
        tps[self.thread_num] = dict()
   
   
   
   
    
    

   
   
   
   
    
        f = open("/tmp/pg_test." + str(self.thread_num), "w")
   
   
   
   
    
    

   
   
   
   
    
        for x in range(1,3001):
   
   
   
   
    
          start_t = time.time()
   
   
   
   
    
          for i in range(0,xs):
   
   
   
   
    
            curs.execute("update tbl set info=now(),crt_time=now() where id=%(id)s", {"id": random.randrange(1,50000000)})      
   
   
   
   
    
          stop_t = time.time()
   
   
   
   
    
          tps[self.thread_num][x] = round(xs/(stop_t-start_t),2)
   
   
   
   
    
          res = "Round: " + str(x) + " TID: " + str(self.thread_num) + " Sec: " + str(round((stop_t-start_t),2)) + " tps: " + str(tps[self.thread_num][x])
   
   
   
   
    
          print >> f, res
   
   
   
   
    
          f.flush()
   
   
   
   
    
    

   
   
   
   
    
        f.close()
   
   
   
   
    
    

   
   
   
   
    
    def test():
   
   
   
   
    
      t_names = []
   
   
   
   
    
      for i in xrange(0,27):
   
   
   
   
    
        t_names.append(n_t(i))
   
   
   
   
    
    

   
   
   
   
    
      for t in t_names:
   
   
   
   
    
        t.start()
   
   
   
   
    
      
   
   
   
   
    
      return
   
   
   
   
    
    

   
   
   
   
    
    if __name__ == '__main__':
   
   
   
   
    
      test()