PostgreSQL 11 1000亿 tpcb、1000W tpcc 性能测试 - on 阿里云ECS + ESSD (含quoru...

标签

PostgreSQL , pgbench , tpcb , tpcc , tpch


背景

https://help.aliyun.com/knowledge_detail/64950.html

阿里云ESSD提供了单盘32TB容量,100万IOPS,4GB/s读写吞吐的能力,单台ECS可以挂载16块ESSD盘,组成512 TB的大容量存储。非常适合数据库这类IO密集应用。

PostgreSQL 作为一款优秀的企业级开源数据库产品,阿里云ESSD的加入,可以带给用户什么样的体验呢?

《PostgreSQL 11 100亿 tpcb 性能 on ECS》

《PostgreSQL 11 tpcc 测试(103万tpmC on ECS) - use sysbench-tpcc by Percona-Lab》

《(TPC-H测试 SF=10,SF=200) PostgreSQL 11 vs 10 vs Deepgreen》

ESSD云盘部署

parted -s /dev/vdb mklabel gpt          
parted -s /dev/vdb mkpart primary 1MiB 100%         
mkfs.ext4 /dev/vdb1 -m 0 -O extent,uninit_bg -E lazy_itable_init=1 -T largefile -L data01          
mkdir /data01            
          
vi /etc/fstab            
            
LABEL=data01 /data01     ext4        defaults,noatime,nodiratime,nodelalloc,barrier=0,data=writeback    0 0          
          
          
mount -a      

fsync RT测试

1 ECS本地SSD

使用PostgreSQL提供的pg_test_fsync进行测试

5 seconds per test    
O_DIRECT supported on this platform for open_datasync and open_sync.    
    
Compare file sync methods using one 8kB write:    
(in wal_sync_method preference order, except fdatasync is Linux's default)    
        open_datasync                     46355.824 ops/sec      22 usecs/op    
        fdatasync                         39213.835 ops/sec      26 usecs/op    
        fsync                             35912.478 ops/sec      28 usecs/op    
        fsync_writethrough                              n/a    
        open_sync                         42426.737 ops/sec      24 usecs/op    
    
Compare file sync methods using two 8kB writes:    
(in wal_sync_method preference order, except fdatasync is Linux's default)    
        open_datasync                     17109.945 ops/sec      58 usecs/op    
        fdatasync                         26316.089 ops/sec      38 usecs/op    
        fsync                             24202.679 ops/sec      41 usecs/op    
        fsync_writethrough                              n/a    
        open_sync                         15760.721 ops/sec      63 usecs/op    
    
Compare open_sync with different write sizes:    
(This is designed to compare the cost of writing 16kB in different write    
open_sync sizes.)    
         1 * 16kB open_sync write         29108.820 ops/sec      34 usecs/op    
         2 *  8kB open_sync writes        15674.805 ops/sec      64 usecs/op    
         4 *  4kB open_sync writes         9942.061 ops/sec     101 usecs/op    
         8 *  2kB open_sync writes         5637.484 ops/sec     177 usecs/op    
        16 *  1kB open_sync writes         3076.057 ops/sec     325 usecs/op    
    
Test if fsync on non-write file descriptor is honored:    
(If the times are similar, fsync() can sync data written on a different    
descriptor.)    
        write, fsync, close               32581.863 ops/sec      31 usecs/op    
        write, close, fsync               32512.798 ops/sec      31 usecs/op    
    
Non-sync'ed 8kB writes:    
        write                            350232.219 ops/sec       3 usecs/op    

2 ESSD

5 seconds per test    
O_DIRECT supported on this platform for open_datasync and open_sync.    
    
Compare file sync methods using one 8kB write:    
(in wal_sync_method preference order, except fdatasync is Linux's default)    
        open_datasync                      8395.592 ops/sec     119 usecs/op    
        fdatasync                          7722.692 ops/sec     129 usecs/op    
        fsync                              5619.389 ops/sec     178 usecs/op    
        fsync_writethrough                              n/a    
        open_sync                          5685.669 ops/sec     176 usecs/op    
    
Compare file sync methods using two 8kB writes:    
(in wal_sync_method preference order, except fdatasync is Linux's default)    
        open_datasync                      3858.783 ops/sec     259 usecs/op    
        fdatasync                          5396.356 ops/sec     185 usecs/op    
        fsync                              4214.546 ops/sec     237 usecs/op    
        fsync_writethrough                              n/a    
        open_sync                          3025.366 ops/sec     331 usecs/op    
    
Compare open_sync with different write sizes:    
(This is designed to compare the cost of writing 16kB in different write    
open_sync sizes.)    
         1 * 16kB open_sync write          4506.749 ops/sec     222 usecs/op    
         2 *  8kB open_sync writes         3099.963 ops/sec     323 usecs/op    
         4 *  4kB open_sync writes         1763.684 ops/sec     567 usecs/op    
         8 *  2kB open_sync writes          429.923 ops/sec    2326 usecs/op    
        16 *  1kB open_sync writes          198.005 ops/sec    5050 usecs/op    
    
Test if fsync on non-write file descriptor is honored:    
(If the times are similar, fsync() can sync data written on a different    
descriptor.)    
        write, fsync, close                5393.927 ops/sec     185 usecs/op    
        write, close, fsync                5470.240 ops/sec     183 usecs/op    
    
Non-sync'ed 8kB writes:    
        write                            385505.858 ops/sec       3 usecs/op    

fio 专业IO测试

测试项

vi test    
    
[global]        
thread        
numjobs=64        
ramp_time=6        
size=10g        
exitall        
time_based        
runtime=180        
group_reporting        
randrepeat=0        
norandommap        
bs=8k        
rwmixwrite=35        
        
[rw-rand-libaio-mysql-ext4]        
stonewall        
direct=1        
iodepth=16        
iodepth_batch=8        
iodepth_low=8        
iodepth_batch_complete=8        
rw=randrw        
ioengine=libaio        
filename=/data01/ext4        
        
[rw-seq-libaio-mysql-ext4]        
stonewall        
direct=1        
iodepth=16        
iodepth_batch=8        
iodepth_low=8        
iodepth_batch_complete=8        
rw=rw        
ioengine=libaio        
filename=/data01/ext4        
    
[rw-rand-sync-pgsql-ext4]        
stonewall        
direct=0        
rw=randrw        
ioengine=sync        
filename=/data01/ext4        
        
[rw-seq-sync-pgsql-ext4]        
stonewall        
direct=0        
rw=rw        
ioengine=sync        
filename=/data01/ext4    
fio test --output ./cfq-raw.log    

1 ECS本地SSD

rw-rand-libaio-mysql-ext4: (g=0): rw=randrw, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=libaio, iodepth=16    
...    
rw-seq-libaio-mysql-ext4: (g=1): rw=rw, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=libaio, iodepth=16    
...    
rw-rand-sync-pgsql-ext4: (g=2): rw=randrw, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=sync, iodepth=1    
...    
rw-seq-sync-pgsql-ext4: (g=3): rw=rw, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=sync, iodepth=1    
...    
fio-3.1    
Starting 256 threads    
    
rw-rand-libaio-mysql-ext4: (groupid=0, jobs=64): err= 0: pid=27005: Tue Sep 18 15:18:42 2018    
   read: IOPS=146k, BW=1140MiB/s (1196MB/s)(200GiB/180004msec)    
    slat (usec): min=22, max=114828, avg=2266.17, stdev=2520.66    
    clat (nsec): min=1759, max=115061k, avg=2316672.58, stdev=2575502.75    
     lat (usec): min=70, max=164337, avg=4582.93, stdev=3639.95    
    clat percentiles (usec):    
     |  1.00th=[  603],  5.00th=[ 1352], 10.00th=[ 1549], 20.00th=[ 1713],    
     | 30.00th=[ 1811], 40.00th=[ 1876], 50.00th=[ 1942], 60.00th=[ 2008],    
     | 70.00th=[ 2057], 80.00th=[ 2114], 90.00th=[ 2245], 95.00th=[ 2376],    
     | 99.00th=[16581], 99.50th=[19792], 99.90th=[27919], 99.95th=[32900],    
     | 99.99th=[49021]    
   bw (  KiB/s): min= 6000, max=42120, per=1.54%, avg=17968.83, stdev=2125.54, samples=22976    
   iops        : min=  750, max= 5265, avg=2245.75, stdev=265.70, samples=22976    
  write: IOPS=78.6k, BW=614MiB/s (644MB/s)(108GiB/180004msec)    
    slat (usec): min=24, max=114826, avg=2266.44, stdev=2515.63    
    clat (nsec): min=1056, max=114853k, avg=2225812.28, stdev=2478372.79    
     lat (usec): min=43, max=164125, avg=4492.34, stdev=3574.60    
    clat percentiles (usec):    
     |  1.00th=[   19],  5.00th=[ 1123], 10.00th=[ 1500], 20.00th=[ 1696],    
     | 30.00th=[ 1795], 40.00th=[ 1876], 50.00th=[ 1942], 60.00th=[ 1991],    
     | 70.00th=[ 2057], 80.00th=[ 2114], 90.00th=[ 2212], 95.00th=[ 2311],    
     | 99.00th=[16057], 99.50th=[19268], 99.90th=[27132], 99.95th=[31327],    
     | 99.99th=[47449]    
   bw (  KiB/s): min= 3235, max=22265, per=1.54%, avg=9680.55, stdev=1180.58, samples=22976    
   iops        : min=  404, max= 2783, avg=1209.72, stdev=147.58, samples=22976    
  lat (usec)   : 2=0.01%, 4=0.06%, 10=0.04%, 20=0.33%, 50=0.37%    
  lat (usec)   : 100=0.16%, 250=0.21%, 500=0.37%, 750=0.59%, 1000=0.84%    
  lat (msec)   : 2=56.76%, 4=36.91%, 10=0.82%, 20=2.08%, 50=0.44%    
  lat (msec)   : 100=0.01%, 250=0.01%    
  cpu          : usr=0.50%, sys=82.17%, ctx=8285760, majf=0, minf=0    
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.1%, 16=103.4%, 32=0.0%, >=64=0.0%    
     submit    : 0=0.0%, 4=0.0%, 8=100.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%    
     complete  : 0=0.0%, 4=0.0%, 8=100.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%    
     issued rwt: total=26268555,14152416,0, short=0,0,0, dropped=0,0,0    
     latency   : target=0, window=0, percentile=100.00%, depth=16    
rw-seq-libaio-mysql-ext4: (groupid=1, jobs=64): err= 0: pid=27075: Tue Sep 18 15:18:42 2018    
   read: IOPS=147k, BW=1150MiB/s (1206MB/s)(202GiB/180004msec)    
    slat (usec): min=22, max=97402, avg=2245.77, stdev=2561.30    
    clat (nsec): min=1597, max=97414k, avg=2299719.84, stdev=2622137.64    
     lat (usec): min=37, max=121819, avg=4545.57, stdev=3688.31    
    clat percentiles (usec):    
     |  1.00th=[  529],  5.00th=[ 1287], 10.00th=[ 1516], 20.00th=[ 1680],    
     | 30.00th=[ 1778], 40.00th=[ 1844], 50.00th=[ 1909], 60.00th=[ 1975],    
     | 70.00th=[ 2040], 80.00th=[ 2114], 90.00th=[ 2212], 95.00th=[ 2376],    
     | 99.00th=[16712], 99.50th=[19792], 99.90th=[28181], 99.95th=[32375],    
     | 99.99th=[47973]    
   bw (  KiB/s): min= 6822, max=34080, per=1.53%, avg=18044.54, stdev=2237.76, samples=22979    
   iops        : min=  852, max= 4260, avg=2255.24, stdev=279.74, samples=22979    
  write: IOPS=79.3k, BW=620MiB/s (650MB/s)(109GiB/180004msec)    
    slat (usec): min=24, max=97396, avg=2247.37, stdev=2563.13    
    clat (nsec): min=965, max=97414k, avg=2198816.49, stdev=2518690.40    
     lat (usec): min=40, max=121817, avg=4446.27, stdev=3625.49    
    clat percentiles (usec):    
     |  1.00th=[   17],  5.00th=[  996], 10.00th=[ 1434], 20.00th=[ 1647],    
     | 30.00th=[ 1762], 40.00th=[ 1844], 50.00th=[ 1909], 60.00th=[ 1958],    
     | 70.00th=[ 2024], 80.00th=[ 2089], 90.00th=[ 2180], 95.00th=[ 2278],    
     | 99.00th=[16188], 99.50th=[19268], 99.90th=[27132], 99.95th=[31327],    
     | 99.99th=[46400]    
   bw (  KiB/s): min= 3073, max=18400, per=1.53%, avg=9720.67, stdev=1243.91, samples=22979    
   iops        : min=  384, max= 2300, avg=1214.75, stdev=155.52, samples=22979    
  lat (nsec)   : 1000=0.01%    
  lat (usec)   : 2=0.01%, 4=0.09%, 10=0.06%, 20=0.50%, 50=0.28%    
  lat (usec)   : 100=0.13%, 250=0.20%, 500=0.54%, 750=0.71%, 1000=0.98%    
  lat (msec)   : 2=61.58%, 4=31.54%, 10=0.78%, 20=2.16%, 50=0.46%    
  lat (msec)   : 100=0.01%    
  cpu          : usr=0.50%, sys=81.14%, ctx=8462673, majf=0, minf=0    
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.1%, 16=103.3%, 32=0.0%, >=64=0.0%    
     submit    : 0=0.0%, 4=0.0%, 8=100.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%    
     complete  : 0=0.0%, 4=0.0%, 8=100.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%    
     issued rwt: total=26505435,14277757,0, short=0,0,0, dropped=0,0,0    
     latency   : target=0, window=0, percentile=100.00%, depth=16    
rw-rand-sync-pgsql-ext4: (groupid=2, jobs=64): err= 0: pid=27150: Tue Sep 18 15:18:42 2018    
   read: IOPS=228k, BW=1785MiB/s (1872MB/s)(314GiB/180002msec)    
    clat (nsec): min=945, max=14282k, avg=5002.95, stdev=48977.41    
     lat (nsec): min=1247, max=14282k, avg=5374.56, stdev=49037.52    
    clat percentiles (usec):    
     |  1.00th=[    3],  5.00th=[    3], 10.00th=[    3], 20.00th=[    4],    
     | 30.00th=[    4], 40.00th=[    4], 50.00th=[    4], 60.00th=[    4],    
     | 70.00th=[    4], 80.00th=[    4], 90.00th=[    5], 95.00th=[    5],    
     | 99.00th=[   12], 99.50th=[   90], 99.90th=[  212], 99.95th=[  586],    
     | 99.99th=[ 2180]    
   bw (  KiB/s): min= 4824, max=77689, per=1.55%, avg=28302.49, stdev=3542.33, samples=22979    
   iops        : min=  603, max= 9711, avg=3537.39, stdev=442.79, samples=22979    
  write: IOPS=123k, BW=961MiB/s (1008MB/s)(169GiB/180002msec)    
    clat (usec): min=2, max=216155, avg=504.28, stdev=1314.99    
     lat (usec): min=2, max=216155, avg=504.72, stdev=1315.00    
    clat percentiles (usec):    
     |  1.00th=[    8],  5.00th=[   77], 10.00th=[  285], 20.00th=[  392],    
     | 30.00th=[  408], 40.00th=[  420], 50.00th=[  469], 60.00th=[  519],    
     | 70.00th=[  545], 80.00th=[  562], 90.00th=[  586], 95.00th=[  603],    
     | 99.00th=[  685], 99.50th=[ 2089], 99.90th=[15270], 99.95th=[23462],    
     | 99.99th=[46924]    
   bw (  KiB/s): min= 2661, max=42215, per=1.55%, avg=15240.76, stdev=1812.18, samples=22979    
   iops        : min=  332, max= 5276, avg=1904.70, stdev=226.51, samples=22979    
  lat (nsec)   : 1000=0.01%    
  lat (usec)   : 2=0.01%, 4=56.34%, 10=8.49%, 20=1.04%, 50=0.35%    
  lat (usec)   : 100=0.43%, 250=1.42%, 500=16.05%, 750=15.58%, 1000=0.04%    
  lat (msec)   : 2=0.07%, 4=0.05%, 10=0.07%, 20=0.04%, 50=0.02%    
  lat (msec)   : 100=0.01%, 250=0.01%    
  cpu          : usr=0.93%, sys=89.47%, ctx=5631403, majf=0, minf=0    
  IO depths    : 1=102.3%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%    
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%    
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%    
     issued rwt: total=41127965,22146325,0, short=0,0,0, dropped=0,0,0    
     latency   : target=0, window=0, percentile=100.00%, depth=1    
rw-seq-sync-pgsql-ext4: (groupid=3, jobs=64): err= 0: pid=27221: Tue Sep 18 15:18:42 2018    
   read: IOPS=357k, BW=2786MiB/s (2922MB/s)(490GiB/180001msec)    
    clat (nsec): min=1006, max=20336k, avg=2745.46, stdev=22291.35    
     lat (nsec): min=1307, max=20336k, avg=3110.59, stdev=22301.63    
    clat percentiles (nsec):    
     |  1.00th=[ 1592],  5.00th=[ 1800], 10.00th=[ 1944], 20.00th=[ 2160],    
     | 30.00th=[ 2352], 40.00th=[ 2480], 50.00th=[ 2608], 60.00th=[ 2736],    
     | 70.00th=[ 2864], 80.00th=[ 3024], 90.00th=[ 3216], 95.00th=[ 3440],    
     | 99.00th=[ 4016], 99.50th=[ 8896], 99.90th=[15552], 99.95th=[17280],    
     | 99.99th=[21376]    
   bw (  KiB/s): min=21099, max=151871, per=1.56%, avg=44588.93, stdev=5219.41, samples=22983    
   iops        : min= 2637, max=18983, avg=5573.29, stdev=652.43, samples=22983    
  write: IOPS=192k, BW=1500MiB/s (1573MB/s)(264GiB/180001msec)    
    clat (usec): min=2, max=97210, avg=322.71, stdev=828.04    
     lat (usec): min=2, max=97210, avg=323.13, stdev=828.04    
    clat percentiles (usec):    
     |  1.00th=[    6],  5.00th=[   97], 10.00th=[  182], 20.00th=[  269],    
     | 30.00th=[  297], 40.00th=[  306], 50.00th=[  310], 60.00th=[  314],    
     | 70.00th=[  322], 80.00th=[  326], 90.00th=[  334], 95.00th=[  343],    
     | 99.00th=[  400], 99.50th=[  930], 99.90th=[12911], 99.95th=[19792],    
     | 99.99th=[32113]    
   bw (  KiB/s): min=11433, max=81619, per=1.56%, avg=24008.70, stdev=2707.40, samples=22983    
   iops        : min= 1429, max=10202, avg=3000.77, stdev=338.43, samples=22983    
  lat (usec)   : 2=7.89%, 4=56.48%, 10=1.25%, 20=0.37%, 50=0.25%    
  lat (usec)   : 100=0.57%, 250=4.14%, 500=28.80%, 750=0.06%, 1000=0.03%    
  lat (msec)   : 2=0.05%, 4=0.03%, 10=0.04%, 20=0.03%, 50=0.02%    
  lat (msec)   : 100=0.01%    
  cpu          : usr=1.34%, sys=90.37%, ctx=8645391, majf=0, minf=0    
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%    
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%    
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%    
     issued rwt: total=64196338,34566565,0, short=0,0,0, dropped=0,0,0    
     latency   : target=0, window=0, percentile=100.00%, depth=1    
    
Run status group 0 (all jobs):    
   READ: bw=1140MiB/s (1196MB/s), 1140MiB/s-1140MiB/s (1196MB/s-1196MB/s), io=200GiB (215GB), run=180004-180004msec    
  WRITE: bw=614MiB/s (644MB/s), 614MiB/s-614MiB/s (644MB/s-644MB/s), io=108GiB (116GB), run=180004-180004msec    
    
Run status group 1 (all jobs):    
   READ: bw=1150MiB/s (1206MB/s), 1150MiB/s-1150MiB/s (1206MB/s-1206MB/s), io=202GiB (217GB), run=180004-180004msec    
  WRITE: bw=620MiB/s (650MB/s), 620MiB/s-620MiB/s (650MB/s-650MB/s), io=109GiB (117GB), run=180004-180004msec    
    
Run status group 2 (all jobs):    
   READ: bw=1785MiB/s (1872MB/s), 1785MiB/s-1785MiB/s (1872MB/s-1872MB/s), io=314GiB (337GB), run=180002-180002msec    
  WRITE: bw=961MiB/s (1008MB/s), 961MiB/s-961MiB/s (1008MB/s-1008MB/s), io=169GiB (181GB), run=180002-180002msec    
    
Run status group 3 (all jobs):    
   READ: bw=2786MiB/s (2922MB/s), 2786MiB/s-2786MiB/s (2922MB/s-2922MB/s), io=490GiB (526GB), run=180001-180001msec    
  WRITE: bw=1500MiB/s (1573MB/s), 1500MiB/s-1500MiB/s (1573MB/s-1573MB/s), io=264GiB (283GB), run=180001-180001msec    
    
Disk stats (read/write):    
    dm-0: ios=56484741/52000097, merge=0/0, ticks=55721482/45621907, in_queue=102865767, util=67.65%, aggrios=6930165/6249265, aggrmerge=130427/250746, aggrticks=6648860/4061887, aggrin_queue=10712820, aggrutil=66.53%    
  vdb: ios=6928564/6249448, merge=130186/250641, ticks=6629128/3928950, in_queue=10559055, util=65.93%    
  vdc: ios=6930486/6248991, merge=130413/250950, ticks=6643790/3962244, in_queue=10605847, util=66.02%    
  vdd: ios=6928089/6250855, merge=130732/250764, ticks=6472207/4009640, in_queue=10493342, util=66.18%    
  vde: ios=6929909/6250351, merge=130382/250303, ticks=6661137/4040922, in_queue=10701353, util=66.20%    
  vdf: ios=6932429/6245563, merge=130328/251129, ticks=6681549/4088606, in_queue=10769793, util=66.29%    
  vdg: ios=6930521/6249106, merge=130467/250956, ticks=6697543/4113859, in_queue=10811547, util=66.35%    
  vdh: ios=6930993/6249837, merge=130681/250844, ticks=6698661/4151500, in_queue=10851206, util=66.41%    
  vdi: ios=6930333/6249975, merge=130228/250384, ticks=6706869/4199379, in_queue=10910423, util=66.53%    

2 ESSD

rw-rand-libaio-mysql-ext4: (g=0): rw=randrw, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=libaio, iodepth=16    
...    
rw-seq-libaio-mysql-ext4: (g=1): rw=rw, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=libaio, iodepth=16    
...    
rw-rand-sync-pgsql-ext4: (g=2): rw=randrw, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=sync, iodepth=1    
...    
rw-seq-sync-pgsql-ext4: (g=3): rw=rw, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=sync, iodepth=1    
...    
fio-3.1    
Starting 256 threads    
    
rw-rand-libaio-mysql-ext4: (groupid=0, jobs=64): err= 0: pid=21221: Tue Sep 18 15:19:03 2018    
   read: IOPS=112k, BW=879MiB/s (922MB/s)(154GiB/180008msec)    
    slat (usec): min=16, max=119403, avg=1880.80, stdev=2119.90    
    clat (nsec): min=940, max=512641k, avg=4098774.34, stdev=3181162.93    
     lat (usec): min=158, max=514427, avg=5979.64, stdev=3696.51    
    clat percentiles (usec):    
     |  1.00th=[    3],  5.00th=[  486], 10.00th=[  922], 20.00th=[ 1729],    
     | 30.00th=[ 2474], 40.00th=[ 2933], 50.00th=[ 3425], 60.00th=[ 4080],    
     | 70.00th=[ 5014], 80.00th=[ 6128], 90.00th=[ 7832], 95.00th=[ 9503],    
     | 99.00th=[14091], 99.50th=[16450], 99.90th=[23200], 99.95th=[29230],    
     | 99.99th=[61604]    
   bw (  KiB/s): min= 8615, max=31056, per=1.57%, avg=14172.78, stdev=990.53, samples=23040    
   iops        : min= 1076, max= 3882, avg=1771.17, stdev=123.81, samples=23040    
  write: IOPS=60.6k, BW=473MiB/s (496MB/s)(83.2GiB/180008msec)    
    slat (usec): min=17, max=119400, avg=1881.51, stdev=2122.86    
    clat (nsec): min=1019, max=510353k, avg=3892737.90, stdev=3154201.30    
     lat (usec): min=118, max=512897, avg=5774.32, stdev=3688.63    
    clat percentiles (usec):    
     |  1.00th=[    3],  5.00th=[  388], 10.00th=[  799], 20.00th=[ 1500],    
     | 30.00th=[ 2278], 40.00th=[ 2802], 50.00th=[ 3261], 60.00th=[ 3851],    
     | 70.00th=[ 4752], 80.00th=[ 5866], 90.00th=[ 7504], 95.00th=[ 9241],    
     | 99.00th=[13829], 99.50th=[16188], 99.90th=[22676], 99.95th=[28181],    
     | 99.99th=[58459]    
   bw (  KiB/s): min= 4137, max=17712, per=1.57%, avg=7634.48, stdev=590.18, samples=23040    
   iops        : min=  517, max= 2214, avg=953.86, stdev=73.79, samples=23040    
  lat (nsec)   : 1000=0.01%    
  lat (usec)   : 2=0.71%, 4=1.40%, 10=0.17%, 20=0.04%, 50=0.11%    
  lat (usec)   : 100=0.20%, 250=0.80%, 500=2.06%, 750=2.89%, 1000=3.31%    
  lat (msec)   : 2=12.56%, 4=35.84%, 10=36.00%, 20=3.73%, 50=0.17%    
  lat (msec)   : 100=0.01%, 250=0.01%, 750=0.01%    
  cpu          : usr=0.31%, sys=49.41%, ctx=17838909, majf=0, minf=0    
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.1%, 16=103.3%, 32=0.0%, >=64=0.0%    
     submit    : 0=0.0%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%    
     complete  : 0=0.0%, 4=0.0%, 8=100.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%    
     issued rwt: total=20248219,10907423,0, short=0,0,0, dropped=0,0,0    
     latency   : target=0, window=0, percentile=100.00%, depth=16    
rw-seq-libaio-mysql-ext4: (groupid=1, jobs=64): err= 0: pid=21285: Tue Sep 18 15:19:03 2018    
   read: IOPS=133k, BW=1037MiB/s (1087MB/s)(182GiB/180005msec)    
    slat (usec): min=10, max=40443, avg=861.52, stdev=1493.93    
    clat (nsec): min=953, max=508067k, avg=4239522.11, stdev=2801559.16    
     lat (usec): min=188, max=510200, avg=5101.11, stdev=3015.85    
    clat percentiles (usec):    
     |  1.00th=[  420],  5.00th=[  914], 10.00th=[ 1319], 20.00th=[ 2024],    
     | 30.00th=[ 2671], 40.00th=[ 3261], 50.00th=[ 3851], 60.00th=[ 4490],    
     | 70.00th=[ 5211], 80.00th=[ 6194], 90.00th=[ 7635], 95.00th=[ 8979],    
     | 99.00th=[11731], 99.50th=[12911], 99.90th=[15795], 99.95th=[17171],    
     | 99.99th=[21890]    
   bw (  KiB/s): min=  602, max=22672, per=1.53%, avg=16220.02, stdev=2446.89, samples=23020    
   iops        : min=   75, max= 2834, avg=2027.07, stdev=305.86, samples=23020    
  write: IOPS=71.4k, BW=558MiB/s (585MB/s)(98.1GiB/180005msec)    
    slat (usec): min=12, max=40442, avg=871.28, stdev=1501.73    
    clat (nsec): min=913, max=509492k, avg=3964521.95, stdev=2769947.35    
     lat (usec): min=129, max=511537, avg=4835.88, stdev=2992.88    
    clat percentiles (usec):    
     |  1.00th=[  223],  5.00th=[  791], 10.00th=[ 1172], 20.00th=[ 1827],    
     | 30.00th=[ 2409], 40.00th=[ 2999], 50.00th=[ 3589], 60.00th=[ 4228],    
     | 70.00th=[ 4883], 80.00th=[ 5866], 90.00th=[ 7242], 95.00th=[ 8455],    
     | 99.00th=[11207], 99.50th=[12387], 99.90th=[15008], 99.95th=[16319],    
     | 99.99th=[20317]    
   bw (  KiB/s): min=  309, max=12471, per=1.53%, avg=8733.75, stdev=1342.51, samples=23020    
   iops        : min=   38, max= 1558, avg=1091.29, stdev=167.81, samples=23020    
  lat (nsec)   : 1000=0.01%    
  lat (usec)   : 2=0.31%, 4=0.36%, 10=0.02%, 20=0.01%, 50=0.01%    
  lat (usec)   : 100=0.01%, 250=0.08%, 500=0.82%, 750=2.09%, 1000=2.82%    
  lat (msec)   : 2=14.32%, 4=33.12%, 10=43.55%, 20=2.47%, 50=0.01%    
  lat (msec)   : 100=0.01%, 250=0.01%, 750=0.01%    
  cpu          : usr=0.41%, sys=9.90%, ctx=21948046, majf=0, minf=0    
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.1%, 16=102.1%, 32=0.0%, >=64=0.0%    
     submit    : 0=0.0%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%    
     complete  : 0=0.0%, 4=0.0%, 8=100.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%    
     issued rwt: total=23881029,12859729,0, short=0,0,0, dropped=0,0,0    
     latency   : target=0, window=0, percentile=100.00%, depth=16    
rw-rand-sync-pgsql-ext4: (groupid=2, jobs=64): err= 0: pid=21355: Tue Sep 18 15:19:03 2018    
   read: IOPS=304k, BW=2374MiB/s (2489MB/s)(417GiB/180001msec)    
    clat (nsec): min=659, max=20344k, avg=4937.11, stdev=53635.32    
     lat (nsec): min=830, max=20344k, avg=5147.67, stdev=53639.04    
    clat percentiles (nsec):    
     |  1.00th=[   1928],  5.00th=[   2096], 10.00th=[   2192],    
     | 20.00th=[   2352], 30.00th=[   2448], 40.00th=[   2544],    
     | 50.00th=[   2640], 60.00th=[   2736], 70.00th=[   2864],    
     | 80.00th=[   2992], 90.00th=[   3248], 95.00th=[   3504],    
     | 99.00th=[   8640], 99.50th=[  16512], 99.90th=[ 585728],    
     | 99.95th=[1122304], 99.99th=[2375680]    
   bw (  KiB/s): min= 8661, max=312564, per=1.57%, avg=38145.72, stdev=5837.53, samples=23040    
   iops        : min= 1082, max=39070, avg=4767.84, stdev=729.68, samples=23040    
  write: IOPS=164k, BW=1278MiB/s (1340MB/s)(225GiB/180001msec)    
    clat (usec): min=2, max=517110, avg=377.98, stdev=1501.97    
     lat (usec): min=2, max=517110, avg=378.25, stdev=1501.97    
    clat percentiles (usec):    
     |  1.00th=[    6],  5.00th=[   99], 10.00th=[  265], 20.00th=[  330],    
     | 30.00th=[  343], 40.00th=[  351], 50.00th=[  355], 60.00th=[  363],    
     | 70.00th=[  371], 80.00th=[  379], 90.00th=[  396], 95.00th=[  416],    
     | 99.00th=[  545], 99.50th=[ 1045], 99.90th=[11469], 99.95th=[16581],    
     | 99.99th=[28967]    
   bw (  KiB/s): min= 4709, max=168193, per=1.57%, avg=20539.14, stdev=3068.53, samples=23040    
   iops        : min=  588, max=21024, avg=2567.01, stdev=383.55, samples=23040    
  lat (nsec)   : 750=0.01%, 1000=0.01%    
  lat (usec)   : 2=1.33%, 4=62.24%, 10=1.91%, 20=0.50%, 50=0.18%    
  lat (usec)   : 100=0.32%, 250=1.63%, 500=31.34%, 750=0.27%, 1000=0.06%    
  lat (msec)   : 2=0.07%, 4=0.04%, 10=0.06%, 20=0.03%, 50=0.01%    
  lat (msec)   : 100=0.01%, 250=0.01%, 500=0.01%, 750=0.01%    
  cpu          : usr=0.53%, sys=89.73%, ctx=7630423, majf=0, minf=0    
  IO depths    : 1=102.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%    
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%    
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%    
     issued rwt: total=54689028,29447045,0, short=0,0,0, dropped=0,0,0    
     latency   : target=0, window=0, percentile=100.00%, depth=1    
rw-seq-sync-pgsql-ext4: (groupid=3, jobs=64): err= 0: pid=21420: Tue Sep 18 15:19:03 2018    
   read: IOPS=400k, BW=3128MiB/s (3280MB/s)(550GiB/180002msec)    
    clat (nsec): min=655, max=77256k, avg=1908.86, stdev=16100.37    
     lat (nsec): min=818, max=77256k, avg=2119.98, stdev=16102.12    
    clat percentiles (nsec):    
     |  1.00th=[  988],  5.00th=[ 1144], 10.00th=[ 1256], 20.00th=[ 1416],    
     | 30.00th=[ 1576], 40.00th=[ 1704], 50.00th=[ 1816], 60.00th=[ 1928],    
     | 70.00th=[ 2040], 80.00th=[ 2192], 90.00th=[ 2384], 95.00th=[ 2576],    
     | 99.00th=[ 3184], 99.50th=[ 5472], 99.90th=[13504], 99.95th=[15424],    
     | 99.99th=[18304]    
   bw (  KiB/s): min=27335, max=205560, per=1.57%, avg=50224.99, stdev=4880.87, samples=23040    
   iops        : min= 3416, max=25695, avg=6277.74, stdev=610.11, samples=23040    
  write: IOPS=216k, BW=1684MiB/s (1766MB/s)(296GiB/180002msec)    
    clat (nsec): min=1980, max=344807k, avg=289811.56, stdev=750064.44    
     lat (usec): min=2, max=344807, avg=290.08, stdev=750.06    
    clat percentiles (usec):    
     |  1.00th=[    5],  5.00th=[  106], 10.00th=[  194], 20.00th=[  258],    
     | 30.00th=[  273], 40.00th=[  277], 50.00th=[  285], 60.00th=[  289],    
     | 70.00th=[  293], 80.00th=[  297], 90.00th=[  306], 95.00th=[  310],    
     | 99.00th=[  351], 99.50th=[  799], 99.90th=[ 8356], 99.95th=[12387],    
     | 99.99th=[24249]    
   bw (  KiB/s): min=13581, max=112633, per=1.57%, avg=27046.57, stdev=2507.74, samples=23040    
   iops        : min= 1697, max=14079, avg=3380.43, stdev=313.47, samples=23040    
  lat (nsec)   : 750=0.01%, 1000=0.74%    
  lat (usec)   : 2=42.27%, 4=21.80%, 10=1.17%, 20=0.26%, 50=0.13%    
  lat (usec)   : 100=0.33%, 250=4.51%, 500=28.59%, 750=0.03%, 1000=0.02%    
  lat (msec)   : 2=0.05%, 4=0.04%, 10=0.05%, 20=0.02%, 50=0.01%    
  lat (msec)   : 100=0.01%, 500=0.01%    
  cpu          : usr=0.63%, sys=92.37%, ctx=9426487, majf=0, minf=0    
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%    
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%    
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%    
     issued rwt: total=72069547,38810440,0, short=0,0,0, dropped=0,0,0    
     latency   : target=0, window=0, percentile=100.00%, depth=1    
    
Run status group 0 (all jobs):    
   READ: bw=879MiB/s (922MB/s), 879MiB/s-879MiB/s (922MB/s-922MB/s), io=154GiB (166GB), run=180008-180008msec    
  WRITE: bw=473MiB/s (496MB/s), 473MiB/s-473MiB/s (496MB/s-496MB/s), io=83.2GiB (89.4GB), run=180008-180008msec    
    
Run status group 1 (all jobs):    
   READ: bw=1037MiB/s (1087MB/s), 1037MiB/s-1037MiB/s (1087MB/s-1087MB/s), io=182GiB (196GB), run=180005-180005msec    
  WRITE: bw=558MiB/s (585MB/s), 558MiB/s-558MiB/s (585MB/s-585MB/s), io=98.1GiB (105GB), run=180005-180005msec    
    
Run status group 2 (all jobs):    
   READ: bw=2374MiB/s (2489MB/s), 2374MiB/s-2374MiB/s (2489MB/s-2489MB/s), io=417GiB (448GB), run=180001-180001msec    
  WRITE: bw=1278MiB/s (1340MB/s), 1278MiB/s-1278MiB/s (1340MB/s-1340MB/s), io=225GiB (241GB), run=180001-180001msec    
    
Run status group 3 (all jobs):    
   READ: bw=3128MiB/s (3280MB/s), 3128MiB/s-3128MiB/s (3280MB/s-3280MB/s), io=550GiB (590GB), run=180002-180002msec    
  WRITE: bw=1684MiB/s (1766MB/s), 1684MiB/s-1684MiB/s (1766MB/s-1766MB/s), io=296GiB (318GB), run=180002-180002msec    
    
Disk stats (read/write):    
  vdb: ios=32558186/26432830, merge=13683969/3967228, ticks=103112470/71513817, in_queue=174719639, util=60.48%    

PostgreSQL 11 测试

1、参数

listen_addresses = '0.0.0.0'      
port = 1921      
max_connections = 2000      
superuser_reserved_connections = 3      
unix_socket_directories = '., /var/run/postgresql, /tmp'      
tcp_keepalives_idle = 60      
tcp_keepalives_interval = 10      
tcp_keepalives_count = 10      
shared_buffers = 64GB      
max_prepared_transactions = 2000      
work_mem = 8MB      
maintenance_work_mem = 2GB      
dynamic_shared_memory_type = posix      
vacuum_cost_delay = 0      
bgwriter_delay = 10ms      
bgwriter_lru_maxpages = 1000      
bgwriter_lru_multiplier = 10.0      
effective_io_concurrency = 0      
max_worker_processes = 128      
max_parallel_maintenance_workers = 64      
max_parallel_workers_per_gather = 0      
parallel_leader_participation = on      
max_parallel_workers = 64      
wal_level = minimal        
synchronous_commit = off      
wal_writer_delay = 10ms      
checkpoint_timeout = 35min      
max_wal_size = 128GB      
min_wal_size = 32GB      
checkpoint_completion_target = 0.1      
max_wal_senders = 0      
effective_cache_size = 400GB      
log_destination = 'csvlog'      
logging_collector = on      
log_directory = 'log'      
log_filename = 'postgresql-%a.log'      
log_truncate_on_rotation = on      
log_rotation_age = 1d      
log_rotation_size = 0      
log_checkpoints = on       
log_connections = on      
log_disconnections = on      
log_error_verbosity = verbose       
log_line_prefix = '%m [%p] '      
log_timezone = 'PRC'      
log_autovacuum_min_duration = 0      
autovacuum_max_workers = 16      
autovacuum_freeze_max_age = 1200000000      
autovacuum_multixact_freeze_max_age = 1400000000      
autovacuum_vacuum_cost_delay = 0ms      
vacuum_freeze_table_age = 1150000000      
vacuum_multixact_freeze_table_age = 1150000000      
datestyle = 'iso, mdy'      
timezone = 'PRC'      
lc_messages = 'C'      
lc_monetary = 'C'      
lc_numeric = 'C'      
lc_time = 'C'      
default_text_search_config = 'pg_catalog.english'      
jit = off      
cpu_tuple_cost=0.00018884145574257426        
cpu_index_tuple_cost = 0.00433497085216479990        
cpu_operator_cost = 0.00216748542608239995        
seq_page_cost=0.014329        
random_page_cost = 0.016     
parallel_tuple_cost = 0      
parallel_setup_cost = 0     
min_parallel_table_scan_size = 0    
min_parallel_index_scan_size = 0    

1000W tpcc 测试

16072 * 60 = 96.4万 tpmC    

详细结果

......    
[ 2993s ] thds: 64 tps: 15107.81 qps: 431171.53 (r/w/o: 196624.50/204331.41/30215.62) lat (ms,95%): 10.65 err/s 51.00 reconn/s: 0.00    
[ 2994s ] thds: 64 tps: 15454.20 qps: 434439.71 (r/w/o: 198195.61/205335.70/30908.41) lat (ms,95%): 10.46 err/s 77.00 reconn/s: 0.00    
[ 2995s ] thds: 64 tps: 15480.57 qps: 438798.81 (r/w/o: 200298.44/207538.23/30962.14) lat (ms,95%): 10.46 err/s 72.00 reconn/s: 0.00    
[ 2996s ] thds: 64 tps: 15341.97 qps: 434496.22 (r/w/o: 198027.64/205784.63/30683.94) lat (ms,95%): 10.65 err/s 73.00 reconn/s: 0.00    
[ 2997s ] thds: 64 tps: 15208.54 qps: 433973.96 (r/w/o: 197975.05/205581.82/30417.08) lat (ms,95%): 10.65 err/s 75.01 reconn/s: 0.00    
[ 2998s ] thds: 64 tps: 15300.14 qps: 431763.76 (r/w/o: 196862.95/204300.53/30600.28) lat (ms,95%): 10.65 err/s 84.00 reconn/s: 0.00    
[ 2999s ] thds: 64 tps: 15108.49 qps: 426253.35 (r/w/o: 194171.57/201864.80/30216.98) lat (ms,95%): 10.65 err/s 56.99 reconn/s: 0.00    
[ 3000s ] thds: 64 tps: 15046.89 qps: 428187.32 (r/w/o: 195463.56/202631.98/30091.78) lat (ms,95%): 10.84 err/s 70.00 reconn/s: 0.00    
SQL statistics:    
    queries performed:    
        read:                            625427855    
        write:                           649118720    
        other:                           96478628    
        total:                           1371025203    
    transactions:                        48223282 (16072.47 per sec.)    
    queries:                             1371025203 (456952.75 per sec.)    
    ignored errors:                      210005 (69.99 per sec.)    
    reconnects:                          0      (0.00 per sec.)    
    
General statistics:    
    total time:                          3000.3636s    
    total number of events:              48223282    
    
Latency (ms):    
         min:                                    0.28    
         avg:                                    3.98    
         max:                                  912.95    
         95th percentile:                        9.91    
         sum:                            191859179.61    
    
Threads fairness:    
    events (avg/stddev):           753488.7812/3072.19    
    execution time (avg/stddev):   2997.7997/0.05    

1000亿 tpcb 测试

1、生成1000亿数据

nohup pgbench -i -s 1000000 -I dtg >./pgbench_ins.log 2>&1 &    
    
    
99998900000 of 100000000000 tuples (99%) done (elapsed 93180.83 s, remaining 1.03 s)    
99999000000 of 100000000000 tuples (99%) done (elapsed 93181.05 s, remaining 0.93 s)    
99999100000 of 100000000000 tuples (99%) done (elapsed 93181.13 s, remaining 0.84 s)    
99999200000 of 100000000000 tuples (99%) done (elapsed 93181.21 s, remaining 0.75 s)    
99999300000 of 100000000000 tuples (99%) done (elapsed 93181.30 s, remaining 0.65 s)    
99999400000 of 100000000000 tuples (99%) done (elapsed 93182.01 s, remaining 0.56 s)    
99999500000 of 100000000000 tuples (99%) done (elapsed 93182.09 s, remaining 0.47 s)    
99999600000 of 100000000000 tuples (99%) done (elapsed 93182.17 s, remaining 0.37 s)    
99999700000 of 100000000000 tuples (99%) done (elapsed 93182.25 s, remaining 0.28 s)    
99999800000 of 100000000000 tuples (99%) done (elapsed 93182.33 s, remaining 0.19 s)    
99999900000 of 100000000000 tuples (99%) done (elapsed 93182.42 s, remaining 0.09 s)    
100000000000 of 100000000000 tuples (100%) done (elapsed 93182.50 s, remaining 0.00 s)    
done.    

生成1000亿数据耗时: 93182 秒。 (约25小时 52分钟。)

2、给1000亿的单表创建索引(64 parallel)

postgres=# analyze pgbench_accounts ;    
ANALYZE    
postgres=# alter table pgbench_accounts set (parallel_workers =64);    
ALTER TABLE    
nohup pgbench -i -s 1000000 -I p > ./pk.log 2>&1 &    

1000亿单表创建索引耗时: 10小时 50分钟。

1000亿单表、索引容量大小

1000亿单表:12.5 TB。

1000亿单表索引: 2 TB。

postgres=# \di+ pgbench*  
                                      List of relations  
 Schema |         Name          | Type  |  Owner   |      Table       |  Size   | Description   
--------+-----------------------+-------+----------+------------------+---------+-------------  
 public | pgbench_accounts_pkey | index | postgres | pgbench_accounts | 2092 GB |   
 public | pgbench_branches_pkey | index | postgres | pgbench_branches | 21 MB   |   
 public | pgbench_tellers_pkey  | index | postgres | pgbench_tellers  | 214 MB  |   
(3 rows)  
  
postgres=# \dt+ pgbench*  
                          List of relations  
 Schema |       Name       | Type  |  Owner   |  Size   | Description   
--------+------------------+-------+----------+---------+-------------  
 public | pgbench_accounts | table | postgres | 12 TB   |   
 public | pgbench_branches | table | postgres | 35 MB   |   
 public | pgbench_history  | table | postgres | 0 bytes |   
 public | pgbench_tellers  | table | postgres | 422 MB  |   
(4 rows)  

索引深度

1、1000亿行,INT8类型索引,深度为4(不包括ROOT PAGE)。

postgres=# select * from bt_metap('pgbench_accounts_pkey');  
 magic  | version |   root   | level | fastroot | fastlevel | oldest_xact | last_cleanup_num_tuples   
--------+---------+----------+-------+----------+-----------+-------------+-------------------------  
 340322 |       3 | 23149704 |     4 | 23149704 |         4 |           0 |                      -1  
(1 row)  

2、索引查询,索引需要搜索5个BLOCK。

postgres=# explain (analyze,verbose,timing,costs,buffers) select * from pgbench_accounts where aid=10000000;  
                                                                   QUERY PLAN                                                                      
-------------------------------------------------------------------------------------------------------------------------------------------------  
 Index Scan using pgbench_accounts_pkey on public.pgbench_accounts  (cost=0.62..0.66 rows=1 width=101) (actual time=0.020..0.021 rows=1 loops=1)  
   Output: aid, bid, abalance, filler  
   Index Cond: (pgbench_accounts.aid = 10000000)  
   Buffers: shared hit=6  -- 5个index block, 1个heap block  
 Planning Time: 0.049 ms  
 Execution Time: 0.033 ms  
(6 rows)  

《深入浅出PostgreSQL B-Tree索引结构》

tpcb 1000亿 性能测试

使用高斯分布,生成测试数据。

《生成泊松、高斯、指数、随机分布数据 - PostgreSQL 9.5 new feature - pgbench improve, gaussian (standard normal) & exponential distribution》

只读

vi test.sql  
  
\set aid random_gaussian(1, :range, 10.0)  
SELECT abalance FROM pgbench_accounts WHERE aid = :aid;  

读写

vi rw.sql  
  
\set aid random_gaussian(1, :range, 10.0)  
\set bid random(1, 1 * :scale)  
\set tid random(1, 10 * :scale)  
\set delta random(-5000, 5000)  
BEGIN;  
UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;  
SELECT abalance FROM pgbench_accounts WHERE aid = :aid;  
UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;  
UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;  
INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);  
END;  

tpcb 1000亿 只读测试

1、活跃数据10亿

QPS: 998818

pgbench -M prepared -n -r -P 1 -f ./test.sql -c 64 -j 64 -T 360 -D range=1000000000  
  
transaction type: ./test.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 64  
number of threads: 64  
duration: 360 s  
number of transactions actually processed: 359606534  
latency average = 0.064 ms  
latency stddev = 0.046 ms  
tps = 998777.462686 (including connections establishing)  
tps = 998818.121681 (excluding connections establishing)  
statement latencies in milliseconds:  
         0.002  \set aid random_gaussian(1, :range, 10.0)  
         0.062  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;  
2、活跃数据100亿

QPS: 597877

pgbench -M prepared -n -r -P 1 -f ./test.sql -c 64 -j 64 -T 360 -D range=10000000000  
  
transaction type: ./test.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 64  
number of threads: 64  
duration: 360 s  
number of transactions actually processed: 215257932  
latency average = 0.107 ms  
latency stddev = 0.526 ms  
tps = 597861.125133 (including connections establishing)  
tps = 597877.469245 (excluding connections establishing)  
statement latencies in milliseconds:  
         0.002  \set aid random_gaussian(1, :range, 10.0)  
         0.105  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;  

前期IO

Total DISK READ :       2.32 G/s | Total DISK WRITE :       0.00 B/s  
Actual DISK READ:       2.32 G/s | Actual DISK WRITE:       0.00 B/s  
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND                                                                                                                                                                         
46798 be/4 postgres   31.72 M/s    0.00 B/s  0.00 % 72.60 % postgres: postgres postgres [local] BIND    
46774 be/4 postgres   37.42 M/s    0.00 B/s  0.00 % 71.91 % postgres: postgres postgres [local] SELECT  
46792 be/4 postgres   35.54 M/s    0.00 B/s  0.00 % 71.89 % postgres: postgres postgres [local] idle    
46708 be/4 postgres   35.08 M/s    0.00 B/s  0.00 % 71.59 % postgres: postgres postgres [local] SELECT  
46730 be/4 postgres   46.84 M/s    0.00 B/s  0.00 % 70.99 % postgres: postgres postgres [local] SELECT  
46704 be/4 postgres   34.51 M/s    0.00 B/s  0.00 % 70.84 % postgres: postgres postgres [local] SELECT  
46716 be/4 postgres   46.05 M/s    0.00 B/s  0.00 % 70.84 % postgres: postgres postgres [local] SELECT  
46788 be/4 postgres   33.83 M/s    0.00 B/s  0.00 % 70.84 % postgres: postgres postgres [local] SELECT  
46807 be/4 postgres   33.78 M/s    0.00 B/s  0.00 % 70.41 % postgres: postgres postgres [local] SELECT  
46815 be/4 postgres   35.21 M/s    0.00 B/s  0.00 % 70.33 % postgres: postgres postgres [local] SELECT  
46812 be/4 postgres   45.95 M/s    0.00 B/s  0.00 % 70.18 % postgres: postgres postgres [local] SELECT  
46752 be/4 postgres   34.21 M/s    0.00 B/s  0.00 % 70.09 % postgres: postgres postgres [local] SELECT  

加热后IO

... ...    
  
Total DISK READ :     527.32 M/s | Total DISK WRITE :       0.00 B/s  
Actual DISK READ:     527.24 M/s | Actual DISK WRITE:      30.77 K/s  
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND                                                                                                                                                                         
47108 be/4 postgres   11.77 M/s    0.00 B/s  0.00 %  4.71 % postgres: postgres postgres [local] SELECT  
47025 be/4 postgres   10.55 M/s    0.00 B/s  0.00 %  4.39 % postgres: postgres postgres [local] SELECT  
47115 be/4 postgres    9.28 M/s    0.00 B/s  0.00 %  4.30 % postgres: postgres postgres [local] SELECT  
47061 be/4 postgres   13.27 M/s    0.00 B/s  0.00 %  4.23 % postgres: postgres postgres [local] SELECT  
47082 be/4 postgres   10.49 M/s    0.00 B/s  0.00 %  4.21 % postgres: postgres postgres [local] SELECT  
47111 be/4 postgres    6.54 M/s    0.00 B/s  0.00 %  4.18 % postgres: postgres postgres [local] idle    
47071 be/4 postgres    6.46 M/s    0.00 B/s  0.00 %  4.15 % postgres: postgres postgres [local] idle    
47018 be/4 postgres    9.13 M/s    0.00 B/s  0.00 %  4.11 % postgres: postgres postgres [local] idle    
47087 be/4 postgres    5.77 M/s    0.00 B/s  0.00 %  4.09 % postgres: postgres postgres [local] idle    
47105 be/4 postgres    8.89 M/s    0.00 B/s  0.00 %  4.08 % postgres: postgres postgres [local] BINDCT  
47069 be/4 postgres    8.46 M/s    0.00 B/s  0.00 %  4.05 % postgres: postgres postgres [local] SELECT  
47106 be/4 postgres    8.89 M/s    0.00 B/s  0.00 %  3.91 % postgres: postgres postgres [local] idle    
47053 be/4 postgres    6.48 M/s    0.00 B/s  0.00 %  3.91 % postgres: postgres postgres [local] SELECT  
47028 be/4 postgres    9.71 M/s    0.00 B/s  0.00 %  3.83 % postgres: postgres postgres [local] idle    
47112 be/4 postgres    6.72 M/s    0.00 B/s  0.00 %  3.82 % postgres: postgres postgres [local] SELECT  
47039 be/4 postgres    7.63 M/s    0.00 B/s  0.00 %  3.81 % postgres: postgres postgres [local] BIND    
3、活跃数据500亿

QPS: 66678

pgbench -M prepared -n -r -P 1 -f ./test.sql -c 64 -j 64 -T 360 -D range=50000000000  
  
  
transaction type: ./test.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 64  
number of threads: 64  
duration: 3600 s  
number of transactions actually processed: 240046184  
latency average = 0.960 ms  
latency stddev = 1.660 ms  
tps = 66678.433880 (including connections establishing)  
tps = 66678.672147 (excluding connections establishing)  
statement latencies in milliseconds:  
         0.001  \set aid random_gaussian(1, :range, 10.0)  
         0.958  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;  

IO

Total DISK READ :       2.45 G/s | Total DISK WRITE :       0.00 B/s  
Actual DISK READ:       2.45 G/s | Actual DISK WRITE:       0.00 B/s  
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND                                                                                                                                                                         
47230 be/4 postgres   39.87 M/s    0.00 B/s  0.00 % 87.93 % postgres: postgres postgres [local] SELECT  
47218 be/4 postgres   32.12 M/s    0.00 B/s  0.00 % 87.85 % postgres: postgres postgres [local] SELECT  
47196 be/4 postgres   38.54 M/s    0.00 B/s  0.00 % 87.77 % postgres: postgres postgres [local] SELECT  
47250 be/4 postgres   32.52 M/s    0.00 B/s  0.00 % 87.73 % postgres: postgres postgres [local] SELECT  
47210 be/4 postgres   35.25 M/s    0.00 B/s  0.00 % 87.64 % postgres: postgres postgres [local] SELECT  
47173 be/4 postgres   35.29 M/s    0.00 B/s  0.00 % 87.63 % postgres: postgres postgres [local] SELECT  
47220 be/4 postgres   36.14 M/s    0.00 B/s  0.00 % 87.63 % postgres: postgres postgres [local] SELECT  
47243 be/4 postgres   44.79 M/s    0.00 B/s  0.00 % 87.61 % postgres: postgres postgres [local] SELECT  
47149 be/4 postgres   48.33 M/s    0.00 B/s  0.00 % 87.55 % postgres: postgres postgres [local] SELECT  
47245 be/4 postgres   44.83 M/s    0.00 B/s  0.00 % 87.54 % postgres: postgres postgres [local] SELECT  
47254 be/4 postgres   29.74 M/s    0.00 B/s  0.00 % 87.53 % postgres: postgres postgres [local] SELECT  
47253 be/4 postgres   41.24 M/s    0.00 B/s  0.00 % 87.50 % postgres: postgres postgres [local] SELECT  
47162 be/4 postgres   30.31 M/s    0.00 B/s  0.00 % 87.50 % postgres: postgres postgres [local] SELECT  
47229 be/4 postgres   29.40 M/s    0.00 B/s  0.00 % 87.50 % postgres: postgres postgres [local] SELECT  
47234 be/4 postgres   37.08 M/s    0.00 B/s  0.00 % 87.50 % postgres: postgres postgres [local] SELECT  
47242 be/4 postgres   40.28 M/s    0.00 B/s  0.00 % 87.46 % postgres: postgres postgres [local] SELECT  
47186 be/4 postgres   36.05 M/s    0.00 B/s  0.00 % 87.44 % postgres: postgres postgres [local] SELECT  
47165 be/4 postgres   33.66 M/s    0.00 B/s  0.00 % 87.43 % postgres: postgres postgres [local] SELECT  
4、活跃数据1000亿

QPS: 67295

pgbench -M prepared -n -r -P 1 -f ./test.sql -c 64 -j 64 -T 360 -D range=100000000000  
  
  
transaction type: ./test.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 64  
number of threads: 64  
duration: 3600 s  
number of transactions actually processed: 242265704  
latency average = 0.951 ms  
latency stddev = 2.313 ms  
tps = 67295.523254 (including connections establishing)  
tps = 67295.778158 (excluding connections establishing)  
statement latencies in milliseconds:  
         0.002  \set aid random_gaussian(1, :range, 10.0)  
         0.949  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;  

IO

Total DISK READ :       2.24 G/s | Total DISK WRITE :       0.00 B/s  
Actual DISK READ:       2.24 G/s | Actual DISK WRITE:      54.79 K/s  
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND                                                                                                                                                                         
47932 be/4 postgres   33.54 M/s    0.00 B/s  0.00 % 88.36 % postgres: postgres postgres [local] SELECT  
48010 be/4 postgres   33.41 M/s    0.00 B/s  0.00 % 88.27 % postgres: postgres postgres [local] SELECT  
48021 be/4 postgres   34.10 M/s    0.00 B/s  0.00 % 88.21 % postgres: postgres postgres [local] SELECT  
48049 be/4 postgres   32.14 M/s    0.00 B/s  0.00 % 88.20 % postgres: postgres postgres [local] SELECT  
48048 be/4 postgres   33.34 M/s    0.00 B/s  0.00 % 88.18 % postgres: postgres postgres [local] SELECT  
47988 be/4 postgres   31.79 M/s    0.00 B/s  0.00 % 88.11 % postgres: postgres postgres [local] SELECT  
48007 be/4 postgres   26.25 M/s    0.00 B/s  0.00 % 88.07 % postgres: postgres postgres [local] SELECT  
48013 be/4 postgres   35.37 M/s    0.00 B/s  0.00 % 88.07 % postgres: postgres postgres [local] SELECT  
47949 be/4 postgres   36.25 M/s    0.00 B/s  0.00 % 88.04 % postgres: postgres postgres [local] SELECT  
47979 be/4 postgres   44.90 M/s    0.00 B/s  0.00 % 88.02 % postgres: postgres postgres [local] SELECT  
48047 be/4 postgres   39.64 M/s    0.00 B/s  0.00 % 87.97 % postgres: postgres postgres [local] SELECT  
48038 be/4 postgres   39.24 M/s    0.00 B/s  0.00 % 87.93 % postgres: postgres postgres [local] SELECT  
48034 be/4 postgres   38.02 M/s    0.00 B/s  0.00 % 87.89 % postgres: postgres postgres [local] SELECT  
48019 be/4 postgres   35.99 M/s    0.00 B/s  0.00 % 87.88 % postgres: postgres postgres [local] SELECT  
48046 be/4 postgres   32.00 M/s    0.00 B/s  0.00 % 87.88 % postgres: postgres postgres [local] SELECT  

tpcb 1000亿 读写测试

1、活跃数据10亿

TPS: 95119

QPS: 475595

pgbench -M prepared -n -r -P 1 -f ./rw.sql -c 64 -j 64 -T 360 -D scale=1000000 -D range=1000000000   
  
transaction type: ./rw.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 64  
number of threads: 64  
duration: 360 s  
number of transactions actually processed: 34244287  
latency average = 0.673 ms  
latency stddev = 0.394 ms  
tps = 95116.186279 (including connections establishing)  
tps = 95119.886927 (excluding connections establishing)  
statement latencies in milliseconds:  
         0.003  \set aid random_gaussian(1, :range, 10.0)  
         0.001  \set bid random(1, 1 * :scale)  
         0.001  \set tid random(1, 10 * :scale)  
         0.001  \set delta random(-5000, 5000)  
         0.046  BEGIN;  
         0.133  UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;  
         0.077  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;  
         0.104  UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;  
         0.088  UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;  
         0.074  INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);  
         0.146  END;  
2、活跃数据100亿

TPS: 85278

QPS: 426390

pgbench -M prepared -n -r -P 1 -f ./rw.sql -c 64 -j 64 -T 360 -D scale=1000000 -D range=10000000000  
  
transaction type: ./rw.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 64  
number of threads: 64  
duration: 360 s  
number of transactions actually processed: 30702466  
latency average = 0.750 ms  
latency stddev = 1.518 ms  
tps = 85275.759706 (including connections establishing)  
tps = 85278.402619 (excluding connections establishing)  
statement latencies in milliseconds:  
         0.003  \set aid random_gaussian(1, :range, 10.0)  
         0.001  \set bid random(1, 1 * :scale)  
         0.001  \set tid random(1, 10 * :scale)  
         0.001  \set delta random(-5000, 5000)  
         0.047  BEGIN;  
         0.193  UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;  
         0.082  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;  
         0.108  UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;  
         0.093  UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;  
         0.078  INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);  
         0.144  END;  

IO

Total DISK READ :     124.77 M/s | Total DISK WRITE :     846.78 M/s  
Actual DISK READ:     124.01 M/s | Actual DISK WRITE:     820.10 M/s  
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND                                                                                                                                                                         
31430 be/4 postgres   27.03 K/s   44.03 M/s  0.00 % 22.13 % postgres: walwriter  
49767 be/4 postgres 1629.40 K/s    8.68 M/s  0.00 %  3.03 % postgres: postgres postgres [local] UPDATE  
49771 be/4 postgres    2.73 M/s    8.63 M/s  0.00 %  2.34 % postgres: postgres postgres [local] UPDATE               
49742 be/4 postgres    2.77 M/s    8.26 M/s  0.00 %  2.31 % postgres: postgres postgres [local] UPDATE               
49787 be/4 postgres 1343.68 K/s    9.27 M/s  0.00 %  2.29 % postgres: postgres postgres [local] UPDATE               
49785 be/4 postgres    3.05 M/s    8.64 M/s  0.00 %  2.27 % postgres: postgres postgres [local] UPDATE               
49776 be/4 postgres 1783.85 K/s   11.64 M/s  0.00 %  2.05 % postgres: postgres postgres [local] UPDATE               
49774 be/4 postgres    4.22 M/s    8.13 M/s  0.00 %  2.05 % postgres: postgres postgres [local] UPDATE  
49775 be/4 postgres  671.84 K/s    8.79 M/s  0.00 %  2.04 % postgres: postgres postgres [local] UPDATE               
49786 be/4 postgres 1220.12 K/s    8.15 M/s  0.00 %  2.04 % postgres: postgres postgres [local] UPDATE               
49772 be/4 postgres 1003.90 K/s    8.77 M/s  0.00 %  2.02 % postgres: postgres postgres [local] UPDATE               
49697 be/4 postgres    2.56 M/s    8.69 M/s  0.00 %  2.01 % postgres: postgres postgres [local] UPDATE  
49803 be/4 postgres  733.62 K/s    8.22 M/s  0.00 %  2.00 % postgres: postgres postgres [local] UPDATE  
49806 be/4 postgres    2.84 M/s   10.23 M/s  0.00 %  1.99 % postgres: postgres postgres [local] UPDATE               
49804 be/4 postgres 1783.85 K/s    8.60 M/s  0.00 %  1.98 % postgres: postgres postgres [local] UPDATE               
49766 be/4 postgres  478.78 K/s   14.62 M/s  0.00 %  1.97 % postgres: postgres postgres [local] UPDATE               
49770 be/4 postgres    2.29 M/s    8.82 M/s  0.00 %  1.96 % postgres: postgres postgres [local] UPDATE  
49715 be/4 postgres    3.20 M/s    8.37 M/s  0.00 %  1.96 % postgres: postgres postgres [local] UPDATE               
49810 be/4 postgres    3.35 M/s    9.68 M/s  0.00 %  1.93 % postgres: postgres postgres [local] UPDATE  
49780 be/4 postgres    2.56 M/s    8.22 M/s  0.00 %  1.92 % postgres: postgres postgres [local] UPDATE               
49784 be/4 postgres    3.39 M/s    8.35 M/s  0.00 %  1.92 % postgres: postgres postgres [local] UPDATE               
49734 be/4 postgres    2.31 M/s    8.66 M/s  0.00 %  1.91 % postgres: postgres postgres [local] UPDATE               
49800 be/4 postgres 2023.24 K/s    8.55 M/s  0.00 %  1.90 % postgres: postgres postgres [local] UPDATE  
49794 be/4 postgres 1629.40 K/s    8.88 M/s  0.00 %  1.85 % postgres: postgres postgres [local] UPDATE               
49765 be/4 postgres    3.24 M/s    8.67 M/s  0.00 %  1.80 % postgres: postgres postgres [local] UPDATE               
49724 be/4 postgres 1173.79 K/s   10.08 M/s  0.00 %  1.79 % postgres: postgres postgres [local] UPDATE               
49728 be/4 postgres 1413.18 K/s    8.63 M/s  0.00 %  1.78 % postgres: postgres postgres [local] UPDATE  
49781 be/4 postgres 1436.34 K/s    7.93 M/s  0.00 %  1.76 % postgres: postgres postgres [local] UPDATE               
49790 be/4 postgres 1096.56 K/s    8.60 M/s  0.00 %  1.76 % postgres: postgres postgres [local] UPDATE               
3、活跃数据500亿

TPS: 38301

QPS: 191505

pgbench -M prepared -n -r -P 1 -f ./rw.sql -c 64 -j 64 -T 360 -D scale=1000000 -D range=50000000000   
  
transaction type: ./rw.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 64  
number of threads: 64  
duration: 360 s  
number of transactions actually processed: 13790704  
latency average = 1.671 ms  
latency stddev = 2.620 ms  
tps = 38299.935890 (including connections establishing)  
tps = 38301.102322 (excluding connections establishing)  
statement latencies in milliseconds:  
         0.002  \set aid random_gaussian(1, :range, 10.0)  
         0.001  \set bid random(1, 1 * :scale)  
         0.001  \set tid random(1, 10 * :scale)  
         0.001  \set delta random(-5000, 5000)  
         0.031  BEGIN;  
         1.274  UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;  
         0.068  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;  
         0.103  UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;  
         0.076  UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;  
         0.058  INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);  
         0.056  END;  

IO

Total DISK READ :    1508.55 M/s | Total DISK WRITE :     618.92 M/s  
Actual DISK READ:    1507.72 M/s | Actual DISK WRITE:     450.33 M/s  
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND                                                                                                                                                                         
49510 be/4 postgres   21.45 M/s    5.40 M/s  0.00 % 61.29 % postgres: postgres postgres [local] UPDATE               
49507 be/4 postgres   26.85 M/s    5.61 M/s  0.00 % 60.79 % postgres: postgres postgres [local] UPDATE               
49456 be/4 postgres   28.59 M/s    5.67 M/s  0.00 % 60.42 % postgres: postgres postgres [local] UPDATE               
49436 be/4 postgres   23.76 M/s    5.23 M/s  0.00 % 60.31 % postgres: postgres postgres [local] UPDATE               
49516 be/4 postgres   21.82 M/s    5.40 M/s  0.00 % 59.84 % postgres: postgres postgres [local] UPDATE               
49414 be/4 postgres   20.77 M/s    5.44 M/s  0.00 % 59.84 % postgres: postgres postgres [local] UPDATE               
49503 be/4 postgres   20.13 M/s    6.15 M/s  0.00 % 59.81 % postgres: postgres postgres [local] UPDATE               
49410 be/4 postgres   29.23 M/s    5.52 M/s  0.00 % 59.73 % postgres: postgres postgres [local] UPDATE               
49427 be/4 postgres   18.61 M/s    5.18 M/s  0.00 % 59.71 % postgres: postgres postgres [local] idle in transaction  
49501 be/4 postgres   17.22 M/s    5.60 M/s  0.00 % 59.70 % postgres: postgres postgres [local] UPDATE               
49493 be/4 postgres   24.60 M/s    7.22 M/s  0.00 % 59.66 % postgres: postgres postgres [local] UPDATE               
49512 be/4 postgres   23.08 M/s    5.53 M/s  0.00 % 59.65 % postgres: postgres postgres [local] UPDATE               
49509 be/4 postgres   24.04 M/s    5.64 M/s  0.00 % 59.55 % postgres: postgres postgres [local] UPDATE               
49490 be/4 postgres   17.89 M/s    5.62 M/s  0.00 % 59.55 % postgres: postgres postgres [local] UPDATE             
4、活跃数据1000亿

TPS: 35189

QPS: 175945

pgbench -M prepared -n -r -P 1 -f ./rw.sql -c 64 -j 64 -T 360 -D scale=1000000 -D range=100000000000   
  
  
transaction type: ./rw.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 64  
number of threads: 64  
duration: 360 s  
number of transactions actually processed: 12670591  
latency average = 1.818 ms  
latency stddev = 3.928 ms  
tps = 35188.224787 (including connections establishing)  
tps = 35189.625697 (excluding connections establishing)  
statement latencies in milliseconds:  
         0.003  \set aid random_gaussian(1, :range, 10.0)  
         0.001  \set bid random(1, 1 * :scale)  
         0.001  \set tid random(1, 10 * :scale)  
         0.001  \set delta random(-5000, 5000)  
         0.032  BEGIN;  
         1.392  UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;  
         0.072  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;  
         0.112  UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;  
         0.086  UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;  
         0.061  INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);  
         0.059  END;  

IO

Total DISK READ :    1824.52 M/s | Total DISK WRITE :     241.53 M/s  
Actual DISK READ:    1821.44 M/s | Actual DISK WRITE:     237.31 M/s  
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND                                                                                                                                                                         
48918 be/4 postgres   27.19 M/s    0.00 B/s  0.00 % 67.37 % postgres: postgres postgres [local] UPDATE  
48893 be/4 postgres   32.32 M/s    0.00 B/s  0.00 % 67.29 % postgres: postgres postgres [local] UPDATE               
48914 be/4 postgres   23.47 M/s    0.00 B/s  0.00 % 67.29 % postgres: postgres postgres [local] UPDATE  
48889 be/4 postgres   33.45 M/s    0.00 B/s  0.00 % 67.23 % postgres: postgres postgres [local] UPDATE               
48904 be/4 postgres   34.80 M/s    0.00 B/s  0.00 % 67.21 % postgres: postgres postgres [local] UPDATE  
48861 be/4 postgres   30.88 M/s    7.81 K/s  0.00 % 67.06 % postgres: postgres postgres [local] UPDATE  
48910 be/4 postgres   27.16 M/s  328.11 K/s  0.00 % 67.02 % postgres: postgres postgres [local] UPDATE               
48821 be/4 postgres   27.54 M/s    7.81 K/s  0.00 % 67.01 % postgres: postgres postgres [local] UPDATE  
48825 be/4 postgres   35.88 M/s    0.00 B/s  0.00 % 66.89 % postgres: postgres postgres [local] UPDATE               
48930 be/4 postgres   31.68 M/s    7.81 K/s  0.00 % 66.82 % postgres: postgres postgres [local] UPDATE  
48867 be/4 postgres   26.99 M/s    7.81 K/s  0.00 % 66.81 % postgres: postgres postgres [local] idle in transaction  
48929 be/4 postgres   25.61 M/s    7.81 K/s  0.00 % 66.77 % postgres: postgres postgres [local] UPDATE  
48894 be/4 postgres   24.08 M/s    0.00 B/s  0.00 % 66.67 % postgres: postgres postgres [local] UPDATE               
48921 be/4 postgres   32.90 M/s  640.60 K/s  0.00 % 66.66 % postgres: postgres postgres [local] UPDATE  
48925 be/4 postgres   27.30 M/s    0.00 B/s  0.00 % 66.63 % postgres: postgres postgres [local] UPDATE  
48829 be/4 postgres   24.85 M/s    0.00 B/s  0.00 % 66.63 % postgres: postgres postgres [local] idle                 
48901 be/4 postgres   29.57 M/s    0.00 B/s  0.00 % 66.62 % postgres: postgres postgres [local] UPDATE  
48927 be/4 postgres   24.33 M/s    7.81 K/s  0.00 % 66.59 % postgres: postgres postgres [local] UPDATE  
48933 be/4 postgres   27.85 M/s    7.81 K/s  0.00 % 66.57 % postgres: postgres postgres [local] BINDTE  
48890 be/4 postgres   27.79 M/s    0.00 B/s  0.00 % 66.56 % postgres: postgres postgres [local] UPDATE  
48931 be/4 postgres   30.29 M/s    0.00 B/s  0.00 % 66.55 % postgres: postgres postgres [local] UPDATE  

其他测试

同步多副本环境

《PostgreSQL 一主多从(多副本,强同步)简明手册 - 配置、压测、监控、切换、防脑裂、修复、0丢失 - 珍藏级》

其中一个备库使用zfs存储,开启lz4压缩,提供闪回,备份能力。

创建备库,单个备库的创建速度约500MB/s,15TB的库,需要9个半小时创建完成。

[root@pg11 ~]# dstat
You did not select any stats, using -cdngy by default.
----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw 
  4   7  85   4   0   0| 289M  164M|   0     0 |   0     0 |  41k   80k
  0   1  99   0   0   0| 508M    0 |1218k 1007M|   0     0 |  40k 1559 
  0   1  99   0   0   0| 500M    0 |1233k 1004M|   0     0 |  41k 1673 
  0   1  99   0   0   0| 492M    0 |1206k  994M|   0     0 |  40k 1576 
  0   1  99   0   0   0| 508M    0 |1245k 1015M|   0     0 |  41k 1601 
  0   1  99   0   0   0| 516M    0 |1257k 1021M|   0     0 |  42k 1576 
  0   2  98   0   0   0| 520M    0 |1300k 1044M|   0     0 |  44k 1891 

[root@pg11 ~]# top -c -u postgres

top - 15:09:33 up 2 days,  4:48,  2 users,  load average: 0.41, 3.98, 22.70
Tasks: 516 total,   2 running, 514 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.1 us,  0.8 sy,  0.0 ni, 99.1 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 52819500+total, 29681988 free,  3079916 used, 49543308+buff/cache
KiB Swap:        0 total,        0 free,        0 used. 52057548+avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                                                                  
50168 postgres  20   0 66.436g  14264  12248 R  35.1  0.0   0:37.57 postgres: walsender postgres 172.17.20.28(48412) sending backup "pg_basebackup base backup"                                                                              
50176 postgres  20   0 66.436g  20416  18400 S  24.2  0.0   0:18.35 postgres: walsender postgres 172.17.20.29(65032) sending backup "pg_basebackup base backup"                                                                              
50154 postgres  20   0 66.433g 1.979g 1.978g S   0.0  0.4   0:01.41 /usr/pgsql-11/bin/postgres                                                                                                                                               
50155 postgres  20   0  245148   2036    596 S   0.0  0.0   0:00.00 postgres: logger                                                                                                                                                         
50157 postgres  20   0 66.434g 531212 529720 S   0.0  0.1   0:00.43 postgres: checkpointer                                                                                                                                                   
50158 postgres  20   0 66.434g 530592 529120 S   0.0  0.1   0:00.54 postgres: background writer                                                                                                                                              
50159 postgres  20   0 66.433g 526884 525420 S   0.0  0.1   0:00.28 postgres: walwriter                                                                                                                                                      
50160 postgres  20   0 66.436g   3224   1392 S   0.0  0.0   0:00.00 postgres: autovacuum launcher                                                                                                                                            
50161 postgres  20   0  247404   2252    692 S   0.0  0.0   0:00.00 postgres: stats collector                                                                                                                                                
50162 postgres  20   0 66.436g   2844   1096 S   0.0  0.0   0:00.00 postgres: logical replication launcher                                                                                                                                   
50169 postgres  20   0 66.436g   4124   1944 S   0.0  0.0   0:00.01 postgres: walsender postgres 172.17.20.28(48414) streaming 101/44000140                                                                                                  
50177 postgres  20   0 66.436g   3796   1872 S   0.0  0.0   0:00.00 postgres: walsender postgres 172.17.20.29(65034) streaming 101/44000140                                                                                                  

配置步骤

1、配置pg_hba.conf

host replication all xxx.xxx.xxx.xxx/32 md5

2、创建replication角色用户

create role repxxx login replication encrypted password 'xxx';

3、pg_basebackup拉取数据,注意开启SLOT,否则对于很大的数据库,可能数据备份结束后,WAL已经在主库被清除了。(有WAL归档的情况下,可以不开启SLOT。开启SLOT后,未被备库拉取的WAL不会被清除。)

export PGPASSWORD=xxx

nohup pg_basebackup -F p -D $PGDATA -h 172.17.20.30 -p 1921 -U repxxx -X stream -C -S standby1 >/dev/null 2>&1 &
  
  
nohup pg_basebackup -F p -D $PGDATA -h 172.17.20.30 -p 1921 -U repxxx -X stream -C -S standby2 >/dev/null 2>&1 &

4、配置recovery.conf

recovery_target_timeline = 'latest'
standby_mode = on
primary_conninfo = 'host=172.17.20.30 port=1921 user=repxxx password=xxx'          # e.g. 'host=localhost port=5432'
primary_slot_name = 'standby1'


recovery_target_timeline = 'latest'
standby_mode = on
primary_conninfo = 'host=172.17.20.30 port=1921 user=repxxx password=xxx'          # e.g. 'host=localhost port=5432'
primary_slot_name = 'standby2'

5、启动备库

6、主库配置多副本参数

synchronous_standby_names = 'ANY 1 (*)'
synchronous_commit = remote_write 


postgres=# show synchronous_commit ;
 synchronous_commit 
--------------------
 remote_write
(1 row)

postgres=# show synchronous_standby_names ;
 synchronous_standby_names 
---------------------------
 ANY 1 (*)
(1 row)

7、tpcb 1000亿(活跃10亿) rw 测试, QPS 5.34万。

transaction type: ./rw.sql
scaling factor: 1
query mode: prepared
number of clients: 64
number of threads: 64
duration: 360 s
number of transactions actually processed: 19239053
latency average = 1.198 ms
latency stddev = 1.208 ms
tps = 53432.922774 (including connections establishing)
tps = 53435.051257 (excluding connections establishing)
statement latencies in milliseconds:
         0.003  \set aid random_gaussian(1, :range, 10.0)
         0.001  \set bid random(1, 1 * :scale)
         0.001  \set tid random(1, 10 * :scale)
         0.001  \set delta random(-5000, 5000)
         0.043  BEGIN;
         0.122  UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;
         0.068  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
         0.089  UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;
         0.080  UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;
         0.064  INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
         0.726  END;

8、tpcc 1000W , tps : 11000 , tpmC : 66W.

SQL statistics:
    queries performed:
        read:                            513698640
        write:                           533163795
        other:                           79242254
        total:                           1126104689
    transactions:                        39605095 (11000.27 per sec.)
    queries:                             1126104689 (312774.25 per sec.)
    ignored errors:                      172646 (47.95 per sec.)
    reconnects:                          0      (0.00 per sec.)

General statistics:
    total time:                          3600.3729s
    total number of events:              39605095

Latency (ms):
         min:                                    0.29
         avg:                                    5.81
         max:                                 1388.75
         95th percentile:                       11.87
         sum:                            230276426.25

Threads fairness:
    events (avg/stddev):           618829.6094/1712.80
    execution time (avg/stddev):   3598.0692/0.06

flashback 闪回

《PostgreSQL 最佳实践 - 块级增量备份(ZFS篇)验证 - recovery test script for zfs snapshot clone + postgresql stream replication + archive》

主备切换

《PostgreSQL 一主多从(多副本,强同步)简明手册 - 配置、压测、监控、切换、防脑裂、修复、0丢失 - 珍藏级》

修复主库, pg_rewind

《PostgreSQL primary-standby failback tools : pg_rewind》

《PostgreSQL 9.5 new feature - pg_rewind fast sync Split Brain Primary & Standby》

《PostgreSQL 9.5 add pg_rewind for Fast align for PostgreSQL unaligned primary & standby》

小结

1、8K fsync IO RT

1 ecs本地ssd
22 us    
2 essd
119 us    

2、8K directIO

1 ecs本地ssd

离散读

IOPS=146k, BW=1140MiB/s (1196MB/s)(200GiB/180004msec)    

离散写

IOPS=78.6k, BW=614MiB/s (644MB/s)(108GiB/180004msec)    

顺序读

IOPS=147k, BW=1150MiB/s (1206MB/s)(202GiB/180004msec)    

顺序写

IOPS=79.3k, BW=620MiB/s (650MB/s)(109GiB/180004m    
2 essd

离散读

IOPS=112k, BW=879MiB/s (922MB/s)(154GiB/180008msec)    

离散写

IOPS=60.6k, BW=473MiB/s (496MB/s)(83.2GiB/180008msec)    

顺序读

IOPS=133k, BW=1037MiB/s (1087MB/s)(182GiB/180005msec)    

顺序写

IOPS=71.4k, BW=558MiB/s (585MB/s)(98.1GiB/180005msec)    

3、tpcc 1000W (ESSD)

96.4万 tpmC

4、tpcb 1000亿 只读 (ESSD)

1、活跃数据10亿

QPS: 998818

2、活跃数据100亿

QPS: 597877

3、活跃数据500亿

QPS: 66678

4、活跃数据1000亿

QPS: 67295

5、tpcb 1000亿 读写 (ESSD)

1、活跃数据10亿

TPS: 95119

QPS: 475595

2、活跃数据100亿

TPS: 85278

QPS: 426390

3、活跃数据500亿

TPS: 38301

QPS: 191505

4、活跃数据1000亿

TPS: 35189

QPS: 175945

性能小结

环境:阿里云 ECS + 32T ESSD

表SIZE: 12.5 TB 写入耗时 25h52min

索引SIZE: 2 TB 创建耗时 10h50min

索引深度: 5级

单表数据量TEST CASEQPSTPS
10 * 100Wtpcc 1000W-96.4万 tpmC
10 * 100Wtpcc 1000W(同步多副本)-66万 tpmC
1000亿tpcb 活跃数据10亿 只读998818998818
1000亿tpcb 活跃数据100亿 只读597877597877
1000亿tpcb 活跃数据500亿 只读6667866678
1000亿tpcb 活跃数据1000亿 只读6729567295
1000亿tpcb 活跃数据10亿 读写47559595119
1000亿tpcb 活跃数据10亿 读写(同步多副本)26716053432
1000亿tpcb 活跃数据100亿 读写42639085278
1000亿tpcb 活跃数据500亿 读写19150538301
1000亿tpcb 活跃数据1000亿 读写17594535189

阿里云ESSD的引入,结合PostgreSQL企业级开源数据库(良好的性能、可管理海量数据、功能对齐Oracle,不仅ESSD层面提供多副本,同时数据库层面也支持通过quorum based replication多副本提供金融级的可靠性,使用ZFS秒级快照,闪回等企业特性),给企业大容量关系数据库上云提供了便利。

对于PG企业用户,可以选择阿里云PG企业版PPAS,拥有以上所有特性的同时,提供ORACLE兼容性。

参考

《fio测试IO性能》

ECS 本地SSD云盘(8*1.8TB条带)测试:

《PostgreSQL 100亿 tpcb 性能 on ECS》

《PostgreSQL 11 tpcc 测试(103万tpmC on ECS) - use sysbench-tpcc by Percona-Lab》

《(TPC-H测试 SF=10,SF=200) PostgreSQL 11 vs 10 vs Deepgreen》

《PostgreSQL 10 on ECS 实施 流复制备库镜像+自动快照备份+自动备份验证+自动清理备份与归档 - 珍藏级》

https://help.aliyun.com/knowledge_detail/64950.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值