gzip vs bzip2 vs xz vs pbzip2 性能对比

概述

两天前,简单写了篇bzip2 与 pbzip2 压缩哪个更快,当时是处于使用esrally压测Elastic search性能,并没有太多的关注几种压缩工具的性能如何。本文介绍常用的几种压缩命令,分别汇总出各个命令的压缩&解压缩全方面性能对比。

准备工作

看了下gzip,bzip2,pbzip2,xz这4个命令的help信息,帮助信息比较相似:

准备了一个测试文件方便测试:

shell

root@node244:/mnt/disk/compress_test# ll
total 557704
drwxr-xr-x 2 root root      4096 Jun  3 10:34 ./
drwxr-xr-x 5 root root      4096 Jun  3 10:32 ../
-rw-r--r-- 1 root root 571073631 Jun  3 10:33 ceph-client.radosgw.0.log
root@node244:/mnt/disk/compress_test# 

结合各个命令的help信息,可以使用简单的shell命令来完成测试工作,例如:

shell

for i in {1..9}; do echo "============================  Compress Level $i  ==========================="; echo " ----- Compress -----"; time gzip -k -f -$i ceph-client.radosgw.0.log; echo " ----- Compress info -----"; gzip -l -v ceph-client.radosgw.0.log.gz; echo "-----  Uncompress -----"; time gzip -d -f ceph-client.radosgw.0.log.gz; done

测试过程

gzip 压缩与解压缩

shell

root@node244:/mnt/disk/compress_test# for i in {1..9}; do echo "============================  Compress Level $i  ==========================="; echo " ----- Compress -----"; time gzip -k -f -$i ceph-client.radosgw.0.log; echo " ----- Compress info -----"; gzip -l -v ceph-client.radosgw.0.log.gz; echo "-----  Uncompress -----"; time gzip -d -f ceph-client.radosgw.0.log.gz; done
============================  Compress Level 1  ===========================
 ----- Compress -----

real	0m3.389s
user	0m3.261s
sys	0m0.128s
 ----- Compress info -----
method  crc     date  time           compressed        uncompressed  ratio uncompressed_name
defla d30eafa2 Jun  3 10:33            27542300           571073631  95.2% ceph-client.radosgw.0.log
-----  Uncompress -----

real	0m2.502s
user	0m1.962s
sys	0m0.472s
============================  Compress Level 2  ===========================
 ----- Compress -----

real	0m3.385s
user	0m3.257s
sys	0m0.128s
 ----- Compress info -----
method  crc     date  time           compressed        uncompressed  ratio uncompressed_name
defla d30eafa2 Jun  3 10:33            27158636           571073631  95.2% ceph-client.radosgw.0.log
-----  Uncompress -----

real	0m2.782s
user	0m1.893s
sys	0m0.444s
============================  Compress Level 3  ===========================
 ----- Compress -----

real	0m3.407s
user	0m3.262s
sys	0m0.144s
 ----- Compress info -----
method  crc     date  time           compressed        uncompressed  ratio uncompressed_name
defla d30eafa2 Jun  3 10:33            26820084           571073631  95.3% ceph-client.radosgw.0.log
-----  Uncompress -----

real	0m2.403s
user	0m1.919s
sys	0m0.417s
============================  Compress Level 4  ===========================
 ----- Compress -----

real	0m4.302s
user	0m4.150s
sys	0m0.152s
 ----- Compress info -----
method  crc     date  time           compressed        uncompressed  ratio uncompressed_name
defla d30eafa2 Jun  3 10:33            22196434           571073631  96.1% ceph-client.radosgw.0.log
-----  Uncompress -----

real	0m2.413s
user	0m1.914s
sys	0m0.433s
============================  Compress Level 5  ===========================
 ----- Compress -----

real	0m4.385s
user	0m4.237s
sys	0m0.149s
 ----- Compress info -----
method  crc     date  time           compressed        uncompressed  ratio uncompressed_name
defla d30eafa2 Jun  3 10:33            20643438           571073631  96.4% ceph-client.radosgw.0.log
-----  Uncompress -----

real	0m2.370s
user	0m1.937s
sys	0m0.349s
============================  Compress Level 6  ===========================
 ----- Compress -----

real	0m5.298s
user	0m5.142s
sys	0m0.156s
 ----- Compress info -----
method  crc     date  time           compressed        uncompressed  ratio uncompressed_name
defla d30eafa2 Jun  3 10:33            19503464           571073631  96.6% ceph-client.radosgw.0.log
-----  Uncompress -----

real	0m2.367s
user	0m1.910s
sys	0m0.400s
============================  Compress Level 7  ===========================
 ----- Compress -----

real	0m5.518s
user	0m5.402s
sys	0m0.116s
 ----- Compress info -----
method  crc     date  time           compressed        uncompressed  ratio uncompressed_name
defla d30eafa2 Jun  3 10:33            19396724           571073631  96.6% ceph-client.radosgw.0.log
-----  Uncompress -----

real	0m2.435s
user	0m1.842s
sys	0m0.486s
============================  Compress Level 8  ===========================
 ----- Compress -----

real	0m6.961s
user	0m6.853s
sys	0m0.108s
 ----- Compress info -----
method  crc     date  time           compressed        uncompressed  ratio uncompressed_name
defla d30eafa2 Jun  3 10:33            18411146           571073631  96.8% ceph-client.radosgw.0.log
-----  Uncompress -----

real	0m2.480s
user	0m1.884s
sys	0m0.399s
============================  Compress Level 9  ===========================
 ----- Compress -----

real	0m7.233s
user	0m7.077s
sys	0m0.156s
 ----- Compress info -----
method  crc     date  time           compressed        uncompressed  ratio uncompressed_name
defla d30eafa2 Jun  3 10:33            18186392           571073631  96.8% ceph-client.radosgw.0.log
-----  Uncompress -----

real	0m2.877s
user	0m1.820s
sys	0m0.456s
root@node244:/mnt/disk/compress_test# 

bzip2的压缩与解压缩

shell

root@node244:/mnt/disk/compress_test# for i in {1..9}; do echo "============================  Compress Level $i  ==========================="; echo " ----- Compress -----"; time bzip2 -k -f -v -$i ceph-client.radosgw.0.log ceph-client.radosgw.0.log.bz2; echo "-----  Uncompress -----"; time bzip2 -d -f ceph-client.radosgw.0.log.bz2; done
============================  Compress Level 1  ===========================
 ----- Compress -----
  ceph-client.radosgw.0.log:     28.266:1,  0.283 bits/byte, 96.46% saved, 571073631 in, 20203523 out.
bzip2: Input file ceph-client.radosgw.0.log.bz2 already has .bz2 suffix.

real	1m14.986s
user	1m14.115s
sys	0m0.212s
-----  Uncompress -----

real	0m8.986s
user	0m7.820s
sys	0m0.492s
============================  Compress Level 2  ===========================
 ----- Compress -----
  ceph-client.radosgw.0.log:     34.344:1,  0.233 bits/byte, 97.09% saved, 571073631 in, 16627875 out.
bzip2: Input file ceph-client.radosgw.0.log.bz2 already has .bz2 suffix.

real	1m26.956s
user	1m25.741s
sys	0m0.232s
-----  Uncompress -----

real	0m8.858s
user	0m8.241s
sys	0m0.488s
============================  Compress Level 3  ===========================
 ----- Compress -----
  ceph-client.radosgw.0.log:     37.331:1,  0.214 bits/byte, 97.32% saved, 571073631 in, 15297380 out.
bzip2: Input file ceph-client.radosgw.0.log.bz2 already has .bz2 suffix.

real	1m35.014s
user	1m34.232s
sys	0m0.140s
-----  Uncompress -----

real	0m9.019s
user	0m8.451s
sys	0m0.512s
============================  Compress Level 4  ===========================
 ----- Compress -----
  ceph-client.radosgw.0.log:     38.943:1,  0.205 bits/byte, 97.43% saved, 571073631 in, 14664242 out.
bzip2: Input file ceph-client.radosgw.0.log.bz2 already has .bz2 suffix.

real	1m41.031s
user	1m40.805s
sys	0m0.160s
-----  Uncompress -----

real	0m9.702s
user	0m8.491s
sys	0m0.596s
============================  Compress Level 5  ===========================
 ----- Compress -----
  ceph-client.radosgw.0.log:     40.177:1,  0.199 bits/byte, 97.51% saved, 571073631 in, 14214101 out.
bzip2: Input file ceph-client.radosgw.0.log.bz2 already has .bz2 suffix.

real	1m46.857s
user	1m46.109s
sys	0m0.172s
-----  Uncompress -----

real	0m9.378s
user	0m8.565s
sys	0m0.536s
============================  Compress Level 6  ===========================
 ----- Compress -----
  ceph-client.radosgw.0.log:     41.332:1,  0.194 bits/byte, 97.58% saved, 571073631 in, 13816797 out.
bzip2: Input file ceph-client.radosgw.0.log.bz2 already has .bz2 suffix.

real	1m50.821s
user	1m50.619s
sys	0m0.196s
-----  Uncompress -----

real	0m9.688s
user	0m8.611s
sys	0m0.556s
============================  Compress Level 7  ===========================
 ----- Compress -----
  ceph-client.radosgw.0.log:     41.987:1,  0.191 bits/byte, 97.62% saved, 571073631 in, 13601242 out.
bzip2: Input file ceph-client.radosgw.0.log.bz2 already has .bz2 suffix.

real	1m55.011s
user	1m54.842s
sys	0m0.168s
-----  Uncompress -----

real	0m9.197s
user	0m8.560s
sys	0m0.575s
============================  Compress Level 8  ===========================
 ----- Compress -----
  ceph-client.radosgw.0.log:     42.637:1,  0.188 bits/byte, 97.65% saved, 571073631 in, 13393739 out.
bzip2: Input file ceph-client.radosgw.0.log.bz2 already has .bz2 suffix.

real	2m0.052s
user	1m59.319s
sys	0m0.100s
-----  Uncompress -----

real	0m9.232s
user	0m8.650s
sys	0m0.516s
============================  Compress Level 9  ===========================
 ----- Compress -----
  ceph-client.radosgw.0.log:     43.064:1,  0.186 bits/byte, 97.68% saved, 571073631 in, 13261043 out.
bzip2: Input file ceph-client.radosgw.0.log.bz2 already has .bz2 suffix.

real	2m1.284s
user	2m1.072s
sys	0m0.204s
-----  Uncompress -----

real	0m9.246s
user	0m8.667s
sys	0m0.512s
root@node244:/mnt/disk/compress_test# 

pbzip2的压缩与解压缩

由于担心不同CPU并发影响测试效果,这里指定了CPU个数为24个,在不带-p参数情况下,默认32个,当前环境虽然有32cores,但并不是每次都能全部参与,有时候是32,有时候是29,有时候是26,所以这里指定一个更低值,确保每次执行都使用相同cores的CPU数。

shell

@node244:/mnt/disk/compress_test# for i in {1..9}; do echo "============================  Compress Level $i  ==========================="; echo " ----- Compress -----"; time pbzip2 -k -f -v -p24 -$i ceph-client.radosgw.0.log; echo "----- Uncompress -----"; time pbzip2 -p24 -d -f ceph-client.radosgw.0.log.bz2; done
============================  Compress Level 1  ===========================
 ----- Compress -----
Parallel BZIP2 v1.1.9     - by: Jeff Gilchrist [http://compression.ca]
[Apr. 13, 2014]               (uses libbzip2 by Julian Seward)
Major contributions: Yavor Nikolov <nikolov.javor+pbzip2@gmail.com>

         # CPUs: 24
 BWT Block Size: 100 KB
File Block Size: 900 KB
 Maximum Memory: 100 MB
-------------------------------------------
         File #: 1 of 1
     Input Name: ceph-client.radosgw.0.log
    Output Name: ceph-client.radosgw.0.log.bz2

     Input Size: 571073631 bytes
Compressing data...
    Output Size: 20251314 bytes
-------------------------------------------

     Wall Clock: 5.553146 seconds

real	0m5.557s
user	2m10.868s
sys	0m0.600s
----- Uncompress -----

real	0m0.675s
user	0m11.626s
sys	0m0.710s
============================  Compress Level 2  ===========================
 ----- Compress -----
Parallel BZIP2 v1.1.9     - by: Jeff Gilchrist [http://compression.ca]
[Apr. 13, 2014]               (uses libbzip2 by Julian Seward)
Major contributions: Yavor Nikolov <nikolov.javor+pbzip2@gmail.com>

         # CPUs: 24
 BWT Block Size: 200 KB
File Block Size: 900 KB
 Maximum Memory: 100 MB
-------------------------------------------
         File #: 1 of 1
     Input Name: ceph-client.radosgw.0.log
    Output Name: ceph-client.radosgw.0.log.bz2

     Input Size: 571073631 bytes
Compressing data...
    Output Size: 17034671 bytes
-------------------------------------------

     Wall Clock: 6.295284 seconds

real	0m6.299s
user	2m28.521s
sys	0m0.892s
----- Uncompress -----

real	0m0.706s
user	0m12.163s
sys	0m0.780s
============================  Compress Level 3  ===========================
 ----- Compress -----
Parallel BZIP2 v1.1.9     - by: Jeff Gilchrist [http://compression.ca]
[Apr. 13, 2014]               (uses libbzip2 by Julian Seward)
Major contributions: Yavor Nikolov <nikolov.javor+pbzip2@gmail.com>

         # CPUs: 24
 BWT Block Size: 300 KB
File Block Size: 900 KB
 Maximum Memory: 100 MB
-------------------------------------------
         File #: 1 of 1
     Input Name: ceph-client.radosgw.0.log
    Output Name: ceph-client.radosgw.0.log.bz2

     Input Size: 571073631 bytes
Compressing data...
    Output Size: 15336308 bytes
-------------------------------------------

     Wall Clock: 7.094542 seconds

real	0m7.098s
user	2m46.324s
sys	0m1.328s
----- Uncompress -----

real	0m1.047s
user	0m12.899s
sys	0m0.723s
============================  Compress Level 4  ===========================
 ----- Compress -----
Parallel BZIP2 v1.1.9     - by: Jeff Gilchrist [http://compression.ca]
[Apr. 13, 2014]               (uses libbzip2 by Julian Seward)
Major contributions: Yavor Nikolov <nikolov.javor+pbzip2@gmail.com>

         # CPUs: 24
 BWT Block Size: 400 KB
File Block Size: 900 KB
 Maximum Memory: 100 MB
-------------------------------------------
         File #: 1 of 1
     Input Name: ceph-client.radosgw.0.log
    Output Name: ceph-client.radosgw.0.log.bz2

     Input Size: 571073631 bytes
Compressing data...
    Output Size: 15289111 bytes
-------------------------------------------

     Wall Clock: 7.390034 seconds

real	0m7.393s
user	2m53.305s
sys	0m1.428s
----- Uncompress -----

real	0m0.738s
user	0m13.124s
sys	0m0.806s
============================  Compress Level 5  ===========================
 ----- Compress -----
Parallel BZIP2 v1.1.9     - by: Jeff Gilchrist [http://compression.ca]
[Apr. 13, 2014]               (uses libbzip2 by Julian Seward)
Major contributions: Yavor Nikolov <nikolov.javor+pbzip2@gmail.com>

         # CPUs: 24
 BWT Block Size: 500 KB
File Block Size: 900 KB
 Maximum Memory: 100 MB
-------------------------------------------
         File #: 1 of 1
     Input Name: ceph-client.radosgw.0.log
    Output Name: ceph-client.radosgw.0.log.bz2

     Input Size: 571073631 bytes
Compressing data...
    Output Size: 14424635 bytes
-------------------------------------------

     Wall Clock: 7.929576 seconds

real	0m7.933s
user	3m5.904s
sys	0m1.616s
----- Uncompress -----

real	0m0.783s
user	0m14.003s
sys	0m0.780s
============================  Compress Level 6  ===========================
 ----- Compress -----
Parallel BZIP2 v1.1.9     - by: Jeff Gilchrist [http://compression.ca]
[Apr. 13, 2014]               (uses libbzip2 by Julian Seward)
Major contributions: Yavor Nikolov <nikolov.javor+pbzip2@gmail.com>

         # CPUs: 24
 BWT Block Size: 600 KB
File Block Size: 900 KB
 Maximum Memory: 100 MB
-------------------------------------------
         File #: 1 of 1
     Input Name: ceph-client.radosgw.0.log
    Output Name: ceph-client.radosgw.0.log.bz2

     Input Size: 571073631 bytes
Compressing data...
    Output Size: 14320561 bytes
-------------------------------------------

     Wall Clock: 8.235693 seconds

real	0m8.240s
user	3m12.339s
sys	0m2.512s
----- Uncompress -----

real	0m0.770s
user	0m13.975s
sys	0m0.789s
============================  Compress Level 7  ===========================
 ----- Compress -----
Parallel BZIP2 v1.1.9     - by: Jeff Gilchrist [http://compression.ca]
[Apr. 13, 2014]               (uses libbzip2 by Julian Seward)
Major contributions: Yavor Nikolov <nikolov.javor+pbzip2@gmail.com>

         # CPUs: 24
 BWT Block Size: 700 KB
File Block Size: 900 KB
 Maximum Memory: 100 MB
-------------------------------------------
         File #: 1 of 1
     Input Name: ceph-client.radosgw.0.log
    Output Name: ceph-client.radosgw.0.log.bz2

     Input Size: 571073631 bytes
Compressing data...
    Output Size: 14284009 bytes
-------------------------------------------

     Wall Clock: 8.463171 seconds

real	0m8.467s
user	3m17.862s
sys	0m2.209s
----- Uncompress -----

real	0m0.828s
user	0m14.793s
sys	0m0.932s
============================  Compress Level 8  ===========================
 ----- Compress -----
Parallel BZIP2 v1.1.9     - by: Jeff Gilchrist [http://compression.ca]
[Apr. 13, 2014]               (uses libbzip2 by Julian Seward)
Major contributions: Yavor Nikolov <nikolov.javor+pbzip2@gmail.com>

         # CPUs: 24
 BWT Block Size: 800 KB
File Block Size: 900 KB
 Maximum Memory: 100 MB
-------------------------------------------
         File #: 1 of 1
     Input Name: ceph-client.radosgw.0.log
    Output Name: ceph-client.radosgw.0.log.bz2

     Input Size: 571073631 bytes
Compressing data...
    Output Size: 14161949 bytes
-------------------------------------------

     Wall Clock: 9.343830 seconds

real	0m9.348s
user	3m37.798s
sys	0m3.541s
----- Uncompress -----

real	0m0.844s
user	0m15.348s
sys	0m0.937s
============================  Compress Level 9  ===========================
 ----- Compress -----
Parallel BZIP2 v1.1.9     - by: Jeff Gilchrist [http://compression.ca]
[Apr. 13, 2014]               (uses libbzip2 by Julian Seward)
Major contributions: Yavor Nikolov <nikolov.javor+pbzip2@gmail.com>

         # CPUs: 24
 BWT Block Size: 900 KB
File Block Size: 900 KB
 Maximum Memory: 100 MB
-------------------------------------------
         File #: 1 of 1
     Input Name: ceph-client.radosgw.0.log
    Output Name: ceph-client.radosgw.0.log.bz2

     Input Size: 571073631 bytes
Compressing data...
    Output Size: 13295516 bytes
-------------------------------------------

     Wall Clock: 10.434447 seconds

real	0m10.438s
user	4m3.675s
sys	0m4.175s
----- Uncompress -----

real	0m0.796s
user	0m15.991s
sys	0m0.911s
root@node244:/mnt/disk/compress_test#

xz的压缩与解压缩

shell

root@node244:/mnt/disk/compress_test# for i in {0..9}; do echo "============================  Compress Level $i  ==========================="; echo " ----- Compress -----"; rm -rf *.xz; time xz -k -f -v -$i ceph-client.radosgw.0.log; echo " ----- Compress info -----"; xz -l -v ceph-client.radosgw.0.log.xz; echo "----- Uncompress -----"; time xz -d -f ceph-client.radosgw.0.log.xz; done
============================  Compress Level 0  ===========================
 ----- Compress -----
ceph-client.radosgw.0.log (1/1)
  100 %        17.9 MiB / 544.6 MiB = 0.033    71 MiB/s       0:07             

real	0m7.703s
user	0m7.567s
sys	0m0.136s
 ----- Compress info -----
ceph-client.radosgw.0.log.xz (1/1)
  Streams:            1
  Blocks:             1
  Compressed size:    17.9 MiB (18,732,968 B)
  Uncompressed size:  544.6 MiB (571,073,631 B)
  Ratio:              0.033
  Check:              CRC64
  Stream padding:     0 B
  Streams:
    Stream    Blocks      CompOffset    UncompOffset        CompSize      UncompSize  Ratio  Check      Padding
         1         1               0               0      18,732,968     571,073,631  0.033  CRC64            0
  Blocks:
    Stream     Block      CompOffset    UncompOffset       TotalSize      UncompSize  Ratio  Check
         1         1              12               0      18,732,928     571,073,631  0.033  CRC64
----- Uncompress -----

real	0m2.886s
user	0m1.895s
sys	0m0.444s
============================  Compress Level 1  ===========================
 ----- Compress -----
ceph-client.radosgw.0.log (1/1)
  100 %        17.2 MiB / 544.6 MiB = 0.032    56 MiB/s       0:09             

real	0m9.666s
user	0m9.514s
sys	0m0.152s
 ----- Compress info -----
ceph-client.radosgw.0.log.xz (1/1)
  Streams:            1
  Blocks:             1
  Compressed size:    17.2 MiB (18,010,756 B)
  Uncompressed size:  544.6 MiB (571,073,631 B)
  Ratio:              0.032
  Check:              CRC64
  Stream padding:     0 B
  Streams:
    Stream    Blocks      CompOffset    UncompOffset        CompSize      UncompSize  Ratio  Check      Padding
         1         1               0               0      18,010,756     571,073,631  0.032  CRC64            0
  Blocks:
    Stream     Block      CompOffset    UncompOffset       TotalSize      UncompSize  Ratio  Check
         1         1              12               0      18,010,716     571,073,631  0.032  CRC64
----- Uncompress -----

real	0m2.282s
user	0m1.719s
sys	0m0.496s
============================  Compress Level 2  ===========================
 ----- Compress -----
ceph-client.radosgw.0.log (1/1)
  100 %        16.3 MiB / 544.6 MiB = 0.030    45 MiB/s       0:12             

real	0m12.128s
user	0m11.964s
sys	0m0.164s
 ----- Compress info -----
ceph-client.radosgw.0.log.xz (1/1)
  Streams:            1
  Blocks:             1
  Compressed size:    16.3 MiB (17,065,980 B)
  Uncompressed size:  544.6 MiB (571,073,631 B)
  Ratio:              0.030
  Check:              CRC64
  Stream padding:     0 B
  Streams:
    Stream    Blocks      CompOffset    UncompOffset        CompSize      UncompSize  Ratio  Check      Padding
         1         1               0               0      17,065,980     571,073,631  0.030  CRC64            0
  Blocks:
    Stream     Block      CompOffset    UncompOffset       TotalSize      UncompSize  Ratio  Check
         1         1              12               0      17,065,940     571,073,631  0.030  CRC64
----- Uncompress -----

real	0m2.277s
user	0m1.694s
sys	0m0.517s
============================  Compress Level 3  ===========================
 ----- Compress -----
ceph-client.radosgw.0.log (1/1)
  100 %        16.1 MiB / 544.6 MiB = 0.030    32 MiB/s       0:16             

real	0m16.986s
user	0m16.778s
sys	0m0.208s
 ----- Compress info -----
ceph-client.radosgw.0.log.xz (1/1)
  Streams:            1
  Blocks:             1
  Compressed size:    16.1 MiB (16,899,448 B)
  Uncompressed size:  544.6 MiB (571,073,631 B)
  Ratio:              0.030
  Check:              CRC64
  Stream padding:     0 B
  Streams:
    Stream    Blocks      CompOffset    UncompOffset        CompSize      UncompSize  Ratio  Check      Padding
         1         1               0               0      16,899,448     571,073,631  0.030  CRC64            0
  Blocks:
    Stream     Block      CompOffset    UncompOffset       TotalSize      UncompSize  Ratio  Check
         1         1              12               0      16,899,408     571,073,631  0.030  CRC64
----- Uncompress -----

real	0m2.252s
user	0m1.664s
sys	0m0.532s
============================  Compress Level 4  ===========================
 ----- Compress -----
ceph-client.radosgw.0.log (1/1)
  100 %        16.4 MiB / 544.6 MiB = 0.030    17 MiB/s       0:31             

real	0m31.449s
user	0m31.220s
sys	0m0.228s
 ----- Compress info -----
ceph-client.radosgw.0.log.xz (1/1)
  Streams:            1
  Blocks:             1
  Compressed size:    16.4 MiB (17,154,260 B)
  Uncompressed size:  544.6 MiB (571,073,631 B)
  Ratio:              0.030
  Check:              CRC64
  Stream padding:     0 B
  Streams:
    Stream    Blocks      CompOffset    UncompOffset        CompSize      UncompSize  Ratio  Check      Padding
         1         1               0               0      17,154,260     571,073,631  0.030  CRC64            0
  Blocks:
    Stream     Block      CompOffset    UncompOffset       TotalSize      UncompSize  Ratio  Check
         1         1              12               0      17,154,220     571,073,631  0.030  CRC64
----- Uncompress -----

real	0m2.349s
user	0m1.868s
sys	0m0.412s
============================  Compress Level 5  ===========================
 ----- Compress -----
ceph-client.radosgw.0.log (1/1)
  100 %        15.6 MiB / 544.6 MiB = 0.029    12 MiB/s       0:45             

real	0m45.585s
user	0m45.152s
sys	0m0.420s
 ----- Compress info -----
ceph-client.radosgw.0.log.xz (1/1)
  Streams:            1
  Blocks:             1
  Compressed size:    15.6 MiB (16,360,932 B)
  Uncompressed size:  544.6 MiB (571,073,631 B)
  Ratio:              0.029
  Check:              CRC64
  Stream padding:     0 B
  Streams:
    Stream    Blocks      CompOffset    UncompOffset        CompSize      UncompSize  Ratio  Check      Padding
         1         1               0               0      16,360,932     571,073,631  0.029  CRC64            0
  Blocks:
    Stream     Block      CompOffset    UncompOffset       TotalSize      UncompSize  Ratio  Check
         1         1              12               0      16,360,892     571,073,631  0.029  CRC64
----- Uncompress -----

real	0m2.398s
user	0m1.851s
sys	0m0.488s
============================  Compress Level 6  ===========================
 ----- Compress -----
ceph-client.radosgw.0.log (1/1)
  100 %        15.1 MiB / 544.6 MiB = 0.028   7.1 MiB/s       1:17             

real	1m17.152s
user	1m16.763s
sys	0m0.388s
 ----- Compress info -----
ceph-client.radosgw.0.log.xz (1/1)
  Streams:            1
  Blocks:             1
  Compressed size:    15.1 MiB (15,786,352 B)
  Uncompressed size:  544.6 MiB (571,073,631 B)
  Ratio:              0.028
  Check:              CRC64
  Stream padding:     0 B
  Streams:
    Stream    Blocks      CompOffset    UncompOffset        CompSize      UncompSize  Ratio  Check      Padding
         1         1               0               0      15,786,352     571,073,631  0.028  CRC64            0
  Blocks:
    Stream     Block      CompOffset    UncompOffset       TotalSize      UncompSize  Ratio  Check
         1         1              12               0      15,786,312     571,073,631  0.028  CRC64
----- Uncompress -----

real	0m2.302s
user	0m1.752s
sys	0m0.487s
============================  Compress Level 7  ===========================
 ----- Compress -----
ceph-client.radosgw.0.log (1/1)
  100 %        15.1 MiB / 544.6 MiB = 0.028   6.7 MiB/s       1:20             

real	1m20.962s
user	1m20.478s
sys	0m0.460s
 ----- Compress info -----
ceph-client.radosgw.0.log.xz (1/1)
  Streams:            1
  Blocks:             1
  Compressed size:    15.1 MiB (15,862,848 B)
  Uncompressed size:  544.6 MiB (571,073,631 B)
  Ratio:              0.028
  Check:              CRC64
  Stream padding:     0 B
  Streams:
    Stream    Blocks      CompOffset    UncompOffset        CompSize      UncompSize  Ratio  Check      Padding
         1         1               0               0      15,862,848     571,073,631  0.028  CRC64            0
  Blocks:
    Stream     Block      CompOffset    UncompOffset       TotalSize      UncompSize  Ratio  Check
         1         1              12               0      15,862,808     571,073,631  0.028  CRC64
----- Uncompress -----

real	0m2.368s
user	0m1.803s
sys	0m0.497s
============================  Compress Level 8  ===========================
 ----- Compress -----
ceph-client.radosgw.0.log (1/1)
  100 %        15.4 MiB / 544.6 MiB = 0.028   6.4 MiB/s       1:25             

real	1m25.153s
user	1m24.488s
sys	0m0.660s
 ----- Compress info -----
ceph-client.radosgw.0.log.xz (1/1)
  Streams:            1
  Blocks:             1
  Compressed size:    15.4 MiB (16,130,916 B)
  Uncompressed size:  544.6 MiB (571,073,631 B)
  Ratio:              0.028
  Check:              CRC64
  Stream padding:     0 B
  Streams:
    Stream    Blocks      CompOffset    UncompOffset        CompSize      UncompSize  Ratio  Check      Padding
         1         1               0               0      16,130,916     571,073,631  0.028  CRC64            0
  Blocks:
    Stream     Block      CompOffset    UncompOffset       TotalSize      UncompSize  Ratio  Check
         1         1              12               0      16,130,876     571,073,631  0.028  CRC64
----- Uncompress -----

real	0m2.359s
user	0m1.768s
sys	0m0.535s
============================  Compress Level 9  ===========================
 ----- Compress -----
ceph-client.radosgw.0.log (1/1)
  100 %        15.7 MiB / 544.6 MiB = 0.029   5.9 MiB/s       1:32             

real	1m32.645s
user	1m31.211s
sys	0m1.424s
 ----- Compress info -----
ceph-client.radosgw.0.log.xz (1/1)
  Streams:            1
  Blocks:             1
  Compressed size:    15.7 MiB (16,510,384 B)
  Uncompressed size:  544.6 MiB (571,073,631 B)
  Ratio:              0.029
  Check:              CRC64
  Stream padding:     0 B
  Streams:
    Stream    Blocks      CompOffset    UncompOffset        CompSize      UncompSize  Ratio  Check      Padding
         1         1               0               0      16,510,384     571,073,631  0.029  CRC64            0
  Blocks:
    Stream     Block      CompOffset    UncompOffset       TotalSize      UncompSize  Ratio  Check
         1         1              12               0      16,510,344     571,073,631  0.029  CRC64
----- Uncompress -----

real	0m2.394s
user	0m1.763s
sys	0m0.500s
root@node244:/mnt/disk/compress_test# 

测试结果

Compression Size

Unit:bytes

Compress Levelgzipbzip2pbzip2xz
0---18,732,968
127542300202035232025131418,010,756
227158636166278751703467117,065,980
326820084152973801533630816,899,448
422196434146642421528911117,154,260
520643438142141011442463516,360,932
619503464138167971432056115,786,352
719396724136012421428400915,862,848
818411146133937391416194916,130,916
918186392132610431329551616,510,384

Compression Time

先从压缩时间开始,下列图表显示了从1到9的每个压缩级别完成压缩所花费的时间。

Unit: seconds

Compress Levelgzipbzip2pbzip2xz
0---0m7.703s
10m3.389s1m14.986s0m5.557s0m9.666s
20m3.385s1m26.956s0m6.299s0m12.128s
30m3.407s1m35.014s0m7.098s0m16.986s
40m4.302s1m41.031s0m7.393s0m31.449s
50m4.385s1m46.857s0m7.933s0m45.585s
60m5.298s1m50.821s0m8.240s1m17.152s
70m5.518s1m55.011s0m8.467s1m20.962s
80m6.961s2m0.052s0m9.348s1m25.153s
90m7.233s2m1.284s0m10.438s1m32.645s

从折线图可以看出,随着压缩级别的提高,bzip2需要花费更长的时间才能完成,pbzip2和gzip的变化不大,而压缩级别为3后,xz的增长非常明显。

Compression Ratio

Unit: %

Compress Levelgzipbzip2pbzip2xz
0---3.3
14.83.543.553.3
24.82.912.983.0
34.72.682.693.0
43.92.572.683.0
53.62.492.532.9
63.42.422.512.8
73.42.382.502.8
83.22.352.482.8
93.22.322.332.9

compression ratio,越小越好,比如说源文件大小是100MiB,ratio为5%,则压缩后文件大小为5MiB。

从折线图看到趋势是:压缩级别越高,压缩率越低,说明文件被压缩的更小了。在这种情况下,xz始终提供比较平稳的压缩率(但从压缩时间图所示,xz在压缩级别3之后花费更长的时间才能获得这个效果),紧跟其后的是pbzip2和bzip2,gzip在压缩级别为3后,压缩率有所下降。

Compression Speed

Unit:MiB/s

Compress Levelgzipbzip2pbzip2xz
0---70.702
1160.7027.26398.00656.344
2160.8926.26386.46144.906
3159.8535.73276.72832.063
4126.5975.39173.66717.318
5124.2005.09768.65211.947
6102.7974.91466.0947.059
798.6984.73564.3226.727
878.2394.53658.2606.396
975.2964.49052.1765.879

Decompression Time

Unit: seconds

Decompress Levelgzipbzip2pbzip2xz
0---0m2.886s
10m2.502s0m8.986s0m0.675s0m2.282s
20m2.782s0m8.858s0m0.706s0m2.277s
30m2.403s0m9.019s0m1.047s0m2.252s
40m2.413s0m9.702s0m0.738s0m2.349s
50m2.370s0m9.378s0m0.783s0m2.398s
60m2.367s0m9.688s0m0.770s0m2.302s
70m2.435s0m9.197s0m0.828s0m2.368s
80m2.480s0m9.232s0m0.844s0m2.359s
90m2.877s0m9.246s0m0.796s0m2.394s

由于数值比较小,从折线图上直接看不出什么效果,从表格中可以看出,当文件的压缩级别越高,解压相应的压缩文件耗时越低。xz的解压缩时间几乎是一条直线,非常平稳,gbip2次之,当gbzip2解压比xz更快。

从Compression time, Compression Ratio,结合当前的Decompress Time看,总体上推荐gbzip2进行文件的压缩与解压缩操作。

Decompression Speed

Unit: MiB/s

Decompress Levelgzipbzip2pbzip2xz
0---6.202
110.4982.14428.6127.537
29.3101.79023.0117.159
310.6441.61813.9707.149
48.7731.44119.7576.982
58.3071.44517.5696.505
67.8581.36017.7376.560
77.5981.41016.4526.558
87.0801.38416.0026.528
96.0281.36815.9296.558

性能差异和比较

默认情况下,当未指定压缩级别时,gzip 使用 -6,bzip2 和 pbzip2 使用 -9,xz 使用 -6。

根据测试结果,原因非常明确:对于 gzip 和 xz -6 作为默认压缩方法提供良好的压缩级别,但完成时间不会太长,损失一个平衡点,因为较高的压缩级别需要更长的时间来处理压缩。另一方面,pbzip2 最好使用默认压缩级别为 9,如手册页中建议的那样,此处的结果证实了这一点,压缩比增加,但所采用的时间几乎相同,在级别 1 到 9 之间相差不到一秒;但反观bzip2,使用不同的压缩级别,压缩耗时还是有蛮大幅度变化的。

一般来说,xz 达到最佳压缩级别,非常的平稳,然后是 bzip2和pbzip2,然后是 gzip。为了达到更好的压缩,但是xz通常需要最长的完成,其次是pbzip2,然后是gzip,最差的是bzip2,耗时太久。

xz 的默认压缩级别为 6,而 pbzip2 在压缩级别 9 时仅花费比 gzip 稍长一点的时间,并且压缩量更好,而 pbzip2 和 xz 之间的差异小于 pbzip2 和 gzip 之间的差异,因此 pbzip2 成为压缩的优先选择。

根据这些测试结果,pbzip2是压缩的良好中间地带,gzip只是压缩的更快一点,而xz可能并不真正值得使用,尤其使用更高的文件压缩级别(>=6),因为它需要更长的时间来完成压缩操作。

然而,使用 bzip2 解压缩比 xz 或 gzip 或 pbzip2 需要更长的时间,xz 处于解压缩文件的良好中间地带,而 gbzip2 则是解压缩最快的。

选择哪种压缩方式?

那么我该选择哪种压缩/解压缩方式?这完全取决于应用目的了,需要因地制宜选择所需的压缩/解压缩方法。

  • 如果是交互式的压缩文件,可以使用pxz,这个指令可以看到压缩进度;

  • 如果只是想尽可能快的压缩和解压缩文件,很少考虑压缩比情况下,gzip是一个很好的选择;

  • 如果只想要一个更好的压缩比,以节约磁盘空间,并愿意话费更多的时间去加压缩它,那么xz只比较好的选择,其次是pbzip2


来源: Transcendent
文章作者: Gavin Wang
文章链接: gzip vs bzip2 vs xz vs pbzip2 性能对比 | Transcendent
本文章著作权归作者所有,任何形式的转载都请注明出处。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值