linux内存压缩工具,Linux命令:pigz多线程压缩工具

学习Linux系统时都会学习这么几个压缩工具:gzip、bzip2、zip、xz,以及相关的解压工具。关于这几个工具的使用和相互之间的压缩比以及压缩时间对比可以看:Linux中归档压缩工具学习

那么Pigz是什么呢?简单的说,就是支持并行压缩的gzip。Pigz默认用当前逻辑cpu个数来并发压缩,无法检测个数的话,则默认并发8个线程,也可以使用-p指定线程数。需要注意的是其CPU使用比较高。

废话不多说,开始测试。

$ yum install pigz

1

$yuminstallpigz

$ pigz --help

Usage: pigz [options] [files ...]

will compress files in place, adding the suffix '.gz'. If no files are

specified, stdin will be compressed to stdout. pigz does what gzip does,

but spreads the work over multiple processors and cores when compressing.

Options:

-0 to -9, -11 Compression level (11 is much slower, a few % better)

--fast, --best Compression levels 1 and 9 respectively

-b, --blocksize mmm Set compression block size to mmmK (default 128K)

-c, --stdout Write all processed output to stdout (won't delete)

-d, --decompress Decompress the compressed input

-f, --force Force overwrite, compress .gz, links, and to terminal

-F --first Do iterations first, before block split for -11

-h, --help Display a help screen and quit

-i, --independent Compress blocks independently for damage recovery

-I, --iterations n Number of iterations for -11 optimization

-k, --keep Do not delete original file after processing

-K, --zip Compress to PKWare zip (.zip) single entry format

-l, --list List the contents of the compressed input

-L, --license Display the pigz license and quit

-M, --maxsplits n Maximum number of split blocks for -11

-n, --no-name Do not store or restore file name in/from header

-N, --name Store/restore file name and mod time in/from header

-O --oneblock Do not split into smaller blocks for -11

-p, --processes n Allow up to n compression threads (default is the

number of online processors, or 8 if unknown)

-q, --quiet Print no messages, even on error

-r, --recursive Process the contents of all subdirectories

-R, --rsyncable Input-determined block locations for rsync

-S, --suffix .sss Use suffix .sss instead of .gz (for compression)

-t, --test Test the integrity of the compressed input

-T, --no-time Do not store or restore mod time in/from header

-v, --verbose Provide more verbose output

-V --version Show the version of pigz

-z, --zlib Compress to zlib (.zz) instead of gzip format

-- All arguments after "--" are treated as files

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

$pigz--help

Usage:pigz[options][files...]

willcompressfilesinplace,addingthesuffix'.gz'.Ifnofilesare

specified,stdinwillbecompressedtostdout.pigzdoeswhatgzipdoes,

butspreadstheworkovermultipleprocessorsandcoreswhencompressing.

Options:

-0to-9,-11Compressionlevel(11ismuchslower,afew%better)

--fast,--bestCompressionlevels1and9respectively

-b,--blocksizemmmSetcompressionblocksizetommmK(default128K)

-c,--stdoutWriteallprocessedoutputtostdout(won'tdelete)

-d,--decompressDecompressthecompressedinput

-f,--forceForceoverwrite,compress.gz,links,andtoterminal

-F--firstDoiterationsfirst,beforeblocksplitfor-11

-h,--helpDisplayahelpscreenandquit

-i,--independentCompressblocksindependentlyfordamagerecovery

-I,--iterationsnNumberofiterationsfor-11optimization

-k,--keepDonotdeleteoriginalfileafterprocessing

-K,--zipCompresstoPKWarezip(.zip)singleentryformat

-l,--listListthecontentsofthecompressedinput

-L,--licenseDisplaythepigzlicenseandquit

-M,--maxsplitsnMaximumnumberofsplitblocksfor-11

-n,--no-nameDonotstoreorrestorefilenamein/fromheader

-N,--nameStore/restorefilenameandmodtimein/fromheader

-O--oneblockDonotsplitintosmallerblocksfor-11

-p,--processesnAllowuptoncompressionthreads(defaultisthe

numberofonlineprocessors,or8ifunknown)

-q,--quietPrintnomessages,evenonerror

-r,--recursiveProcessthecontentsofallsubdirectories

-R,--rsyncableInput-determinedblocklocationsforrsync

-S,--suffix.sssUsesuffix.sssinsteadof.gz(forcompression)

-t,--testTesttheintegrityofthecompressedinput

-T,--no-timeDonotstoreorrestoremodtimein/fromheader

-v,--verboseProvidemoreverboseoutput

-V--versionShowtheversionofpigz

-z,--zlibCompresstozlib(.zz)insteadofgzipformat

--Allargumentsafter"--"aretreatedasfiles

原目录大小

$ du -sh /tmp/hadoop

2.3G /tmp/hadoop

1

2

$du-sh/tmp/hadoop

2.3G/tmp/hadoop

使用gzip压缩(1个线程)

# 压缩耗时;

$ time tar -zvcf hadoop.tar.gz /tmp/hadoop

real 0m49.935s

user 0m46.205s

sys 0m3.449s

# 压缩大小;

$ du -sh hadoop.tar.gz

410M hadoop.tar.gz

1

2

3

4

5

6

7

8

9

# 压缩耗时;

$timetar-zvcfhadoop.tar.gz/tmp/hadoop

real0m49.935s

user0m46.205s

sys0m3.449s

# 压缩大小;

$du-shhadoop.tar.gz

410Mhadoop.tar.gz

解压gzip压缩文件

$ time tar xf hadoop.tar.gz

real 0m17.226s

user 0m14.647s

sys 0m4.957s

1

2

3

4

5

$timetarxfhadoop.tar.gz

real0m17.226s

user0m14.647s

sys0m4.957s

使用pigz压缩(4个线程)

# 压缩耗时;

$ time tar -cf - /tmp/hadoop | pigz -p 4 > hadoop.tgz

real 0m13.596s

user 0m48.181s

sys 0m2.045s

# 压缩大小;

$ du -sh hadoop.tgz

411M hadoop.tgz

1

2

3

4

5

6

7

8

9

# 压缩耗时;

$timetar-cf-/tmp/hadoop|pigz-p4>hadoop.tgz

real0m13.596s

user0m48.181s

sys0m2.045s

# 压缩大小;

$du-shhadoop.tgz

411Mhadoop.tgz

解压pigz文件

$ time pigz -p 4 -d hadoop.tgz

real 0m17.508s

user 0m12.973s

sys 0m5.037s

1

2

3

4

5

$timepigz-p4-dhadoop.tgz

real0m17.508s

user0m12.973s

sys0m5.037s

可以看出pigz时间上比gzip快了三分之二还多,但CPU消耗则是gzip的好几倍,我这里只是4个线程的虚拟机,当然pigz的CPU使用率也是很可观的哦,基本100%了。所以在对压缩效率要求较高、但对短时间内CPU消耗较高不受影响的场景,使用pigz非常合适。

当然pigz也不是随着线程的增加速度就越快,也有一个瓶颈区域,网上有人对比了一下:并发8线程对比4线程提升41.2%,16线程对比8线程提升27.9%,32线程对比16线程提升3%。可以看出线程数越高速度提升就越慢了。更多的可以自己测试。

如果您觉得本站对你有帮助,那么可以支付宝扫码捐助以帮助本站更好地发展,在此谢过。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值