如何将zlib,gzip和zip相关联?它们有什么共同之处,它们有何不同?

本文翻译自:How are zlib, gzip and zip related? What do they have in common and how are they different?

The compression algorithm used in zlib is essentially the same as that in gzip and zip . zlib中使用的压缩算法与gzipzip中的压缩算法基本相同。 What are gzip and zip ? 什么是gzipzip How are they different and how are they same? 它们有何不同,它们是如何相同的?


#1楼

参考:https://stackoom.com/question/1P7AU/如何将zlib-gzip和zip相关联-它们有什么共同之处-它们有何不同


#2楼

ZIP is a file format used for storing an arbitrary number of files and folders together with lossless compression. ZIP是一种文件格式,用于存储任意数量的文件和文件夹以及无损压缩。 It makes no strict assumptions about the compression methods used, but is most frequently used with DEFLATE . 它没有对使用的压缩方法做出严格的假设,但最常用于DEFLATE

Gzip is both a compression algorithm based on DEFLATE but less encumbered with potential patents et al, and a file format for storing a single compressed file. Gzip既是基于DEFLATE的压缩算法,又是对潜在专利等的较少阻碍,以及用于存储单个压缩文件的文件格式。 It supports compressing an arbitrary number of files and folders when combined with tar . 它支持在与tar结合使用时压缩任意数量的文件和文件夹。 The resulting file has an extension of .tgz or .tar.gz and is commonly called a tarball . 生成的文件的扩展名为.tgz.tar.gz ,通常称为tarball

zlib is a library of functions encapsulating DEFLATE in its most common LZ77 incarnation. zlib是一个函数库,在最常见的LZ77版本中封装了DEFLATE。


#3楼

The most important difference is that gzip is only capable to compress a single file while zip compresses multiple files one by one and archives them into one single file afterwards. 最重要的区别是gzip只能压缩单个文件,而zip会逐个压缩多个文件,然后将它们归档到一个文件中。 Thus, gzip comes along with tar most of the time (there are other possibilities, though). 因此,gzip大部分时间都伴随着tar(尽管还有其他可能性)。 This comes along with some (dis)advantages. 这伴随着一些(dis)优势。

If you have a big archive and you only need one single file out of it, you have to decompress the whole gzip file to get to that file. 如果你有一个大的存档而你只需要一个单独的文件,你必须解压缩整个gzip文件才能获得该文件。 This is not required if you have a zip file. 如果您有zip文件,则不需要这样做。

On the other hand, if you compress 10 similiar or even identical files, the zip archive will be much bigger because each file is compressed individually, whereas in gzip in combination with tar a single file is compressed which is much more effective if the files are similiar (equal). 另一方面,如果你压缩10个相似或甚至相同的文件,zip存档将会更大,因为每个文件都是单独压缩的,而在gzip中与tar结合使用时会压缩单个文件,如果文件是相似的(相等的)。


#4楼

Short form: 简写:

.zip is an archive format using, usually, the Deflate compression method . .zip是一种归档格式 ,通常使用Deflate压缩方法 The .gz gzip format is for single files, also using the Deflate compression method. .gz gzip格式适用于单个文件,也使用Deflate压缩方法。 Often gzip is used in combination with tar to make a compressed archive format , .tar.gz . gzip通常与tar结合使用,以生成压缩的归档格式 .tar.gz The zlib library provides Deflate compression and decompression code for use by zip, gzip, png (which uses the zlib wrapper on deflate data), and many other applications. zlib库提供了Deflate压缩和解压缩代码,供zip,gzip, png (使用defl数据上的zlib包装器 )和许多其他应用程序使用。

Long form: 长表:

The ZIP format was developed by Phil Katz as an open format with an open specification, where his implementation, PKZIP, was shareware. ZIP格式是由Phil Katz开发的一种开放格式,具有开放式规范,其实现PKZIP是共享软件。 It is an archive format that stores files and their directory structure, where each file is individually compressed. 它是一种存档格式,用于存储文件及其目录结构,其中每个文件都是单独压缩的。 The file type is .zip . 文件类型是.zip The files, as well as the directory structure, can optionally be encrypted. 可以选择加密文件以及目录结构。

The ZIP format supports several compression methods: ZIP格式支持多种压缩方法:

0 - The file is stored (no compression)
1 - The file is Shrunk
2 - The file is Reduced with compression factor 1
3 - The file is Reduced with compression factor 2
4 - The file is Reduced with compression factor 3
5 - The file is Reduced with compression factor 4
6 - The file is Imploded
7 - Reserved for Tokenizing compression algorithm
8 - The file is Deflated
9 - Enhanced Deflating using Deflate64(tm)
10 - PKWARE Data Compression Library Imploding (old IBM TERSE)
11 - Reserved by PKWARE
12 - File is compressed using BZIP2 algorithm
13 - Reserved by PKWARE
14 - LZMA (EFS)
15 - Reserved by PKWARE
16 - Reserved by PKWARE
17 - Reserved by PKWARE
18 - File is compressed using IBM TERSE (new)
19 - IBM LZ77 z Architecture (PFS)
97 - WavPack compressed data
98 - PPMd version I, Rev 1

Methods 1 to 7 are historical and are not in use. 方法1至7是历史的并且未使用。 Methods 9 through 98 are relatively recent additions, and are in varying, small amounts of use. 方法9至98是相对较新的添加物,并且具有不同的少量使用。 The only method in truly widespread use in the ZIP format is method 8, Deflate , and to some smaller extent method 0, which is no compression at all. 在ZIP格式中真正广泛使用的唯一方法是方法8, Deflate ,以及一些较小范围的方法0,它根本不压缩。 Virtually every .zip file that you will come across in the wild will use exclusively methods 8 and 0, likely just method 8. (Method 8 also has a means to effectively store the data with no compression and relatively little expansion, and Method 0 cannot be streamed whereas Method 8 can be.) 实际上,您将在野外遇到的每个.zip文件都将使用方法8和0,可能只是方法8.(方法8也有一种方法可以有效地存储数据而不进行压缩和相对较少的扩展,而方法0不能流式传输,而方法8可以。)

The ISO/IEC 21320-1:2015 standard for file containers is a restricted zip format, such as used in Java archive files (.jar), Office Open XML files (Microsoft Office .docx, .xlsx, .pptx), Office Document Format files (.odt, .ods, .odp), and EPUB files (.epub). 文件容器ISO / IEC 21320-1:2015标准是受限制的zip格式,例如用于Java归档文件(.jar),Office Open XML文件(Microsoft Office .docx,.xlsx,.pptx),Office文档格式化文件(.odt,.ods,.odp)和EPUB文件(.epub)。 That standard limits the compression methods to 0 and 8, as well as other constraints such as no encryption or signatures. 该标准将压缩方法限制为0和8,以及其他约束,如无加密或签名。

Around 1990, the Info-ZIP group wrote portable, free, open source implementations of zip and unzip utilities, supporting compression with the Deflate format, and decompression of that and the earlier formats. 大约在1990年, Info-ZIP小组编写了zipunzip实用程序的可移植,免费,开源实现,支持使用Deflate格式进行压缩,以及对该格式和早期格式进行解压缩。 This greatly expanded the use of the .zip format. 这极大地扩展了.zip格式的使用。

In the early 90's, the gzip format was developed as a replacement for the Unix compress utility , derived from the Deflate code in the Info-ZIP utilities. 在90年代早期, gzip格式被开发为Unix compress实用程序的替代品,它源自Info-ZIP实用程序中的Deflate代码。 Unix compress was designed to compress a single file or stream, appending a .Z to the file name. Unix compress旨在压缩单个文件或流,将.Z附加到文件名。 compress uses the LZW compression algorithm , which at the time was under patent and its free use was in dispute by the patent holders. compress使用LZW压缩算法 ,该算法当时属于专利,其免费使用受到专利持有人的争议。 Though some specific implementations of Deflate were patented by Phil Katz, the format was not, and so it was possible to write a Deflate implementation that did not infringe on any patents. 虽然Deflate的一些具体实现是由Phil Katz申请专利的,但格式不是,所以有可能编写一个不侵犯任何专利的Deflate实现。 That implementation has not been so challenged in the last 20+ years. 在过去的20多年里,这种实施并未受到如此严峻的挑战。 The Unix gzip utility was intended as a drop-in replacement for compress , and in fact is able to decompress compress -compressed data (assuming that you were able to parse that sentence). Unix gzip实用程序旨在作为compress替代品,实际上能够解compress数据(假设您能够解析该句子)。 gzip appends a .gz to the file name. gzip.gz附加到文件名。 gzip uses the Deflate compressed data format, which compresses quite a bit better than Unix compress , has very fast decompression, and adds a CRC-32 as an integrity check for the data. gzip使用Deflate压缩数据格式,压缩比Unix compress更好,具有非常快速的解压缩,并添加CRC-32作为数据的完整性检查。 The header format also permits the storage of more information than the compress format allowed, such as the original file name and the file modification time. 标头格式还允许存储比允许的compress格式更多的信息,例如原始文件名和文件修改时间。

Though compress only compresses a single file, it was common to use the tar utility to create an archive of files, their attributes, and their directory structure into a single .tar file, and to then compress it with compress to make a .tar.Z file. 虽然compress仅压缩单个文件,这是通常使用的tar实用程序创建的文件,它们的属性,以及它们的目录结构的归档到一个单一.tar文件,并然后用它压缩compress做出.tar.Z档。 In fact the tar utility had and still has an option to do the compression at the same time, instead of having to pipe the output of tar to compress . 事实上, tar实用程序已经并且仍然可以选择同时进行压缩,而不必管道tar的输出进行compress This all carried forward to the gzip format, and tar has an option to compress directly to the .tar.gz format. 这一切都转移到gzip格式, tar有一个选项可以直接压缩到.tar.gz格式。 The tar.gz format compresses better than the .zip approach, since the compression of a .tar can take advantage of redundancy across files, especially many small files. tar.gz格式比.zip方法压缩得更好,因为.tar的压缩可以利用文件之间的冗余,特别是许多小文件。 .tar.gz is the most common archive format in use on Unix due to its very high portability, but there are more effective compression methods in use as well, so you will often see .tar.bz2 and .tar.xz archives. .tar.gz是Unix上最常用的归档格式,因为它具有很高的可移植性,但是还有更有效的压缩方法,所以你经常会看到.tar.bz2.tar.xz档案。

Unlike .tar , .zip has a central directory at the end, which provides a list of the contents. .tar不同, .zip在末尾有一个中心目录,它提供了一个内容列表。 That and the separate compression provides random access to the individual entries in a .zip file. 这和单独的压缩提供了对.zip文件中各个条目的随机访问。 A .tar file would have to be decompressed and scanned from start to end in order to build a directory, which is how a .tar file is listed. 必须对.tar文件进行解压缩并从头到尾进行扫描才能构建目录,这就是.tar文件的列出方式。

Shortly after the introduction of gzip, around the mid-1990's, the same patent dispute called into question the free use of the .gif image format, very widely used on bulletin boards and the World Wide Web (a new thing at the time). 在引入gzip之后不久,大约在20世纪90年代中期,同样的专利纠纷质疑.gif图像格式的自由使用,广泛用于公告板和万维网(当时的新事物)。 So a small group created the PNG losslessly compressed image format, with file type .png , to replace .gif . 因此,一个小组创建了PNG无损压缩图像格式,文件类型为.png ,以替换.gif That format also uses the Deflate format for compression, which is applied after filters on the image data expose more of the redundancy. 该格式还使用Deflate格式进行压缩,这是在图像数据上的过滤器暴露更多冗余之后应用的。 In order to promote widespread usage of the PNG format, two free code libraries were created. 为了促进PNG格式的广泛使用,创建了两个免费的代码库。 libpng and zlib . libpngzlib libpng handled all of the features of the PNG format, and zlib provided the compression and decompression code for use by libpng, as well as for other applications. libpng处理了PNG格式的所有功能,zlib提供了压缩和解压缩代码供libpng以及其他应用程序使用。 zlib was adapted from the gzip code. zlib改编自gzip代码。

All of the mentioned patents have since expired. 所有提到的专利都已过期。

The zlib library supports Deflate compression and decompression, and three kinds of wrapping around the deflate streams. zlib库支持Deflate压缩和解压缩,以及围绕deflate流的三种包装。 Those are: no wrapping at all ("raw" deflate), zlib wrapping , which is used in the PNG format data blocks, and gzip wrapping, to provide gzip routines for the programmer. 它们是:根本没有包装(“原始”deflate), zlib包装 (用于PNG格式数据块)和gzip包装,为程序员提供gzip例程。 The main difference between zlib and gzip wrapping is that the zlib wrapping is more compact, six bytes vs. a minimum of 18 bytes for gzip, and the integrity check, Adler-32, runs faster than the CRC-32 that gzip uses. zlib和gzip包装的主要区别在于zlib包装更紧凑,6个字节,而gzip最少18个字节,完整性检查Adler-32运行速度比gzip使用的CRC-32快。 Raw deflate is used by programs that read and write the .zip format, which is another format that wraps around deflate compressed data. 原始deflate由读取和写入.zip格式的程序使用,这是另一种包装缩小压缩数据的格式。

zlib is now in wide use for data transmission and storage. zlib现在广泛用于数据传输和存储。 For example, most HTTP transactions by servers and browsers compress and decompress the data using zlib. 例如,服务器和浏览器的大多数HTTP事务使用zlib压缩和解压缩数据。

Different implementations of deflate can result in different compressed output for the same input data, as evidenced by the existence of selectable compression levels that allow trading off compression effectiveness for CPU time. deflate的不同实现可以导致相同输入数据的不同压缩输出,如可选择的压缩级别的存在所证明的,其允许折衷CPU时间的压缩有效性。 zlib and PKZIP are not the only implementations of deflate compression and decompression. zlib和PKZIP并不是deflate压缩和解压缩的唯一实现。 Both the 7-Zip archiving utility and Google's zopfli library have the ability to use much more CPU time than zlib in order to squeeze out the last few bits possible when using the deflate format, reducing compressed sizes by a few percent as compared to zlib's highest compression level. 7-Zip归档实用程序和Google的zopfli库都能够比zlib使用更多的CPU时间,以便在使用deflate格式时挤出最后几位,与zlib相比,压缩大小减少了几个百分点压缩等级。 The pigz utility , a parallel implementation of gzip, includes the option to use zlib (compression levels 1-9) or zopfli (compression level 11), and somewhat mitigates the time impact of using zopfli by splitting the compression of large files over multiple processors and cores. pigz实用程序是gzip的并行实现,包括使用zlib(压缩级别1-9)或zopfli(压缩级别11)的选项,并通过在多个处理器上拆分大文件的压缩来稍微减轻使用zopfli的时间影响和核心。

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值