Chinese translated version of Documentation/xz.txt
If you have any comment or update to the content, please contact the
original document maintainer directly. However, if you have a problem
communicating in English you can also ask the Chinese maintainer for
help. Contact the Chinese maintainer if this translation is outdated
or if there is a problem with the translation.
Chinese maintainer: zhuweijing <313997199@qq.com>
---------------------------------------------------------------------
Documentation/xz.txt的中文翻译
如果想评论或更新本文的内容,请直接联系原文档的维护者。如果你使用英文
交流有困难的话,也可以向中文版维护者求助。如果本翻译更新不及时或者翻
译存在问题,请联系中文版维护者。
中文版维护者: 朱伟婧 <313997199@qq.com>
中文版翻译者: 朱伟婧 <313997199@qq.com>
中文版校译者: 朱伟婧 <313997199@qq.com>
XZ data compression in Linux
============================
-Linux XZ数据压缩
Introduction
XZ is a general purpose data compression format with high compression
ratio and relatively fast decompression. The primary compression
algorithm (filter) is LZMA2. Additional filters can be used to improve
compression ratio even further. E.g. Branch/Call/Jump (BCJ) filters
improve compression ratio of executable data.
The XZ decompressor in Linux is called XZ Embedded. It supports
the LZMA2 filter and optionally also BCJ filters. CRC32 is supported
for integrity checking. The home page of XZ Embedded is at
<http://tukaani.org/xz/embedded.html>, where you can find the
latest version and also information about using the code outside
the Linux kernel.
For userspace, XZ Utils provide a zlib-like compression library
and a gzip-like command line tool. XZ Utils can be downloaded from
<http://tukaani.org/xz/>.
-XZ是带着高压缩比和相对快速减压的一个通用的数据压缩格式
-主要的压缩算法(过滤器)是lzma2。额外的过滤器可以用于进一步提高压缩比。例如 Branch/Call/Jump (BCJ)滤波器提高执行的数据压缩比。
-在Linux中XZ减压装置称为XZ面向嵌入式。它支持lzma2滤波器和任选地BCJ滤波器。CRC32支持完整性检查。
-XZ嵌入式的主页是在<http://tukaani.org/xz/embedded.html>
-在这里你可以找到最新的版本,并使用Linux内核的外部信息的代码。
-对于用户空间,XZ工具提供一个zlib压缩库和一个gzip压缩的命令行工具。XZ工具可以从<http://tukaani.org/xz/>下载
XZ related components in the kernel
The xz_dec module provides XZ decompressor with single-call (buffer
to buffer) and multi-call (stateful) APIs. The usage of the xz_dec
module is documented in include/linux/xz.h.
The xz_dec_test module is for testing xz_dec. xz_dec_test is not
useful unless you are hacking the XZ decompressor. xz_dec_test
allocates a char device major dynamically to which one can write
.xz files from userspace. The decompressed output is thrown away.
Keep an eye on dmesg to see diagnostics printed by xz_dec_test.
See the xz_dec_test source code for the details.
For decompressing the kernel image, initramfs, and initrd, there
is a wrapper function in lib/decompress_unxz.c. Its API is the
same as in other decompress_*.c files, which is defined in
include/linux/decompress/generic.h.
scripts/xz_wrap.sh is a wrapper for the xz command line tool found
from XZ Utils. The wrapper sets compression options to values suitable
for compressing the kernel image.
For kernel makefiles, two commands are provided for use with
$(call if_needed). The kernel image should be compressed with
$(call if_needed,xzkern) which will use a BCJ filter and a big LZMA2
dictionary. It will also append a four-byte trailer containing the
uncompressed size of the file, which is needed by the boot code.
Other things should be compressed with $(call if_needed,xzmisc)
which will use no BCJ filter and 1 MiB LZMA2 dictionary.
-XZ相关组件
-该xz_dec模块提供了XZ解压缩(缓冲器)和多调用API(状态)。xz_dec模块使用记录在include/linux/xz.h。
该xz_dec_test模块测试xz_dec xz_dec_test是没有用的,除非你是黑客XZ减压器。xz_dec_test分配一个动态字符设备可以从用户空间写到xz文件。
解压缩后的输出被扔掉。留意dmesg看到xz_dec_test诊断。
详见xz_dec_test源代码。
-解压缩内核映像,initramfs 以及initrd, 在lib/decompress_unxz.c有一个函数,它的API和其他相同,它是在include/linux/decompress/generic.h
-scripts/xz_wrap.sh是一个从XZ工具发现的包装XZ的命令行工具,包装集压缩选项的值适用于压缩的内核映象
-对于核心的makefile,两个命令提供了美元得应用(称if_needed)。
内核映像应该被压缩成美元(称if_needed,xzkern),将使用一个滤波器和一个较大的lzma2 BCJ词典。
它还将追加一四字节的预告片包含文件的未压缩的大小,这被代码需要。
其他的东西应该被压缩(称if_needed,xzmisc)将不使用BCJ滤波器和1 MIB lzma2词典。
Notes on compression options
Since the XZ Embedded supports only streams with no integrity check or
CRC32, make sure that you don't use some other integrity check type
when encoding files that are supposed to be decoded by the kernel. With
liblzma, you need to use either LZMA_CHECK_NONE or LZMA_CHECK_CRC32
when encoding. With the xz command line tool, use --check=none or
--check=crc32.
Using CRC32 is strongly recommended unless there is some other layer
which will verify the integrity of the uncompressed data anyway.
Double checking the integrity would probably be waste of CPU cycles.
Note that the headers will always have a CRC32 which will be validated
by the decoder; you can only change the integrity check type (or
disable it) for the actual uncompressed data.
In userspace, LZMA2 is typically used with dictionary sizes of several
megabytes. The decoder needs to have the dictionary in RAM, thus big
dictionaries cannot be used for files that are intended to be decoded
by the kernel. 1 MiB is probably the maximum reasonable dictionary
size for in-kernel use (maybe more is OK for initramfs). The presets
in XZ Utils may not be optimal when creating files for the kernel,
so don't hesitate to use custom settings. Example:
xz --check=crc32 --lzma2=dict=512KiB inputfile
An exception to above dictionary size limitation is when the decoder
is used in single-call mode. Decompressing the kernel itself is an
example of this situation. In single-call mode, the memory usage
doesn't depend on the dictionary size, and it is perfectly fine to
use a big dictionary: for maximum compression, the dictionary should
be at least as big as the uncompressed data itself.
-压缩的注意事项
-由于XZ嵌入式只支持没有诚信支票或CRC32流,当编码文件被内核文件解码时,确保你不使用其他一些完整性检查类型。
带着 liblzma,当编码时,你需要使用 LZMA_CHECK_NONE 或 LZMA_CHECK_CRC32。XZ命令行工具use --check=none or --check=crc32.
-使用CRC32强烈被建议,除非有其他层这将验证未压缩的数据的完整性,仔细检查可能会浪费的CPU周期。
请注意,标题将始终有一个由解码器验证CRC32,对于实际的未压缩的数据你能改变的只有完整性检查的类型(或禁用)。
-在用户空间,lzma2通常被几兆字节的字典大小使用,解码器需要在RAM有一个字典,这样的话字典不能用于要被解码的文件,
对于内核使用来说,1 MIB可能是最大的合理字典的大小(或许对于initramfs更好),
当创建内核文件时,在XZ工具预置可能不是最优的,不要犹豫,使用自定义设置。例如:
xz --check=crc32 --lzma2=dict=512KiB inputfile
-以上的字典大小的限制的例子是当解码器采用的是单一的呼叫模式,解压缩内核本身就是这种情况的一个例子。
单一的调用方式,内存的使用不依赖于字典的大小,它是完美的对于使用一个大的字典来说:最大压缩,词典应该至少和未压缩的数据本身一样大
Future plans
Creating a limited XZ encoder may be considered if people think it is
useful. LZMA2 is slower to compress than e.g. Deflate or LZO even at
the fastest settings, so it isn't clear if LZMA2 encoder is wanted
into the kernel.
Support for limited random-access reading is planned for the
decompression code. I don't know if it could have any use in the
kernel, but I know that it would be useful in some embedded projects
outside the Linux kernel.
-未来计划
-创建一个有限的XZ编码器可以被认可,如果人们认为它是有用的。
在最快的设置下,lzma2比Deflate 或者 LZO压缩更慢,现在还不清楚是否lzma2编码器是要到内核。
有限的随机存取阅读支持解压码,我不知道它是否可以在内核中的任何使用,但我知道它在一些嵌入式Linux内核之外的项目是有用的。
Conformance to the .xz file format specification
There are a couple of corner cases where things have been simplified
at expense of detecting errors as early as possible. These should not
matter in practice all, since they don't cause security issues. But
it is good to know this if testing the code e.g. with the test files
from XZ Utils.
-符合XZ文件格式规范
-在有案例情况下,事情尽可能早的在检测错误,以便于简化。这些问题不应在实践中所有,因为它们不会造成安全问题。
如果测试代码从XZ工具测试文件了解这个,是蛮好的。
Reporting bugs
Before reporting a bug, please check that it's not fixed already
at upstream. See <http://tukaani.org/xz/embedded.html> to get the
latest code.
Report bugs to <lasse.collin@tukaani.org> or visit #tukaani on
Freenode and talk to Larhzu. I don't actively read LKML or other
kernel-related mailing lists, so if there's something I should know,
you should email to me personally or use IRC.
Don't bother Igor Pavlov with questions about the XZ implementation
in the kernel or about XZ Utils. While these two implementations
include essential code that is directly based on Igor Pavlov's code,
these implementations aren't maintained nor supported by him.
-错误报告
-在报告一个错误之前,请检查是不是被固定,从<http://tukaani.org/xz/embedded.html>获取最新的代码。
-报告错误到< lasse.collin@tukaani.org >或访问#tukaani on Freenode以及larhzu。我不积极读LKML或其他相关的邮件列表的内核,
所以如果有什么我应该知道的,你应该发我个人邮箱或使用IRC。
-不要打扰Igor Pavlov 关于XZ在内核或XZ工具实施问题。虽然这两个实现包括是直接基于Igor Pavlov代码的必要代码,这些实现不被他支持。