innodb file format: compcat 和 dynamic

COMPACT Row Format

The COMPACT row format reduces row storage space by about 20% compared to the REDUNDANT row format, at the cost of increasing CPU use for some operations. If your workload is a typical one that is limited by cache hit rates and disk speed, COMPACT format is likely to be faster. If the workload is limited by CPU speed, compact format might be slower.

compact格式与redundant相比,节约了20%的存储空间,不过对于某些操作需要更多的CPU。如果使用者的存储和缓存比较紧缺,推荐使用compact格式。如果cpu比较紧缺,compact会比较慢。

The COMPACT row format is supported by both InnoDB file formats (Antelope and Barracuda). For more information, see Section 14.10, “InnoDB File-Format Management”.

compact支持两种文件格式(antelope和barracuda)

Tables that use the COMPACT row format store the first 768 bytes of variable-length column values (VARCHARVARBINARY, and BLOB and TEXT types) in the index record within the B-tree node, with the remainder stored on overflow pages. Fixed-length columns greater than or equal to 768 bytes are encoded as variable-length columns, which can be stored off-page. For example, a CHAR(255) column can exceed 768 bytes if the maximum byte length of the character set is greater than 3, as it is with utf8mb4.

对于变长列(text,varchar,blob等),前768B的数据是存储在B树节点中(注意这里的含义:innodb使用B+数索引,如果某一列有索引,不管是主索引还是二级所以,索引节点中都会保存此列的值),超过768B的数据,存储在额外的页中。

对于定长列,如果长度超过768B,则按照变长列来处理。

举例,如果一个列的长度超过了768B,则在B+数索引的节点中,该列的存储格式是  列部分数据(小于768B) + 额外页的地址。

如果超出的数据,一个页也放不完,则页存储格式是: 数据 + 下一页的地址。

If the value of a column is 768 bytes or less, an overflow page is not used, and some savings in I/O may result, since the value is stored entirely in the B-tree node. This works well for relatively short BLOB column values, but may cause B-tree nodes to fill with data rather than key values, reducing their efficiency. Tables with many BLOB columns could cause B-tree nodes to become too full, and contain too few rows, making the entire index less efficient than if rows were shorter or column values were stored off-page.

如果一列的数据量小于768B,则不需要额外的页存储,列的值可以放入索引中(这里的索引是该列的索引)。

如果列的值都比较大,则B+树的效率不高。举例,载入256级的B+树,如果每个节点的数据量比较大,则载入2层B+树需要的内存更多。

COMPACT Row Format Storage Characteristics

The COMPACT row format has the following storage characteristics:

  • Each index record contains a 5-byte header that may be preceded by a variable-length header. The header is used to link together consecutive records, and for row-level locking.

  • 每个B+树索引节点都有5B的头部,保存了行锁和指针,指针指向上一个和下一个记录(B+树)。这个头部在"变长度头部"之前。

  • The variable-length part of the record header contains a bit vector for indicating NULL columns. If the number of columns in the index that can be NULL is N, the bit vector occupies CEILING(N/8) bytes. (For example, if there are anywhere from 9 to 16 columns that can be NULL, the bit vector uses two bytes.) Columns that are NULL do not occupy space other than the bit in this vector. The variable-length part of the header also contains the lengths of variable-length columns. Each length takes one or two bytes, depending on the maximum length of the column. If all columns in the index are NOT NULL and have a fixed length, the record header has no variable-length part.

  • 变长头部:对于每个可为null值的列,有一个b存储了该列的值是否为null。如果每一列都不能为null,则不需要此头部。

  • For each non-NULL variable-length field, the record header contains the length of the column in one or two bytes. Two bytes are only needed if part of the column is stored externally in overflow pages or the maximum length exceeds 255 bytes and the actual length exceeds 127 bytes. For an externally stored column, the 2-byte length indicates the length of the internally stored part plus the 20-byte pointer to the externally stored part. The internal part is 768 bytes, so the length is 768+20. The 20-byte pointer stores the true length of the column.

  • 对于值不为null的变长列,头部还保存每一列的实际长度。

  • The record header is followed by the data contents of non-NULL columns.

  • 头部之后就是每一列的数据值。

  • Records in the clustered index contain fields for all user-defined columns. In addition, there is a 6-byte transaction ID field and a 7-byte roll pointer field.

  • 主索引的节点还包含了行的每一列数据。另外还有6B保存事务ID,7B保存回滚地址。

  • If no primary key is defined for a table, each clustered index record also contains a 6-byte row ID field.

  • Each secondary index record contains all the primary key columns defined for the clustered index key that are not in the secondary index. If any of the primary key columns are variable length, the record header for each secondary index has a variable-length part to record their lengths, even if the secondary index is defined on fixed-length columns.

  • 二级索引的节点除了保存自身的值,保存了主键的值。如果主键是变长的,还要保存主键的实际长度。

  • Internally, for nonvariable-length character sets, fixed-length character columns such as CHAR(10) are stored in a fixed-length format.

    Trailing spaces are not truncated from VARCHAR columns.

  • 定长列在行中使用固定的存储空间。

  • Internally, for variable-length character sets such as utf8mb3 and utf8mb4InnoDB attempts to store CHAR(N) in N bytes by trimming trailing spaces. If the byte length of a CHAR(N) column value exceeds N bytes, trailing spaces are trimmed to a minimum of the column value byte length. The maximum length of a CHAR(N) column is the maximum character byte length × N.

    A minimum of N bytes is reserved for CHAR(N). Reserving the minimum space N in many cases enables column updates to be done in place without causing index page fragmentation. By comparison, CHAR(N) columns occupy the maximum character byte length × N when using the REDUNDANT row format.

    Fixed-length columns greater than or equal to 768 bytes are encoded as variable-length fields, which can be stored off-page. For example, a CHAR(255) column can exceed 768 bytes if the maximum byte length of the character set is greater than 3, as it is with utf8mb4.

 

DYNAMIC Row Format

The DYNAMIC row format offers the same storage characteristics as the COMPACT row format but adds enhanced storage capabilities for long variable-length columns and supports large index key prefixes.

dynamic格式与compact格式很像,不过添加了对超长可变列和超长索引的支持。

The Barracuda file format supports the DYNAMIC row format. See Section 14.10, “InnoDB File-Format Management”.

Barracuda文件格式支持dynamic行格式

When a table is created with ROW_FORMAT=DYNAMICInnoDB can store long variable-length column values (for VARCHARVARBINARY, and BLOB and TEXT types) fully off-page, with the clustered index record containing only a 20-byte pointer to the overflow page. Fixed-length fields greater than or equal to 768 bytes are encoded as variable-length fields. For example, a CHAR(255) column can exceed 768 bytes if the maximum byte length of the character set is greater than 3, as it is with utf8mb4.

对于变长列,varchar,varbinary,blob,text,主索引的节点可以不保持它们的值,保持的是20B大小的额外页地址。超过768B的定长列,也是同样的处理。(注意这里说的是:可以不保存,并不是一定。或许这是dynamic这个名字的由来)

Whether columns are stored off-page depends on the page size and the total size of the row. When a row is too long, the longest columns are chosen for off-page storage until the clustered index record fits on the B-tree page. TEXT and BLOB columns that are less than or equal to 40 bytes are stored in line.

何时使用全额外地址的存储方式,取决于行的总长。如果总长太长,无法放入768B的节点,则取最长的列进行处理。

对于TEXT和blob列,如果长度不超过40B,则会存储在行内。

The DYNAMIC row format maintains the efficiency of storing the entire row in the index node if it fits (as do the COMPACT and REDUNDANT formats), but the DYNAMIC row format avoids the problem of filling B-tree nodes with a large number of data bytes of long columns. The DYNAMIC row format is based on the idea that if a portion of a long data value is stored off-page, it is usually most efficient to store the entire value off-page. With DYNAMIC format, shorter columns are likely to remain in the B-tree node, minimizing the number of overflow pages required for a given row.

dynamic格式在compact的基础上,可以减少了主索引节点的大小。有的时候,全额外页存储超长变量的方式效率更高(比如大部分查询没有涉及到使用全额外页存储的列)。

The DYNAMIC row format supports index key prefixes up to 3072 bytes. This feature is controlled by the innodb_large_prefix variable, which is enabled by default. See the innodb_large_prefix variable description for more information.

Tables that use the DYNAMIC row format can be stored in the system tablespace, file-per-table tablespaces, and general tablespaces. To store DYNAMIC tables in the system tablespace, either disable innodb_file_per_table and use a regular CREATE TABLE or ALTER TABLE statement, or use the TABLESPACE [=] innodb_system table option with CREATE TABLE or ALTER TABLE. The innodb_file_per_table and innodb_file_format variables are not applicable to general tablespaces, nor are they applicable when using the TABLESPACE [=] innodb_system table option to store DYNAMIC tables in the system tablespace.

DYNAMIC Row Format Storage Characteristics

The DYNAMIC row format is a variation of the COMPACT row format. For storage characteristics, see COMPACT Row Format Storage Characteristics.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值