Layout
Generally speaking, the journal has this format:
Superblock [(descriptor_block data_blocks|revocation_block) [more data or revocations] commmit_block] [more transactions...]
|<---------------------------------- one transaction ----------------------------------->|
Notice that a transaction begins with either a descriptor and some data, or a block revocation list. A finished transaction always ends with a commit. If there is no commit record (or the checksums don't match), the transaction will be discarded during replay.
Block Header
Every block in the journal starts with a common 12-byte header struct journal_header_s:
Offset | Type | Name | Description | ||||||||||
0x0 | __be32 | h_magic | jbd2 magic number, 0xC03B3998. | ||||||||||
0x4 | __be32 | h_blocktype | Description of what this block contains. One of:
| ||||||||||
0x8 | __be32 | h_sequence | The transaction ID that goes with this block. |
Super Block
The super block for the journal is much simpler as compared to ext4's. The key data kept within are size of the journal, and where to find the start of the log of transactions.
The journal superblock is recorded as struct journal_superblock_s, which is 1024 bytes long:
Offset | Type | Name | Description | ||||||
0x0 | journal_header_t (12 bytes) | s_header | Common header identifying this as a superblock. | ||||||
Static information describing the journal. | |||||||||
0xC | __be32 | s_blocksize | Journal device block size. | ||||||
0x10 | __be32 | s_maxlen | Total number of blocks in this journal. | ||||||
0x14 | __be32 | s_first | First block of log information. | ||||||
Dynamic information describing the current state of the log. | |||||||||
0x18 | __be32 | s_sequence | First commit ID expected in log. | ||||||
0x1C | __be32 | s_start | Block number of the start of log. Contrary to the comments, this field being zero does not imply that the journal is clean! | ||||||
0x20 | __be32 | s_errno | Error value, as set by jbd2_journal_abort(). | ||||||
The remaining fields are only valid in a version 2 superblock. | |||||||||
0x24 | __be32 | s_feature_compat; | Compatible feature set. Any of:
| ||||||
0x28 | __be32 | s_feature_incompat | Incompatible feature set. Any of:
| ||||||
0x2C | __be32 | s_feature_ro_compat | Read-only compatible feature set. There aren't any of these currently. | ||||||
0x30 | __u8 | s_uuid[16] | 128-bit uuid for journal. This is compared against the copy in the ext4 super block at mount time. | ||||||
0x40 | __be32 | s_nr_users | Number of file systems sharing this journal. | ||||||
0x44 | __be32 | s_dynsuper | Location of dynamic super block copy. (Not used?) | ||||||
0x48 | __be32 | s_max_transaction | Limit of journal blocks per transaction. (Not used?) | ||||||
0x4C | __be32 | s_max_trans_data | Limit of data blocks per transaction. (Not used?) | ||||||
0x50 | __u32 | s_padding[44] | |||||||
0x100 | __u8 | s_users[16*48] | ids of all file systems sharing the log. (Not used?) |
表示第一次commitID是0x670, log的位置在0x566000.
Descriptor Block
The descriptor block contains an array of journal block tags that describe the final locations of the data blocks that follow in the journal. Descriptor blocks are open-coded instead of being completely described by a data structure, but here is the block structure anyway. Descriptor blocks consume at least 36 bytes, but use a full block:
Offset | Type | Name | Descriptor |
0x0 | journal_header_t | (open coded) | Common block header. |
0xC | struct journal_block_tag_s | open coded array[] | Enough tags either to fill up the block or to describe all the data blocks that follow this descriptor block. |
Journal block tags have the following format, as recorded by struct journal_block_tag_s. They can be 8, 12, 24, or 38 bytes:
Offset | Type | Name | Descriptor | ||||||||
0x0 | __be32 | t_blocknr | Lower 32-bits of the location of where the corresponding data block should end up on disk. | ||||||||
0x4 | __be32 | t_flags | Flags that go with the descriptor. Any of:
| ||||||||
This next field is only present if the super block indicates support for 64-bit block numbers. | |||||||||||
0x8 | __be32 | t_blocknr_high | Upper 32-bits of the location of where the corresponding data block should end up on disk. | ||||||||
This field appears to be open coded. It always comes at the end of the tag, after t_flags or t_blocknr_high. This field is not present if the "same UUID" flag is set. | |||||||||||
0x8 or 0xC | char | uuid[16] | A UUID to go with this tag. This field appears to be copied from a field in struct journal_s that is never set, which means that the UUID is probably all zeroes. Or perhaps it will contain garbage. |
那么举例,第4个block0x56a000的数据应该要写到block1,也就是GDT的位置。
0x0A的flag表示最后一个data block 0x570000
Data Block
In general, the data blocks being written to disk through the journal are written verbatim into the journal file after the descriptor block. However, if the first four bytes of the block match the jbd2 magic number then those four bytes are replaced with zeroes and the "escaped" flag is set in the descriptor block.
那么description block会有一个commit block:由于没有打开checksum,所以都是空,只有commit时间。
Revocation Block
A revocation block is used to record a list of data blocks in this transaction that supersede any older copies of those data blocks that might still be lurking in the journal. This can speed up recovery because those older copies don't have to be written out to disk.
Revocation blocks are described in struct jbd2_journal_revoke_header_s, are at least 16 bytes in length, but use a full block:
Offset | Type | Name | Description |
0x0 | journal_header_t | r_header | Common block header. |
0xC | __be32 | r_count | Number of bytes used in this block. |
0x10 | __be32 or __be64 | blocks[0] | Blocks to revoke. |
After r_count is a linear array of block numbers that are effectively revoked by this transaction. The size of each block number is 8 bytes if the superblock advertises 64-bit block number support, or 4 bytes otherwise.
Commit Block
The commit block is a sentry that indicates that a transaction has been completely written to the journal. Once this commit block reaches the journal, the data stored with this transaction can be written to their final locations on disk.
The commit block is described by struct commit_header, which is 32 bytes long (but uses a full block):
Offset | Type | Name | Descriptor | ||||||
0x0 | journal_header_s | (open coded) | Common block header. | ||||||
0xC | unsigned char | h_chksum_type | The type of checksum to use to verify the integrity of the data blocks in the transaction. One of:
| ||||||
0xD | unsigned char | h_chksum_size | The number of bytes used by the checksum. Most likely 4. | ||||||
0xE | unsigned char | h_padding[2] | |||||||
0x10 | __be32 | h_chksum[JBD2_CHECKSUM_BYTES] | 32 bytes of space to store checksums. | ||||||
0x30 | __be64 | h_commit_sec | The time that the transaction was committed, in seconds since the epoch. | ||||||
0x38 | __be32 | h_commit_nsec | Nanoseconds component of the above timestamp. |