MySQL专题四:InnoDB存储引擎
InnoDB是Mysql默认的存储引擎
CREATE TABLE t (i INT) ENGINE = InnoDB;
4.1.磁盘物理存储结构
Name
Size
Field Start Offsets
(F*1) or (F*2) bytes
Extra Bytes
6 bytes
Field Contents
depends on content
字段开始偏移量(Field Start Offset)
记录中每一个字段相对于原点(第一个Field Contents开始的位置)的偏移量的集合取反后的列表,每个Offset大小为1个或2个字节
额外的字节(Extra Bytes )
最重要的是1byte_offs_flag,标志偏移量的Offset大小是1个或2个字节,1 代表1-byte offsets,0代表2-byte offsets
Name
Size
Description
info_bits:
-
-
()
1 bit
unused or unknown
()
1 bit
unused or unknown
deleted_flag
1 bit
1 if record is deleted
min_rec_flag
1 bit
1 if record is predefined minimum record
n_owned
4 bits
number of records owned by this record
heap_no
13 bits
record's order number in heap of index page
n_fields
10 bits
number of fields in this record, 1 to 1023
1byte_offs_flag
1 bit
1 if each Field Start Offsets is 1 byte long (this item is also called the "short" flag)
next 16 bits
16 bits
pointer to next record in page
TOTAL
48 bits
-
字段内容(Field Contents )
用户自定义表后,DBMS会在表中额外增加三个系统字段:
row ID
transaction ID
rollback pointer
例,向T表中插入一条记录:
CREATE TABLE T (FIELD1 VARCHAR(3),
FIELD2 VARCHAR(3),
FIELD3 VARCHAR(3)) ;
INSERT INTO T VALUES ('PP', 'PP', 'PP');
ha_write_row19 17 15 13 0C 06 Field Start Offsets /* First Row */
00 00 78 0D 02 BF Extra Bytes
00 00 00 00 04 21 System Column #1
00 00 00 00 09 2A System Column #2
80 00 00 00 2D 00 84 System Column #3
50 50 Field1 'PP'
50 50 Field2 'PP'
50 50 Field3 'PP'
16 15 14 13 0C 06 Field Start Offsets /* Second Row */
00 00 80 0D 02 E1 Extra Bytes
00 00 00 00 04 22 System Column #1
00 00 00 00 09 2B 80 System Column #2
00 00 00 2D 00 84 System Column #3
51 Field1 'Q'
51 Field2 'Q'
51 Field3 'Q'
94 94 14 13 0C 06 Field Start Offsets /* Third Row */
00 00 88 0D 00 74 Extra Bytes
00 00 00 00 04 23 System Column #1
00 00 00 00 09 2C System Column #2
80 00 00 00 2D 00 84 System Column #3
52 Field1 'R'
Page用于存储记录,每个Page的大小固定为16KB,结构如下:
Fil Header
Page Header
Infimum + Supremum Records
User Records
Free Space
Page Directory
Fil Trailer
Fil Header
FIL_PAGE_PREV和 FIL_PAGE_NEXT :B+Tree数据结构中指针,指向Previous Page和Next Page
Name
Size
Remarks
FIL_PAGE_SPACE
4
4 ID of the space the page is in
FIL_PAGE_OFFSET
4
ordinal page number from start of space
FIL_PAGE_PREV
4
offset of previous page in key order
FIL_PAGE_NEXT
4
offset of next page in key order
FIL_PAGE_LSN
8
log serial number of page's latest log record
FIL_PAGE_TYPE
2
current defined types are: FIL_PAGE_INDEX, FIL_PAGE_UNDO_LOG, FIL_PAGE_INODE, FIL_PAGE_IBUF_FREE_LIST
FIL_PAGE_FILE_FLUSH_LSN
8
"the file has been flushed to disk at least up to this lsn" (log serial number), valid only on the first page of the file
FIL_PAGE_ARCH_LOG_NO
4
the latest archived log file number at the time that FIL_PAGE_FILE_FLUSH_LSN was written (in the log)
4.1.3. B+Tree树结构
InnoDB是以每一个Page为节点的B-Tree结构的存储引擎。
在B-树的结构上,InnoDB由于 FIL_PAGE_PREV和 FIL_PAGE_NEXT指针的存在,可以从一个叶节点出发访问另一个叶节点,而不必每次回到根节点,这就是为什么InnoDB应该被称为B+树。
4.2. 内存缓存存储模式
BufferPool
缓冲池是主内存中的一个区域,InnoDB在访问表和索引数据时将其缓存。缓冲池允许直接从内存中处理经常使用的数据加快了处理速度。
BufferPool被实现作为Page的列表,使用了LRU算法进行管理,很少使用的数据使用LRU算法会变淘汰。
对前面2种存储做一个总结:
4.3. 其它内置存储引擎
mysql> show engines;
+--------------------+---------+----------------------------------------------------------------+--------------+------+------------+
| Engine | Support | Comment | Transactions | XA | Savepoints |
+--------------------+---------+----------------------------------------------------------------+--------------+------+------------+
| MEMORY | YES | Hash based, stored in memory, useful for temporary tables | NO | NO | NO |
| MRG_MYISAM | YES | Collection of identical MyISAM tables | NO | NO | NO |
| CSV | YES | CSV storage engine | NO | NO | NO |
| FEDERATED | NO | Federated MySQL storage engine | NULL | NULL | NULL |
| PERFORMANCE_SCHEMA | YES | Performance Schema | NO | NO | NO |
| MyISAM | YES | MyISAM storage engine | NO | NO | NO |
| InnoDB | DEFAULT | Supports transactions, row-level locking, and foreign keys | YES | YES | YES |
| BLACKHOLE | YES | /dev/null storage engine (anything you write to it disappears) | NO | NO | NO |
| ARCHIVE | YES | Archive storage engine | NO | NO | NO |
+--------------------+---------+----------------------------------------------------------------+--------------+------+------------+
4.4. InnoDB的事务与锁
4.4.1. 事务(Transactions):
Transactions是可以被提交(commit)和回滚(rollback)作业的原子单元,假设一个事务对数据库做了很多操作,要么当事务提交的时候操作数据库成功,要么当事务回滚的时候数据库不发生任何改变。
在Mysql数据库存储引擎中只有InnnoDB实现了事务,具有ACID属性,包括原子性(atomicity)、一致性(consistency)、隔离性(isolation)和持久性(durability)。
rw-lock等级为行的锁,多个事务可以并发地修改同一张表(table),但是当修改同一行(row),同一时间只能有一个事务可以修改,另一个必须等前面的事务完成或者释放行级锁(row locks)才能进行修改操作。
rw-lock包括三种类型的锁:(shared、exclusive)
共享锁(s-locks):可以对公共资源的读取访问
独占锁(x-locks): 可以对公共资源的写访问,但是不允许其它线程不一致地读取
共享独占锁(sx-locks):可以对公共资源的写访问,也允许其它线程不一致地读取
S
X
SX
S
兼容
兼容
冲突
X
兼容
冲突
冲突
SX
冲突
冲突
冲突