最近,在工作中遇到了MySQL中如何存储长度较长的字段类型问题,于是花了一周多的时间抽空学习了一下,并且记录下来。
MySQL大致的逻辑存储结构在这篇文章中有介绍,做为基本概念:InnoDB 逻辑存储结构
注:文中所指的大数据指的是长度较长的数据字段,包括varchar/varbinay/text/blob。
Compact行格式
我们首先来看一下行格式为Compact是如何存储大数据的:
mysql> select version();
+-----------+
| version() |
+-----------+
| 5.1.73 |
+-----------+
1 row in set (0.01 sec)
mysql> show table status like 'row'\G;
*************************** 1. row ***************************
Name: row
Engine: InnoDB
Version: 10
Row_format: Compact
Rows: 1
Avg_row_length: 81920
Data_length: 81920
Max_data_length: 0
Index_length: 0
Data_free: 0
Auto_increment: NULL
Create_time: 2017-01-04 21:46:02
Update_time: NULL
Check_time: NULL
Collation: latin1_swedish_ci
Checksum: NULL
Create_options:
Comment:
1 row in set (0.00 sec)
我们建立一张测试表,插入数据:
CREATE TABLE `row` ( `content` varchar(65532) NOT NULL DEFAULT '' ) ENGINE=InnoDB DEFAULT CHARSET=latin1 mysql> insert into row(content) select repeat('a',65532); Query OK, 1 row affected (0.03 sec) Records: 1 Duplicates: 0 Warnings: 0
我们使用工具来查看表中的页分布:
[root@localhost mysql]# python py_innodb_page_info.py -v com/row.ibd page offset 00000000, page type <File Space Header> page offset 00000001, page type <Insert Buffer Bitmap> page offset 00000002, page type <File Segment inode> page offset 00000003, page type <B-tree Node>, page level <0000> page offset 00000004, page type <Uncompressed BLOB Page> page offset 00000005, page type <Uncompressed BLOB Page> page offset 00000006, page type <Uncompressed BLOB Page> page offset 00000007, page type <Uncompressed BLOB Page> Total number of page: 8: Insert Buffer Bitmap: 1 Uncompressed BLOB Page: 4 File Space Header: 1 B-tree Node: 1 File Segment inode: 1
可以看出,第4页的<B-tree Node>, page level <0000>
格式为数据页,存放着MySQL的行数据。<Uncompressed BLOB Page>
可以理解为MySQL存放大数据的地方,暂且叫作外部存储页。Compact格式没有将大数据全部放在数据页中,而是将一部分数据放在了外部存储页中。那么,是全部数据在外部存储页中,还是一部分数据。假如是一部分数据,这一部分是多少呢?
我们使用hexdump -Cv row.ibd
查看一下数据页<B-tree Node>, page level <0000>
,也就是第4页:
3073 0000c000 8c 25 17 57 00 00 00 03 ff ff ff ff ff ff ff ff |.%.W....????????|
3074 0000c010 00 00 00 00 00 07 3a b8 45 bf 00 00 00 00 00 00 |......:?E?......|
3075 0000c020 00 00 00 00 00 02 00 02 03 a6 80 03 00 00 00 00 |.........?......|
3076 0000c030 00 7f 00 05 00 00 00 01 00 00 00 00 00 00 00 00 |................|
3077 0000c040 00 00 00 00 00 00 00 00 00 13 00 00 00 02 00 00 |................|
3078 0000c050 00 02 00 f2 00 00 00 02 00 00 00 02 00 32 01 00 |...?.........2..|
3079 0000c060 02 00 1c 69 6e 66 69 6d 75 6d 00 02 00 0b 00 00 |...infimum......|
3080 0000c070 73 75 70 72 65 6d 75 6d 14 c3 00 00 10 ff f1 00 |supremum.?...??.|
3081 0000c080 00 00 00 04 03 00 00 00 00 13 12 80 00 00 00 2d |...............-|
3082 0000c090 01 10 61 61 61 61 61 61 61 61 61 61 61 61 61 61 |..aaaaaaaaaaaaaa|
3083 0000c0a0 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 |aaaaaaaaaaaaaaaa|
3084 0000c0b0 61 61 61 61 61 61 61 61 61 61 61 6