什么是MySQL数据空洞？

最新推荐文章于 2025-05-23 21:09:29 发布

原创最新推荐文章于 2025-05-23 21:09:29 发布 · 2.2k 阅读

3 ·

CC 4.0 BY-SA版权

文章标签：

#MySQL数据空洞 #OPTIMIZE TABLE #MySQL Optimize

数据库同时被 2 个专栏收录

25 篇文章

订阅专栏

闲聊MySQL

16 篇文章

订阅专栏

本文解析了MySQL中数据空洞现象，即删除数据后空间不释放的问题，重点介绍了OPTIMIZETABLE命令的工作原理、适用场景和优化过程，包括其在线DDL机制和对InnoDB表索引的影响。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

背景

MySQL的数据存储在表空间中，有的时候我们删除了一张表的一半的数据，但是发现表空间文件的大小并没有减少，这是什么原因呢？

MySQL数据空洞

当对一条数据执行delete操作时，MySQL将数据删除后，并未将数据占用的空间返还给操作系统，而是将当前空间标记为"可复用"，当有新的数据插入时，则不会重新申请空间，而是插入到"可复用"空间中，这种"可复用"空间，称之为数据空洞。

MySQL官方文档对此的解释如下：

After deleting a large part of a MyISAM or ARCHIVE table, or making many changes to a MyISAM or ARCHIVE table with variable-length rows (tables that have VARCHAR, VARBINARY, BLOB, or TEXT columns).

Deleted rows are maintained in a linked list and subsequent INSERT operations reuse old row positions.

数据空洞的好处是尽可能的复用表空间结构，带来的问题也是显而易见，当删除数据后，表空间并未及时的释放，当长时间没有新的数据填充，会造成空间浪费的情况。

OPTIMIZE TABLE命令

MySQL提供了命令OPTIMIZE TABLE来整理表空间结构，来清理释放未使用的表空间结构，以此提升表的性能。

MySQL官方文档对此的解释如下：

You can use OPTIMIZE TABLE to reclaim the unused space and to defragment the data file. After extensive changes to a table, this statement may also improve performance of statements that use the table, sometimes significantly.

OPTIMIZE TABLE works for InnoDB, MyISAM, and ARCHIVE tables.
OPTIMIZE TABLE is also supported for dynamic columns of in-memory NDB tables. It does not work for fixed-width columns of in-memory tables, nor does it work for Disk Data tables.

对于InnoDB，OPTIMIZE TABLE等价于ALTER TABLE...FORCE，相当于对数据表的聚簇索引进行重建，清理未使用的空间。

下面来看一下执行OPTIMIZE TABLE的效果：

mysql> OPTIMIZE TABLE foo;
+----------+----------+----------+-------------------------------------------------------------------+
| Table    | Op       | Msg_type | Msg_text                                                          |
+----------+----------+----------+-------------------------------------------------------------------+
| test.foo | optimize | note     | Table does not support optimize, doing recreate + analyze instead |
| test.foo | optimize | status   | OK                                                                |
+----------+----------+----------+-------------------------------------------------------------------+

OPTIMIZE TABLE背后细节

1、OPTIMIZE TABLE使用online DDL机制，以此来降低并发DML操作场景下的停顿时间。
2、OPTIMIZE TABLE操作会在prepare和commit两个阶段对表加锁。
3、在prepare阶段，会对数据表的元数据进行更新，并创建临时表。
4、在commit阶段，数据表的元数据更新会进行提交。
5、online DDL机制不支持InnoDB引擎的FULLTEXT索引，如果使用FULLTEXT索引，那么将会变为表数据拷贝的方式进行重建。

MySQL官方文档对此的解释如下：

OPTIMIZE TABLE uses online DDL for regular and partitioned InnoDB tables, 
which reduces downtime for concurrent DML operations. 
The table rebuild triggered by OPTIMIZE TABLE is completed in place. 
An exclusive table lock is only taken briefly during the prepare phase and the commit phase of the operation. 
During the prepare phase, metadata is updated and an intermediate table is created. 
During the commit phase, table metadata changes are committed.

参考文档：
OPTIMIZE TABLE Statement：
https://dev.mysql.com/doc/refman/8.0/en/optimize-table.html

InnoDB and Online DDL
https://dev.mysql.com/doc/refman/8.0/en/innodb-online-ddl.html

MYSQL 5.7 到底 OPTIMIZE Table 塞不塞 DML
https://cloud.tencent.com/developer/article/1606990