子分区
又称为复合分区(composite partitioning),是在一个分区表中对分区的进一步拆分。
CREATE TABLE ts_with_sub_partition (
id INT,
purchased DATE
)
PARTITION BY RANGE( YEAR(purchased) )
SUBPARTITION BY HASH( TO_DAYS(purchased) )
SUBPARTITIONS 2 (
PARTITION p0 VALUES LESS THAN (1990),
PARTITION p1 VALUES LESS THAN (2000),
PARTITION p2 VALUES LESS THAN MAXVALUE
);
不管怎样,由于PARTITION BY RANGE语句的行为,分区中的前两个只存储purchased列的值小于1990的。MySQL5.6中可以对RANGE或LIST分区创建子分区,子分区类型可以为HASH或KEY分区。 表ts_with_sub_partition拥有3个RANGE分区(p0、p1、p2),每个分区又进一步拆分为两个子分区。实际上,整个表被分成了3*2=6个分区。
SUBPARTITION BY HASH和SUBPARTITION BY KEY遵循与PARTITION BY HASH和 PARTITION BY KEY相同的语法规则,只是SUBPARTITION BY KEY必须明确指定列。
同样可以通过SUBPARTITION语句来明确定义单独的子分区,更加详细的创建表ts_with_sub_partition的语句如下:
CREATE TABLE ts_with_sub_partition (
id INT,
purchased DATE
)
PARTITION BY RANGE( YEAR(purchased) )
SUBPARTITION BY HASH( TO_DAYS(purchased) ) (
PARTITION p0 VALUES LESS THAN (1990) (
SUBPARTITION s0,
SUBPARTITION s1
),
PARTITION p1 VALUES LESS THAN (2000) (
SUBPARTITION s2,
SUBPARTITION s3
),
PARTITION p2 VALUES LESS THAN MAXVALUE (
SUBPARTITION s4,
SUBPARTITION s5
)
);
a、每个分区必须具有相同数目的子分区; 注意事项:
b、如果明确通过 SUBPARTITION 在任意已经创建的分区表上定义子分区时,必须对于所有的子分区都进行定义,如下语句即是错误的:
CREATE TABLE ts (id INT, purchased DATE)
PARTITION BY RANGE( YEAR(purchased) )
SUBPARTITION BY HASH( TO_DAYS(purchased) ) (
PARTITION p0 VALUES LESS THAN (1990) (
SUBPARTITION s0,
SUBPARTITION s1
),
PARTITION p1 VALUES LESS THAN (2000),
PARTITION p2 VALUES LESS THAN MAXVALUE (
SUBPARTITION s2,
SUBPARTITION s3
)
);
d、在整个分区表中,子分区的名称必须唯一,如s0、s1、s2、s3等。 c、每一个 SUBPARTITION 语句必须包括子分区的名称;
子分区可以用来处理特别大的表,通过将数据和索引分别写到不通的磁盘上。比如挂载了6块磁盘在 /disk0, /disk1, /disk2, disk3, disk4, disk5,那么可以:
CREATE TABLE ts_with_sub_partition_of_mount_odisk (
id INT,
purchased DATE
)
PARTITION BY RANGE( YEAR(purchased) )
SUBPARTITION BY HASH( TO_DAYS(purchased) ) (
PARTITION p0 VALUES LESS THAN (1990) (
SUBPARTITION s0
DATA DIRECTORY = '/disk0/data'
INDEX DIRECTORY = '/disk0/idx',
SUBPARTITION s1
DATA DIRECTORY = '/disk1/data'
INDEX DIRECTORY = '/disk1/idx'
),
PARTITION p1 VALUES LESS THAN (2000) (
SUBPARTITION s2
DATA DIRECTORY = '/disk2/data'
INDEX DIRECTORY = '/disk2/idx',
SUBPARTITION s3
DATA DIRECTORY = '/disk3/data'
INDEX DIRECTORY = '/disk3/idx'
),
PARTITION p2 VALUES LESS THAN MAXVALUE (
SUBPARTITION s4
DATA DIRECTORY = '/disk4/data'
INDEX DIRECTORY = '/disk4/idx',
SUBPARTITION s5
DATA DIRECTORY = '/disk5/data'
INDEX DIRECTORY = '/disk5/idx'
)
);
在上面的例子中,每一个单独的磁盘用来存储RANGE分区的数据和索引,也可以将数据和索引通过单独的磁盘存储。
CREATE TABLE ts_with_sub_partition_of_mount_sdisk (
id INT,
purchased DATE
)
PARTITION BY RANGE(YEAR(purchased))
SUBPARTITION BY HASH( TO_DAYS(purchased) ) (
PARTITION p0 VALUES LESS THAN (1990) (
SUBPARTITION s0a
DATA DIRECTORY = '/disk0'
INDEX DIRECTORY = '/disk1',
SUBPARTITION s0b
DATA DIRECTORY = '/disk2'
INDEX DIRECTORY = '/disk3'
),
PARTITION p1 VALUES LESS THAN (2000) (
SUBPARTITION s1a
DATA DIRECTORY = '/disk4/data'
INDEX DIRECTORY = '/disk4/idx',
SUBPARTITION s1b
DATA DIRECTORY = '/disk5/data'
INDEX DIRECTORY = '/disk5/idx'
),
PARTITION p2 VALUES LESS THAN MAXVALUE (
SUBPARTITION s2a,
SUBPARTITION s2b
)
);
由于1990之前的数据特别大(占据大量的存储空间),将分区p0的两个子分区s0a和s0b对应的数据和索引分别采用单独的磁盘来存储; 存储说明:
1990到1999数据量相对较少,因此分区p1的两个子分区通过两块硬盘(而非p0的四块)来存储;
2000年以后的数据并不需要占用较多的空间,因此存在在默认位置(MySQL数据文件存储的位置),当数据量增长到默认位置无法存储时可以再扩展分区。
The DATA DIRECTORY and INDEX DIRECTORY options are not permitted in partition definitions when the NO_DIR_IN_CREATE server SQL mode is in effect. In MySQL 5.6, these options are also not permitted when defining subpartitions (Bug #42954).
示例代码:
RANGE - HASH复合分区
略
RANGE- KEY复合分区
略
LIST - HASH复合分区
略
LIST - KEY 复合分区
略
MySQL分区如何处理NULL
MySQL不禁止在分区键上使用NULL,不论它是一个字段值或者一个用户定义的表达式。一般情况下,MySQL分区实现中认为NULL小于任何non-NULL值,类似于ORDER BY语句。
注意:RANGE分区中,NULL值会被当作最小值来处理;LIST分区中,NULL值必须出现在枚举列表中,否则不被接受;HASH/KEY分区中,NULL值会被当作零值来处理。
Handling of NULL with RANGE partitioning
RANGE分区中NULL值将插入到最小的分区中。
mysql> CREATE TABLE t1_range_part_with_null (
c1 INT,
c2 VARCHAR(20)
)
PARTITION BY RANGE(c1) (
PARTITION p0 VALUES LESS THAN (0),
PARTITION p1 VALUES LESS THAN (10),
PARTITION p2 VALUES LESS THAN MAXVALUE
);
Query OK, 0 rows affected (0.09 sec)
mysql> CREATE TABLE t2_range_part_with_null (
c1 INT,
c2 VARCHAR(20)
)
PARTITION BY RANGE(c1) (
PARTITION p0 VALUES LESS THAN (-5),
PARTITION p1 VALUES LESS THAN (0),
PARTITION p2 VALUES LESS THAN (10),
PARTITION p3 VALUES LESS THAN MAXVALUE
);
Query OK, 0 rows affected (0.09 sec)
-- 查看INFORMATION_SCHEMA数据库中的分区表定义PARTITIONS
mysql> SELECT TABLE_NAME, PARTITION_NAME, TABLE_ROWS, AVG_ROW_LENGTH, DATA_LENGTH
> FROM INFORMATION_SCHEMA.PARTITIONS
> WHERE TABLE_SCHEMA = 'big_data' AND TABLE_NAME LIKE '%_range_part_with_null';
-- 插入带有NULL值的数据
mysql> INSERT INTO t1 VALUES (NULL, 'mothra');
Query OK, 1 row affected (0.00 sec)
mysql> INSERT INTO t2 VALUES (NULL, 'mothra');
Query OK, 1 row affected (0.00 sec)
mysql> SELECT * FROM t1;
+------+--------+
| id | name |
+------+--------+
| NULL | mothra |
+------+--------+
1 row in set (0.00 sec)
mysql> SELECT * FROM t2;
+------+--------+
| id | name |
+------+--------+
| NULL | mothra |
+------+--------+
1 row in set (0.00 sec
-- 查看添加的数据数据位于哪个分区:
mysql> SELECT TABLE_NAME, PARTITION_NAME, TABLE_ROWS, AVG_ROW_LENGTH, DATA_LENGTH
> FROM INFORMATION_SCHEMA.PARTITIONS
> WHERE TABLE_SCHEMA = 'big_data' AND TABLE_NAME LIKE '%_range_part_with_null';
-- 同样可以通过删除最小分区的形式验证数据存储于哪个分区:
mysql> ALTER TABLE t1 DROP PARTITION p0;
Query OK, 0 rows affected (0.16 sec)
mysql> ALTER TABLE t2 DROP PARTITION p0;
Query OK, 0 rows affected (0.16 sec)
mysql> SELECT * FROM t1;
Empty set (0.00 sec)
mysql> SELECT * FROM t2;
Empty set (0.00 sec)
-- NULL同样可以用于使用了SQL函数的分区表达式:
CREATE TABLE tndate (
id INT,
dt DATE
)
PARTITION BY RANGE( YEAR(dt) ) (
PARTITION p0 VALUES LESS THAN (1990),
PARTITION p1 VALUES LESS THAN (2000),
PARTITION p2 VALUES LESS THAN MAXVALUE
);
-- 可以插入数据验证添加到了哪个分区
Handling of NULL with LIST partitioning
LIST分区中必须明确指定NULL,否则将无法添加数据。
mysql> CREATE TABLE ts1 (
c1 INT,
c2 VARCHAR(20)
)
PARTITION BY LIST(c1) (
PARTITION p0 VALUES IN (0, 3, 6),
PARTITION p1 VALUES IN (1, 4, 7),
PARTITION p2 VALUES IN (2, 5, 8)
);
Query OK, 0 rows affected (0.01 sec)
mysql> INSERT INTO ts1 VALUES (9, 'mothra');
ERROR 1504 (HY000): Table has no partition for value 9
mysql> INSERT INTO ts1 VALUES (NULL, 'mothra');
ERROR 1504 (HY000): Table has no partition for value NULL
-- 创建包含NULL的List分区:
mysql> CREATE TABLE ts2 (
c1 INT,
c2 VARCHAR(20)
)
PARTITION BY LIST(c1) (
PARTITION p0 VALUES IN (0, 3, 6),
PARTITION p1 VALUES IN (1, 4, 7),
PARTITION p2 VALUES IN (2, 5, 8),
PARTITION p3 VALUES IN (NULL)
);
Query OK, 0 rows affected (0.01 sec)
mysql> CREATE TABLE ts3 (
c1 INT,
c2 VARCHAR(20)
)
PARTITION BY LIST(c1) (
PARTITION p0 VALUES IN (0, 3, 6),
PARTITION p1 VALUES IN (1, 4, 7, NULL),
PARTITION p2 VALUES IN (2, 5, 8)
);
Query OK, 0 rows affected (0.01 sec)
-- 再次添加包含空值的数据
mysql> INSERT INTO ts2 VALUES (NULL, 'mothra');
Query OK, 1 row affected (0.00 sec)
mysql> INSERT INTO ts3 VALUES (NULL, 'mothra');
Query OK, 1 row affected (0.00 sec)
-- 查看数据添加到了哪个分区
mysql> SELECT TABLE_NAME, PARTITION_NAME, TABLE_ROWS, AVG_ROW_LENGTH, DATA_LENGTH
> FROM INFORMATION_SCHEMA.PARTITIONS
> WHERE TABLE_SCHEMA = 'p' AND TABLE_NAME LIKE 'ts_';
-- 同样可以通过删除分区后查询的方式验证数据添加到了哪个分区。
Handling of NULL with HASH and KEY partitioning
mysql> CREATE TABLE th (
c1 INT,
c2 VARCHAR(20)
)
PARTITION BY HASH(c1)
PARTITIONS 2;
Query OK, 0 rows affected (0.00 sec)
-- 查询分区情况:
mysql> SELECT TABLE_NAME,PARTITION_NAME,TABLE_ROWS,AVG_ROW_LENGTH,DATA_LENGTH
> FROM INFORMATION_SCHEMA.PARTITIONS
> WHERE TABLE_SCHEMA = 'p' AND TABLE_NAME ='th';
+------------+----------------+------------+----------------+-------------+
| TABLE_NAME | PARTITION_NAME | TABLE_ROWS | AVG_ROW_LENGTH | DATA_LENGTH |
+------------+----------------+------------+----------------+-------------+
| th | p0 | 0 | 0 | 0 |
| th | p1 | 0 | 0 | 0 |
+------------+----------------+------------+----------------+-------------+
-- 能看到每个分区的TABLE_ROWS均为0,插入两条测试数据,验证数据存储于哪个分区。
mysql> INSERT INTO th VALUES (NULL, 'mothra'), (0, 'gigan');
Query OK, 1 row affected (0.00 sec)
mysql> SELECT * FROM th;
+------+---------+
| c1 | c2 |
+------+---------+
| NULL | mothra |
+------+---------+
| 0 | gigan |
+------+---------+
2 rows in set (0.01 sec)
-- 由于NULL对N取MOD同样得到的值为0,HASH分区或者KEY分区均将NULL值作为0存储到相应分区。
mysql> SELECT TABLE_NAME, PARTITION_NAME, TABLE_ROWS, AVG_ROW_LENGTH, DATA_LENGTH
> FROM INFORMATION_SCHEMA.PARTITIONS
> WHERE TABLE_SCHEMA = 'p' AND TABLE_NAME ='th';
+------------+----------------+------------+----------------+-------------+
| TABLE_NAME | PARTITION_NAME | TABLE_ROWS | AVG_ROW_LENGTH | DATA_LENGTH |
+------------+----------------+------------+----------------+-------------+
| th | p0 | 2 | 20 | 20 |
| th | p1 | 0 | 0 | 0 |
+------------+----------------+------------+----------------+-------------+
2 rows in set (0.00 sec)
-- 如果在表的定义中使用PARTITION BY KEY而非PARTITION BY HASH,
-- 同样可以发现在该分区类型NULL被作为0值处理。
分区管理
对于已经过期或者不需要保存的数据,可以通过删除与这些数据有关的分区来快速删除数据。
1、从RANGE或List分区表中删除一个分区
ALTER TABLE table_name DROP PARTITION partition_name;
注意:删除分区的命令执行之后,并不显示实际从表中删除的行数,并不是真的没有记录被删除
-- 如:
ALTER TABLE tr DROP PARTITION p2;
2、为RANGE或List分区表增加分区
ALTER TABLE table_name ADD PARTITION partition_define;
-- 如:
ALTER TABLE tt ADD PARTITION (PARTITION np VALUES IN (4, 8, 12));
3、RANGE或List分区表重新定义分区
MySQL也提供了在不丢失数据的情况下,通过重新定义分区的语句ALTER TABLE REORGANIZE PARTITION INTO重定义分区。
语法如下:
CREATE TABLE emp_date_range_part (
`id` int(11) NOT NULL,
`ename` varchar(30) DEFAULT NULL,
`hired` date NOT NULL DEFAULT '1970-01-01',
`separated` date NOT NULL DEFAULT '9999-12-31',
`job` varchar(30) NOT NULL,
`store_id` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='重定义分区测试'
PARTITION BY RANGE (YEAR(separated))
(
PARTITION p0 VALUES LESS THAN (1995) ENGINE = InnoDB,
PARTITION p1 VALUES LESS THAN (2000) ENGINE = InnoDB,
PARTITION p3 VALUES LESS THAN (2015) ENGINE = InnoDB,
PARTITION p4 VALUES LESS THAN (2030) ENGINE = InnoDB
)
-- 重定义分区:
ALTER TABLE members REORGANIZE PARTITION p0 INTO (
PARTITION s0 VALUES LESS THAN (1960),
PARTITION s1 VALUES LESS THAN (1970)
);
ALTER TABLE tt ADD PARTITION (PARTITION np VALUES IN (4, 8));
ALTER TABLE tt REORGANIZE PARTITION p1,np INTO (
PARTITION p1 VALUES IN (6, 18),
PARTITION np VALUES in (4, 8, 12)
);
注意:重新定义RANGE分区时,只能够重新定义相邻的分区,不能跳过某个RANGE分区进行重新定义,同时重新定义的分区区间必须和原分区区间覆盖相同的区间;也不能使用重新定义分区来改变表分区的类型,例如,不能把RANGE分区变为HASH分区,也不能把HASH分区变成RANGE分区。
合并分区:
alter table tt reorganize partition p4,p6 into (partition p4 values in (6,11));
注意:只能合并相邻的分区。
4、管理Hash分区和Key分
区
CREATE TABLE clients (
id INT,
fname VARCHAR(30),
lname VARCHAR(30),
signed DATE
)
PARTITION BY HASH( MONTH(signed) )
PARTITIONS 12;
-- 合并分区(从12到4个)
ALTER TABLE clients COALESCE PARTITION 4;
-- COALESCE不能用来增加分区的数量,否则会出现以下错误:
mysql> ALTER TABLE clients COALESCE PARTITION 18;
ERROR 1478 (HY000): Cannot remove all partitions, use DROP TABLE instead
-- 但可以通过如下语句增加分区
ALTER TABLE clients ADD PARTITION PARTITIONS 6;