在网上看了很多文章,都太过于概念,注意集中在介绍分区表的优点,而不注重时间操作,跟大学教授论文似的,唯一由于的一片文章和大家分享一下吧http://fanqiang.chinaunix.net/db/mysql/2006-05-08/4135.shtml.
MySQL分区表操作代码(本案例按月分区):
<!--[if !supportLists]-->1. <!--[endif]-->创建分区表:
CREATE TABLE `表名` (
`EQUIPMENTID`char(17) NOT NULL,
`ATTRIBUTEID`char(4) NOT NULL,
`VALUE`varchar(20) NOT NULL,
`COLLECTTIME`datetime NOT NULL
) ENGINE=InnoDB(适用大部分引擎,可根据需要调整) DEFAULT CHARSET=latin1
PARTITION BY RANGE(to_days(`时间字段名`))
(PARTITION pminVALUES LESS THAN (to_days('2010-01-01')),
PARTITION p201001VALUES LESS THAN (to_days('2010-02-01')) ,
PARTITION p201002VALUES LESS THAN (to_days('2010-03-01')) ,
PARTITION p201003VALUES LESS THAN (to_days('2010-04-01')) ,
PARTITION p201004VALUES LESS THAN (to_days('2010-05-01')) ,
PARTITION p201005VALUES LESS THAN (to_days('2010-06-01')) ,
PARTITION p201006VALUES LESS THAN (to_days('2010-07-01')) ,
PARTITION p201007VALUES LESS THAN (to_days('2010-08-01')) ,
PARTITION p201008VALUES LESS THAN (to_days('2010-09-01')) ,
PARTITION p201009VALUES LESS THAN (to_days('2010-10-01')) ,
PARTITION p201010VALUES LESS THAN (to_days('2010-11-01')),
PARTITION p201011VALUES LESS THAN (to_days('2010-12-01')),
PARTITION p201012VALUES LESS THAN (to_days('2011-01-01')),
PARTITION pmax VALUESLESS THAN MAXVALUE );
<!--[if !supportLists]-->2. <!--[endif]-->为现有表创建分区:
alter table 表名
PARTITION BY RANGE(to_days(`时间字段名`))
(PARTITION pminVALUES LESS THAN (to_days('2010-01-01')),
PARTITION p201001VALUES LESS THAN (to_days('2010-02-01')) ,
PARTITION p201002VALUES LESS THAN (to_days('2010-03-01')) ,
PARTITION p201003VALUES LESS THAN (to_days('2010-04-01')) ,
PARTITION p201004 VALUESLESS THAN (to_days('2010-05-01')) ,
PARTITION p201005VALUES LESS THAN (to_days('2010-06-01')) ,
PARTITION p201006VALUES LESS THAN (to_days('2010-07-01')) ,
PARTITION p201007VALUES LESS THAN (to_days('2010-08-01')) ,
PARTITION p201008VALUES LESS THAN (to_days('2010-09-01')) ,
PARTITION p201009VALUES LESS THAN (to_days('2010-10-01')) ,
PARTITION p201010VALUES LESS THAN (to_days('2010-11-01')),
PARTITION p201011VALUES LESS THAN (to_days('2010-12-01')),
PARTITION p201012VALUES LESS THAN (to_days('2011-01-01')),
PARTITION pmax VALUESLESS THAN MAXVALUE );
<!--[if !supportLists]-->3. <!--[endif]-->删除表中的指定分区(删除分区会导致分区数据丢失,建议先备份):
ALTERTABLE 表名DROP PARTITION p0;
<!--[if !supportLists]-->4. <!--[endif]-->追加表分区
需要先删除MAXVALUE分区后增加分区后再重建MAXVALUE分区,删除前需要先备份MAXVALUE分区数据.
ALTER TABLE 表名 DROPPARTITION pmax;
ALTER TABLE表名
ADD PARTITION (
PARTITION p201201VALUES LESS THAN (to_days('2012-2-1')),
PARTITION pmax VALUESLESS THAN MAXVALUE);
<!--[if !supportLists]-->5. <!--[endif]-->查看标分区信息
SELECT
partition_namepart,
partition_expressionexpr,
partition_descriptiondescr,
table_rows
FROM
INFORMATION_SCHEMA.partitions
WHERE
TABLE_SCHEMA= schema()
ANDTABLE_NAME='表名';
<!--[if !supportLists]-->6. <!--[endif]-->查看查询语句涉及分区信息
explainpartitions
select …from 表名 where …;
性能对比:
<!--[if !supportLists]-->1. <!--[endif]-->测试环境
CPU: Intel 奔腾双核 E5300
硬盘: 西数(320GB/7200/16M 蓝盘)
内存: 南亚易胜 DDR2 800MHz 1GB + 三星 DDR2 800MHz 1GB
操作系统:Windows XP
MySQL版本: 5.1.57(5.1+版本支持分区表)
<!--[if !supportLists]-->2. <!--[endif]-->表信息
表结构:
名 | 类型 | 长度 |
|
EQUIPMENTID | char | 17 | 主键1 |
ATTRIBUTEID | char | 4 | 主键2 |
VALUE | varchar | 20 |
|
COLLECTTIME | datetime |
| 主键3 |
总记录数:580W
分区信息(红色为主要测试区域):
part | expr | descr | table_rows |
pmin | to_days(COLLECTTIME) | 734138 | 2686 |
p201001 | to_days(COLLECTTIME) | 734169 | 2511883 |
p201002 | to_days(COLLECTTIME) | 734197 | 192497 |
p201003 | to_days(COLLECTTIME) | 734228 | 811103 |
p201004 | to_days(COLLECTTIME) | 734258 | 82894 |
p201005 | to_days(COLLECTTIME) | 734289 | 109297 |
p201006 | to_days(COLLECTTIME) | 734319 | 555065 |
p201007 | to_days(COLLECTTIME) | 734350 | 742949 |
p201008 | to_days(COLLECTTIME) | 734381 | 525900 |
p201009 | to_days(COLLECTTIME) | 734411 | 89 |
p201010 | to_days(COLLECTTIME) | 734442 | 71665 |
p201011 | to_days(COLLECTTIME) | 734472 | 85964 |
p201012 | to_days(COLLECTTIME) | 734503 | 1612 |
p201101 | to_days(COLLECTTIME) | 734534 | 176 |
p201102 | to_days(COLLECTTIME) | 734562 | 253 |
p201103 | to_days(COLLECTTIME) | 734593 | 44824 |
p201104 | to_days(COLLECTTIME) | 734623 | 62324 |
p201105 | to_days(COLLECTTIME) | 734654 | 50658 |
p201106 | to_days(COLLECTTIME) | 734684 | 0 |
p201107 | to_days(COLLECTTIME) | 734715 | 0 |
p201108 | to_days(COLLECTTIME) | 734746 | 0 |
p201109 | to_days(COLLECTTIME) | 734776 | 0 |
p201110 | to_days(COLLECTTIME) | 734807 | 0 |
p201111 | to_days(COLLECTTIME) | 734837 | 0 |
p201112 | to_days(COLLECTTIME) | 734868 | 0 |
p201201 | to_days(COLLECTTIME) | 734899 | 0 |
p201202 | to_days(COLLECTTIME) | 734928 | 0 |
pmax | to_days(COLLECTTIME) | MAXVALUE | 921 |
<!--[if !supportLists]-->3. <!--[endif]-->查询效率对比
对比表:无分区表名nopart_data,有分区表名part_data
查询条件:select count(*) from 表名 where COLLECTTIME > 起始时间 and COLLECTTIME < 终止时间
查询耗时按照3次平均值统计
统计表:
开始时间 | 结束时间 | 查询结果 | 无分区耗时 | 有分区耗时 | 涉及分区 |
全部 | 5848859 | 6.26s | 9.58s | 全部 | |
2010-5-1 | 2010-6-1 | 109086 | 7.04s | 0.48s | pmin,p201005 |
2010-6-1 | 2010-7-1 | 554695 | 8.34s | 0.38s | pmin,p201006 |
2010-7-1 | 2010-8-1 | 742565 | 7.57s | 0.43s | pmin,p201007 |
2010-5-1 | 2010-7-1 | 663781 | 7.07s | 0.51s | pmin,p201005,p201006 |
2010-6-1 | 2010-8-1 | 1297260 | 6.84s | 1.93s | pmin,p201006,p201007 |
2010-5-1 | 2010-8-1 | 1406346 | 6.97s | 2.30s | pmin,p201006,p201007,p201008 |
小结:
<!--[if !supportLists]-->1) 分区表查询在查询上有明显优势.但在跨区查询时会有查询时间消耗,因此需要注意分区的疏密程度.
<!--[if !supportLists]-->2) 每次查询都会查询pmin(第一个分区),因此需要尽量减少这个分区的数据.
<!--[if !supportLists]-->4. <!--[endif]-->写入数据效率对比
COLLECTTIME | 无分区耗时 | 有分区耗时 |
2010-5-22 15:36 | 0.05s | 0.03s |
2010-6-22 15:36 | 0.02s | 0.05s |
2010-7-22 15:36 | 0.03s | 0.03s |
小结:
1) 分区对单条数据的插入操作无较大影响.
以上是我对MySQL的初体验总结,没啥心得体会,只有一点点成就感,希望和大家分享.
另外分区表尚存在问题:
1,是否可将分区表设置在不同硬盘,innodb可行?
2,是否可根据多条件进行水平分区,类似group by 列1,列2...
3,是否能将分区设置成不同引擎,例如当前使用中的分区为innodb,老的分区使用MyISAM.