mysql 数据表格切分,MySQL:将大表拆分为小表的最快方法

I have a very big table, which has almost 300 million records. Since the select query is too slow for me, I want to split it into about 800 small tables.

The dataset looks like this:

XXXXXX column2 column3 column4 ...

XXXXXX column2 column3 column4 ...

XXXXXX column2 column3 column4 ...

YYYYYY column2 column3 column4 ...

YYYYYY column2 column3 column4 ...

I want to split the table based on the value of first column(e.g. record with XXXXXX splits into table XXXXXX), what's the quickest way to make it ?

Note: I have already added 10 partitions for it, but it doesn't speed it up very well.

解决方案

Partitioning works as a performance strategy under two circumstances:

The primary querie(s) that of that table end up doing table or index scans, and are on a system with adequate resources and appropriate configuration to do a high level of parallelism. So if all of the partitions are on the same physical drive, that doesn't buy you much, you're as I/O bound as you were in the first place. But if you're on a 16-core system, with each partition on a physically distinct disk? Partitioning may produce startling improvements in system performance.

The partitioning rule uses an index that is often used in the most prevalent queries against that table. If you're going for performance by that route, you should partition on an indexed value that is often used to filter or constrain the result set. The most frequent candidate is transaction date, since reporting is often by a calendar date range. The query optimizer can then use the partitioning rule to constrict action to a single (smaller) partition, or to run two or more partition scans in parallel (subject to the same strictures mentioned above).

I'm presuming that the primary reason to want to split up this table is for performance. But 800 partitions? If performance improvement is what you're after, that may be the wrong approach. Enterprise databases keep as much top-level table indexes in cache memory for good performance. In a five-level b-tree, for a moderately used table, it's quite possible that the top three levels are always kept in cache, after their first access (this is a likely configuration for a 300M row table with an integer primary key). By splitting your table into 800 pieces, that means there will be 800 data structures to try to keep cached (in addition to table data itself). Chances are, if your access is more-or-less evenly distributed by the primary key, that searching on one partition will end up pushing other partitions out of cache, to the ultimate detriment of overall performance.

Nevertheless, if you're determined to do this, the easiest way to partition a table into N pieces is to partition it by the MODULUS of number of partitions you want against the primary key (primary_key % 800, in your case). Newer version s of MySQL also have hash partition support, making partitioning into arbitrary numbers of sets fairly strightforward:

PARTITION BY HASH(some_column_value) PARTITIONS number_of_partitions

If you want to put your data into 800 actual tables, instead, you'll have to do so editor magic, or use a scripting language, and do it in SQL:

CREATE TABLE table1 LIKE MasterTable

CREATE TABLE table2 LIKE MasterTable

CREATE TABLE table3 LIKE MasterTable

..

INSERT INTO table1 SELECT * FROM MasterTable WHERE id MOD 800 = 0

INSERT INTO table2 SELECT * FROM MasterTable WHERE id MOD 800 = 1

INSERT INTO table3 SELECT * FROM MasterTable WHERE id MOD 800 = 2

You could do this in a loop in your favorite programming language using dynamic SQL: that would probably be the easiest to render.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值