mysql 数据表格切分,MySQL：将大表拆分为小表的最快方法

最新推荐文章于 2024-04-29 18:05:24 发布

weixin_39586683

最新推荐文章于 2024-04-29 18:05:24 发布

阅读量459

点赞数

文章标签： mysql 数据表格切分

I have a very big table, which has almost 300 million records. Since the select query is too slow for me, I want to split it into about 800 small tables.

The dataset looks like this:

XXXXXX column2 column3 column4 ...

YYYYYY column2 column3 column4 ...

I want to split the table based on the value of first column(e.g. record with XXXXXX splits into table XXXXXX), what's the quickest way to make it ?

Note: I have already added 10 partitions for it, but it doesn't speed it up very well.

解决方案

Partitioning works as a performance strategy under two circumstances:

The primary querie(s) that of that table end up doing table or index scans, and are on a system with adequate resources and appropriate configuration to do a high level of parallelism. So if all of the partitions are on the same physical drive, that doesn't buy you much, you're as I/O bound as you were in the first place. But if you're on a 16-core system, with each partition on a physically distinct disk? Partitioning may produce startling improvements in system performance.

The partitioning rule uses an index that is often used in the most prevalent queries against that table. If you're going for performance by that route, you should partition on an indexed value that is often used to filter or constrain the result set. The most frequent candidate is transaction date, since reporting is often by a calendar date range. The query optimizer can then use the partitioning rule to constrict action to a single (smaller) partition, or to run two or more partition scans in parallel (subject to the same strictures mentioned above).

I'm presuming that the primary reason to want to split up this table is for performance. But 800 partitions? If performance improvement is what you're after, that may be the wrong approach. Enterprise databases keep as much top-level table indexes in cache memory for good performance. In a five-level b-tree, for a moderately used table, it's quite possible that the top three levels are always kept in cache, after their first access (this is a likely configuration for a 300M row table with an integer primary key). By splitting your table into 800 pieces, that means there will be 800 data structures to try to keep cached (in addition to table data itself). Chances are, if your access is more-or-less evenly distributed by the primary key, that searching on one partition will end up pushing other partitions out of cache, to the ultimate detriment of overall performance.

Nevertheless, if you're determined to do this, the easiest way to partition a table into N pieces is to partition it by the MODULUS of number of partitions you want against the primary key (primary_key % 800, in your case). Newer version s of MySQL also have hash partition support, making partitioning into arbitrary numbers of sets fairly strightforward:

PARTITION BY HASH(some_column_value) PARTITIONS number_of_partitions

If you want to put your data into 800 actual tables, instead, you'll have to do so editor magic, or use a scripting language, and do it in SQL:

CREATE TABLE table1 LIKE MasterTable

CREATE TABLE table2 LIKE MasterTable

CREATE TABLE table3 LIKE MasterTable

INSERT INTO table1 SELECT * FROM MasterTable WHERE id MOD 800 = 0

INSERT INTO table2 SELECT * FROM MasterTable WHERE id MOD 800 = 1

INSERT INTO table3 SELECT * FROM MasterTable WHERE id MOD 800 = 2

You could do this in a loop in your favorite programming language using dynamic SQL: that would probably be the easiest to render.

weixin_39586683

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
mysql 数据表格切分,MySQL：将大表拆分为小表的最快方法

I have a very big table, which has almost 300 million records. Since the select query is too slow for me, I want to split it into about 800 small tables.The dataset looks like this:XXXXXX column2 colu...
复制链接

扫一扫