mysql数据超10亿条,大型主键:超过10亿行MySQL + InnoDB?

I was wondering if InnoDB would be the best way to format the table? The table contains one field, primary key, and the table will get 816k rows a day (est.). This will get very large very quick! I'm working on a file storage way (would this be faster)? The table is going to store ID numbers of Twitter Ids that have already been processed?

Also, any estimated memory usage on a SELECT min('id') statement? Any other ideas are greatly appreciated!

解决方案

The only definitive answer is to try both and test and see what happens.

Generally, MyISAM is faster for writes and reads, but not both at the same time. When you write to a MyISAM table the entire table gets locked for the insert to complete. InnoDB has more overhead but uses row-level locking so that reads and writes can happen concurrently without the problems that MyISAM's table locking incurs.

However, your problem, if I understand it correctly, is a little different. Having only one column, that column being a primary key has an important consideration in the different ways that MyISAM and InnoDB handle primary key indexes.

In MyISAM, the primary key index is just like any other secondary index. Internally each row has a row id and the index nodes just point to the row ids of the data pages. A primary key index is not handled differently than any other index.

In InnoDB, however, primary keys are clustered, meaning they stay attached to the data pages and ensure that the row contents remain in physically sorted order on disk according to the primary key (but only within single data pages, which themselves could be scattered in any order.)

This being the case, I would expect that InnoDB might have an advantage in that MyISAM would essentially have to do double work -- write the integer once in the data pages, and then write it again in the index pages. InnoDB wouldn't do this, the primary key index would be identical to the data pages, and would only have to write once. It would only have to manage the data in one place, where MyISAM would needlessly have to manage two copies.

For either storage engine, doing something like min() or max() should be trivial on an indexed column, or just checking the existence of a number in the index. Since the table is only one column no bookmark lookups would even be necessary as the data would be represented entirely within the index itself. This should be a very efficient index.

I also wouldn't be all that worried about the size of the table. Where the width of a row is only one integer, you can fit a huge number of rows per index/data page.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值