MySQL vs MongoDB 1000 reads

问题:

I have been very excited about MongoDb and have been testing it lately. I had a table called posts in MySQL with about 20 million records indexed only on a field called 'id'.

对MongoDB感到非常兴奋,并且最近对其进行了测试。 有一个名为posts的MySQL表,约有2,000万条数据记录,仅在id字段上建了索引。

I wanted to compare speed with MongoDB and I ran a test which would get and print 15 records randomly from our huge databases. I ran the query about 1,000 times each for mysql and MongoDB and I am suprised that I do not notice a lot of difference in speed. Maybe MongoDB is 1.1 times faster. That's very disappointing. Is there something I am doing wrong? I know that my tests are not perfect but is MySQL on par with MongoDb when it comes to read intensive chores.

我想把MySQL与MongoDB的性能做一个比较,就运行了一个测试,该测试会从数据库中随机获取并输出15条数据记录。对MySQL与MongoDB分别运行了约1,000次查询,我很惊讶没有看到两者性能上有大的差异。也许MongoDB快1.1倍。 真令人失望。我做错什么了吗?我知道我的测试并不完美,但是在密集的读取数据方面,MySQL与MongoDB性能相当。

注:

  • I have dual core + ( 2 threads ) i7 cpu and 4GB ram
  • 双核 +(双线程)i7 CPU,及 4GB RAM
  • I have 20 partitions on MySQL each of 1 million records
  • 在MySQL上有20个分区,每个分区有1百万条记录

 

回答:

MongoDB is not magically faster. If you store the same data, organised in basically the same fashion, and access it exactly the same way, then you really shouldn't expect your results to be wildly different. After all, MySQL and MongoDB are both GPL, so if Mongo had some magically better IO code in it, then the MySQL team could just incorporate it into their codebase.

MongoDB 并不是神奇的更快。如果以相同的方式存储相同的数据,并以完全相同的方式对其进行访问,那么真的不应该期望结果会大相径庭。毕竟,MySQL和MongoDB都是GPL,因此,如果Mongo中包含一些神奇的更好的IO代码,则MySQL团队可以将其合并到他们的代码库中。

People are seeing real world MongoDB performance largely because MongoDB allows you to query in a different manner that is more sensible to your workload.

人们看到现实世界中MongoDB性能更好,是因为MongoDB允许以更适合工作负载的方式进行查询。

For example, consider a design that persisted a lot of information about a complicated entity in a normalised fashion. This could easily use dozens of tables in MySQL (or any relational db) to store the data in normal form, with many indexes needed to ensure relational integrity between tables.

例如,考虑一种设计,该设计以规范化方式保留了有关复杂实体的许多信息。这可以很容易的使用MySQL或者任何关系数据库中的数十个表以标准形式存储数据,并需要许多索引来确保表之间的关系完整性。

Now consider the same design with a document store. If all of those related tables are subordinate to the main table (and they often are), then you might be able to model the data such that the entire entity is stored in a single document. In MongoDB you can store this as a single document, in a single collection. This is where MongoDB starts enabling superior performance.

现在考虑与文档存储相同的设计。如果所有这些相关表都从属于主表(并且经常属于主表),那么您也许可以对数据建模,以便将整个实体存储在单个文档中。在MongoDB中,您可以将其作为单个文档存储在单个集合中。这是MongoDB提供卓越性能的地方。

表1:关系数据库概念与MongoDB概念

关系数据库概念MongoDB概念说明
database 数据库database 数据库 
table 表collection 集合 
row 记录 (行)document 文档 
column 字段 (列)field 域 
index 索引index 索引 
table joins 表连接 MongoDB不支持表连接
primary key 主键primary key 主键MongoDB自动将_id字段设置为主键

In MongoDB, to retrieve the whole entity, you have to perform:

在MongoDB中,要检索整个实体,必须执行:

  • One index lookup on the collection (assuming the entity is fetched by id)
  • 在集合上进行一次索引查找(假设通过id获取实体)
  • Retrieve the contents of one database page (the actual binary json document)
  • 检索一个数据库页的内容(实际的二进制json文档)

So a b-tree lookup, and a binary page read. Log(n) + 1 IOs. If the indexes can reside entirely in memory, then 1 IO.

因此,会有一次B树查找,以及一次读取数据库页。Log(n)+1次磁盘IO。如果索引可以完全驻留在内存中,则为1次磁盘IO。

 

In MySQL with 20 tables, you have to perform:

在有20个表的MySQL中,必须执行:

  • One index lookup on the root table (again, assuming the entity is fetched by id)
  • 在根表上进行一次索引查找(同样,假设该实体是通过id获取的)
  • With a clustered index, we can assume that the values for the root row are in the index
  • 对于聚集索引,我们可以假设根行的值在索引中
  • 20+ range lookups (hopefully on an index) for the entity's pk value
  • 实体主键值的20多次范围查找(希望在索引上)
  • These probably aren't clustered indexes, so the same 20+ data lookups once we figure out what the appropriate child rows are.
  • 这些可能不是聚集索引,因此一旦我们确定了合适的子行是什么,就进行了20多个相同的数据查找。

So the total for mysql, even assuming that all indexes are in memory (which is harder since there are 20 times more of them) is about 20 range lookups.

因此,即使假设所有索引都在内存中(这比较困难,因为它们的数量是20倍),MySQL的总数也大约为20个范围查找。

These range lookups are likely comprised of random IO — different tables will definitely reside in different spots on disk, and it's possible that different rows in the same range in the same table for an entity might not be contiguous (depending on how the entity has been updated, etc).

这些范围查找可能由随机IO组成——不同的表肯定会驻留在磁盘上的不同位置,并且同一实体在同一表的同一范围内的不同行可能不连续(取决于该实体的状态、更新等)。

So for this example, the final tally is about 20 times more IO with MySQL per logical access, compared to MongoDB.

因此,对于此示例,与MongoDB相比,每个逻辑访问的最终IO数比MySQL多20倍。

This is how MongoDB can boost performance in some use cases.

这是MongoDB在某些用例中可以提高性能的方式。

 

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值