MongoDB: Index Size and Memory & Possible performance impact Options

这个是google group mongodb user小组的一则讨论,提问者首先提出如下的问题:

Hi,

on our main mongo box, we have 4G memory (wohooo).   Our data size is
1G and index size is 5.6G.   This box is 1 node of a 3 node replica
set.   We have 1 collection and a few indexes on that collection.
This one collection stores a lot of writes from our website.

We are mostly concerned that our write speed does not degrade (right
now, we're ~ 3-4ms/write).   (near) worst case scenario is that the
write speed would degrade dramatically (while mongo is paging).

What things should we be keeping an eye on?  any particular metric to
comb for?   would writes (including index updates) be efficient even
if, in our case, we see index size greater than ram?

Thanks.

 

 

下面是一个叫gates的哥们给的答复,comments很不错,推荐认真看一下!


Here's the general advice on scaling:
 - *Replica Sets* is used for scaling reads and for providing
redundancy
 - *Sharding* is used for scaling writes

 

> What things should we be keeping an eye on?  any particular metric to comb for?

Metric #1 is to check that index size < RAM size.
---
It looks like you're already past that.

Metric #2 is to check IO Utilization. (see iostat)
---
You want to keep IO utilization % under 100%. By default, MongoDB is
flushing every 60 seconds, which may result in IO spikes every minute
or so. If you're seeing low utilization with spikes every minute, then
you may want to set --syncdelay when you run mongodb. If you can set
it lower

Metric #3: watch for paging.
---
You can often do this with a tool like top. We also have mongostat
which should give you an idea of how often data is being "paged in".

The Problems:

> We are mostly concerned that our write speed does not degrade

---
This is really hard to guarantee when you've already overflowed the
RAM and it's going to grow.

If you only have one index, then maybe you'll get lucky. MongoDB is
pretty good about "writing to the end" of the index. So old index data
would just flow out of memory and rarely if ever get re-used.

However, I suspect that you have multiple indexes here. If this is the
case, then it's hard to guarantee that we won't be going to disk to
fetch those indexes.

The other problem here is queries. At some point you'll want to query
this data. When you do, it's quite likely that you're going to force
the whole index back into memory and this is going to slow down the
whole system.

Possible Solutions:
 #1: Add more RAM -> will delay the problem
 #2: Less indexes? -> if possible
 #3: Break out data

It sounds like you have lots of transactional data.
Is it possible to break out data by hour or by day or by week?
This will make queries a little more difficult, but it's going to
minimize the amount of indexes you need in memory at any given time.

If you have more details on what data you're storing and how you're
indexing, we may be able to provide additional guidance.

- Gates

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值