Excessive Disk Space

You may notice that for a given set of data the MongoDB datafiles in /data/db are larger than the data set inserted into the database.  There are several reasons for this.

local.* files and replication

The replication oplog is preallocated as a capped collection in the local database. This will be allocated approximately 5% of disk space by default on 64 bit installations. If you would like a smaller oplog size use the --oplogSize command line parameter.

Datafile Preallocation

Each datafile is preallocated to a given size.  (This is done to prevent file system fragmentation, among other reasons.)  The first file for a database is <dbname>.0, then <dbname>.1, etc.  <dbname>.0 will be 64MB, <dbname>.1 128MB, etc., up to 2GB.  Once the files reach 2GB in size, each successive file is also 2GB.

Thus if the last datafile present is say, 1GB, that file might be 90% empty if it was recently reached.

Additionally, on Unix, mongod will preallocate an additional datafile in the background and do background initialization of this file.  These files are prefilled with zero bytes.  This inititialization can take up to a minute (less on a fast disk subsystem) for larger datafiles; without prefilling in the background this could result in significant delays when a new file must be prepopulated.

You can disable preallocation with the --noprealloc option to the server. This flag is nice for tests with small datasets where you drop the db after each test. It shouldn't be used on production servers.

For large databases (hundreds of GB or more) this is of no signficant consequence as the unallocated space is small.

Deleted Space

MongoDB maintains deleted lists of space within the datafiles when objects or collections are deleted.  This space is reused but never freed to the operating system.

To compact this space, run db.repairDatabase() from the mongo shell (note this operation will block and is slow).

When testing and investigating the size of datafiles, if your data is just test data, use db.dropDatabase() to clear all datafiles and start fresh.

Checking Size of a Collection

Use the validate command to check the size of a collection -- that is from the shell run:

 

 

This command returns info on the collection data but note there is also data allocated for associated indexes.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值