python batch_size_pymongo的默认batchSize是多少?

I am using pymongo to fetch around 2M documents in one query, each document only contains three string fields. the query is just a simple find(), without any limit() or batchSize().

While iterating through the cursor, I noticed that the script waits for about 30~40seconds after processing around 25k documents.

So I am wondering does mongo return all the 2M results in one batch? what is the default batchSize() in pymongo?

解决方案

The cursor in MongoDB defaults to returning up to 101 documents or enough to get you to 1 MB. Calls to iterate thru the cursor after that pop up to 4MB. The number of documents returned will be a function of how big your documents are:

Cursor Batches

The MongoDB server returns the query results in batches. Batch size

will not exceed the maximum BSON document size. For most queries, the

first batch returns 101 documents or just enough documents to exceed 1

megabyte. Subsequent batch size is 4 megabytes. To override the

default size of the batch, see batchSize() and limit().

For queries that include a sort operation without an index, the server

must load all the documents in memory to perform the sort and will

return all documents in the first batch.

As you iterate through the cursor and reach the end of the returned

batch, if there are more results, cursor.next() will perform a getmore

operation to retrieve the next batch.

You can use the batch_size() method in pymongo on the cursor to override the default - however it won't go above 16 MB (the maximum BSON document size):

batch_size(batch_size)

Limits the number of documents returned in one batch. Each batch

requires a round trip to the server. It can be adjusted to optimize

performance and limit data transfer.

Note

batch_size can not override MongoDB’s internal limits on the amount of

data it will return to the client in a single batch (i.e if you set

batch size to 1,000,000,000, MongoDB will currently only return 4-16MB

of results per batch).

Raises TypeError if batch_size is not an integer. Raises ValueError if

batch_size is less than 0. Raises InvalidOperation if this Cursor has

already been used. The last batch_size applied to this cursor takes

precedence.

Parameters :

batch_size: The size of each batch of results requested.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值