MongoDB——count不准原因分析

最新推荐文章于 2024-03-11 18:11:26 发布

mmgithub123

最新推荐文章于 2024-03-11 18:11:26 发布

阅读量1.3k

点赞数

文章标签： java 数据库 python mongodb redis

本文链接：https://blog.csdn.net/mmgithub123/article/details/124507808

版权

仅仅是tips，我们用的3.6版本，存在这种情况。4.0版本以后就没了。

背景

一般来说，除了由于secondary延迟可能造成查询secondary节点数据不准以外，关于count的准确性问题，在MongoDB4.0官方文档中有这么一段话
On a sharded cluster, db.collection.count() without a query predicate can result in an inaccurate count iforphaned documents exist or if a chunk migration is in progress.
To avoid these situations, on a sharded cluster, use the db.collection.aggregate() method

而MongoDB3.6官方文档却是这么描述的
On a sharded cluster, db.collection.count() can result in an inaccurate count if orphaned documents exist or if a chunk migration is in progress.
To avoid these situations, on a sharded cluster, use the db.collection.aggregate() method

也就是说，MongoDB4.0分片集群模式下，针对不带谓词条件的全表count操作的返回结果是不准确的，主要包括以下两种场景。在MongoDB4.0以前的版本，即使不带谓词条件，在以下两种场景下count值也不准。
1 存在孤立文档
2 mongo分片集群内部正在进行move chunk操作
本文主要针对这两种场景，分析count不准的原因和规避措施

orphaned documents导致count不准

孤立文档定义和产生原因

孤立文档是由于move chunk期间进程异常关闭造成的迁移失败或清理迁移后的源端chunk失败造成的，使得这部分记录在源端和目标端都存在，而在mongo分片集群的定义中，一个文档必须且只能属于一个chunk和shard。
显而易见，孤立文档可能导致count不准，如果孤立文档量太大，还会造成占用额外的磁盘存储资源。

一般来说，movechunk操作大概有以下步骤

负载均衡器向源端分

最低0.47元/天解锁文章

mmgithub123

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
MongoDB——count不准原因分析

仅仅是tips，我们用的3.6版本，存在这种情况。4.0版本以后就没了。背景一般来说，除了由于secondary延迟可能造成查询secondary节点数据不准以外，关于count的准确性问题，在MongoDB4.0官方文档中有这么一段话On a sharded cluster, db.collection.count() without a query predicate can resul...
复制链接

扫一扫