我的mongoDB中有100个文档,假设它们中的每一个都可能与不同条件下的其他文档重复,例如firstName&姓氏,电子邮件和手机.
我试图mapReduce这100个文件,以具有键值对,如分组.
一切正常,直到我在DB中有第101个重复记录.
与第101条记录重复的其他文档的mapReduce结果的输出已损坏.
例如:
我正在研究firstName& lastName现在.
当DB包含100个文档时,我可以包含结果
{
_id: {
firstName: "foo",
lastName: "bar,
},
value: {
count: 20
duplicate: [{
id: ObjectId("/*an object id*/"),
fullName: "foo bar",
DOB: ISODate("2000-01-01T00:00:00.000Z")
},{
id: ObjectId("/*another object id*/"),
fullName: "foo bar",
DOB: ISODate("2000-01-02T00:00:00.000Z")
},...]
},
}
这正是我想要的,但……
当数据库包含100多个可能的重复文档时,结果就像这样,
假设第101个文件是
{
firstName: "foo",
lastName: "bar",
email: "foo@bar.com",
mobile: "019894793"
}
包含101个文件:
{
_id: {
firstName: "foo",
lastName: "bar,
},
value: {
count: 21
duplicate: [{
id: undefined,
fullName: undefined,
DOB: undefined
},{
id: ObjectId("/*another object id*/"),
fullName: "foo bar",
DOB: ISODate("2000-01-02T00:00:00.000Z")
}]
},
}
包含102个文件:
{
_id: {
firstName: "foo",
lastName: "bar,
},
value: {
count: 22
duplicate: [{
id: undefined,
fullName: undefined,
DOB: undefined
},{
id: undefined,
fullName: undefined,
DOB: undefined
}]
},
}
我发现stackoverflow上的另一个主题有类似我的问题,但答案对我不起作用
MapReduce results seem limited to 100?
有任何想法吗?
编辑:
原始源代码:
var map = function () {
var value = {
count: 1,
userId: this._id
};
emit({lastName: this.lastName, firstName: this.firstName}, value);
};
var reduce = function (key, values) {
var reducedObj = {
count: 0,
userIds: []
};
values.forEach(function (value) {
reducedObj.count += value.count;
reducedObj.userIds.push(value.userId);
});
return reducedObj;
};
源代码现在:
var map = function () {
var value = {
count: 1,
users: [this]
};
emit({lastName: this.lastName, firstName: this.firstName}, value);
};
var reduce = function (key, values) {
var reducedObj = {
count: 0,
users: []
};
values.forEach(function (value) {
reducedObj.count += value.count;
reducedObj.users = reducedObj.users.concat(values.users); // or using the forEach method
// value.users.forEach(function (user) {
// reducedObj.users.push(user);
// });
});
return reducedObj;
};
我不明白为什么它会失败,因为我也将值(userId)推送到reducedObj.userIds.
关于我在map函数中发出的值有什么问题吗?