查了官方文档和伙伴们的帖子,可能由于版本不同,示范案例总是失败
比如:
db.myset.aggregate([
{
'$group': { '_id': {'docname': '$docname','hpname': '$hpname'},'count': {'$sum': 1},'dups': {'$addToSet': '$_id'}}
},
{
'$match': {'count': {'$gt': 1}}
}
]).forEach(function(doc){
doc.dups.shift();
db.myset.remove({_id: {$in: doc.dups}});
})
总是提示语法错误:
Traceback (most recent call last):
Python Shell, prompt 48, line 8
invalid syntax: <string>, line 8, pos 25
抓狂了一下午,觉得自己编写去重函数,方法如下:
比如,两个变量,docname,hpname,找出两者相同的重复值,然后删掉,opthions是可选项,删除保留ID最大值。
def test(query={'docname': '$docname','hpname': '$hpname'},options="savemax"):
db = _connect_mongo(host='localhost', port=27017 , db='dianping')
myset=db['soyoung2']
res=myset.aggregate([
{
'$group': { '_id': query,'count': {'$sum': 1},'dups': {'$addToSet': '$_id'}}
},
{
'$match': {'count': {'$gt': 1}}
}
])
res2=list(res)
for i in res2:
#print res2[i]
for ii in i:
if ii=="dups":
#print ii,i[ii]
num=len(i[ii])
if options=="savemax":
if num>=2:
a=1
for sid in i[ii]:
if a<=num-1:
myset.remove({"_id":ObjectId(sid)})
print sid,"was deleted"
a+=1
else:
if num>=2:
a=1
for sid in i[ii]:
if a>=2:
myset.remove({"_id":ObjectId(sid)})
print sid,"was deleted"
a+=1
代码比较长,但能解决问题。欢迎高手指点,精简代码,谢谢!