我有一个.csv文件,其中有几列,其中一列填充了随机数,我想在那儿找到重复的值.万一有-奇怪的情况,但这毕竟是我要检查的-我想显示/存储存储这些值的完整行.
为了清楚起见,我有这样的事情:
First, Whatever, 230, Whichever, etc
Second, Whatever, 11, Whichever, etc
Third, Whatever, 46, Whichever, etc
Fourth, Whatever, 18, Whichever, etc
Fifth, Whatever, 14, Whichever, etc
Sixth, Whatever, 48, Whichever, etc
Seventh, Whatever, 91, Whichever, etc
Eighth, Whatever, 18, Whichever, etc
Ninth, Whatever, 67, Whichever, etc
我想拥有:
Fourth, Whatever, 18, Whichever, etc
Eighth, Whatever, 18, Whichever, etc
为了找到重复的值,我将该列存储到字典中,并对每个键进行计数,以发现它们出现的次数.
import csv
from collections import Counter, defaultdict, OrderedDict
with open(file, 'rt') as inputfile:
data = csv.reader(inputfile)
seen = defaultdict(set)
counts = Counter(row[col_2] for row in data)
print "Numbers and times they appear: %s" % counts
我看到
Counter({‘ 18 ‘: 2, ‘ 46 ‘: 1, ‘ 67 ‘: 1, ‘ 48 ‘: 1,…})
现在出现了问题,因为我无法将密钥与重复链接起来并在以后进行计算.如果我做
for value in counts:
if counts > 1:
print counts
我只会拿钥匙,这不是我想要的,也不是每个值(更不用说我希望打印的不仅是整行,而是…)
基本上我正在寻找一种方法
If there's a repeated number:
print rows containing those number
else
print "No repetitions"
提前致谢.