python只显示重复值,Python-在csv文件中显示具有重复值的行

I have a .csv file with several columns, one of them filled with random numbers and I want to find duplicated values there. In case there are - strange case, but it's what I want to check after all -, I would like to display/store the complete row in which those values are stored.

To make it clear, I have sth like this:

First, Whatever, 230, Whichever, etc

Second, Whatever, 11, Whichever, etc

Third, Whatever, 46, Whichever, etc

Fourth, Whatever, 18, Whichever, etc

Fifth, Whatever, 14, Whichever, etc

Sixth, Whatever, 48, Whichever, etc

Seventh, Whatever, 91, Whichever, etc

Eighth, Whatever, 18, Whichever, etc

Ninth, Whatever, 67, Whichever, etc

And I would like to have:

Fourth, Whatever, 18, Whichever, etc

Eighth, Whatever, 18, Whichever, etc

To find duplicated values, I store that column into a dictionary and I count every key in order to discover how many times they appear.

import csv

from collections import Counter, defaultdict, OrderedDict

with open(file, 'rt') as inputfile:

data = csv.reader(inputfile)

seen = defaultdict(set)

counts = Counter(row[col_2] for row in data)

print "Numbers and times they appear: %s" % counts

And I see

Counter({' 18 ': 2, ' 46 ': 1, ' 67 ': 1, ' 48 ': 1,...})

The problem comes now because I don't manage to link the key with the repetitions and compute it later. If I do

for value in counts:

if counts > 1:

print counts

I would be taking only the key, which is not what I want and every value (not to mention that I'm looking to print not only that but the whole line...)

Basically I'm looking for a way of doing

If there's a repeated number:

print rows containing those number

else

print "No repetitions"

Thanks in advance.

解决方案

try this may work for you.

entries = []

duplicate_entries = []

with open('in.txt', 'r') as my_file:

for line in my_file:

columns = line.strip().split(',')

if columns[2] not in entries:

entries.append(columns[2])

else:

duplicate_entries.append(columns[2])

if len(duplicate_entries) > 0:

with open('out.txt', 'w') as out_file:

with open('in.txt', 'r') as my_file:

for line in my_file:

columns = line.strip().split(',')

if columns[2] in duplicate_entries:

print line.strip()

out_file.write(line)

else:

print "No repetitions"

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值