python字典值得和计算,如何在python的字典中计算前10个最常见的值-CSDN博客

I'm new to python and programming in general and so please be kind. I'm trying to analyze a csv file with music information and return the top n most listened to bands. From the code below, each song listen is a dict entry within a list formatted like this:

[{'album': 'Exile on Main Street', 'song': 'Happy', 'datetime': '3 Dec 2014 14:08', 'artist': 'The Rolling Stones'}, {'album': 'II', 'song': 'Black Dog', 'datetime': '1 Dec 2014 08:08', 'artist': 'Led Zepplin'}]

from collections import Counter

def count_artist_plays(filename):

with open(filename, 'r') as data:

header = data.readline().strip().split(',')

entries = []

for line in data:

entry = line.strip().split(',')

listens = {}

for info, type in enumerate(header):

listens[type] = entry[info]

entries.append(listens)

for d in entries:

arts = d['artist']

c = Counter(arts)

print c.most_common(10)

How do I get the most common string (band) instead of the character breakdown I'm getting like this below?

[('s', 2), ('a', 1), (' ', 1), ('E', 1), ('l', 1), ('o', 1), ('n', 1), ('S', 1), ('v', 1), ('y', 1)]

解决方案

Initialize the Counter once, let the keys be artists, and augment a key (artist) each time through the loop:

c = Counter()

for d in entries:

arts = d['artist']

c[arts] += 1

print(c.most_common(10))

When arts is a string, then c = Counter(arts) counts the characters in arts:

In [522]: collections.Counter('Led Zepplin')

Out[522]: Counter({'e': 2, 'p': 2, ' ': 1, 'd': 1, 'i': 1, 'L': 1, 'l': 1, 'n': 1, 'Z': 1})

In contrast:

In [523]: c = collections.Counter()

In [524]: c['Led Zepplin'] += 1

In [525]: c['The Rolling Stones'] += 1

In [526]: c.most_common()

Out[526]: [('Led Zepplin', 1), ('The Rolling Stones', 1)]

Alternatively, as Jon Clements points out, build a list of all the artists, and then count the list:

c = Counter(d['artist'] for d in entries)

print(c.most_common(10))

Note that the above uses a generator expression to avoid building a (possibly) large temporary list, and at the same time has a much more succinct, readable syntax.