Data frame:
pair = collections.defaultdict(collections.Counter)
e.g.
pair = {'doc1': {'word1':4, 'word2':3},
'doc2': {'word1':2, 'word3':4},
'doc3': {'word2':2, 'word4':1},
...}
I want to keep the data frame but alter the type of this part {'word1':4, 'word2':3} {'word1':2, 'word3':4}``... It is now a Counter and I need a dict.
I tried this to get the data from pair, but I do not know how to create a dict for each doc:
new_pair = collections.defaultdict(collections.Counter)
for doc, tab in testing.form.items():
for word, freq in tab.items():
new_pair[doc][word] = freq
I do not want to change the output. I just need that in each doc, the data type is dict, not Counter.
解决方案
A Counter is already a dict - or, a subclass of it. But, if you really need exactly a dict for some reason, then its a one-liner:
>>> c = Counter(word1=4, word2=3)
>>> c
Counter({'word1': 4, 'word2': 3})
>>> dict(c)
{'word1': 4, 'word2': 3}
Any Mapping (anything that behaves like a dictionary) can be passed into dict, and you will get a dict with the same contents. There is no need to iterate over it to construct it yourself.
This gives you one loop, with one line in the body instead of a nested loop. But any code of the form:
thing = a new empty collection
for elem in old_thing:
Add something to do with elem to thing
Can usually be done in one line using a generator expression or a list, set or dict comprehension. We're building a dict, so a dict comprehension (the Examples section is what you're most interested in) seems likely. I'll leave coming up with it as an exercise for the reader. ;-)