I'm a bit stuck with this particular problem I'm having. I have a working solution, but I don't think it's very Pythonic.
I have a raw text output like this:
Key 1
Value 1
Key 2
Value 2
Key 3
Value 3a
Value 3b
Value 3c
Key 4
Value 4a
Value 4b
I'm trying to make a dictionary:
{ 'Key 1': ['Value 1'], 'Key 2': ['Value 2'], 'Key 3': ['Value 3a', 'Value 3b', 'Value 3c'], 'Key 4': ['Value 4a', 'Value 4b'] }
The raw output can be made into a string and it looks something like this:
my_str = "
Key 1\n\tValue 1
\nKey 2\n\tValue 2
\nKey 3\n\tValue 3a \n\tValue 3b \n\tValue 3c
\nKey 4\n\tValue 4a \n\tValue 4b "
So the Values are separated by \n\t and the Keys are separated by \n
If I try to do something like this:
dict(item.split('\n\t') for item in my_str.split('\n'))
It doesn't parse it correctly because it splits the 'n' in \n\t as well.
So far I have something like this:
#!/usr/bin/env python
str = "Key 1\n\tValue 1\nKey 2\n\tValue 2\nKey 3\n\tValue 3a \n\tValue 3b \n\tValue 3c\nKey 4\n\tValue 4a \n\tValue 4b"
output = str.replace('\n\t', ',').replace('\n',';')
result = {}
for key in output.split(';'):
result[key.split(',')[0]] = key.split(',')[1:]
print result
Which returns:
{'Key 1': ['Value 1'], 'Key 2': ['Value 2'], 'Key 3': ['Value 3a ', 'Value 3b ', 'Value 3c'], 'Key 4': ['Value 4a ', 'Value 4b']}
However, this looks quite gross to me, I'm just wondering if there is a pythonic way to do this. Any help would be super appreciated!
解决方案
Batteries are included - defaultdict deals with auto-hydrating a new key's value as a list and we leverage str's iswhitespace method to check for indentation (otherwise we could have used a regular expression):
from collections import defaultdict
data = """
Key 1
Value 1
Key 2
Value 2
Key 3
Value 3a
Value 3b
Value 3c
Key 4
Value 4a
Value 4b
"""
result = defaultdict(list)
current_key = None
for line in data.splitlines():
if not line: continue # Filter out blank lines
# If the line is not indented then it is a key
# Save it and move on
if not line[0].isspace():
current_key = line.strip()
continue
# Otherwise, add the value
# (minus leading and trailing whitespace)
# to our results
result[current_key].append(line.strip())
# result is now a defaultdict
defaultdict(,
{'Key 1': ['Value 1'],
'Key 2': ['Value 2'],
'Key 3': ['Value 3a', 'Value 3b', 'Value 3c'],
'Key 4': ['Value 4a', 'Value 4b']})