你把你的行分成一个单词列表,但只给它一个键和值。
这将起作用:with open('LIWC_words.txt', 'r') as document:
answer = {}
for line in document:
line = line.split()
if not line: # empty line?
continue
answer[line[0]] = line[1:]
注意,您不需要给.split()一个参数;如果没有参数,它将在空白处分割,并为您去除结果。这样就省去了显式调用.strip()。
另一种方法是只在第一个空格上拆分:with open('LIWC_words.txt', 'r') as document:
answer = {}
for line in document:
if line.strip(): # non-empty line?
key, value = line.split(None, 1) # None means 'all whitespace', the default
answer[key] = value.split()
.split()的第二个参数限制了所做的拆分次数,保证最多返回2个元素,从而可以将赋值中的值解压缩到key和value。
任何一种方法都会导致:{'aaien': ['12', '13', '39'],
'aan': ['10'],
'aanbad': ['12', '13', '14', '57', '58', '38'],
'aanbaden': ['12', '13', '14', '57', '58', '38'],
'aanbeden': ['12', '13', '14', '57', '58', '38'],
'aanbid': ['12', '13', '14', '57', '58', '39'],
'aanbidden': ['12', '13', '14', '57', '58', '39'],
'aanbidt': ['12', '13', '14', '57', '58', '39'],
'aanblik': ['27', '28'],
'aanbreken': ['39']}
如果您仍然只看到一个键和文件的其余部分作为(分割)值,那么您的输入文件可能使用了非标准行分隔符。通过将U字符添加到模式中,使用universal line ending support打开文件:with open('LIWC_words.txt', 'rU') as document: