embedId = {}
embedding = [] # 这样声明不能直接访问,可以append
file = open(embedFile, "r")
wordId = 0
for line in file:
parts = line.split(' ')
if (len(parts) > 2):
# key = word, value = wordId
embedId[parts[0]] = wordId
# wordId对应embedding
emb = [parts[i] for i in range(1, 201)]
embedding.append(emb)
print(len(emb))
wordId += 1
file.close()
读取的词向量文本,需要注意的是:第一行是文本统计信息,刚学习Python,不是太会用,待改进
python 读取文本进行处理
最新推荐文章于 2022-10-18 13:39:47 发布