今天需要用到中文的词向量,我用sgns.financial.word.bz2关键字搜到的教程比较少,我这里写一个简单的教程供大家参考
import gensim
fd='../data/sgns.financial.word.bz2'
model =gensim.models.KeyedVectors.load_word2vec_format(fd, encoding = "utf-8")
vocab = model.index2entity
print(vocab[:5])
词向量下载地址为: https://github.com/Embedding/Chinese-Word-Vectors