第一步,下载谷歌word2vec现成词向量
在以下网址下载即可,大小~1.5G
https://drive.google.com/file/d/0B7XkCwpI5KDYNlNUTTlSS21pQmM/edit?usp=sharing
第二步,解压文件与加载词向量
- 解压(终端):
gzip -d GoogleNews-vectors-negative300.bin.gz
- 加载:
注:词向量长度是300
import gensim
model = gensim.models.KeyedVectors.load_word2vec_format('./GoogleNews-vectors-negative300.bin', binary=True)
Reference:
https://mccormickml.com/2016/04/12/googles-pretrained-word2vec-model-in-python/