Getting started
- 系统及包的版本
- ubuntu 16.04
- Python 3.5
- Numpy 1.14.0
- Scipy 1.1.0
- gensim 3.4.0
- jieba 0.39
Install Gensim
pip3 install gensim
Demo
- 英文
import gensim
from gensim.models import Word2Vec
# define training data
sentences = [['this','is','first','sentence','for','word2vec'],
['this','is','the','second','sentence'],
['yet','another','sentence'],
['one','more','sentence'],
['and','the','final','sentence']]
# train model: 2 approaches
print('----------------------------------------------')
# approach 1: using Word2Vec directly
model = Word2Vec(sentences, min_count=1, size=100)
# appraoch 2: 3 steps
model = Word2Vec(size=100, min_count=1) # build a null model
model.build_vocab(sentence) # build vocabs
model.train(sentence,total_examples=model.corpus_count, epochs=model.epochs) # train the model
print('------------------------