1 训练数据
2 预处理
jieba https://github.com/fxsjy/jieba
python -m jieba news.txt > cut_result.txt
3 训练
word2vec https://github.com/svn2github/word2vec
./word2vec -train resultbig.txt -output vectors.bin -cbow 0 -size 200 -window 5 -negative 0 -hs 1 -sample 1e-3 -threads 12 -binary 1