1、提前准备好语料库,例如fenci.txt
2、从官网上下载glove应用包,更改demo.sh文件中的参数,设置语料库、输出文件等,具体如下:
CORPUS=fenci.txt ##加载语料库 VOCAB_FILE=vocab_sifa.txt ##统计词频,并输出到vocab_cipin.txt COOCCURRENCE_FILE=cooccurrence.bin COOCCURRENCE_SHUF_FILE=cooccurrence.shuf.bin BUILDDIR=build SAVE_FILE=vectors VERBOSE=2 MEMORY=4.0 VOCAB_MIN_COUNT=5 ##设置词频下限,即词频低于5的字不再统计 VECTOR_SIZE=300 MAX_ITER=15 ##训练次数15次 WINDOW_SIZE=15 BINARY=2 NUM_THREADS=8 X_MAX=10