https://towardsdatascience.com/an-implementation-guide-to-word2vec-using-numpy-and-google-sheets-13445eebd281
https://www.leiphone.com/news/201812/2o1E1Xh53PAfoXgD.html
两个链接对照着看
实现的是skip_graw模型
text = "natural language processing and machine learning is fun and exciting"
# Note the .lower() as upper and lowercase does not matter in our implementation
# [['natural', 'language', 'processing', 'and', 'machine', 'learning', 'is', 'fun', 'and', 'exciting']]
corpus = [[word.lower() for word in text.split()]]
数据处理,把目标词和对应的内容词打包
处理之后的格式
模型训练
计算损失
更新参数W1,W2
细节部分按照开头给的链接去看
看完后收获:word2vec内部实现,我理解的是两层神经网络连接,损失函数那里不是很清楚