个人助理qq526346584
接上一篇
P16-P17
常见汉语分词工具
词向量化
词向量化,将词映射到向量。
其关键目标是,意思相近的词,其向量的间距相近。(NLP最前沿的进展,几乎都是基于词向量化的)
The cat jump over the dog
The à [0.0,0.0,0.4,0.0,1.0,0.0]
Cat à [0.5,0.1,0.2,0.0,0.2,0.5]
Jump à [0.4,0.0,0.0,0.6,1.0,0.0]
Over à [0.0,0.3,0.4,0.8,0.0,0.0]
The à [0.0,1.0,0.4,0.7,1.0,0.0]
dog à [0.2,0.0,0.5,0.0,1.3,0.0]
方法
ü矩阵分解方法(LSA)
üWord2Vec
üGlove
üELMo
未完待续……