Word Vectors详解(1)

最新推荐文章于 2020-06-24 23:43:03 发布

QZQmmmm

最新推荐文章于 2020-06-24 23:43:03 发布

阅读量1k

点赞数 1

分类专栏：其他

本文链接：https://blog.csdn.net/Techmonster/article/details/73316850

版权

We want to represent a word with a vector in NLP. There are many methods.

1 one-hot Vector

Represent every word as an $\mathbb{R}^{|\it{V}|*1}$ vector with all 0s and one 1 at the index of that word in the sorted english language. Where $V$ is the set of vocabularies.

2 SVD Based Methods

2.1 Window based Co-occurrence Matrix

Representing a word by means of its neighbors.
In this method we count the number of times each word appears inside a window of a particular size around the word of interest.

For example:
这里写图片描述

The matrix is too large. We should make it smaller with SVD.

Generate $|\it{V}|*|\it{V}|$ co-occurrence matrix, $X$ .
Apply SVD on X to get X=USVT .
- Select the first $k$ columns of $U$ to get a $k$ -dimensional word vectors.
- $\frac{\sum^k_{i=1}\sigma_i}{\sum^{|\it{V}|}_{i=1}\sigma_i}$ indicates the amount of variance captured by the first $k$ dimensions.

2.2 shortage

SVD based methods do not scale well for big matrices and it is hard to incorporate new words or documents. Computational cost for a $m*n$ matrix is $O(mn^2)$

3 Iteration Based Methods - Word2Vec

3.1 Language Models (Unigrams, Bigrams, etc.)

We need to create such a model that will assign a probability to a sequence of tokens.

For example
* The cat jumped over the puddle. —high probability
* Stock boil fish is toy. —low probability

Unigrams:
We can take the unary language model approach and break apart this probability by assuming the word occurrences are completely independent:

P (w 1, w 2

最低0.47元/天解锁文章

QZQmmmm

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Word Vectors详解(1)

We want to represent a word with a vector in NLP. There are many methods.1 one-hot VectorRepresent every word as an ℝ|V|∗1\mathbb{R}^{|\it{V}|*1} vector with all 0s and one 1 at the index of that word
复制链接

扫一扫

专栏目录