Word Representation

最新推荐文章于 2024-10-04 18:47:42 发布

Sparkleii

最新推荐文章于 2024-10-04 18:47:42 发布

阅读量1k

点赞数 23

文章标签： word

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/sparkleii/article/details/136533683

版权

1. Background

Word representation: a process that transform the symbols to the machine understandable meanings
Definition of meaning（Webster Dictionary）

- The thing one intends to convey
- The logical extension of a word

How to represent the meaning so that the machine can understand?

Compute word similarity （词相似性计算）

- WR(Star) ≈ WR(Sun)
- WR(Motel) ≈ WR(Hotel)

lnfer word relation（语义关系）

- WR(China) - WR(Beijing) ≈ WR(Japan) - WR(Tokyo)
- WR(Man) ≈ WR(King)- WR(Queen)+ WR(Woman)
- WR(Swimming) ≈ WR(Walking)- WR(Walk)+ WR(Swim)

2. One-Hot Representaion

2.1. Definition

Regard words as discrete symbols
Word ID or one-hot representation
E.g.

- Vector dimension = # word in vocabulary
- Order is not important

2.2. Problems of One-Hot Representaion

similarity(star, sum) = (Vstars,Vsun) = 0
All the vectors are orthogonal. No natural notion of similarity for one-hot vectors

2.3. Represent Word ByContext

The meaning of word is given by the words that frequently appear close-by

- "You shall know a word by the company it keeps." (J.R.Firth 1957:11).
- One of the most successful ideas of modern statistical NLP.

Use context words to represent stars

- Co-occurrence Counts
- Words Embeddings

- - Term-Term matrix

- - - How often a word occurs with another

- - Term-Document maxtrix

- - - How often a word occurs in a document

	shining	bright	trees	dark	look
stars	38	45	2	27	12

2.4. Problem of Count-Based Representation

Increase in size with vocabulary
Require a lot of storage

2.5. Word Embedding

Distributed Representation

- Build a dense vector for each word learned from large-scale text corpora
- Learning method: Word2Vec

sparsity issues for those less frequent words

- Subsequent classification models will be less robust

Distributed Representation

3. Word2Vec

Word2Vec uses shallow neural networks that associate words to distributed representations
It can capture many linguistic regularities, such as

Word2vec uses a sliding window of a fixed size moving alonga sentence
ln each window, the middle word is the target word, otherwords are the context words

- Given the context words, CBow predicts the probabilities of thetarget word

While given a target word, skip-gram predicts the probabilities ofthe context words

3.1 CBOW & SKIP-GRAM

ln CBOW architecture, the model predicts the target wordgiven a window of surrounding context words
According to the bag-of-word assumption: The order ofcontext words does not influence the prediction

- Suppose the window size is 5

- - Never too late to learn
  - .P(late|[never,too,to,learn]) ...

3.2 Problems of Full Softmax

When the vocabulary size is very large

- Softmax for all the words every step depends on a huge number ofmodel parameters, which is computationally impractica
- We need to improve the computation efficiency

- - 负采样
  - 分层softmax

3.3 Imporving Computational Efficiency

ln fact, we do not need a full probabilistic model in word2vec

There are two main improvement methods for word2vec:

Negative sampling

- As we discussed before, the vocabulary is very large, which means our model has a tremendous number of weightsneed to be updated every step
- The idea of negative sampling is, to only update a small percentage of the weights every step
- Since we have the vocabulary and know the context words, we can select a couple of words not in the context word list by probabilit

Hierarchical softmax

关注

23
点赞
踩
24

收藏

觉得还不错? 一键收藏
1
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

Sparkleii CSDN认证博客专家 CSDN认证企业博客

码龄7年

24: 原创

16万+: 周排名

8万+: 总排名

1万+: 访问

: 等级

494: 积分

109: 粉丝

181: 获赞

4: 评论

178: 收藏

私信

关注

热门文章

分类专栏

设计模式

最新评论

如何进行服务限流
CSDN-Ada助手: 恭喜用户撰写了第15篇博客！对于“如何进行服务限流”这个话题的探讨，我觉得您的观点独特且实用。希望您能继续保持创作的热情，为读者带来更多有价值的内容。或许下一步您可以考虑深入探讨如何优化服务限流策略，或者分享一些案例分析，让读者更加深入地了解这个话题。期待您的下一篇作品！加油！
Word Representation
CSDN-Ada助手: “恭喜你写了这篇关于Word Representation的博客！你对这个主题的深入探讨让人印象深刻。接下来，我建议你可以尝试从不同的角度来思考和探讨Word Representation，或者深入研究一些相关的领域，这样可以为你的创作带来更多的灵感和新的视角。期待看到你更多的精彩内容！”
TensorboardX绘制Pytorch神经网络结构图踩坑笔记
water&12: 显示不出来

最新文章

目录

评论 1

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。