深度学习Course5第二周Natural Language Processing & Word Embeddings习题整理

Natural Language Processing & Word Embeddings

  1. True/False: Suppose you learn a word embedding for a vocabulary of 20000 words. Then the embedding vectors could be 1000 dimensional, so as to capture the full range of variation and meaning in those words.
  • False
  • True

解析:The dimension of word vectors is usually smaller than the size of the vocabulary. Most common sizes for word vectors range between 50 and 1000.

  1. True/False: t-SNE is a linear transformation that allows us to solve analogies on word vectors.
  • False
  • True

解析:tr-SNE is a non-linear dimensionality reduction technique.

  1. Suppose you download a pre-trained word embedding which has been trained on a huge corpus of text. You then use this word embedding to train an RNN for a language task of recognizing if someone is happy from a short snippet of text, using a small training set.
    在这里插入图片描述
    Then even if the word “ecstatic” does not appear in your small training set, your RNN might reasonably be expected to recognize “I’m ecstatic” as deserving a label y = 1 y=1 y=1.
  • False
  • True

解析: word vectors empower your model with an incredible ability to generalize. The vector for “ecstatic” would contain a positive/happy connotation which will probably make your model classify the sentence as a “1”.(泛化能力增强)

  1. Which of these equations do you think should hold for a good word embedding? (Check all that apply)
  • e m a n − e u n c l e ≈ e w o m a n − e a u n t e_{man}−e_{uncle}≈e_{woman}−e_{aunt} emaneuncleewomaneaunt
  • e m a n − e w o m a n ≈ e u n c l e − e a u n t e_{man}−e_{woman}≈e_{uncle}−e_{aunt} emanewomaneuncleeaunt
  • e m a n − e w o m a n ≈ e a u n t − e u n c l e e_{man}−e_{woman}≈e_{aunt}−e_{uncle} emanewomaneaunteuncle
  • e m a n − e a u n t ≈ e w o m a n − e u n c l e e_{man}−e_{aunt}≈e_{woman}−e_{uncle} emaneauntewomaneuncle
  1. Let A A A be an embedding matrix, and let o 4567 o_{4567} o4567 be a one-hot vector corresponding to word 4567. Then to get the embedding of word 4567, why don’t we call A ∗ o 4567 A * o_{4567} Ao4567 in Python?
  • It is computationally wasteful.
  • The correct formula is A T ∗ o 4567 A^T∗o_{4567} ATo4567
  • None of the answers are correct: calling the Python snippet as described above is fine.
  • This doesn’t handle unknown words ().

解析:the element-wise multiplication will be extremely inefficient.

  1. When learning word embeddings, words are automatically generated along with the surrounding words.
  • True
  • False

解析: we pick a given word and try to predict its surrounding words or vice versa.

  1. In the word2vec algorithm, you estimate P ( t ∣ c ) P(t \mid c) P(tc), where t t t is the target word and c c c is a context word. How are t t t and c c c chosen from the training set? Pick the best answer.
  • c c c is the sequence of all the words in the sentence before t t t
  • c c c and t t t are chosen to be nearby words.
  • c c c is a sequence of several words immediately before t t t
  • c c c is the one word that comes immediately before t t t
  1. Suppose you have a 10000 word vocabulary, and are learning 500-dimensional word embeddings. The word2vec model uses the following softmax function:
    在这里插入图片描述
    Which of these statements are correct? Check all that apply.
  • θ t θ_t θt and e c e_c ec are both trained with an optimization algorithm such as Adam or gradient descent.
  • θ t θ_t θt and e c e_c ec are both 500 dimensional vectors.
  • After training, we should expect θ t θ_t θt to be very close to e c e_c ec when t t t and c c c are the same word.
  • θ t θ_t θt and e c e_c ec are both 10000 dimensional vectors.
  1. Suppose you have a 10000 word vocabulary, and are learning 500-dimensional word embeddings. The GloVe model minimizes this objective:
    在这里插入图片描述
    Which of these statements are correct? Check all that apply.
  • θ i θ_i θi and e j e_j ej should be initialized to 0 at the beginning of training.
  • θ i θ_i θi and e j e_j ej should be initialized randomly at the beginning of training.
  • X i j X_ij Xij is the number of times word j appears in the context of word i.
  • Theoretically, the weighting function f ( . ) f(.) f(.) must satisfy f ( 0 ) = 0 f(0)=0 f(0)=0
  1. You have trained word embeddings using a text dataset of t 1 t_1 t1 words. You are considering using these word embeddings for a language task, for which you have a separate labeled dataset of t 2 t_2 t2 words. Keeping in mind that using word embeddings is a form of transfer learning, under which of these circumstances would you expect the word embeddings to be helpful?
  • When t 1 t_1 t1 is equal to t 2 t_2 t2
  • When t 1 t_1 t1 is smaller than t 2 t_2 t2
  • When t 1 t_1 t1 is larger than t 2 t_2 t2

解析:Transfer embeddings to new tasks with smaller training sets.

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 2
    评论
We are awash with text, from books, papers, blogs, tweets, news, and increasingly text from spoken utterances. Every day, I get questions asking how to develop machine learning models for text data. Working with text is hard as it requires drawing upon knowledge from diverse domains such as linguistics, machine learning, statistical natural language processing, and these days, deep learning. I have done my best to write blog posts to answer frequently asked questions on the topic and decided to pull together my best knowledge on the matter into this book. I designed this book to teach you step-by-step how to bring modern deep learning methods to your natural language processing projects. I chose the programming language, programming libraries, and tutorial topics to give you the skills you need. Python is the go-to language for applied machine learning and deep learning, both in terms of demand from employers and employees. This is not least because it could be a renaissance for machine learning tools. I have focused on showing you how to use the best of breed Python tools for natural language processing such as Gensim and NLTK, and even a little scikit-learn. Key to getting results is speed of development, and for this reason, we use the Keras deep learning library as you can define, train, and use complex deep learning models with just a few lines of Python code. There are three key areas that you must know when working with text: 1. How to clean text. This includes loading, analyzing, filtering and cleaning tasks required prior to modeling. 2. How to represent text. This includes the classical bag-of-words model and the modern and powerful distributed representation in word embeddings. 3. How to generate text. This includes the range of most interesting problems, such as image captioning and translation. These key topics provide the backbone for the book and the tutorials you will work through. I believe that after completing this book, you will have the skills t

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

l8947943

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值