NLP Topic 1 Word Embeddings and Sentence Embeddings

本文深入探讨了如何表示单词的意义,从上下文表示到Word2vec框架的介绍,包括其目标和预测函数。词向量通过捕获相似上下文中的单词关系来代表单词意义,解决了传统方法的局限性。
摘要由CSDN通过智能技术生成

Topic 1 Word Embeddings and Sentence Embeddings

cs224n-2019

  • lecture 1: Introduction and Word Vectors
  • lecture 2: Word Vectors 2 and Word Senses
    slp
  • chapter 6: Vector Semantics
    ruder.io/word-embeddings
  • chapter 14: The Representation of Sentence Meaning

语言是信息传递知识传递的载体,
能有效沟通的前提是,双方的知识等同

How to represent the meaning of a word?

meaning: signifier(symbol) <=> signified(idea or thing)
common solution: WordNet, a thesaurus containing lists of synonym sets and hypernyms 同义词和上位词。
缺点:missing new meanings of words, can’t compute accurate word similarity.
solution: representing words as discrete symbols one-hot, but there is curse of dimensionality problem as well as on natural notion of similarity:

Representing words by their context

It should learn to encode similarity in the vectors themselves
词向量的编码目标是把词相似性进行编码,所有优化的目标和实际的使用都围绕在similarity上。类比所有的编码器,都应该清楚编码的目标是什么!
Distributional semantics: A word’s meaning is given by the words that frequently appear close

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值