NLP coursera note 1

最新推荐文章于 2024-04-01 20:49:35 发布

sxvincent

最新推荐文章于 2024-04-01 20:49:35 发布

阅读量573

点赞数

文章标签： nlp open course

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/u013232273/article/details/17461789

版权

NLP applications:

1) machine translation : use computer to translate language automatically

2) information extraction : given text, produce structured database representation

3) text summarization : give multiple text resources, then generate sth that could capture main idea

4) dialogue system : interact with computer to get answers

Tagging:

1) part of speech tagging : sequence of tag for adj, prep..

2) name entity recognition : which attribute the word belongs to

Parse:

map the input to parse tree that represent some hierarchical structure

Basic language modeling problem:

finite vocabulary set v = {....}

sentence s which words came from v.

sum of p(s) = 1, learn p. More reasonable sentence assigned higher prob

--Naive model:

p = count of sentence / total count of sentence -> hard to be general

--Trigram model:

based on second order Markov

P(X1 = x1, X2 = x2...Xn = xn, ) = product over i=1~n { P(Xi = xi |Xi-1 = xi-1, Xi-2 = xi-2 ) }

P(X0) = P(X-1) = *

Each random variable(word, including the STOP)in the chain is only conditioned on previously two variables(words).

--How to get P(Xi = xi | Xi-1 = xi-1, Xi-2 = xi-2 ) ?

use weighted combination of threeMaximum likelihood estimate: Linear Interpolation

trigram: est3 = count (Xi = xi , Xi-1 = xi-1, Xi-2 = xi-2 ) / count (Xi-1 = xi-1, Xi-2 = xi-2 )

bigram: est2 = count (Xi = xi , Xi-1 = xi-1) / count (Xi-1 = xi-1)

unigram: est1 = count (Xi = xi) / count () Note here: count () include stop. count = total occurrence

P(Xi = xi | Xi-1 = xi-1, Xi-2 = xi-2 ) = a3*est3 + a2*est2 + a1*est1 where a1 + a2 + a3 =1

--How to estimate a1, a2, a3? change it to the optimization problem

pick up validation data, c'(X1, X2, X3) is the count of trigram in there, make max L(a1, a2, a3)

L(a1, a2, a3) = sum c'(X1, X2, X3) * log (P(X1|X2, X3))

where P(X1|X2, X3) = a3*est3 + a2*est2 + a1*est1 and a1 + a2 + a3 =1

Evaluate model : perplexity (lower, better)

l = 1/M * sum over log (prob of each sentence)， M--total words count in sentence set

(this can be viewed as responsibility assigned to each word)

if uniformly distributed, l = log 1/N N-- size of vacabulary set

perplexity = 2^-l

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
NLP coursera note 1

NLP applications:1) machine translation : use computer to translate language automatically2) information extraction : given text, produce structured database representation3) text summarization
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。