Wee5-5Statistical parsing

PCFG

Need for PCFG

  • Time flies like an arrow
    • Many parses
    • Some more likely than others
    • Need for a probabilistic ranking method

Definition

Just like CFG, a 4 tuple (N,Σ,R,S)

  • N: non-terminal symbols
  • Σ : terminal symbols(disjoint from N)
  • R: rules( Aβ )[p]
    • β(ΣN)
    • p is the probability p(β|A)
  • S: start symbol(from N)

Rules having the same left-hand side should have probabilities summing to 1.

Probability of a parse tree

p(t)=i=1np(αiβi)

Most likely parse tree

argmaxtT(s)p(t)

Probability of the sentence

p(s)=i=1np(ti)

这里写图片描述

Main tasks for PCFGs

  • Given a grammar G and sentence s, let T(s) be all parse trees that correspond to s

  • Task1: find the most likely parse tree t

  • Task2: find p(s) as the sum of all p(t)

Probabilistic parsing methods

  • Probabilistic Earley algorithm
    • Top-down parser with dynamic programming table
  • Probabilistic CKY algorithm
    • Bottom-up parser with a dynamic programming table

Probabilistic grammars

  • Possibilities can be learned from the training(Treebank)
  • Possible to do reranking
  • Possible to combine with other stages

MLE

pML(αβ)=Count(αβ)Count(α)

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值