Week4-4Earley Parser

Background

  • Developed by Jay Earley in 1970
  • No need to convert grammar to CNF
  • Left to right

Complexity

fast than O(n3) in many cases

Earley Parser

  • look for both full and partial constituents
  • when reading word k, it has already identified all hypotheses that are consistent with words 1 to k-1

Data structure

  • It uses dynamic programming table, just like CKY
  • Example entry in column 1:
    • [0:1] VP -> VP . PP
    • created when processing word 1
    • corresponds to words 0 to 1 (the part on the left of . represents the part that we have found, thus VP, and if we found later PP, we will find the whole non terminal)
    • the dot(.) separates the completed(known) part from the incomplete(possibly unattainable) part

3 types of entries

  • ‘scan’- for words
  • ‘predict’ - for non-terminals
  • ‘complete’ - otherwise

Example

Take this book.

这里写图片描述

at the end we could find that it is either a verb phrase or a sentence.

The problem of CFG

Agreement

  • Number
    • Chen is/ People are
  • Person
    • I am/ Chen is
  • was/ is/ will be
  • Case
  • Gender

Combinatorial explosion

  • Many combinations of rules are needed to express agreement
    • S -> NP VP
    • S -> 1sgNP 1sgVP
    • S -> 2sgNP 2sgVP

Subcategorization frames

For different type of words, the rules we have are different.

  • direct object
  • prepositional phrase
  • predictive adjective
  • bare infinitive
  • to-infinitive
  • participial phrase
  • that-clause
  • question-form clause

CFG independence assumption

The probability of different non terminals are not independent in the context of rules.

Remark: The solution of it is the Lexicalized CFG(PCFG).

Conclusion

这里写图片描述

Because the possibilities of combinations, the number of the parses of a sentence is exponential, so to find all the parses, the you have to spend exponential time.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值