自然语言处理：依存句法解析（Dependency Parsing）

本文链接：https://blog.csdn.net/sinat_34072381/article/details/105841933

文章目录

短语结构语法
为什么需要依存句法结构？
依存句法结构
Transition-based dependency parsers
Why train a neural dependency parser?

自然语言的结构有哪些？怎么建立句法结构模型？大致可将句法结构分为两种：

phrase structure (context-free grammars), that organizes words into nested constituents;
dependency structure, that shows which words depend on (modify or are arguments of) which other words;

短语结构语法

句子有逐步嵌套的单元构成，我们可以将相邻单元/单词组合为更大的单元/单词，称之为短语或词组，然后继续将组合后的短语或词组组合为更大的单元：

为什么需要依存句法结构？

依存句法可以解释句子不同单元的联系，相同句子可能具有不能的依存结构，不同依存结构可能具有较大的语义差异。 因此，依据依存句法可以更好的理解句子，提升机器翻译等任务的准确性。

介词短语附着歧义（Prepositional phrase attachment ambiguity）

San Jose cops kill man with knife

There are two meanings of this sentence:

the cops stabs that guy;
the man has a knife;

Scientists count whales from space

There are two meanings of this sentence:

scientists counting the whales from space using something like a satellite;
the whales come from space;

对等范围歧义（Coordination scope ambiguity）

Shuttle veteran and longtime NASA executive Fred Gregory appointed to board

There are two meanings of this sentence:

a man is shuttle veteran and NASA executive;
shuttle veteran and NASA executive both of them have been appointed to the board;

依赖路径识别语义关系（Dependency paths identify semantic relations）

The results demonstrated that KaiC interacts rhythmically with SaSA KaiA and KaiB

We can get out of protein-protein interaction in dependency analysis, such as KaiC interacting with there other proteins over there.

The noun subjects here interacts with a noun modifier, and then it’s going to be there things that are beneath that of the SasA, and its conjoin things KaiA and KaiB are the things that interacts with.

依存句法结构

依存句法结构可用两种方法表示：线型结构表示、树形结构表示，如下左右两幅图片：

The Rise of Annotated Data: Universal Dependencies Treebanks

依存句法开源标注集，涉及多种语言.

依存句法构建方法：

dynamic programming, complexity is O(n³);
graph algorithms;
constraint satisfaction;
“transition-based parsing” or “deterministic dependency parsing”;

Transition-based dependency parsers

Arc-standard transition-based parser

Analysis of Happy children like to play with their friends.

Actually, it had different choices of when to shift and when to reduce. You would’ve explored this exponential size of different possible parsers, that would be able to parse efficiently.

In the 60s, it can be come up with clever dynamic programming algorithms by relatively efficiently explore the space of all possible parsers.

It’s the 2000s (MaltParser), at a particular position in the parse and each action is predicted by a discriminative classifier (e.g. softmax classifier) over each legal more:

max of 3 choices/actions when untyped ; max of |R| x 2 + 1 when typed;
features: top of stack word, POS; first in buffer word, POS; etc

There is NO search (in the simplest form), but you can profitably do a beam search if you wish (slower but better), keep k good parse prefixes at each time step.

Evaluation of dependency parsing