目录
1. What is Part of Speech (POS)? 词性是什么
2. Information Extraction 信息提取
2.2 POS Closed Classes (English)
2.4 POS Ambiguity in News Headlines
3.3 Derived Tags (Closed Class)
4.1 Why Automatically POS tag?
1. What is Part of Speech (POS)? 词性是什么
AKA word classes, morphological classes, syntactic categories 又名词类,形态词,句法类
Nouns, verbs, adjective, etc
POS tells us quite a bit about a word and its neighbours POS告诉我们关于一个词和它的邻居的相当多的信息:
- nouns are often preceded by determiners 名词前面经常有定语从句
- verbs preceded by nouns 动词前面是名词
- content as a noun pronounced as CONtent content作为名词读作CONtent
- content as a adjective pronounced as conTENT content作为形容词读作conTENT
2. Information Extraction 信息提取
Given this:
- “Brasilia, the Brazilian capital, was founded in 1960.”
Obtain this:
- capital(Brazil, Brasilia)
- founded(Brasilia, 1960)
Many steps involved but first need to know nouns (Brasilia, capital), adjectives (Brazilian), verbs (founded) and numbers (1960).
2.1 POS Open Classes
Open vs closed classes: how readily do POS categories take on new words? Just a few open classes: 开放类与封闭类:POS类别多容易接受新词?只有几个开放类。
Nouns
- Proper (Australia) versus common (wombat)
- Mass (rice) versus count (bowls)
Verbs
- Rich inflection (go/goes/going/gone/went)
- Auxiliary verbs (be, have, and do in English)
- Transitivity (wait versus hit versus give) — number of arguments
Adjectives
- Gradable (happy) versus non-gradable (computational)
Adverbs
- Manner (slowly)
- Locative (here)
- Degree (really)
- Temporal (today)
2.2 POS Closed Classes (English)
Prepositions (in, on, with, for, of, over,…)
- on the table
Particles
- brushed himself off
Determiners
- Articles (a, an, the)
- Demonstratives (this, that, these, those)
- Quantifiers (each, every, some, two,…)
Pronouns
- Personal (I, me, she,…)
- Possessive (my, our,…)
- Interrogative or Wh (who, what, …)
Conjunctions
- Coordinating (and, or, but)
- Subordinating (if, although, that, …)
Modal verbs