Python自然语言处理学习笔记(48):深入阅读

  5.9   Further Reading  深入阅读

Extra materials for this chapter are posted at http://www.nltk.org/, including links to freely available resources on the web. For more examples of tagging with NLTK, please see the Tagging HOWTO at http://www.nltk.org/howto. Chapters 4 and 5 of (Jurafsky & Martin, 2008) contain more advanced material on n-grams and part-of-speech tagging. Other approaches to tagging involve machine learning methods (chap-data-intensive). In Chapter 7 we will see a generalization of tagging called chunking in which a contiguous sequence of words is assigned a single tag.

For tagset documentation, see nltk.help.upenn_tagset() and nltk.help.brown_tagset(). Lexical categories are introduced in linguistics textbooks, including those listed in Chapter 1.

There are many other kinds of tagging. Words can be tagged with directives to a speech synthesizer, indicating which words should be emphasized. Words can be tagged with sense numbers, indicating which sense of the word was used. Words can also be tagged with morphological features. Examples of each of these kinds of tags are shown below. For space reasons, we only show the tag for a single word. Note also that the first two examples use XML-style tags, where elements in angle brackets enclose the word that is tagged.

  1. Speech Synthesis Markup Language (W3C SSML): That is a <emphasis>big</emphasis> car!
  2. SemCor: Brown Corpus tagged with WordNet senses: Space in any <wf pos="NN" lemma="form" wnsn="4">form</wf> is completely measured by the three dimensions. (Wordnet form/nn sense 4: "shape, form, configuration, contour, conformation")
  3. Morphological tagging, from the Turin University Italian Treebank: E' italiano , come progetto e realizzazione , il primo (PRIMO ADJ ORDIN M SING) porto turistico dell' Albania .

Note that tagging is also performed at higher levels. Here is an example of dialogue act tagging, from the NPS Chat Corpus (Forsyth & Martell, 2007) included with NLTK. Each turn of the dialogue is categorized as to its communicative function:

Statement User117 Dude..., I wanted some of that

ynQuestion User120 m I missing something?

Bye        User117 I'm gonna go fix food, I'll be back later.

System     User122 JOIN

System     User2   slaps User122 around a bit with a large trout.

Statement User121 18/m pm me if u tryin to chat

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值