词干抽取java实现_Lemmatisation & Stemming 词干提取

Lemmatisation is closely related to stemming. The difference is that a stemmer operates on a single word without knowledge of the context, and therefore cannot discriminate between words which have different meanings depending on part of speech. However, stemmers are typically easier to implement and run faster, and the reduced accuracy may not matter for some applications. 1.Stemmer 抽取词的词干或词根形式(不一定能够表达完整语义) Porter Stemmer基于Porter词干提取算法

>>> from nltk.stem.porter import PorterStemmer

>>> porter_stemmer = PorterStemmer()

>>> porter_stemmer.stem(‘maximum’)

u’maximum’

>>> porter_stemmer.stem(‘presumably’)

u’presum’

>>> porter_stemmer.stem(‘multiply’)

u’multipli’

>>> porter_stemmer.stem(‘provision’)

u’provis’

>>> porter_stemmer.stem(‘owed’)

u’owe’

Lancaster Stemmer 基于Lancaste

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值