java opennlp,OpenNLP与斯坦福CoreNLP

博主在比较OpenNLP和StanfordCoreNLP时,认为StanfordCoreNLP在命名实体识别、准确性及易用性上胜出,而OpenNLP在训练新模型时可能更简单。尽管OpenNLP最近更新较少,但其训练API相对更直观。对于性别识别,两者文档都不够完善。在训练API方面,OpenNLP可能更适合非标准训练,但CoreNLP的训练速度更快。
摘要由CSDN通过智能技术生成

I've been doing a little comparison of these two packages and am not sure which direction to go in. What I am looking for briefly is:

Named Entity Recognition (people, places, organizations and such).

Gender identification.

A decent training API.

From what I can tell, OpenNLP and Stanford CoreNLP expose pretty similar capabilities. However, Stanford CoreNLP looks like it has a lot more activity whereas OpenNLP has only had a few commits in the last six months.

Based on what I saw, OpenNLP appears to be easier to train new models and might be more attractive for that reason alone. However, my question is what would others start with as the basis for adding NLP features to a Java app? I'm mostly worried as to whether OpenNLP is "just mature" versus semi-abandoned.

解决方案

In full disclosure, I'm a contributor to CoreNLP, so this is a biased answer. But, in my view on your three criteria:

Named Entity Recognition: I think CoreNLP clearly wins here, both on accuracy and ease-of-use. For one, OpenNLP has a model per NER tag, whereas CoreNLP detects all tags with a single Annotator. Furthermore, temporal resolution with SUTime is a nice perk in CoreNLP. Accuracy-wise, my anecdotal experience is that CoreNLP does better on general-purpose text.

Gender identification. I think both tools are kind of poorly documented on this front. OpenNLP seems to have a GenderModel class; CoreNLP has a gender Annotator.

Training API. I suspect the OpenNLP training API is easier-to-use for not off-the-shelf training. But, if all you want to do is, e.g., train a model from a CoNLL file, both should be straightforward. Training speed tends to be faster with CoreNLP than other tools I've tried, but I haven't benchmarked it formally, so take that with a grain of salt.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值