自然语言处理, NLP 中文情感模型,开源项目

最新推荐文章于 2024-08-17 15:35:20 发布

置顶 weixin_40473141

最新推荐文章于 2024-08-17 15:35:20 发布

阅读量716

点赞数

分类专栏： NLP 情感分析文章标签：自然语言处理 NLP 人工智能情感分析 AI

本文链接：https://blog.csdn.net/weixin_40473141/article/details/84190084

版权

NLP 同时被 2 个专栏收录

2 篇文章 0 订阅

订阅专栏

情感分析

1 篇文章 0 订阅

订阅专栏

自然语言处理, NLP 中文情感模型

和大家分享一个自己做的情感模型, 欢迎大家尝试调用和任何feedback, 谢谢

A Simple NLP Pipeline

This project is developed as a deployable NLP pipeline as a server service. The original language model is CoreNLP from stanford NLP group. Officially, the chinese model is supported for many of its pipelines but the sentiment annotator. This project provide a sentiment model learned from over 150k sentimental binary trees. For more model details and API, here

Query

To query a sentence, here is a sample command:

wget --post-data '无敌是多么的孤独' 'nlp.awesomevc.com:81' -O-

Result

The output is in a json format

Sentiment

"sentimentValue": "3",
"sentiment": "Positive",
"sentimentDistribution": [
        0.04447204774245,
        0.13967615459976,
        0.2460704480222,
        0.38548481571688,
        0.18338223574722,
        0.00091429817149
      ],
"sentimentTree": "(ROOT|sentiment=3|prob=0.385 (VV|sentiment=2|prob=0.369 无敌)\n  (VP|sentiment=3|prob=0.470 (VC|sentiment=2|prob=0.348 是)\n    (VP|sentiment=2|prob=0.488 (ADVP|sentiment=2|prob=0.352 多么的) (VP|sentiment=2|prob=0.461 孤独))))"

Data Fields Explanation:

sentimentValue: Integer Value for “sentiment”, 0-VeryNegative, 1-Negative, 2-Neutral, 3-Negative, 4-VeryNegative, 5-IgnoredForNow
sentiment: Level of Sentiment
sentimentDistribution: Likelihood distribution for levels of sentiment

POS tagging & NER & Other

query:

wget --post-data '我家在北京有房, 就在东二环旁边' 'nlp.awesomevc.com:81' -O-

Output:

{
          "index": 1,
          "word": "我家",
          "originalText": "我家",
          "lemma": "我家",
          "characterOffsetBegin": 0,
          "characterOffsetEnd": 2,
          "pos": "NN",
          "ner": "O"
        },
        {
          "index": 2,
          "word": "在",
          "originalText": "在",
          "lemma": "在",
          "characterOffsetBegin": 2,
          "characterOffsetEnd": 3,
          "pos": "P",
          "ner": "O"
        },
        {
          "index": 3,
          "word": "北京",
          "originalText": "北京",
          "lemma": "北京",
          "characterOffsetBegin": 3,
          "characterOffsetEnd": 5,
          "pos": "NR",
          "ner": "STATE_OR_PROVINCE"
        },
        {
          "index": 4,
          "word": "有",
          "originalText": "有",
          "lemma": "有",
          "characterOffsetBegin": 5,
          "characterOffsetEnd": 6,
          "pos": "VE",
          "ner": "O"
        },
        {
          "index": 5,
          "word": "房",
          "originalText": "房",
          "lemma": "房",
          "characterOffsetBegin": 6,
          "characterOffsetEnd": 7,
          "pos": "NN",
          "ner": "O"
        },
        {
          "index": 6,
          "word": ",",
          "originalText": ",",
          "lemma": ",",
          "characterOffsetBegin": 7,
          "characterOffsetEnd": 8,
          "pos": "PU",
          "ner": "O"
        },
        {
          "index": 7,
          "word": "就",
          "originalText": "就",
          "lemma": "就",
          "characterOffsetBegin": 9,
          "characterOffsetEnd": 10,
          "pos": "AD",
          "ner": "O"
        },
        {
          "index": 8,
          "word": "在",
          "originalText": "在",
          "lemma": "在",
          "characterOffsetBegin": 10,
          "characterOffsetEnd": 11,
          "pos": "P",
          "ner": "O"
        },
        {
          "index": 9,
          "word": "东二环",
          "originalText": "东二环",
          "lemma": "东二环",
          "characterOffsetBegin": 11,
          "characterOffsetEnd": 14,
          "pos": "NR",
          "ner": "FACILITY"
        },
        {
          "index": 10,
          "word": "旁边",
          "originalText": "旁边",
          "lemma": "旁边",
          "characterOffsetBegin": 14,
          "characterOffsetEnd": 16,
          "pos": "NN",
          "ner": "O"
        }