自然语言处理, NLP 中文情感模型
和大家分享一个自己做的情感模型, 欢迎大家尝试调用和任何feedback, 谢谢
A Simple NLP Pipeline
This project is developed as a deployable NLP pipeline as a server service. The original language model is CoreNLP from stanford NLP group. Officially, the chinese model is supported for many of its pipelines but the sentiment annotator. This project provide a sentiment model learned from over 150k sentimental binary trees. For more model details and API, here
Query
To query a sentence, here is a sample command:
wget --post-data '无敌是多么的孤独' 'nlp.awesomevc.com:81' -O-
Result
The output is in a json format
Sentiment
"sentimentValue": "3",
"sentiment": "Positive",
"sentimentDistribution": [
0.04447204774245,
0.13967615459976,
0.2460704480222,
0.38548481571688,
0.18338223574722,
0.00091429817149
],
"sentimentTree": "(ROOT|sentiment=3|prob=0.385 (VV|sentiment=2|prob=0.369 无敌)\n (VP|sentiment=3|prob=0.470 (VC|sentiment=2|prob=0.348 是)\n (VP|sentiment=2|prob=0.488 (ADVP|sentiment=2|prob=0.352 多么的) (VP|sentiment=2|prob=0.461 孤独))))"
Data Fields Explanation:
- sentimentValue: Integer Value for “sentiment”, 0-VeryNegative, 1-Negative, 2-Neutral, 3-Negative, 4-VeryNegative, 5-IgnoredForNow
- sentiment: Level of Sentiment
- sentimentDistribution: Likelihood distribution for levels of sentiment
POS tagging & NER & Other
query:
wget --post-data '我家在北京有房, 就在东二环旁边' 'nlp.awesomevc.com:81' -O-
Output:
{
"index": 1,
"word": "我家",
"originalText": "我家",
"lemma": "我家",
"characterOffsetBegin": 0,
"characterOffsetEnd": 2,
"pos": "NN",
"ner": "O"
},
{
"index": 2,
"word": "在",
"originalText": "在",
"lemma": "在",
"characterOffsetBegin": 2,
"characterOffsetEnd": 3,
"pos": "P",
"ner": "O"
},
{
"index": 3,
"word": "北京",
"originalText": "北京",
"lemma": "北京",
"characterOffsetBegin": 3,
"characterOffsetEnd": 5,
"pos": "NR",
"ner": "STATE_OR_PROVINCE"
},
{
"index": 4,
"word": "有",
"originalText": "有",
"lemma": "有",
"characterOffsetBegin": 5,
"characterOffsetEnd": 6,
"pos": "VE",
"ner": "O"
},
{
"index": 5,
"word": "房",
"originalText": "房",
"lemma": "房",
"characterOffsetBegin": 6,
"characterOffsetEnd": 7,
"pos": "NN",
"ner": "O"
},
{
"index": 6,
"word": ",",
"originalText": ",",
"lemma": ",",
"characterOffsetBegin": 7,
"characterOffsetEnd": 8,
"pos": "PU",
"ner": "O"
},
{
"index": 7,
"word": "就",
"originalText": "就",
"lemma": "就",
"characterOffsetBegin": 9,
"characterOffsetEnd": 10,
"pos": "AD",
"ner": "O"
},
{
"index": 8,
"word": "在",
"originalText": "在",
"lemma": "在",
"characterOffsetBegin": 10,
"characterOffsetEnd": 11,
"pos": "P",
"ner": "O"
},
{
"index": 9,
"word": "东二环",
"originalText": "东二环",
"lemma": "东二环",
"characterOffsetBegin": 11,
"characterOffsetEnd": 14,
"pos": "NR",
"ner": "FACILITY"
},
{
"index": 10,
"word": "旁边",
"originalText": "旁边",
"lemma": "旁边",
"characterOffsetBegin": 14,
"characterOffsetEnd": 16,
"pos": "NN",
"ner": "O"
}