关于SemEval2016 Task4 Sentiment Analysis的分析

SemEval2016的任务4聚焦于情感分析,包括消息极性分类、二分尺度推文分类、五分尺度推文分类、二分尺度推文量化和五分尺度推文量化。各子任务中,多数顶级团队采用了深度学习技术,如LSTM、CNN和词嵌入。在量化任务中,部分团队专门针对问题的量化特性进行了系统调整。
摘要由CSDN通过智能技术生成

SemEval2016 Sentiment Anaylisis

Introduction

区别概念
  • Ordinal Classification
    对于此处的创新是将二分类问题变为三分类或者五分类问题

  • Quantification
    这里面的应用我们不限于对一个特定人的帖子进行情感分析,而是对于一个特定的话题来说,进行情感的分析

Task Definition(分问题的定义)

  • Subtask A: Given a tweet, predict whether it is ofpositive, negative, or neutral sentiment.
  • Subtask B: Given a tweet known to be about a given topic, predict whether it conveys a positive or a negative sentiment towards the topic.
  • Subtask C: Given a tweet known to be about a given topic, estimate the sentiment it conveys towards the topic on a five-point scale ranging from HIGHLYNEGATIVE to HIGHLYPOSITIVE.
  • Subtask D: Given a set of tweets known to be abouta given topic, estimate the distribution of the tweets in the POSITIVE andNEGATIVE classes.
  • Subtask E: Given a set of tweets known to be about a given topic, estimate the distribution of the tweets across the five classes of a fivepoint scale, ranging from HIGHLYNEGATIVE to HIGHLYPOSITIVE

Evaluation Measures

Participants and Results

  • Subtask A (34 teams)
  • Subtask B (19 teams)
  • Subtask C (11 teams)
  • Subtask D (14 teams)
  • Subtask E (10 teams)
Subtask A: Message polarity classification

这里写图片描述

  • The top-scoring team (SwissCheese1) used an ensemble of convolutional neural networks,differing in their choice of filter shapes, pooling shapes and usage of hidden layers. Word embeddings generated via word2vec were also used, and the neural networks were trained by using distant supervision.
  • Out of the 10 top-ranked teams, 5teams (SwissCheese1, SENSEI-LIF2, UNIMELB3,INESC-ID4, INSIGHT-18) used deep NNs of somesort, and 7 teams (SwissCheese1, SENSEI-LIF2,UNIMELB3, INESC-ID4, aueb.twitter.sentiment5,I2RNTU7, INSIGHT-18) used either general purpose ortask-specific word embeddings,generated viaword2vec or GloVe.
Subtask B: Tweet classification according to a two-point scale

  • The top-scoring team (Tweester1) used a combination of convolutional neural networks, topic modeling, and word embeddingsgenerated viaword2vec. Similar to Subtask A, the main trend among all participants is the widespread use of deep learning techniques.

  • Out of the 10 top-ranked participating teams, 5 teams (Tweester1, LYS2, INSIGHT15, UNIMELB7, Finki10) used convolutional neural networks; 3 teams (thecerealkiller3, UNIMELB7, Finki10) submitted systems using recurrent neural networks; and 7 teams (Tweester1, LYS2, INSIGHT-15, UNIMELB7, Finki10) incorporated in their participating systems eithergeneral-purposeortask-specific word embeddings (generated via toolkits such as GloVe or word2vec).

Subtask C: Tweet classification according to a five-point scale

这里写图片描述

  • The top-scoring team (TwiSE1) used a singlelabel multi-class classifier to classify the tweets according to their overall polarity. In particular, they used logistic regression that minimizes the multinomial loss across the classes, with weights to cope with class imbalance. Note that they ignored the given topics altogether.
  • Only 2 of the 11 participating teams tuned their systems toexploit the ordinal (as opposed to binary, or single-label multi-class) nature of this subtask. The two teams who did exploit the ordinal nature of the problem are PUT3, which uses an ensemble of ordinal regression approaches, and ISTI-CNR7, which uses a tree-based approach to ordinal regression. All other teams used general-purpose approaches for single-label multi-class classification, in many cases relying (as for Subtask B) onconvolutional neural networks,recurrent neural networks, and word embeddings
Subtask D: Tweet quantification according to a two-point scale

这里写图片描述

  • The top-scoring team (Finki1) adopts an approach based on“classify and count”,a classification oriented (instead of quantification-oriented) approach, usingrecurrent and convolutional neural networks, andGloVe word embeddings.
  • Indeed, only5 of the 14participating teams tuned their systems to the fact that it deals with quantification (as opposed to classification). Among the teams who do rely on quantification-oriented approaches, teams LYS2andHSENN14 used an existing structured prediction method that directly optimizesKLD; teams QCRI5 andISTI-CNR11 use existing probabilistic quantification methods; team NRU-HSE7uses an existingiterative quantification method based oncost-sensitive learning. Interestingly, team TwiSE2 uses a “classify and count” approach after comparing it with a quantification oriented method (similar to the one used by teams LYS2 and HSENN14) on the development set, and concluding that the former works better than the latter.
    All other teams used“classify and count” approaches, mostly based on convolutional neural networks and word embeddings
Subtask E: Tweet quantification according to a five-point scale

这里写图片描述

  • Only 3 of the 10 participants tuned their systems to the specific characteristics of this subtask, i.e., to the fact that it deals withquantification (as opposed to classification) and to the fact that it has an ordinal (as opposed to binary) nature.
  • The top-scoring team (QCRI1) used anovel algorithm explicitly designed for ordinal quantification, that leverages an ordinal hierarchy of binary probabilistic quantifiers.
  • Team NRU-HSE4 uses anexisting quantification approach based on cost-sensitive learning, and adapted it to the ordinal case.
  • Team ISTI-CNR6 instead useda novel adaptation to quantification of a tree-based approach to ordinal regression.
  • Teams LYS7 and HSENN9 also used anexisting quantification approach, but did not exploit the ordinal nature of the problem.
  • The other teams mostly used approaches based on“classify and count” (see Section 5.4), and viewed the problem as single-label multi-class (instead of ordinal) classification; some of these teams (notably, team Finki2) obtained very good results, which testifies to the quality of the (general-purpose) features and learning algorithm they used.

Conclusion

值得研读的Paper:
每个任务的最高得分
* A:SwissCheese(Deriu et al., 2016)
* B:Tweester(Palogiannidi et al., 2016)
* C:TwiSE(Balikas and Amini, 2016)
* D:Finki(Stojanovski et al., 2016)
* E:QCRI(Da San Martino et al., 2016)
独具一格
* PUT3(Lango et al., 2016)
* ISTI-CNR(Esuli, 2016)
* LYS(Vilares et al., 2016)
* QCRI5(Da San Martino et al., 2016)
* NRU-HSE(Karpov et al., 2016)

虽然相关的论文很多但是值得深入的不多,基本方法都是运用深度学习,在构建网络的时候不同。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值