词典创建
1.1、现成的词典
1.1.1、NRC Emotion Lexicon(Mohammad & Turney, 2010):annotated for eight emotions (joy, sadness, anger, fear, disgust, surprise, trust, and anticipation) as well as for positive and negative sentiment.
1.1.2、Bing Liu’s Lexicon (Hu & Liu, 2004):provides a list of positive and negative words manually extracted from customer reviews.
1.1.3、MPQA Subjectivity Lexicon (Wilson et al., 2005): contains words marked with their prior polarity (positive or negative) and a discrete strength of evaluative intensity(strong or weak)
Entities in these lexicons do not come with a real-valued score indicating
the ne-grained evaluative intensity这些词典都不包含情感的强度分值。
1.2、作者自己创建的词典
1.2.1、Hashtag Sentiment Lexicon
有些tweet里包含“#词”这样的hashtag,这个#后面的词就表明了这条tweet的主题或者情感,所以根据这样的hashtag来从tweet网站上里爬取一些tweet,爬到的每条tweet的情感极性就用#后面词的极性标注。作者用了一个包含有74个词的种