Word embedding and sentiment analysis（词嵌入和情感分析）-CSDN博客

本文链接：https://blog.csdn.net/m0_74756454/article/details/138476233

I.Sparse versus dense vectors（稀疏向量与密集向量）

1）Difference

TF-IDF (or PMI) vectors are

- long (length |V|= 20,000 to 50,000)

- sparse (most elements are zero)

Alternative: learn vectors which are

- short (length 50-1000)

- dense (most elements are non-zero)

2)Why dense vectors?

- Short vectors may be easier to use as features in machine learning (fewer weights to tune)

- Dense vectors may generalize better than explicit counts

- Dense vectors may do better at capturing synonymy:

• car and automobile are synonyms; but are distinct dimensions

• a word with car as a neighbor and a word with automobile as a neighbor should be similar, but aren't

‣ In practice, they work better

3)Classes of embedding

Static embedding: one fixed embedding for each word in the vocabulary
Dynamic embedding: the vector for each word is different in different contexts

II.Word2vec

1）Definition:

Popular embedding method
Very fast to train
Idea: predict rather than count
Word2vec provides various options. We'll do: skip-gram with negative sampling (SGNS)

2)Word2vec

Instead of counting how often each word w occurs near "apricot"

- Train a classifier on a binary prediction task:

Is w likely to show up near "apricot"?

-We don’t actually care about this task:

•But we'll take the learned classifier weights as the word embeddings

-Big idea: self-supervision:

•A word c that occurs near apricot in the corpus cats as the gold "correct answer" for supervised learning

•No need for human labels

• Bengio et al. (2003); Collobert et al. (2011)

3)Approach:

1.Treat the target word t and a neighboring context word c as positive examples.

2.Randomly sample other words in the lexicon to get negative examples

3.Use logistic regression to train a classifier to distinguish those two cases

4.Use the learned weights as the embedding

4)Skip-gram:

1, training data:

Assume a +/- 2 word window, given training sentence:

…lemon, a [tablespoon of apricot jam, a] pinch…

c1 c2 target c3 c4

2,classifier:

(assuming a +/- 2 word window)

…lemon, a [tablespoon of apricot jam, a] pinch…

c1 c2 target c3 c4

Goal: train a classifier that is given a candidate (word, context) pair (apricot, jam) (apricot, aardvark) …

And assigns each pair a probability: P(+|w, c)

P(−|w, c) = 1 − P(+|w, c)

3,Positive or negative

+zany characters and richly applied satire, and some great plot twists.

-It was pathetic. The worst part about it was the boxing scenes...

+...awesome caramel sauce and sweet toasty almonds. I love this place!

-...awful pizza and ridiculously overpriced...

4,Steps

Vector representation
Data source: Hand labeling ; kaggle.com ; Internet
Evaluation metrics:

i.Precision and recall

Precision =True positive/positive

recall=True positive/(True positive+False negative)

ii.F-score

- The harmonic mean of precision and recall

- F1 gives equal importance to precision and recall

iii. Accuracy

- Binary classification

- Multi-class classification

Note:TP = True positive; FP = False positive; TN = True negative; FN = False negative

5,Scherer Typology of Affective States

Emotion: brief organically synchronized … evaluation of a major event - angry, sad, joyful, fearful, ashamed, proud, elated

Mood: diffuse non-caused low-intensity long-duration change in subjective feeling - cheerful, gloomy, irritable, listless, depressed, buoyant

Interpersonal stances: affective stance toward another person in a specific interaction - friendly, flirtatious, distant, cold, warm, supportive, contemptuous

Attitudes: enduring, affectively colored beliefs, dispositions towards objects or persons - liking, loving, hating, valuing, desiring

Personality traits: stable personality dispositions and typical behavior tendencies - nervous, anxious, reckless, morose, hostile, jealous