部分语音情感识别数据集解析（EMO-DB，RAVDESS，SAVEE）

Wsyoneself

已于 2023-07-11 08:12:58 修改

阅读量3.3k

点赞数 5

分类专栏： dataset 文章标签： dataset

于 2022-10-07 16:21:57 首次发布

本文链接：https://blog.csdn.net/weixin_45647721/article/details/127195641

版权

dataset 专栏收录该内容

10 篇文章

订阅专栏

EMO-DB：

德语，10 个人（5 名男性，5 名女性）的大约 500 个音频，表达了 7 种不同的情绪（倒数第二个字母表示情绪类别）：N = neutral，W = angry，A = fear，F = happy，T = sad，E = disgust，L = boredom。

文件名每个字母的对应：

有一些版本可能还有第7个letter，暂时不清楚含义，但也应该没有太大作用。

positon 6 对应情感：

W：anger
L:boredom
E:disgust
A:anxiety/fear
F:happiness
T:sadness
N:neutral version

Positions 3-5 对应的语音内容（Code of texts，此处写出的是由语音中的德语转为了英语）：

a01 the tablecloth is lying on the frigde.
a02 she will hand it in on wednesday.
a04 tonight I cound tell him.
a05 the black sheet of paper is located up there besides the piece of timber.
a07 in seven hours it will be.
b01 what about the bags standing there under the table?
b02 they just carried it upstairs and now they are going down again.
b03 currently at the weekends i always went home and saw agnes.
b09 i will just discard this and then go for a drink with karl
b10 it will be in the place where we always store it.

Positions 1-2 对应的人的性别及年龄，Information about the speakers：

03 - male, 31 years old
08 - female, 34 years
09 - female, 21 years
10 - male, 32 years
11 - male, 26 years
12 - male, 30 years
13 - female, 32 years
14 - female, 35 years
15 - male, 25 years
16 - female, 31 years

RAVDESS:文件名由 7 部分数字标识符组成（例如，02-01-06-01-02-01-12.mp4）。这些标识符定义了刺激特征：
1. 文件名标识符
  1. Modality (01 = full-AV, 02 = video-only, 03 = audio-only).
  2. Vocal channel (01 = speech, 02 = song).
  3. Emotion (01 = neutral, 02 = calm, 03 = happy, 04 = sad, 05 = angry, 06 = fearful, 07 = disgust, 08 = surprised).
  4. Emotional intensity (01 = normal, 02 = strong). NOTE: There is no strong intensity for the 'neutral' emotion.
  5. Statement (01 = "Kids are talking by the door", 02 = "Dogs are sitting by the door").
  6. Repetition (01 = 1st repetition, 02 = 2nd repetition).
  7. Actor (01 to 24. Odd numbered actors are male, even numbered actors are female).
2. 文件名示例：02-01-06-01-02-01-12.mp4
  1. Video-only (02)
  2. Speech (01)
  3. Fearful (06)
  4. Normal intensity (01)
  5. Statement "dogs" (02)
  6. 1st Repetition (01)
  7. 12th Actor (12）
  8. Female, as the actor ID number is even
3. 英文，24 个人（12 名男性，12 名女性）的大约 1500 个音频，表达了 8 种不同的情绪（第三位数字表示情绪类别）：01 = neutral，02 = calm，03 = happy，04 = sad，05 = angry，06 = fearful，07 = disgust，08 = surprised。
SAVEE
1. Speaker：“DC”、“JE”、“JK”和“KL”是为SAVE数据库记录的四位男性演讲者
2. Audio data:
  1. 音频文件由以44.1 kHz采样的WAV音频文件组成
  2. 7种情绪类别中的每一种都有15个句子。
  3. 文件名的首字母表示情感类别，后面的数字表示句子编号。
  4. The letters 'a', 'd', 'f', 'h', 'n', 'sa' and 'su' represent 'anger', 'disgust', 'fear', 'happiness', 'neutral', 'sadness' and 'surprise' emotion classes respectively.
  5. E.g., 'd03.wav' is the 3rd disgust sentence.