2018 Interspeech On Enhancing Speech Emotion Recognition using Generative Adversarial Networks

最新推荐文章于 2024-02-27 14:50:32 发布

wangdapang_2

最新推荐文章于 2024-02-27 14:50:32 发布

阅读量264

点赞数

分类专栏：读顶会

本文链接：https://blog.csdn.net/qq_38221026/article/details/104260884

版权

读顶会专栏收录该内容

9 篇文章 3 订阅

订阅专栏

两个GAN网络（不介绍了）

两种GAN网络生成，一种粗糙的，一种条件gan。粗糙的不能合成1582维度的，不收敛。条件gan借助了label信息，可以生成1582维的。

实验生成的数据怎么帮助最后的SER准确率
three sets of evaluations:
(i) in-domain eval- uation合成的样本做训练数据 with and without real data
(ii) in-domain evaluation 合成的样本做测试数据
(iii) a cross-corpus evaluation using a combination of real and synthetic data. 跨库合成的数据和真实的数据

首先在IEMOCAP 的4个session上训练两个GAN生成 synthetic samples.
然后
(i)合成的样本做训练数据，具体地，在三种设置下训练SVM for 第一种GAN（同样地，对 conditional-GAN，也训练三个SVM）：
1 using only the synthetic samples generated by the vanilla GAN, 只用第一种GAN合成的样本
2using only the real samples in the four training sessions
只用IEMOCAP的四个session的真实样本
3 using a combination of both synthetic and real samples
用上述两个的组合
在这里插入图片描述
对于二维GAN，结论就是原有的真实数据加上合成的样本，表现是最好的（内含的高斯分布不如真实的1582数据分布复杂）。对于1582维的条件GAN，加了合成数据反而轻微地影响了结果。但是具体的添加比例没有交代。

最后一行的improved-conditional指的是
为了改进model，训练有两条策略：G的学习率比D大（0.001 和0.0001）。G训练五次再训练一次D。（可以借鉴）

(ii) in-domain evaluation 合成的样本做测试数据

通过在真实数据上训练的模型，来分类 synthetic data，验证两种数据的相似性。用SVM分类。同样也是两组
对于条件gan，结果和synthetic samples were used for training类似（34.09%和35.23%）
对于二维GAN，比高纬的更容易预测。
在这里插入图片描述

(iii) 跨库合成的数据和真实的数据

IEMOCAP for training and MSP-IMPROV [14] testing
没有详细介绍
在这里插入图片描述

wangdapang_2

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
2018 Interspeech On Enhancing Speech Emotion Recognition using Generative Adversarial Networks

两个GAN网络（不介绍了）两种GAN网络生成，一种粗糙的，一种条件gan。粗糙的不能合成1582维度的，不收敛。条件gan借助了label信息，可以生成1582维的。实验生成的数据怎么帮助最后的SER准确率three sets of evaluations:(i) in-domain eval- uation合成的样本做训练数据 with and without real data(i...
复制链接

扫一扫

专栏目录