Latif, Siddique and Asim, Muhammad and Rana, Rajib ORCID: https://orcid.org/0000-0002-0506-2409 and Khalifa, Sara and Jurdak, Raja and Schuller, Bjorn W.
(2020)
Augmenting generative adversarial networks for
speech emotion recognition.
In: 21st Annual Conference of the International Speech Communication Association: Cognitive Intelligence for Speech
Processing (INTERSPEECH 2020), 25–29 Oct 2020, Shanghai, China.
![]() |
Text (Published Version)
3194.pdf Restricted |
Abstract
Generative adversarial networks (GANs) have shown potential in learning emotional attributes and generating new data samples. However, their performance is usually hindered by the unavailability of larger speech emotion recognition (SER) data. In this work, we propose a framework that utilises the mixup data augmentation scheme to augment the GAN in feature learning and generation. To show the effectiveness of the proposed framework, we present results for SER on (i) synthetic feature vectors, (ii) augmentation of the training data with synthetic features, (iii) encoded features in compressed representation. Our results show that the proposed framework can effectively learn compressed emotional representations as well as it can generate synthetic samples that help improve performance in within-corpus and cross-corpus evaluation.
![]() |
Statistics for this ePrint Item |
Actions (login required)
![]() |
Archive Repository Staff Only |