Skip to main content

Research Repository

Advanced Search

Audiovisual classification of vocal outbursts in human conversation using long-short-term memory networks


F. Eyben

S. Petridis


Georgios Tzimiropoulos

Stefanos Zafeiriou

Maja Pantic


We investigate classification of non-linguistic vocalisations with a novel audiovisual approach and Long Short-Term Memory (LSTM) Recurrent Neural Networks as highly successful dynamic sequence classifiers. As database of evaluation serves this year's Paralinguistic Challenge's Audiovisual Interest Corpus of human-to-human natural conversation. For video-based analysis we compare shape and appearance based features. These are fused in an early manner with typical audio descriptors. The results show significant improvements of LSTM networks over a static approach based on Support Vector Machines. More important, we can show a significant gain in performance when fusing audio and visual shape features.


Eyben, F., Petridis, S., Schuller, B., Tzimiropoulos, G., Zafeiriou, S., & Pantic, M. (2011). Audiovisual classification of vocal outbursts in human conversation using long-short-term memory networks.

Conference Name ICASSP 2011 - 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
End Date May 27, 2011
Publication Date Jan 1, 2011
Deposit Date Jan 29, 2016
Publicly Available Date Jan 29, 2016
Peer Reviewed Peer Reviewed
Keywords Audio Signal Processing, Audio-Visual Systems, Recurrent Neural Nets, Support Vector Machines, Video Signal Processing
Public URL
Publisher URL
Additional Information Published in: 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing: proceedings: May 22-27, 2011 Prague Congress Center, Prague, Czech Republic. Piscataway, NJ : IEEE, c2011. 978-1-4577-0538-0. pp. 5844-5847. doi: 10.1109/ICASSP.2011.5947690 ©2011 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.


Downloadable Citations