A probabilistic multimodal approach for predicting listener backchannels |
| |
Authors: | Louis-Philippe Morency Iwan de Kok Jonathan Gratch |
| |
Affiliation: | 1.Institute for Creative Technologies,University of Southern California,Marina del Rey,USA;2.Human Media Interaction Group,University of Twente,Enschede,The Netherlands |
| |
Abstract: | During face-to-face interactions, listeners use backchannel feedback such as head nods as a signal to the speaker that the
communication is working and that they should continue speaking. Predicting these backchannel opportunities is an important
milestone for building engaging and natural virtual humans. In this paper we show how sequential probabilistic models (e.g.,
Hidden Markov Model or Conditional Random Fields) can automatically learn from a database of human-to-human interactions to
predict listener backchannels using the speaker multimodal output features (e.g., prosody, spoken words and eye gaze). The
main challenges addressed in this paper are automatic selection of the relevant features and optimal feature representation
for probabilistic models. For prediction of visual backchannel cues (i.e., head nods), our prediction model shows a statistically
significant improvement over a previously published approach based on hand-crafted rules. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|