Detecting emotional state of a child in a conversational computer game期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Detecting emotional state of a child in a conversational computer game

Authors:	Serdar Yildirim Shrikanth Narayanan Alexandros Potamianos

Affiliation:	1. Computer Engineering Department, Mustafa Kemal University, Antakya 31040, Turkey;2. Signal Analysis and Interpretation Laboratory (SAIL), Department of Electrical Engineering, University of Southern California, Los Angeles, CA 90089, USA;3. Department of ECE, Technical University of Crete, Chania 73100, Greece;1. Department of Environmental Health Sciences, University of Michigan, 1415 Washington Heights, Ann Arbor, MI 48109-2029, USA;2. Department of Chemistry, University of Michigan, Ann Arbor, MI 48109, USA;3. Center for Wireless Integrated MicroSensing and Systems, University of Michigan, Ann Arbor, MI 48109, USA;1. The Arthur Smith Institute for Urology, Hofstra North Shore-Long Island Jewish School of Medicine, New Hyde Park, New York;2. Department of Urology, Columbia University School of Medicine, New York, New York

Abstract:	The automatic recognition of user’s communicative style within a spoken dialog system framework, including the affective aspects, has received increased attention in the past few years. For dialog systems, it is important to know not only what was said but also how something was communicated, so that the system can engage the user in a richer and more natural interaction. This paper addresses the problem of automatically detecting “frustration”, “politeness”, and “neutral” attitudes from a child’s speech communication cues, elicited in spontaneous dialog interactions with computer characters. Several information sources such as acoustic, lexical, and contextual features, as well as, their combinations are used for this purpose. The study is based on a Wizard-of-Oz dialog corpus of 103 children, 7–14 years of age, playing a voice activated computer game. Three-way classification experiments, as well as, pairwise classification between polite vs. others and frustrated vs. others were performed. Experimental results show that lexical information has more discriminative power than acoustic and contextual cues for detection of politeness, whereas context and acoustic features perform best for frustration detection. Furthermore, the fusion of acoustic, lexical and contextual information provided significantly better classification results. Results also showed that classification performance varies with age and gender. Specifically, for the “politeness” detection task, higher classification accuracy was achieved for females and 10–11 years-olds, compared to males and other age groups, respectively.

Keywords:
本文献已被 ScienceDirect 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏