Level of interest sensing in spoken dialog using decision-level fusion of acoustic and lexical evidence期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Level of interest sensing in spoken dialog using decision-level fusion of acoustic and lexical evidence

Affiliation:	1. College of Business, Florida State University, Tallahassee, FL 32306, USA;2. School of Business Administration, University of Miami, Coral Gables, FL 33124, USA

Abstract:	Automatic detection of a user's interest in spoken dialog plays an important role in many applications, such as tutoring systems and customer service systems. In this study, we propose a decision-level fusion approach using acoustic and lexical information to accurately sense a user's interest at the utterance level. Our system consists of three parts: acoustic/prosodic model, lexical model, and a model that combines their decisions for the final output. We use two different regression algorithms to complement each other for the acoustic model. For lexical information, in addition to the bag-of-words model, we propose new features including a level-of-interest value for each word, length information using the number of words, estimated speaking rate, silence in the utterance, and similarity with other utterances. We also investigate the effectiveness of using more automatic speech recognition (ASR) hypotheses (n-best lists) to extract lexical features. The outputs from the acoustic and lexical models are combined at the decision level. Our experiments show that combining acoustic evidence with lexical information improves level-of-interest detection performance, even when lexical features are extracted from ASR output with high word error rate.

Keywords:	Level of interest Decision-level fusion Human–machine interaction
本文献已被 ScienceDirect 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏