Term-Dependent Confidence Normalisation for Out-of-Vocabulary Spoken Term Detection |
| |
Authors: | Dong Wang Javier Tejedor Simon King Joe Frankel |
| |
Affiliation: | (1) Centre for Speech Technology Research, University of Edinburgh, 10 Crichton Street, Edinburgh, EH8 9LW, U.K.;(2) Human Computer Technology Laboratory (HCTLab), School of Computer Engineering and Telecommunication University Autonomous of Madrid, Avenue Francisco Tom?s y Valiente 11, 28049 Madrid, Spain;(3) Nuance Communications, 1 Wayside Road, Burlington, MA, 01803, U.S.A. |
| |
Abstract: | An important component of a spoken term detection (STD) system involves estimating confidence measures of hypothesised detections.
A potential problem of the widely used lattice-based confidence estimation, however, is that the confidence scores are treated
uniformly for all search terms, regardless of how much they may differ in terms of phonetic or linguistic properties. This
problem is particularly evident for out-of-vocabulary (OOV) terms which tend to exhibit high intra-term diversity. To address
the impact of term diversity on confidence measures, we propose in this work a term-dependent normalisation technique which
compensates for term diversity in confidence estimation. We first derive an evaluation-metric-oriented normalisation that
optimises the evaluation metric by compensating for the diverse occurrence rates among terms, and then propose a linear bias
compensation and a discriminative compensation to deal with the bias problem that is inherent in lattice-based confidence
measurement and from which the Term Specific Threshold (TST) approach suffers. We tested the proposed technique on speech
data from the multi-party meeting domain with two state-of-the-art STD systems based on phonemes and words respectively. The
experimental results demonstrate that the confidence normalisation approach leads to a significant performance improvement
in STD, particularly for OOV terms with phoneme-based systems. |
| |
Keywords: | confidence estimation discriminative model spoken term detection speech recognition |
本文献已被 CNKI SpringerLink 等数据库收录! |
|