Acoustic quality normalization for robust automatic speech recognition |
| |
Authors: | Ghulam Muhammad |
| |
Affiliation: | (1) Department of Computer Engineering, College of Computer and Information Sciences, King Saud University, P.O. 51178, Riyadh, 11543, Saudi Arabia |
| |
Abstract: | Automatic speech recognition (ASR) system suffers from the variation of acoustic quality in an input speech. Speech may be
produced in noisy environments and different speakers have their own way of speaking style. Variations can be observed even
in the same utterance and the same speaker in different moods. All these uncertainties and variations should be normalized
to have a robust ASR system. In this paper, we apply and evaluate different approaches of acoustic quality normalization in
an utterance for robust ASR. Several HMM (hidden Markov model)-based systems using utterance-level, word-level, and monophone-level
normalization are evaluated with HMM-SM (subspace method)-based system using monophone-level normalization for normalizing
variations and uncertainties in an utterance. SM can represent variations of fine structures in sub-words as a set of eigenvectors,
and so has better performance at monophone-level than HMM. Experimental results show that word accuracy is significantly improved
by the HMM-SM-based system with monophone-level normalization compared to that by the typical HMM-based system with utterance-level
normalization in both clean and noisy conditions. Experimental results also suggest that monophone-level normalization using
SM has better performance than that using HMM. |
| |
Keywords: | Acoustic quality Normalization HMM Subspace method Automatic speech recognition |
本文献已被 SpringerLink 等数据库收录! |
|