首页 | 本学科首页   官方微博 | 高级检索  
     


Application of hidden Markov models to automatic speech endpoint detection
Authors:J G Wilpon  L R Rabiner
Abstract:Accurate detection of the boundaries of a speech utterance during a recording interval has been shown to be crucial for reliable and robust automatic speech recognition. The endpoint detection problem is fairly straightforward for high-level speech signals spoken in low-level stationary noise environments (e.g. signal-to-noise ratios greater than 30 dB). However, these ideal conditions do not always exist. One example, where reliable word detection is difficult, is speech spoken in a mobile environment. Because of road, tire, fan noises, etc. detection of speech often becomes problematic.Currently, most endpoint detection algorithms use only signal energy and duration information to perform the endpoint detection task. These algorithms perform quite well with reasonable signal-to-noise ratios. However, under the harshest of conditions (e.g. in a car travelling at 60 mph with the fan on high) these algorithms begin to fail.In this paper, an endpoint detection algorithm is presented which is based on hidden Markov model (HMM) technology. The algorithm explicitly determines a set of speech endpoints based on the output of a Viterbi decoding algorithm. This algorithm was tested using a template-based speech recognition system and also using an HMM based system.Based on a speaker dependent speech database from four talkers, recorded in a mobile environment under five different driving conditions (including traveling at 60 mph with the fan on), we tested several endpoint detection schemes. The results showed that, under some conditions, the HMM-based approach to endpoint detection performed significantly better than the energy-based system. The overall accuracy of the system using the HMM endpoint detector, when trained with clean inputs and when tested on the 11 word digits vocabulary (zero through nine and oh) with speech recorded in various mobile environments, was 99.7%. The equivalent accuracy of the energy based endpoint detector was 95.2% in a template based recognizer.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号