Weighted Finite State Transducer–Based Endpoint Detection Using Probabilistic Decision Logic |
| |
Authors: | Hoon Chung Sung Joo Lee Yun Keun Lee |
| |
Affiliation: | Hoon Chung (corresponding author, hchung@etri.re.kr), Sung Joo Lee (lee1862@etri.re.kr), and Yun Keun Lee (yklee@etri.re.kr) are with the SW·Content Research Laboratory, ETRI, Daejeon, Rep. of Korea. |
| |
Abstract: | In this paper, we propose the use of data‐driven probabilistic utterance‐level decision logic to improve Weighted Finite State Transducer (WFST)‐based endpoint detection. In general, endpoint detection is dealt with using two cascaded decision processes. The first process is frame‐level speech/non‐speech classification based on statistical hypothesis testing, and the second process is a heuristic‐knowledge‐based utterance‐level speech boundary decision. To handle these two processes within a unified framework, we propose a WFST‐based approach. However, a WFST‐based approach has the same limitations as conventional approaches in that the utterance‐level decision is based on heuristic knowledge and the decision parameters are tuned sequentially. Therefore, to obtain decision knowledge from a speech corpus and optimize the parameters at the same time, we propose the use of data‐driven probabilistic utterance‐level decision logic. The proposed method reduces the average detection failure rate by about 14% for various noisy‐speech corpora collected for an endpoint detection evaluation. |
| |
Keywords: | Endpoint detection speech recognition Weighted Finite State Transducer |
|
|