首页 | 本学科首页   官方微博 | 高级检索  
     


Stream fusion for multi-stream automatic speech recognition
Authors:Hesam Sagha  Feipeng Li  Ehsan Variani  José del R Millán  Ricardo Chavarriaga  Björn Schuller
Affiliation:1.Chair of Complex & Intelligent Systems,University of Passau,Passau,Germany;2.Center of Language and Speech Processing,Johns Hopkins University,Baltimore,USA;3.Apple Inc,San Francisco Bay Area,USA;4.Google,San Francisco Bay Area,USA;5.Defitech Chair in Brain-Machine Interface,école Polytechnique Fédérale de Lausanne,Lausanne,Switzerland;6.Department of Computing,Imperial College,London,UK
Abstract:Multi-stream automatic speech recognition (MS-ASR) has been confirmed to boost the recognition performance in noisy conditions. In this system, the generation and the fusion of the streams are the essential parts and need to be designed in such a way to reduce the effect of noise on the final decision. This paper shows how to improve the performance of the MS-ASR by targeting two questions; (1) How many streams are to be combined, and (2) how to combine them. First, we propose a novel approach based on stream reliability to select the number of streams to be fused. Second, a fusion method based on Parallel Hidden Markov Models is introduced. Applying the method on two datasets (TIMIT and RATS) with different noises, we show an improvement of MS-ASR.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号