首页 | 本学科首页   官方微博 | 高级检索  
     


k-TSS language models in speech recognition systems
Affiliation:1. University of Lyon, INSA Lyon, DEEP Laboratory (Déchets Eaux Environnement Pollutions), EA 7429, F-69621 Villeurbanne Cedex, France;2. Research Group Cienca e Ingenieria del Agua y el Ambiente, Faculty of Engineering, Pontificia Universidad Javeriana, Carrera 7 No. 40-62, Bogota, Colombia;3. Graz University of Technology, Institute of Urban Water Management and Landscape Water Engineering, Stremayrgasse 10/I, A-8010 Graz, Austria;4. Kompetenzzentrum Wasser Berlin gGmbH, Cicerostrasse 24, D-10709 Berlin, Germany;1. Vanderbilt University, Owen Graduate School of Management, Nashville, TN 37203, USA;2. University of California, Haas School of Business, Berkeley, CA 94720-1900, USA
Abstract:The aim of this work is to show the ability of stochastic regular grammars to generate accurate language models which can be well integrated, allocated and handled in a continuous speech recognition system. For this purpose, a syntactic version of the well-known n -gram model, called k -testable language in the strict sense (k -TSS), is used. The complete definition of a k -TSS stochastic finite state automaton is provided in the paper. One of the difficulties arising in representing a language model through a stochastic finite state network is that the recursive schema involved in the smoothing procedure must be adopted in the finite state formalism to achieve an efficient implementation of the backing-off mechanism. The use of the syntactic back-off smoothing technique applied to k -TSS language modelling allowed us to obtain a self-contained smoothed model integrating several k -TSS automata in a unique smoothed and integrated model, which is also fully defined in the paper. The proposed formulation leads to a very compact representation of the model parameters learned at training time: probability distribution and model structure. The dynamic expansion of the structure at decoding time allows an efficient integration in a continuous speech recognition system using a one-step decoding procedure. An experimental evaluation of the proposed formulation was carried out on two Spanish corpora. These experiments showed that regular grammars generate accurate language models (k -TSS) that can be efficiently represented and managed in real speech recognition systems, even for high values of k, leading to very good system performance.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号