首页 | 本学科首页   官方微博 | 高级检索  
     


Speech understanding and speech translation by maximum a-posteriori semantic decoding
Affiliation:1. Siemens AG, Department of ICN TR S R2, Hofmannstrasse 51, D-81359 Munich, Germany;2. Rohde & Schwarz GmbH & Co. KG, Muehldorfstrasse 15, D-81671 Munich, Germany;1. Laboratoire Interdisciplinaire Carnot de Bourgogne UMR 6303 CNRS, Université de Bourgogne, 9 Avenue Alain Savary, 21078, Dijon, France;2. Department of Natural Sciences, Lebanese American University, P.O. Box 36, Byblos, Lebanon;1. College of Health Solutions, Arizona State University, Tempe, AZ, USA;2. Speech-Language-Hearing Sciences, University of Minnesota, Minneapolis, MN, USA;3. Speech and Hearing Sciences, University of Washington, Seattle, WA, USA;4. Speech and Hearing Sciences, Washington State University, Spokane, WA, USA;5. National Centre for Inherited Metabolic Disorders, Children’s Health Ireland at Temple Street, Dublin, Ireland
Abstract:This paper describes a domain-limited system for speech understanding as well as for speech translation. An integrated semantic decoder directly converts the preprocessed speech signal into its semantic representation by a maximum a-posteriori classification. With the combination of probabilistic knowledge on acoustic, phonetic, syntactic, and semantic levels, the semantic decoder extracts the most probable meaning of the utterance. No separate speech recognition stage is needed because of the integration of the Viterbi-algorithm (calculating acoustic probabilities by the use of Hidden-Markov-Models) and a probabilistic chart parser (calculating semantic and syntactic probabilities by special models). The semantic structure is introduced as a representation of an utterance's meaning. It can be used as an intermediate level for a succeeding intention decoder (within a speech understanding system for the control of a running application by spoken inputs) as well as an interlingua-level for a succeeding language production unit (within an automatic speech translation system for the creation of spoken output in another language). Following the above principles and using the respective algorithms, speech understanding and speech translating front-ends for the domains ‘graphic editor’, ‘service robot’, ‘medical image visualisation’ and ‘scheduling dialogues’ could be successfully realised.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号