A pilot study on augmented speech communication based on Electro-Magnetic Articulography |
| |
Authors: | Panikos Heracleous,Pierre BadinGé rard Bailly,Norihiro Hagita |
| |
Affiliation: | a ATR, Intelligent Robotics and Communication Laboratories, 2-2-2 Hikaridai Seika-cho, Soraku-gun, Kyoto-fu 619-0288, Japan b GIPSA-lab, Speech and Cognition Department, UMR 5216, CNRS-Grenoble University, 961 rue de la Houille Blanche Domaine universitaire, F-38402 Saint Martin d’Hères cedex, France |
| |
Abstract: | Speech is the most natural form of communication for human beings. However, in situations where audio speech is not available because of disability or adverse environmental condition, people may resort to alternative methods such as augmented speech, that is, audio speech supplemented or replaced by other modalities, such as audiovisual speech, or Cued Speech. This article introduces augmented speech communication based on Electro-Magnetic Articulography (EMA). Movements of the tongue, lips, and jaw are tracked by EMA and are used as features to create hidden Markov models (HMMs). In addition, automatic phoneme recognition experiments are conducted to examine the possibility of recognizing speech only from articulation, that is, without any audio information. The results obtained are promising, which confirm that phonetic features characterizing articulation are as discriminating as those characterizing acoustics (except for voicing). This article also describes experiments conducted in noisy environments using fused audio and EMA parameters. It has been observed that when EMA parameters are fused with noisy audio speech, the recognition rate increases significantly as compared with using noisy audio speech only. |
| |
Keywords: | Augmented speech Electro-Magnetic Articulography (EMA) Automatic speech recognition Hidden Markov model (HMMs) Fusion Noise robustness |
本文献已被 ScienceDirect 等数据库收录! |
|