首页 | 本学科首页   官方微博 | 高级检索  
     


Extracting Phonetic Knowledge from Learning Systems: Perceptrons, Support Vector Machines and Linear Discriminants
Authors:Robert I Damper  Steve R Gunn  Mathew O Gore
Affiliation:(1) Image, Speech and Intelligent Systems (ISIS) Research Group, Department of Electronics and Computer Science, University of Southampton, Southampton, SO17 1BJ, UK;(2) Image, Speech and Intelligent Systems (ISIS) Research Group, Department of Electronics and Computer Science, University of Southampton, Southampton, SO17 1BJ, UK
Abstract:Speech perception relies on the human ability to decode continuous, analogue sound pressure waves into discrete, symbolic labels (lsquophonemesrsquo) with linguistic meaning. Aspects of this signal-to-symbol transformation have been intensively studied over many decades, using psychophysical procedures. The perception of (synthetic) syllable-initial stop consonants has been especially well studied, since these sounds display a marked categorization effect: they are typically dichotomised into lsquovoicedrsquo and lsquounvoicedrsquo classes according to their voice onset time (VOT). In this case, the category boundary is found to have a systematic relation to the (simulated) place of articulation, but there is no currently-accepted explanation of this phenomenon. Categorization effects have now been demonstrated in a variety of animal species as well as humans, indicating that their origins lie in general auditory and/or learning mechanisms, rather than in some lsquophonetic modulersquo specialized to human speech processing.In recent work, we have demonstrated that appropriately-trained computational learning systems (lsquoneural networksrsquo) also display the same systematic behaviour as human and animal listeners. Networks are trained on simulated patterns of auditory-nerve firings in response to synthetic lsquocontinuuarsquo of stop-consonant/vowel syllables varying in place of articulation and VOT. Unlike real listeners, such a software model is amenable to analysis aimed at extracting the phonetic knowledge acquired in training, so providing a putative explanation of the categorization phenomenon. Here, we study three learning systems: single-layer perceptrons, support vector machines and Fisher linear discriminants. We highlight similarities and differences between these approaches. We find that the modern inductive inference technique for small sample sizes of support vector machines gives the most convincing results. Knowledge extracted from the trained machine indicated that the phonetic percept of voicing is easily and directly recoverable from auditory (but not acoustic) representations.
Keywords:speech perception  auditory processing  perceptrons  support vector machines  linear discriminant analysis
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号