首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Speaker normalization for chinese vowel recognition in cochlear implants   总被引:1,自引:0,他引:1  
Because of the limited spectra-temporal resolution associated with cochlear implants, implant patients often have greater difficulty with multitalker speech recognition. The present study investigated whether multitalker speech recognition can be improved by applying speaker normalization techniques to cochlear implant speech processing. Multitalker Chinese vowel recognition was tested with normal-hearing Chinese-speaking subjects listening to a 4-channel cochlear implant simulation, with and without speaker normalization. For each subject, speaker normalization was referenced to the speaker that produced the best recognition performance under conditions without speaker normalization. To match the remaining speakers to this "optimal" output pattern, the overall frequency range of the analysis filter bank was adjusted for each speaker according to the ratio of the mean third formant frequency values between the specific speaker and the reference speaker. Results showed that speaker normalization provided a small but significant improvement in subjects' overall recognition performance. After speaker normalization, subjects' patterns of recognition performance across speakers changed, demonstrating the potential for speaker-dependent effects with the proposed normalization technique.  相似文献   

2.
According to speech perception-rate data continuous interleaved sampling (CIS) and spectral maxima sound processor (SMSP) techniques are, and probably will be, the best speech processing strategies for multichannel electrode cochlear implant devices. From packaging and power-consumption viewpoints, today's speech processing systems are very big and are, therefore, worn on the body and consume large electric power. Next-generation cochlear implant devices would be more compact, low-power products that would be worn behind or in the ear. It is clear that mixed-signal, high-density, and low-power design techniques are required to satisfy compactness, as well as low-power consumption features to realize intelligent speech sensation for the implantees. The especially critical design consideration of power supply lifetime and efficiency might be increased by using new promising technologies like microelectromechanical systems (MEMS)  相似文献   

3.
An advanced multiple channel cochlear implant   总被引:4,自引:0,他引:4  
An advanced multiple channel cochlear implant hearing prosthesis is described. Stimulation is presented through an array of 20 electrodes located in the scala tympani. Any two electrodes can be configured as a bipolar pair to conduct a symmetrical, biphasic, constant-current pulsatile stimulus. Up to three stimuli can be presented in rapid succession or effectively simultaneously. For simultaneous stimulation, a novel time-division current multiplexing technique has been developed to obviate electrode interactions that may compromise safety. The stimuli are independently controllable in current amplitude, duration, and onset time. Groups of three stimuli can be generated at a rate of typically 500 Hz. Stimulus control data and power are conveyed to the implant through a single transcutaneous inductive link. The device also incorporates a telemetry system that enables electrode voltage waveforms to be monitored externally in real time. The electronics of the implant are contained almost entirely on a custom designed integrated circuit. Preliminary results obtained with the first patient to receive the advanced implant are included.  相似文献   

4.
Minimizing morphological variances of the vocal tract across speakers is a challenge for articulatory analysis and modeling. In order to reduce morphological differences in speech organs among speakers and retain speakers’ speech dynamics, our study proposes a method of normalizing the vocal-tract shapes of Mandarin and Japanese speakers by using a Thin-Plate Spline (TPS) method. We apply the properties of TPS in a two-dimensional space in order to normalize vocal-tract shapes. Furthermore, we also use DNN (Deep Neural Networks) based speech recognition for our evaluations. We obtained our template for normalization by measuring three speakers’ palates and tongue shapes. Our results show a reduction in variances among subjects. The similar vowel structure of pre/post-normalization data indicates that our framework retains speaker specific characteristics. Our results for the articulatory recognition of isolated phonemes show an improvement of 25%. Moreover, our phone error rate of continuous speech reduced by 5.84%.  相似文献   

5.
Several alternate linear prediction parametric representations are experimentally compared as to their vowel recognition performance. The speech data used for this purpose consist of 900 utterances of 10 different vowels spoken by 3 speakers in a/b/ -vowel- /b/ context. The cepstral coefficients representation is found to be the best linear prediction parametric representation.  相似文献   

6.
This paper deals with a model-based design of an autonomous biomechatronic device for sensing and analog signal processing of acoustic signals. The aim is to develop a biomechatronic artificial cochlear implant for people with hearing loss due to damage or disease of their cochlea. The unique artificial electronic cochlear implant is based on an array of microelectromechanical piezoelectric membranes. Oscillations of membranes detect and filter acoustic signals in individual acoustic frequencies. The proposed biomechatronic device of the artificial cochlear implant consists of an active filters array, signal processing electronics, stimulation nerves electrodes and energy harvesting system for autonomous powering of the device. This solution differs from current cochlear implants solutions, which are bulky electronic systems limited by their high power consumption. The multidisciplinary models of the artificial cochlea implant concept are presented. The mechatronic approach based on model seems to be very useful for development of the full implantable cochlear implant which is designed for the sensing and processing of acoustic signals without external energy source.  相似文献   

7.
One of the fundamental facets of the cochlear implant that must be understood to predict accurately the effect of an electrical stimulus on the auditory nerve is the nerve-electrode interface. One aspect of this interface is the degree to which current delivered by an electrode spreads to neurons distant from it. This paper reports a direct mapping of this current spread using recordings from single units from the cat auditory nerve. Large variations were seen in the degree to which the different units are selective in responding to electrodes at different positions within the scala tympani. Three types of units could be identified based on the selectiveness of their response to the different electrodes in a linear array. The first type of unit exhibited a gradual increase in threshold as the stimulating site was moved from more apical to more basal locations within the scala tympani. The second type of unit exhibited a sharp local minimum, with rapid increases in threshold in excess of 6 dB/mm in the vicinity of the minimum. At electrode sites distant from the local minima the rate of change of the threshold approached that of the first type of units. The final type of unit also demonstrated a gradual change in threshold with changing electrode position, however, two local minima, one apical and one basal, could be identified. These three types are hypothesized to correspond to units which originate apical to the electrode array, along the electrode array and basal to the electrode array  相似文献   

8.
Three-dimensional (3-D) localization of individual cochlear implant electrodes within the inner ear is of importance for modeling the electrical field of the cochlea, designing the electrode array, and programming the associated speech processor. A 3-D reconstruction method of cochlear implant electrodes is proposed to localize individual electrodes from two X-ray views in combination with the spiral computed tomography technique. By adapting epipolar geometry to the configuration of an X-ray imaging system, we estimate individual electrode locations in the least square sense without using a patient attachment required by an existing stereophotogrammetry technique. Furthermore, our method does not require any knowledge of the intrinsic and extrinsic parameters of the imaging system. The performance of our method is studied in numerical simulation and with patient data and is found to be sufficiently accurate for clinical use. The maximum root mean-square errors measured are 0.0445 and 0.214 mm for numerical simulation and patient data, respectively.  相似文献   

9.
Encoding frequency modulation to improve cochlear implant performance in noise   总被引:10,自引:0,他引:10  
Different from traditional Fourier analysis, a signal can be decomposed into amplitude and frequency modulation components. The speech processing strategy in most modern cochlear implants only extracts and encodes amplitude modulation in a limited number of frequency bands. While amplitude modulation encoding has allowed cochlear implant users to achieve good speech recognition in quiet, their performance in noise is severely compromised. Here, we propose a novel speech processing strategy that encodes both amplitude and frequency modulations in order to improve cochlear implant performance in noise. By removing the center frequency from the subband signals and additionally limiting the frequency modulation's range and rate, the present strategy transforms the fast-varying temporal fine structure into a slowly varying frequency modulation signal. As a first step, we evaluated the potential contribution of additional frequency modulation to speech recognition in noise via acoustic simulations of the cochlear implant. We found that while amplitude modulation from a limited number of spectral bands is sufficient to support speech recognition in quiet, frequency modulation is needed to support speech recognition in noise. In particular, improvement by as much as 71 percentage points was observed for sentence recognition in the presence of a competing voice. The present result strongly suggests that frequency modulation be extracted and encoded to improve cochlear implant performance in realistic listening situations. We have proposed several implementation methods to stimulate further investigation. Index Terms-Amplitude modulation, cochlear implant, fine structure, frequency modulation, signal processing, speech recognition, temporal envelope.  相似文献   

10.
11.
The performance of cochlear implants deteriorates in noisy environments compared to quiet conditions. This paper presents an adaptive cochlear implant system, which is capable of classifying the background noise environment in real time for the purpose of adjusting or tuning its noise suppression algorithm to that environment. The tuning is done automatically with no user intervention. Five objective quality measures are used to show the superiority of this adaptive system compared to a conventional fixed noise-suppression system. Steps taken to achieve the real-time implementation of the entire system, incorporating both the cochlear implant speech processing and the background noise suppression, on a portable PDA research platform are presented along with the timing results.  相似文献   

12.
Cochlear implant (CI) recipients report severe degradation of speech understanding under noisy conditions. Most CI recipients typically can require about 10-25 dB higher signal-to-noise ratio than normal hearing (NH) listeners in order to achieve similar speech understanding performance. In recent years, significant emphasis has been put on binaural algorithms, which not only make use of the head shadow effect, but also have two or more microphone signals at their disposal to generate binaural inputs. Most of the CI recipients today are unilaterally implanted but they can still benefit from the binaural processing utilizing a contralateral microphone. The phase error filtering (PEF) algorithm tries to minimize the phase error variance utilizing a time-frequency mask for noise reduction. Potential improvement in speech intelligibility offered by the algorithm is evaluated with four different kinds of mask functions. The study reveals that the PEF algorithm which uses a contralateral microphone but unilateral presentation provides considerable improvement in intelligibility for both NH and CI subjects. Further, preference rating test suggests that CI subjects can tolerate higher levels of distortions than NH subjects, and therefore, more aggressive noise reduction for CI recipients is possible.  相似文献   

13.
针对电子耳蜗连续交替采样算法(CIS)实现中面临的所需刺激速率高、要求参数调整灵活等问题,提出了一种基于FFT的CIS算法实现方法,设计了一套基于TMS320VC5502的电子耳蜗体外处理器的硬件系统。利用单频信号和实际语音对系统进行了验证,结果表明系统高效地实现了CIS算法,刺激速率达到了8 000次以上,满足电子耳蜗的要求。因此该设计系统有着高灵活性、高实时性和低功耗等特点,对电子耳蜗的研究和工程实现具有一定的指导意义。  相似文献   

14.
Highly invasive surgical procedures, such as the implantation of a prosthetic device, require correct force delivery to achieve desirable outcomes and minimize trauma induced during the operation. Improvement in surgeon technique can reduce the chances of excessive force application and lead to optimal placement of the electrode array. The fundamental factors that affect the degree of success for cochlear implant recipients are identified through empirical methods. Insertion studies are performed to assess force administration and electrode trajectories during implantations of the Nucleus 24 Contour and Nucleus 24 Contour Advance electrodes into a synthetic model of the human Scala Tympani, using associated methods. Results confirm that the Advance Off- Stylet insertion of the soft-tipped Contour Advance electrode gives an overall reduction in insertion force. Analysis of force delivery and electrode positioning during cochlear implantation can help identify and control key factors for improvement of insertion method. Based on the findings, suggestions are made to enhance surgeon technique.  相似文献   

15.
In practical applications of importance sampling (IS) simulation, two basic problems are encountered, that of determining the estimation variance and that of evaluating the proper IS parameters needed in the simulations. The authors derive new upper and lower bounds on the estimation variance which are applicable to IS techniques. The upper bound is simple to evaluate and may be minimized by the proper selection of the IS parameter. Thus, lower and upper bounds on the improvement ratio of various IS techniques relative to the direct Monte Carlo simulation are also available. These bounds are shown to be useful and computationally simple to obtain. Based on the proposed technique, one can readily find practical suboptimum IS parameters. Numerical results indicate that these bounding techniques are useful for IS simulations of linear and nonlinear communication systems with intersymbol interference in which bit error rate and IS estimation variances cannot be obtained readily using prior techniques  相似文献   

16.
The classical characterisation of vowel sounds used by phoneticians is the vowel quadrilateral, which originally was assumed to be the locus of the tongue body between its four extreme vowel positions. A method of real-time computer analysis and display which represents the vowel quadrilateral is described. This has application in computer-based speech training aids, but may have wider use in speech research generally.  相似文献   

17.
Computer-synthesized vowels were used to examine methods for controlling and measuring the perceptions elicited during electrical stimulation of the human cochlea. In the first experiment, we measured the importance of the second formant (F2) in the identification of vowels, matched for duration, in a single subject with a multichannel cochlear implant. The subject never confused vowels having a "low" frequency F2 with those having a "high" frequency F2. In the second experiment, identification functions were generated for a series of vowels varying only in F2. When the pattern of F2 stimulation at the basilar membrane was manipulated, vowel identification functions were altered. For the categorization of vowels, the data indicate that the relative cochlear position of F2 stimulation was more important than fine-grain temporal waveform cues. The data are supportive of cochlear implant coding strategies that make use of cochlear place information. In the later experiments, we manipulated filter passbands and channel gains to explore their effect on these classifications. These preliminary studies indicate that it is possible to "fine-tune" such classifications.  相似文献   

18.
A vowel discrimination test using a tactual vocoder was administered and the results were compared to that of an eight-channel cochlear implant. Both the tactile vocoder and the cochlear implant divided the speech signals into 16 frequency components using band-pass filters and lateral inhibition circuits. In the tactile vocoder, these 16 components were converted into a vibration with 200 Hz frequency and applied to a 3 x 16 element vibrator array using bimorph piezoelectric elements. The vibratory patterns were sensed on the fingertip. In the cochlear implant, the 16 components were reduced to eight current stimulation signals, consisting of biphasic pulses with 200 Hz frequency, which were applied to an eight-channel electrode array implanted in the scala tympani. The electrode array passed through the round window into the scala tympani to a depth of 23 mm. These psychophysical experiments investigate the ability of human subjects to discriminate synthetic vowels as a function of the number of channels employed. The results suggested that an eight-channel and a 16-channel tactile vocoder provided essentially the same discrimination scores. However, the ability to discriminate synthetic vowels decreased rapidly when less than eight channels were employed. The ability of an eight-channel tactile vocoder is expected to be better than that of the eight-channel cochlear implant because it is supposed that vowel discrimination is degraded by a phenomenon known as "current spreading" in the case of cochlear stimulation. However, the comparison between the two devices was not done on the cochlear implant subject.  相似文献   

19.
Cochlear implants (CIs) restore partial hearing to people with severe to profound sensorineural deafness; but there is still a marked performance gap in speech recognition between those who have received cochlear implant and people with a normal hearing capability. One of the factors that may lead to this performance gap is the inadequate signal processing method used in CIs. This paper investigates the application of an improved signal-processing method called bionic wavelet transform (BWT). This method is based upon the auditory model and allows for signal processing. Comparing the neural network simulations on the same experimental materials processed by wavelet transform (WT) and BWT, the application of BWT to speech signal processing in CI has a number of advantages, including: improvement in recognition rates for both consonants and vowels, reduction of the number of required channels, reduction of the average stimulation duration for words, and high noise tolerance. Consonant recognition results in 15 normal hearing subjects show that the BWT produces significantly better performance than the WT (t = -4.36276, p = 0.00065). The BWT has great potential to reduce the performance gap between CI listeners and people with a normal hearing capability in the future.  相似文献   

20.
耳语音是一种语言方式,是指声带轻微振动或者不振动的轻声说话。本文对已经收集形成的语音库的基础之上进行了一系列研究,在此基础上就正常音和耳语音对共振峰位置带宽进行研究计算,得出其相应的变化比例,最终获得了耳语音在共振峰的基本特点。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号