首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Most of the existing systems and methods for laryngeal pathology detection are characterized by a classification error. One of the basic problems is the approximation and estimation of the probability density functions of the given classes. In order to increase the accuracy of laryngeal pathology detection and to eliminate the most dangerous error classification of a patient with laryngeal disease as a normal speaker, an approach based on modeling of the probability density functions (pdfs) of the input vectors of the normal and pathological speakers by means of two prototype distribution maps (PDM), respectively, is proposed. The pdf of the input vectors of an unknown normal or pathological speaker is also modeled by such a prototype distribution neural map (PDM(X)), and the pathology detection is done by means of a ratio of specific similarities rather than by a direct comparison of some type of distance/similarity with a threshold. The experiments show an increased classification accuracy and that the proposed method can be used for screening the laryngeal diseases. The method is applied in a consulting system for clinical practice  相似文献   

2.
The methodology of electroglottography is briefly outlined, Major emphasis is given to validating key features of the electroglottographic (EGG) waveform using ultrahigh-speed laryngeal films. We show how the instants of glottal closure and opening may be identified from the EGG waveform. This information may be used to improve speech analysis techniques such as the pitch synchronous, closed phase, covariance analysis method. Other applications include pitch detection, the determination of intervals of voicing, unvoicing, mixed voicing and silence, improving speech synthesis, and assisting the automation of inverse filtering.  相似文献   

3.
An evaluation of reconstruction quality in image vector quantisation (VQ) is introduced. Several existing objective measures were examined, and several measures were developed that take into account specific distortions accompanying the VQ process. Analysis of the combination of the existing and proposed measures has led to the design of a simple objective measure with an ability to subjectively judge the image quality more accurately than when using the PSNR criterion  相似文献   

4.
Real-time multichannel computerized electrogastrograph   总被引:3,自引:0,他引:3  
The purpose of this study was to develop a real-time multichannel computerized electrogastrograph (EGG) to measure and analyze electrical signals from the human abdominal surface. A soft-contact matrix composed of 25 cutaneous electrodes is embedded evenly in a latex mat. The mat can be firmly attached to the abdominal surface by drawing a vacuum between the matrix and the stomach. Twenty-five high-amplification filter/amplifiers provide a high signal-to-noise ratio and flat amplitude response for a signal between 0.02 and 0.12 Hz (1.2-7.2 cpm). The computer program provides waveform and frequency analysis for any chosen channel and mapping analyses for all 25 channels. A two-dimensional propagation exploration program was also developed. Using four different mapping analysis program subroutines, the optimal points for analyzing the EGG signals can be reliably found and variability of these locations can be observed easily. Results show differences in the EGG mappings of normal and abnormal subjects  相似文献   

5.
赵力  邹采荣  吴镇扬 《电子学报》2002,30(7):967-969
本文提出了一种新的语音识别方法,它综合了VQ、HMM和无教师说话人自适应算法的优点,在每个状态通过用矢量量化误差值取代传统HMM的输出概率值来建立FVQ/HMM,同时采用基于模糊矢量量化的无教师自适应算法,来改变FVQ/HMM的各状态的码字,从而实现对未知说话人的码本适应.本文通过非特定人汉语数码(孤立和连续数码)语音识别实验,把该新的组合方法同基于CHMM的自适应和识别方法进行了比较,实验结果表明该方法的自适应和识别效果优于基于CHMM的方法.  相似文献   

6.
The authors examined the applicability of the exponential distribution for the time-frequency representation of the electrogastrogram (EGG). The EGG is a noninvasive measurement of the electrical activity of the stomach by placing electrodes on the abdominal skin. Quantitative analysis of the EGG has relied on spectral methods. The normal frequency of the EGG in humans is 3 cycles/min. Electrical dysrhythmia observed in the EGG is associated with motor disorders of the stomach. The exponential distribution was applied here for the estimation of EGG frequency and for the detection of dysrhythmia in the EGG. A series of computer simulations was conducted, demonstrating the reliability of the exponential distribution in the analysis of nonstationary electrical signals of the stomach. Applications of the exponential distribution in the spectral analysis of typical EGGs are presented. The results show that there is a great potential for the use of the exponential distribution in EGG analysis  相似文献   

7.
Sequence detection is studied for communication channels with intersymbol interference and non-Gaussian noise using a novel adaptive receiver structure. The receiver adapts itself to the noise environment using an algorithm which employs a Gaussian mixture distribution model and the expectation maximization algorithm. Two alternate procedures are studied for sequence detection. These are a procedure based on the Viterbi algorithm and a symbol-by-symbol detection procedure. The Viterbi algorithm minimizes the probability the sequence is in error and the symbol-by-symbol detector minimizes symbol error rate, which are different  相似文献   

8.
Noise reduction of VQ encoded images through anti-gray coding   总被引:2,自引:0,他引:2  
Noise reduction of VQ encoded images is achieved through the proposed anti-gray coding (AGC) and noise detection and correction scheme. In AGC, binary indices are assigned to the codevector in such a way that the 1-b neighbors of a code vector are as far apart as possible. To detect the channel errors, we first classify an image into uniform and edge regions. Then we propose a mask to detect the channel errors based on the image classification (uniform or edge region) and the characteristics of AGC. We also mathematically derive a criterion for error detection based on the image classification. Once error indices are detected, the recovered indices can be easily chosen from a “candidate set” by minimizing the gray-level transition across the block boundaries in a VQ encoded image. Simulation results show that the proposed technique provides detection results with smaller than 0.1% probability of error and more than 86.3% probability of detection at a random bit error rate of 0.1%, while the undetected errors are invisible. In addition, the proposed detection and correction techniques improve the image quality (compared with that encoded by AGC) by 3.9 dB  相似文献   

9.
A complexity reduction technique for image vector quantization   总被引:2,自引:0,他引:2  
A technique for reducing the complexity of spatial-domain image vector quantization (VQ) is proposed. The conventional spatial domain distortion measure is replaced by a transform domain subspace distortion measure. Due to the energy compaction properties of image transforms, the dimensionality of the subspace distortion measure can be reduced drastically without significantly affecting the performance of the new quantizer. A modified LBG algorithm incorporating the new distortion measure is proposed. Unlike conventional transform domain VQ, the codevector dimension is not reduced and a better image quality is guaranteed. The performance and design considerations of a real-time image encoder using the techniques are investigated. Compared with spatial domain a speed up in both codebook design time and search time is obtained for mean residual VQ, and the size of fast RAM is reduced by a factor of four. Degradation of image quality is less than 0.4 dB in PSNR.  相似文献   

10.
Radioscintigraphy is currently the gold standard for gastric emptying test which involves radiation exposure and is considerably expensive. We present a feature-based detection approach using neural networks for the noninvasive diagnosis of delayed gastric emptying from the cutaneous electrogastrogram (EGG). Simultaneous recordings of the EGG and scintigraphic gastric emptying test were made in 152 patients with symptoms suggestive of delayed gastric emptying. Spectral analyses were performed to derive EGG parameters which were used as the input of the neural network. The result of scintigraphic gastric emptying was used as the gold standard for the training and testing of the neural network. A correct classification of 85% (a specificity of 89% and a sensitivity of 82%) was achieved using the proposed method.  相似文献   

11.
A statistical analysis of brain morphology using wild bootstrapping   总被引:1,自引:0,他引:1  
Methods for the analysis of brain morphology, including voxel-based morphology and surface-based morphometries, have been used to detect associations between brain structure and covariates of interest, such as diagnosis, severity of disease, age, IQ, and genotype. The statistical analysis of morphometric measures usually involves two statistical procedures: 1) invoking a statistical model at each voxel (or point) on the surface of the brain or brain subregion, followed by mapping test statistics (e.g., t test) or their associated p values at each of those voxels; 2) correction for the multiple statistical tests conducted across all voxels on the surface of the brain region under investigation. We propose the use of new statistical methods for each of these procedures. We first use a heteroscedastic linear model to test the associations between the morphological measures at each voxel on the surface of the specified subregion (e.g., cortical or subcortical surfaces) and the covariates of interest. Moreover, we develop a robust test procedure that is based on a resampling method, called wild bootstrapping. This procedure assesses the statistical significance of the associations between a measure of given brain structure and the covariates of interest. The value of this robust test procedure lies in its computationally simplicity and in its applicability to a wide range of imaging data, including data from both anatomical and functional magnetic resonance imaging (fMRI). Simulation studies demonstrate that this robust test procedure can accurately control the family-wise error rate. We demonstrate the application of this robust test procedure to the detection of statistically significant differences in the morphology of the hippocampus over time across gender groups in a large sample of healthy subjects.  相似文献   

12.
We present here a new method to identify the position of the optic disc (OD) in retinal fundus images. The method is based on the preliminary detection of the main retinal vessels. All retinal vessels originate from the OD and their path follows a similar directional pattern (parabolic course) in all images. To describe the general direction of retinal vessels at any given position in the image, a geometrical parametric model was proposed, where two of the model parameters are the coordinates of the OD center. Using as experimental data samples of vessel centerline points and corresponding vessel directions, provided by any vessel identification procedure, model parameters were identified by means of a simulated annealing optimization technique. These estimated values provide the coordinates of the center of OD. A Matlab prototype implementing this method was developed. An evaluation of the proposed procedure was performed using the set of 81 images from the STARE project, containing images from both normal and pathological subjects. The OD position was correctly identified in 79 out of 81 images (98%), even in rather difficult pathological situations.  相似文献   

13.
We present an assessment of the practical value of existing traditional and non-standard measures for discriminating healthy people from people with Parkinson's disease (PD) by detecting dysphonia. We introduce a new measure of dysphonia, Pitch Period Entropy (PPE), which is robust to many uncontrollable confounding effects including noisy acoustic environments and normal, healthy variations in voice frequency. We collected sustained phonations from 31 people, 23 with PD. We then selected 10 highly uncorrelated measures, and an exhaustive search of all possible combinations of these measures finds four that in combination lead to overall correct classification performance of 91.4%, using a kernel support vector machine. In conclusion, we find that non-standard methods in combination with traditional harmonics-to-noise ratios are best able to separate healthy from PD subjects. The selected non-standard methods are robust to many uncontrollable variations in acoustic environment and individual subjects, and are thus well-suited to telemonitoring applications.  相似文献   

14.
An architecture suitable for real-time image coding using adaptive vector quantization (VQ) is presented. This architecture is based on the concept of content-addressable memory (CAM), where the data is accessed simultaneously and in parallel on the basis of its content. VQ essentially involves, for each input vector, a search operation to obtain the best match codeword. A speedup results if a CAM-based implementation is used. This speedup, coupled with the gains in execution time for the basic distortion operation, implies that even codebook generation is possible in real time (<32 ms). In using the CAM, the conventional mean square error measure is replaced by the absolute difference measure. This measure results in little degradation and in fact limits large errors. The regular and iterable architecture is particularly well suited for VLSI implementation  相似文献   

15.
The authors introduce a novel coding technique which significantly improves the performance of the traditional vector quantisation (VQ) schemes at low bit rates. High interblock correlation in natural images results in a high probability that neighbouring image blocks are mapped to small subsets of the VQ codebook, which contains highly correlated codevectors. If, instead of the whole VQ codebook, a small subset is considered for the purpose of encoding neighbouring blocks, it is possible to improve the performance of traditional VQ schemes significantly. The performance improvement obtained with the new method is about 3 dB on average when compared with traditional VQ schemes at low bit rates. The method provides better performance than the JPEG coding standard at low bit rates, and gives comparable results with much less complexity than address VQ  相似文献   

16.
The authors give exponential-type bounds on the probabilities of detection error under certain conditions. The bounds tend to zero rapidly as the sample size increases. These procedures are referred to as general information criterion (GIC) procedures since each is consistent and, under certain conditions, the rate of convergence of the estimate of the number of signals to the true value is rapid. The authors give bounds on the probability of wrong detection of the GIC procedure when the noise variance is unknown. Upper bounds on the probabilities of detection error are also obtained  相似文献   

17.
Traditional speech processing methods for laryngeal pathology assessment assume linear speech production with measures derived from an estimated glottal flow waveform. They normally require the speaker to achieve complete glottal closure, which for many vocal fold pathologies cannot be accomplished. To address this issue, a nonlinear signal processing approach is proposed which does not require direct glottal flow waveform estimation. This technique is motivated by earlier studies of airflow characterization for human speech production. The proposed nonlinear approach employs a differential Teager energy operator and the energy separation algorithm to obtain formant AM and FM modulations from filtered speech recordings. A new speech measure is proposed based on parameterization of the autocorrelation envelope of the AM response. This approach is shown to achieve impressive detection performance for a set of muscular tension dysphonias. Unlike flow characterization using numerical solutions of Navier-Stokes equations, this method is extremely computationally attractive, requiring only a small time window of speech samples. The new noninvasive method shows that a fast, effective digital speech processing technique can be developed for vocal fold pathology assessment without the need for direct glottal flow estimation or complete glottal closure by the speaker. The proposed method also confirms that alternative nonlinear methods can begin to address the limitations of previous linear approaches for speech pathology assessment  相似文献   

18.
Endoscopic high-speed laryngoscopy in combination with image analysis strategies is the most promising approach to investigate the interrelation between vocal fold vibrations and voice disorders. So far, due to the lack of an objective and standardized analysis procedure a unique characterization of vocal fold vibrations has not been achieved yet. We present a visualization and analysis strategy which transforms the segmented edges of vibrating vocal folds into a single 2-D image, denoted Phonovibrogram (PVG). Within a PVG the individual type of vocal fold vibration becomes uniquely characterized by specific geometric patterns. The PVG geometries give an intuitive access on the type and degree of the laryngeal asymmetry and can be quantified using an image segmentation approach. The PVG analysis was applied to 14 representative recordings derived from a high-speed database comprising normal and pathological voices. We demonstrate that PVGs are capable to differentiate and quantify different types of normal and pathological vocal fold vibrations. The objective and precise quantification of the PVG geometry may have the potential to realize a novel classification of vocal fold vibrations.  相似文献   

19.
The authors evaluate continuous density hidden Markov models (CDHMM), dynamic time warping (DTW) and distortion-based vector quantisation (VQ) for speaker recognition, emphasising the performance of each model structure across incremental amounts of training data. Text-independent (TI) experiments are performed with VQ and CDHMMs, and text-dependent (TD) experiments are performed with DTW, VQ and CDHMMs. For TI speaker recognition, VQ performs better than an equivalent CDHMM with one training version, but is outperformed by CDHMM when trained with ten training versions. For TD experiments, DTW outperforms VQ and CDHMMs for sparse amounts of training data, but with more data the performance of each model is indistinguishable. The performance of the TD procedures is consistently superior to TI, which is attributed to subdividing the speaker recognition problem into smaller speaker-word problems. It is also shown that there is a large variation in performance across the different digits, and it is concluded that digit zero is the best digit for speaker discrimination  相似文献   

20.
A bound for a Minkowski metric based on Lp distortion measure is proposed and evaluated as a means to reduce the computation in vector quantisation. This bound provides a better criterion than the absolute error inequality (AEI) elimination rule on the Euclidean distortion measure. For the Minkowski metric of order n, this bound contributes the elimination criterion from the L1 metric to L n metric. This bound can also be an extended quadratic metric which can be a hidden Markov model (HMM) with a Gaussian mixture probability density function (PDF). In speech recognition, the HMM with the Gaussian mixture VQ codebook PDF has been shown to be a promising method  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号