首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The paper presents a hybrid of a hidden Markov model and a Markov chain model for speech recognition. In this hybrid, the hidden Markov model is concerned with the time-varying property of spectral features, while the Markov chain accounts for the interdependence of spectral features. The log-likelihood scores of the two models, with respect to a given utterance, are combined by a postprocessor to yield a combined log-likelihood score for word classification. Experiments on speaker-independent and multispeaker isolated English alphabet recognition show that the hybrid outperformed both the hidden Markov model and the Markov chain model in terms of recognition  相似文献   

2.
The techniques used to develop an acoustic-phonetic hidden Markov model, the problems associated with representing the whole acoustic-phonetic structure, the characteristics of the model, and how it performs as a phonetic decoder for recognition of fluent speech are discussed. The continuous variable duration model was trained using 450 sentences of fluent speech, each of which was spoken by a single speaker, and segmented and labeled using a fixed number of phonemes, each of which has a direct correspondence to the states of the matrix. The inherent variability of each phoneme is modeled as the observable random process of the Markov chain, while the phonotactic model of the unobservable phonetic sequence is represented by the state transition matrix of the hidden Markov model. The model assumes that the observed spectral data were generated by a Gaussian source. However, an analysis of the data shows that the spectra for the most of the phonemes are not normally distributed and that an alternative representation would be beneficial  相似文献   

3.
In applying hidden Markov modeling for recognition of speech signals, the matching of the energy contour of the signal to the energy contour of the model for that signal is normally achieved by appropriate normalization of each vector of the signal prior to both training and recognition. This approach, however, is not applicable when only noisy signals are available for recognition. A unified approach is developed for gain adaptation in recognition of clean and noisy signals. In this approach, hidden Markov models (HMMs) for gain-normalized clean signals are designed using maximum-likelihood (ML) estimates of the gain contours of the clean training sequences. The models are combined with ML estimates of the gain contours of the clean test signals, obtained from the given clean or noisy signals, in performing recognition using the maximum a posteriori decision rule. The gain-adapted training and recognition algorithms are developed for HMMs with Gaussian subsources using the expectation-minimization (EM) approach  相似文献   

4.
语音同步识别系统的发展方向是连续性的人机交互,采用传统系统易受到突发性噪声影响,致使识别效果较差,提出基于隐马尔可夫模型的连续语音同步识别系统。结合语音识别原理,设计系统硬件总体结构。利用JFET输入高保真运放的OPA604低通滤波器,保证信号处理结果的有效性。通过OMAP5912ZZG型号芯片对处理后的信号进行存储,使用矢量图缓冲音频,经由以太网接口移植相关语音识别序列,由此实现连续语音同步识别。由实验对比结果可知,该系统比传统系统识别效果最高值高出48%,推进了语音识别技术研究的快速发展。  相似文献   

5.
A new deformed shape recognition method that relies on hidden Markov models to evaluate the sequentiality of the relevant points of the shape is proposed. These points are extracted from its adaptively calculated curvature function to give stability against noise transformations and deformations. The proposed method is very fast. Comparative tests for different shapes have been successful.  相似文献   

6.
Kim  H.R. Lee  H.S. 《Electronics letters》1991,27(18):1633-1635
A modified corrective training method using state segment information in the hidden Markov model is presented. The proposed algorithm is shown to result in a higher recognition rate than the conventional corrective training method and requires less computation.<>  相似文献   

7.
Motion trajectories provide rich spatiotemporal information about an object's activity. This paper presents novel classification algorithms for recognizing object activity using object motion trajectory. In the proposed classification system, trajectories are segmented at points of change in curvature, and the subtrajectories are represented by their principal component analysis (PCA) coefficients. We first present a framework to robustly estimate the multivariate probability density function based on PCA coefficients of the subtrajectories using Gaussian mixture models (GMMs). We show that GMM-based modeling alone cannot capture the temporal relations and ordering between underlying entities. To address this issue, we use hidden Markov models (HMMs) with a data-driven design in terms of number of states and topology (e.g., left-right versus ergodic). Experiments using a database of over 5700 complex trajectories (obtained from UCI-KDD data archives and Columbia University Multimedia Group) subdivided into 85 different classes demonstrate the superiority of our proposed HMM-based scheme using PCA coefficients of subtrajectories in comparison with other techniques in the literature.  相似文献   

8.
Vaseghi  S.V. 《Electronics letters》1991,27(8):625-626
A new method is proposed for incorporation of duration knowledge in the form of duration-dependent state transition probabilities in a left-right hidden Markov model. Duration-dependent transition probabilities are derived from integration of histograms of the state durations. The model re-estimation process becomes one of obtaining a new segmentation from which a new set of state and observation probabilities are derived.<>  相似文献   

9.
It is well known that a strong relationship exists between human voices and the movement of articulatory facial muscles. In this paper, we utilize this knowledge to implement an automatic speech recognition scheme which uses solely surface electromyogram (EMG) signals. The sequence of EMG signals for each word is modelled by a hidden Markov model (HMM) framework. The main objective of the work involves building a model for state observation density when multichannel observation sequences are given. The proposed model reflects the dependencies between each of the EMG signals, which are described by introducing a global control variable. We also develop an efficient model training method, based on a maximum likelihood criterion. In a preliminary study, 60 isolated words were used as recognition variables. EMG signals were acquired from three articulatory facial muscles. The findings indicate that such a system may have the capacity to recognize speech signals with an accuracy of up to 87.07%, which is superior to the independent probabilistic model.  相似文献   

10.
11.
The authors evaluate continuous density hidden Markov models (CDHMM), dynamic time warping (DTW) and distortion-based vector quantisation (VQ) for speaker recognition, emphasising the performance of each model structure across incremental amounts of training data. Text-independent (TI) experiments are performed with VQ and CDHMMs, and text-dependent (TD) experiments are performed with DTW, VQ and CDHMMs. For TI speaker recognition, VQ performs better than an equivalent CDHMM with one training version, but is outperformed by CDHMM when trained with ten training versions. For TD experiments, DTW outperforms VQ and CDHMMs for sparse amounts of training data, but with more data the performance of each model is indistinguishable. The performance of the TD procedures is consistently superior to TI, which is attributed to subdividing the speaker recognition problem into smaller speaker-word problems. It is also shown that there is a large variation in performance across the different digits, and it is concluded that digit zero is the best digit for speaker discrimination  相似文献   

12.
A new deformed shape recognition method based on hidden Markov models (HMMs), which is very resistant against transformations and non-rigid deformations, is presented. Since shape features are not referred to an absolute point, the method is also resistant to severe shape distortions. The method has been successfully tested using different databases  相似文献   

13.
李楠  姬光荣 《现代电子技术》2012,35(8):54-56,60
为了更详细地研究隐马尔科夫模型在图像识别中的应用,以指纹识别为例,纵向总结了几种基于隐马尔科夫模型的指纹图像识别算法,包括一维隐马尔科夫模型、伪二维隐马尔科夫模型、二维模型及一维模型组。分别从时间复杂度、识别精确度等方面总结出这四种隐马尔科夫模型在图像识别时的优缺点,得出不同待识别图像适合使用的识别模型的结论。  相似文献   

14.
Due to the enormous quantity of radar images acquired by satellites and through shuttle missions, there is an evident need for efficient automatic analysis tools. This paper describes unsupervised classification of radar images in the framework of hidden Markov models and generalized mixture estimation. Hidden Markov chain models, applied to a Hilbert-Peano scan of the image, constitute a fast and robust alternative to hidden Markov random field models for spatial regularization of image analysis problems, even though the latter provide a finer and more intuitive modeling of spatial relationships. We here compare the two approaches and show that they can be combined in a way that conserves their respective advantages. We also describe how the distribution families and parameters of classes with constant or textured radar reflectivity can be determined through generalized mixture estimation. Sample results obtained on real and simulated radar images are presented.  相似文献   

15.
The authors propose a channel compensation method for the hidden Markov model (HMM) parameters in automatic speech recognition. The proposed approach is to adapt the existing reference models to a new channel environment by using a small amount of adaptation data. The concept of HMM parameter adaptation by incorporating the corresponding phone-dependent channel compensation (PDCC) vectors is applied to improve the performance of speech recognition. Two extended PDCC techniques are presented. One is based on the refinement of PDCC using vector quantisation. The other is based on the interpolation of compensation vectors. Both techniques are evaluated on the experiments on telephone speech recognition and speaker adaptation. The experimental results show that the performance can be significantly improved  相似文献   

16.
A fused hidden Markov model with application to bimodal speech processing   总被引:2,自引:0,他引:2  
This paper presents a novel fused hidden Markov model (fused HMM) for integrating tightly coupled time series, such as audio and visual features of speech. In this model, the time series are first modeled by two conventional HMMs separately. The resulting HMMs are then fused together using a probabilistic fusion model, which is optimal according to the maximum entropy principle and a maximum mutual information criterion. Simulations and bimodal speaker verification experiments show that the proposed model can significantly reduce the recognition errors in noiseless or noisy environments.  相似文献   

17.
Speaker-dependent phoneme recognition experiments were conducted using variants of the semicontinuous hidden Markov model (SCHMM) with explicit state duration modeling. Results clearly demonstrated that the SCHMM with state duration offers significantly improved phoneme classification accuracy compared to both the discrete HMM and the continuous HMM; the error rate was reduced by more than 30% and 20%, respectively. The use of a limited number of mixture densities significantly reduced the amount of computation. Explicit state duration modeling further reduced the error rate  相似文献   

18.
A novel framework of an online unsupervised learning algorithm is presented to flexibly adapt the existing speaker-independent hidden Markov models (HMMs) to nonstationary environments induced by varying speakers, transmission channels, ambient noises, etc. The quasi-Bayes (QB) estimate is applied to incrementally obtain word sequence and adaptation parameters for adjusting HMMs when a block of unlabelled data is enrolled. The underlying statistics of a nonstationary environment can be successively traced according to the newest enrolment data. To improve the QB estimate, the adaptive initial hyperparameters are employed in the beginning session of online learning. These hyperparameters are estimated from a cluster of training speakers closest to the test environment. Additionally, a selection process is developed to select reliable parameters from a list of candidates for unsupervised learning. A set of reliability assessment criteria is explored for selection. In a series of speaker adaptation experiments, the effectiveness of the proposed method is confirmed and it is found that using the adaptive initial hyperparameters in online learning and the multiple assessments in parameter selection can improve the recognition performance  相似文献   

19.
基于隐马尔可夫模型的车牌自动识别技术   总被引:2,自引:0,他引:2  
文中提出了一种车牌字符识别的新方法,用二维隐马尔可夫模型方法识别车牌中的汉字,用伪二维隐马尔可夫模型(P2D-HMM)方法识别车牌中的英文字符及阿拉伯数字。该算法适用于不同的字符大小、字符倾斜、污损等情况,抗噪声能力强。字符识别正确率达94%以上,满足实用技术的要求。  相似文献   

20.
This paper describes a complete system for the recognition of unconstrained handwritten words using a continuous density variable duration hidden Markov model (CD-VDHMM). First, a new segmentation algorithm based on mathematical morphology is developed to translate the 2-D image into a 1-D sequence of subcharacter symbols. This sequence of symbols is modeled by the CDVDHMM. Thirty-five features are selected to represent the character symbols in the feature space. Generally, there are two information sources associated with written text; the shape information and the linguistic knowledge. While the shape information of each character symbol is modeled as a mixture Gaussian distribution, the linguistic knowledge, i.e., constraint, is modeled as a Markov chain. The variable duration state is used to take care of the segmentation ambiguity among the consecutive characters. A modified Viterbi algorithm, which provides l globally best paths, is adapted to VDHMM by incorporating the duration probabilities for the variable duration state sequence. The general string editing method is used at the postprocessing stage. The detailed experiments are carried out for two postal applications; and successful recognition results are reported.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号