首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 12 毫秒
1.
For a speech-recognition system based on continuous-density hidden Markov models (CDHMM), speaker adaptation of the parameters of CDHMM is formulated as a Bayesian learning procedure. A speaker adaptation procedure which is easily integrated into the segmental k-means training procedure for obtaining adaptive estimates of the CDHMM parameters is presented. Some results for adapting both the mean and the diagonal covariance matrix of the Gaussian state observation densities of a CDHMM are reported. The results from tests on a 39-word English alpha-digit vocabulary in isolated word mode indicate that the speaker adaptation procedure achieves the same level of performance as that of a speaker-independent system, when one training token from each word is used to perform speaker adaptation. It shows that much better performance is achieved when two or more training tokens are used for speaker adaptation. When compared with the speaker-dependent system, it is found that the performance of speaker adaptation is always equal to or better than that of speaker-dependent training using the same amount of training data  相似文献   

2.
针对手机、电话的短语音文本无关说话人确认,本文设计了一种基于分类GMM-UBM(CGMM-UBM)的说话人确认系统。用k-means算法将训练背景模型的语音参数集分类成若干个子空间,并据此进行目标说话人语音数据的子空间分类,再采用GMM-UBM结构为每个子空间分别建立一个子系统,以各个子系统输出评分的线性加权作为系统的输出评分。分类后的模型可以采用较低的混合度,线性加权增强了贡献较大子空间对确认性能的作用。在NIST’03语音库上100个男性话者的实验表明,短语音条件下,分类系统的性能比不分类系统有显著的改进,运算效率较后者也提高很多。  相似文献   

3.
The techniques used to develop an acoustic-phonetic hidden Markov model, the problems associated with representing the whole acoustic-phonetic structure, the characteristics of the model, and how it performs as a phonetic decoder for recognition of fluent speech are discussed. The continuous variable duration model was trained using 450 sentences of fluent speech, each of which was spoken by a single speaker, and segmented and labeled using a fixed number of phonemes, each of which has a direct correspondence to the states of the matrix. The inherent variability of each phoneme is modeled as the observable random process of the Markov chain, while the phonotactic model of the unobservable phonetic sequence is represented by the state transition matrix of the hidden Markov model. The model assumes that the observed spectral data were generated by a Gaussian source. However, an analysis of the data shows that the spectra for the most of the phonemes are not normally distributed and that an alternative representation would be beneficial  相似文献   

4.
Kim  H.R. Lee  H.S. 《Electronics letters》1991,27(18):1633-1635
A modified corrective training method using state segment information in the hidden Markov model is presented. The proposed algorithm is shown to result in a higher recognition rate than the conventional corrective training method and requires less computation.<>  相似文献   

5.
语音同步识别系统的发展方向是连续性的人机交互,采用传统系统易受到突发性噪声影响,致使识别效果较差,提出基于隐马尔可夫模型的连续语音同步识别系统。结合语音识别原理,设计系统硬件总体结构。利用JFET输入高保真运放的OPA604低通滤波器,保证信号处理结果的有效性。通过OMAP5912ZZG型号芯片对处理后的信号进行存储,使用矢量图缓冲音频,经由以太网接口移植相关语音识别序列,由此实现连续语音同步识别。由实验对比结果可知,该系统比传统系统识别效果最高值高出48%,推进了语音识别技术研究的快速发展。  相似文献   

6.
A fused hidden Markov model with application to bimodal speech processing   总被引:2,自引:0,他引:2  
This paper presents a novel fused hidden Markov model (fused HMM) for integrating tightly coupled time series, such as audio and visual features of speech. In this model, the time series are first modeled by two conventional HMMs separately. The resulting HMMs are then fused together using a probabilistic fusion model, which is optimal according to the maximum entropy principle and a maximum mutual information criterion. Simulations and bimodal speaker verification experiments show that the proposed model can significantly reduce the recognition errors in noiseless or noisy environments.  相似文献   

7.
Shrinkage of the mean vectors and the variances in HMM due to additive white noise is an important issue for the speech recogniser. By giving an assumed relation between the adaptation factors for mean vector and variances, an optimal adaptation factor can be found by using the maximum likelihood method  相似文献   

8.
黄岗 《电子设计工程》2013,21(17):60-62
通过对马尔可夫模型进行深入的分析的基础上对隐马尔科夫模型做了详细的讨论,对马尔科夫模型在语音识别、疾病分析等方面的应用做了介绍,同时针对隐马尔科夫模型在估值问题、解码问题和学习问题等经典问题上的应用做了研究。最后讨论了马尔科夫模型其隐马尔可夫模型的缺陷,并提出相关的改进建议。  相似文献   

9.
In applying hidden Markov modeling for recognition of speech signals, the matching of the energy contour of the signal to the energy contour of the model for that signal is normally achieved by appropriate normalization of each vector of the signal prior to both training and recognition. This approach, however, is not applicable when only noisy signals are available for recognition. A unified approach is developed for gain adaptation in recognition of clean and noisy signals. In this approach, hidden Markov models (HMMs) for gain-normalized clean signals are designed using maximum-likelihood (ML) estimates of the gain contours of the clean training sequences. The models are combined with ML estimates of the gain contours of the clean test signals, obtained from the given clean or noisy signals, in performing recognition using the maximum a posteriori decision rule. The gain-adapted training and recognition algorithms are developed for HMMs with Gaussian subsources using the expectation-minimization (EM) approach  相似文献   

10.
屈丹  张文林 《通信学报》2015,36(9):47-54
本征音子说话人自适应方法在自适应数据量不足时会出现严重的过拟合现象,提出了一种基于稀疏组LASSO约束的本征音子说话人自适应算法。首先给出隐马尔可夫—高斯混合模型下本征音子说话人自适应的基本原理;然后将稀疏组LASSO正则化引入到本征音子说话人自适应,通过调整权重因子控制模型的复杂度,并通过一种加速近点梯度的数学优化算法来实现;最后将稀疏组LASSO约束的自适应算法与当前多种正则化约束的自适应方法进行比较。汉语连续语音识别的说话人自适应实验表明,引入稀疏组LASSO约束后,本征音子说话人自适应方法的性能得到了明显提高,且稀疏组LASSO约束方法优于l1、l2和弹性网正则化方法。  相似文献   

11.
The paper presents a hybrid of a hidden Markov model and a Markov chain model for speech recognition. In this hybrid, the hidden Markov model is concerned with the time-varying property of spectral features, while the Markov chain accounts for the interdependence of spectral features. The log-likelihood scores of the two models, with respect to a given utterance, are combined by a postprocessor to yield a combined log-likelihood score for word classification. Experiments on speaker-independent and multispeaker isolated English alphabet recognition show that the hybrid outperformed both the hidden Markov model and the Markov chain model in terms of recognition  相似文献   

12.
Linear predictive hidden Markov models have proved to be efficient for statistically modeling speech signals. The possible application of such models to statistical characterization of the speaker himself is described and evaluated. The results show that even with a short sequence of only four isolated digits, a speaker can be verified with an average equal-error rate of less than 3 %. These results are slightly better than the results obtained using speaker-dependent vector quantizers, with comparable numbers of spectral vectors. The small improvement over the vector quantization approach indicates the weakness of the Markovian transition probabilities for characterizing speaker-dependent transitional information  相似文献   

13.
Lee  L.-M. Wang  H.-C. 《Electronics letters》1995,31(8):616-617
The state parameters of the hidden Markov model are represented by the autocorrelation coefficients of a context window that can be adaptively transformed to cepstral and delta cepstral coefficients according to the environmental noise. Experimental results show that it can significantly improve the speech recognition rate under noisy environments  相似文献   

14.
Hidden Markov models (HMMs) with bounded state durations (HMM/BSD) are proposed to explicitly model the state durations of HMMs and more accurately consider the temporal structures existing in speech signals in a simple, direct, but effective way. A series of experiments have been conducted for speaker dependent applications using 408 highly confusing first-tone Mandarin syllables as the example vocabulary. It was found that in the discrete case the recognition rate of HMM/BSD (78.5%) is 9.0%, 6.3%, and 1.9% higher than the conventional HMMs and HMMs with Poisson and gamma distribution state durations, respectively. In the continuous case (partitioned Gaussian mixture modeling), the recognition rates of HMM/BSD (88.3% with 1 mixture, 88.8% with 3 mixtures, and 89.4% with 5 mixtures) are 6.3%, 5.0%, and 5.5% higher than those of the conventional HMMs, and 5.9% (with 1 mixture), 3.9% (with 3 mixtures) and 3.1% (with 1 mixture), 1.8% (with 3 mixtures) higher than HMMs with Poisson and gamma distributed state durations, respectively  相似文献   

15.
Dual-tree complex wavelet hidden Markov tree model for image denoising   总被引:2,自引:0,他引:2  
《Electronics letters》2007,43(18):973-975
A new non-training complex wavelet hidden Markov tree (HMT) model, which is based on the dual-tree complex wavelet transform and a fast parameter estimation technique, is proposed for image denoising. This new model can mitigate the two problems (high computational cost and shift-variance) of the conventional wavelet HMT model simultaneously. Experiments show that the denoising approach with this new model achieves better performance than other related HMT- based image denoising algorithms.  相似文献   

16.
This paper describes a complete system for the recognition of unconstrained handwritten words using a continuous density variable duration hidden Markov model (CD-VDHMM). First, a new segmentation algorithm based on mathematical morphology is developed to translate the 2-D image into a 1-D sequence of subcharacter symbols. This sequence of symbols is modeled by the CDVDHMM. Thirty-five features are selected to represent the character symbols in the feature space. Generally, there are two information sources associated with written text; the shape information and the linguistic knowledge. While the shape information of each character symbol is modeled as a mixture Gaussian distribution, the linguistic knowledge, i.e., constraint, is modeled as a Markov chain. The variable duration state is used to take care of the segmentation ambiguity among the consecutive characters. A modified Viterbi algorithm, which provides l globally best paths, is adapted to VDHMM by incorporating the duration probabilities for the variable duration state sequence. The general string editing method is used at the postprocessing stage. The detailed experiments are carried out for two postal applications; and successful recognition results are reported.  相似文献   

17.
We consider quantization from the perspective of minimizing filtering error when quantized instead of continuous measurements are used as inputs to a nonlinear filter, specializing to discrete-time two-state hidden Markov models (HMMs) with continuous-range output. An explicit expression for the filtering error when continuous measurements are used is presented. We also propose a quantization scheme based on maximizing the mutual information between quantized observations and the hidden states of the HMM  相似文献   

18.
李宏升  徐洪章 《激光与红外》2013,43(10):1184-1187
针对隐式马尔科夫模型在图像消噪中的不足,采用参数求解方法.首先对参数模型三元组确定其限制条件,通过递归计算状态概率,通过最大似然估计来使期望最大化,期望最大化过程包括期望过程和最大化过程;在图像消噪中提取观察信号过程利用Kullback-Leibler距离设置其阈值,最终给出了参数解.实验仿真表明本文算法能够保持图像有用信息,执行速度快.  相似文献   

19.
基于层叠隐马尔可夫模型的中文命名实体识别   总被引:29,自引:0,他引:29  
提出了一种基于层叠隐马尔可夫模型的中文命名实体一体化识别方法,旨在将人名识别、地名识别以及机构名识别等命名实体识别融合到一个相对统一的理论模型中。首先在词语粗切分的结果集上采用底层隐马尔可夫模型识别出普通无嵌套的人名、地名和机构名等,然后依次采取高层隐马尔可夫模型识别出嵌套了人名、地名的复杂地名和机构名。在对大规模真实语料库的封闭测试中,人名、地名和机构识别的F-1值分别达到92.55%、94.53%、86.51%。采用该方法的系统ICTCLAS在2003年5月SIGHAN举办的第一届汉语分词大赛中名列前茅。  相似文献   

20.
Binary hypotheses testing using empirically observed statistics is studied in the Neyman-Pearson formulation for the hidden Markov model (HMM). An asymptotically optimal decision rule is proposed and compared to the generalized likelihood ratio test (GLRT), which has been shown in earlier studies to be asymptotically optimal for simpler parametric families. The proof of the main theorem is provided. The result can be applied to several types of HMMs commonly used in speech recognition and communication applications. Several applications are demonstrated  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号