首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 944 毫秒
1.
郭庆  吴文虎  方棣棠 《软件学报》1999,10(6):631-635
在使用传统的隐马尔可夫模型(traditional hidden Markov model,简称THMM)刻画现实中的语音时有一个明显的缺点,即THMM不能合适地表征语音信号的时域结构.时域上的相关性被认为对识别非常有用,因为相邻帧间的特征矢量具有很强的相关性.文章提出了一种新的方法,用以把时域的相关性糅合到一个基于传统的隐马尔可夫模型的语音识别系统中.首先,用条件概率的形式处理帧间相关性;然后,用一种非线性的概率近似公式来表征相邻帧之间的相关性.此方法丝毫不增加原来的THMM的空间复杂度,而且也几乎不增  相似文献   

2.
提出一种新的基于条件随机域和隐马尔可夫模型(HMM)的人类动作识别方法——HMCRF。目前已有的动作识别方法均使用隐马尔可夫模型及其变型,这些模型一个最突出的不足就是要求观察值相互独立。条件模型很容易表示上下文相关性,且可使用动态规划做到有效且精确的推论,它的参数可以通过凸函数优化训练得到。把条件图形模型应用于动作识别之上,并通过大量的实验表明,所提出的方法在识别正确率方面明显优于一般线性结构的CRF和HMM。  相似文献   

3.
To overcome the disadvantage of classical recognition model that cannot perform well enough when there are some noises or lost frames in expression image sequences, a novel model called fuzzy buried Markov model (FBMM) is presented in this paper. FBMM relaxes conditional independence assumptions for classical hidden Markov model (HMM) by adding the specific cross-observation dependencies between observation elements. Compared with buried Markov model (BMM), FBMM utilizes cloud distribution to replace probab...  相似文献   

4.
In this paper, we propose a novel optimization algorithm called constrained line search (CLS) for discriminative training (DT) of Gaussian mixture continuous density hidden Markov model (CDHMM) in speech recognition. The CLS method is formulated under a general framework for optimizing any discriminative objective functions including maximum mutual information (MMI), minimum classification error (MCE), minimum phone error (MPE)/minimum word error (MWE), etc. In this method, discriminative training of HMM is first cast as a constrained optimization problem, where Kullback-Leibler divergence (KLD) between models is explicitly imposed as a constraint during optimization. Based upon the idea of line search, we show that a simple formula of HMM parameters can be found by constraining the KLD between HMM of two successive iterations in an quadratic form. The proposed CLS method can be applied to optimize all model parameters in Gaussian mixture CDHMMs, including means, covariances, and mixture weights. We have investigated the proposed CLS approach on several benchmark speech recognition databases, including TIDIGITS, Resource Management (RM), and Switchboard. Experimental results show that the new CLS optimization method consistently outperforms the conventional EBW method in both recognition performance and convergence behavior.  相似文献   

5.
一种针对区分性训练的受限线性搜索优化方法   总被引:1,自引:0,他引:1  
提出一种称为“受限线性搜索”的优化方法,并用于语音识别中混合高斯的连续密度隐马尔科夫(CDHMM)模型的区分性训练。该方法可用于优化基于最大互信息(MMI)准则的区分性训练目标函数。在该方法中,首先把隐马尔科夫模型(HMM)的区分性训练问题看成一个受限的优化问题,并利用模型间的KL度量作为优化过程中的一个限制。再基于线性搜索的思想,指出通过限制更新前后模型间的KL度量,可将HMM的参数表示成一种简单的二次形式。该方法可用于优化混合高斯CDHMM模型中的任何参数,包括均值、协方差矩阵、高斯权重等。将该方法分别用于中英文两个标准语音识别任务上,包括英文TIDIGITS数据库和中文863数据库。实验结果表明,该方法相对传统的扩展Baum-Welch方法在识别性能和收敛特性上都取得一致提升。  相似文献   

6.
提出了一种基于判别随机场模型的联机行为识别方法,将传统的随机场模型和隐藏条件随机场模型的特点相结合,构建一个针对于运动序列帧数据建模的帧-隐藏条件随机场模型,并将该模型应用于数据驱动的行为建模,利用传统条件随机场模型对行为间的运动特性进行建模;通过引入隐藏特征函数,设计有效的特征模板来表示行为中子姿态的联系,实现对行为的内在运动特性进行建模.通过对比实验表明,该模型对于联机处理行为序列具有更强的识别能力.  相似文献   

7.
一种改进的隐马尔可夫模型在语音识别中的应用   总被引:1,自引:0,他引:1  
提出了一种新的马尔可夫模型——异步隐马尔可夫模型.该模型针对噪音环境下语音识别过程中出现丢失帧的情况,通过增加新的隐藏时间标示变量Ck,估计出实际观察值对应的状态序列,实现对不规则或者不完整采样数据的建模.详细介绍了适合异步HMM的前后向算法以及用于训练的EM算法,并且对转移矩阵的计算进行了优化.最后通过实验仿真,分别使用经典HMM和异步HMM对相同的随机抽取帧的语音数据进行识别,识别结果显示在抽取帧相同情况下异步HMM比经典HMM的识别错误率低.  相似文献   

8.
In the present paper, a trajectory model, derived from a hidden Markov model (HMM) by imposing explicit relationships between static and dynamic feature vector sequences, is developed and evaluated. The derived model, named a trajectory HMM, can alleviate two limitations of the standard HMM, which are (i) piece-wise constant statistics within a state and (ii) conditional independence assumption of state output probabilities, without increasing the number of model parameters. In the present paper, a Viterbi-type training algorithm based on the maximum likelihood criterion is also derived. The performance of the trajectory HMM was evaluated both in speech recognition and synthesis. In a speaker-dependent continuous speech recognition experiment, the trajectory HMM achieved an error reduction over the corresponding standard HMM. Subjective listening test results showed that the introduction of the trajectory HMM improved the naturalness of synthetic speech.  相似文献   

9.
This paper discusses the use of an integrated HMM/NN classifier for speech recognition. The proposed classifier combines the time normalization property of the HMM classifier with the superior discriminative ability of the neural net (NN) classifier. Speech signals display a strong time varying characteristic. Although the neural net has been successful in many classification problems, its success (compared to HMM) is secondary to HMM in the field of speech recognition. The main reason is the lack of time normalization characteristics of most neural net structures (time-delay neural net is one notable exception but its structure is very complex). In the proposed integrated hybrid HMM/NN classifier, a left-to-right HMM module is used first to segment the observation sequence of every exemplar into a fixed number of states. Subsequently, all the frames belonging to the same state are replaced by one average frame. Thus, every exemplar, irrespective of its time scale variation, is transformed into a fixed number of frames, i.e., a static pattern. The multilayer perceptron (MLP) neural net is then used as the classifier for these time normalized exemplars. Some experimental results using telephone speech databases are presented to demonstrate the potential of this hybrid integrated classifier.  相似文献   

10.
用于脱机手写数字识别的隐马尔可夫模型   总被引:9,自引:0,他引:9  
将隐马尔可夫模型(HMM)用于脱机手写数字识别中,系统如何建模是一个值得研究的问题.在考虑手写数字自身特点及特征抽取的基础上,对HMM模型的训练方法及模型参数的选取进行了研究,以提高系统识别率.在银行票据OCR的应用中,与基于神经网络的方法结合使用,使得整张票据的拒识率降低了3%,明显提高了银行票据OCR系统的性能.  相似文献   

11.
探讨了利用Gabor小波和隐马尔可夫模型(HMM)进行人脸识别的方法,首先对人脸图像进行多分辨率的Gabor小波变换;然后在图像上放置一组网格结点,每个结点用该结点处的多尺度Gabor幅度特征描述,采用独立元分析法对每个结点进行去相关和降维;最后形成特征结,把每个特征结作为观测向量,对隐马尔可夫模型进行训练,并将优化的模型参数用于人脸识别,ORL人脸库的实验结果表明,该方法识别率高,工程上易于应用。  相似文献   

12.
Conditional models for contextual human motion recognition   总被引:1,自引:0,他引:1  
We describe algorithms for recognizing human motion in monocular video sequences, based on discriminative conditional random fields (CRFs) and maximum entropy Markov models (MEMMs). Existing approaches to this problem typically use generative structures like the hidden Markov model (HMM). Therefore, they have to make simplifying, often unrealistic assumptions on the conditional independence of observations given the motion class labels and cannot accommodate rich overlapping features of the observation or long-term contextual dependencies among observations at multiple timesteps. This makes them prone to myopic failures in recognizing many human motions, because even the transition between simple human activities naturally has temporal segments of ambiguity and overlap. The correct interpretation of these sequences requires more holistic, contextual decisions, where the estimate of an activity at a particular timestep could be constrained by longer windows of observations, prior and even posterior to that timestep. This would not be computationally feasible with a HMM which requires the enumeration of a number of observation sequences exponential in the size of the context window. In this work we follow a different philosophy: instead of restrictively modeling the complex image generation process – the observation, we work with models that can unrestrictedly take it as an input, hence condition on it. Conditional models like the proposed CRFs seamlessly represent contextual dependencies and have computationally attractive properties: they support efficient, exact recognition using dynamic programming, and their parameters can be learned using convex optimization. We introduce conditional graphical models as complementary tools for human motion recognition and present an extensive set of experiments that show not only how these can successfully classify diverse human activities like walking, jumping, running, picking or dancing, but also how they can discriminate among subtle motion styles like normal walks and wander walks.  相似文献   

13.
一种基于改进CP网络与HMM相结合的混合音素识别方法   总被引:2,自引:0,他引:2  
提出了一种基于改进对偶传播(CP)神经网络与隐驰尔可夫模型(HMM)相结合的混合音素识别方法.这一方法的特点是用一个具有有指导学习矢量量化(LVQ)和动态节点分配等特性的改进的CP网络生成离散HMM音素识别系统中的码书。因此,用这一方法构造的混合音素识别系统中的码书实际上是一个由有指导LVQ算法训练的具有很强分类能力的高性能分类器,这就意味着在用HMM对语音信号进行建模之前,由码书产生的观测序列中  相似文献   

14.
基于HMM的汉语文本识别后处理研究   总被引:9,自引:1,他引:8  
本文用HMM(Hidden Markov Model)描述汉语文本识别后处理,将汉语语言和单字识别这两个概率模型结合起来,以充分利用单字识别器提供的信息。语言模型的参数由语料库统计得到;单字识别模型的参数为条件概率,经理论分析,它可转化为后验概率来求解。在分析训练样本集单字识别结果的基础上,提出一种统计方法估计候选字的后验概率。HMM在脱机手写体汉语文本识别中的实验表明,后处理性能除取决于语言模型外,还取决于后验概率的精确估计。  相似文献   

15.
This paper presents a new approach for speech feature enhancement in the log-spectral domain for noisy speech recognition. A switching linear dynamic model (SLDM) is explored as a parametric model for the clean speech distribution. Each multivariate linear dynamic model (LDM) is associated with the hidden state of a hidden Markov model (HMM) as an attempt to describe the temporal correlations among adjacent frames of speech features. The state transition on the Markov chain is the process of activating a different LDM or activating some of them simultaneously by different probabilities generated by the HMM. Rather than holding a transition probability for the whole process, a connectionist model is employed to learn the time variant transition probabilities. With the resulting SLDM as the speech model and with a model for the noise, speech and noise are jointly tracked by means of switching Kalman filtering. Comprehensive experiments are carried out using the Aurora2 database to evaluate the new algorithm. The results show that the new SLDM approach can further improve the speech feature enhancement performance in terms of noise-robust recognition accuracy, since the transition probabilities among the LDMs can be described more precisely at each time point.  相似文献   

16.
17.
一种新的隐马尔可夫模型及其在手绘图形识别中的应用   总被引:2,自引:0,他引:2  
提出了一种新的隐马尔可夫模型——自适应隐马尔可夫模型(AHMM).与传统的开环HMM相区别,AHMM是一种用于识别的带反馈机制的闭环HMM.AHMM采用带有压缩率调整因子的特征压缩算法,首先对待识别的特征序列进行较高压缩率的压缩,然后将压缩得到的特征序列送入HMM识别器进行识别.根据对识别效果满意度的判决,确定是否需要调整压缩率因子以获得较长的特征序列,并重新送入HMM识别器进行识别.将该文提出的AHMM用于联机手绘图形的识别,实验表明,AHMM方法与传统的HMM方法相比,识别率和识别速度均有显著提高.  相似文献   

18.
一种基于DSmT和HMM的序列飞机目标识别算法   总被引:1,自引:1,他引:0  
针对姿态多变化的飞机自动目标识别中的低识别率问题, 提出了一种基于DSmT (Dezert-Smarandache theory)与隐马尔可夫模型(Hidden Markov model, HMM)的飞机多特征序列信息融合识别算法(Multiple features and sequential information fusion, MFSIF). 其创新性在于将单幅图像的多特征信息融合识别和序列图像信息融合识别进行有机结合.首先, 对图像进行二值化预处理, 并提取目标的Hu矩和轮廓局部奇异值特征; 然后, 利用概率神经网络(Probabilistic neural networks, PNN)构造基本信度赋值(Basic belief assignment, BBA); 接着, 利用DSmT对该图像的不同特征进行融合,从而获得HMM的观察值序列;再接着, 利用隐马尔可夫模型对飞机序列信息融合, 计算观察值序列与各隐马尔可夫模型之间的相似度, 从而实现姿态多变化的飞机目标自动识别;最后, 通过仿真实验, 验证了该算法在飞机姿态发生较大变化时, 依然可以获得较高的正确识别率,同时在实时性方面也可以满足飞机目标识别的要求. 另外, 在飞机序列发生连续遮挡帧数τ ≤ 6的情况下, 也具有较高的飞机目标正确识别率.  相似文献   

19.
In this paper we present a new event analysis framework based on mixture hidden Markov model (HMM) for ice hockey videos. Hockey is a competitive sport and hockey videos are hard to analyze because of the homogeneity of its frame features. However, the temporal dynamics of hockey videos is highly structured. Using the mixture representation of local observations and Markov chain property of hockey event structure, we successfully model the hockey event as a mixture HMM. Based on the mixture HMM, the hockey event could be classified with high accuracy. Two types of mixture HMMs, Gaussian mixture and independent component analysis (ICA) mixture, are compared for the hockey video event classification. The results confirm our analysis that the mixture HMM is a suitable model to deal with videos with intensive activities. The new mixture HMM hockey event model could be a very useful tool for hockey game analysis.  相似文献   

20.
为了解决语音信号中帧与帧之间的重叠,提高语音信号的自适应能力,本文提出基于隐马尔可夫(HMM)与遗传算法神经网络改进的语音识别系统.该改进方法主要利用小波神经网络对Mel频率倒谱系数(MFCC)进行训练,然后利用HMM对语音信号进行时序建模,计算出语音对HMM的输出概率的评分,结果作为遗传神经网络的输入,即得语音的分类识别信息.实验结果表明,改进的语音识别系统比单纯的HMM有更好的噪声鲁棒性,提高了语音识别系统的性能.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号