首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 156 毫秒
1.
基于HMM和遗传神经网络的语音识别系统   总被引:1,自引:0,他引:1  
本文提出了一种基于隐马尔可夫(HMM)和遗传算法优化的反向传播网络(GA-BP)的混合模型语音识别方法。该方法首先利用HMM对语音信号进行时序建模,并计算出语音对HMM的输出概率的评分,将得到的概率评分作为优化后反向传播网络的输入,得到分类识别信息,最后根据混合模型的识别算法作出识别决策。通过Matlab软件对已有的样本数据进行训练和测试。仿真结果表明,由于设计充分利用了HMM时间建模能力强和GA-BP神经网络分类能力强等特点,该混合模型比单纯的HMM具有更强的抗噪性,克服了神经网络的局部最优问题,大大提高了识别的速度,明显改善了语音识别系统的性能。  相似文献   

2.
基于GEP和Baum-Welch算法训练HMM模型的研究   总被引:1,自引:0,他引:1  
传统的向前-向后算法或Baum-Welch算法训练HMM的转移概率aij和发射概率ai(Ot),使观察序列的O概率恰好达到最大值往往很难,虽然在理论上训练HMM的这两个网络结构是可能的,但仅能保证局部的最大值,而基于全局搜索的基因表达式编程(GEP)的一个主要的特点就是可以高效快速的发现全局最优解.把GEP引入到HMM的训练中去,提出一种改进的训练方法GBHA.实验结果表明,该算法比传统算法的系统效率更高、更稳定.  相似文献   

3.
语音识别中常用的HMM/GMM框架由于训练准则和算法的限制,对模式的辨识能力较差;另一种HMM/ANN框架虽具有极强的模式分类能力,但缺乏成熟有效的优化手段.将一种综合两者优点的TANDEM方法应用到普通话发音检错系统中,通过使用区分性训练的神经网络去估计音素级后验概率,经过一系列后续处理将原始MFCC特征转化为TANDEM特征,作为基于HMM统计模型的发音检错系统的输入,进而完成评测过程.实验结果证明,TANDEM方法使系统的检错性能有了较大的提升,结合MLLR等自适应方法的使用效果会更为明显.  相似文献   

4.
隐马尔可夫模型(HMM)已经被证明是一个对系统正常行为建模的好工具,但是它的Baum-Welch训练算法效率不高,训练过程需要很大的计算机资源,在实际的入侵检测中效率是不高的.本文提出了一个高效的用多观察序列来训练HMM的训练方案,我们的实验结果显示我们的训练方法能比传统的训练方法节省60%的时间.  相似文献   

5.
针对语音识别中DBN-DNN网络训练时间过长的问题,提出了一种DBN-DNN网络的快速训练方法。该方法从减少误差反向传播计算量的角度出发,在更新网络参数时,通过交替变换网络更新层数来实现加速。同时,也设计了逐渐减少网络全局更新频率和逐渐减少网络更新层数两种实施策略。这种训练方法可以与多种DNN加速训练算法相结合。实验结果表明,在不影响识别率的前提下,该方法独立使用或与随机数据筛选算法(Stochastic Data Sweeping, SDS)、ASGD算法等DNN加速训练算法相结合,都可以取得较为理想的加速结果。  相似文献   

6.
基于循环神经网络的语音识别模型   总被引:5,自引:1,他引:4  
朱小燕  王昱  徐伟 《计算机学报》2001,24(2):213-218
近年来基于隐马尔可夫模型(HMM)的语音识别技术得到了很大发展。然而HMM模型有着一定的局限性,如何克服HMM的一阶假设和独立性假设带来的问题一直是研究讨论的热点,在语音识别中引入神经网络的方法是克服HMM局限性的一条途径。该文将循环神经网络应用于汉语语音识别,修改了原网络模型并提出了相应的训练方法,实验结果表明该模型具有良好的连续信号处理性能,与传统的HMM模型效果相当,新的训练策略能够在提高训练速度的同时,使得模型分类性能有明显提高。  相似文献   

7.
隐马尔可夫模型(Hidden Markov Model,HMM)被广泛地应用于信号处理和模式识别中.当将其应用于聚类时,HMM的训练是一个非常重要的问题.特别是对数据不均衡的数据集,传统的模型训练方法存在使某些类为空类,类中数据偏少等缺点.针对这一特殊问题,提出了基于频率敏感的聚类方法PIFS-HMM,其目的在于提高模型训练的有效性,使聚类结果均衡.实验结果证实了提出方法的有效性.  相似文献   

8.
自适应数字预失真是克服高功率放大器非线性失真最有前途的一项技术。为提高预失真的效率和效果,引入并行计算平台下的演化计算技术,提出了基于PSO算法预训练神经网络的方法,给出了算法软件实现的基本流程。在所述基础上,采用带抽头延时的双入双出三层前向神经网络结构,根据非直接学习结构和反向传播算法实现自适应,可同时补偿放大器的记忆失真和非线性失真的预失真技术。仿真实验表明,通过与无PSO预训练算法的相比,基于PSO预训练的神经网络训练算法有更好的性能。  相似文献   

9.
赵虎  杨宇 《计算机应用》2016,36(4):923-926
针对误差反向传播(BP)算法计算迭代的特点,给出了迭代式MapReduce框架实现BP算法的方法。迭代式MapReduce框架在传统MapReduce框架上添加了传送模块,避免了传统框架运用在迭代程序时需要多次任务提交的缺陷。通过对K/TGR146对空台射电开关控制系统进行仿真得到BP算法训练样本,并在Hadoop云计算环境下,分别在基于传统框架和迭代式框架的BP算法中进行训练。实验结果表明,基于迭代式MapReduce框架的BP算法训练速度达到了基于传统MapReduce框架的BP算法训练速度的10倍以上,正确率提升了10%~13%,能有效解决算法训练时间过长和迭代计算中多次任务提交的问题。  相似文献   

10.
在人机交互过程中,理解人类的情绪是计算机和人进行交流必备的技能之一。最能表达人类情绪的就是面部表情。设计任何现实情景中的人机界面,面部表情识别是必不可少的。在本文中,我们提出了交互式计算环境中的一种新的实时面部表情识别框架。文章对这个领域的研究主要有两大贡献:第一,提出了一种新的网络结构和基于AdaBoost的嵌入式HMM的参数学习算法。第二,将这种优化的嵌入式HMM用于实时面部表情识别。本文中,嵌入式HMM把二维离散余弦变形后的系数作为观测向量,这和以前利用像素深度来构建观测向量的嵌入式HMM方法不同。因为算法同时修正了嵌入式HMM的网络结构和参数,大大提高了分类的精确度。该系统减少了训练和识别系统的复杂程度,提供了更加灵活的框架,且能应用于实时人机交互应用软件中。实验结果显示该方法是一种高效的面部表情识别方法。  相似文献   

11.
In this paper, we propose a novel optimization algorithm called constrained line search (CLS) for discriminative training (DT) of Gaussian mixture continuous density hidden Markov model (CDHMM) in speech recognition. The CLS method is formulated under a general framework for optimizing any discriminative objective functions including maximum mutual information (MMI), minimum classification error (MCE), minimum phone error (MPE)/minimum word error (MWE), etc. In this method, discriminative training of HMM is first cast as a constrained optimization problem, where Kullback-Leibler divergence (KLD) between models is explicitly imposed as a constraint during optimization. Based upon the idea of line search, we show that a simple formula of HMM parameters can be found by constraining the KLD between HMM of two successive iterations in an quadratic form. The proposed CLS method can be applied to optimize all model parameters in Gaussian mixture CDHMMs, including means, covariances, and mixture weights. We have investigated the proposed CLS approach on several benchmark speech recognition databases, including TIDIGITS, Resource Management (RM), and Switchboard. Experimental results show that the new CLS optimization method consistently outperforms the conventional EBW method in both recognition performance and convergence behavior.  相似文献   

12.
一种针对区分性训练的受限线性搜索优化方法   总被引:1,自引:0,他引:1  
提出一种称为“受限线性搜索”的优化方法,并用于语音识别中混合高斯的连续密度隐马尔科夫(CDHMM)模型的区分性训练。该方法可用于优化基于最大互信息(MMI)准则的区分性训练目标函数。在该方法中,首先把隐马尔科夫模型(HMM)的区分性训练问题看成一个受限的优化问题,并利用模型间的KL度量作为优化过程中的一个限制。再基于线性搜索的思想,指出通过限制更新前后模型间的KL度量,可将HMM的参数表示成一种简单的二次形式。该方法可用于优化混合高斯CDHMM模型中的任何参数,包括均值、协方差矩阵、高斯权重等。将该方法分别用于中英文两个标准语音识别任务上,包括英文TIDIGITS数据库和中文863数据库。实验结果表明,该方法相对传统的扩展Baum-Welch方法在识别性能和收敛特性上都取得一致提升。  相似文献   

13.
Maximum confidence hidden markov modeling for face recognition   总被引:1,自引:0,他引:1  
This paper presents a hybrid framework of feature extraction and hidden Markov modeling(HMM) for two-dimensional pattern recognition. Importantly, we explore a new discriminative training criterion to assure model compactness and discriminability. This criterion is derived from hypothesis test theory via maximizing the confidence of accepting the hypothesis that observations are from target HMM states rather than competing HMM states. Accordingly, we develop the maximum confidence hidden Markov modeling (MC-HMM) for face recognition. Under this framework, we merge a transformation matrix to extract discriminative facial features. The closed-form solutions to continuous-density HMM parameters are formulated. Attractively, the hybrid MC-HMM parameters are estimated under the same criterion and converged through the expectation-maximization procedure. From the experiments on FERET and GTFD facial databases, we find that the proposed method obtains robust segmentation in presence of different facial expressions, orientations, etc. In comparison with maximum likelihood and minimum classification error HMMs, the proposed MC-HMM achieves higher recognition accuracies with lower feature dimensions.  相似文献   

14.
We present a novel confidence- and margin-based discriminative training approach for model adaptation of a hidden Markov model (HMM)-based handwriting recognition system to handle different handwriting styles and their variations. Most current approaches are maximum-likelihood (ML) trained HMM systems and try to adapt their models to different writing styles using writer adaptive training, unsupervised clustering, or additional writer-specific data. Here, discriminative training based on the maximum mutual information (MMI) and minimum phone error (MPE) criteria are used to train writer-independent handwriting models. For model adaptation during decoding, an unsupervised confidence-based discriminative training on a word and frame level within a two-pass decoding process is proposed. The proposed methods are evaluated for closed-vocabulary isolated handwritten word recognition on the IFN/ENIT Arabic handwriting database, where the word error rate is decreased by 33% relative compared to a ML trained baseline system. On the large-vocabulary line recognition task of the IAM English handwriting database, the word error rate is decreased by 25% relative.  相似文献   

15.
This paper proposes an effective segmentation-free approach using a hybrid neural network hidden Markov model (NN-HMM) for offline handwritten Chinese text recognition (HCTR). In the general Bayesian framework, the handwritten Chinese text line is sequentially modeled by HMMs with each representing one character class, while the NN-based classifier is adopted to calculate the posterior probability of all HMM states. The key issues in feature extraction, character modeling, and language modeling are comprehensively investigated to show the effectiveness of NN-HMM framework for offline HCTR. First, a conventional deep neural network (DNN) architecture is studied with a well-designed feature extractor. As for the training procedure, the label refinement using forced alignment and the sequence training can yield significant gains on top of the frame-level cross-entropy criterion. Second, a deep convolutional neural network (DCNN) with automatically learned discriminative features demonstrates its superiority to DNN in the HMM framework. Moreover, to solve the challenging problem of distinguishing quite confusing classes due to the large vocabulary of Chinese characters, NN-based classifier should output 19900 HMM states as the classification units via a high-resolution modeling within each character. On the ICDAR 2013 competition task of CASIA-HWDB database, DNN-HMM yields a promising character error rate (CER) of 5.24% by making a good trade-off between the computational complexity and recognition accuracy. To the best of our knowledge, DCNN-HMM can achieve a best published CER of 3.53%.  相似文献   

16.
Large margin hidden Markov models for speech recognition   总被引:1,自引:0,他引:1  
In this paper, motivated by large margin classifiers in machine learning, we propose a novel method to estimate continuous-density hidden Markov model (CDHMM) for speech recognition according to the principle of maximizing the minimum multiclass separation margin. The approach is named large margin HMM. First, we show this type of large margin HMM estimation problem can be formulated as a constrained minimax optimization problem. Second, we propose to solve this constrained minimax optimization problem by using a penalized gradient descent algorithm, where the original objective function, i.e., minimum margin, is approximated by a differentiable function and the constraints are cast as penalty terms in the objective function. The new training method is evaluated in the speaker-independent isolated E-set recognition and the TIDIGITS connected digit string recognition tasks. Experimental results clearly show that the large margin HMMs consistently outperform the conventional HMM training methods. It has been consistently observed that the large margin training method yields significant recognition error rate reduction even on top of some popular discriminative training methods.  相似文献   

17.
Multiple-cluster schemes, such as cluster adaptive training (CAT) or eigenvoice systems, are a popular approach for rapid speaker and environment adaptation. Interpolation weights are used to transform a multiple-cluster, canonical, model to a standard hidden Markov model (HMM) set representative of an individual speaker or acoustic environment. Maximum likelihood training for CAT has previously been investigated. However, in state-of-the-art large vocabulary continuous speech recognition systems, discriminative training is commonly employed. This paper investigates applying discriminative training to multiple-cluster systems. In particular, minimum phone error (MPE) update formulae for CAT systems are derived. In order to use MPE in this case, modifications to the standard MPE smoothing function and the prior distribution associated with MPE training are required. A more complex adaptive training scheme combining both interpolation weights and linear transforms, a structured transform (ST), is also discussed within the MPE training framework. Discriminatively trained CAT and ST systems were evaluated on a state-of-the-art conversational telephone speech task. These multiple-cluster systems were found to outperform both standard and adaptively trained systems.  相似文献   

18.
In recent years, the use of Multi-Layer Perceptron (MLP) derived acoustic features has become increasingly popular in automatic speech recognition systems. These features are typically used in combination with standard short-term spectral-based features, and have been found to yield consistent performance improvements. However there are a number of design decisions and issues associated with the use of MLP features for state-of-the-art speech recognition systems. Two modifications to the standard training/adaptation procedures are described in this work. First, the paper examines how MLP features, and the associated acoustic models, can be trained efficiently on large training corpora using discriminative training techniques. An approach that combines multiple individual MLPs is proposed, and this reduces the time needed to train MLPs on large amounts of data. In addition, to further speed up discriminative training, a lattice re-use method is proposed. The paper also examines how systems with MLP features can be adapted to a particular speakers, or acoustic environments. In contrast to previous work (where standard HMM adaptation schemes are used), linear input network adaptation is investigated. System performance is investigated within a multi-pass adaptation/combination framework. This allows the performance gains of individual techniques to be evaluated at various stages, as well as the impact in combination with other sub-systems. All the approaches considered in this paper are evaluated on an Arabic large vocabulary speech recognition task which includes both Broadcast News and Broadcast Conversation test data.  相似文献   

19.
20.
We propose a new framework and the associated maximum-likelihood and discriminative training algorithms for the variable-parameter hidden Markov model (VPHMM) whose mean and variance parameters vary as functions of additional environment-dependent conditioning parameters. Our framework differs from the VPHMM proposed by Cui and Gong (2007) in that piecewise spline interpolation instead of global polynomial regression is used to represent the dependency of the HMM parameters on the conditioning parameters, and a more effective functional form is used to model the variances. Our framework unifies and extends the conventional discrete VPHMM. It no longer requires quantization in estimating the model parameters and can support both parameter sharing and instantaneous conditioning parameters naturally. We investigate the strengths and weaknesses of the model on the Aurora-3 corpus. We show that under the well-matched condition the proposed discriminatively trained VPHMM outperforms the conventional HMM trained in the same way with relative word error rate (WER) reduction of 19% and 15%, respectively, when only mean is updated and when both mean and variances are updated.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号