首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 625 毫秒
1.
In this paper, we propose a novel optimization algorithm called constrained line search (CLS) for discriminative training (DT) of Gaussian mixture continuous density hidden Markov model (CDHMM) in speech recognition. The CLS method is formulated under a general framework for optimizing any discriminative objective functions including maximum mutual information (MMI), minimum classification error (MCE), minimum phone error (MPE)/minimum word error (MWE), etc. In this method, discriminative training of HMM is first cast as a constrained optimization problem, where Kullback-Leibler divergence (KLD) between models is explicitly imposed as a constraint during optimization. Based upon the idea of line search, we show that a simple formula of HMM parameters can be found by constraining the KLD between HMM of two successive iterations in an quadratic form. The proposed CLS method can be applied to optimize all model parameters in Gaussian mixture CDHMMs, including means, covariances, and mixture weights. We have investigated the proposed CLS approach on several benchmark speech recognition databases, including TIDIGITS, Resource Management (RM), and Switchboard. Experimental results show that the new CLS optimization method consistently outperforms the conventional EBW method in both recognition performance and convergence behavior.  相似文献   

2.
一种针对区分性训练的受限线性搜索优化方法   总被引:1,自引:0,他引:1  
提出一种称为“受限线性搜索”的优化方法,并用于语音识别中混合高斯的连续密度隐马尔科夫(CDHMM)模型的区分性训练。该方法可用于优化基于最大互信息(MMI)准则的区分性训练目标函数。在该方法中,首先把隐马尔科夫模型(HMM)的区分性训练问题看成一个受限的优化问题,并利用模型间的KL度量作为优化过程中的一个限制。再基于线性搜索的思想,指出通过限制更新前后模型间的KL度量,可将HMM的参数表示成一种简单的二次形式。该方法可用于优化混合高斯CDHMM模型中的任何参数,包括均值、协方差矩阵、高斯权重等。将该方法分别用于中英文两个标准语音识别任务上,包括英文TIDIGITS数据库和中文863数据库。实验结果表明,该方法相对传统的扩展Baum-Welch方法在识别性能和收敛特性上都取得一致提升。  相似文献   

3.
Large-margin discriminative training of hidden Markov models has received significant attention recently. A natural and interesting question is whether the existing discriminative training algorithms can be extended directly to embed the concept of margin. In this paper, we give this question an affirmative answer by showing that the sigmoid bias in the conventional minimum classification error (MCE) training can be interpreted as a soft margin. We justify this claim from a theoretical classification risk minimization perspective where the loss function associated with a non-zero sigmoid bias is shown to include not only empirical error rates but also a margin-bound risk. Based on this perspective, we propose a practical optimization strategy that adjusts the margin (sigmoid bias) incrementally in the MCE training process so that a desirable balance between the empirical error rates on the training set and the margin can be achieved. We call this modified MCE training process large-margin minimum classification error (LM-MCE) training to differentiate it from the conventional MCE. Speech recognition experiments have been carried out on two tasks. First, in the TIDIGITS recognition task, LM-MCE outperforms the state-of-the-art MCE method with 17% relative digit-error reduction and 19% relative string-error reduction. Second, on the Microsoft internal large vocabulary telephony speech recognition task (with 2000 h of training data and 120 K words in the vocabulary), significant recognition accuracy improvement is achieved, demonstrating that our formulation of LM-MCE can be successfully scaled up and applied to large-scale speech recognition tasks.  相似文献   

4.
一类非线性极小极大问题的改进粒子群算法   总被引:1,自引:0,他引:1  
张建科  李立峰  周畅 《计算机应用》2008,28(5):1194-1196
针对一类非线性极小极大问题目标函数非光滑的特点给求解带来的困难,利用改进的粒子群算法并结合极大熵函数法给出了此类问题的一种新的有效算法。首先利用极大熵函数将无约束和有约束极小极大问题转化为一个光滑函数的无约束最优化问题,将此光滑函数作为粒子群算法的适应值函数;然后用数学中的外推方法给出一个新的粒子位置更新公式,并应用这个改进的粒子群算法来优化此问题。数值结果表明,该算法收敛快﹑数值稳定性好,是求解非线性极小极大问题的一种有效算法。  相似文献   

5.
In this paper we examine a technique by which fault tolerance can be embedded into a feedforward network leading to a network tolerant to the loss of a node and its associated weights. The fault tolerance problem for a feedforward network is formulated as a constrained minimax optimization problem. Two different methods are used to solve it. In the first method, the constrained minimax optimization problem is converted to a sequence of unconstrained least-squares optimization problems, whose solutions converge to the solution of the original minimax problem. An efficient gradient-based minimization technique, specially tailored for nonlinear least-squares optimization, is then applied to perform the unconstrained minimization at each step of the sequence. Several modifications are made to the basic algorithm to improve its speed of convergence. In the second method a different approach is used to convert the problem to a single unconstrained minimization problem whose solution very nearly equals that of the original minimax problem. Networks synthesized using these methods, though not always fault tolerant, exhibit an acceptable degree of partial fault tolerance.  相似文献   

6.
Hidden Markov models (HMM) are stochastic models capable of statistical learning and classification. They have been applied in speech recognition and handwriting recognition because of their great adaptability and versatility in handling sequential signals. On the other hand, as these models have a complex structure and also because the involved data sets usually contain uncertainty, it is difficult to analyze the multiple observation training problem without certain assumptions. For many years researchers have used the training equations of Levinson (1983) in speech and handwriting applications, simply assuming that all observations are independent of each other. This paper presents a formal treatment of HMM multiple observation training without imposing the above assumption. In this treatment, the multiple observation probability is expressed as a combination of individual observation probabilities without losing generality. This combinatorial method gives one more freedom in making different dependence-independence assumptions. By generalizing Baum's auxiliary function into this framework and building up an associated objective function using the Lagrange multiplier method, it is proven that the derived training equations guarantee the maximization of the objective function. Furthermore, we show that Levinson's training equations can be easily derived as a special case in this treatment  相似文献   

7.
Feature extraction is an important component of pattern classification and speech recognition. Extracted features should discriminate classes from each other while being robust to environmental conditions such as noise. For this purpose, several feature transformations are proposed which can be divided into two main categories: data-dependent transformation and classifier-dependent transformation. The drawback of data-dependent transformation is that its optimization criteria are different from the measure of classification error which can potentially degrade the classifier’s performance. In this paper, we propose a framework to optimize data-dependent feature transformations such as PCA (Principal Component Analysis), LDA (Linear Discriminant Analysis) and HLDA (Heteroscedastic LDA) using minimum classification error (MCE) as the main objective. The classifier itself is based on Hidden Markov Model (HMM). In our proposed HMM minimum classification error technique, the transformation matrices are modified to minimize the classification error for the mapped features, and the dimension of the feature vector is not changed. To evaluate the proposed methods, we conducted several experiments on the TIMIT phone recognition and the Aurora2 isolated word recognition tasks. The experimental results show that the proposed methods improve performance of PCA, LDA and HLDA transformation for mapping Mel-frequency cepstral coefficients (MFCC).  相似文献   

8.
针对隐马尔可夫模型传统训练算法易收敛于局部极值的问题,提出一种带极值扰动的自适应调整惯性权重和加速系数的粒子群算法,将改进后的粒子群优化算法引入到隐马尔可夫模型的训练中,分别对隐马尔可夫模型的状态数与参数进优化.通过对手写数字识别的实验说明,提出的基于改进粒子群优化算法的隐马尔可夫模型训练算法与传统隐马尔可夫模型训练算法Baum-Welch算法相比,能有效地跳出局部极值,从而使训练后的隐马尔可夫模型具有较高的识别能力.  相似文献   

9.
Inspired by the great success of margin-based classifiers, there is a trend to incorporate the margin concept into hidden Markov modeling for speech recognition. Several attempts based on margin maximization were proposed recently. In this paper, a new discriminative learning framework, called soft margin estimation (SME), is proposed for estimating the parameters of continuous-density hidden Markov models. The proposed method makes direct use of the successful ideas of soft margin in support vector machines to improve generalization capability and decision feedback learning in minimum classification error training to enhance model separation in classifier design. SME is illustrated from a perspective of statistical learning theory. By including a margin in formulating the SME objective function, SME is capable of directly minimizing an approximate test risk bound. Frame selection, utterance selection, and discriminative separation are unified into a single objective function that can be optimized using the generalized probabilistic descent algorithm. Tested on the TIDIGITS connected digit recognition task, the proposed SME approach achieves a string accuracy of 99.43%. On the 5 k-word Wall Street Journal task, SME obtains relative word error rate reductions of about 10% over our best baseline results in different experimental configurations. We believe this is the first attempt to show the effectiveness of margin-based acoustic modeling for large vocabulary continuous speech recognition in a hidden Markov model framework. Further improvements are expected because the approximate test risk bound minimization principle offers a flexible and rigorous framework to facilitate incorporation of new margin-based optimization criteria into hidden Markov model training.  相似文献   

10.
In this paper, we propose to use a new optimization method, i.e., semidefinite programming (SDP), to solve the large-margin estimation (LME) problem of continuous-density hidden Markov model (CDHMM) in speech recognition. First, we introduce a new constraint for LME to guarantee the boundedness of the margin of CDHMM. Second, we show that the LME problem subject to this new constraint can be formulated as an SDP problem under some relaxation conditions. Therefore, it can be solved using many efficient optimization algorithms specially designed for SDP. The new LME/SDP method has been evaluated on a speaker independent E-set speech recognition task using the ISOLET database and a connected digit string recognition task using the TIDIGITS database. Experimental results clearly demonstrate that large-margin estimation via semidefinite programing (LME/SDP) can significantly reduce word error rate (WER) over other existing CDHMM training methods, such as MLE and MCE. It has also been shown that the new SDP-based method largely outperforms the previously proposed LME optimization methods using gradient descent search.  相似文献   

11.
Optical character recognition for cursive handwriting   总被引:5,自引:0,他引:5  
A new analytic scheme, which uses a sequence of image segmentation and recognition algorithms, is proposed for the off-line cursive handwriting recognition problem. First, some global parameters, such as slant angle, baselines, stroke width and height, are estimated. Second, a segmentation method finds character segmentation paths by combining gray-scale and binary information. Third, a hidden Markov model (HMM) is employed for shape recognition to label and rank the character candidates. For this purpose, a string of codes is extracted from each segment to represent the character candidates. The estimation of feature space parameters is embedded in the HMM training stage together with the estimation of the HMM model parameters. Finally, information from a lexicon and from the HMM ranks is combined in a graph optimization problem for word-level recognition. This method corrects most of the errors produced by the segmentation and HMM ranking stages by maximizing an information measure in an efficient graph search algorithm. The experiments indicate higher recognition rates compared to the available methods reported in the literature  相似文献   

12.
一类非线性极大极小问题的极大熵社会认知算法   总被引:2,自引:1,他引:1       下载免费PDF全文
针对一类非线性极大极小问题目标函数非光滑的特点给求解带来的困难,利用社会认知算法并结合极大熵函数法给出了此类问题的一种新的有效算法。首先利用极大熵函数将原问题转化为一个光滑无约束优化问题,然后利用社会认知算法对其进行求解。该算法是基于社会认知理论,通过一系列的学习代理来模拟人类的社会性以及智能性从而完成对目标的优化。数值结果表明,该算法收敛快,数值稳定性好,是求解非线性极大极小问题的一种有效算法。  相似文献   

13.
In this paper, we propose a novel hybrid global optimization method to solve constrained optimization problems. An exact penalty function is first applied to approximate the original constrained optimization problem by a sequence of optimization problems with bound constraints. To solve each of these box constrained optimization problems, two hybrid methods are introduced, where two different strategies are used to combine limited memory BFGS (L-BFGS) with Greedy Diffusion Search (GDS). The convergence issue of the two hybrid methods is addressed. To evaluate the effectiveness of the proposed algorithm, 18 box constrained and 4 general constrained problems from the literature are tested. Numerical results obtained show that our proposed hybrid algorithm is more effective in obtaining more accurate solutions than those compared to.  相似文献   

14.
在基于加速度信号的人体行为识别中,LDA是较常用的特征降维方法之一,然而LDA并不直接以训练误差作为目标函数,无法保证获得训练误差最小的投影空间。针对这一情况,采用基于GA优化的LDA进行特征选择。提取加速度信号特征,利用PCA方法解决“小样本问题”,通过GA调整LDA中类间离散度矩阵的特征值矢量,使获得的投影空间训练误差最小。采用SVM对7种日常行为进行分类。实验结果表明,与单独采用PCA和采用PCA+LDA方法相比,基于GA优化的LDA算法在保证较高识别率的同时能有效降低特征维数并减小分类误差,最终测试样本的识别率可达95.96%。  相似文献   

15.
The Prediction Error Method (PEM) is related to an optimization problem built on input/output data collected from the system to be identified. It is often hard to find the global solution of this optimization problem because the corresponding objective function presents local minima and/or the search space is constrained to a nonconvex set. The shape of the cost function, and hence the difficulty in solving the optimization problem, depends directly on the experimental conditions, more specifically on the spectrum of the input/output data collected from the system. Therefore, it seems plausible to improve the convergence to the global minimum by properly choosing the spectrum of the input; in this paper, we address this problem. We present a condition for convergence to the global minimum of the cost function and propose its inclusion in the input design. We present the application of the proposed approach to case studies where the algorithms tend to get trapped in nonglobal minima.  相似文献   

16.
Recent face recognition algorithm can achieve high accuracy when the tested face samples are frontal. However, when the face pose changes largely, the performance of existing methods drop drastically. Efforts on pose-robust face recognition are highly desirable, especially when each face class has only one frontal training sample. In this study, we propose a 2D face fitting-assisted 3D face reconstruction algorithm that aims at recognizing faces of different poses when each face class has only one frontal training sample. For each frontal training sample, a 3D face is reconstructed by optimizing the parameters of 3D morphable model (3DMM). By rotating the reconstructed 3D face to different views, pose virtual face images are generated to enlarge the training set of face recognition. Different from the conventional 3D face reconstruction methods, the proposed algorithm utilizes automatic 2D face fitting to assist 3D face reconstruction. We automatically locate 88 sparse points of the frontal face by 2D face-fitting algorithm. Such 2D face-fitting algorithm is so-called Random Forest Embedded Active Shape Model, which embeds random forest learning into the framework of Active Shape Model. Results of 2D face fitting are added to the 3D face reconstruction objective function as shape constraints. The optimization objective energy function takes not only image intensity, but also 2D fitting results into account. Shape and texture parameters of 3DMM are thus estimated by fitting the 3DMM to the 2D frontal face sample, which is a non-linear optimization problem. We experiment the proposed method on the publicly available CMUPIE database, which includes faces viewed from 11 different poses, and the results show that the proposed method is effective and the face recognition results toward pose variants are promising.  相似文献   

17.
We are addressing the novel problem of jointly evaluating multiple speech patterns for automatic speech recognition and training. We propose solutions based on both the non-parametric dynamic time warping (DTW) algorithm, and the parametric hidden Markov model (HMM). We show that a hybrid approach is quite effective for the application of noisy speech recognition. We extend the concept to HMM training wherein some patterns may be noisy or distorted. Utilizing the concept of “virtual pattern” developed for joint evaluation, we propose selective iterative training of HMMs. Evaluating these algorithms for burst/transient noisy speech and isolated word recognition, significant improvement in recognition accuracy is obtained using the new algorithms over those which do not utilize the joint evaluation strategy.  相似文献   

18.
A practical, penalty function approach to solving constrained minimax problems is applied here. In essence, this approach reformulates the constrained minimax problem as an unconstrained minimax problem. A recently proposed optimization algorithm called grazor search is used to solve the reformulated unconstrained minimax problem. The proposed approach can handle inequality constraints-parameter constraints in particular. A practical transmission-line filter example with parameter constraints illustrates the results.  相似文献   

19.
The aim of this paper is to design an efficient multigrid method for constrained convex optimization problems arising from discretization of some underlying infinite dimensional problems. Due to problem dependency of this approach, we only consider bound constraints with (possibly) a single equality constraint. As our aim is to target large-scale problems, we want to avoid computation of second derivatives of the objective function, thus excluding Newton-like methods. We propose a smoothing operator that only uses first-order information and study the computational efficiency of the resulting method.  相似文献   

20.
罚函数法是一种将约束优化问题转化为无约束问题的重要方法.对于一般的约束优化问题,通过加入新参数,给出了一种改进的精确罚函数和这种罚函数的精确罚定理证明,提出了求解这种罚函数的算法.实验表明该算法是有效的.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号