共查询到20条相似文献,搜索用时 109 毫秒
1.
基于GDTW+SVM的语音识别 总被引:3,自引:0,他引:3
针对经特征提取后语音信号的特征参数的维数不同问题,文章提出了基于GDTW核 SVM算法的语音识别方法。这种方法先对语音信号进行特征提取,并通过GDTW核把特征矢量映射到高维特征空间,然后在高维特征空间中应用支持矢量机的分类方法进行识别。实验证明,与DTW算法和神经网络方法相比,这种方法是可行的,能显著提高语音信号的识别率。 相似文献
2.
3.
4.
5.
6.
给出了一种基于广义线性区分核支持向量机方法的VolP说话人确认算法,给出了直接从G.729,G.723.1(6.3Kb/s)和G.723.1(5.3Kb/s)压缩语音的码流中提取识别参数,采用广义线性区分核的支持向量机进行确认的方法。实验结果表明,该方法可有效地进行VoIP说话人确认。 相似文献
7.
8.
一种改进的模糊C-均值聚类算法在说话人识别中的应用 总被引:3,自引:0,他引:3
提出了一种将改进的FCM聚类算法与矢量量化相结合的说话人识别的方法。先从语音信号中提取待识别的特征矢量集,再利用矢量量化来设计码本,最后用改进的算法对待识别语音进行辩识。该算法解决了FCM算法对初始值敏感、易陷入局部最优的问题。所使用的特征参数较少,计算比较简单,但识别率较高,且具有较好的鲁棒性。 相似文献
9.
10.
在VoIP说话人识别中,当使用原始语音(未经过编译码处理)训练的说话人模型识别经过语音编译码处理的测试语音时,系统的识别性能会发生下降.本文给出了一种基于统计匹配和EM(期望最大化)算法的VoIP说话人特征(12阶的LPCC系数)补偿算法,其中对假设失真特征与未失真识别特征间符合非线性(二次函数型)和线性函数关系时的函数参数进行了估计,并使用得到的补偿函数对失真特征进行补偿.实验结果表明,该特征补偿算法对VoIP中广泛使用的G.729 8kb/s、G.723.1 6.3kb/s、G.723.1 5.3kb/s编译码所造成的识别性能下降有较大的改善,其性能也优于CMS(倒谱均值减)方法. 相似文献
11.
针对经典Mean-Shift跟踪算法需要多次迭代才能达到收敛的缺点,提出一种高效的Mean-Shift跟踪算法。在使用颜色空间作为目标特征的跟踪系统中,目标本身往往可以表征为区别于背景的颜色特征,而颜色特征的分布则与偏移向量的权值相对应。通过分析跟踪算法中不同的权值对收敛速度的影响,对加权系数进行了二次加权,使改进的算法只需要一次粗定位和一次精确定位2次迭代便可准确地对目标进行定位。试验结果表明,该算法在保证了经典算法准确性的同时,大大加快了向目标收敛的速度。 相似文献
12.
针对在支持向量聚类,当样本分布不均匀时,单宽度的高斯核限制了支持向量机泛化性能,影响了聚类效果的问题,提出一种基于加权多宽度高斯核函数的支持向量聚类算法。加权多宽度高斯核函数比单宽度的高斯核有更多的可调参数,通过多参数调节,可提高泛化能力,改善聚类效果。仿真实验表明,与单宽度的高斯核相比,加权多宽度高斯核可以有效聚类,从而证明了该算法的有效性。 相似文献
13.
Zhang Y. Desilva C.J.S. Togneri A. Alder M. Attikiouzel Y. 《Vision, Image and Signal Processing, IEE Proceedings -》1994,141(3):197-202
A multi-HMM speaker-independent isolated word recognition system is described. In this system, three vector quantisation methods, the LBG algorithm, the EM algorithm, and a new MGC algorithm, are used for the classification of the speech space. These quantisations of the speech space are then used to produce three HMMs for each word in the vocabulary. In the recognition step, the Viterbi algorithm is used in the three subrecognisers. The log probabilities of the observation sequences matching-the models are multiplied by the weights determined by the recognition accuracies of individual subrecognisers and summed to give the log probability that the utterance is of a particular word in the vocabulary. This multi-HMM system results in a reduction of about 50% in the error rate in comparison with the single model system 相似文献
14.
A novel adaptive discriminative vector quantisation technique for speaker identification (ADVQSI) is introduced. In the training mode of ADVQSI, for each speaker, the speech feature vector space is divided into a number of subspaces. The feature space segmentation is based on the difference between the probability distribution of the speech feature vectors from each speaker and that from all speakers in the speaker identification (SI) group. Then, an optimal discriminative weight, which represents the subspace's role in SI, is calculated for each subspace of each speaker by employing adaptive techniques. The largest template differences between speakers in the SI group are achieved by using optimal discriminative weights. In the testing mode of ADVQSI, discriminative weighted average vector quantisation (VQ) distortions are used for SI decisions. The performance of ADVQSI is analysed and tested experimentally. The experimental results confirm the performance improvement employing the proposed technique in comparison with existing VQ techniques for SI and recently reported discriminative VQ techniques for SI (DVQSI) 相似文献
15.
复杂时间序列是高度复杂的非线性动态系统,传统的支持向量机方法无法对单一点值进行精确的预测,因此,对时序波动区间的预测更有参考意义。基于此,提出一种基于加权支持向量机的时序波动范围预测算法。研究中以股票指数为例,首先将原始价格数据进行模糊信息粒化,并针对金融时间序列的特点,利用改进后的加权支持向量机对粒化后的价格数据作出回归分析,同时对参数进行优化。最后对3大股票指数的预测实验验证结果表明,该方法能对复杂时间序列的波动范围进行有效的预测,并且精度优于标准支持向量机模型。 相似文献
16.
We present a new approach to joint state and parameter estimation for a target-directed, nonlinear dynamic system model with switching states. The model, recently proposed for representing speech dynamics, is called the hidden dynamic model (HDM). The model parameters, subject to statistical estimation, consist of the target vector and the system matrix (also called "time-constants"), as well as parameters characterizing the nonlinear mapping from the hidden state to the observation. We implement these parameters as the weights of a three-layer feedforward multilayer perceptron (MLP) network. The new estimation approach is based on the extended Kalman filter (EKF), and its performance is compared with the traditional expectation-maximization (EM) based approach. Extensive simulation results are presented using both approaches and under typical HDM speech modeling conditions. The EKF-based algorithm demonstrates superior convergence performance compared with the EM algorithm, but the former suffers from excessive computational loads when adopted for training the MLP weights. In all cases, the simulated model output converges to the given observation sequence. However, only in the case where the MLP weights or the target vector are assumed known do the time-constant parameters converge to their true values. We also show that the MLP weights never converge to their true values, thus demonstrating the many-to-one mapping property of the feedforward MLP. We conclude that, for the system to be identifiable, restrictions on the parameter space are needed. 相似文献
17.
在支持向量聚类中,采用单个核函数的支持向量机具有很大局限性,为了得到学习能力和泛化能力都很强的核函数,采用了一种新的混合核函数。将该混合核函数应用于支持向量聚类运算中,并且与普通核函数构造的支持向量机的实验结果进行了对比。结果表明了该方法的有效性。 相似文献
18.
为了有效改善高光谱图像数据分类的精确度,减少对大数目数据集的依赖,在原型空间特征提取方法的基础上提出一种基于加权模糊C均值算法改进型原型空间特征提取方案。该方案通过加权模糊 C 均值算法对每个特征施加不同的权重,从而保证提取后的特征含有较高的信息量。实验结果表明,与业内公认的原型空间提取算法相比 该方案在相对较小的数据集下,其性能仍具有较为理想的稳定性,且具有相对较高的分类精度,这样子就大大降低了对数据集样本数量的依赖性,同时改善了原型空间特征方法的效率。 相似文献
19.
一种用于WI语音编码的相位预测式矢量量化方法 总被引:1,自引:0,他引:1
在传统的低比特率语音编码中,考虑到人耳对相位信息不敏感而经常忽略相位信息,这将导致语音粗糙、刺耳甚至音调发生改变。为了获得高质量的声码器,语音的相位信息是不能不考虑的。该文在散布相位矢量量化方法的基础上进一步去除了相位冗余,在波形内插(Waveform Interpolation,WI)编码模型中对相邻帧慢渐变波形(Slowly Evolving Waveform,SEW)的相位谱差值进行预测式矢量量化。实验发现,该方法大大改善了重建语音效果,明显提高了语音的自然度和清晰度。主观A/B测试结果显示,该方法与固定相位法相比,经4~6 bit的相位量化可使合成语音质量得到显著的改善,相比散布相位矢量量化方法,女声的语音合成质量有所改进。 相似文献