共查询到19条相似文献,搜索用时 125 毫秒
1.
提出了一种基于遗传优化RBF神经网络的声纹识别算法,该算法中采用遗传算法对传统的RBF神经网络基函数中心以及宽度进行优化处理,克服了传统RBF神经网络参数难以确定的缺陷。同时,算法结合心理声学模型,提取了能表现说话人个性特征的Mel倒谱系数为特征进行说话人识别,可较好地提升系统的抗噪性能。仿真实验结果表明,与传统RBF神经网络相比,该方法具有快速学习网络权重的能力,并且网络的全局寻优能力强,使得系统的识别率进一步提高。 相似文献
2.
《电子技术与软件工程》2015,(16)
本文将基于正交最小二乘的RBF神经网络算法引入自适应噪声对消中,提出一种基于最小二乘算法和径向基网络的自适应噪声抵消(adaptive filter based on least square algorithm and radial basis network,简称OLSRBFAF)算法。RBF网络因其具有良好的推广能力,简单的结构和快速的训练过程等诸多优点已被成功应用于很多领域。RBF神经网络中关键因素是基函数中心的选取,中心选取不当构造出来的RBF网络的性能一般不能令人满意。利用正交最小二乘(orthogonal least squares,简称OLS)算法选取RBF网络中心,解决了径向基函数网络构造这一关键问题。并由于OLS算法中采用了最小二乘(least-square,简称LS)准则,其对时变信道具有快速跟踪的能力。利用MATLAB仿真结果分析可知,通过将两种算法结合引入自适应噪声抵消系统,使该系统具有误差更小,消除噪声能力更强的优点。 相似文献
3.
本征音子说话人自适应算法在自适应数据量充足时可以取得很好的自适应效果,但在自适应数据量不足时会出现严重的过拟合现象。为此该文提出一种基于本征音子说话人子空间的说话人自适应算法来克服这一问题。首先给出基于隐马尔可夫模型-高斯混合模型(HMM-GMM)的语音识别系统中本征音子说话人自适应的基本原理。其次通过引入说话人子空间对不同说话人的本征音子矩阵间的相关性信息进行建模;然后通过估计说话人相关坐标矢量得到一种新的本征音子说话人子空间自适应算法。最后将本征音子说话人子空间自适应算法与传统说话人子空间自适应算法进行了对比。基于微软语料库的汉语连续语音识别实验表明,与本征音子说话人自适应算法相比,该算法在自适应数据量极少时能大幅提升性能,较好地克服过拟合现象。与本征音自适应算法相比,该算法以较小的性能牺牲代价获得了更低的空间复杂度而更具实用性。 相似文献
4.
5.
为了提高说话人识别系统的性能,提出基于改进语谱图的深度学习说话人识别算法。语谱图当中包含了语音的内容、情绪、语种以及说话人身份等多种信息,在以往的说话人识别算法中,往往没有考虑到说话人身份特性,采用直接提取语音中的语谱图作为网络输入,而说话人识别系统中需要提取语谱图中表征身份的信息,因此需要在原始语谱图的基础上进行改进。在语谱图中,基音频率以及共振峰等信息最能表现说话人的身份特征,从而提出根据语音信号中每一帧的基音频率进行自适应梳状滤波,得到改进后的语谱图,再通过卷积神经网络提取说话人特征,从而达到提升识别准确率的效果。网络模型采用MobileNetv2神经网络,该网络模型具有模型参数少、收敛速度快、识别速度快等优点,有利于实际应用。在对照实验结果中,该方法相对于原始语谱图的准确率分别提高了2.3%、5.2%、3%。 相似文献
6.
7.
本征音子说话人自适应方法在自适应数据量不足时会出现严重的过拟合现象,提出了一种基于稀疏组LASSO约束的本征音子说话人自适应算法。首先给出隐马尔可夫—高斯混合模型下本征音子说话人自适应的基本原理;然后将稀疏组LASSO正则化引入到本征音子说话人自适应,通过调整权重因子控制模型的复杂度,并通过一种加速近点梯度的数学优化算法来实现;最后将稀疏组LASSO约束的自适应算法与当前多种正则化约束的自适应方法进行比较。汉语连续语音识别的说话人自适应实验表明,引入稀疏组LASSO约束后,本征音子说话人自适应方法的性能得到了明显提高,且稀疏组LASSO约束方法优于l1、l2和弹性网正则化方法。 相似文献
8.
针对语音识别中快速说话人自适应问题,对已有的说话人支持权重算法进行改进,利用支撑向量机(Support Vector Machines,SVM)参与支持说话人选择过程,并采用最大后验概率(Maximum a Posteriori,MAP)代替最大似然(Maximum Likelihood,ML)准则进行支撑说话人权重的估算,最后对测试说话人进行线性组合。与现有的相关自适应方法相比,该算法能够有效提高自适应数据较少时的性能。实验结果表明,在仅有一句自适应语句的情况下系统汉字正识率从原有非特定人(Speaker Independent,SI)系统的45.67%到58.05%,相对原有说话人支持权重算法提高4.67%。 相似文献
9.
10.
11.
12.
13.
Comparing support vector machines with Gaussian kernels to radialbasis function classifiers 总被引:2,自引:0,他引:2
Scholkopf B. Kah-Kay Sung Burges C.J.C. Girosi F. Niyogi P. Poggio T. Vapnik V. 《Signal Processing, IEEE Transactions on》1997,45(11):2758-2765
The support vector (SV) machine is a novel type of learning machine, based on statistical learning theory, which contains polynomial classifiers, neural networks, and radial basis function (RBF) networks as special cases. In the RBF case, the SV algorithm automatically determines centers, weights, and threshold that minimize an upper bound on the expected test error. The present study is devoted to an experimental comparison of these machines with a classical approach, where the centers are determined by X-means clustering, and the weights are computed using error backpropagation. We consider three machines, namely, a classical RBF machine, an SV machine with Gaussian kernel, and a hybrid system with the centers determined by the SV method and the weights trained by error backpropagation. Our results show that on the United States postal service database of handwritten digits, the SV machine achieves the highest recognition accuracy, followed by the hybrid system. The SV approach is thus not only theoretically well-founded but also superior in a practical application 相似文献
14.
基于神经网络的说话人识别方法可以在一定程度上模仿人脑的功能,是说话人识别中的一种主要技术,但它通常难以确定隐层单元的数目,收敛速度慢,易于收敛到极小点。该文研究了一种用于说话人识别的小波神经网络模型,给出了网络结构和学习算法。采用Mel频率倒谱系数作为与文本无关的说话人识别的特征参数,并利用该模型进行了5个人的说话人识别实验,得到99.5%的识别率。实验结果表明,小波网络和传统的BP网络相比,训练速度和识别率都有了较大提高,具有良好的应用前景和进一步研究的价值。 相似文献
15.
高斯混合模型采用固定混合数结构的建模方法并不符合说话人语音特征分布的多样性,从而出现过拟合或者欠拟合的情况并影响系统的识别性能。提出一种混合数可变的自适应高斯混合模型并将其应用于说话人识别。模型训练中根据说话人语音特征参数分布的聚类特性,采用吸收合并与分裂机制动态调整混合数以获得更加精确的拟合性能,提高系统识别率。实验结果显示,在特征参数MFCC和BFCC(Bilinear Frequency Cepstrum Coefficients)下相对误识率分别下降了41.41%和22.21%。 相似文献
16.
17.
针对现有图像识别系统大多采用软件实现,无法利用神经网络并行计算能力的问题。该文提出一套基于FPGA的改进RBF神经网络硬件化图像识别系统,将乘法运算改为加法运算解决了神经网络计算复杂不便于硬件化的问题,并且提出一种基于位比较的排序电路解决了大量数据的快速排序问题,以此为基础开发了多目标图像识别应用系统。系统特征提取部分采用FPGA实现,图像识别部分采用ASIC电路实现。实验结果表明,该文所提出的改进RBF神经网络算法平均识别时间较LeNet-5, AlexNet和VGG16缩短50%;所开发的硬件系统完成对10000张样本图片识别的时间为165 μs,对比于DSP芯片系统所需426.6 μs,减少了60%左右。 相似文献
18.
Besacier L. Mayorga P. Bonastre J.-F. Fredouille C. Meignier S. 《Vision, Image and Signal Processing, IEE Proceedings -》2003,150(6):372-376
An overview is presented of compression and packet loss effects in speech biometrics. These new problems appear particularly in recent applications of biometrics over mobile or Internet networks. The influence of speech compression on speaker recognition performance in mobile networks is investigated. In a first experiment, it is found that the use of GSM coding degrades the performance. In a second experiment, the features for the speaker recognition system are calculated directly from the information available in the encoded bit stream. It is found that a low LPC order in GSM coding is responsible for most performance degradations. A speaker recognition system was obtained which is equivalent in performance to the original one which decodes and reanalyses speech before performing recognition. The joint packet loss and compression effects over IP networks are also studied. It is experimentally demonstrated that the adverse effects of packet loss alone are negligible, while the encoding of speech, particularly at a low bit rate, coupled with packet loss, can reduce the verification accuracy considerably. 相似文献
19.
Po-Rong Chang Wen-Hao Yang 《Vehicular Technology, IEEE Transactions on》1997,46(1):155-160
This paper investigates the application of a radial basis function (RBF) neural network to the prediction of field strength based on topographical and morphographical data. The RBF neural network is a two-layer localized receptive field network whose output nodes from a combination of radial activation functions computed by the hidden layer nodes. Appropriate centers and connection weights in the RBF network lead to a network that is capable of forming the best approximation to any continuous nonlinear mapping up to an arbitrary resolution. Such an approximation introduces best nonlinear approximation capability into the prediction model in order to accurately predict propagation loss over an arbitrary environment based on adaptive learning from measurement data. The adaptive learning employs hybrid competitive and recursive least squares algorithms. The unsupervised competitive algorithm adjusts the centers while the recursive least squares (RLS) algorithm estimates the connection weights. Because these two learning rules are both linear, rapid convergence is guaranteed. This hybrid algorithm significantly enhances the real-time or adaptive capability of the RBF-based prediction model. The applications to Okumura's (1968) data are included to demonstrate the effectiveness of the RBF neural network approach 相似文献