首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 609 毫秒
1.
基于GDTW+SVM的语音识别   总被引:3,自引:0,他引:3  
针对经特征提取后语音信号的特征参数的维数不同问题,文章提出了基于GDTW核 SVM算法的语音识别方法。这种方法先对语音信号进行特征提取,并通过GDTW核把特征矢量映射到高维特征空间,然后在高维特征空间中应用支持矢量机的分类方法进行识别。实验证明,与DTW算法和神经网络方法相比,这种方法是可行的,能显著提高语音信号的识别率。  相似文献   

2.
提出一种基于动态时间规整(DTW)和改进的学习矢量量化(LoPLVQ)的神经网络的语音识别方法.该方法用动态时间规整算法先对语音信号进行时间规整,然后通过改进的学习矢量量化神经网络进行语音的分类识别.实验表明,新系统在大规模语音识别方面不仅能缩短训练时间,而且具有较高的识别率.  相似文献   

3.
认知无线电中调制识别算法研究   总被引:1,自引:0,他引:1  
通信信号的调制类型识别对于认知无线电这种智能通信系统具有重要研究意义。利用调制信号的循环谱相关特征,提取了5个特征参数,给出了各个参数随信噪比变化的曲线图。分类器基于RBF神经网络,采用"一类一个网络"结构,并从提高网络识别性能出发,构建了大容量和高质量的网络训练样本,能够扩大识别范围,提高识别精度。基于谱相关特征参数和神经网络分类器的算法能动态识别信号的调制方式,仿真结果验证了该算法在低信噪比下的有效性。  相似文献   

4.
在传统电力调度通信系统中,通话双方的语音内容被存储在单个录音文件中。如何将通话双方的人声区分开来,对于语音识别和声纹识别在电力调度领域应用具有重要的作用。当有多人同时说话时,如何将这些人声区分开来的问题,被称为鸡尾酒会问题。为解决鸡尾酒会问题,特别是单通道双人语音分离问题,提出了基于注意力机制的深度聚类网络。首先提取语音信号的MFCC特征,其次将其输入到神经网络提取更高维度的特征,再次采用注意力机制为每一特征分配权重,最后采用k-means聚类将同一说话人的语音聚类输出。在wsj0数据集上,所提算法模型相比于原始的聚类网络取得了更好的效果。测试结果表明:在封闭的数据集上,新算法的SDR增长在男性和男性混合语音、女性和女性混合语音、男性和女性混合语音及总体情况分别为20.58%、17.25%、1.88%、22.78%,而在开放数据集上的对应结果分别为3.56%、20.87%、1.04%、17.67%。  相似文献   

5.
基于聚类统计与文本无关的说话人识别研究   总被引:6,自引:2,他引:4  
从语音信号特征矢量的空间映射出发,在二元分裂算法的基础上提出了一种裂合并的聚类算法,并用于与本文无关的说话人识别研究,初步建立了基于聚类统计的开放系统,该系统用说话人语音信号在特征空间的分布中心建立参考模板,用聚类统计中心代替待识语音段的特征矢量进行了模式匹配计算,系统规模的越大,节省的计算量就越多。在小规模说话人辨认系统的实践研究中,研究了特征矢量的加权,语音段的时长以及a因子的选择对系统性能的影响。  相似文献   

6.
建立了一种基于自组织神经网络的语音识别系统。对语音信号进行了预处理,提取了语音信号的线性预测系数、线性预测倒谱系数和Mel倒谱特征系数,建立了基于自组织神经网络的识别判决模型。深入分析和改进了自组织神经网络的分类聚类能力,通过加强训练和设定阈值函数的方法,有效地确定了边界神经元的归属,划分出了合理的输出模式类。验证了自组织神经网络适合于处理孤立词语音识别,并具有快速性和结构简单等特征。MATLAB仿真实验表明,语音识别率达到96%。  相似文献   

7.
语音情感识别中,情感特征信息的提取和选择、情感识别模型的选择是2个重要部分.结合语音信号的声学特征参数和听觉特征参数进行情感识别,针对两类不同情感之间的差别选择最优的特征集,并设计了一个基于神经网络的情感交叉识别,与听觉特征参数结合,经过分类器得到识别情感,达到平均92%识别率.  相似文献   

8.
神经网络语音识别的研究及进展   总被引:3,自引:0,他引:3  
论述了在语音识别中听觉神经网络模型,BP网络,时延神经网络,自组织影射,学习矢量量化和神经预测网络的优缺点及神经网络语音识别的发展动态。  相似文献   

9.
冯涛 《无线电工程》2006,36(6):24-26
通信信号的分类识别是一种典型的统计模式识别问题。系统地论述了通信信号特征选择、特征提取和分类识别的原理和方法。设计了人工神经网络分类器,包括神经网络模型的选择、分类器的输入输出表示、神经网络拓扑结构和训练算法,并提出了分层结构的神经网络分类器。  相似文献   

10.
一种改进的模糊C-均值聚类算法在说话人识别中的应用   总被引:3,自引:0,他引:3  
杨彦  赵力 《电声技术》2006,(1):40-43
提出了一种将改进的FCM聚类算法与矢量量化相结合的说话人识别的方法。先从语音信号中提取待识别的特征矢量集,再利用矢量量化来设计码本,最后用改进的算法对待识别语音进行辩识。该算法解决了FCM算法对初始值敏感、易陷入局部最优的问题。所使用的特征参数较少,计算比较简单,但识别率较高,且具有较好的鲁棒性。  相似文献   

11.
陈婧  李海峰  马琳  陈肖  陈晓敏 《信号处理》2017,33(3):374-382
针对传统维度语音情感识别系统采用全局统计特征造成韵律学细节信息丢失以及特征演化规律缺失的问题,本文提出了一种基于不同时间单元的多粒度特征提取方法,提取了短时帧粒度、中时段粒度以及长时窗粒度特征,并提出了一种可以融合多粒度特征的基于认知机理的回馈神经网络(Cognition-Inspired Recurrent Neural Network, CIRNN)。该网络模拟了人脑处理语音信号时“循序渐进”的过程,通过融合多粒度特征,使得不同时间单元的特征均参与网络训练,既突出了情感的时序性,也保留了全局特性对情感识别的作用,实现多层级信息融合。该网络同时模拟大脑运用以往经验模式进行对比的过程,在网络中引入记忆层,用于记忆上文情感特征,强化了上下文信息对识别的影响作用。本文将该方法用于VAM维度语料库的维度情感识别,分别从Activation、Dominance、Valence三个维度进行测试,平均相关系数为0.66,识别结果明显优于传统ANN和SVR的识别结果。   相似文献   

12.
Living plant recognition based on images of leaf, flower and fruit is a very challenging task in the field of pattern recognition and computer vision. There has been little work reported on flower and fruit image processing and recognition. In recent years, several researchers have dedicated their work to leaf characterisation. As an inherent trait, leaf vein definitely contains the important information for plant species recognition despite its complex modality. A new approach that combines a thresholding method and an artificial neural network (ANN) classifier is proposed to extract leaf veins. A preliminary segmentation based on the intensity histogram of leaf images is first carried out to coarsely determine vein regions. This is followed by a fine segmentation using a trained ANN classifier with ten features extracted from a window centred on the object pixel as its inputs. Compared with other methods, experimental results show that this combined approach is capable of extracting more accurate venation modality of the leaf for the subsequent vein pattern classification. The approach can also reduce the computing time compared with a direct neural network approach  相似文献   

13.
基于相空间重构实现非线性语音清浊音判决   总被引:2,自引:0,他引:2  
陈亮  张雄伟 《通信学报》2003,24(6):16-22
以相空间重构理论为基础,采用Takens定理重构语音信号相空间并提取相似序列重复度(RPT)特征参数。利用清浊音RPT参数的差异,提出并实现了一种采用BP神经网络进行非线性清浊音判决的方法,得到了明显优于传统算法的结果。本文方法为语音特征提取和识别研究提供了新的途径  相似文献   

14.
Recently several speaker adaptation methods have been proposed for deep neural network (DNN) in many large vocabulary continuous speech recognition (LVCSR) tasks. However, only a few methods rely on tuning the connection weights in trained DNNs directly to optimize system performance since it is very prone to over-fitting especially when some class labels are missing in the adaptation data. In this paper, we propose a new speaker adaptation method for the hybrid NN/HMM speech recognition model based on singular value decomposition (SVD). We apply SVD on the weight matrices in trained DNNs and then tune rectangular diagonal matrices with the adaptation data. This alleviates the over-fitting problem via updating the weight matrices slightly by only modifying the singular values. We evaluate the proposed adaptation method in two standard speech recognition tasks, namely TIMIT phone recognition and large vocabulary speech recognition in the Switchboard task. Experimental results have shown that it is effective to adapt large DNN models using only a small amount of adaptation data. For example, recognition results in the Switchboard task have shown that the proposed SVD-based adaptation method may achieve up to 3-6 % relative error reduction using only a few dozens of adaptation utterances per speaker.  相似文献   

15.
This study proposes a hybrid model of speech recognition parallel algorithm based on hidden Markov model (HMM) and artificial neural network (ANN). First, the algorithm uses HMM for time-series modeling of speech signals and calculates the voice to the HMM of the output probability score. Second, with the probability score as input to the neural network, the algorithm gets information for classification and recognition and makes a decision based on the hybrid model. Finally, Matlab software is used to train and test sample data. Simulation results show that using the strong time-series modeling ability of HMM and the classification features of neural network, the proposed algorithm possesses stronger noise immunity than the traditional HMM. Moreover, the hybrid model enhances the individual flaws of the HMM and the neural network and greatly improves the speed and performance of speech recognition.  相似文献   

16.
宋波  张雪英 《电声技术》2009,33(8):68-70
以G.721ADPCM语音编码算法为研究对象,在语音编码的预测中引入神经网络模型来克服传统线性滤波方法中存在的不足,研究了基于RBF神经网络的ADPCM语音编码系统的结构。通过k均值聚类算法来确定RBF神经网络的中心和宽度,用最小二乘法确定RBF网络权值的方法改进了ADPCM语音编码算法。实验证明.其平均信噪比较原ADPCM编码算法有1-2dB的提高。  相似文献   

17.
语音信号互信息估计的非线性搜索算法及识别应用   总被引:6,自引:0,他引:6  
基于互信息理论的语音识别方法不仅考虑了语音信号的时变分布特征,并且考虑了语音信号的统计分布特征,能有效地提高同类模式的凝聚度,减少非同类模式间的耦合性,在语音识别实验和实际应用中反映出良好的识别精度和很高的运行效率,与其它方法相比更适合嵌入式系统的语音识别应用。本文提出了一种互信息估计的非线性搜索算法,这一算法能够有效地处理语音信号时变分布特征的非线性波动,进一步提高语音模式互信息匹配的精度。  相似文献   

18.
An adaptive speech streaming method to improve the perceived speech quality of a software‐based multipoint control unit (SW‐based MCU) over IP networks is proposed. First, the proposed method predicts whether the speech packet to be transmitted is lost. To this end, the proposed method learns the pattern of packet losses in the IP network, and then predicts the loss of the packet to be transmitted over that IP network. The proposed method classifies the speech signal into different classes of silence, unvoiced, speech onset, or voiced frame. Based on the results of packet loss prediction and speech classification, the proposed method determines the proper amount and bitrate of redundant speech data (RSD) that are sent with primary speech data (PSD) in order to assist the speech decoder to restore the speech signals of lost packets. Specifically, when a packet is predicted to be lost, the amount and bitrate of the RSD must be increased through a reduction in the bitrate of the PSD. The effectiveness of the proposed method for learning the packet loss pattern and assigning a different speech coding rate is then demonstrated using a support vector machine and adaptive multirate‐narrowband, respectively. The results show that as compared with conventional methods that restore lost speech signals, the proposed method remarkably improves the perceived speech quality of an SW‐based MCU under various packet loss conditions in an IP network.  相似文献   

19.
对于语音的情感识别,针对单层长短期记忆(LSTM)网络在解决复杂问题时的泛化能力不足,提出一种嵌入自注意力机制的堆叠LSTM模型,并引入惩罚项来提升网络性能。对于视频序列的情感识别,引入注意力机制,根据每个视频帧所包含情感信息的多少为其分配权重后再进行分类。最后利用加权决策融合方法融合表情和语音信号,实现最终的情感识别。实验结果表明,与单模态情感识别相比,所提方法在所选数据集上的识别准确率提升4%左右,具有较好的识别结果。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号