首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 62 毫秒
耳语音识别可应用于国家安全的某些特殊需要。运用双门限法对语音样本进行端点检测,通过实验分别找出短时能量、短时过零率的高低门限4个参数的最佳取值。深入分析研究参数的抗噪问题,在MFCC参数中引入短时能量、一阶差分、二阶差分等参数,增强MFCC的抗噪性。研究表明,在隐马尔可夫模型中,MFCC和LPCC联合运用讨论识别效果要远优于独立参数。  相似文献   

胡宏梅  别玉霞 《电子器件》2023,46(6):1634-1639
分析粒子群算法的基本原理及影响因素,理解语音识别流程,将粒子群算法运用到语音识别过程中,根据运行效果,提出一种柯西变异和粒子群聚集度混合扰动的方法。一旦粒子群聚集度偏低就说明粒子群多样性欠缺,通过变异当前最优值扰动粒子群收敛,避免粒子群的过早收敛,改善全局遍历性。实验证明,改进后的粒子群算法较好地提高了全局收敛性,提升了语音识别效果。  相似文献   

本文采用人工神经网络群进行手写体数字识别,把多模式分类转化为二模式分类,降低了网络面临的函数逼近的复杂性,并运用改进BP算法对网络进行训练,提高了训练速度、改善了网络性能。实验表明采用此方法的识别系统在性能上优于单BP网络识别系统。  相似文献   

对于多级数字交换网络来说,它经常被应用于大型程控交换机中,其作用是强化交换接续.在利用多级数字交换网络中经常会出现延时调整电路故障,如果不能将这些故障解决,将直接威胁到大型程控交换机正常使用.因此,本文将从延时调整电路与故障模型基本情况入手,从IN点输入测试码诊断与BPB选择器诊断方面,研究如何做好延时调整电路故障诊断工作.  相似文献   

本文采用前向、多层神经网络,BP学习算法对40个人的手写体数字进行了识别。识别过程分为四步:首先,用HP扫描仪把写在纸上的数字变成二值图像,接着对它进行分割,规整等预处理,变换成32×32点阵。然后提取特征,把点阵图像变成特征描述。最后,进行训练和识别。在拒识率为25%条件下,得到误识率为0.4%的识别结果,文中还分析和讨论了在实验中遇到的一些问题。  相似文献   

由于受识别率较低和计算量大的限制,语音识别的应用一直难以推广。根据楼宇控制系统的特点,文中提出了一种用DSP实现的数字连接词的语音识别实时系统,并结合BACnet协议,把系统设计成BACnet设备的一个嵌入式系统,从而把语音识别应用到楼宇控制系统中。  相似文献   

基于ANN的汉语数字语音识别   总被引:1,自引:0,他引:1  
本文介绍了在语音识别中使用人工神经网络构成识别系统的新方法,分析了它与传统识别方法的不同及优越性,并以BP网络构成不定人汉语数字语音识别器,通过计算机模拟实验表明,勘误别性能明显优于同样条件下HMM识别器,证明了用ANN进行语音识别是一种具吸引力有发展前途的新方法。  相似文献   

全刚  肖熙 《电声技术》2010,34(6):45-47
数字语音识别具有很高的识别率,具有较高的实用价值。为实现在真实噪声环境下能达到高识别率的数字语音识别系统,采用基于段长分布的隐马尔可夫模型(DDBHMM)进行了安静环境和带噪环境下,特定人和非特定人的数字语音识别试验。试验结果表明,基于DDBHMM模型的数字语音识别技术对真实非平稳噪声环境下录制的特定人和非特定人语音都具有较高识别率。  相似文献   

本文提出了一种识别孤立词汉语语音的新方法,提取线性预测系数作为语音特征,利用矢量量化的聚类特性压缩数据,用多段码书作为语音样板,用最小失真法进行识别。  相似文献   

To utilize the supra-segmental nature of Mandarin tones, this article proposes a feature extraction method for hidden markov model (HMM) based tone modeling. The method uses linear transforms to project F0 (fundamental frequency) features of neighboring syllables as compensations, and adds them to the original F0 features of the current syllable. The transforms are discriminatively trained by using an objective function termed as "minimum tone error", which is a smooth approximation of tone recognition accuracy. Experiments show that the new tonal features achieve 3.82% tone recognition rate improvement, compared with the baseline, using maximum likelihood trained HMM on the normal F0 features. Further experiments show that discriminative HMM training on the new features is 8.78% better than the baseline.  相似文献   

Tone model (TM) integration is an important task for mandarin speech recognition.It has been proved to be effective to use discriminatively trained scaling factors when integrating TM scores into multi...  相似文献   

Significance of group delay functions in spectrum estimation   总被引:1,自引:0,他引:1  
A method of spectrum estimation using group delay functions is proposed. This method exploits the additive property of the Fourier transform (FT) phase to extract spectral information of the signal in the presence of noise. The phase is generally featureless due to random polarity and wrappings, but the group delay function can be processed to derive significant information such as peaks in the spectral envelope. In the resulting spectral estimates obtained the resolution properties of the periodogram estimate are preserved while the variance is reduced. Variance caused by the sidelobe leakage due to windows and additive noise are significantly reduced even in the spectral estimate obtained using a single realization of the observation peak. Resolution is primarily dictated by the size of the data window. The method works even for high noise levels. The results of this procedure are demonstrated through two illustrative examples: estimation of sinusoids in noise and estimation of the narrowband autoregressive process in noise  相似文献   

Kim  N.S. Un  C.K. 《Electronics letters》1993,29(9):735-736
A technique for smoothing hidden Markov model parameters based on the concepts of deleted estimation and probabilistic mapping is proposed. The proposed algorithm is closely related to deleted interpolation in its approach and is shown to yield higher recognition rate than the distance-based smoothing and co-occurrence smoothing methods.<>  相似文献   

Power spectrum estimation of complex signals: group delay approach   总被引:1,自引:0,他引:1  
A method for estimating the power spectrum of a complex signal (CS) realised by the group delay (GD) for a CS and the modified GD concept is proposed. This extends the performance advantages of the modified GD applicable to a real signal to a complex signal. A significant reduction in variance without any compromise in frequency resolution over that of the periodogram is found  相似文献   

王维强 《电子设计工程》2012,20(12):186-189
设计了一个嵌入式语音识别系统,该系统硬件平台以ADSP-BF531为核心,采用离散隐马尔可夫模型(DHMM)检测和识别算法完成了对非特定人的孤立词语音识别。试验结果表明,该系统对非特定人短词汇的综合识别率在90%以上。该系统具有小型、高速、可靠以及扩展性好等特点;可应用于许多特定场合,有很好的市场前景。文中讲述了该系统CODEC、片外RAM、ROM以及CPLD等与DSP的接口设计,语音识别运用的矢量量化、Mel倒谱参数、Viterbi等有关算法及其实际应用效果。  相似文献   

We examine alternative architectures for a client-server model of speech-enabled applications over the World Wide Web (WWW). We compare a server-only processing model where the client encodes and transmits the speech signal to the server, to a model where the recognition front end runs locally at the client and encodes and transmits the cepstral coefficients to the recognition server over the Internet. We follow a novel encoding paradigm, trying to maximize recognition performance instead of perceptual reproduction, and we find that by transmitting the cepstral coefficients we can achieve significantly higher recognition performance at a fraction of the bit rate required when encoding the speech signal directly. We find that the required bit rate to achieve the recognition performance of high-quality unquantized speech is just 2000 bits per second  相似文献   

屈丹  张文林 《通信学报》2015,36(9):47-54
本征音子说话人自适应方法在自适应数据量不足时会出现严重的过拟合现象,提出了一种基于稀疏组LASSO约束的本征音子说话人自适应算法。首先给出隐马尔可夫—高斯混合模型下本征音子说话人自适应的基本原理;然后将稀疏组LASSO正则化引入到本征音子说话人自适应,通过调整权重因子控制模型的复杂度,并通过一种加速近点梯度的数学优化算法来实现;最后将稀疏组LASSO约束的自适应算法与当前多种正则化约束的自适应方法进行比较。汉语连续语音识别的说话人自适应实验表明,引入稀疏组LASSO约束后,本征音子说话人自适应方法的性能得到了明显提高,且稀疏组LASSO约束方法优于l1、l2和弹性网正则化方法。  相似文献   

In automatic speech recognition, the acoustic signal is the only tangible connection between the talker and the machine. While the signal conveys linguistic information, this information is often encoded in such a complex manner that the signal exhibits a great deal of variability. In addition, variations in environment and speaker can introduce further distortions that are linguistically irrelevant. This paper has three aims: 1) to discuss the nature of variabilities; 2) to describe the kinds of speech knowledge that may help us understand variabilities; and 3) to advocate and suggest specific procedures for the increased utilization of speech knowledge in automatic speech recognition.  相似文献   

In this article we have reviewed a wide variety of techniques based on the identification of missing spectral features that have proved effective in reducing the error rates of automatic speech recognition systems. These approaches have been conspicuously effective in ameliorating the effects of transient maskers such as impulsive noise or background music. We described two broad classes of missing feature algorithms: feature-vector imputation algorithms (which restore unreliable components of incoming feature vectors) and classifier-modification algorithms (which dynamically reconfigure the classifier itself to cope with the effects of unreliable feature components). We reviewed the mathematics of four major missing feature techniques: the feature-imputation techniques of cluster-based reconstruction and covariance-based reconstruction, and the classifier-modification methods of class-conditional imputation and marginalization. We also discussed the ways in which the common feature extraction procedures of cepstral analysis, temporal-difference features, and mean subtraction can be handled by speech recognition systems that make use of missing feature techniques. We concluded with a discussion of a small number of selected experimental results. These results confirm the effectiveness of all types of missing feature approaches discussed in ameliorating the effects of both stationary and transient noise, as well as the particular effectiveness of both soft masks and fragment decoding.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号