首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 62 毫秒
1.
非线性统计匹配用于子带鲁棒语音识别   总被引:1,自引:0,他引:1  
由于语音信号的多变性,识别系统的性能极易受噪声环境的影响而导致性能下降。该文以听觉试验为基础,提出一种新的非线性独立子带隐马尔可夫模型(HMM)最大后验统计匹配算法。该算法依据人耳感知的频选性,根据各子带噪声特点采用统计匹配、MAP估计和HMM/MLP非线性映射来补偿噪声环境的影响。实验表明该算法明显改善了识别系统在噪声环境下的性能。  相似文献   

2.
基于最大似然估计的子带语音去噪的研究   总被引:1,自引:0,他引:1  
对基本的谱减算法进行了改进。利用满足听觉感知模型的滤波器组对语音信号进行处理,然后通过自动跟踪信号的低能量部分的包络来估计背景噪声的特性,最后利用改进的谱减技术对子带语音进行滤波并增强。实验表明,改进的谱减技术具有良好的性能。  相似文献   

3.
基于改进语音特征提取方法的语音识别   总被引:1,自引:1,他引:0  
在分析语音特征提取方法基础上提出一种改进组合算法,并采用HMM声学模型和Viterbi算法进行模式训练和识别.实验结果表明,该算法在噪声环境中具有较好的鲁棒性,能有效提高噪声环境下中文连续语音识别的正确率,增强语音识别整体性能,因此在噪声环境下的语音识别系统中具有一定的实用价值.  相似文献   

4.
从线性预测HMM到一种新的语音识别的混合模型   总被引:1,自引:0,他引:1       下载免费PDF全文
欧智坚  王作英 《电子学报》2002,30(9):1313-1316
线性预测HMM(Linear Prediction HMM,LPHMM)并没有象传统HMM那样引入状态输出独立同分布假设,但实用中识别性能并不佳.通过分析两种HMM的各自优劣,本文提出了一种新的语音识别的混合模型,将语音静态特性(基于传统HMM)和动态特性(基于LPHMM)分别描述又有机结合在一起,更为精确地刻划了真实的语音现象,同时又继承使系统的实现改动很小和较小的计算量.汉语大词汇量非特定人连续语音识别的实验表明,混合模型的识别性能显著好于LPHMM和传统HMM.理论上,本文还给出了LPHMM的一组闭式参数重估公式.  相似文献   

5.
针对字母手势的检测和跟踪问题,文章提出一种基于最大似然准则Hausdorff距离的手势识别算法。该算法首先对字母手势图像进行二值化处理,并由字母手势图像的边缘信息中提取字母手势的关键点(指根和指尖);然后采用基于最大似然准则的Hausdorff距离对手势进行识别,搜索策略采用类似于Rucklidge提出的多分辨率搜索方法,在不影响成功率和目标定位精度的情况下,可以显著地缩短搜索时间。实验结果表明此方法可以较好地识别字母手势,同时对部分变形(旋转和缩放)手势也有良好的效果。  相似文献   

6.
7.
吴斌  袁亚博  汪勃 《电子与信息学报》2016,38(10):2546-2552
为解决有记忆非线性的连续相位调制(CPM)信号调制方式识别精度低的问题,该文提出一种基于记忆因子的CPM信号最大似然调制识别新方法。该方法定义具有时齐马尔科夫性的映射符号,通过计算其后验概率构造记忆因子,进一步结合CPM分解和EM算法,推导出时间可分离,信道参数可估计的CPM信号似然函数。该调制识别方法所需符号数目少,适用信噪比范围广,识别CPM信号种类多且精度高,对相位误差鲁棒性强。仿真结果证明,当符号数目为200,信噪比为0 dB,相位误差任意时,该方法对8种CPM信号的识别率可达95%以上。  相似文献   

8.
大口径平面镜作为光学系统的重要组成部分, 其面形精度对系统成像具有重要影响。子孔径拼接检测作为大口径光学平面反射镜检测的常用手段, 子孔径拼接算法是该技术的核心。研究了平面子孔径拼接算法, 基于最大似然估计与正交化Zernike多项式拟合建立了一套合理的拼接算法与数学模型, 基于该算法模型可以有效实现对大口径平面镜的拼接检测, 同时编写了相应的拼接程序, 并利用100 mm干涉仪对120 mm的平面镜进行了拼接检测, 给出了拼接检测与全口径检测的对比结果, 对比结果表明: 拼接所得全孔径相位分布与全口径检测结果的RMS值偏差分别为0.002, 验证了算法的可靠性与准确性。  相似文献   

9.
LFM信号参数估计的最大似然改进算法   总被引:1,自引:0,他引:1  
为实现含噪声LFM信号参数的快速检测和精确估计,提出了一种基于延时相关解线调的最大似然估计改进算法,即首先在时域内进行延时相关解线调,然后对解线调后含噪声信号进行经典功率谱估计,得到调频斜率的粗略估计,将此估计值作为初始值,再进行最大似然估计,得到调频斜率的精确估计值,用此精确估计值对原LFM信号进行解线调,再以同样的思路可以得到LFM信号初始频率的最大似然精确估计值。仿真实验证明了该算法的有效性。  相似文献   

10.
孙暐  吴镇扬 《信号处理》2006,22(4):559-563
根据Flether等人的研究,基于感知独立性假设的子带识别方法被用于抗噪声鲁棒语音识别。本文拓展子带方法,采用基于噪声污染假定的多带框架来减少噪声影响。论文不仅从理论上分析了噪声污染假定多带框架在识别性能上的潜在优势,而且提出了多带环境下的鲁棒语音识别算法。研究表明:多带框架不仅回避了独立感知假设要求,而且与子带方法相比,多带方法能更好的减少噪声影响,提高系统识别性能。  相似文献   

11.
噪声自适应的多数据流复合子带语音识别方法   总被引:3,自引:0,他引:3  
张军  韦岗 《电子与信息学报》2006,28(7):1183-1187
首先针对现有丢失数据语音识别技术中的边缘化(marginalisation)技术在特征运用上的局限,提出了一种倒谱特征分量的可靠性估计方法,将边缘化技术推广到常用的倒谱语音识别系统中; 然后利用基于全带和子带倒谱特征的边缘化识别器在不同噪声中的互补性能,提出了一种噪声自适应的多数据流复合子带语音识别方法。实验结果表明,所提识别方法可以自适应地选出全带和子带数据流中受噪声影响较小者并以之为主要依据进行识别,有效地提高了识别系统在多变噪声环境中的鲁棒性。  相似文献   

12.
For the acoustic models of embedded speech recognition systems, hidden Markov models (HMMs) are usually quantized and the original full space distributions are represented by combinations of a few quantized distribution prototypes. We propose a maximum likelihood objective function to train the quantized distribution prototypes. The experimental results show that the new training algorithm and the link structure adaptation scheme for the quantized HMMs reduce the word recognition error rate by 20.0%.  相似文献   

13.
基于子带能量累积变化的语音端点检测   总被引:1,自引:0,他引:1  
噪声环境下的语音端点检测在稳健语音识别中占有十分重要的地位。根据噪音和语音子带能量的累积分布变化,提出一种新的语音信号端点检测算法。通过计算各帧的子带能量变化程度,并以此设定门限进行语音端点的检测。实验表明,与一些传统的端点检测算法比较,该算法在速度和抗噪声能力上都有所增强,适合低信噪比下的语音端点检测。  相似文献   

14.
摘 要 稀疏编码(SRC)是一种用于人脸识别的方法。该方法把检测图像表示为一组训练样本的稀疏线性组合,表示的准确性通过L2或L1残余项来衡量。此模型假定编码残余项服从高斯分布或拉普拉斯分布,实际上却不能很准确的描述编码错误率。本文提出一种新的稀疏编码方法,建立一种有约束的回归问题模型。最大似然稀疏编码(MSC)寻找此模型的最大似然估计参数,对异常情况具有很强的鲁棒性。在Yale及ORL人脸数据库的实验结果表明了该方法对于人脸模糊、光照及表情变化等的有效性及鲁棒性。  相似文献   

15.
为提高语音活动检测(VAD)在低信噪比下的准确率,提出了一种基于子带长时信号变化特征的VAD算法.将语音信号转换到频域,并分解为几个不重复的子频带,对这些子带信号分别提取长时信号变化特征,然后采用GMM在线建立语音和非语音模型,以模型的似然比进行VAD判决.实验结果表明,算法在较低的信噪比下能够显著地提高语音活动检测的准确率,且在多种噪声环境和信噪比条件下具有较好的稳健性.应用于语音识别系统的实验表明,该算法能有效提高噪声环境下的语音识别率.  相似文献   

16.
The maximum likelihood linear spectral transformation (ML‐LST) using a numerical iteration method has been previously proposed for robust speech recognition. The numerical iteration method is not appropriate for real‐time applications due to its computational complexity. In order to reduce the computational cost, the objective function of the ML‐LST is approximated and a closed‐form solution is proposed in this paper. It is shown experimentally that the proposed closed‐form solution for the ML‐LST can provide rapid speaker and environment adaptation for robust speech recognition.  相似文献   

17.
The principal cause of speech recognition errors is a mismatch between trained acoustic/language models and input speech due to the limited amount of training data in comparison with the vast variation of speech. It is crucial to establish methods that are robust against voice variation due to individuality, the physical and psychological condition of the speaker, telephone sets, microphones, network characteristics, additive background noise, speaking styles, and other aspects. This paper overviews robust architecture and modeling techniques for speech recognition and understanding. The topics include acoustic and language modeling for spontaneous speech recognition, unsupervised adaptation of acoustic and language models, robust architecture for spoken dialogue systems, multi-modal speech recognition, and speech summarization. This paper also discusses the most important research problems to be solved in order to achieve ultimate robust speech recognition and understanding systems. Dr. Sadaoki Furui is currently a Professor at Tokyo Institute of Technology, Department of Computer Science. He is engaged in a wide range of research on speech analysis, speech recognition, speaker recognition, speech synthesis, and multimodal human-computer interaction and has authored or coauthored over 450 published articles. From 1978 to 1979, he served on the staff of the Acoustics Research Department of Bell Laboratories, Murray Hill, New Jersey, as a visiting researcher working on speaker verification. He is a Fellow of the IEEE, the Acoustical Society of America and the Institute of Electronics, Information and Communication Engineers of Japan (IEICE). He was President of the Acoustical Society of Japan (ASJ) from 2001 to 2003 and the Permanent Council for International Conferences on Spoken Language Processing (PC-ICSLP) from 2000 to 2004. He is currently President of the International Speech Communication Association (ISCA). He was a Board of Governor of the IEEE Signal Processing Society from 2001 to 2003. He has served on the IEEE Technical Committees on Speech and MMSP and on numerous IEEE conference organizing committees. He has served as Editor-in-Chief of both Journal of Speech Communication and the Transaction of the IEICE. He is an Editorial Board member of Speech Communication, the Journal of Computer Speech and Language, and the Journal of Digital Signal Processing. He has received the Yonezawa Prize and the Paper Awards from the IEICE (1975, 88, 93, 2003), and the Sato Paper Award from the ASJ (1985, 87). He has received the Senior Award from the IEEE ASSP Society (1989) and the Achievement Award from the Minister of Science and Technology, Japan (1989). He has received the Technical Achievement Award and the Book Award from the IEICE (2003, 1990). He has also received the Mira Paul Memorial Award from the AFECT, India (2001). In 1993 he served as an IEEE SPS Distinguished Lecturer. He is the author of “Digital Speech Processing, Synthesis, and Recognition” (Marcel Dekker, 1989, revised, 2000) in English, “Digital Speech Processing” (Tokai University Press, 1985) in Japanese, “Acoustics and Speech Processing” (Kindai-Kagaku-Sha, 1992) in Japanese, and “Speech Information Processing” (Morikita, 1998) in Japanese. He edited “Advances in Speech Signal Processing” (Marcel Dekker, 1992) jointly with Dr. M.M. Sondhi. He has translated into Japanese “Fundamentals of Speech Recognition,” authored by Drs. L.R. Rabiner and B.-H. Juang (NTT Advanced Technology, 1995) and “Vector Quantization and Signal Compression,” authored by Drs. A. Gersho and R. M. Gray (Corona-sha, 1998).  相似文献   

18.
稳健语音识别技术发展现状及展望   总被引:12,自引:0,他引:12  
姚文冰  姚天任  韩涛 《信号处理》2001,17(6):484-493
本文在简单叙述稳健语音识别技术产生的背景后,着重介绍了现阶段国内外有关稳健语音识别的主要技术、研究现状及未来发展方向.首先简述引起语音质量恶化、影响语音识别系统稳健性的干扰源及其影响.然后分别介绍语音增强、稳健语音特征的提取、基于特征和模型的补偿技术、麦克风阵列、基于人耳的听觉处理及听觉视觉双模态语音识别等技术路线及发展现状.最后讨论稳健语音识别技术朱来的发展方向.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号