共查询到20条相似文献,搜索用时 31 毫秒
1.
一种稳健的基于Visemic LDA的口形动态特征及听视觉语音识别 总被引:4,自引:0,他引:4
视觉特征提取是听视觉语音识别研究的热点问题。文章引入了一种稳健的基于Visemic LDA的口形动态特征,这种特征充分考虑了发音时口形轮廓的变化及视觉Viseme划分。文章同时提出了一利利用语音识别结果进行LDA训练数据自动标注的方法。这种方法免去了繁重的人工标注工作,避免了标注错误。实验表明,将'VisemicLDA视觉特征引入到听视觉语音识别中,可以大大地提高噪声条件下语音识别系统的识别率;将这种视觉特征与多数据流HMM结合之后,在信噪比为10dB的强噪声情况下,识别率仍可以达到80%以上。 相似文献
2.
This paper discusses robust speech section detection by audio and video modalities. Most of today's speech recognition systems require speech section detection prior to any further analysis, and the accuracy of detected speech section s is said to affect the speech recognition accuracy. Because audio modalities are intrinsically disturbed by audio noise, we have been researching video modality speech section detection by detecting deformations in speech organ images. Video modalities are robust to audio noise, but their detection sections are longer than audio speech sections because deformations in related organs start before the speech to prepare for the articulation of the first phoneme, and also because the settling down motion lasts longer than the speech. We have verified that inaccurate detected sections caused by this excess length degrade the speech recognition rate, leading to speech recognition errors by insertions. To reduce insertion errors, and enhance the robustness of speech detection, we propose a method that takes advantage of the two types of modalities. According to our experiment, the proposed method is confirmed to reduce the insertion error rate as well as increase the recognition rate in noisy environment. 相似文献
3.
4.
Patrice Quinton 《电信纪事》1977,32(9-10):323-336
The author describes the syntactic analyzer which is used in the system keal for continuous speech recognition. After detection of the words in an utterance by a lexical analyzer, the syntactic analyzer builds all the possible syntactic structures according to a context free grammar previously defined by means of a compiled metalanguage. This analyzer allows, in some cases, to correct some errors such as omission and insertion of phonemes by the phonemic analyzer, or non-detection of short words by the lexical analyzer. This program enables presently the recognition of 65 % of utterances in simple dialogs. A few seconds are enough to recognize a sentence. 相似文献
5.
针对说话人识别进行的加权小波去噪方法 总被引:1,自引:1,他引:0
采用改进的小波去噪方法对含噪语音进行了前端处理,并针对说话人识别的特点,在小波重构之前对各小波系数进行加权处理;识别过程采用GMM识别算法。实验结果显示,相比纯粹使用MFCC作为识别特征的说话人识别提出的方法对于含噪说话人识别有明显的优越性。该方法对实时说话人识别有很好的指导作用。 相似文献
6.
The evaluation of the degree of speech impairment and the utility of computer recognition of impaired speech are separately and independently performed. Particular attention is paid to the question concerning whether or not there is a relationship between naive listeners' subjective judgments of impaired speech and the performance of a laboratory version of a speech recognition system. It is a difficult task to relate a speech impairment rating with speech recognition accuracy. Towards this end, a statistical causal model is proposed. This model is very appealing in its structure to support inference, and thus can be applied to perform various assessments such as the success of automatic recognition of dysarthric speech. The application of this model is illustrated with a case study of a dysarthric speaker compared against a normal speaker serving as a control 相似文献
7.
We investigate the performance of an isolated word speech recognition (IWSR) system for degraded speech. We propose a recognition scheme which adapts itself to mild degradations in speech and improves the reliability of recognition significantly. The scheme does not use a priori information regarding the nature and extent of noise. We suggest techniques which adaptively discriminate between noisy and noise-free parameters by using a selective weighting procedure in the final distance calculation. A new measure of performance is adopted to compare several recognition schemes using small data sets. Our scheme lends itself to greater flexibility in handling degradations in speech input than do the existing recognition schemes. 相似文献
8.
9.
本文研究了大词汇量非特定人汉语连续语音识别和理解系统中的容错技术.首先,声学识别器产生N个最优(N-best)音节候选及其相应的声学层的概念,再由N个最优音节候选构成一个音节网格(syllable lattice).一个容错语言分析器被用来搜索该音节网格并发现最优的汉字串.由于考虑了额外的可能候选音节,该最优汉字串的某些字的音节可能不在原来的音节网格中.这样,声学层的一些错误被纠正,语言分析器的稳健性(robustness)得以提高.实验表明容错分析器能将字的理解正确率从91.83%提高到94.15%.与传统的无容错技术的基于三元文法模型的分析器相比,错误率下降了28.4%. 相似文献
10.
研究了3种背景噪声下与说话人有关的孤立词语音识别方法。即语音前端声学处理法、正则相关分析的谱变换补偿方法和同模极点增加法。实验结果表明,这3种方法都有效地提高了噪声环境中语音识别率,其中较好的方法在强噪声环境中(信噪比为0 dB)的语音识别率达到80%以上,为信噪比较低的噪声环境中自动语音识别展现了美好前景。 相似文献
11.
在语音识别系统中产生错误识别的原因之一是端点检测有误差.在高信噪比情况下,正确地确定语音的端点并不困难.然而,大多数实际的语音识别系统需工作在低信噪比情况下,一些常规的端点检测方法,例如基于能量的端点检测方法在噪声环境下不能有效地工作.本文利用倒谱特征来检测语音端点,提出了带噪语音端点检测的两个算法,第一个算法利用倒谱距离代替短时能量作为判决的门限,第二个算法改进了基于隐马尔柯夫模型(HMM)的语音检测以适应噪声的变化,实验结果表明本方法可得到高正确率的带噪语音端点检测. 相似文献
12.
13.
依据车载自组织网络的特点,提出了一种基于椭圆曲线零知识证明的匿名安全认证机制,利用双向匿名认证算法避免消息收发双方交换签名证书,防止节点身份隐私在非安全信道上泄露;利用基于消息认证码的消息聚合算法,通过路边单元协助对消息进行批量认证,提高消息认证速度,避免高交通密度情形下大量消息因得不到及时认证而丢失。分析与仿真实验表明,该机制能实现车辆节点的隐私保护和可追踪性,确保消息的完整性。与已有车载网络匿名安全认证算法相比,该机制具有较小的消息延迟和消息丢失率,且通信开销较低。 相似文献
14.
本文提出了一种新的用于片上的语音识别多级搜索算法.该算法以连续隐含马尔可夫模型(Continuous Density HMM,CDHMM)为基本识别框架.在保证识别率基本不变的前提下,大大降低了片内存储空间的占用量,减少了识别搜索时间.在第二级识别候选词条的选取准则上,提出一种基于置信度的选择方法,更进一步改善了识别速度,增强了识别的稳健性.在200个语音命令的识别任务下,系统的识别率为98.83%.而当识别词条增加到600条时,该算法也具有良好的识别性能. 相似文献
15.
16.
数字语音识别具有很高的识别率,具有较高的实用价值。为实现在真实噪声环境下能达到高识别率的数字语音识别系统,采用基于段长分布的隐马尔可夫模型(DDBHMM)进行了安静环境和带噪环境下,特定人和非特定人的数字语音识别试验。试验结果表明,基于DDBHMM模型的数字语音识别技术对真实非平稳噪声环境下录制的特定人和非特定人语音都具有较高识别率。 相似文献
17.
噪声下差分复合子带语音识别方法 总被引:4,自引:0,他引:4
本文根据子带特征反映语音信号局部特性和全带特征反映语音信号整体特性的事实,提出了 一种差分复合子带语音识别新方法。先用频谱差分减少噪声的干扰,再将多子带特征识别概率与全带特征识别概率相结合进行综合判决,以得到最终识别结果。将新方法应用于TIMIT数据包0-9十个英文数字和E-Set在NoiseX92的白噪声和F16战机噪声下的识别实验。实验结果表明新方法比传统方法识别性能有很大提高。 相似文献
18.
《IEEE transactions on information theory / Professional Technical Group on Information Theory》1975,21(4):423-430
A model of a linguistic information source is proposed as a grammar that generates a language over some finite alphabet. It is pointed out that grammatical sentences generated by the source grammar contain intrinsic "redundancy" that can be exploited for error-corrections. Symbols occurring in the sentences are composed according to some syntactic rules determined by the source grammar, and hence are different in nature from the lexicographical source symbols assumed in information theory and algebraic coding theory. Almost all programming languages and some simple natural languages can be described by the linguistic source model proposed in this paper. In order to combat excessive errors for very noisy channels, a conventional encoding-decoding scheme that does not utilize the source structure is introduced into the communication system. Decoded strings coming out of the lexicographical decoder may not be grammatical, which indicates that some uncorrected errors still remain in the individual sentences and will be reprocessed by a syntactic decoder that converts ungrammatical strings into legal sentences of the source language by the maximum-likelihood criterion. Thus more errors in the strings coming out of the noisy channel can be corrected by the syntactic decoder using syntactic analysis than the !exicographical decoder is capable of correcting or even of detecting. To design the syntactic decoder we use parsing techniques from the study of compilers and formal languages. 相似文献
19.
基于语音学知识的鲁棒性两级语音起点检测方法 总被引:4,自引:2,他引:2
语音识别系统的实用化,需要对噪声有很强的鲁棒性,而噪声环境下的端点检测对整个识别系统性能起着关键的作用。提出一种基于语音学知识的两级起点检测方法,其中第一级选取短时能零比和短时谱幅作为初检特征,并采取自适应门限,第二级根据语音起点能量变化和语音性持续时间进行起点的确定。实验结果表明该方法在常见噪声环境下鲁棒性较好,且适于实时应用。 相似文献
20.
基于HMM/VQ的认人的中等词表连续语音识别 总被引:2,自引:2,他引:0
本文讨论基于隐马尔可夫模型(HMM)和矢量量化(VQ)的连续语音识别方法。用这种方法,对每个单词作成一个HMM,对多个模型组合成的状态转移网络搜索其状态转移的最佳路径,从而实现不预先进行单词切分的连续语音的识别,使用有限态文法约束及其它一些改善识别性能的措施,演示系统能识别特定人的18种英语句式,150个单词,用312个话句(共有2710个单词)进行测试,识别延迟时间为发音时长的62%,发音速度平均为每秒2.32个单词,单词识准率为97.3%。 相似文献