共查询到20条相似文献,搜索用时 156 毫秒
1.
2.
汉语语音理解系统的任务之一是把语音识别系统获得的汉语单单节转换成正确的汉字,词乃至汉语的短语,语句,与误音识别系统一起,完成一个语音到文本(speechtotext)的转换系统,本文利用一个闭环反馈方式汉语语音识别理解方案,在汉语词识别理解的基础上,进一步实现时汉语结构性短的识别理解,获得了预期的结果,最后本文对实验结果和反馈式语音识别理解方案进行了讨论。 相似文献
3.
汉语普通话是一种带声调的语言,声调信息在汉语连续语音识别中具有非常重要的作用。传统的连续语音声调识别算法一般只研究阴平、阳平、上声、去声的声调特征,却很少讨论第零声(即轻声)的声调特征。利用归一化自相关函数法研究了轻声音节基频轨迹的特点,并给出了可用于识别轻声音节的一些基本声调特征。 相似文献
4.
5.
6.
本文运用语言信号数字处理方法,研究了汉语普通话音素的区别特征,研究结果进一步完善了汉语普通话音素的区别特征矩阵表,将为基于音素的计算机汉语普通话语音分析、合成和识别提供了一种有效的参考方法。 相似文献
7.
8.
9.
计算机语音信号处理与语音识别系统 总被引:5,自引:0,他引:5
对计算机语音处理和对单个数码字识别的实现进行了探讨。根据汉语语音的特点,以汉语单音字作为识别对象,对10个数码字识别进行了研究和实验。通过观察和分析语音信号的时域特性(主要是短时帧能量、短时过零率和帧能量差),并把它们应用于语音端点检测,为系统的建立做了基础准备。选用了语音信号的功率谱差的特征,进行了模板的建立与识别实验。测试结果表明,该系统性能较稳定,单个数码字识别率可达98.6%,说话人识别率 相似文献
10.
基于状态码本的准连续隐马尔可夫模型 总被引:1,自引:0,他引:1
本文针对经典HMM模型对训练数据要求多且算法复杂的问题,提出了一种改进的模型一基于状态码本的准连续HMM模型(SCBHMM),该模型在有限训练数据的条件下能更加有效地描述语音信号的声学特征.通过将状态转移概率与动态谱变化量相关联,使得SCBHMM能有效地将语音信号的静态特征和动态特征相结合.通过在标准语音数据库USTC94上的大量实验表明了SCBHMM在汉语音节识别中的有效性,它缓减了模型对训练数据的要求,并大大降低了训练、识别的计算量,但同样取得了相当高的识别率. 相似文献
11.
The author presents a study of large-vocabulary continuous Mandarin speech recognition based on a segmental probability model (SPM) approach. The SPM was found to be very suitable for recognition of isolated Mandarin syllables especially considering the monosyllabic structure of the Chinese language. To extend the application of the model to continuous Mandarin speech recognition, a concatenated syllable matching (CSM) algorithm in place of the conventional Viterbi search algorithm is first introduced. Also, to utilise the available training material efficiently, a training procedure is proposed to re-estimate the SPM parameters using the maximum a posteriori (MAP) algorithm. A few special techniques integrating acoustic and linguistic knowledge are developed further to improve the performance step by step. Preliminary experimental results show that the final achievable rate is as high as 91.62%, which indicates a 18.48% error rate reduction and more than three times faster than the well studied subsyllable-based CHMM 相似文献
12.
为了解决传统氦语音处理技术存在的处理速度慢、计算复杂、操作困难等问题,提出了一种采用机器学习的氦语音识别方法,通过深层网络学习高维信息、提取多种特征,不但解决了过拟合问题,同时也具备了字错率(Word Error Rate,WER)低、收敛速度快的优点。首先自建氦语音孤立词和连续氦语音数据库,对氦语音数据预处理,提取的语音特征主要包括共振峰特征、基音周期特征和FBank(Filter Bank)特征。之后将语音特征输入到由深度卷积神经网络(Deep Convolutional Neural Network,DCNN)和连接时序分类(Connectionist Temporal Classification,CTC)组成的声学模型进行语音到拼音的建模,最后应用Transformer语言模型得到汉字输出。提取共振峰特征、基音周期特征和FBank特征的氦语音孤立词识别模型相比于仅提取FBank特征的识别模型的WER降低了7.91%,连续氦语音识别模型的WER降低了14.95%。氦语音孤立词识别模型的最优WER为1.53%,连续氦语音识别模型的最优WER为36.89%。结果表明,所提方法可有效识别氦语音。 相似文献
13.
随着大词汇量连续语音识别技术的发展,越来越多的研究人员选取声韵母作为识别单元。在基于声韵母的汉语连续语音识别中,声韵母基元的准确分割是非常重要的一步。结合汉语发音声学特性,提出了基于声母分割方法和基于段间距离方法相结合的策略。实验结果表明:该方法达到了准确分割的目的。 相似文献
14.
方言语音识别是方言保护的核心环节。传统的方言语音识别模型缺乏考虑方言语音中特定方言音素的重要性,同时缺少多种语音特征提取及融合,导致方言语音识别性能不高。本文提出的端到端方言语音识别模型充分发挥了残差CNN(Convolutional Neural Networks)和Bi-LSTM(Bi-directional Long Short-Term Memory)分别在语音帧内和帧间特征提取的优势,并利用多头自注意力机制有效提取不同方言中特定方言音素信息构成语音发音底层特征,利用该方言发音底层特征进行方言语音识别。在基准赣方言和客家方言两种方言语音语料库上的实验结果表明本文提出的方言语音识别模型显著优于现有基准模型,通过对注意力机制的可视化进一步分析了模型取得性能提升的根本原因。 相似文献
15.
作为汉语语音识别的重要组成部分,声调识别具有关键的作用.提出了一种新的基于前后文相关的模型识别方法用以提高汉语连续语音中的识别率.首先介绍用于声调识别的基因轨迹的提取和处理,然后提出6种特征来描述基因轨迹的变化趋势并给出具体的计算公式,利用这些特征并考虑连续语音中前后音节的相关性对基因轨迹造成的变化而建立细分的声调模型... 相似文献
16.
《Philips Journal of Research》1995,49(4):353-366
Large-vocabulary continuous-speech recognition (CSR) technology is at work. As an application of the technology, we will describe a dictation system (DS). Input to the system is unrestricted spontaneous speech. No adaptation, no special skills are required to use the system. The DS transforms continuous speech into written text. It is essential in this application that the user is free to speak as he or she usually does and should be free to use his or her own wording and formulation. This implies speech recognition for large and open vocabularies, free syntax, continuous speech. The aim of the paper is an attempt to determine what is feasible with today's technology and what will be feasible in the near future. The problems addressed are: what are the limits of today's technology, what is needed to make the next step, i.e. going towards real industrialization of CSR technology. 相似文献
17.
18.
19.
耳语音的声学特征是研究其语音识别和说话人识别的重要组成部分.介绍了耳语音的特点并讨论了其声学特征.由于耳语音没有基频,所以共振峰与音长特性可以作为重要的声学参数用于识别.对汉语6个耳语音元音进行了分析研究,证明共振峰频率和音长可以作为耳语音识别的特征参数. 相似文献
20.
This paper describes an automatic caption-superimposing system with a new continuous speech recognizer for efficient production of TV programs. The system which we have developed can recognize continuous speech announced in a hall of Japanese `sumo' wrestling and automatically superimpose the recognition results of wrestlers' names and winning tricks as captions on a TV display. The announcements consist of sentences to inform which wrestler has won a match with what kind of winning trick. They are formed out of small-sized vocabulary with a specific uttered style and are spoken nearly at a Japanese `bunsetsu' unit like a phrase only by some specific speakers. We designed the system to work with the following features: (a) recognition of continuous speech with a specific uttered style; (b) an easy change of vocabulary to be recognized; (c) no requirement of pre-registration of any particular utterances; (d) implementation on multi-microprocessors with high computing speed. The proposed recognizer utilizes general intra-`bunsetsu' grammar which is applicable to various recognition tasks, while conventional Japanese continuous speech recognizers use intra-`bunsetsu' grammar which depends on applied recognition tasks. In a recognition experiment on 40 sentences of `sumo' announcements by two speakers, the system attained `bunsetsu' accuracy of 91.0% with quasi-real-time processing 相似文献