共查询到19条相似文献,搜索用时 171 毫秒
1.
汉语普通话是一种带声调的语言,声调信息在汉语连续语音识别中具有非常重要的作用。传统的连续语音声调识别算法一般只研究阴平、阳平、上声、去声的声调特征,却很少讨论第零声(即轻声)的声调特征。利用归一化自相关函数法研究了轻声音节基频轨迹的特点,并给出了可用于识别轻声音节的一些基本声调特征。 相似文献
2.
基音频率和共振峰频率的提取在语音编码、语音合成和语音识别中有着广泛的应用。通过深入分析语音信号的时域和频域性质,针对语音信号幅度谱的特征设计了一种有效的基频和共振峰提取算法。并对实际语音信号进行参数提取测试,实验结果证明了这种算法能够准确提取不同讲话者和录音条件下的语音信号的基频与共振峰频率。 相似文献
3.
提出了一种基于LS-SVM的情感语音识别方法。即先提取实验中语音信号的基频,能量,语速等参数为情感特征,然后采用LS-SVM方法对相应的情感语音信号建立模型,进行识别。实验结果表明,利用LS-SVM进行基本情感识别时,识别率较高。 相似文献
4.
混叠语音的基频分离提取问题是听觉场景分析系统的重要一环。以往的分频带自相关函数的混叠语音基频分离提取方法都是基于频带只受混叠信号之一支配的假设,而事实上,频带常常同时受两个信号影响,为此,本文提出了一种混叠语音基频分离提取新算法,算法在寻找可能的频带组时采用了闭环自适应频带选取模块,根据频带组的基频及其周期度确定两个潜在基频,提高了搜索潜在基频的鲁棒性;利用两个潜在基频重新判断频带的归属来分离信号提取基频,提高了提取基频的精度。实验结果证明新算法具有较高的有效基频提取精度。 相似文献
5.
声带准周期振动的缺失,使得汉语耳语音成为了一种特殊的发音模式,也使得耳语声调无法用基音周期表征。目前用于语音识别和声纹识别的常规语音特征,包含声调信息较少,所以在声调识别实验中很难获得良好的效果。本文提出一种新的特征参数来模拟正常语音的基频声调轨迹,即以人的听觉特性为出发点,研究人的声调敏感Bark频带,发现部分扩散Bark谱能量归一化比例拟合曲线,能够呈现出类似正常语音的基频轨迹,这说明在某些方面该轨迹或多或少包含了耳语音的声调信息。在以该轨迹和语音短时能量曲线为特征,以神经网络为模型的耳语声调识别实验中获得了较高的识别正确率,汉语四声的总体识别正确率高达78%,这也为对耳语音的进一步处理提供了很多有力依据。 相似文献
6.
7.
8.
9.
用改进的SIFT方法检测语音基频 总被引:2,自引:1,他引:1
提出了一种改进的SIFT方法,在LPC分析和逆滤波处理过程中,分别用不同的比率对语音数据降采样,这样即保证了逆滤波后自相关计算的精度,又节约了计算时间;有自相关曲线上寻找基音峰值时,采用连续的自适应门限,提高了基频检测的频率范围和精度;对基频包络用四点非线性平滑算法进行处理,有效去除了峰值检测中的“野点”;基频包络经过时间归一化 处理后,得到最终的基频数据,进一步的实验表明该方法有良好的基频检测性能。 相似文献
10.
11.
12.
基于MATLAB的语音增强系统的设计 总被引:1,自引:0,他引:1
语音增强是信号处理领域中的一个重要的组成部分。在许多语音处理的应用中,例如移动通信,语音识别和助听器,语音信号的处理不得不在具有噪声的环境下进行。在过去的几十年里,人们提出了许多方法去消除噪声和减少语音失真,例如谱减法,基于小波的方法,隐式马尔科夫模型法和信号子空间法等。小波分析由于能同时在时域和频域中对信号进行分析,所以它能有效地实现对信号的去噪。介绍了一种语音增强系统的设计方法,采用Least Mean Square(LMS)算法和小波变换相结合的方法对带噪语音进行去噪,并在MATLAB的Simulink环境下建立了该系统的模型。通过对该模型的仿真表明:该方法去噪效果明显,为该系统在硬件上的实现打下了理论基础。 相似文献
13.
电子喉语音存在基频单一、发声机械、辐射噪声大等多种缺陷,这严重影响了电子喉语音可懂度和自然度,特别是对汉语普通话之类的声调语言,问题尤其严重.汉语普通话电子喉语音识别存在辅音混淆的问题并且识别结果没有声调,因此本文在识别结果的基础之上设计了拼音拼写修正器和声调标注工具,再结合基于Tacotron-2的TTS实现了电子喉语音向正常语音的转换.客观评价实验结果表明,拼音拼写修正器可以提高拼音准确率,声调标注在有上下文的语义环境中具有较高准确率.主观听力测试结果表明,本文所提方法在不同语言水平上提高了汉语普通话电子喉语音的可懂度和自然度.研究结果表明,本文设计的方法可以将不带声调的电子喉语音转换为正常语音,相比于传统语音转换方法具有更高的性能. 相似文献
14.
Speech enhancement algorithms play an important role in speech signal processing. Over the past several decades, many algorithms have been studied for speech enhancement. A speech enhancement algorithm uses a noise removal method and a statistical model filter to analyze the speech signal in the frequency domain. Spectral subtraction and Wiener filters have been used as representative algorithms. These algorithms have excellent speech enhancement performance, but suffer from deterioration in performance due to specific noise or low signal-to-noise ratio (SNR) environments. In addition, according to estimations of erroneous noise, a noise existing in a voice signal is maintained so that a spectrum corresponding to a voice signal is distorted, or a frame corresponding to a voice signal cannot be retrieved, and voice recognition performance deteriorates. The problem of deterioration in speech recognition performance arises from the difference between speech recognition and training model. We use silence-feature normalization model as a methodology to improve the recognition rate resulting from the difference in the noisy environments. Conventional silence-feature normalization has a problem in that the silent part of the energy increases, which affects recognition performance due to unclear boundaries categorizing the voice. In this study, we use the cepstrum feature of the noise signals in the silence-feature normalization model to improve the performance of silence-feature normalization in a signal with a low SNR by setting a reference value for voiced and unvoiced classification. As a result of recognition rate confirmation, the recognition rates improve in performance, compared with other methods. 相似文献
15.
为了提高语音信号的识别率。提出了一种改进的LPCC参数提取方法。该方法先对语音信号进行预加重、分帧加窗处理。然后进行小波分解,在此基础上提取LPCC参数,从而构成新向量作为每帧信号的特征参数。最后采用高斯混合模型(GMM)进行说话人语音识别,实验表明新特征参数取得了较好的识别率。 相似文献
16.
Borgatti M. Felici M. Ferrari A. Guerrieri R. 《Solid-State Circuits, IEEE Journal of》1998,33(7):1082-1089
In this paper, a low-power, low-voltage speech processing system is presented. The system is intended to he used in remote speech recognition applications where feature extraction is performed on terminal and high-complexity recognition tasks and moved to a remote server accessed through a radio link. The proposed system is based on a CMOS feature extraction chip for speech recognition that computes 15 cepstrum parameters, each 8 ms, and dissipates 30 μW at 0.9-V supply. Single-cell battery operation is achieved. Processing relies on a novel feature extraction algorithm using 1-bit A/D conversion of the input speech signal. The chip has been implemented as a gate array in a standard 0.5-μm, three-metal CMOS technology. The average energy required to process a single word of the TI46 speech corpus is 10 μJ. It achieves recognition rates over 98% in isolated-word speech recognition tasks 相似文献
17.
Signal modeling techniques in speech recognition 总被引:13,自引:0,他引:13
Picone J.W. 《Proceedings of the IEEE. Institute of Electrical and Electronics Engineers》1993,81(9):1215-1247
A tutorial on signal processing in state-of-the-art speech recognition systems is presented, reviewing those techniques most commonly used. The four basic operations of signal modeling, i.e. spectral shaping, spectral analysis, parametric transformation, and statistical modeling, are discussed. Three important trends that have developed in the last five years in speech recognition are examined. First, heterogeneous parameter sets that mix absolute spectral information with dynamic, or time-derivative, spectral information, have become common. Second, similarity transform techniques, often used to normalize and decorrelate parameters in some computationally inexpensive way, have become popular. Third, the signal parameter estimation problem has merged with the speech recognition process so that more sophisticated statistical models of the signal's spectrum can be estimated in a closed-loop manner. The signal processing components of these algorithms are reviewed 相似文献
18.
耳语音的声学特征是研究其语音识别和说话人识别的重要组成部分.介绍了耳语音的特点并讨论了其声学特征.由于耳语音没有基频,所以共振峰与音长特性可以作为重要的声学参数用于识别.对汉语6个耳语音元音进行了分析研究,证明共振峰频率和音长可以作为耳语音识别的特征参数. 相似文献
19.
基于定量递归分析的清浊音判决 总被引:2,自引:0,他引:2
在语音信号处理中,清浊音判决的准确与否直接关系到后续语音处理的质量。该文通过分析不同的语音音素动力学物理模型在其递归图上的表现,统计定量递归分析中确定性和归一化最长对角线这两种特征参数,得到清浊音的显著差异。设定灵活合理的阈值判决语音信号的清浊音,得到了良好的试验结果。和其他传统判决方法比较, 误判率有明显降低,为语音特征提取和识别研究提供了新的途径。 相似文献