共查询到18条相似文献,搜索用时 78 毫秒
1.
2.
《电子技术与软件工程》2019,(20)
藏语语音信号处理是藏语语音实现人工智能化的关键技术之一。自然人的语音发音和直观判断与实际的发音规则存在一定的差异。客观地量化分析藏语连续语音中的特征参数,能够更客观更精确的反应语音的发音规律。介绍了Praat语音分析软件及其在语音处理研究和语音教学中的应用;用Praat语音处理软件平台仿真和分析了藏语连续语音录音句子中的语音强度、语调、频谱特征、基音轨迹等声学参数,为藏语连续语音信号处理、藏语发音和听力教学提供参考依据。 相似文献
3.
本文给出了一种改进的LPC语音编码算法,用于实现低速率声码器。与传统LPC声码器算法相比,本算法在参数提取及合成等方面采取了一些改进措施,使得合成语音质量有很大的提高。本文在引言后概述了编码算法改进的考虑,然后给出编译码器的算法,重点讨论了本文提出的用动态规划法进行基音提取和平滑的新算法,以及合成端混合激励算法。本算法已经用TMS320C25实现单片编解码。 相似文献
4.
语音识别指利用计算机识别语音信号所表达的内容,其目的是要准确地理解语音所蕴含的含义。本文着重研究了语音识别实现过程的特征提取,针对特征提取的多种方法,选用LPC倒谱系数作为特征参数提取,较彻底地去除了语音信号产生过程的激励信息,主要反映了声道模型,而且只需十几个倒谱系数就较好地描述了语音的共振峰特性。通过对语音信号进行预加重、分帧、加窗、自相关分析,而后提取出LPC倒谱系数。根据流程编写VC程序,对语音信号进行分析处理,去除对语音识别无关紧要的冗余信息,从而获得用于语音识别的重要信息。 相似文献
5.
6.
主要研究用于分布式语音识别(DSR)的语音参数的提取方法以及参数性能分析。以前所用到的语音参数大部分是LPC例谱参数,但其抗噪声性能较差。文中主要讨论了MEL倒谱参数。并在移动通信环境下,比较了两者的性能。 相似文献
7.
8.
9.
提取智能机器人语音信号特征对于确保机器人正常运行有关键性意义。为了有效缩短特征参数提取时间,提高提取准确率,提出了基于VMD的智能机器人语音信号特征参数提取方法。利用VMD技术建立语音信号的脉冲数字模型,分析实时性系统频率,确定信号传递函数。采用VMD特有的分子程序化模式,将智能机器人作为大型生物分子来进行语音信号的特征参数分析,建立分子可视化程序,将不稳定的语音信号拆散开进行分析处理。生成自相关系数,进而通过傅里叶变换将自相关系数转变为LPC系数,通过数据分析处理生成线性预测倒谱系数,经过VMD分子可视化进行提取,得到了智能机器人的语音信号特征参数。实验结果表明,基于VMD的智能机器人语音信号特征参数提取方法能够通过加窗处理解决外部冲击问题,提高检测准确率,缩短特征参数提取时间,确保机器人正常工作。 相似文献
10.
11.
12.
For the purpose of guiding a pole quantization scheme, a psychophysical experiment was performed to measure just-noticeable differences (JND) in the frequency and radius of the poles. The frequency JNDs, measured up to a formant frequency of 4 kHz, are quantified as distributions with means that are increasing functions of formant frequency and bandwidth. An example of a pole quantization scheme, based on the JND data, is presented and found to be significantly superior to common scalar quantization methods of the LPC-PARCOR coefficients. The pole quantization scheme is found to be almost comparable, both in quality and bit consumption, to vector quantization 相似文献
13.
The authors describe several adaptive block transform speech coding systems based on vector quantization of linear predictive coding (LPC) parameters. Specifically, the authors vector quantize the LPC parameters (LPCVQ) associated with each speech block and transmit the index of the code vector as overhead information. This code vector will determine the short-term spectrum of the block and, in turn, can be used for optimal bit allocation among the transform coefficients. In order to get a better estimate of the speech spectrum, the authors also consider the possibility of incorporating pitch information in the coder. In addition, entropy-coded zero-memory quantization of the transform coefficients is considered as an alternative to Lloyd-Max quantization. An adaptive BTC scheme based on LPCVQ and using entropy-coded quantizers is developed. Extensive simulations are used to evaluate the performance of this scheme 相似文献
14.
本文主要阐述了语音线性预测编码中描述声道特性的全极点预测滤波器的几种激励方式,即残余信号激励,多脉冲激励及音调激励的原理和有关性能。对基带提取—高频再生残余信号的激励方式亦作了相应的介绍。 相似文献
15.
Waibel A. Geutner P. Tomokiyo L.M. Schultz T. Woszczyna M. 《Proceedings of the IEEE. Institute of Electrical and Electronics Engineers》2000,88(8):1297-1313
Building modern speech and language systems currently requires large data resources such as texts, voice recordings, pronunciation lexicons, morphological decomposition information and parsing grammars. Based on a study of the most important differences between language groups, we introduce approaches to efficiently deal with the enormous task of covering even a small percentage of the world's languages. For speech recognition, we have reduced the resource requirements by applying acoustic model combination, bootstrapping and adaption techniques. Similar algorithms have been applied to improve the recognition of foreign accents. Segmenting language into appropriate units reduces the amount of data required to robustly estimate statistical models. The underlying morphological principles are also used to automatically adapt the coverage of our speech recognition dictionaries with the Hypothesis-Driven Lexical Adaptation (HDLA) algorithm. This reduces the out-of-vocabulary problems encountered in agglutinative languages. Speech recognition results are reported for the read GlobalPhone database and some broadcast news data. For speech translation, using a task-oriented Interlingua allows to build a system with N languages with linear, rather than quadratic effort. We have introduced a modular grammar design to maximize reusability and portability. End-to-end translation results are reported on a travel-domain task in the framework of C-STAR 相似文献
16.
Markovic M. Milosavljevic M. Kovacevic B. Veinovic M. 《Vision, Image and Signal Processing, IEE Proceedings -》1998,145(1):19-22
In order to decrease LPC spectral degradation in the USA FED STD 1016 4.8 kbit/s CELP speech coder, application of a robust LPC parameter estimation is proposed. Robust LPC methods, based on Huber's M-estimation theory and a heuristic sample-selective two-stage robust procedure, are considered. Comparative experimental analysis is carried out based on the cepstral distance, as an objective spectral measure. Presented experimental analyses justify the use of the robust LPC methods in the standard CELP 4800 bit/s speech coder, showing that the best results are obtained by using the combined sample-selective robust LPC procedure 相似文献
17.