期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

黄琦志熊哲华李轲杨霞《装备指挥技术学院学报》2007,18(6)

在当前网络状况下,高效的语音压缩编解码器可节省网络传输带宽,解决网络拥挤问题.通过对语音特性的分析,结合共扼结构代数码本激励线性预测(conjugate structure algebraic code excited linear prediction,CS-ACELP)算法,提出了一种运算量较小但行之有效的话音激活检测算法,设计出了一种基于TMS320VC5409数字信号处理器的语音压缩编解码器.实验结果表明,利用该语音压缩编解码器可将平均比特率降低到约4 kb/s,能很好地满足VoIP中全双工实时语音通信的要求,得到了较好的实际效果. 相似文献

2.

基于DSP的VolP语音压缩编解码器的研究与实现

黄琦志熊哲华李轲杨霞《装备指挥技术学院学报》2007,18(6):69-72

在当前网络状况下，高效的语音压缩编解码器可节省网络传输带宽，解决网络拥挤问题。通过对语音特性的分析，结合共扼结构代数码本激励线性预测（coniugate structure algebraic codeexcited linear prediction，CS—ACELP）算法，提出了一种运算量较小但行之有效的话音激活检测算法，设计出了一种基于TMS320VC5409数字信号处理器的语音压缩编解码器。实验结果表明，利用该语音压缩编解码器可将平均比特率降低到约4kb／s，能很好地满足VoIP中全双工实时语音通信的要求，得到了较好的实际效果。相似文献

3.

VoIP中语音压缩编解码器的研究与实现

张拥军《安徽电子信息职业技术学院学报》2007,6(4):102-103

在当前网络状况下,高效的语音压缩编解码器可以节省网络传输带宽,解决网络拥挤问题.本文通过对语音特性的分析,结合共扼结构代数码本激励线性预测算法提出了一种运算量较小但有效的话音激活检测算法,进一步将平均比特率降低到约4kb/s.经过实验分析,以上设计和实现能够很好地满足VoIP中全双工实时语音通信的要求. 相似文献

4.

基于小波变换的静音与语音分割新算法 总被引：6，自引：1，他引：6

梅晓丹孙圣和《哈尔滨工业大学学报》2002,34(3):408-411

含噪语音信号的静音与语音分割，即端点检测问题是语音识别至关重要的一步，为了提高语音分割对环境的适应性，提出了一种利用小波变换分割含噪语音信号中静音与语音的新算法，该算法首先将语音信号进行小波变换，利用小波系数去噪，然后选择小波部分子带跟踪信号的能量变化以分割语音与静音，仿真实验表明该算法在低信噪比条件下也能够有效分割语音。相似文献

5.

基于压缩感知的煤矿井下语音通信系统

马丽娜曹新德《淮南工业学院学报》2011,(3):72-74

介绍了煤矿井下语音通信系统的发展现状,指出了现行的语音通信系统具有编码算法复杂的缺点。将压缩感知理论引入到煤矿井下语音通信系统中,提出一种压缩感知编解码与传统编解码相混合的编解码方法。最后通过实验仿真了基于压缩感知的语音重构,从重构结果的MOS评分可以看出该算法降低了井下终端的编码复杂度,节约了终端的能耗,延长了其生存时间,基本与MP3的压缩效果相当。证明了该算法可以很好的应用于煤矿井下语音通信系统中。相似文献

6.

考虑包内容特性的网络语音质量评价模型

李维杨付正《西安电子科技大学学报(自然科学版)》2011,38(2):23-28,98

丢包是引起网络语音质量下降的主要因素,其影响程度不仅与丢失包的个数有关,而且与丢失包的内容特性有关.考虑丢失包的内容特性,提出了一种无参考的网络语音质量评价模型.首先,利用话音激活检测技术和未丢帧的电平来判断丢失包的内容特性,然后统计语音包的丢失率,进而提出一种无参考的网络语音质量评价模型来预测网络失真语音质量.实验结果表明,相比于国际标准G.1070中的语音质量评价模型,无参考的网络语音质量评价模型与主观质量评价的相关性平均提高了8.4%. 相似文献

7.

基于信号处理的电话语音模拟

左国玉刘文举阮晓钢《北京工业大学学报》2003,29(2):182-187

针对电话语料比较缺乏的问题,提出了一种完全由软件模拟实现由纯净语音向电话质量语音转换的算法。该算法采用滤波器设计技术来模拟电话线路连接中各种模拟传输设备频率响应特性,并对电话通道环境中各种噪音行为等电话语音现象进行了模拟研究。频谱失真分析和识别实验结果表明,通过模型参数的合理设置和调整,该算法能有效地实现纯净语音向电话质量语音的近似模拟,使用由纯净数据生成的模拟语音可以获得与真实语音同等的识别性能。相似文献

8.

带噪藏语语音增强算法研究

冯炎安宝坤《重庆石油高等专科学校学报》2013,(6):136-139

藏语语音增强方法能够用于提高噪声环境中语音处理设备的性能,并且能够在不同的噪声环境下使用而不影响其性能.根据藏语语音特点,提出藏语语音增强算法.实验显示,该算法具备良好的分段信噪比增益. 相似文献

9.

引入内容特性分析的包层语音质量评价模型 总被引：1，自引：0，他引：1

江亮亮李雪敏杨付正杨旭《四川大学学报(工程科学版)》2013,45(3):103-107

为了实现对网络语音质量的实时监控,提出一种包层语音质量评价模型.该模型无需介入数据包的载荷部分,只利用数据包的头信息评价语音质量.首先通过分析包头信息区分出语音段和静音段,获取语音段的编码参数和丢包参数,然后根据语音段的编码参数预测编码失真,在此基础上利用语音段的丢包参数评价丢包引起的失真,从而得到语音流的总质量.实验结果表明,相比于国际标准G.107中的E-model,提出的模型得到的语音质量评分与PESQ算法评分的皮尔森相关系数平均提高0.041 2,均方根误差平均降低0.045 1. 相似文献

10.

强噪声背景下汉语语音端点检测和音节分割 总被引：3，自引：0，他引：3

杨崇林李雪耀《哈尔滨工程大学学报》1997,18(5):28-32

根据汉语语音的特点，提出了强噪声背景下对汉语语音进行了端点检测和音节分割的新算法，在８５ｄＢ的噪声环境中，实验考察了端点检测的正确性和音节分割的稳定性，结果表明，算法在这两方面达到了很高的性能，且与发音者无关。相似文献

11.

Improving Deep Attractor Network by BGRU and GMM for Speech Separation

Rawad Melhem Assef Jafar Riad Hamadeh 《哈尔滨工业大学学报(英文版)》2021,28(3):90-96

Deep Attractor Network (DANet) is the state-of-the-art technique in speech separation field, which uses Bidirectional Long Short-Term Memory (BLSTM), but the complexity of the DANet model is very high. In this paper, a simplified and powerful DANet model is proposed using Bidirectional Gated neural network (BGRU) instead of BLSTM. The Gaussian Mixture Model (GMM) other than the k-means was applied in DANet as a clustering algorithm to reduce the complexity and increase the learning speed and accuracy. The metrics used in this paper are Signal to Distortion Ratio (SDR), Signal to Interference Ratio (SIR), Signal to Artifact Ratio (SAR), and Perceptual Evaluation Speech Quality (PESQ) score. Two speaker mixture datasets from TIMIT corpus were prepared to evaluate the proposed model, and the system achieved 12.3 dB and 2.94 for SDR and PESQ scores respectively, which were better than the original DANet model. Other improvements were 20.7% and 17.9% in the number of parameters and time training respectively. The model was applied on mixed Arabic speech signals and the results were better than that in English. 相似文献

12.

基于提升小波分解的低速率波形内插语音编码算法

李如玮鲍长春《北京工业大学学报》2011,37(12):1779-1785

提出了一种基于双正交提升小波变换(bi-orthogonal lifting wavelet transform,BLWT)的低速率特征波形内插语音编码方法,其中的特征波形分解算法不需要复杂的特征波形对齐操作和滤波器的卷积运算,其固有的原位运算降低了传统特征波形小波分解算法所需的内存,当前帧边界点替代相邻帧样点的措施有效减少了传统特征波形小波分解算法的时延.同时,该分解方法对分解后的各成分单独重建,并根据人耳的感知特性选择量化参数.基于该分解,分别构建了1.84 kb/s和2.32 kb/s两种速率的BLWT-CWI(characteristic waveform interpo-lation)语音编码器.主观平均意见得分(mean opinin score,MOS)结果表明,2.32 kb/s的BLWT-CWI语音编码质量与2.4 kb/s的MELP声码器相当,1.84 kb/s的BLWT-CWI语音编码质量稍逊于2.4 kb/s的MELP声码器.主观A/B听力测试结果表明,1.84 kb/s的BLWT-CWI语音编码质量优于2 kb/s的LIWI(low-complex improvedwaveform interpolation)声码器. 相似文献

13.

A speech enhancement algorithm to reduce noise and compensate for partial masking effect

JEON Yu-yong LEE Sang-min 《中南工业大学学报(英文版)》2011,18(4):1121-1127

To enhance the speech quality that is degraded by environmental noise,an algorithm was proposed to reduce the noise and reinforce the speech.The minima controlled recursive averaging(MCRA) algorithm was used to estimate the noise spectrum and the partial masking effect which is one of the psychoacoustic properties was introduced to reinforce speech.The performance evaluation was performed by comparing the PESQ(perceptual evaluation of speech quality) and segSNR(segmental signal to noise ratio) by the propos... 相似文献

14.

G.729语音编解码算法实现方法研究及DSP实现 总被引：2，自引：0，他引：2

吴海涛梁迎春梁欣涛谢金宝闫健《哈尔滨理工大学学报》2005,10(6):5-8

为减小编解码运算复杂度,提出了一种基于DSP的G.729语音编解码算法实现方法,重点对DSP的代码优化作了论述．仿真结果表明,运算复杂度大大降低,在单片TMS320VC5410上完全能够实现G．729语音编解码算法．重建语音具有符合标准的编解码效果．相似文献

15.

子带MCRASC-MGSC微型麦克风阵语音增强算法

曾庆宁欧阳缮《西安电子科技大学学报(自然科学版)》2010,37(6):1011-1016

微型麦克风阵语音增强在基于微型设备的语音通信及语音识别中具有重要的应用价值．通过引进多路抗串扰自适应信号抵消(MCRASC)技术,对子带修正广义旁瓣抵消(MGSC)方法提供更为有效的信号阻塞方式,为微型阵列语音增强提出了一种更为可行的算法．理论分析和真实环境下的实验结果验证了所提算法的有效性和优越性．实验中,所提算法比子带MGSC算法所获增强语音的信噪比提高了12.5dB．相似文献

16.

智能语音识别系统中噪声估计算法的研究和改进

下载免费PDF全文

吴楠冯祖勇韦高梧《广东工业大学学报》2018,35(3):43-46

智能语音识别技术的研究已有较长的时间,但由于语音信号本身所具有的多变性、瞬时性、连续性和动态性的特征,使得机器在不同的环境尤其是噪声环境中进行语音信号的识别仍具有一定的困难.为了提高带噪语音信号识别的准确率,本文研究了一种常用的噪声估计算法,即基于后验信噪比的时间递归平均算法.并在此算法的基础上提出了一种对平滑因子的改进算法,将语音活性检测算法与这两种算法在不同输入信噪比下进行模拟验证.通过运算结果的对比分析可以看出,改进后的算法相比于语音活性检测算法最高可以使输出分段SNR提高2.1 dB,相比于原时间递归平均算法最高可以使输出分段SNR提高0.5 dB,表明低输入SNR下改进后的算法可以有效提高语音信号的质量和可懂度. 相似文献

17.

Steganography in low bit-rate speech streams based on quantization index modulation controlled by keys

YongFeng Huang HuaiZhou Tao Bo Xiao ChinChen Chang 《中国科学:技术科学(英文版)》2017,60(10):1585-1596

Since low bit-rate speech codecs used for voice over internet protocol (VoIP), such as iLBC (internet low bit-rate codec), G.723.1 and G.729A, have less redundancy due to high compression, it is more challenging to embed information in low bit-rate speech streams of VoIP. In this study, a new method is proposed for steganography in low bit-rate speech streams of VoIP. The core idea of this method is setting up a graph model for the codebook space of the quantizer. Based on the graph model, the method realises a quantization index modulation (QIM)-controlled algorithm for partitioning the codebook space. It can be proved that this method can minimize signal distortion while steganography taking place. Taking into account codeword partition balance and partition diversity, the proposed steganographic algorithm was based on QIM controlled by secret keys, i.e., mapping the ways of codebook division into secret keys, thereby significantly improving the undetectability and robustness of VoIP steganography. Performance measurements and steganalysis experiments showed that the proposed QIM-controlled steganographic algorithm was more secure and robust than the QIM algorithm, the conventional RANDOM algorithm and the original codebook algorithm. 相似文献

18.

Audiovisual bimodal mutual compensation of Chinese

周治杜利民徐彦居《中国科学E辑(英文版)》2001,44(1):19-26

The perception of human languages is inherently a multi-modal process, in which audio information can be compensated by visual information to improve the recognition performance. Such a phenomenon in English, German, Spanish and so on has been researched, but in Chinese it has not been reported yet. In our experiment, 14 syllables (/ba, bi, bian, biao, bin, de, di, dian, duo, dong, gai, gan, gen, gu/), extracted from Chinese audiovisual bimodal speech database CAVSR-1.0, were pronounced by 10 subjects. The audio-only stimuli, audiovisual stimuli, and visual-only stimuli were recognized by 20 observers. The audio-only stimuli and audiovisual stimuli both were presented under 5 conditions: no noise, SNR 0 dB, - 8 dB, - 12 dB, and - 16 dB. The experimental result is studied and the following conclusions for Chinese speech are reached. Human beings can recognize visual-only stimuli rather well. The place of articulation determines the visual distinction. In noisy environment, audio information can remarkably 相似文献

19.

Study of speech enhancement in the background of ship-radiated noise

LI Dawei YANG Rijie HAN Jianhui 《西安电子科技大学学报(自然科学版)》2016,43(5):133-138

Spectral subtraction is wildly used in speech enhancement. But it is not always available in the ship working environment because of the fact that it is hard to discriminate the speech duration with the ship noise. So, in this paper, we propose a new approach, which first computes the frequency spectrum similarity and then discriminates the signal segments into speech or noise roughly. And then by the frame shifting method, we achieve a high discrimination precision. Finally, the improved algorithm is benchmarked on a large measured data and experimental results show that the proposed method can be used in various ship-radiated noise environments. The discrimination accuracy is 98％ for ship-radiated noise and 96％ for speech. 相似文献

20.

基于韵律匹配代价和韵律拼接代价的汉语语音合成

张鹏王琳刘胜《哈尔滨工业大学学报》2006,38(11):2006-2008

为了进一步提高汉语语音合成的自然度,通过对汉语语音合成技术的分析与对比,确定了以汉语音节作为拼接的合成基元,采用韵律匹配代价和韵律拼接代价的方法,进行合成基元的优化选取,实现了汉语语音合成的韵律建模及其韵律控制.采用直接拼接、过渡拼接和拟拼接,实现合成基元之间的拼接与平滑过渡.实验结果表明了汉语语音合成及其韵律控制方法的有效性. 相似文献