首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 199 毫秒
1.
双树复小波包变换语音增强新算法   总被引:7,自引:0,他引:7  
实小波包变换是语音增强中效果较好的一种算法,利用阈值的方法对小波包系数进行压缩进而重构语音信号.分析了实小波包变换的平移敏感性,以及其对语音进行增强时的缺陷.提出采用双树复小波包变换方法进行语音增强,当低通滤波器和高通滤波器对应的小波基近似为希尔伯特变换对时,该变换能大大减小实小波包变换中的平移敏感性.同时考虑小波包系数之间的相互关系,提出了重叠块复阈值算法.结果表明,算法优于传统实小波包变换及点阈值算法,尤其对含周期噪声的语音信号,双树复小波包变换算法的优势更为明显.  相似文献   

2.
针对传统的小波包语音增强算法增强后的语音失真严重的问题,本文提出了一种基于自适应阈值和新阈值函数的小波包语音增强算法。该算法在小波包域将带噪语音加窗分帧,基于相邻帧快速傅立叶变换功率谱的互相关值,计算各帧存在语音的概率,然后通过语音存在概率对传统通用小波包阈值进行调整,使得阈值在非语音帧中较大,在语音帧中较小,实现阈值的自适应调整,可以在最大程度消除噪声的同时,尽可能的保留语音,减小语音失真。本文还设计了一种新阈值函数,克服了传统硬阈值函数不连续和软阈值函数会带来恒定偏差的缺点,进一步减小了语音失真。本文采用TIMIT 数据库和NOISEX-92 数据库中的语音和噪声进行了大量的模拟实验,主观评比和客观评比结果均证明本文提出的语音增强算法比现有的两种算法有更好的增强效果,采用本文算法增强后的语音失真更小,听觉效果更好。  相似文献   

3.
一种基于小波和MP变换的细粒度视频编码算法   总被引:1,自引:3,他引:1  
针对目前的细粒度算法的计算复杂度大(如Matching Pursuit编码,简称MP编码)或者视频质量有各种效应(如离散小波变换编码DWT)的缺点,提出了一种基于小波变换和MP变换联合的细粒度视频编码算法。对连续的8帧视频采用一维小波变换,然后对变换后第1帧低频图像用二维小波变换,其它7帧高频图像信息采用MP变换编码,并采用基于能量的原子搜索与基于人眼视觉特性的分配策略。实验表明,该细粒度算法对帧间运动较小的视频应用,有较高的恢复质量。  相似文献   

4.
针对复杂背景下红外快速运动小目标检测问题,提出一种将时域特性分析、小波包变换和管道滤波多帧检测相结合的方法。该方法首先对图像进行时域特性分析确定目标所在帧图像序列,再利用小波包变换对目标所在帧图像进行多尺度分解,提取候选目标,然后通过管道滤波从候选目标中确定小目标的位置,完成对小目标的检测。对实测数据进行仿真实验,结果表明该方法能有效检测红外快速运动小目标。  相似文献   

5.
针对现有细粒度视频编码算法计算复杂度大或视频恢复质量有各种效应的缺点,提出了一种基于联合小波变换和MP变换的细粒度编码算法。该算法在运动估计与补偿的基础上,用小波变换来消除帧间冗余,然后对变换结果根据不同帧的数据特征分别进行二维小波变换或MP变换。算法还提出了新的运动估计和像素调整策略、基于人眼视觉特性的MP原子分配策略和基于能量查找的原子搜索机制。实验表明,该算法可同时兼顾视频恢复质量、计算复杂度和控制粒度。  相似文献   

6.
通过对MFCC算法的研究,发现其中的FFT在整个时频空间使用固定的分析窗,这不符合语音信号的特性,而小波变换具有多分辨率特性,更符合人耳的听觉特性。提出了动静态特征参数结合的语音信号识别方法,首先在特征参数提取中引入了小波包变换,借助MFCC参数的提取方法,用小波包变换代替傅里叶变换和Mel滤波器组,提取了新的静态特征参数DWPTMFCC,然后把它与一阶DWPTMFCC差分参数相结合成一个向量,作为一帧语音信号的参数。仿真实验证明:基于新特征的识别率比原来MFCC的识别率有了很大提高,特别是在低信噪比情况下。  相似文献   

7.
该文给出了基于小波变换的序列图像压缩编码的一个框架,主要内容包括自适应选取帧内或帧间编码、帧内编码方案和帧间编码方案等。为了有效的编码运动补偿金量误差图像,把EZW算法推广到了一种特殊的小波包分解,这种分解更适合余量误差图像的特性。实验结果表明,该方案可以实现高压缩比,同时能保持较高的峰值信躁比。  相似文献   

8.
针对传统小波语音增强算法存在过度周值处理的问题,提出一种改进的时间自适应阈值小波包去噪算法.该方法采用听觉感知小波包对噪声语音进行分解,得到小波包听觉感知节点上的系数,并基于语音存在概率估计按帧自动调节去噪周值,因改进的闲值能更好地避免语音小波包系数被过度阈值处理的情况,从而在抑制噪声的同时保留了更多的原始语音成分,进一步提高了降噪效果,实验结果表明,该算法比常规小波自适应闻值算法能得到更清晰的语音增强信号.  相似文献   

9.
针对数字视频的版权保护应用,提出了一种混合提升小波变换和DCT的视频水印算法。该算法先对水印进行混沌加密和Arnold置乱处理,借助密钥选取r帧彩色视频并将每帧视频的每一分量进行互不重叠的8×8分块,对选取的分块进行1级提升小波变换,并对低频子带进行DCT变换,以视频帧的纹理和运动特性自适应地确定量化步长的抖动调制方式嵌入水印,水印提取时无须原始视频的参与。实验表明,该算法实现简单,具有良好的透明性和鲁棒性,与其他算法相比,该算法具有更好的性能。  相似文献   

10.
基于音频特征的多小波域水印算法   总被引:3,自引:0,他引:3  
基于对音频特征的分析,提出了一种多小波域的水印算法.结合人类听觉系统的时频掩蔽特性,该算法分析音频帧的过零率及时域能量,确定用于嵌入水印的帧.利用音频的分抽样特征和多小波变换在信号处理中的优势,将每一个音频帧进行分抽样为两个子音频帧并分别将其变换到多小波域.利用两个子音频帧在多小波域的能量来估计所嵌入水印的容量,并根据它们的能量大小关系完成水印的嵌入.水印的提取过程转为一个使用支持向量机进行处理的二分类问题.实验结果验证了所提出的水印算法能根椐音频自身的特点寻找到适合用于嵌入水印的音频帧,且能动态调整水印的嵌入强度,在保证听觉质量的同时提高了水印的鲁棒性.  相似文献   

11.
提出了一种基于最佳小波包变换和SPIHT编码的语音信号压缩编码方法。该方法首先对语音信号进行小波包变换,求解最佳小波树,进行动态位分配,再用改进的SPIHT算法对变换后的小波系数进行压缩编码。并且采用了熵编码的方法进一步提高了压缩比。实验表明,该方法在较高的压缩比下能获得较好的信号重构质量,计算复杂度低,延迟小。  相似文献   

12.
语音增强主要用来提高受噪声污染的语音可懂度和语音质量,它的主要应用与在嘈杂环境中提高移动通信质量有关。传统的语音增强方法有谱减法、维纳滤波、小波系数法等。针对复杂噪声环境下传统语音增强算法增强后的语音质量不佳且存在音乐噪声的问题,提出了一种结合小波包变换和自适应维纳滤波的语音增强算法。分析小波包多分辨率在信号频谱划分中的作用,通过小波包对含噪信号作多尺度分解,对不同尺度的小波包系数进行自适应维纳滤波,使用滤波后的小波包系数重构进而获取增强的语音信号。仿真实验结果表明,与传统增强算法相比,该算法在低信噪比的非平稳噪声环境下不仅可以更有效地提高含噪语音的信噪比,而且能较好地保存语音的谱特征,提高了含噪语音的质量。  相似文献   

13.
语音激活检测是语音信号处理的一个重要环节.在低信噪比的情况下,传统的检测方法已不适用.为了提高语音激活检测的性能和鲁棒性,针对主要由白噪声组成的噪声背景,提出了一种基于小波包变换的自适应门限的语音激活检测方法(VAD),它将语音信号进行小波包变换,得到各个子带信号,符个子带信号通过Teager能量算子(TEO)将有声部分强化,同时衰减无声部分,最后进行自适应门限判决.实验结果表明在低信噪比的情况下,算法能够正确判别语音段和噪声段.  相似文献   

14.
Speech and speaker recognition is an important topic to be performed by a computer system. In this paper, an expert speaker recognition system based on optimum wavelet packet entropy is proposed for speaker recognition by using real speech/voice signal. This study contains both the combination of the new feature extraction and classification approach by using optimum wavelet packet entropy parameter values. These optimum wavelet packet entropy values are obtained from measured real English language speech/voice signal waveforms using speech experimental set. A genetic-wavelet packet-neural network (GWPNN) model is developed in this study. GWPNN includes three layers which are genetic algorithm, wavelet packet and multi-layer perception. The genetic algorithm layer of GWPNN is used for selecting the feature extraction method and obtaining the optimum wavelet entropy parameter values. In this study, one of the four different feature extraction methods is selected by using genetic algorithm. Alternative feature extraction methods are wavelet packet decomposition, wavelet packet decomposition – short-time Fourier transform, wavelet packet decomposition – Born–Jordan time–frequency representation, wavelet packet decomposition – Choi–Williams time–frequency representation. The wavelet packet layer is used for optimum feature extraction in the time–frequency domain and is composed of wavelet packet decomposition and wavelet packet entropies. The multi-layer perceptron of GWPNN, which is a feed-forward neural network, is used for evaluating the fitness function of the genetic algorithm and for classification speakers. The performance of the developed system has been evaluated by using noisy English speech/voice signals. The test results showed that this system was effective in detecting real speech signals. The correct classification rate was about 85% for speaker classification.  相似文献   

15.
This paper, presents a robust voice activity detection (VAD) technique based on wavelet packet. In this technique sub-bands and their amplitudes are represented as the vectors for each sample time in order to find a new feature from the frequency and amplitude changes. On the other hand, the multi-resolution analysis property of the wavelet packet transform (WPT), the voiced, unvoiced, and transient components of speech can be distinctly discriminated. Then, a new feature extraction method is implemented based on observations of the angles between vectors. This feature extraction method retains most unvoiced sounds in a voice active frame. Experimental results show that the proposed WT feature parameter can extract the speech activity under poor SNR conditions and that it is also insensitive to variable-level of noise.  相似文献   

16.
Dysfluency and stuttering are a break or interruption of normal speech such as repetition, prolongation, interjection of syllables, sounds, words or phrases and involuntary silent pauses or blocks in communication. Stuttering assessment through manual classification of speech dysfluencies is subjective, inconsistent, time consuming and prone to error. This paper proposes an objective evaluation of speech dysfluencies based on the wavelet packet transform with sample entropy features. Dysfluent speech signals are decomposed into six levels by using wavelet packet transform. Sample entropy (SampEn) features are extracted at every level of decomposition and they are used as features to characterize the speech dysfluencies (stuttered events). Three different classifiers such as k-nearest neighbor (kNN), linear discriminant analysis (LDA) based classifier and support vector machine (SVM) are used to investigate the performance of the sample entropy features for the classification of speech dysfluencies. 10-fold cross validation method is used for testing the reliability of the classifier results. The effect of different wavelet families on the classification performance is also performed. Experimental results demonstrate that the proposed features and classification algorithms give very promising classification accuracy of 96.67% with the standard deviation of 0.37 and also that the proposed method can be used to help speech language pathologist in classifying speech dysfluencies.  相似文献   

17.
In the present study, the techniques of wavelet transform (WT) and neural network were developed for speech based text-independent speaker identification. The first five formants in conjunction with the Shannon entropy of wavelet packet (WP) upon level four features extraction method was developed. Thirty-five features were fed to feed-forward backpropagation neural networks (FFPBNN) for classification. The functions of features extraction and classification are performed using the wavelet packet and formants neural networks (WPFNN) expert system. The declared results show that the proposed method can make an effectual analysis with average identification rates reaching 91.09. Two published methods were investigated for comparison. The best recognition rate selection obtained was for WPFNN. Discrete wavelet transform (DWT) was studied to improve the system robustness against the noise of −2 dB.  相似文献   

18.
噪声环境下的基音检测在语音信号处理中占有重要地位。为了有效提取低信噪比情况下的语音基音周期,提出了一种基于小波包变换加权线性预测自相关的检测方法。该方法首先利用小波包自适应阈值消除噪声,将多级小波包变换的近似分量求和以突出基音信息,并采用小波包系数加权线性预测误差自相关的方法突出基音周期处的峰值,提高了基音周期检测的精度。实验结果表明,与传统的自相关法、小波加权自相关法相比,该方法鲁棒性好,基音轨迹平滑,具有更高的准确性,即使在信噪比为-5dB时仍能取得较为理想的结果。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号