首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 221 毫秒
1.
从语音信号声道滤波器中估计声道面积会受到声门波信号的干扰和非理想边界条件的影响,为了最小化这些因素 的干扰,提出了一种加权线性预测估计声道面积的方法,首先采用相位倾斜的动态调节算法DYPPSA(Dynamic Programming Projected Phase Slope Algorithm)确定声门的开启和闭合点位置,然后利用加权线性预测方法计算声门闭相下声道模型的反射系数,最后根据声道滤波器和声管的等效模型,递推得到反射系数与声道面积的函数关系,并迭代求解声道面积,实验结果 表明,计算同一段语音数据的声道面积,与核磁共振获得的标准声道面积比较,本方法估计的面积均方误差为 0.03,声门波幅值在峰值一半以下的闭相法得到的均方误差为0.15,故本方法估计更为准确。  相似文献   

2.
针对稀疏表示模型中网格失配导致波达方向角(DOA)估计存在较大估计误差的问题,该文提出一种基于协方差矩阵重构的离网格(Off-Grid)DOA估计方法(OGCMR).首先,将DOA与网格点之间偏移量包含进所构建接收数据空域离散稀疏表示模型;而后基于重构信号协方差矩阵建立关于DOA估计的稀疏表示凸优化问题;再构建采样协方差矩阵估计误差凸模型,并将此凸集显式包含进稀疏表示模型以改善稀疏信号重构性能;最后采用交替迭代方法求解所得联合优化问题以获得网格偏移参数及离网格DOA估计.数值仿真表明,与传统多重信号分类(MUSIC)、L1-SVD及基于稀疏和低秩恢复的稳健MVDR(SLRD-RMVDR)等估计算法相比,所提算法具有较好的角度分辨力以及较高的DOA估计精度.  相似文献   

3.
为提高DNN模型在无线通信中信道估计精度,提出一种基于1D-Concatenate的信道估计DNN模型优化方法。该方法将Concatenate进行一维(1D)数据转换,以跳跃连接的方式引入DNN模型,抑制梯度消失问题,运用1D-Concatenate恢复网络训练过程中丢失的数据特征,提高DNN信道估计精度。为验证优化方法的有效性,选取较典型的基于DNN的无线通信信道估计模型进行对比仿真实验。实验结果表明,本文提出的优化方法对已有DNN模型的估计增益提升可达77.10%,在高信噪比下信道增益提升可达3 dB。该优化方法能有效提高DNN模型在无线通信中的信道估计精度,特别是高信噪比下提升效果显著。  相似文献   

4.
针对稀疏表示模型中网格失配导致波达方向角(DOA)估计存在较大估计误差的问题,该文提出一种基于协方差矩阵重构的离网格(Off-Grid)DOA估计方法(OGCMR)。首先,将DOA与网格点之间偏移量包含进所构建接收数据空域离散稀疏表示模型;而后基于重构信号协方差矩阵建立关于DOA估计的稀疏表示凸优化问题;再构建采样协方差矩阵估计误差凸模型,并将此凸集显式包含进稀疏表示模型以改善稀疏信号重构性能;最后采用交替迭代方法求解所得联合优化问题以获得网格偏移参数及离网格DOA估计。数值仿真表明,与传统多重信号分类(MUSIC)、L1-SVD及基于稀疏和低秩恢复的稳健MVDR (SLRD-RMVDR)等估计算法相比,所提算法具有较好的角度分辨力以及较高的DOA估计精度。  相似文献   

5.
逆高斯纹理的复合高斯分布(IG-CG分布)是描述高分辨率海杂波常用的模型,其参数估计在高分辨海用雷达自适应目标检测中起着关键作用。由于参数估计中数据不可避免地存在来自海面目标、岛礁的异常样本,对异常样本稳健的双分位点估计是近年来提出的有效方法之一。该文提出一种对异常点稳健的IG-CG分布三分位点参数估计(Tri-per)方法,其是对双分位点估计的改进。改进来自两个方面,通过双分位点位置优化提高逆形状参数的估计精度;通过第3个分位点的引入和位置优化提高尺度参数的估计精度。最后,用仿真和实测数据检验了提出估计方法的有效性和稳健性。  相似文献   

6.
为了实现多输入多输出(MIMO)正交频分复用(OFDM)系统中同步损伤和信道的联合估计,提出了一种基于网格搜索的联合估计算法。首先通过构建起一个以反映同步损伤和信道响应影响的系统模型,然后将各损伤参数估计的多维优化问题简化为二维网格和一维网格搜索,从而实现对载波频率偏移、采样频率偏移和符号定时误差的联合估计;数值仿真结果表明,本文提出的联合估计算法相比于非联合估计算法具有更好的估计性能。  相似文献   

7.
基于声带的分层结构特性,并考虑声学的相关理论,本文主要研究了一种合理的声带振动模型,并进行仿真,模拟了发声过程。基于一维数字波导模型中的声管级联,构造出一种传输线声道模型,利用声学、电学、力学之间有可以类比的关系,给出声门等效电路图,并和声带模型联合仿真得到输出语音信号。实验结果表明,通过设置不同的声带和声门参数,可以模拟出不同频率、音调的声音,说明了声道的形态变化对声音的影响关系。  相似文献   

8.
该文针对有限次采样导致传统波达方向角(DOA)估计算法存在较大估计误差的问题,提出一种基于稀疏低秩分解(SLRD)的稳健DOA估计方法。首先,基于低秩矩阵分解方法,将接收信号协方差矩阵建模为低秩无噪协方差及稀疏噪声协方差矩阵之和;而后基于低秩恢复理论,构造关于信号和噪声协方差矩阵的凸优化问题;再者构建关于采样协方差矩阵估计误差的凸模型,并将此凸集显式包含进凸优化问题以改善信号协方差矩阵估计性能进而提高DOA估计精度及稳健性;最后基于所得最优无噪声协方差矩阵,利用最小方差无畸变响应(MVDR)方法实现DOA估计。此外,基于采样协方差矩阵估计误差服从渐进正态分布的统计特性,该文推导了一种误差参数因子选取准则以较好重构无噪声协方差矩阵。数值仿真表明,与传统常规波束形成(CBF)、最小方差无畸变响应(MVDR)、传统多重信号分类(MUSIC)及基于稀疏低秩分解的增强拉格朗日乘子(SLD-ALM)算法相比,有限次采样条件下所提算法具有较高DOA估计精度及较好稳健性能。  相似文献   

9.
基于均匀设计的线性回归模型稳健参数估计   总被引:2,自引:0,他引:2  
针对一个线性回归模型的系统矩阵存在的随机扰动情况,提出一种基于均匀设计的稳健参数估计算法。仿真结果表明,采用本文算法估计回归模型参数,能较好地抑制随机扰动对模型预测精度的影响,提高模型的稳定性和抗干扰能力,同时为改进仪表测量误差对预测精度带来的影响提供了一条可行途径。  相似文献   

10.
基于杂波多普勒分布(DDC)模型,该文研究了杂波协方差矩阵的特征值能量分布特点,提出了一种稳健的机载雷达杂波多普勒参数估计方法.该方法对杂波多普勒中心估计的精度和现有常见方法可比拟,而对杂波谱宽估计的精度优于现有常见方法,尤其适用于机载雷达运动目标检测(MTD)等实际应用背景下数据样本较少的情况.仿真实验证明了该方法的有效性.  相似文献   

11.
李永伟  陶建华  李凯 《信号处理》2023,39(4):632-638
语音情感识别是实现自然人机交互不可缺失的部分,是人工智能的重要组成部分。发音器官的调控引起情感语音声学特征的差异,从而被感知到不同的情感。传统的语音情感识别只是针对语音信号中的声学特征或听觉特征进行情感分类,忽略了声门波和声道等发音特征对情感感知的重要作用。在我们前期工作中,理论分析了声门波和声道形状对感知情感的重要影响,但未将声门波与声道特征用于语音情感识别。因此,本文从语音生成的角度重新探讨了声门波与声道特征对语音情感识别的可能性,提出一种基于源-滤波器模型的声门波和声道特征语音情感识别方法。首先,利用Liljencrants-Fant和Auto-Regressive eXogenous(ARX-LF)模型从语音信号中分离出情感语音的声门波和声道特征;然后,将分离出的声门波和声道特征送入双向门控循环单元(BiGRU)进行情感识别分类任务。在公开的情感数据集IEMOCAP上进行了情感识别验证,实验结果证明了声门波和声道特征可以有效的区分情感,且情感识别性能优于一些传统特征。本文从发音相关的声门波与声道研究语音情感识别,为语音情感识别技术提供了一种新思路。  相似文献   

12.
Glottal source estimation using a sum-of-exponentials model   总被引:1,自引:0,他引:1  
An algorithm for estimating the glottal source waveform in voiced speech is described. The glottal source waveform is described using the LF model proposed by Fant et al. (1985). The vocal tract filter is modeled as a pole-zero system. The analysis of vowel sounds from several talkers shows that the analysis procedure leads to an accurate estimate of the glottal source  相似文献   

13.
In this paper, a new method is proposed to extract the physiologically relevant parameters of the vocal fold mathematic model including masses, spring constants and damper constants from high-speed video (HSV) image series. This method uses a genetic algorithm to optimize the model parameters until the model and the realistic vocal folds have similar dynamic behavior. Numerical experiments theoretically test the validity of the proposed parameter estimation method. Then the validated method is applied to extract the physiologically relevant parameters from the glottal area series measured by HSV in an excised larynx model. With the estimated parameters, the vocal fold model accurately describes the vibration of the observed vocal folds. Further studies show that the proposed parameter estimation method can successfully detect the increase of longitudinal tension due to the vocal fold elongation from the glottal area signal. These results imply the potential clinical application of this method in inspecting the tissue properties of vocal fold.  相似文献   

14.
This study is devoted to constructing a mathematical model of acoustic interaction between the glottal volume velocity, vocal tract, and subglottal region (trachea, bronchi, and lungs). The model is based on the approximation of the acoustic impedances by autoregressive models with a moving mean. The experimental results are in good agreement with the data of other studies on interaction of the vocal source and vocal tract.  相似文献   

15.
The quality of synthetic speech is affected by two factors: intelligibility and naturalness. At present, synthesized speech may be highly intelligible, but often sounds unnatural. Speech intelligibility depends on the synthesizer's ability to reproduce the formants, the formant bandwidths, and formant transitions, whereas speech naturalness is thought to depend on the excitation waveform characteristics for voiced and unvoiced sounds. Voiced sounds may be generated by a quasiperiodic train of glottal pulses of specified shape exciting the vocal tract filter. It is generally assumed that the glottal source and the vocal tract filter are linearly separable and do not interact. However, this assumption is often not valid, since it has been observed that appreciable source-tract interaction can occur in natural speech. Previous experiments in speech synthesis have demonstrated that the naturalness of synthetic speech does improve when source-tract interaction is simulated in the synthesis process. The purpose of this paper is two-fold: (1) to present an algorithm for automatically measuring source-tract interaction for voiced speech, and (2) to present a simple speech production model that incorporates source-tract interaction into the glottal source model, This glottal source model controls: (1) the skewness of the glottal pulse, and (2) the amount of the first formant ripple superimposed on the glottal pulse. A major application of the results of this paper is the modeling of vocal disorders  相似文献   

16.
For voiced speech, LPC coefficients obtained from sequential adaptation techniques are maximally perturbed roughly at the instant of glottal closure. Coefficient extraction at or immediately beyond this instant results in a poor vocal tract model estimate. A method is described to enhance the accuracy of the estimate by selecting coefficient vectors based on their time-domain convergence.  相似文献   

17.
18.
Traditional speech processing methods for laryngeal pathology assessment assume linear speech production with measures derived from an estimated glottal flow waveform. They normally require the speaker to achieve complete glottal closure, which for many vocal fold pathologies cannot be accomplished. To address this issue, a nonlinear signal processing approach is proposed which does not require direct glottal flow waveform estimation. This technique is motivated by earlier studies of airflow characterization for human speech production. The proposed nonlinear approach employs a differential Teager energy operator and the energy separation algorithm to obtain formant AM and FM modulations from filtered speech recordings. A new speech measure is proposed based on parameterization of the autocorrelation envelope of the AM response. This approach is shown to achieve impressive detection performance for a set of muscular tension dysphonias. Unlike flow characterization using numerical solutions of Navier-Stokes equations, this method is extremely computationally attractive, requiring only a small time window of speech samples. The new noninvasive method shows that a fast, effective digital speech processing technique can be developed for vocal fold pathology assessment without the need for direct glottal flow estimation or complete glottal closure by the speaker. The proposed method also confirms that alternative nonlinear methods can begin to address the limitations of previous linear approaches for speech pathology assessment  相似文献   

19.
Brookes  D.M. Loke  H.P. 《Electronics letters》1998,34(23):2202-2204
The detection of glottal closure and opening instants is needed for pitch-synchronous analysis in several areas of speech processing. The authors examine the flow of energy in the lossless-tube model of the vocal tract and show how linear predictive analysis may be used to estimate the waveform of acoustic input power at the glottis. It is demonstrated that this signal may be used to identify the instants of glottal closure and opening during voiced speech  相似文献   

20.
In this paper, the role of vocal fold elongation in governing glottal movement dynamics was theoretically and experimentally investigated. A theoretical model was first proposed to incorporate vocal fold elongation into the two-mass model. This model predicted the direct and nondirect components of the glottal time series as a function of vocal fold elongation. Furthermore, high-speed digital imaging was applied in excised larynx experiments to visualize vocal fold vibrations with variable vocal fold elongation from -10% to 50% and subglottal pressures of 18- and 24-cm H(2)O. Comparison between theoretical model simulations and experimental observations showed good agreement. A relative maximum was seen in the nondirect component of glottal area, suggesting that an optimal elongation could maximize the vocal fold vibratory power. However, sufficiently large vocal fold elongations caused the nondirect component to approach zero and the direct component to approach a constant. These results showed that vocal fold elongation plays an important role in governing the dynamics of glottal area movement and validated the applicability of the proposed theoretical model and high-speed imaging to investigate laryngeal activity.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号