期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

李永伟陶建华李凯《信号处理》2023,39(4):632-638

语音情感识别是实现自然人机交互不可缺失的部分,是人工智能的重要组成部分。发音器官的调控引起情感语音声学特征的差异,从而被感知到不同的情感。传统的语音情感识别只是针对语音信号中的声学特征或听觉特征进行情感分类,忽略了声门波和声道等发音特征对情感感知的重要作用。在我们前期工作中,理论分析了声门波和声道形状对感知情感的重要影响,但未将声门波与声道特征用于语音情感识别。因此,本文从语音生成的角度重新探讨了声门波与声道特征对语音情感识别的可能性,提出一种基于源-滤波器模型的声门波和声道特征语音情感识别方法。首先,利用Liljencrants-Fant和Auto-Regressive eXogenous(ARX-LF)模型从语音信号中分离出情感语音的声门波和声道特征;然后,将分离出的声门波和声道特征送入双向门控循环单元（BiGRU）进行情感识别分类任务。在公开的情感数据集IEMOCAP上进行了情感识别验证,实验结果证明了声门波和声道特征可以有效的区分情感,且情感识别性能优于一些传统特征。本文从发音相关的声门波与声道研究语音情感识别,为语音情感识别技术提供了一种新思路。相似文献

2.

Measuring and modeling vocal source-tract interaction

Childers D.G. Chun-Fan Wong 《IEEE transactions on bio-medical engineering》1994,41(7):663-671

The quality of synthetic speech is affected by two factors: intelligibility and naturalness. At present, synthesized speech may be highly intelligible, but often sounds unnatural. Speech intelligibility depends on the synthesizer's ability to reproduce the formants, the formant bandwidths, and formant transitions, whereas speech naturalness is thought to depend on the excitation waveform characteristics for voiced and unvoiced sounds. Voiced sounds may be generated by a quasiperiodic train of glottal pulses of specified shape exciting the vocal tract filter. It is generally assumed that the glottal source and the vocal tract filter are linearly separable and do not interact. However, this assumption is often not valid, since it has been observed that appreciable source-tract interaction can occur in natural speech. Previous experiments in speech synthesis have demonstrated that the naturalness of synthetic speech does improve when source-tract interaction is simulated in the synthesis process. The purpose of this paper is two-fold: (1) to present an algorithm for automatically measuring source-tract interaction for voiced speech, and (2) to present a simple speech production model that incorporates source-tract interaction into the glottal source model, This glottal source model controls: (1) the skewness of the glottal pulse, and (2) the amount of the first formant ripple superimposed on the glottal pulse. A major application of the results of this paper is the modeling of vocal disorders 相似文献

3.

Energy flow in lossless tube model of vocal tract with applicationsto glottal closure and opening detection

Brookes D.M. Loke H.P. 《Electronics letters》1998,34(23):2202-2204

The detection of glottal closure and opening instants is needed for pitch-synchronous analysis in several areas of speech processing. The authors examine the flow of energy in the lossless-tube model of the vocal tract and show how linear predictive analysis may be used to estimate the waveform of acoustic input power at the glottis. It is demonstrated that this signal may be used to identify the instants of glottal closure and opening during voiced speech 相似文献

4.

一种基于联合源-滤波器模型优化的语音声门源模型估计方法 总被引：1，自引：0，他引：1

下载免费PDF全文

付强 Peter Murphy 颜永红《电子学报》2007,35(5):982-986

本文论述了一种基于联合源-滤波器分离的稳健声门源模型估计方法.此方法利用LF(Liljencrants-Fant)模型对声门波导数(glottal flow derivative)进行建模,而声道被描述为一个时变的ARX模型.由于联合估计问题是一个多变量非线性优化过程,本文采用了一个两阶段(two-pass)的实现策略来解决这一问题.第一阶段初始化声门源和声道模型,并为其后的联合优化过程提供稳健的初始参数.第二阶段的联合估计则最终决定模型估计的精度,由信任域下降优化算法实现.通过分别对合成和真实语音的实验,表明该方法是一种具有一定精度和较好的稳健性的声门源模型估计算法. 相似文献

5.

Critical analysis of the impact of glottal features in the classification of clinical depression in speech

Moore E Clements MA Peifer JW Weisser L 《IEEE transactions on bio-medical engineering》2008,55(1):96-107

相似文献

6.

Electroglottography for Laryngeal Function Assessment and Speech Analysis

Childers Donald G. Larar J. N. 《IEEE transactions on bio-medical engineering》1984,(12):807-817

The methodology of electroglottography is briefly outlined, Major emphasis is given to validating key features of the electroglottographic (EGG) waveform using ultrahigh-speed laryngeal films. We show how the instants of glottal closure and opening may be identified from the EGG waveform. This information may be used to improve speech analysis techniques such as the pitch synchronous, closed phase, covariance analysis method. Other applications include pitch detection, the determination of intervals of voicing, unvoicing, mixed voicing and silence, improving speech synthesis, and assisting the automation of inverse filtering. 相似文献

7.

A nonlinear operator-based speech feature analysis method withapplication to vocal fold pathology assessment

Hansen J.H.L. Gavidia-Ceballos L. Kaiser J.F. 《IEEE transactions on bio-medical engineering》1998,45(3):300-313

Traditional speech processing methods for laryngeal pathology assessment assume linear speech production with measures derived from an estimated glottal flow waveform. They normally require the speaker to achieve complete glottal closure, which for many vocal fold pathologies cannot be accomplished. To address this issue, a nonlinear signal processing approach is proposed which does not require direct glottal flow waveform estimation. This technique is motivated by earlier studies of airflow characterization for human speech production. The proposed nonlinear approach employs a differential Teager energy operator and the energy separation algorithm to obtain formant AM and FM modulations from filtered speech recordings. A new speech measure is proposed based on parameterization of the autocorrelation envelope of the AM response. This approach is shown to achieve impressive detection performance for a set of muscular tension dysphonias. Unlike flow characterization using numerical solutions of Navier-Stokes equations, this method is extremely computationally attractive, requiring only a small time window of speech samples. The new noninvasive method shows that a fast, effective digital speech processing technique can be developed for vocal fold pathology assessment without the need for direct glottal flow estimation or complete glottal closure by the speaker. The proposed method also confirms that alternative nonlinear methods can begin to address the limitations of previous linear approaches for speech pathology assessment 相似文献

8.

利用改进的LF模型进行语音嗓音源合成

彭柏许刚《电声技术》2006,(5):53-57

LF微分声门波模型是分析嗓音源的一个重要的参考模型,它与实际的声门波形通过时域参数相对应,其分析所得的模型参数可用来合成声门激励信号。嗓音源合成是语音转换和语音合成的一个重要基础。在LF-4微分声门波模型的基础上,提出了一种LF-4模型的改进算法,进行嗓音源的合成。实验结果和分析表明,该算法可实现对嗓音源参数的灵活控制,使嗓音源的合成具有更高的融合度。相似文献

9.

Improved source corrected analysis of speech by glottal endpoint preconditioning

Thomson M.M. Guillemin B.J. 《Electronics letters》1989,25(14):881-882

An improved method of source corrected analysis of speech is presented, based on preconditioning of the initial glottal endpoint estimates. The authors show why the existing algorithm sometimes fails to converge and how this may be avoided by modifying the initial endpoint estimates so that they lie between their true positions.<> 相似文献

10.

Computer-Aided Design of Digital Lightwave Systems 总被引：4，自引：0，他引：4

Duff D. 《Selected Areas in Communications, IEEE Journal on》1984,2(1):171-185

Models and analysis methods suitable for computer-aided design of digital lightwave systems are reviewed. An overview of a digital lightwave system is described along with properties of system components. Models for system degradations due to the source waveform, fiber dispersion, and noise are developed. Various analytical methods used to compute error rate, eye margin, and power penalty are described and compared. 相似文献

11.

干涉型光纤水听器模拟分析与优化 总被引：1，自引：0，他引：1

吴黎叶志成陆晓东全宇军《激光与光电子学进展》2006,43(2):37-41

概述了干涉型光纤水听器直接调制光源的相位载波调制解调原理,对实际应用中存在的问题进行了详细的理论和模拟分析,并提出了优化调制信号幅度、消除延迟的影响以及采用自动增益控制电路来稳定光电检测输出等优化方案来提高水听器的稳定性和动态范围,减少波形畸变。相似文献

12.

Theoretical model of the hybrid soliton pulse source

Ozyazici M.S. Morton P.A. Zhang L.M. Mizrahi V. 《Photonics Technology Letters, IEEE》1995,7(10):1142-1144

A complete model of the hybrid soliton pulse source based on a time-domain solution of the coupled-mode equations is described. The model predicts the novel wavelength self-tuning mechanism and large stability range seen with this source. Results show the output waveform, optical spectrum, and instantaneous frequency, providing accurate source characteristics for use in transmission simulations 相似文献

13.

全固态快沿电磁脉冲源的模拟及特性分析

下载免费PDF全文

赵敏周星王庆国杨清熙《微波学报》2015,31(1):65-69

为了给脉冲源的参数选择和电磁脉冲防护提供理论支撑,首先对雪崩三极管的雪崩效应以及全固态快沿电磁脉冲源的工作原理进行阐述。然后通过对雪崩三极管导通过程简化,建立了该脉冲源的等效模型,分析了该脉冲源的时域特性,通过自定义曲线拟合方法,首次得到了脉冲源输出波形的数学表达式,并验证了它的有效性,最终确定了脉冲源的具体理论模型。依据此表达式的参数,通过对脉冲电流的频域分析表明:电流的振幅主要集中在0 ~200MHz 之间,90%以上的能量都分布在810MHz 以内。相似文献

14.

一种2.4kbps波形插值声码器 总被引：2，自引：0，他引：2

杨慧敏陈弘毅孙义和吴历曦《电子学报》1998,26(11):110-113,106

本文描述了波形插值语音分析与合成方法的基本原理和编码技术，并给出一种２．４ｋｂｐｓ波形插值声码器的实际结构，该声码器利用提取特征波形及浊／清音成分分离的方法，有效地去除了残差信号中基音波形的冗余度，能在２．４ｋｂｐｓ数据率下获得高质量的合成语音。相似文献

15.

Precise glottal closure instant detector for voiced speech 总被引：1，自引：0，他引：1

Hahn M. Dong-Guy Kang 《Electronics letters》1996,32(23):2117-2118

An algorithm which can extract precise glottal closure instant (GCI) information directly from speech is described. By utilising the average pitch and the area informations, accurate GCI positions are obtained. The result compared with the manually classified GCI information using EGG signals only shows 0.2 ms average error 相似文献

16.

Model of acoustic interaction between the vocal tract,subglottal region,and vocal source

K. S. Gorbunov I. S. Makarov 《Journal of Communications Technology and Electronics》2010,55(12):1456-1465

This study is devoted to constructing a mathematical model of acoustic interaction between the glottal volume velocity, vocal tract, and subglottal region (trachea, bronchi, and lungs). The model is based on the approximation of the acoustic impedances by autoregressive models with a moving mean. The experimental results are in good agreement with the data of other studies on interaction of the vocal source and vocal tract. 相似文献

17.

Modeling and measurement of flow effects on tracheal sounds

Harper VP Pasterkamp H Kiyokawa H Wodicka GR 《IEEE transactions on bio-medical engineering》2003,50(1):1-10

The analysis of breathing sounds measured over the extrathoracic trachea offers a noninvasive technique to monitor obstructions of the respiratory tract. Essential to development of this technique is a quantitative understanding of how such tracheal sounds are related to the underlying tract anatomy, airflow, and disease-induced obstructions. In this study, the first dynamic acoustic model of the respiratory tract was developed that takes into consideration such factors as turbulent sound sources and varying glottal aperture. Model predictions were compared to tracheal sounds measured on four healthy subjects at target flow rates of 0.5, 1.0, 1.5, and 2.0 L/s, and also during nontargeted breathing. Both the simulation and measurement spectra depicted increasing sound power with increasing flow, with smaller incremental increases at the higher flow rates. A sound power increase of approximately 30 dB between a flow rate of 0.5 and 2.0 L/s was observed in both the simulated and measured spectra. Variations of as much as 15 dB over the 300-600 Hz frequency band were noted in the sound power produced during targeted and nontargeted breathing maneuvers at the same flow rates. We propose that this variability was in part due to changes in glottal aperture area, which is known to vary during normal respiration and has been observed as a method of flow control. Model simulations incorporating a turbulent source at the glottis with respiratory cycle variations in glottal aperture from 0.64 cm2 to 1.4 cm2 explained approximately 10 dB of the measured variation. This study provides the first links between spatially distributed sound sources due to turbulent flow in the respiratory tract and noninvasive tracheal sounds measurements. 相似文献

18.

雷达数字编码波形发生器CPLD实现与优化 总被引：1，自引：0，他引：1

刘笃仁秦金明《现代雷达》2005,27(12):53-56

介绍了一种基于CPLD的可程控雷达数字编码波形发生器的优化设计与实现，叙述了用Vefilog HDL进行设计的思想，阐述了Verilog HDL在在系统可编程（ISP）开发平台——ispLEVER上的应用与设计流程，给出了用Lattice半导体公司CPLD之LC51024VG-5F484C实现的方法。整个系统设计简洁明了，高效快捷，编码波形发生器实现了三个任意可变：波形的码元宽度任意可变；重复周期任意可变；码元个数任意可变。最大码元bit数可根据实际需要随时升级改动，而这种改动只需要轻松改动设计源文件顶层模块中所调用的移位寄存器和锁存器的个数。经仿真、综合、优化、适配，可重新下载到所选用的CPLD中，不需要改动其他外围电路，十分灵活、方便。相似文献

19.

A current source PWM inverter with actively commutated SCRs 总被引：1，自引：0，他引：1

Bendre A. Wallace I. Nord J. Venkataramanan G. 《Power Electronics, IEEE Transactions on》2002,17(4):461-468

Conventional SCR based current source inverters suffer from poor waveform quality due to six step switching. Pulse width modulated current source inverters typically require gate turn off devices with reverse voltage blocking capability which have limited their application. In this paper, a new pulse width modulated current source inverter topology using one gate turn off switch and six SCRs is presented. The converter uses active commutation to realize pulse width modulation in a conventional SCR based current source inverter. Modulation techniques for the proposed inverter, simulation and experimental results are described in the paper. This topology is suitable for high performance, high power applications 相似文献

20.

Pulse and time-domain measurements

《Proceedings of the IEEE. Institute of Electrical and Electronics Engineers》1986,74(1):77-81

A review of the state of the art and science of pulse parameter measurements is given including recent advances in the use of real-time oscilloscopes, waveform recorders, equivalent time sampling oscilloscopes, and counter timers in the measurement of repetitive and single transient signals. Recent advances in the use of artifact waveform standards and modern signal analysis techniques to compensate for measurement distortion are highlighted. The formation and progress of an IEEE committee which is developing a performance standard for waveform recorders is also described. 相似文献