期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

王文延曾庆宁李琴赵中华《声学技术》2007,26(3):435-441

利用短时过零率来检测清音,用短时能量来检测浊音,两者相配合便实现了信号信噪比较大情况下的端点检测。但是在信噪比较小的环境下,这两种方法便失去了作用。为了能在噪声环境下准确地检测出语音信号的端点,根据对含噪语音在时频域中的研究,提出了一种基于Matching pursuits时频分解算法的语音端点检测方法。该方法使用Matching pursuits算法对含噪信号进行分解,然后再对信号进行魏格纳变换,可以完全去除信号的魏格纳交叉干扰项,使得语音信号和噪声信号在时频平面上具有较直观明显的魏格纳能量分布,利用这个特点再进行端点检测,实验结果表明,该方法能在信噪比较低的情况下,准确地检测出语音信号的端点。相似文献

2.

修正倒谱和动态规划的基频估计算法 总被引：1，自引：0，他引：1

金学成解岭汪增福《声学技术》2008,27(1):79-86

基音频率是语音信号处理中的一个重要参数。倍频、半频错误以及清浊音判决的可靠性等问题一直是基频估计中的难点问题。在对语音信号的倒谱进行适当修正的基础上,提出了一种高精度的基频估计算法。该算法根据倒谱、短时能量和短时过零率在清音段和浊音段的不同表现,构造了一个清浊音判决函数,大大提高了清浊音判决精度;然后利用动态规划技术进行基频跟踪。在构造代价函数时．充分考虑了基频连续性的影响,从而使该算法既能有效地避免倍频和半频错误,又能体现出基频的自然加倍和减半。通过与现有的几种效果较好的方法进行对比实验,结果表明该算法具有准确率高、基频轨迹平滑的优点,利用该算法得到的基频轨迹基本不需要进行后期平滑处理。相似文献

3.

匹配追踪时频分解算法的端点检测方法 总被引：2，自引：0，他引：2

王文延曾庆宁李琴《声学技术》2007,26(1):117-120

为了能在无噪音环境下准确地检测语音信号的端点,传统的方法是使用过零方法检测清音,短时能量方法检测浊音,两者相结合便实现了端点检测。通过对语音信号在时频平面中分布的研究,提出了一种基于匹配追踪时频原子分解算法的端点检测方法。该方法利用匹配追踪算法对信号进行分解,使得信号在时频平面上具有较直观明显的魏格纳能量分布,利用这个特点设置一个门限值再进行端点检测,便能准确检测出语音信号端点。实验结果表明,和传统的方法相对比,因为涉及到了信号的分解,所以实时性较差,且门限问题还有待深入研究,但该方法能更加准确地检测出语音信号的端点,亦为端点检测问题提供了一种新的思维方法。相似文献

4.

匹配追踪时频分解算法的点检测方法

下载免费PDF全文

王文延曾庆宁李琴《声学技术》2007,26(1):117-120

为了能在无噪音环境下准确地检测语音信号的端点,传统的方法是使用过零方法检测清音,短时能量方法检测浊音,两者相结合便实现了端点检测.通过对语音信号在时频平面中分布的研究,提出了一种基于匹配追踪时频原子分解算法的端点检测方法.该方法利用匹配追踪算法对信号进行分解,使得信号在时频平面上具有较直观明显的魏格纳能量分布,利用这个特点设置一个门限值再进行端点检测,便能准确检测出语音信号端点.实验结果表明,和传统的方法相对比,因为涉及到了信号的分解,所以实时性较差,且门限问题还有待深人研究,但该方法能更加准确地检测出语音信号的端点,亦为端点检测问题提供了一种新的思维方法. 相似文献

5.

基于MATLAB的语音信号基音周期检测的实现

段继鹏李春泉熊殷《中国科技博览》2008,(21)

本文针对语音信号基音周期检测进行分析,并使用matlab软件编程实现了语音信号的基音周期检测.在实现基音周期检测时使用中心削波法,该方法使语音信号基音周期检测更为可靠,并采用了三电平削波法减少基于自相关法的基音周期检测的乘法运算量. 相似文献

6.

基于相似度的高精度基音检测算法

陈雪勤刘正赵鹤鸣《声学技术》2008,27(5):704-707

提出了一种具有较高精度且抗噪性能强的基音检测算法。该算法将线性预测残差看作语音源信号的近似,对其进行频谱分析,依据残差幅度谱算得基音周期的粗估值。然后回到时域信号,根据基音周期粗估值设计一长度可调的窗,通过窗函数在语音段连续取两段语音信号作相似度运算,可根据最大相似度值计算出准确的基音周期。该方法准确性高,在噪声环境下也具有较好的效果。相似文献

7.

一种基于线性预测和极大似然估计的基音检测算法 总被引：1，自引：0，他引：1

下载免费PDF全文

张永亮鲁宇明张先庭杨焱《声学技术》2009,28(6):768-772

用线性预测的方法求出语音信号的LPC(Linear Predictive Coding)谱,然后根据候选的声门激励与LPC谱卷积重构语音信号的短时频谱,当重构频谱与原始语音频谱之间的畸变最小时,声门激励之间的间隔为基音周期.为了提高计算效率,采用频域动态搜索的方法搜索基音周期的候选值.数值实验表明,采用线性预测和极大似燃估计 (Maximum Likelihood, ML)的基音检测算法可保留更多的基音信息,并能有效地减少基音检测的错误,并且该算法比传统的ML法有更强的鲁棒性. 相似文献

8.

结合EMD与DWT-ACF的语音基音周期检测改进算法

张涛章小兵朱明星《噪声与振动控制》2018,38(2):173-178

针对传统小波-自相关算法在噪声环境下检测语音的基音周期会出现偏差和漏报的情况,提出一种经验模式分解下的小波-自相关的基音周期检测改进算法。该算法首先利用经验模式分解去除含噪语音趋势项并减噪,再利用改进的小波-自相关法突出每个基音周期的峰值点,提高了基音周期检测的精度。实验结果表明,该改进方法可有效改善加噪语音在基音提取上出现的偏差误报情况以及避免部分倍频和半频错误,提高基音周期检测速率及准确率。相似文献

9.

语音信号端点检测的实验研究

程启明《声学与电子工程》1997,(3):29-31

语音信号的端点检测一般都采用短时平均过零率和短时平均能量两参数判定,仅靠某一参数一般难以把噪声、清音和浊音区分开.本文通过理论分析和实验研究证明,仅靠短时过零率参数,只可以把清音和浊音区分开,但无法把清音和噪声有效地区分开. 相似文献

10.

一种基于能量对称度参数的实时基音检测方法

朱君波高瑞华王守觉《声学与电子工程》2003,(4):9-10

提出了一种基于能量对称度(ES)参数的基音检测方法.先通过波峰检测和对称度检测粗略估计语音的基音，再根据ES参数得到最佳的语音基音。实验证明此方法不仅具有实时性而且具有很高的准确性，而且不存在延时问题．是一种适合于单片机实现的语音信号处理方法。相似文献

11.

基于临界带功率谱方差的端点检测 总被引：1，自引：0，他引：1

下载免费PDF全文

张春雷曾向阳王曙光《声学技术》2012,31(2):204-208

端点检测作为语音信号处理的关键技术,其准确性直接影响到语音识别系统的计算复杂度和识别能力。在人耳听觉特性理论研究的基础上,利用语音段和背景噪声段临界带功率谱上的差异,提出了一种基于临界带功率谱方差的端点检测方法。通过自适应门限值的选取,该方法对背景噪声具有良好的跟踪性能。在不同的信噪比条件下,进行了端点检测实验。结果表明：该方法与传统的短时能量和短时平均过零率方法、谱熵方法相比,可以有效降低背景噪声的影响,具有更好的鲁棒性和正确率。相似文献

12.

Measurement of the effects of temporal clipping on speech quality

Lijing Ding Radwan A. El-Hennawey M.S. Goubran R.A. 《IEEE transactions on instrumentation and measurement》2006,55(4):1197-1203

This paper investigates the effects of temporal clipping on perceived speech quality. Temporal clipping usually results from voice activity detection (VAD), or line echo canceller's nonlinear processor, and the clipped speech portions are replaced by comfort noise. A nonintrusive algorithm is proposed to predict speech quality based on the clipping statistics. Mean opinion score (MOS) is used as a metric for speech quality and is measured by perceptual evaluation of speech quality (PESQ). The impacts of speech frame size and noise spectrum on the algorithm are also investigated. The results show that the proposed algorithm can efficiently predict the speech quality. The correlation coefficient between the prediction and the measurement is about 0.975, and the root mean square error for the prediction is 0.20 MOS. The algorithm can be used as an integral part of a general speech quality assessment scheme in voice over Internet protocol (VoIP). 相似文献

13.

Epoch-based analysis of speech signals

B YEGNANARAYANA SURYAKANTH V GANGASHETTY 《Sadhana》2011,36(5):651-697

Speech analysis is traditionally performed using short-time analysis to extract features in time and frequency domains. The window size for the analysis is fixed somewhat arbitrarily, mainly to account for the time varying vocal tract system during production. However, speech in its primary mode of excitation is produced due to impulse-like excitation in each glottal cycle. Anchoring the speech analysis around the glottal closure instants (epochs) yields significant benefits for speech analysis. Epoch-based analysis of speech helps not only to segment the speech signals based on speech production characteristics, but also helps in accurate analysis of speech. It enables extraction of important acoustic-phonetic features such as glottal vibrations, formants, instantaneous fundamental frequency, etc. Epoch sequence is useful to manipulate prosody in speech synthesis applications. Accurate estimation of epochs helps in characterizing voice quality features. Epoch extraction also helps in speech enhancement and multispeaker separation. In this tutorial article, the importance of epochs for speech analysis is discussed, and methods to extract the epoch information are reviewed. Applications of epoch extraction for some speech applications are demonstrated. 相似文献

14.

Real-time and MPEG-1 layer III compression resistant steganography in speech

Shirali-Shahreza M.H. Shirali-Shahreza S. 《Information Security, IET》2010,4(1):1-7

Embedding a secret message into a cover media without attracting any attention, known as steganography, is one of the methods used for hidden communication purposes. One of the cover media that can be used for steganography is speech. In this study, the authors propose a new steganography method in speech signals. In this method, the silence intervals of speech are found and the length (number of samples) of these intervals is changed to hide information. The main feature of our method is robustness to MPEG-1 layer III (MP3) compression. This method can hide information in a speech stream with very low processing time which makes it a real-time steganography method. The hiding capacity of our method is comparable with other MP3 resistance methods and the listening tests show that the degradation in speech quality is not annoying. Additionally, the effect of our method on chaotic features is negligible, so it is difficult to detect our method with chaotic-based steganalysis methods. 相似文献

15.

小波包能量谱和BP神经网络在波纹管压浆超声检测中的应用

下载免费PDF全文

梁凯韩庆邦《声学技术》2020,39(2):151-156

针对小波分析在信号处理的局限性,将小波包分析和反向传播(Back Propagation,BP)神经网络相结合,提出一种基于小波包能量谱和BP神经网络的波纹管压浆超声检测方法。采用超声检测方法接收波纹管模型的回波信号,以小波包分解后各子频带的能量作为检测特征,当波纹管内部出现脱落时,检测特征会发生变化,最后将特征输入BP神经网络中进行分类识别。试验结果表明,该方法能够理想地实现波纹管内部缺陷的诊断,可为波纹管超声检测提供一定的技术支持。相似文献

16.

An Efficient Detection Approach of Content Aware Image Resizing

Ming Lu Shaozhang Niu Zhenguang Gao 《计算机、材料和连续体（英文）》2020,64(2):887-907

Content aware image resizing (CAIR) is an excellent technology used widely for image retarget. It can also be used to tamper with images and bring the trust crisis of image content to the public. Once an image is processed by CAIR, the correlation of local neighborhood pixels will be destructive. Although local binary patterns (LBP) can effectively describe the local texture, it however cannot describe the magnitude information of local neighborhood pixels and is also vulnerable to noise. Therefore, to deal with the detection of CAIR, a novel forensic method based on improved local ternary patterns (ILTP) feature and gradient energy feature (GEF) is proposed in this paper. Firstly, the adaptive threshold of the original local ternary patterns (LTP) operator is improved, and the ILTP operator is used to describe the change of correlation among local neighborhood pixels caused by CAIR. Secondly, the histogram features of ILTP and the gradient energy features are extracted from the candidate image for CAIR forgery detection. Then, the ILTP features and the gradient energy features are concatenated into the combined features, and the combined features are used to train classifier. Finally support vector machine (SVM) is exploited as a classifier to be trained and tested by the above features in order to distinguish whether an image is subjected to CAIR or not. The candidate images are extracted from uncompressed color image database (UCID), then the training and testing sets are created. The experimental results with many test images show that the proposed method can detect CAIR tampering effectively, and that its performance is improved compared with other methods. It can achieve a better performance than the state-of-the-art approaches. 相似文献

17.

基于峰谷特征和组合投票法的钢板计数算法

下载免费PDF全文

梁田龙永红汤汶龙刘芸萌《包装学报》2023,15(3):85-90

针对传统的人工计数方法存在效率低和危险系数高等问题,设计了一种基于峰谷特征和组合投票法的钢板计数算法。采用帧差法检测感兴趣区域,并对其进行图像预处理和边缘检测,利用每块切分的钢板边缘图像进行纵向投影获得灰度投影曲线并去噪,对投影曲线做一阶向前差分和二阶向前差分,得到钢板的波峰、波谷点,再对多块图像的波峰和波谷点进行组合投票,得到钢板数。实验结果表明,本文算法的钢板点张准确率在95%以上,能满足实际业务需求。相似文献

18.

Glottal closure instant and voice source analysis using time-scale lines of maximum amplitude

CHRISTOPHE D’ALESSANDRO NICOLAS STURMEL 《Sadhana》2011,36(5):601-622

¹Time-scale representation of voiced speech is applied to voice quality analysis, by introducing the Line of Maximum Amplitude (LoMA) method. This representation takes advantage of the tree patterns observed for voiced speech periods in the time-scale domain. For each period, the optimal LoMA is computed by linking amplitude maxima at each scale of a wavelet transform, using a dynamic programming algorithm. A time-scale analysis of the linear acoustic model of speech production shows several interesting properties. The LoMA points to the glottal closure instants. The LoMA phase delay is linked to the voice open quotient. The cumulated amplitude along the LoMA is related to voicing amplitude. The LoMA spectral centre of gravity is an indication of voice spectral tilt. Following these theoretical considerations, experimental results are reported. Comparative evaluation demonstrates that the LoMA is an effective method for the detection of Glottal Closure Instants (GCI). The effectiveness of LoMA analysis for open quotient, amplitude and spectral tilt estimations is also discussed with the help of some examples. 相似文献