首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Voice over IP (VoIP) is becoming one of the key technologies for telecommunications. Since IP networks generally do not guarantee transmission quality, it is extremely important to design and manage the quality of service (QoS) properly. To do this, it is desirable to develop an objective quality assessment method that estimates subjective quality based on the physical characteristics of the VoIP system. This paper first proposes a framework of objective models that can be applied not only to quality planning, which is an intended application of the existing standard methodology known as International Telecommunication Union—Telecommunication Standardization Sector (ITU-T) Recommendation G.107, “the E-model,” but also to quality benchmarking and management. Then, it proposes a model that complies with the proposed framework. Experimental results show that the proposed model has sufficient accuracy in the evaluation of practical VoIP systems. In addition, we attempt to integrate the opinion model with other objective quality measures, such as perceptual evaluation of speech quality (PESQ), standardized in Recommendation P.862 in ITU-T. Finally, we examine the task dependence of the performance of the proposed model.  相似文献   

2.
设计并开发了一套基于PESQ的空乘人员语音质量评价系统.详细介绍了ITU-T推荐的P.862建议的PESQ算法,并将其与声压级、1/3倍频程一起构成了一种新的语音质量的客观评价标准,设计并实现了系统的软硬件.该系统作为飞机综合训练舱服务训练系统的一个子系统,为空乘人员的模拟训练提供了有力的工具,具有较高的应用价值.  相似文献   

3.
We describe a novel single-ended algorithm constructed from models of speech signals, including clean and degraded speech, and speech corrupted by multiplicative noise and temporal discontinuities. Machine learning methods are used to design the models, including Gaussian mixture models, support vector machines, and random forest classifiers. Estimates of the subjective mean opinion score (MOS) generated by the models are combined using hard or soft decisions generated by a classifier which has learned to match the input signal with the models. Test results show the algorithm outperforming ITU-T P.563, the current “state-of-art” standard single-ended algorithm. Employed in a distributed double-ended measurement configuration, the proposed algorithm is found to be more effective than P.563 in assessing the quality of noise reduction systems and can provide a functionality not available with P.862 PESQ, the current double-ended standard algorithm.  相似文献   

4.
Quality estimation of speech is essential for monitoring and maintenance of the quality of service at different nodes of modern telecommunication networks. It is also required in the selection of codecs in speech communication systems. There is no requirement of the original clean speech signal as a reference in non-intrusive speech quality evaluation, and thus it is of importance in evaluating the quality of speech at any node of the communication network. In this paper, non-intrusive speech quality assessment of narrowband speech is done by Gaussian Mixture Model (GMM) training using several combinations of auditory perception and speech production features, which include principal components of Lyon’s auditory model features, MFCC, LSF and their first and second differences. Results are obtained and compared for several combinations of auditory features for three sets of databases. The results are also compared with ITU-T Recommendation P.563 for non-intrusive speech quality assessment. It is found that many combinations of these feature sets outperform the ITU-T P.563 Recommendation under the test conditions.  相似文献   

5.
This paper presents an analysis of the relation between IP channel characteristics and final voice transmission quality. The NISTNet emulator is used for adjusting the IP channel network. The transmission quality criterion is an MOS parameter investigated using the ITU-T P.862 PESQ, future P.863 POLQA and P.563 3SQM algorithms. Jitter and packet loss influence are investigated for the PCM codec and the Speex codec.  相似文献   

6.
This paper proposes a novel approach to quantifying the quality degradation of Voice over IP (VoIP) telephony in the presence of codec and network-related impairments. This approach differs from the baisc ITU-T E-Model for VoIP quality estimation in that it addresses mixed narrowband/wideband scenarios. It makes novel use of instrumental models and symbolic regression via Genetic Programming (GP) to enable the evolution of degradation models from a modest set of initial parameters. Here, a two-step approach has been used. First, values of impairment factors are derived using WB-PESQ as a reference model. Secondly, a GP based symbolic regression approach has been utilized to automatically evolve the functional form of equipment impairment factors from a set of variables. Very few a priori assumptions are made about the model structure. The effectiveness of the approach is demonstrated by a number of generated models which compare favorably with WB-PESQ and outperform the traditional E-Model in terms of prediction accuracy when compared using WB-PESQ. A significant advantage of the approach is that new models are easily generated to account for continuing evolution of the VoIP standards.   相似文献   

7.
Voice over Internet Protocol (VoIP) is one of the fastest growing technologies in the world. In VoIP speech signals are transmitted over the same network used for data communications. The internet is not a robust network and is subjected to delay, jitter, and packet loss. It is very important to measure and monitor the quality of service (QoS) the users experience in VoIP networks; this is not an easy task and usually requires subjective tests. In this paper we have analyzed three non-intrusive models to measure and monitor voice quality using Random Neural Networks (RNN). A RNN is an open queuing network with positive and negative signals. We have assessed the voice quality based on various parameters i.e. delay, jitter, packet loss, and codec. In our approach we have used the Mean Opinion Score (MOS) calculated using a Perceptual Evaluation of Speech Quality (PESQ) algorithm to generate data for training the RNN model. We have studied two feed-forward models and a recurrent architecture. We have found that the simple feed-forward architecture has produced the most accurate results compared to the other two architectures.  相似文献   

8.
基于E-model的VoIP语音质量评估的研究   总被引:1,自引:0,他引:1  
为准确评估VoIP)语音质量,对E—model算法进行了深入研究,剖析了E—model算法的组成部分Id,Is,Ie,A,探讨了丢包、延迟和抖动对VoIP质量的影响,并应用该算法对多种语音编码进行了评估。实验证明该客观评估算法主观与客观相关度高,有较强的适应性,可靠性,实用性,完全可用于VoIP语音质量评估。  相似文献   

9.
This paper proposes a multiresolution model of auditory excitation pattern and applies it to the problem of objective evaluation of subjective wideband speech quality. The model uses wavelet packet transform for time-frequency decomposition of the input signal. The selection of the wavelet packet tree is based on an optimality criterion formulated to minimize a cost function based on the critical band structure. The models of the different auditory phenomena are reformulated for the multiresolution framework. This includes the proposition of duration dependent outer and middle ear weighting, multiresolution spectral spreading, and multiresolution temporal smearing. As an application, the excitation pattern is used to define an objective measure of auditory distortion of a distorted speech signal compared to the undistorted one. The performance of this objective measure is evaluated with a database of various kinds of NOISEX-92 degraded wideband speech signals in predicting the subjective mean opinion score (MOS) and is compared with the fast Fourier transform (FFT)-based ITU-T PESQ P.862.2 algorithm. The proposed measure is found to achieve comparable correlation between subjective MOS and objective MOS as PESQ P.862.2, with a trend suggesting better correlation for the nonstationary degradations compared to the stationary ones. Further refinement of the measure for distortion types other than additive noise is anticipated.  相似文献   

10.
A hybrid signal-and-link-parametric approach to speech quality measurement for voice-over-Internet protocol (VoIP) communications is described. Connection parameters are used to determine a base quality representative of the transmission link. Degradation factors, computed from perceptual features extracted from the decoded speech signal, are used to quantify distortions not captured by the connection parameters. The algorithm is tested on speech degraded by acoustic noise, temporal clippings, and noise suppression artifacts, thus simulating degradations present in wireless-VoIP tandem connections. Hybrid measurement is shown to overcome the limitations of pure link parametric and pure signal-based measurement methods, resulting in better measurement accuracy for modern VoIP communications. In addition, the proposed algorithm incurs modest computational overhead relative to pure link parametric measurement and attains up to 88% reduction in processing time relative to the ITU-T standard P.563 signal-based algorithm.   相似文献   

11.
王伟  王贞松 《计算机应用》2007,27(12):2969-2972
针对运用国际电联G.107 E模型评估VoIP通话质量时如何准确计算有效设备损伤系数的问题,提出一种基于马尔可夫模型的实时评估算法,通过分别为随机信息包丢失概率和突发比建立三态和二态马尔可夫模型,推导出估算有效设备损伤系数的运算公式和相应统计算法。商用测试结果表明,该评估算法能够在实时环境中较准确地评估VoIP通话质量。  相似文献   

12.
基于改进的SOM网络模型的VoIP QoS应用研究   总被引:1,自引:0,他引:1  
VoIP的服务质量(QoS,Quality of Service)评估可以采用一系列可度量的参数来描述:业务可用性、吞吐量、延迟、抖动、分组丢失率等。现有的感知语音质量评价(PESQ)很难对不同环境下的网络结构进行实时和恰当的语音等级质量分类。为了能够综合考虑几种QoS相关因素,在给出改进的自组织映射神经网络模型(ESOMNN)的基础上,利用ESOM能够对高维输入数据有效分类的特点,提出了将端到端延迟、丢包率、抖动、语音编码以及测试系统标识作为ESOMNN的输入数据,在对采样数据进行训练后可自动完成语音质量评价和映射,并能根据得到的实时变量有效地评价包含多种相关因素的QoS级别。  相似文献   

13.
对于基于统计模型的语音增强算法,不同分布模型对应于不同的增益函数,由于语音信号的不确定性,没有一种分布函数能准确对语音和噪声谱的分布建模,因此任何一种固定的统计模型均会存在一定的误差。所以提出一种增益字典查询的语音增强算法,该算法通过采用对数谱失真准则对一个语音噪声库进行增益的训练,得到一个增益的字典,其中输入为先验信噪比和后验信噪比的估计值。最后采用ITU-T P.826 PESQ、分段信噪比、总信噪比和对数谱失真对该算法进行了测试,并与基于高斯分布模型、拉普拉斯分布模型的算法进行了对比。实验结果表明,该算法无论在非平稳噪声还是平稳噪声环境下都比其他几种算法增强效果好,且音乐噪声和残留背景噪声也可以得到很好的抑制。  相似文献   

14.
Voice quality prediction models and their application in VoIP networks   总被引:4,自引:0,他引:4  
The primary aim of this paper is to present new models for objective, nonintrusive, prediction of voice quality for IP networks and to illustrate their application to voice quality monitoring and playout buffer control in VoIP networks. The contributions of the paper are threefold. First, we present a new methodology for developing perceptually accurate models for nonintrusive prediction of voice quality which avoids time-consuming subjective tests. The methodology is generic and as such it has wide applicability in multimedia applications. Second, based on the new methodology, we present efficient regression models for predicting conversational voice quality nonintrusively for four modern codecs (G.729, G.723.1, AMR and iLBC). Third, we illustrate the usefulness of the models in two main applications - voice quality prediction for real Internet VoIP traces and perceived quality-driven playout buffer optimization. For voice quality prediction, the results show that the models have accuracy close to the combined ITU PESQ/E-model method using real Internet traces (correlation coefficient over 0.98). For playout buffer optimization, the proposed buffer algorithm provides an optimum voice quality when compared to five other buffer algorithms for all the traces considered.  相似文献   

15.
The quality of text-to-speech systems can be effectively assessed only on the basis of reliable and valid listening tests to assess overall system performance. A mean opinion scale (MOS) has been the recommended measure of synthesized speech quality [ITU-T Recommendation P.85, 1994. Telephone transmission quality subjective opinion tests. A method for subjective performance assessment of the quality of speech voice output devices]. We assessed this MOS scale and developed and tested a modified measure of speech quality. This modified measure has new items specific to text-to-speech systems. Our research was motivated by the lack of clear evidence of the conceptual content of as well as the psychometric properties of the MOS scale. We present conceptual arguments and empirical evidence for the reliability and validity of a modified scale. Moreover, we employ state of the art psychometric techniques such as confirmatory factor analysis to provide strong tests of psychometric properties. This modified scale is better suited to appraise synthesis systems since it includes items that are specific to the artifacts found in synthesized speech. We believe that the speech synthesis research communities will find this modified scale a better fit for listening tests to assess synthesized speech.  相似文献   

16.
黄石磊  刘轶  程刚 《计算机工程》2012,38(18):19-21
为提高语音质量客观评估的性能,提出一种改进的语音质量感知评估(PESQ)算法。该算法利用音节稳定性检测和清浊静音分类的方法,通过音节的帧间稳定性和损伤参数来描述语音听觉感知所受到的影响,这些参数对不同的语音段,如清音、浊音和静音具有不同的特性。实验结果表明,该算法能在窄带语音上提高PESQ得分与主观平均意见分的相关性。  相似文献   

17.
一种VoIP语音质量评价模型   总被引:1,自引:1,他引:1  
在VoIP系统中,传输网络性能(QoS)参数对可感知语音质量(Quality of Experience, QoE)起着基础性的影响作用,但QoS取值情况并不能直接反映和代表QoE水平。为此,基于对VoIP传输特征的分析,首先采用PESQ,E-Model算法分析了单个QoS参数对QoE损伤的影响;在单个因素计算的基础上,通过对E-Model算法的扩展研究了QoS参数综合作用情况下语音QoE值的变化情况;采用回归分析的方法建立了QoS参数与语音 QoE的映射模型,模型构成简单。验证实验表明,该模型与语音QoE客观评价方法之间具有很高的相关度,满足对网络运行状况及VoIP QoE实时监测的要求。  相似文献   

18.
This paper proposes two mathematical models that can be used to estimate VoIP quality from Skype, which is one of the most popular VoIP applications. The first model is simple, it has been developed using data from the informal interview tests called Conversation-like tests, referring to packet loss of 0 %, 5 %, 10 %, …, and 30 %. The tests have been conducted with Skype using a non ITU-T’s codec called SILK via the Internet with over 180 native Thai participants, while packet loss effects were generated using a network emulation tool. The second model is called the Enhanced Simplified E-model, this has been developed by adding the Thai Bias factor into a generic Simplified E-model, which calculates by subtracting the subjective results from the computed results using the Simplified E-model formula. After obtaining the models, they were evaluated with the Test set from 36 native Thai participants (different from the other group of participants) using Mean Absolute Percentage Error technique (MAPE). It has been found that VoIP quality measurement performance of both models are classified as excellent and provide higher reliability and accuracy than the Simplified E-model. Subjective MOS model and Enhanced Simplified E-model error reduction compared to the simplified one was at about 21.9 % and 21.2 % respectively, which is the major contribution of this work.  相似文献   

19.
Although many discrete Fourier transform (DFT) domain-based speech enhancement methods rely on stochastic models to derive clean speech estimators, like the Gaussian and Laplace distribution, certain speech sounds clearly show a more deterministic character. In this paper, we study the use of a deterministic model in combination with the well-known stochastic models for speech enhancement. We derive a minimum mean-square error (MMSE) estimator under a combined stochastic-deterministic speech model with speech presence uncertainty and show that for different distributions of the DFT coefficients the combined stochastic-deterministic speech model leads to improved performance of approximately 0.8 dB segmental signal-to-noise ratio (SNR) over the use of a stochastic model alone. Evaluation with perceptual evaluation of speech quality (PESQ) shows performance improvements of approximately 0.15 on an MOS scale  相似文献   

20.
This paper deals with the investigation of PESQ's (Perceptual Evaluation of Speech Quality; also known as ITU-T Recommendation P.862) behavior under independent and dependent loss conditions from a speech activity parameter perspective. The results show that an increase in amount of speech in the reference signal (expressed by the activity parameter) may result in an increase of the PESQ sensitivity to packet loss change as well as PESQ's prediction accuracy improvement. On the other hand, it seems that human brain is a bit less sensitive to loss of some parts of words than PESQ. The reasons for those findings are particularly discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号