首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This paper presents a performance analysis of a PLC-1 private line voice concentrator which uses speech interpolation to increase the capacity of transmission facilities. The PLC-1 employs speech storage. As a result it can be applied to relatively small trunk groups where statistics of loading patterns are particularly unfavorable. Speech impairments can be categorized as delay, gap modulation, and clipping. A mathematical model based on queueing theory is presented for the evaluation of the statistics of these impairments. The model can be used to determine obtainable TASI advantage for various system configurations and loading conditions. It is shown that a TASI advantage of two is achievable with 24 transmission facilities. The veracity of the queueing model is established using simulation techniques.  相似文献   

2.
This paper deals with the measurement and calculation of various speech temporal parameters of interest in an environment where speech activity detection is employed. In particular it is shown that, based on either a measurement or model of the probability density function (pdf) for silence durations for the case of zero talkspurt "hangover" or "fill-in," that the following temporal parameters can be computed for any value of hangover or fill-in: the mean (and pdf) for silence durations, the mean talkspurt duration, the mean talkspurt rate, and the speech activity. Directly measured values of these parameters and those computed from both measured and fitted versions of the pdf for silence durations are compared and are shown to be in reasonable agreement. The illustrated results are based on measurements of about two minutes of taped male monolog source speech. However, the approach to calculating the above parameters is general in the sense that it can be applied to any measured or modeled pdf for silence durations. The significance of this work lies in the important role that talkspurt hangover plays, for example, in minimizing speech detector induced back-end clipping of talkspurts, reducing exposure to the variable talkspurt delay impairment, and in determining signaling overhead and resource occupancy in various speech interpolation, packet voice, and integrated voice/data systems.  相似文献   

3.
Voice transmission in burst switching is characterized by the process of talkspurt clipping, while in packet switching, it is characterized by the process of packet delay. In most analyses, the talkspurt clipping has been measured by the clipping probability averaged over all bits, and the packet delay has been measured by the delay performance averaged over all packets. The resulting measures overlook the duration of clipping in a talkspurt and the significant difference of delay in packets arriving at different times. Because of the nature of voice, different effects of these may result in substantially different degrees of voice distortion. This paper studies the worst case performance of both processes. The voice traffic is modeled as a process alternating between overload and underload periods. Statistically, more clipping and delay will be incurred while in the overload period. By worst case we mean that, in burst switching, we measure the worst case of talkspurt clipping duration in an overload period, while in packet switching, we measure the worst case of packet delay in an overload period. Furthermore, a simple closed form equation is derived which gives a very good approximation of the worst case mean packet delay performance. This equation can be more generally applied when the packet service time is to be geometrically distributed or when voice and data are to be integrated. The voice performances in burst switching and packet switching are also compared.  相似文献   

4.
Voice over IP is already widespread in enterprise private networks and is growing in public switched voice networks as manufacturers withdraw support for earlier technologies. Packet transmission of voice can introduce new impairments, including packet loss, extra sources of delay, and the use of compressed speech coding, all of which may affect voice quality delivered to the user. Factors affecting the quality of a voice telephony connection are described, concentrating on those which are changed by the move to packet transmission, including the complex area of delay. We outline subjective testing based on users’ opinions of fragments of recorded audio material or of connections realised in a laboratory, and describe the abstraction of these results into transmission planning models to assist with design of networks and their QoS mechanisms. QoS requirements are stated for a packet technology to support a PSTN and ISDN service in the UK telecommunications environment.  相似文献   

5.
This paper examines the quality of transmission of voice over cellular, packet-switched networks. The medium access mechanism in the uplink is simulated under various statistical multiplexing scenarios in order to assess the effect of front-end clipping on voice quality. Moreover, the simulation is implemented in a real-time demonstration platform utilized to acquire subjective indicators of voice quality by performing Mean Opinion Score (MOS) tests. Results from the MOS tests are reported, and an analysis of the obtained speech samples is presented. Finally, the results are summarized and potential further directions for the simulation tool and the speech models are discussed.  相似文献   

6.
Quality models predict the perceptual quality of services as they calculate subjective ratings from measured parameters. In this article, we present a new quality model that evaluates Voice over IP (VoIP) telephone calls. In addition to packet loss rate, coding mode and delay, it takes into account the impairments due to changes in the transmission configuration (e.g. switching the coding mode or re‐scheduling the playout time). Moreover, this model can be used at run time to control the transmission of such calls. It is also computationally efficient and open source. To demonstrate the potential of our model, we apply it to select the ideal coding and packet rate in bandwidth‐limited environments. Furthermore, we decide, based on model predictions, whether to delay the playout of speech frames after delay spikes. Delay spikes often occur after congestion and cause packets to arrive too late. We show a considerable improvement in perceptual speech quality if our model is applied to control VoIP transmissions. Copyright © 2005 John Wiley & Sons, Ltd.  相似文献   

7.
This paper focuses on network delays as they apply to voice traffic. First the nature of the delay problem is discussed and this is followed by a review of enhanced circuit, packet, and hybrid switching techniques: these include fast circuit switching (FCS), virtual circuit switching (VCS), buffered speech interpolation (SI), packetized virtual circuit (PVC), cut-through switching (CTS), composite packets, and various frame management strategies for hybrid switching. In particular, the concept of introducing delay to resolve contention in SI is emphasized, and when applied to both voice talkspurts and data messages, forms a basis for a relatively new approach to network design called transparent message switching (TMS). This approach and its potential performance advantages are reviewed in terms of packet structure, multiplexing scheme, network topology, and network protocols. The paper then deals more specifically with the impact of variable delays on voice traffic. In this regard the importance of generating and preserving appropriate length speech talkspurts in order to mitigate the effects of variable network delay is emphasized. The results indicate that a desirable length of talkspurt "hangover" of about 200 ms will accomplish this without unduly affecting speech activity, and that, under these circumstances, the perceptable threshold of variable talkspurt delay can be as high as about 200 ms average. As such, the results provide a useful guideline for integrated services system designers. Finally, suggestions are made for further studies on performance analysis and subjective evaluation of advanced integrated services systems.  相似文献   

8.
Pure delay effects on speech quality in telecommunications   总被引:4,自引:0,他引:4  
The effect of transmission delay on speech quality in telecommunications is described, with human factors such as conversational mode and the talker's knowledge of the cause of delay taken into account. Objective quality estimation methods for delay effects are proposed, and these methods are applied in an actual communications network. In connection with delay perception in a telephone conversation, the assumption was verified that a talker expects a particular response time from a partner, and that delay that is outside this expectation time window is noticed. Taking this information into account, a subjective conversational experiment is controlled by six kinds of tasks by varying the temporal characteristics. Thus, a subjective assessment of delay effects is obtained by laboratory tests in relation to the detectability threshold, opinion rating, and conversational efficiency. Objective quality measures for each test were defined as a linear combination of temporal parameters that correspond closely to subjective qualities  相似文献   

9.
贾龙涛  鲍长春 《通信学报》2006,27(6):121-125
目前,几乎所有的语音电话系统(VoIP)都采用固定速率传输,这使得网络丢包,特别是连续丢包无法避免,因此导致了严重的语音质量下降.针对这一问题,给出了一种新的抗分组丢失的网络语音通信系统,并用网络仿真软件NS(network simulator)对该系统进行了性能分析,仿真实验证明,所提出的网络语音通信系统在网络丢包、平均延迟和主观听觉方面明显优于传统的IP语音电话系统.  相似文献   

10.
吴彭龙  邹霞  孙蒙  张星昱 《信号处理》2020,36(3):426-438
截幅失真会影响语音编码质量,特别在低速率语音编码条件下,截幅语音不再符合人发声模型,编码语音质量严重下降。为了研究截幅失真对低速率语音编码的影响,从截幅语音编码参数提取和截幅语音编码质量两个方面进行了分析。采用偏离度衡量低速率语音编码参数提取的准确性,编码参数包括LPC、基音周期和清浊音。在不同截幅程度下,分析了各种参数错误分布、错误类型和错误原因。采用客观感知语音质量评估PESQ打分评估截幅语音编码质量。针对常用截幅修复方法在截幅程度较大时修复性能下降严重的现象,提出采用两种改进型截幅修复方法对截幅语音进行修复。实验结果表明,改进的截幅修复方法能有效提高截幅程度较大时的低速语音编码质量。   相似文献   

11.
This article proposes a new output-based method for non-intrusive assessment of speech quality of voice communication systems and evaluates its performance. The method requires access to the processed (degraded) speech only, and is based on measuring perception-motivated objective auditory distances between the voiced parts of the output speech to appropriately matching references extracted from a pre-formulated codebook. The codebook is formed by optimally clustering a large number of parametric speech vectors extracted from a database of clean speech records. The auditory distances are then mapped into objective Mean Opinion listening quality scores. An efficient data-mining tool known as the self-organizing map (SOM) achieves the required clustering and mapping/reference matching processes. In order to obtain a perception-based, speaker-independent parametric representation of the speech, three domain transformation techniques have been investigated. The first technique is based on a perceptual linear prediction (PLP) model, the second utilises a bark spectrum (BS) analysis and the third utilises mel-frequency cepstrum coefficients (MFCC). Reported evaluation results show that the proposed method provides high correlation with subjective listening quality scores, yielding accuracy similar to that of the ITU-T P.563 while maintaining a relatively low computational complexity. Results also demonstrate that the method outperforms the PESQ in a number of distortion conditions, such as those of speech degraded by channel impairments.  相似文献   

12.
Various strategies to provide low-delay high-quality digital speech communications in a high-capacity wireless network are examined. Various multiple access schemes based on time-division and packet reservation are compared in terms of their statistical multiplexing capabilities, sensitivity to speech packet dropping, delay, robustness to lossy packet environments, and overhead efficiency. In particular, a low-delay multiple access scheme, called shared time-division duplexing (STDD) is proposed. This scheme allows both the uplink and downlink traffic to share a common channel, thereby achieving high statistical multiplexing gain even with a low population of simultaneous conversations. The authors also propose a choice of low delay, high quality speech coding and digital modulation systems based on adaptive DPCM, with QDPSK or pseudo-analog transmission (skewed DPSK), for use in conjunction with the STDD multiple access protocol. The choice of the alternative systems depends on required end-to-end delay, recovered speech quality and bandwidth efficiency. Typically, with a total capacity of 1 MBaud, 2 ms frame and 8 kBaud speech coding rate, low delay STDD is able to support 48 pairs of users compared to 38, 35, and 16 for TDMA with speech activity detection, basic TDMA and PRMA respectively. This corresponds to respective gains of 26%, 37% and 200%  相似文献   

13.
The predicted wordlength assignment system (PWA) is a digital speech interpolation method which avoids speech clipping and "freeze-out" distortion. Inactive sources are excluded by a speech detector. The active speech signals are coded with variable wordlengths (3-8 bits) at a sampling rate of 8 kHz. In an overload case, all active sources are still served, but at reduced wordlength. The required wordlength is calculated using only the signal history, which is also available at the receiver. Therefore, no auxiliary information about the individual wordlength is transmitted. A system with up to 128 telephone conversation speech sources has been studied using computer simulation. The signal-to-noise ratio (SNR) is employed to describe speech quality. With an input of 128 sources (40 percent activity) and a transmission rate per source of 21 kbits/s, an SNR of 34 dB can be achieved. Above a bit rate of 16 kbits/s, distortions are not audible. As a first step towards implementation, a specially designed fast microprocessor has been used to simulate the most important PWA system functions, such as speech detection, linear prediction, and coding algorithm.  相似文献   

14.
Discontinuous transmission based on speech/pause detection represents a valid solution to improve the spectral efficiency of new generation wireless communication systems. In this context, robust voice activity detection (VAD) algorithms are required, as traditional solutions present a high misclassification rate in the presence of the background noise typical of mobile environments. This paper presents a voice detection algorithm which is robust to noisy environments, thanks to a new methodology adopted for the matching process. More specifically, the VAD proposed is based on a pattern recognition approach in which the matching phase is performed by a set of six fuzzy rules, trained by means of a new hybrid learning tool. A series of objective tests performed on a large speech database, varying the signal-to-noise ratio (SNR), the types of background noise, and the input signal level, showed that, as compared with the VAD standardized by ITU-T in Recommendation G.729 annex B, the fuzzy VAD, on average, achieves an improvement in reduction both of the activity factor of about 25% and of the clipping introduced of about 43%. Informal listening tests also confirm an improvement in the perceived speech quality  相似文献   

15.
Quantizing noise is present whenever analog information is encoded into digital form suitable as an input to any digital system such as a computer or digital transmission line. The subjective impairment caused by this noise is frequently measured by the ratio of signal power to frequency-weighted quantizing noise powerS/N_Y. An upper bound onS/N_Yis found such that source encoding systems will always have values ofS/N_Yless than this bound. The bound has the form (in decibels)S/N_Y leq T_B + T_P + T_S, whereT_Bis a constant that depends on the hit rate of the signal,T_Pdepends on the redundancy (or predictability) of the signal, andT_Sdepends on subjective considerations (as embodied in a subjectively determined frequency-weighting function). The bound is applied to source encoding systems for speech and television signals. By using the frequency-weighting function, bounds on commonly used measures of subjective impairments are possible.  相似文献   

16.
To alleviate congestion at thin route multiplexers such as are used in VSAT systems, some transmitters are forced to drop speech packets. Receiver reconstruction of the lost speech invariably produces some degradation of quality. The authors exploit the knowledge of dropped packets at the transmitter to mitigate this degradation  相似文献   

17.
用分数延迟改进基音预测的CELP编码方案   总被引:2,自引:0,他引:2  
在CELP编码器中,通常用延迟为抽样间隔整数倍的长项预测器表征浊音语音的准周期性,然而在低比特率,这种限制降低了编码器的性能。本文在介绍了CELP编码器原理及激励码本构成后,重点研究了一种新型的基音预测方法;分数延迟基音预测,计算机模拟结果表明,这种方法能对浊音进行更准确的表达,尤其对女性讲话者明显改善了语音质量。  相似文献   

18.
We review the variable frame rate (VFR) transmission methodology that we developed, implemented, and tested during the period 1973-1978 for efficiently transmitting LPC vocoder parameters extracted from the input speech at a fixed frame rate. In the VFR method, parameters are transmitted only when their values have changed sufficiently over the interval since their preceding transmission. We explored two distinct approaches to automatic implementation of the VFR method. The first approach bases the transmission decisions on comparisons of the parameter values of the present frame and the last transmitted frame. The second approach, which is based on a functional perceptual model of speech, compares the parameter values of all the frames that lie in the interval between the present frame and the last transmitted frame against a linear model of parameter variation over that interval. The application of VFR transmission to the design of narrow-band LPC speech coders with average bit rates of 2000-2400 bits/s is also considered. The transmission decisions are made separately for the three sets of LPC parameters, pitch, gain, and spectral parameters, using separate VFR schemes. A formal subjective spccch quality test of six selected LPC coders is described, and the results are presented and analyzed in detail. It is shown that a 2075 bit/s VFR coder produces speech quality equal to or better than that of a 5700 bit/s fixed frame rate coder.  相似文献   

19.
A theoretical method of evaluating degradations of variable rate coders in a multichannel digital speech interpolation (DSI) system is developed. Each of the coder outputs has a variable rate based on the algorithm. The DSI system multiplexes the outputs of these variable rate coders into a fixed bit rate channel. During periods of high activity all active users are served, but at a reduced rate depending on the demand. The degradation due to high activity is shared by all active users. This system avoids speech clipping and "freeze-out" distortion. Theoretical expressions of the system overload probability and the probability of degradation to a particular user in the DSI system are derived. Two types of variable rate coders, namely, a constant quality subband coder and a constant noise subband coder, are chosen and used as examples. Comparisons of the degradations are made between the theoretical results and computer simulated results for the two types of variable rate coders, and close agreement is observed. The theory is applicable to other variable rate coding algorithms as well. In this study, all of the simulations are made at 40 percent speech activity and the average rate of the variable rate coders is close to 16 kbits/s. Objective quality measures indicate that in a system with a trunk size larger than 40, the variable rate coder DSI system can achieve a 2:1 compression with a degradation of less than 1 dB compared to non-DSI variable rate coders. This corresponds to a total gain of 8:1 when compared to 64 kbit/s PCM.  相似文献   

20.
Discrete-time analysis of two schemes for multiplexing voice and data is presented. In each scheme voice and data are multiplexed using the movable boundary frame allocation scheme. In the first scheme, speech activity detectors (SAD's) are not used, and hence, the variations in the voice traffic are only due to the on/off characteristics of voice. In the second scheme, SAD's are employed so that talker silences can he utilized for transmission of additional voice and/or data. In this scheme, the multiplexer performs digital speech interpolation as well as movable boundary frame allocation. The performance measures considered are probability of loss for voice calls, probability of speech clipping, speech packet rejection ratio, and the expected data message delay. In the case of the multiplexer with SAD, a tradeoff exists between data message delay and speech interpolation advantage. Some numerical examples are presented which illustrate the performance of the two multiplexers.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号