首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
In this paper, speech bit rate reduction by not transmitting a percentage of samples (i.e., robbing the coder of some samples) has been studied. The technique has been applied to predictive coders, namely differential PCM (DPCM) and adaptive DPCM (ADPCM) coders. A robbed sample is replaced by its estimate so that the prediction process in the feedback loop of the coders continues in a normal manner. After one period delay, when the next sample is decoded, the robbed sample is reestimated using delayed interpolation. Only periodic sample robbing has been considered, such as every fourth, every third, etc. The technique is particularly useful where graceful degradation is required under heavy loading conditions. The technique is found to be useful when the desired bit rate is 24 kbits/s or lower. The technique was evaluated by computer simulation using real-time speech inputs. Improvements of up to 3 dB in the case of a DPCM coder and of up to 1.5 dB in the case of an ADPCM coder have been achieved.  相似文献   

2.
Predictive Coding of Speech at Low Bit Rates   总被引:1,自引:0,他引:1  
Predictive coding is a promising approach for speech coding. In this paper, we review the recent work on adaptive predictive coding of speech signals, with particular emphasis on achieving high speech quality at low bit rates (less than 10 kbits/s). Efficient prediction of the redundant structure in speech signals is obviously important for proper functioning of a predictive coder. It is equally important to ensure that the distortion in the coded speech signal be perceptually small. The subjective loudness of quantization noise depends both on the short-time spectrum of the noise and its relation to the short-time spectrum of the Speech signal. The noise in the formant regions is partially masked by the speech signal itself. This masking of quantization noise by speech signal allows one to use low bit rates while maintaining high speech quality. This paper will present generalizations of predictive coding for minimizing subjective distortion in the reconstructed speech signal at the receiver. The quantizer in predictive coders quantizes its input on a sample-by-sample basis. Such sample-by-sample (instantaneous) quantization creates difficulty in realizing an arbitrary noise spectrum, particularly at low bit rates. We will describe a new class of speech coders in this paper which could be considered to be a generalization of the predictive coder. These new coders not only allow one to realize the precise optimum noise spectrum which is crucial to achieving very low bit rates, but also represent the important first step in bridging the gap between waveform coders and vocoders without suffering from their limitations.  相似文献   

3.
We review the variable frame rate (VFR) transmission methodology that we developed, implemented, and tested during the period 1973-1978 for efficiently transmitting LPC vocoder parameters extracted from the input speech at a fixed frame rate. In the VFR method, parameters are transmitted only when their values have changed sufficiently over the interval since their preceding transmission. We explored two distinct approaches to automatic implementation of the VFR method. The first approach bases the transmission decisions on comparisons of the parameter values of the present frame and the last transmitted frame. The second approach, which is based on a functional perceptual model of speech, compares the parameter values of all the frames that lie in the interval between the present frame and the last transmitted frame against a linear model of parameter variation over that interval. The application of VFR transmission to the design of narrow-band LPC speech coders with average bit rates of 2000-2400 bits/s is also considered. The transmission decisions are made separately for the three sets of LPC parameters, pitch, gain, and spectral parameters, using separate VFR schemes. A formal subjective spccch quality test of six selected LPC coders is described, and the results are presented and analyzed in detail. It is shown that a 2075 bit/s VFR coder produces speech quality equal to or better than that of a 5700 bit/s fixed frame rate coder.  相似文献   

4.
The combination of speech coders and entropy coders is investigated, for bit rate reduction. Three speech coders of the celp (code excited linear prediction) type are considered and the residual correlation in lsp (line spectrum pairs) coefficients and gains in a speech frame is exploited. The lossless entropy coders use Huffman, Lzw (lempel ziv welch) and gzip (LZ-Huffrnan) techniques. The greatest efficiency is provided by the adaptive Huffman approach, with a 15 % gain in each type of compressed parameter and an overall average bit rate reduction of 7 % for the FS1016 coder and 5 % for the Tetra and lbc coders.  相似文献   

5.
N. Moreau  P. Dymarski 《电信纪事》2000,55(9-10):493-506
A low delay coder for speech and music signals sampled at 32kHz is described. Its algorithmic delay does not exceed 25 ms which enables audioconferencing applications without echo cancellation. Its bit rate is scalable between 64 and 32 kbit/s by steps of 8 kbit/s. The transmitter issues the binary code at 64 kbit/s with lower bit rate codes embedded in it. The receiver may operate at lower bit rates with gradual loss of quality. The proposed coder is based on a mixed scheme : the adopted solution contains elements from the CELP speech coder and frequency domain music coders. The perceptual signal is obtained in the time domain, then transformed to the frequency domain where bit allocation is calculated and transform coefficients are quantized. A first solution based on the dft is discussed, then a second solution based on a mdct with small overlap is applied. The quantization of these coefficients is done in the following way. First, a prediction of the whole spectrum is applied. Then, a mean- removed gain- shape split vq is used for amplitude spectrum quantization and a hierarchical 2- dimensional vq is used for phase spectrum quantization with amplitude correction. At the phase quantization stage, each codeword describing the selected vector index is split into parts corresponding to different bit rates. Due to the hierarchical codebook structure, truncated indices may be used, without much affecting the signal quality. Simulation results are presented and the robustness of the proposed coder is examined.  相似文献   

6.
The design and implementation of a real-time CELP coder for mobile communication applications are discussed. To realize a single-chip implementation, several tradeoffs were made without compromising speech quality. In addition, techniques that make the coder more robust under a variety of channel conditions are discussed. The real-time coder can be operated at different bit rates (8, 6.8, 4.6 kb/s) by simply changing the frame update rates. The speech quality was evaluated through a formal listening test, and it was found that this coder compares favorably with other (standardized) coders operating at similar or higher rates  相似文献   

7.
This paper presents several strategies to improve the performance of very low bit rate speech coders and describes a speech codec that incorporates these strategies and operates at an average bit rate of 1.2 kb/s. The encoding algorithm is based on several improvements in a mixed multiband excitation (MMBE) linear predictive coding (LPC) structure. A switched-predictive vector quantiser technique that outperforms previously reported schemes is adopted to encode the LSF parameters. Spectral and sound specific low rate models are used in order to achieve high quality speech at low rates. An MMBE approach with three sub-bands is employed to encode voiced frames, while fricatives and stops modelling and synthesis techniques are used for unvoiced frames. This strategy is shown to provide good quality synthesised speech, at a bit rate of only 0.4 kb/s for unvoiced frames. To reduce coding noise and improve decoded speech, spectral envelope restoration combined with noise reduction (SERNR) postfilter is used. The contributions of the techniques described in this paper are separately assessed and then combined in the design of a low bit rate codec that is evaluated against the North American Mixed Excitation Linear Prediction (MELP) coder. The performance assessment is carried out in terms of the spectral distortion of LSF quantisation, mean opinion score (MOS), A/B comparison tests and the ITU-T P.862 perceptual evaluation of speech quality (PESQ) standard. Assessment results show that the improved methods for LSF quantisation, sound specific modelling and synthesis and the new postfiltering approach can significantly outperform previously reported techniques. Further results also indicate that a system combining the proposed improvements and operating at 1.2 kb/s, is comparable (slightly outperforming) a MELP coder operating at 2.4 kb/s. For tandem connection situations, the proposed system is clearly superior to the MELP coder.  相似文献   

8.
A medium-band speech coder is proposed that uses a weighted vector quantization scheme in the transformed domain. The linear prediction residue is transformed and vector-quantized. In order to control the quantization errors in the transformed domain, adaptively weighted matching is used instead of conventional adaptive bit allocation. Therefore, the residual signal can be reconstructed by the decoder, even if the spectral envelope parameters are destroyed due to transmission errors. This coder is also capable of maintaining higher SNR (signal-to-noise ratio) performance than time-domain vector quantization coders for a wide range of computation complexities and bit rates. Coded speech is natural and unaffected by background noise. The mean opinion score for this coder at 7.2 kb/s is comparable to that of 5.5-bit log PCM coded speech sampled at 6.4 kHz  相似文献   

9.
Low bit-rate speech coders for multimedia communication   总被引:10,自引:0,他引:10  
The International Telecommunications Union (ITU) has standardized three speech coders which are applicable to low-bit-rate multimedia communications. ITU Rec. G.729 8 kb/s CS-ACELP has a 15 ms algorithmic codec delay and provides network-quality speech. It was originally designed for wireless applications, but is applicable to multimedia communications as well. Annex A of Rec. G.729 is a reduced-complexity version of the CS-ACELP coder. It was designed explicitly for simultaneous voice and data applications that are prevalent in low-bit-rate multimedia communications. These two coders use the same bitstream format and can interoperate. The ITU Rec. G.723.1 6.3 and 5.3 kb/s speech coder for multimedia communications was designed originally for low-bit-rate videophones. Its frame size of 30 ms and one-way algorithmic codec delay of 37.5 ms allow for a further reduction in bit rate compared to the G.729 coder. In applications where low delay is important, the delay of G.723.1 may be too large. However, if the delay is acceptable, G.723.1 provides a lower-complexity alternative to G.729 at the expense of a slight degradation in quality. This article describes the attributes of speech coders such as bit rate, complexity, delay, and quality. Then it discusses the basic concepts of the three new ITU coders by comparing their specific attributes. The second part of this article describes the standardization process for each of these coders  相似文献   

10.
介绍了一种采用TMS320C548构造的低速率话音编解码DSP系统的通信与控制接口的设计方法。着重介绍了DSP系统内部通信与控制接口的硬件结构和软件设计方法。  相似文献   

11.
The effects of digital transmission errors on a family of variable-rate embedded subband speech coders (SBC) are analyzed in detail. It is shown that there is a difference in error sensitivity of four orders of magnitude between the most and the least sensitive bits of the speech coder. As a result, a family of rate-compatible punctured convolutional codes with flexible unequal error protection capabilities have been matched to the speech coder. These codes are optimally decoded with the Viterbi algorithm. Among the results, analysis and informal listening tests show that with a 4-level unequal error protection scheme transmission of 12 kb/s speech is possible with very little degradation in quality over a 16 kb/s channel with an average bit error rate (BER) of 2×10-2 at a vehicle speed of 60 m.p.h. and with interleaving over two 16 ms speech frames  相似文献   

12.
In low rate code-excited linear predictive (CELP) coders, the LPC spectral information is usually quantized and transmitted on a frame-by-frame basis about every 20 to 30 msec. The quality of speech reproduced by a CELP coder can be improved by making spectral transitions as smooth and continuous as possible. One way in which this can be accomplished without increasing the transmission bit rate is to interpolate the LPC spectral parameters between adjacent extraction frames. This, however, usually leads to a dramatic increase in the computations required for the codebook search. The paper presents a new LPC interpolation technique, based on interpolating the impulse response of the LPC synthesis filter. It demonstrates that this method offers a significant complexity reduction for the codebook search over other typical interpolation schemes. Furthermore, the experiments show that the coder using the impulse response for interpolation produces the same speech quality as the coder using the LSP parameters for interpolation, and both these parameter sets are superior to other LPC representations for interpolation  相似文献   

13.
Two very different subband coders are described. The first is a modified dynamic bit-allocation-subband coder (D-SBC) designed for variable rate coding situations and easily adaptable to noisy channel environments. It can operate at rates as low as 12 kb/s and still give good quality speech. The second coder is a 16-kb/s waveform coder, based on a combination of subband coding and vector quantization (VQ-SBC). The key feature of this coder is its short coding delay, which makes it suitable for real-time communication networks. The speech quality of both coders has been enhanced by adaptive postfiltering. The coders have been implemented on a single AT&T DSP32 signal processor  相似文献   

14.
A multidigit adaptive delta modulation (ADM) system has been proposed where the error signal, between the input and the approximated signal produced by ADM coder, is coded in an auxiliary encoder. The error in the auxiliary coder is processed by another ADM and so on. The bit rate of each of these coders isf_{r}/Nwhere fris the overall transmission rate andNis the number of coders used. The bit streams are interleaved for transmission and at the receiver they are separated and decoded, and these signals are added and filtered. It is shown that for a given transmission rate, each coder operates at a basic sampling rate of frBsuch thatN_{opt} = f_{r}/f_{rB}gives the optimum number of coders to be used for maximum signal-to-noise ratio (SNR). A bound is derived for the maximum SNR of such a system and is compared with the bounds derived for other predictive coders. The experimental results of a two-digit ADM are presented. An average SNR of 30 dB is obtained with a dynamic range of 32 dB at fr= 32 kbits/s for band-limited noise signals. The SNR increases with the sampling rate at 15 dB/octave, as against 9 dB for a single-digit ADM. The frequency response is good and the variation of SNR with the message frequency of the delta coding system has been improved. The effect of channel errors has also been studied and the performance of the system is found satisfactory.  相似文献   

15.
该文基于LPC的自适应前后向量化技术,提出了一种可变速率的混合激励线性预测MELP语音编码算法。该算法中,采用当前语音帧(前向LPC)或前面某帧已合成语音帧(后向LPC)进行线性预测,当采用后向LPC时,只需传输时间序列编码,故减少了LPC系数的平均编码比特。计算机模拟表明,该算法与标准MELP算法合成的语音质量相当,但显著减少了LPC的传输带宽,从而明显降低了MELP平均编码速率。  相似文献   

16.
In wireless commercial and military communications systems, where bandwidth is at a premium, robust low-bit-rate speech coders are essential. They operate at fix bit rates and those bit rates cannot be altered without major modifications in the vocoder design. A novel approach to vocoders, in order to reduce the bit rate required to transmit speech signal, is proposed. While traditional low-bit-rate vocoders code original input speech, the proposed procedure operates on the time-scale modified signal. The proposed method offers any bit rate from 2400 b/s to downwards without modifying the principle vocoder structure, which is the new NATO standard, Stanag 4591, Mixed Excitation Linear Prediction (MELP) vocoder. We consider the application of transmitting MELP-encoded speech over noisy communication channels by applying different modulation techniques, after time-scale compression is applied. Three different time-scale modification algorithms have been evaluated and waveform similarity overlap and add (WSOLA) algorithm has been selected for time-scale modification purposes. Computer simulation results, both source and channel, are presented in terms of objective speech quality metrics and informal subjective listening tests. Design parameters such as codec complexity and delay are also investigated. Simulation results lead to a possible wireless communications system, whose performance might be enhanced by using the spared bits offered by the procedure.  相似文献   

17.
一种600bps极低速率语音编码算法   总被引:1,自引:0,他引:1  
该文针对抗干扰通信中对低速率语音编码算法的应用需求,提出了一种600bps极低速率语音编码算法,采用6帧超帧结构,超帧中包括2个基本帧与4个插值帧。插值帧的线性预测(LPC)参数采用基于闭环最优一阶线性预测的4阶段残差矩阵量化;在解码端,提出了闭环的激励脉冲幅度估计方法,提高了合成语音的自然度与鼻音音节的清晰度。该算法可以提供良好的合成语音质量,DRT测试结果达到88.55分。  相似文献   

18.
A theoretical method of evaluating degradations of variable rate coders in a multichannel digital speech interpolation (DSI) system is developed. Each of the coder outputs has a variable rate based on the algorithm. The DSI system multiplexes the outputs of these variable rate coders into a fixed bit rate channel. During periods of high activity all active users are served, but at a reduced rate depending on the demand. The degradation due to high activity is shared by all active users. This system avoids speech clipping and "freeze-out" distortion. Theoretical expressions of the system overload probability and the probability of degradation to a particular user in the DSI system are derived. Two types of variable rate coders, namely, a constant quality subband coder and a constant noise subband coder, are chosen and used as examples. Comparisons of the degradations are made between the theoretical results and computer simulated results for the two types of variable rate coders, and close agreement is observed. The theory is applicable to other variable rate coding algorithms as well. In this study, all of the simulations are made at 40 percent speech activity and the average rate of the variable rate coders is close to 16 kbits/s. Objective quality measures indicate that in a system with a trunk size larger than 40, the variable rate coder DSI system can achieve a 2:1 compression with a degradation of less than 1 dB compared to non-DSI variable rate coders. This corresponds to a total gain of 8:1 when compared to 64 kbit/s PCM.  相似文献   

19.
A new phase coding algorithm working in the pitch-cycle waveform domain is introduced. It provides accurate phase coding at low bit cost, thus being suitable for low bit rate sinusoidal coders. Its performance is analysed inside a multiband excitation (MBE) coder with improved onset representation. In this context, the introduction of original phase information by means of the proposed coding algorithm provides noticeable quality improvement without significantly increasing the complexity and total bit rate of the coder  相似文献   

20.
该文提出了一种码率为 0.75-5.4kb/s可变速率的高质量语音编码讲法。该算法对CELP的激励进行了改进,根据语音的特征把语音分成4类,不同类型的语音采用不同的激励码本。特别是对于浊音,提出了一种基于基音同步的嵌入分裂式激励码本,该码本利用浊音具有准周期性的特点,使该算法在很低的码率下就可很好地恢复浊音信号,克服了CELP在4kb/s速率以下因码本尺寸小而导致合成语音质量差的缺点。经非正式听音测试,它的主观质量超过了1~8kb/s的可变速率QCELP系统,并且平均速率大约只有2kb/s,比QCELP的5kb/s平均速率低了很多、非常适用于 CDMA移动通信系统。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号