首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A multidigit adaptive delta modulation (ADM) system has been proposed where the error signal, between the input and the approximated signal produced by ADM coder, is coded in an auxiliary encoder. The error in the auxiliary coder is processed by another ADM and so on. The bit rate of each of these coders isf_{r}/Nwhere fris the overall transmission rate andNis the number of coders used. The bit streams are interleaved for transmission and at the receiver they are separated and decoded, and these signals are added and filtered. It is shown that for a given transmission rate, each coder operates at a basic sampling rate of frBsuch thatN_{opt} = f_{r}/f_{rB}gives the optimum number of coders to be used for maximum signal-to-noise ratio (SNR). A bound is derived for the maximum SNR of such a system and is compared with the bounds derived for other predictive coders. The experimental results of a two-digit ADM are presented. An average SNR of 30 dB is obtained with a dynamic range of 32 dB at fr= 32 kbits/s for band-limited noise signals. The SNR increases with the sampling rate at 15 dB/octave, as against 9 dB for a single-digit ADM. The frequency response is good and the variation of SNR with the message frequency of the delta coding system has been improved. The effect of channel errors has also been studied and the performance of the system is found satisfactory.  相似文献   

2.
In this paper, speech bit rate reduction by not transmitting a percentage of samples (i.e., robbing the coder of some samples) has been studied. The technique has been applied to predictive coders, namely differential PCM (DPCM) and adaptive DPCM (ADPCM) coders. A robbed sample is replaced by its estimate so that the prediction process in the feedback loop of the coders continues in a normal manner. After one period delay, when the next sample is decoded, the robbed sample is reestimated using delayed interpolation. Only periodic sample robbing has been considered, such as every fourth, every third, etc. The technique is particularly useful where graceful degradation is required under heavy loading conditions. The technique is found to be useful when the desired bit rate is 24 kbits/s or lower. The technique was evaluated by computer simulation using real-time speech inputs. Improvements of up to 3 dB in the case of a DPCM coder and of up to 1.5 dB in the case of an ADPCM coder have been achieved.  相似文献   

3.
A speech coding algorithm with low complexity and a short processing delay is introduced. The proposed algorithm is ADPCM (adaptive digital pulse code modulation) with a multiquantizer (ADPCM-MQ). The input signal is processed in parallel by multiple ADPCM coders with different characteristics. Then the optimum ADPCM coder with minimum error power is dynamically selected for each frame. A 16-kb/s codec based on this algorithm has been implemented using two general-purpose digital signal processors (MB8764) with 8.3 ms of total processing delay. A segmental SNR of 19-21 dB was achieved at 16 kb/s; with postfiltering the segmental SNR was increased to 23-25 dB. Combined with the time domain compression scheme, the algorithm can be easily applied to 8-kb/s coding. It is also extensible to variable-rate coding  相似文献   

4.
In this paper, implementation of a compact and efficient multirate speech digitizer with variable transmission rates of 2.4, 4.8, 9.6, and 14.96 kbits/s is presented. The multirate algorithm has been made based on the residual-excited linear prediction (RELP) vocoder with a transmission rate of 9.6 kbits/s. The residual encoder employed in the RELP vocoder uses hybrid companding delta modulation (HCDM). This HCDM is also used as a 14.96 kbit/s coder. If the residual in the RELP system is down-sampled before encoding, a 4.8 kbit/s coder can be realized. If the residual encoder is not used, a 2.4 kbit/s linear predictive coder (LPC) can be realized by incorporating a pitch extractor. In the 4.8 and 9.6 kbit/s coders the pitch-implanted residual excitation method has been used to generate the excitation signal to the synthesis filter. The multirate speech digitizer algorithm has been implemented using 2900 series bit-slice microprocessors. The external memory is composed of 2K RAM's and 2K ROM's. The system design is a two-bus structure with a 204 ns cycle time. With efficient hardware and software design, the multirate speech digitizer requires almost the same hardware complexity as compared with the conventional 2.4 kblt/s LPC vocoder.  相似文献   

5.
Predictive Coding of Speech at Low Bit Rates   总被引:1,自引:0,他引:1  
Predictive coding is a promising approach for speech coding. In this paper, we review the recent work on adaptive predictive coding of speech signals, with particular emphasis on achieving high speech quality at low bit rates (less than 10 kbits/s). Efficient prediction of the redundant structure in speech signals is obviously important for proper functioning of a predictive coder. It is equally important to ensure that the distortion in the coded speech signal be perceptually small. The subjective loudness of quantization noise depends both on the short-time spectrum of the noise and its relation to the short-time spectrum of the Speech signal. The noise in the formant regions is partially masked by the speech signal itself. This masking of quantization noise by speech signal allows one to use low bit rates while maintaining high speech quality. This paper will present generalizations of predictive coding for minimizing subjective distortion in the reconstructed speech signal at the receiver. The quantizer in predictive coders quantizes its input on a sample-by-sample basis. Such sample-by-sample (instantaneous) quantization creates difficulty in realizing an arbitrary noise spectrum, particularly at low bit rates. We will describe a new class of speech coders in this paper which could be considered to be a generalization of the predictive coder. These new coders not only allow one to realize the precise optimum noise spectrum which is crucial to achieving very low bit rates, but also represent the important first step in bridging the gap between waveform coders and vocoders without suffering from their limitations.  相似文献   

6.
A theoretical method of evaluating degradations of variable rate coders in a multichannel digital speech interpolation (DSI) system is developed. Each of the coder outputs has a variable rate based on the algorithm. The DSI system multiplexes the outputs of these variable rate coders into a fixed bit rate channel. During periods of high activity all active users are served, but at a reduced rate depending on the demand. The degradation due to high activity is shared by all active users. This system avoids speech clipping and "freeze-out" distortion. Theoretical expressions of the system overload probability and the probability of degradation to a particular user in the DSI system are derived. Two types of variable rate coders, namely, a constant quality subband coder and a constant noise subband coder, are chosen and used as examples. Comparisons of the degradations are made between the theoretical results and computer simulated results for the two types of variable rate coders, and close agreement is observed. The theory is applicable to other variable rate coding algorithms as well. In this study, all of the simulations are made at 40 percent speech activity and the average rate of the variable rate coders is close to 16 kbits/s. Objective quality measures indicate that in a system with a trunk size larger than 40, the variable rate coder DSI system can achieve a 2:1 compression with a degradation of less than 1 dB compared to non-DSI variable rate coders. This corresponds to a total gain of 8:1 when compared to 64 kbit/s PCM.  相似文献   

7.
Subjective quality measurements on three digital speech coders, simulated with mobile radio channel transmission, were performed using the "mean opinion score (MOS)" method. The three speech coding methods tested were: continuously variable slope deltamodulation (CVSD) coding, adaptive predictive coding (APC), and residually excited linear predictive (RELP) coding. Several versions of each coder, with transmission rates in the range of 7.3 to 16.1 kbits/s, were simulated. Five different channel conditions, including three derived from land mobile radio field experiments, were applied to the speech coders' encoded output to study the effects. The results show that of the three coders, the CVSD coder is the most robust to channel errors, but produces reconstructed output speech of unacceptable quality. The 14.4 kbit/s RELP coder produces relatively good Output speech quality, exhibits a mild degree of robustness to mobile radio channel errors, and is slightly less complex than the APC coder. Of the three digital speech coders tested, the RELP coder appears the most suitable for use with land mobile radio. However none of the three coders was able to produce speech of telephone toll quality in a mobile radio environment.  相似文献   

8.
Two very different subband coders are described. The first is a modified dynamic bit-allocation-subband coder (D-SBC) designed for variable rate coding situations and easily adaptable to noisy channel environments. It can operate at rates as low as 12 kb/s and still give good quality speech. The second coder is a 16-kb/s waveform coder, based on a combination of subband coding and vector quantization (VQ-SBC). The key feature of this coder is its short coding delay, which makes it suitable for real-time communication networks. The speech quality of both coders has been enhanced by adaptive postfiltering. The coders have been implemented on a single AT&T DSP32 signal processor  相似文献   

9.
In this paper, we establish a probabilistic framework for adaptive transform coding that leads to a generalized Lloyd type algorithm for transform coder design. Transform coders are often constructed by concatenating an ad hoc choice of transform with suboptimal bit allocation and quantizer design. Instead, we start from a probabilistic latent variable model in the form of a mixture of constrained Gaussian mixtures. From this model, we derive an transform coder design algorithm, which integrates optimization of all transform coder parameters. An essential part this algorithm is our introduction of a new transform basis-the coding optimal transform-which, unlike commonly used transforms, minimizes compression distortion. Adaptive transform coders can be effective for compressing databases of related imagery since the high overhead associated with these coders can be amortized over the entire database. For this work, we performed compression experiments on a database of synthetic aperture radar images. Our results show that adaptive coders improve compressed signal-to-noise ratio (SNR) by approximately 0.5 dB compared with global coders. Coders that incorporated the coding optimal transform had the best SNRs on the images used to develop the coder. However, coders that incorporated the discrete cosine transform generalized better to new images.  相似文献   

10.
This paper shows the utility of using adaptive quantizers in the tree-encoding of speech waveforms based on the (M, L) algorithm [1]. Resulting adaptive differential PCM (ADPCM) and adaptive delta modulation (ADM) encoders, with time-invariant prediction networks, can provide useful speech outputs at bit rates in the order of 24 kbits/s; at 16 kbits/s, on the other hand, the encoders exhibit clearly perceptible amounts of quantization noise.  相似文献   

11.
The predicted wordlength assignment system (PWA) is a digital speech interpolation method which avoids speech clipping and "freeze-out" distortion. Inactive sources are excluded by a speech detector. The active speech signals are coded with variable wordlengths (3-8 bits) at a sampling rate of 8 kHz. In an overload case, all active sources are still served, but at reduced wordlength. The required wordlength is calculated using only the signal history, which is also available at the receiver. Therefore, no auxiliary information about the individual wordlength is transmitted. A system with up to 128 telephone conversation speech sources has been studied using computer simulation. The signal-to-noise ratio (SNR) is employed to describe speech quality. With an input of 128 sources (40 percent activity) and a transmission rate per source of 21 kbits/s, an SNR of 34 dB can be achieved. Above a bit rate of 16 kbits/s, distortions are not audible. As a first step towards implementation, a specially designed fast microprocessor has been used to simulate the most important PWA system functions, such as speech detection, linear prediction, and coding algorithm.  相似文献   

12.
A novel high-quality, low-complexity dual-rate 4.7 and 6.5 kbits/s algebraic code excited linear predictive codec is proposed for adaptive multi-mode speech communicators, which can drop their source rate and speech quality under network control in order to invoke a more error resilient modem amongst less favorable channel conditions. Source-matched binary Bose-Chaudhuri-Hocquenghem (BCH) codecs combined with unequal protection diversity- and pilot-assisted 16and 64-level quadrature amplitude modulation (16-QAM, 64-QAM) are employed in order to accommodate both the 4.7 and the 6.5 kbits/s coded speech bits at a signaling rate of 3.1 kBd. Assuming an excess bandwidth of 100%, in a bandwidth of 200 kHz 32 time slots can be created, which allows us to support in excess of 50 users, when employing packet reservation multiple access (PRMA). Good communications quality speech is delivered in an equivalent bandwidth of 4 kHz, if the channel signal-to-noise ratio (SNR) and signal-to-interference ratio (SIR) of the benign indoors cordless channel are in excess of about 15 and 25 dB for the lower and higher speech quality 16-QAM and 64-QAM systems, respectively, and the PRMA time-slots are sufficiently uninterfered due to using time-slot classification algorithms and due to the attenuation of partitioning walls and ceilings  相似文献   

13.
A characteristic of a mobile radio channel is the occurrence of correlated signal fading that results in burst errors. The use of adaptive delta modulation (ADM) based on explicit transmission of the quantizer step size was proposed earlier for speech communication over such a channel. Two other variable step-size delta modulation (VSDM) schemes are presented, and their performance in a mobile radio environment is discussed. One of them is the constant factor delta modulation that uses one-bit memory and produces fast and instantaneous step-size changes. The other is the digitally controlled delta modulation (DCDM) that incorporates a new step-size adaptation strategy using seven bits of memory. In some cases, bit scrambling has been used. This is equivalent to scrambling (spreading out in time) the, clustered errors. Computer simulations providing values of coder parameters for satisfactory signal-to-noise ratios for band-limited speech signals and Gaussian noise are described. New hardware realizations are given that allow those parameters to vary smoothly for a wide range of sampling frequencies. Results of informal listening tests obtained with a mobile radio channel simulator are included. It is shown that for mobile radio, DCDM, as expected, is the better of the two coders. This is because it does not sacrifice its overload performance for the sake of error resistance.  相似文献   

14.
An improved system for speech digitization using adaptive differential pulse-code modulation (ADPCM) is described. The system uses an adaptive predictor, an adaptive quantizer, and a variable length source coding scheme to achieve a 4-5 dB increase in signal-to-noise ratio over previous ADPCM. The increase can be used to improve speech quality at moderate data rates on the order of 16 kbits/s or to retain the same quality and reduce the data rate to 9.6 kbits/s. The latter alternative permits the use of narrow-band channels. The implementation complexity is on the same order as other ADPCM systems.  相似文献   

15.
Computer simulation was used to evaluate the performance of eleven coder/decoders (CODEC's) with phase shift keying (PSK) and differential PSK(DPSK) voiceband data signals. The CODEC's were PCM, differential PCM and delta modulation systems designed for speech and operating at bit rates from 16 to 64 kbits/s. The voiceband data signals processed by these CODEC's were demodulated to determine the phase error caused by the CODEC. The phase error introduced by the CODEC's is a function of the phase of the CODEC sampling clock relative to the data modem bit clock. Some of the statistics of the phase error are presented. Three performance metrics were used to evaluate the performance of these CODEC's-signal to quantizing noise ratio, variance of the phase error and maximum value of the phase error.  相似文献   

16.
In this paper we present a multisubscriber variable-rate sampling hybrid companding delta modulation (HCDM) system for simultaneous transmission of several speech signals. This system employs both the statistical multiplexing and variable-rate sampling schemes. It transmits speech signals synchronously at a fixed rate using a buffer. In this system the sampling rate of each subscriber is varied according to the speech activity and the status of buffer occupancy, and only the speech portion is coded for transmission. To optimize the system performance within the allowed maximum transmission delay (300 ms), an efficient dynamic buffer control algorithm is proposed. When the number of subscribers is six and the transmission rate for each subscriber is 16 kbits/s, the proposed system yields a performance improvement of about 10 dB over the conventional single-subscriber HCDM system. The buffer delay in this case is 150 ms, which gives a perceptually negligible effect.  相似文献   

17.
In this paper, we present a median-rate speech coder, the controlled adaptive prediction delta modulation coder (CAPDM), which operates at 16 kb/s with good speech quality and low algorithm complexity. The coder is dedicated to personal communication network (PCN) applications and transmits speech samples on the basis of packets. It combines the features of a one-step looking forward decision, syllabic companding, instantaneous companding, and adaptive prediction. In addition to the use of a short-term prediction filter, CAPDM also exploits the pitch property to predict speech waveform explicitly. With the aid of a pitch prediction filter, the performance of a CAPDM codec improves about 3 dB in segmental signal-to-noise ratio (SEGSNR). The average SEGSNR of CAPDM.FF is about 21 dB, which is 7 dB over traditional CVSD at 16 kb/s. We also utilize an adaptive postfilter (APF) to enhance the perceptual quality of the decoded speech. The mean opinion score (MOS) listening test of CAPDM.FF with APF shows that its average score achieves 4.19, which is as good as G.728 16-kb/s LD-CELP and is comparable with CCITT G.721 32-kb/s ADPCM. The complexity of CAPDM.FF is evaluated to be 8 MIPS, which is much lower than that of LD-CELP and could be further reduced by adopting a smaller correlation window for pitch detection. To solve the problem of packet loss, we developed a packet-based waveform substitution method by reinitializing the codec parameters at the beginning of each packet. The simulation results show that CAPDM.FF could tolerate 5% of packet loss and still keep an SEGSNR at 10 dB and an MOS at about 3.0  相似文献   

18.
A constrained joint source/channel coder design   总被引:3,自引:0,他引:3  
The design of joint source/channel coders in situations where there is residual redundancy at the output of the source coder is examined. It has previously been shown that this residual redundancy can be used to provide error protection without a channel coder. In this paper, this approach is extended to conventional source coder/convolutional coder combinations. A family of nonbinary encoders is developed which more efficiently use the residual redundancy in the source coder output. It is shown through simulation results that the proposed systems outperform conventional source-channel coder pairs with gains of greater than 9 dB in the reconstruction SNR at high probability of error  相似文献   

19.
The problem of predictor mistracking for narrowband signals in backward adaptive ADPCM (adaptive digital pulse code modulation) speech coders is shown to arise as a result of feedback from the signal reconstruction filter to the predictor adaptation process. A class of residual-signal-driven lattice predictors (PR) is defined that guarantees tracking for all signals without regard to the order of prediction. The LR predictor enhances speech and DTMF (dual-tone multifrequency) signal transmission performance in the presence of transmission errors. Under error-free transmission conditions, a segmental SNR (signal-to-noise ratio) drop for speech of nearly 2 dB may be encountered for the LR predictor relative to the classical signal-drive lattice predictor. In most practical telecommunication applications, however, this degradation is outweighed by the improved robustness of the predictor  相似文献   

20.
Simulation results are presented which compare the performance of all-pole, all-zero, and pole-zero predictors in ADPCM at data rates of 16 and 32 kbits/s over both ideal and noisy channels. Separate backward adaptive gradient algorithms are used to adapt the poles and the zeros independently. The performance indicators used are signal-to-quantization noise ratio (SNR), signal-to-prediction error ratio (SPER), segmental SNR (SNRSEG), and subjective listening tests. For speech sources, the all-zero and pole-zero predictors produce SNR and SNRSEG values that are approximately 1-3 dB higher than those generated by the all-pole predictor. Subjective listening tests reveal that an eighth-order all-zero predictor performs as well or better than an allpole predictor for all conditions studied.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号