首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
In this paper, we present a new method for high quality audio coding at low delay and low bit rate for telecommunications applications such as audioconfe-rence or videoconference. The developped coder is adapted to code generic audio signals at a bit rate of 64 kbit/s with a delay close to 5 ms in the 20-15000 Hz bandwidth. The method is based on speech coding as well as audio coding concepts. The coder combines subband decomposition of the input signal and LD-CELP techniques. We introduce in this structure of coding a psychoacoustic model which allows to allocate an optimal bit rate on each subband according to perceptual properties of the human hearing. In order to satisfy the bit rate requirement of the psychoacoustic model and to reduce the complexity of such a coding algorithm, we suggested a new method of vector quantization based on lattice quantization. This method allows to quantify the residual signal in the LD-CELP coder and avoid the complexity of the full search. Objective and subjective tests have been made on a test set of audio signals which is a critical sub-set used by ISO. Formal tests showed that the quality of the proposed coder is comparable to the best implementation of the MPEG-1, Layer II, but our solution has the advantage of reaching a very low delay (5 ms).  相似文献   

2.
The paper presents a speech coding algorithm for operation at 11025 samples/s. The coder provides improved speech quality and compatibility with the MS‐Windows multimedia environment. The coding algorithm has been developed by adapting the ITU G729 and enhancing it with some recent developments in the medium band coding. The coder operates over a band of frequencies ranging from 20 to 5400 Hz at a bit rate of 8.9 kbit/s. Application of this coder includes intranet VoIP, voice chatting, multimedia communications, and voice archiving. Copyright © 2001 John Wiley & Sons, Ltd.  相似文献   

3.
This paper presents a wavelet-based image coder that is optimized for transmission over the binary symmetric channel (BSC). The proposed coder uses a robust channel-optimized trellis-coded quantization (COTCQ) stage that is designed to optimize the image coding based on the channel characteristics. A phase scrambling stage is also used to further increase the coding performance and robustness to nonstationary signals and channels. The resilience to channel errors is obtained by optimizing the coder performance only at the level of the source encoder with no explicit channel coding for error protection. For the considered TCQ trellis structure, a general expression is derived for the transition probability matrix. In terms of the TCQ encoding rat and the channel bit error rate, and is used to design the COTCQ stage of the image coder. The robust nature of the coder also increases the security level of the encoded bit stream and provides a much more visually pleasing rendition of the decoded image. Examples are presented to illustrate the performance of the proposed robust image coder  相似文献   

4.
Video coding is a key to successful visual communications. An interframe video coding algorithm using hybrid motion-compensated prediction and interpolation is considered for coding studio quality video at a bit rate of over 5 Mb/s. Interframe coding without a buffer control strategy usually results in variable bit rates. Although packet networks may be capable of handling variable bit rates, in some applications, a constant bit rate is more desirable either for a simpler network configuration or for channels with fixed bandwidth. A self-governing rate buffer control strategy that can automatically steer the coder to a pseudoconstant bit rate is considered. This self-governing rate buffer control strategy employs more progressive quantization parameters, and constrains quantizer adjustments so that a smoother quality transition can be attained. Simulation results illustrate the performance of the pseudoconstant bit rate coder with this buffer control strategy.  相似文献   

5.
Low-rate and flexible image coding with redundant representations.   总被引:7,自引:0,他引:7  
New breakthroughs in image coding possibly lie in signal decomposition through nonseparable basis functions that can efficiently capture edge characteristics, present in natural images. The work proposed in this paper provides an adaptive way of representing images as a sum of two-dimensional features. It presents a low bit-rate image coding method based on a matching pursuit (MP) expansion, over a dictionary built on anisotropic refinement and rotation of contour-like atoms. This method is shown to provide, at low bit rates, results comparable to the state of the art in image compression, represented here by JPEG2000 and SPIHT, with generally a better visual quality in the MP scheme. The coding artifacts are less annoying than the ringing introduced by wavelets at very low bit rate, due to the smoothing performed by the basis functions used in the MP algorithm. In addition to good compression performances at low bit rates, the new coder has the advantage of producing highly flexible streams. They can easily be decoded at any spatial resolution, different from the original image, and the bitstream can be truncated at any point to match diverse bandwidth requirements. The spatial adaptivity is shown to be more flexible and less complex than transcoding operations generally applied to state of the art codec bitstreams. Due to both its ability for capturing the most important parts of multidimensional signals, and a flexible stream structure, the image coder proposed in this paper represents an interesting solution for low to medium rate image coding in visual communication applications.  相似文献   

6.
视频对象分割与两种面向对象的视频编码器   总被引:9,自引:0,他引:9       下载免费PDF全文
翁南钐  蔡德钧 《电子学报》2000,28(10):106-110
在基于对象的视频编码中,视频对象的分割是重要的任务.本文研究一种利用位移帧差(DFD)的高阶统计特性和数学形态学算子的视频对象自动分割方法.这种方法首先根据一组转移帧差的高阶矩来得到一个大体覆盖运动对象的区域(模板),接着用形态学的腐蚀算子从模板的边沿向内腐蚀,直到对象的边沿.提出一种简单而高效的基于查找人头肩像轮廓最大转折点的头肩分离算法.在分割的基础上,用软件实现了一种基于MPEG-4的视频编码系统.提出一种面向对象分配带宽(OOBA——object-oriented bandwidth allocation)的极低比特率视频编码器.该编码器与传统基于帧的视频编码器相比,在低比特率环境下,PSNR略为下降,但图像的主观视觉质量得到提高.  相似文献   

7.
A method for low bit-rate video coding based on wavelet vector quantisation is proposed. Motion estimation/compensation using overlapped block matching (OBM) is employed to eliminate the blocking effects in the prediction error introduced by conventional block matching. It is shown that OBM significantly increases the efficiency of the wavelet transform coder. The motion-compensated interframe prediction error is decomposed using a wavelet transform and a method is employed for the efficient coding of the wavelet coefficients. In this technique, the coefficients are coded with a zero-tree multistage lattice vector quantiser. Simulation results are provided to evaluate the coding performance of the described coding scheme for low bit-rate video coding. It provides constant bit rate, obviating the need for buffer, with just small fluctuations in PSNR. Moreover, comparison with the RM8 implementation of the standard H261 video coder shows that the presented codec provides improvements in both peak signal-to-noise ratio and picture quality  相似文献   

8.
Low resolution region discriminator for wavelet coding   总被引:1,自引:0,他引:1  
Syed  Y.F. Rao  K.R. 《Electronics letters》2001,37(12):748-749
A wavelet block chain (WBC) method is used in the initial coding of the low-low subband created by a wavelet transform to separate and label homogenous regions of the image which require no additional overhead in the bitstream. This information is then used to enhance the coding performance in a modified wavelet based coder. This method uses a two stage ZTE/SPIHT entropy coder (called a homogenous connected-region interested ordered transmission coder) to create a bitstream with properties of progressive transmission, scalability, and perceptual optimisation after a minimum bit rate is reached. Simulation results show good scalable low bit rate (0.04-0.4 bpp) compression, comparable to a SPIHT coder, but with better perceptual quality due to use of the region based information acquired by the WBC method  相似文献   

9.
In this paper a low bit rate subband coding scheme for image sequences is described. Typically, the scheme is based on temporal DPCM in combination with an intraframe subband coder. In contrast to previous work, however, the subbands are divided into blocks onto which conditional replenishment is applied, while a bit allocation algorithm divides the bits among the blocks assigned for replenishment. A solution is given for the ‘dirty window’ effect by setting blocks to zero that were assigned to be replenished but received no bits. The effect of motion compensation and the extension to color images are discussed as well. Finally, several image sequence coding results are given for a bit rate of 300 kbit/s.  相似文献   

10.
基于局部余弦变换的低比特变速率语音编码算法研究   总被引:1,自引:0,他引:1  
提出将局部余弦变换(LCT)算法应用于语音编码中,系统设计了一个平均比特率近1.6kbit/s的低比特变速率语音编码器。在变比特率编码器设计中采用SVM算法进行VAD检测。激活语音帧的语音模式采用GSM半速率编码中的划分方法,但将其中的强浊音模式和中浊音模式合并为一个中强浊音模式。对各类语音模式和无声帧(背景噪声)的局部余弦变换系数采用分维矢量量化算法进行量化,码书设计采用LGB算法。编码中的码书搜索采用树形快速搜索算法。通过主观非正式听力测试表明设计的变比特率编码器编码的重建语音MOS约为3.15,与比特率为2.4kbit/s美国联邦声码器标准MELP的重建语音相当,具有较强的顽健性,适合于对存在各种环境噪声的语音进行编码。  相似文献   

11.
A spatially scalable video coding scheme for low bit rates is proposed. The codec is especially well suited for communications applications because it is based on motion-compensated predictive coding which provides the necessary low-delay property. The frames to be coded are decomposed into a Gaussian pyramid. Motion estimation and compensation are performed between corresponding pyramid levels of successive frames. We show that, to fulfill specific needs of spatial scalability, the motion compensation on each level must result in compatible prediction errors (displaced frame differences, DFD). Compatibility of the prediction errors means that the pyramid formed by independently obtained DFD's (the DFD pyramid) is close to a Gaussian pyramid decomposition of the DFD of the highest resolution level. From the DFD pyramid, a least squares Laplacian pyramid is derived, which is quantized and coded. The DFD encoder outputs an embedded bit stream. Thus, the coder control may truncate the bit stream at any point, and can keep a fixed rate. The motion vector fields obtained at the different resolution levels are also encoded by employing a pyramid approach. Simulation results show that the proposed coder achieves a coding gain compared to simulcast coding  相似文献   

12.
This paper presents several strategies to improve the performance of very low bit rate speech coders and describes a speech codec that incorporates these strategies and operates at an average bit rate of 1.2 kb/s. The encoding algorithm is based on several improvements in a mixed multiband excitation (MMBE) linear predictive coding (LPC) structure. A switched-predictive vector quantiser technique that outperforms previously reported schemes is adopted to encode the LSF parameters. Spectral and sound specific low rate models are used in order to achieve high quality speech at low rates. An MMBE approach with three sub-bands is employed to encode voiced frames, while fricatives and stops modelling and synthesis techniques are used for unvoiced frames. This strategy is shown to provide good quality synthesised speech, at a bit rate of only 0.4 kb/s for unvoiced frames. To reduce coding noise and improve decoded speech, spectral envelope restoration combined with noise reduction (SERNR) postfilter is used. The contributions of the techniques described in this paper are separately assessed and then combined in the design of a low bit rate codec that is evaluated against the North American Mixed Excitation Linear Prediction (MELP) coder. The performance assessment is carried out in terms of the spectral distortion of LSF quantisation, mean opinion score (MOS), A/B comparison tests and the ITU-T P.862 perceptual evaluation of speech quality (PESQ) standard. Assessment results show that the improved methods for LSF quantisation, sound specific modelling and synthesis and the new postfiltering approach can significantly outperform previously reported techniques. Further results also indicate that a system combining the proposed improvements and operating at 1.2 kb/s, is comparable (slightly outperforming) a MELP coder operating at 2.4 kb/s. For tandem connection situations, the proposed system is clearly superior to the MELP coder.  相似文献   

13.
This paper presents a novel image coding scheme using M-channel linear phase perfect reconstruction filterbanks (LPPRFBs) in the embedded zerotree wavelet (EZW) framework introduced by Shapiro (1993). The innovation here is to replace the EZWs dyadic wavelet transform by M-channel uniform-band maximally decimated LPPRFBs, which offer finer frequency spectrum partitioning and higher energy compaction. The transform stage can now be implemented as a block transform which supports parallel processing and facilitates region-of-interest coding/decoding. For hardware implementation, the transform boasts efficient lattice structures, which employ a minimal number of delay elements and are robust under the quantization of lattice coefficients. The resulting compression algorithm also retains all the attractive properties of the EZW coder and its variations such as progressive image transmission, embedded quantization, exact bit rate control, and idempotency. Despite its simplicity, our new coder outperforms some of the best image coders published previously in the literature, for almost all test images (especially natural, hard-to-code ones) at almost all bit rates.  相似文献   

14.
This paper describes a new audio coding scheme based on adaptive wavelet analysis that provides transparent audio coding for CD-audio signals at low bit rates (≈1.4 bits/sample per channel). A new perceptual cost function is defined to obtain the best wavelet-packet base for each audio frame. The sharp variations in quantization noise that appear at the border of the frames are minimized by a novel approach that avoids overlapping. The proposed coder guarantees high perceptual quality using filters that generate wavelets of any compact support, because a bit-allocation algorithm that takes into account the equivalent filter frequency responses of the synthesis filter bank branches is used.  相似文献   

15.
This paper describes an object-based video coding system with new ideas in both the motion analysis and source encoding procedures. The moving objects in a video are extracted by means of a joint motion estimation and segmentation algorithm based on the Markov random field (MRF) model. The two important features of the presented technique are the temporal linking of the objects, and the guidance of the motion segmentation with spatial color information. This facilitates several aspects of an object-based coder. First, a new temporal updating scheme greatly reduces the bit rate to code the object boundaries without resorting to crude lossy approximations. Next, the uncovered regions can be extracted and encoded in an efficient manner by observing their revealed contents. The objects are classified adaptively as P objects or I objects and encoded accordingly. Subband/wavelet coding is applied in encoding the object interiors. Simulations at very low bit rates yielded comparable performance in terms of reconstructed PSNR to the H.263 coder. The object-based coder produced visually more pleasing video with less blurriness and devoid of block artifacts, thus confirming the advantages of object-based coding at very low bit-rates  相似文献   

16.
Two-layer coding of video signals for VBR networks   总被引:5,自引:0,他引:5  
Two-layer conditional-replenishment coding of video signals over a variable-bit-rate (VBR) network is described. A slotted-ring network based on an Orwell protocol is assumed, where transmission of certain packets is guaranteed. The two-layer coder produces two output bit streams: the first bit stream contains all the important structural information in the image and is accommodated in the guaranteed capacity of the network, while the second adds the necessary quality finish. The performance of the coder is tested with CIF standard sequences and broadcast-quality pictures. The portion of the VBR channel allocated to the lower layer as guaranteed bandwidth is examined. Using broadcast-quality pictures, statistics were obtained on the performance of this system for different choices of bit rate in the lower layer. The effect of lost packets is shown on CIF standard picture sequences. It is shown that the coder performs well for a guaranteed channel rate as low as 10-20% of the total bit rate  相似文献   

17.
In an earlier paper, an extension of the pel recursive techniques of Netravali and Robbins [2] and Cafforio and Rocca [3] was introduced. Here a method is provided to realize the algorithm in hardware, with some approximations. The prediction error distribution allows the use of quantized variables to a lookup table of reasonable size. The algorithm is then incorporated into a simple multimode coder capable of 1.5 bits/pel on the sequence examined. The coder incorporates a spot filter, quantizer, block run length coding, and variable word length coding and subsampling. Simulation results are presented, including bit rate, buffer status, and mode control analysis.  相似文献   

18.
基于率失真优化的递进UTCQ编码   总被引:1,自引:0,他引:1  
本文提出了一种基于UTCQ量化器的递进静态图像小波编码算法。一致网格编码量化(UTCQ)用于小波系数的量化并得到了非常好的量化效果。UTCQ超集索引值构成系数位平面,率失真优化按照率失真斜率递减的顺序从系数位平面选择编码系数位。最先编码的位具有最大的率失真斜率,每编码一位都会使失真减少最大。率失真斜率的计算仅仅是利用MQ自适应算术编码器的概率状态估计表而进行的查表过程。MQ算术编码器进一步压缩率失真优化选择的系数位。率失真门限方法的编码速度比搜索最大的率失真斜率更快。该算法有较快的编码速度以及好的压缩效果。  相似文献   

19.
This paper presents a transform coding algorithm devoted to high quality audio coding at a bit rate of 64 kbps per monophonic channel. It enables the transmission of a high quality stereo sound through the basic access (2B channels) of ISDN. Although a complete system including framing, synchronization and error correction has been developed, only the bit rate compression algorithm is described here. A detailed analysis of the signal processing techniques such as the time/frequency transformation, the pre-echo reduction by adaptive filtering, the fast algorithm computations, etc., is provided. The use of psychoacoustical properties is also precisely reported. Finally, some subjective evaluation results and one real time implementation of the coder using the ATT DSP32C digital signal processor are presented  相似文献   

20.
A wavelet electrocardiogram (ECG) data codec based on the set partitioning in hierarchical trees (SPIHT) compression algorithm is proposed in this paper. The SPIHT algorithm [1] has achieved notable success in still image coding. We modified the algorithm for the one-dimensional case and applied it to compression of ECG data. Experiments on selected records from the MIT-BIH arrhythmia database revealed that the proposed codec is significantly more efficient in compression and in computation than previously proposed ECG compression schemes. The coder also attains exact bit rate control and generates a bit stream progressive in quality or rate.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号