首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
This paper presents a technique to incorporate psychoacoustic models into an adaptive wavelet packet scheme to achieve perceptually transparent compression of high-quality (34.1 kHz) audio signals at about 45 kb/s. The filter bank structure adapts according to psychoacoustic criteria and according to the computational complexity that is available at the decoder. This permits software implementations that can perform according to the computational power available in order to achieve real time coding/decoding. The bit allocation scheme is an adapted zero-tree algorithm that also takes input from the psychoacoustic model. The measure of performance is a quantity called subband perceptual rate, which the filter bank structure adapts to approach the perceptual entropy (PE) as closely as possible. In addition, this method is also amenable to progressive transmission, that is, it can achieve the best quality of reconstruction possible considering the size of the bit stream available at the encoder. The result is a variable-rate compression scheme for high-quality audio that takes into account the allowed computational complexity, the available bit-budget, and the psychoacoustic criteria for transparent coding. This paper thus provides a novel scheme to marry the results in wavelet packets and perceptual coding to construct an algorithm that is well suited to high-quality audio transfer for Internet and storage applications  相似文献   

2.
A perceptual audio coder, in which each audio segment is adaptively analyzed using either a sinusoidal or an optimum wavelet basis according to the time-varying characteristics of the audio signals, has been constructed. The basis optimization is achieved by a novel switched filter bank scheme, which switches between a uniform filter bank structure (discrete cosine transform) and a non-uniform filter bank structure (discrete wavelet transform). A major artifact of the International ISO/Moving Pictures Experts Group (MPEG) audio coding standard (MPEG-I layers 1 and 2) known as pre-echo distortion which uses a uniform filter bank structure for audio signal analysis, is almost eliminated in the proposed coder. A perceptual masking model implemented using a high-resolution wavelet packet filter bank with 27 subbands, closely mimicking the critical bands of the human auditory system, is employed in this audio coder. The resulting scheme is a variable bit-rate audio coder, which provides compression ratios comparable to MPEG-I layers 1 and 2 with almost transparent quality.  相似文献   

3.
提出了一种新颖的基于自适应小波基优化选择和心理声学模型相结合的数字音频信号的透明质量编码方法,保证固定失真水平上使每帧信号的变换系数的动态分配的比特数最少,并且利用动态码本的方法来消除音频信号的统计冗余,进一步压缩比特率,对于抽样率为44.1kHz每样值用16比特线性码表示的光盘单声道音乐信号可以压缩到64kBPS左右。  相似文献   

4.
Boland  S. Deriche  M. 《Electronics letters》1997,33(4):262-263
A new audio coding system is proposed. Using an M-band multiresolution filter bank technique. This consists of a cascade of 4-band and 8-band filter banks. Experiments with a complete audio coding system were carried out with the proposed filter bank, masking model, bit allocation algorithm, scalar quantisation and Huffman coding. For the broadband signals tested, the proposed system resulted in near transparent quality at bit-rates of 78-91 kbit/s with low computational load. It also achieved similar performance to the MPEG layer 2 coder at 128 kbit/s  相似文献   

5.
本文依据感知音频编解码基本原理,研究和设计了一种基于多描述编码技术的高质量音频编码算法。这种算法具有较好抗丢包性能,算法的总体思路是先在分析与合成的层面上把音频分解为听觉掩蔽门限和剩余信号,然后在量化和编码层面上分别对音频的听觉掩蔽门限和剩余信号进行多描述处理。姑果表明,在所提出的多描述抗丢包音频编解码算法框架下,多描述算法的抗丢包性能明显优于单描述的抗丢包性能,标量量化多描述算法的抗丢包性能比奇偶分离双描述算法和对偶变换双描述算法的抗丢包性能都要好。  相似文献   

6.
In this paper, we present a new method for high quality audio coding at low delay and low bit rate for telecommunications applications such as audioconfe-rence or videoconference. The developped coder is adapted to code generic audio signals at a bit rate of 64 kbit/s with a delay close to 5 ms in the 20-15000 Hz bandwidth. The method is based on speech coding as well as audio coding concepts. The coder combines subband decomposition of the input signal and LD-CELP techniques. We introduce in this structure of coding a psychoacoustic model which allows to allocate an optimal bit rate on each subband according to perceptual properties of the human hearing. In order to satisfy the bit rate requirement of the psychoacoustic model and to reduce the complexity of such a coding algorithm, we suggested a new method of vector quantization based on lattice quantization. This method allows to quantify the residual signal in the LD-CELP coder and avoid the complexity of the full search. Objective and subjective tests have been made on a test set of audio signals which is a critical sub-set used by ISO. Formal tests showed that the quality of the proposed coder is comparable to the best implementation of the MPEG-1, Layer II, but our solution has the advantage of reaching a very low delay (5 ms).  相似文献   

7.
We introduce new methods for increasing the performance of multiprogram digital audio broadcast systems, e.g., satellite digital audio broadcasting. Joint multiprogram encoding is an attractive possibility for parallel broadcasting of a large number of programs. Joint coding extended over multiple audio frames in time give further improvements. The benefits of this kind of statistical multiplexing yield improved audio quality and/or higher capacity in terms of number of programs. We describe the new Joint Multiple Program Encoding Technique in the context of the perceptual audio coding (PAC) type of algorithms. We also describe methods for multi-program transmission including Equal Error Protection (EEP) as well as Unequal Error Protection (UEP) and improved error concealment for multiple program transmission. Some of the techniques described in this paper, are currently being used in satellite digital audio broadcasting in the United States.  相似文献   

8.
This work is concerned with the development and optimization of a signal model for scalable perceptual audio coding at low bit rates. A complementary two-part signal model consisting of Sines plus Noise (SN) is described. The paper presents essentially a fundamental enhancement to the sinusoidal modeling component. The enhancement involves an audio signal scheme based on carrying out overlap-add sinusoidal modeling at three successive time scales, large, medium, and small. The sinusoidal modeling is done in an analysis-by-synthesis overlapadd manner across the three scales by using a psychoacoustically weighted matching pursuits. The sinusoidal modeling residual at the first scale is passed to the smaller scales to allow for the modeling of various signal features at appropriate resolutions. This approach greatly helps to correct the pre-echo inherent in the sinusoidal model. This improves the perceptual audio quality upon our previous work of sinusoidal modeling while using the same number of sinusoids. The most obvious application for the SN model is in scalable, high fidelity audio coding and signal modification.  相似文献   

9.
本文阐述了语言信号的子带及变换编码原理,为了消除分块效应,介绍了一种作为语言编码新方法的搭接式变换和与此有关的快速算法及滤波器设计。最后,对于为实现最佳比特分配所需的提取副信息的一种方法作了简要的说明。  相似文献   

10.
研究高质量的高频重建技术是音频压缩编码技术的必然需求.频带复制技术能够大幅提高感知音频编码器的压缩效率,业已经成为MPEG-4 version 3音频扩展标准.频带复制技术只需提取少量参数,就可在接收端重建高频信号,从而在低比特率下能得到更高的音频质量.本文论述了频带复制技术的原理和特点,并对SBR性能进行评测.  相似文献   

11.
数字音频水印   总被引:4,自引:0,他引:4  
周宏  陈健 《电声技术》2002,(7):10-14
感知音频编码是音频信号存储和传输的主流技术。数字音频水印技术和感知音频编码技术相结合,在音频信号中嵌入水印数据。在保证合成音频信号质量不发生可察觉变化的同时,充分保证水印数据的健壮性。文中首先介绍了数字水印系统的概况,然后讨论了数字音频水印系统的特点和要求,比较了常用的数字音频水印技术,接着重点讨论了基于扩频技术的数字音频水印系统的实现方案,最后归纳了数字音频水印的典型应用。  相似文献   

12.
音频编码中瞬态信号的时域检测方法   总被引:1,自引:0,他引:1  
在低比特率音频感觉编码中,预回声失真更为突出,而对其处理的前提是瞬态信号的有效检测。在时域,基于峰值功率与平均功率之比( PMR)定义了瞬态强度,并以此为判决函数提出一种新的瞬态信号时域检测算法。由于考虑了时域掩蔽效应来设置检测门限和有效瞬态点间隔,非常适用于感觉音频编码。与当前典型的基于感觉熵的频域瞬态检测方法相比,具有时间分辨率高、准确和算法简单等优点。  相似文献   

13.
Advances in speech and audio compression   总被引:4,自引:0,他引:4  
Speech and audio compression has advanced rapidly in recent years spurred on by cost-effective digital technology and diverse commercial applications. Recent activity in speech compression is dominated by research and development of a family of techniques commonly described as code-excited linear prediction (CELP) coding. These algorithms exploit models of speech production and auditory perception and offer a quality versus bit rate tradeoff that significantly exceeds most prior compression techniques for rates in the range of 4 to 16 kb/s. Techniques have also been emerging in recent years that offer enhanced quality in the neighborhood of 2.4 kb/s over traditional vocoder methods. Wideband audio compression is generally aimed at a quality that is nearly indistinguishable from consumer compact-disc audio. Subband and transform coding methods combined with sophisticated perceptual coding techniques dominate in this arena with nearly transparent quality achieved at bit rates in the neighborhood of 128 kb/s per channel  相似文献   

14.
This tutorial paper describes various efficient implementations (published and new unpublished) of the forward and backward modified discrete cosine transform (MDCT) in the MPEG layer III (MP3) audio coding standard developed in the time period 1990-2010, including the efficient implementation of polyphase filter banks for completeness. The efficient MDCT implementations are discussed in the context of (fast) complete analysis/synthesis MDCT filter banks in the MP3 encoder and decoder. In general, for each efficient forward/backward MDCT block transforms implementation are presented: complete formulas or sparse matrix factorizations of the algorithm, the corresponding signal flow graph for the short audio block and the total arithmetic complexity as well as the useful comments related to improving the arithmetic complexity and a possible structural simplification of the algorithm. Finally, all efficient forward/backward MDCT implementations are compared both in terms of the arithmetic complexity and structural simplicity. It is important to note that almost all presented algorithms can be also used for the 2n-length data blocks in others MPEG audio coding standards and proprietary audio compression algorithms.  相似文献   

15.
The audio quality, robustness and implementational complexity of a novel mobile digital audio broadcast scheme are addressed. The audio codec proposed is based on an efficient combination of subband coding (SBC) and multipulse excited linear prediction coding (MPLPC). The bit allocation is dynamically adapted according to both the signal power in different subbands and a perceptual hearing model. Typically a segmental signal to noise ratio (SEGSNR) in excess of 30 dB associated with high fidelity subjective quality was achieved for 2.67-b/sample transmissions at a bit rate of 86 kb/s. Perceptually unimpaired audio quality was achieved for a bit error rate (BER) of about 10-4, when injecting random errors, which was degraded for increased BERs. In order to provide robust error protection, the audio codec was also subjected to a rigorous bit sensitivity analysis. Four different forward error correction schemes were investigated in order to explore the complexity, bit rate, and robustness tradeoffs  相似文献   

16.
一种基于WLP-MDCT混合音频编码算法   总被引:2,自引:0,他引:2  
介绍了一种卷曲线性预测(Warped Linear Prediction,WLP)与改进型离散余弦变换(Modified Discrete Cosine Transform,MDCT)混合音频编码的算法。WLP技术用来构造一对前滤波器和后滤波器,其中前滤波器用来降低MDCT编码过程中前回声的产生,后滤波器可以对量化噪声进行整形,从而进一步提高重建音频的主观听觉质量。实验结果表明该算法确实有效可行。  相似文献   

17.
周延献  张涛  王赞 《电声技术》2012,36(2):64-66
瞬态段检测算法是数字音频信号编码前进行预处理的关键算法之一,其性能优劣直接影响到编码复杂度以及编码质量。提出了一种基于方差与平坦测度相综合的瞬态段检测算法,实验结果表明,本算法能够有效减少低能量瞬态段的冗余检测,并且可以以比帧更小的信号段作为检测对象,提高了时间分辨率,检测结果更能接近信号的实际情况,提高了编码质量。此外,本算法还具有检测准确度高、算法简单等优点。  相似文献   

18.
基于TI C5402 DSK的子带分析滤波器快速算法的研究与实现   总被引:1,自引:1,他引:0  
MP3音频压缩算法是由ISO1172-3标准规定的一种高效、高保真的压缩编码算法。笔者对MP3所用到的几个关键技术,如子带滤波,心理声学模型、动态噪声分配等做出简明分析。用TI公司的数字信号处理芯片TMS320 VC5402实现了MPEG建议的子带分析方案的一种快速算法。  相似文献   

19.
In high-quality digital audio coding, a great deal of attention is focused on the auditory perception process, as the goal of audio compression is to attain perceptually-transparent compression and reproduction. Consequently models for perceptual masking are used extensively in audio coders, allowing quantisation noise to be allocated in the various frequency subbands according to a masking function. In this way, quantisation noise can be made almost inaudible at the receiver. In this paper, the psychoacoustic phenomenon of auditory masking is described. This is followed by a review of the MPEG-1 (Moving Pictures Experts Group) international standard for audio compression, including an outline of the psychoacoustic models used  相似文献   

20.
The class of perceptual audio coding (PAC) algorithms yields efficient and high-quality stereo digital audio bitstreams at bit rates from 16 kb/sec to 128 kb/sec (and higher). To avoid "pops and clicks" in the decoded audio signals, channel error detection combined with source error concealment, or source error mitigation, techniques are preferred to pure channel error correction. One method of channel error detection is to use a high-rate block code, for example, a cyclic redundancy check (CRC) code. Several joint source-channel coding issues arise in this framework because PAC contains a fixed-to-variable source coding component in the form of Huffman codes, so that the output audio packets are of varying length. We explore two such issues. First, we develop methods for screening for undetected channel errors in the audio decoder by looking for inconsistencies between the number of bits decoded by the Huffman decoder and the number of bits in the packet as specified by control information in the bitstream. We evaluate this scheme by means of simulations of Bernoulli sources and real audio data encoded by PAC. Considerable reduction in undetected errors is obtained. Second, we consider several configurations for the channel error detection codes, in particular CRC codes. The preferred set of formats employs variable-block length, variable-rate outer codes matched to the individual audio packets, with one or more codewords used per audio packet. To maintain a constant bit rate into the channel, PAC and CRC encoding must be performed jointly, e.g., by incorporating the CRC into the bit allocation loop in the audio coder.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号