期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Embedded quantization of line spectral frequencies using a multistage tree-structured vector quantizer

Chu W.C. 《IEEE transactions on audio, speech, and language processing》2006,14(4):1205-1217

This paper presents a multistage tree-structured vector quantization (MTVQ) scheme for line spectral frequencies (LSF), where two advantages exist: It supports embedded quantization which is required for scalable coder designs, and the tree-structure at each stage can be relied on to accelerate the encoding process. The different codebook design strategies suitable for MTVQ are analyzed. Two speech coding standards are modified by replacing their original LSF quantizers with an MTVQ; it is shown that graceful quality degradation can be obtained for the synthetic speech when the number of bits available for LSF decoding is decremented one-by-one. Moreover, the search complexity is substantially reduced with slight performance degradation. 相似文献

2.

High-Rate Optimized Recursive Vector Quantization Structures Using Hidden Markov Models

《IEEE transactions on audio, speech, and language processing》2007,15(3):756-769

This paper examines the design of recursive vector quantization systems built around Gaussian mixture vector quantizers. The problem of designing such systems for minimum high-rate distortion, under input-weighted squared error, is discussed. It is shown that, in high dimensions, the design problem becomes equivalent to a weighted maximum likelihood problem. A variety of recursive coding schemes, based on hidden Markov models are presented. The proposed systems are applied to the problem of wideband speech line spectral frequency (LSF) quantization under the log spectral distortion (LSD) measure. By combining recursive quantization and random coding techniques, the systems are able to attain transparent quality at rates as low as 36 bits per frame 相似文献

3.

基于矢量量化的高效BTC图像编码算法

徐润生陈佩张卫东许晓鸣《测控技术》2001,20(5):31-34

作为一种有损图像编码技术,块截短编码算法（BTC）的计算量较少,速度快,有较好的信道容错力,重建图像质量较高。然而,标准BTC算法的主要缺点是其压缩比特率比其他基于块图像编码的算法（如变换编码和矢量量化）高。为了降低比特率,提出了几种有效的BTC算法,还提出了一种简单的查表算法对每块的BTC量化数据编码,另外还引入了矢量量化技术以减少对位平面编码的比特数。为了减少由改进算法引入的额外失真,在每种提出的算法中,采用最优阈值而不用平均值作为量化阈值。相似文献

4.

Spherical Logarithmic Quantization

《IEEE transactions on audio, speech, and language processing》2010,18(1):126-140

相似文献

5.

An architecture for hybrid coding of NTSC TV signals

A. Jalali K.R. Rao 《Computers & Electrical Engineering》1982,9(1):45-51

A hardware feasible architecture for DCT/DPCM hybrid coding of color television (TV) signals has been developed. The coding system is based on formatting four horizontal scan lines into blocks of the same subcarrier phase elements. The samples in each block are rearranged and transformed by a FDCT processor. Based on its average energy, a given transform block is compared with its adjacent blocks and the nearest block is selected as its estimate. The difference between the actual and the estimated values of the (DCT) coefficients are then quantized and encoded using nonuniform quantizers and a variable length coder. Furthermore, the maximum number of different wordlengths is assumed to be five. Therefore, five sets of 256-byte encoding ROMs are used to store the quantization tables. To reduce the redundancy in the code words, an adaptive coding scheme is used. The coding scheme is based on setting two threshold levels. 相似文献

6.

Linear transform for simultaneous diagonalization of covariance and perceptual metric matrix in image coding

I. EpifanioAuthor Vitae J. GutiérrezAuthor VitaeJ. MaloAuthor Vitae 《Pattern recognition》2003,36(8):1799-1811

Two types of redundancies are contained in images: statistical redundancy and psychovisual redundancy. Image representation techniques for image coding should remove both redundancies in order to obtain good results. In order to establish an appropriate representation, the standard approach to transform coding only considers the statistical redundancy, whereas the psychovisual factors are introduced after the selection of the representation as a simple scalar weighting in the transform domain.In this work, we take into account the psychovisual factors in the definition of the representation together with the statistical factors, by means of the perceptual metric and the covariance matrix, respectively. In general the ellipsoids described by these matrices are not aligned. Therefore, the optimal basis for image representation should simultaneously diagonalize both matrices. This approach to the basis selection problem has several advantages in the particular application of image coding. As the transform domain is Euclidean (by definition), the quantizer design is highly simplified and at the same time, the use of scalar quantizers is truly justified. The proposed representation is compared to covariance-based representations such as the DCT and the KLT or PCA using standard JPEG-like and Max-Lloyd quantizers. 相似文献

7.

Parallel 1D and 2D vector quantizers using a Kohonen neural network

A. S. Mohamed E. N. Attia 《Neural computing & applications》1996,4(2):64-71

The process of reconstructing an original image from a compressed one is a difficult problem, since a large number of original images lead to the same compressed image and solutions to the inverse problem cannot be uniquely determined. Vector quantization is a compression technique that maps an input set of k-dimensional vectors into an output set of k-dimensional vectors, such that the selected output vector is closest to the input vector according to a selected distortion measure. In this paper, we show that adaptive 2D vector quantization of a fast discrete cosine transform of images using Kohonen neural networks outperforms other Kohonen vector quantizers in terms of quality (i.e. less distortion). A parallel implementation of the quantizer on a network of SUN Sparcstations is also presented. 相似文献

8.

Multiple-Description Predictive-Vector Quantization With Applications to Low Bit-Rate Speech Coding Over Networks

Pradeepa Yahampath Paul Rondeau 《IEEE transactions on audio, speech, and language processing》2007,15(3):749-755

相似文献

9.

Image coding algorithm based on Hadamard transform and simple vector quantization

Simić Nikola Perić Zoran H. Savić Milan S. 《Multimedia Tools and Applications》2018,77(5):6033-6049

Transform coding is commonly used in image processing algorithms to provide high compression ratios, often at the expense of processing time and simplicity of the system. We have recently proposed a pixel value prediction scheme in order to exploit adjacent pixel correlation, providing a low-complexity model for image coding. However, the proposed model was unable to reach high compression ratios retaining high quality of reconstructed image at the same time. In this paper we propose a new segmentation algorithm which further utilizes adjacent pixel correlation, provides higher compression ratios and it is based on application of Hadamard transform coding. Additional compression is provided by using vector quantization for a low number of quantization levels and by simplifying generalized Lloyd’s algorithm where the special attention is paid to determination of optimal partitions for vector quantization, making a fixed quantizer. The proposed method is quite simple and experimental results show that it ensures better or similar rate-distortion ratio for very low bit-rates, comparing to the other similar methods that are based on wavelet or curvelet transform coding and support or core vector machine application. Furthermore, the proposed method requires very low processing time since the proposed quantizers are fixed, much less than the required time for the aforementioned methods that we compare with as well as much less than the time required for fractal image coding. In the end, the appropriate discussion is provided comparing the results with a scheme based on linear prediction and dual-mode quantization.

相似文献

10.

High-Resolution Spherical Quantization of Sinusoidal Parameters

Pim Korten Jesper Jensen Richard Heusdens 《IEEE transactions on audio, speech, and language processing》2007,15(3):966-981

Sinusoidal coding is an often employed technique in low bit-rate audio coding. Therefore, methods for efficient quantization of sinusoidal parameters are of great importance. In this paper, we use high-resolution assumptions to derive analytical expressions for the optimal entropy-constrained unrestricted spherical quantizers for the amplitude, phase, and frequency parameters of the sinusoidal model. This is done both for the case of a single sinusoid, and for the more practically relevant case of multiple sinusoids distributed across multiple segments. To account for psychoacoustical effects of the auditory system, a perceptual distortion measure is used. The optimal quantizers minimize a high-resolution approximation of the expected perceptual distortion, while the corresponding quantization indices satisfy an entropy constraint. The quantizers turn out to be flexible and of low complexity, in the sense that they can be determined easily for varying bit rate requirements, without any sort of retraining or iterative procedures. In an objective comparison it is shown that for the squared error distortion measure, the rate-distortion performance of the proposed method is very close to that of the theoretically optimal entropy-constrained vector quantization. Furthermore, for the perceptual distortion measure, the proposed scheme is shown to objectively outperform an existing sinusoidal quantization scheme, where frequency quantization is done independently. Finally, a subjective listening test, in which the proposed scheme is compared to an existing state-of-the-art sinusoidal quantization scheme with fixed quantizers for all input signals, indicates that the proposed scheme leads to an average bit rate reduction of 20%, at the same subjective quality level as the existing scheme 相似文献

11.

Efficient bit-rate scalability for weighted squared error optimization in audio coding

《IEEE transactions on audio, speech, and language processing》2006,14(4):1313-1327

We propose two quantization techniques for improving the bit-rate scalability of compression systems that optimize a weighted squared error (WSE) distortion metric. We show that quantization of the base-layer reconstruction error using entropy-coded scalar quantizers is suboptimal for the WSE metric. By considering the compandor representation of the quantizer, we demonstrate that asymptotic (high resolution) optimal scalability in the operational rate-distortion sense is achievable by quantizing the reconstruction error in the compandor's companded domain. We then fundamentally extend this work to the low-rate case by the use of enhancement-layer quantization which is conditional on the base-layer information. In the practically important case that the source is well modeled as a Laplacian process, we show that such conditional coding is implementable by only two distinct switchable quantizers. Conditional coding leads to substantial improvement over the companded scalable quantization scheme introduced in the first part, which itself significantly outperforms standard techniques. Simulation results are presented for synthetic memoryless Laplacian sources with /spl mu/-law companding, and for real-world audio signals in conjunction with MPEG AAC. Using the objective noise-mask ratio (NMR) metric, the proposed approaches were found to result in bit-rate savings of a factor of 2 to 3 when implemented within the scalable MPEG AAC. Moreover, the four-layer scalable coder consisting of 16-kb/s layers achieves performance close to that of the 64-kb/s nonscalable coder on the standard test database of 44.1-kHz audio. 相似文献

12.

Multiresolution,perceptual and vector quantization based video codec

Akbar Sheikh Akbari Pooneh Bagheri Zadeh Tom Buggy John Soraghan 《Multimedia Tools and Applications》2012,58(3):569-583

This paper presents a novel Multiresolution, Perceptual and Vector Quantization (MPVQ) based video coding scheme. In the intra-frame mode of operation, a wavelet transform is applied to the input frame and decorrelates it into its frequency subbands. The coefficients in each detail subband are pixel quantized using a uniform quantization factor divided by the perceptual weighting factor of that subband. The quantized coefficients are finally coded using a quadtree-coding algorithm. Perceptual weights are specifically calculated for the centre of each detail subband. In the inter-frame mode of operation, a Displaced Frame Difference (DFD) is first generated using an overlapped block motion estimation/compensation technique. A wavelet transform is then applied on the DFD and converts it into its frequency subbands. The detail subbands are finally vector quantized using an Adaptive Vector Quantization (AVQ) scheme. To evaluate the performance of the proposed codec, the proposed codec and the adaptive subband vector quantization coding scheme (ASVQ), which has been shown to outperform H.263 at all bitrates, were applied to six test sequences. Experimental results indicate that the proposed codec outperforms the ASVQ subjectively and objectively at all bit rates. 相似文献

13.

一种基于索引约束矢量量化的脆弱音频水印算法* 总被引：1，自引：1，他引：0

周赟王让定《计算机应用研究》2011,28(5):1940-1942

与传统矢量量化不同,索引约束矢量量化在量化过程中通过约束码字索引二进制形式中某一位的值来限定码字的搜索范围。本文利用其特殊的码字搜索方法提出了一种在音频信号中嵌入水印的方法。将原始音频信号分段,每段进行DCT变换并提取若干中频系数构成矢量。水印嵌入时根据水印比特信息和预先设定的索引约束位的值找到匹配码字修改各段DCT中频系数。水印提取时利用传统矢量量化方法得到各量化索引值后,提取出各索引值中与嵌入端相同位的比特值即为水印信息。该方法在量化过程中嵌入水印信息,有很好的实时性。实验结果表明,利用该方法嵌入的水印为一种脆弱水印,可用于认证。相似文献

14.

语音谱参数的增强双预测多级矢量量化的码本设计方法 总被引：1，自引：0，他引：1

高戈胡瑞敏李德仁《计算机工程与应用》2002,38(10):23-26

表征语音谱参数的线性预测编码(LPC)参数被广泛用于各种语音编码算法。甚低位率语音编码算法要求使用尽可能少的位率编码语音谱参数。文章提出了语音谱参数的增强双预测多级矢量量化算法(EDPMSVQ)的码本设计方法。这种改进的多级矢量量化方法充分利用语音谱参数的短时相关和长时相关特性,采用了有记忆的多级矢量量化算法(MSVQ),对语音谱参数的每一维分别使用不同的预测系数;并且通过利用相邻语音帧间语音谱参数的强相关和弱相关的不同特点,采用了分别对应于强相关和弱相关的两个预测值集合,进一步减小了语音谱参数编码位率。增强双预测多级矢量量化方法能够实现20位的语音谱参数近似“透明”量化,同时能够使语音谱参数量化时的计算复杂度略有减少,所需的存储空间大为减少。相似文献

15.

A blind audio watermarking algorithm by logarithmic quantization index modulation

Xinkai Wang Pengjun Wang Peng Zhang Shuzheng Xu Huazhong Yang 《Multimedia Tools and Applications》2014,71(3):1157-1177

In this paper we present a blind audio watermarking algorithm based on the vector norm and the logarithmic quantization index modulation (LQIM) in the wavelet domain, integrating the robustness of the vector norm with the imperceptibility of the logarithmic quantization index modulation based on μ-Law (or mu-Law) companding. Firstly μ-Law companding is adopted to transform the vector norm of the segmented wavelet approximation components of the original audio signal. And then a binary image scrambled by the chaotic sequence as watermark is embedded in the transformed domain with a uniform quantization scheme. Experimental results demonstrate that even if the capacity of the proposed algorithm is high, up to 102.4 bps, this algorithm can still maintain a high quality of the audio signal, and achieve a better performance, such as imperceptibility, robustness and complexity in comparison with the uniform quantization based algorithms against common attacks. What’s more, it can resist amplitude scaling attack effectively. 相似文献

16.

High performance data compression method with pattern matching for biomedical ECG and arterial pulse waveforms

Chen WS Hsieh L Yuan SY 《Computer methods and programs in biomedicine》2004,74(1):11-27

Biomedical waveforms, such as electrocardiogram (ECG) and arterial pulse, always possess a lot of important clinical information in medicine and are usually recorded in a long period of time in the application of telemedicine. Due to the huge amount of data, to compress the biomedical waveform data is vital. By recognizing the strong similarity and correlation between successive beat patterns in biomedical waveform sequences, an efficient data compression scheme mainly based on pattern matching is introduced in this paper. The waveform codec consists mainly of four units: beat segmentation, beat normalization, two-stage pattern matching and template updating and residual beat coding. Three different residual beat coding methods, such as Huffman/run-length coding, Huffman/run-length coding in discrete cosine transform domain, and vector quantization, are employed. The simulation results show that our compression algorithms achieve a very significant improvement in the performances of compression ratio and error measurement for both ECG and pulse, as compared with some other compression methods. 相似文献

17.

基于Contourlet变换和支持向量机的纹理识别方法

王佳奕葛玉荣《计算机应用》2013,33(3):677-679

针对变换域中图像纹理识别时如何选择最佳特征向量的问题,利用Contourlet变换的多方向、多尺度选择性和各向异性,将图像从空间域变换到频率域,全面地提取了Contourlet变换分解后低频子带、中频子带和高频子带的特征,输入支持向量机(SVM)分类器进行分类识别。利用Brodatz纹理库进行仿真实验,实验结果表明低频均值方差和高频能量作为组合特征时识别准确率可达98.75%,且特征向量维数少,是在Contourlet变换下表示图像纹理的最优特征。相似文献

18.

Low bit-rate multi stage vector quantization based on energy clustered training set

R. Krishnamoorthy R. Punidha 《Multimedia Tools and Applications》2014,70(3):2293-2308

In this paper, a new multi stage vector quantization with energy clustered training set is proposed for color image coding. The input image is applied with orthogonal polynomials based transformation and the energy clustered transformed training vectors are obtained with reduced dimension. The stage-by-stage codebook for vector quantization is constructed from the proposed transformed training vectors so as to reduce computational complexity. This method also generates a single codebook for all the three color components, utilizing the inter-correlation property of individual color planes and interactions among the color planes due to the proposed transformation. As a result, the color image encoding time is only slightly higher than that of gray scale image coding time and in contrast to the existing color image coding techniques, whose time is thrice greater than that of gray scale image coding. The experimental results reveal that only 35 % and 10 % of transform coefficients are sufficient for smaller and larger blocks respectively, for the reconstruction of images with good quality. The proposed multi stage vector quantization technique is faster when compared to existing techniques and yields better trade-off between image quality and block size for encoding. 相似文献

19.

基于分裂系数的多描述编码

蔡灿辉陈婧丁润涛《计算机应用》2005,25(5):1066-1068

提出了一种新的多描述图像编码方案---基于分裂系数的多描述编码。通过对每一个小波系数进行按位下采样,把一个系数分裂为奇数位系数和偶数位系数。奇数位系数信息和偶数位系数保护信息构成一个描述,偶数位系数信息和奇数位系数保护信息构成另一个描述。两个描述通过不同的信道进行传输。由于每个描述都包含了奇数位系数和偶数位系数的部分或全部信息,因此,接收到一个信道的信息就可以重建出一定质量的图像。如果所有的信道传输的信息都被接收,就会得到比任何一个单独信道更好的重建图像。实验结果表明,该算法的性能优于多相变换与选择量化算法。相似文献

20.

A fast vector quantization encoding algorithm based on projection pyramid with Hadamard transformation

Ahmed Swilem 《Image and vision computing》2010

Vector quantization (VQ) for image compression requires expensive time to find the closest codevector in the encoding process. In this paper, a fast search algorithm is proposed for projection pyramid vector quantization using a lighter modified distortion with Hadamard transform of the vector. The algorithm uses projection pyramids of the vectors and codevectors after applying Hadamard transform and one elimination criterion based on deviation characteristic values in the Hadamard transform domain to eliminate unlikely codevectors. Experimental results are presented on image block data. These results confirm the effectiveness of the proposed algorithm with the same quality of the image as the full search algorithm. 相似文献