首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In this paper, methods for improved parametric coding of transients are presented. We propose a signal model for coding of transients consisting of a sum of sinusoids each being amplitude-modulated by a different gamma envelope. These envelopes are characterized by an onset time, an attack and a decay parameter. An efficient method for estimating these parameters is presented. Further, methods are proposed that combine this transient model with a constant-amplitude sinusoidal model in order to achieve efficient coding of both stationary and transient signal parts. By rate-distortion optimization using a perceptual distortion measure, we combine variable rate bit allocation and segmentation in an optimal way. Formal, as well as informal, listening tests show that significant improvements can be achieved with the proposed model as compared to a state-of-the-art sinusoidal coder by the combination of optimal segmentation and amplitude modulated sinusoidal audio coding.  相似文献   

2.
This paper examines the design of recursive vector quantization systems built around Gaussian mixture vector quantizers. The problem of designing such systems for minimum high-rate distortion, under input-weighted squared error, is discussed. It is shown that, in high dimensions, the design problem becomes equivalent to a weighted maximum likelihood problem. A variety of recursive coding schemes, based on hidden Markov models are presented. The proposed systems are applied to the problem of wideband speech line spectral frequency (LSF) quantization under the log spectral distortion (LSD) measure. By combining recursive quantization and random coding techniques, the systems are able to attain transparent quality at rates as low as 36 bits per frame  相似文献   

3.
Visual sensitivity guided bit allocation for video coding   总被引:1,自引:0,他引:1  
A video bit allocation technique adopting a visual distortion sensitivity model for better rate-visual distortion coding control is proposed in this paper. Instead of applying complicated semantic understanding, the proposed automatic distortion sensitivity analysis process analyzes both the motion and the texture structures in the video sequences in order to achieve better bit allocation for rate-constrained video coding. The proposed technique evaluates the perceptual distortion sensitivity on a macroblock basis, and allocates fewer bits to regions permitting large perceptual distortions for rate reduction. The proposed algorithm can be incorporated into existing video coding rate control schemes to achieve same visual quality at reduced bitrate. Experiments based on H.264 JM7.6 show that this technique achieves bit-rate saving of up to 40.61%. However, the conducted subjective viewing experiments show that there is no perceptual quality degradation. EDICS-1-CPRS, 3-QUAL.  相似文献   

4.
本文研究了在总比特率设定的情况下,改良并给出表现更优的量化器,以及如何实现基于网络的随机标量参数分布式量化估计,重点讨论传感器比特数最优分配.与常规给定各传感器的量化比特率不同的是,本文将结合估计器算法使用和不同量化器的构建,来研究固定总比特率下的分配.文中的观测模型噪声服从高斯分布,并且以此模型为对象通过均匀量化探讨基于一般类型与线性估计器的最理想比特分配方式.前者均方误差上限与后者对应下限在高精度处理方案下结果几乎相同,都表现出网络中观测噪声误差反比于量化级数这一特性.此外还借用交替序列比特分配算法以确保求解出的数值解恒非负.最后从MATLAB仿真结果可以看到,本文给出的最优比特分配估计器较传统方案的表现更优.  相似文献   

5.
In this paper, an optimal entropy-constrained non-uniform scalar quantizer is proposed for the pixel domain DVC. The uniform quantizer is efficient for the hybrid video coding since the residual signals conforming to a single-variance Laplacian distribution. However, the uniform quantizer is not optimal for pixel domain distributed video coding (DVC). This is because the uniform quantizer is not adaptive to the joint distribution of the source and the SI, especially for low level quantization. The signal distribution of pixel domain DVC conforms to the mixture model with multi-variance. The optimal non-uniform quantizer is designed according to the joint distribution, the error between the source and the SI can be decreased. As a result, the bit rate can be saved and the video quality won’t sacrifice too much. Accordingly, a better R-D trade-off can be achieved. First, the quantization level is fixed and the optimal RD trade-off is achieved by using a Lagrangian function J(Q). The rate and distortion components is designed based on P(Y|Q). The conditional probability density function of SI Y depend on quantization partitions Q, P(Y|Q), is approximated by a Guassian mixture model at encocder. Since the SI can not be accessed at encoder, an estimation of P(Y|Q) based on the distribution of the source is proposed. Next, J(Q) is optimized by an iterative Lloyd-Max algorithm with a novel quantization partition updating algorithm. To guarantee the convergence of J(Q), the monotonicity of the interval in which the endpoints of the quantizer lie must be satisfied. Then, a quantizer partition updating algorithm which considers the extreme points of the histogram of the source is proposed. Consequently, the entropy-constrained optimal non-uniform quantization partitions are derived and a better RD trade-off is achieved by applying them. Experiment results show that the proposed scheme can improve the performance by 0.5 dB averagely compared to the uniform scalar quantization.  相似文献   

6.
This paper deals with the application of adaptive signal models for parametric audio coding. A fully parametric audio coder, which decomposes the audio signal into sinusoids, transients and noise, is here proposed. Adaptive signal models for sinusoidal, transient, and noise modeling are therefore included in the parametric scheme in order to achieve high-quality and low bit-rate audio coding. In this paper, a new sinusoidal modeling method based on a perceptual distortion measure is proposed. For transient modeling, a fast and effective method based on matching pursuit with a mixed dictionary is chosen. The residue of the previous models is analyzed as a noise-like signal. The proposed parametric audio coder allows high quality audio coding for one-channel audio signals at 16 kbits/s (average bit rate). A bit-rate scalable version of the parametric audio coder is also proposed in this work. Bit-rate scalability is intended for audio streaming applications, which are highly demanded nowadays. The performance of the proposed parametric audio coders (nonscalable and scalable coders) is assessed in comparison to widely used audio coders operating at similar bit rates.   相似文献   

7.
陈甬娜  周宇  王晓东  郭磊 《计算机应用》2017,37(10):2806-2812
针对基于视频帧内预测模式调制的信息隐藏算法嵌入容量较小、比特率上升较明显等问题,提出一种基于菱形编码的帧内视频信息隐藏算法。该算法基于高效视频编码(HEVC),将相邻两个4×4块预测模式组成模式对,采用改进的菱形编码算法指导模式调制和信息嵌入过程;并采取二次编码方式在保留原始平台最优编码划分下进行第二次隐秘信息嵌入编码,在保证嵌入量的同时抑制帧内失真漂移。实验结果表明:所提算法峰值信噪比(PSNR)值下降在0.03dB以内,码率增长低于0.53%,嵌入量有大幅提升,并能很好地保证视频主客观质量。  相似文献   

8.
目的 基于哈希编码的检索方法是图像检索领域中的经典方法。其原理是将原始空间中相似的图片经哈希函数投影、量化后,在汉明空间中得到相近的哈希码。此类方法一般包括两个过程:投影和量化。投影过程大多采用主成分分析法对原始数据进行降维,但不同方法的量化过程差异较大。对于信息量不均衡的数据,传统的图像哈希检索方法采用等长固定编码位数量化的方式,导致出现低编码效率和低量化精度等问题。为此,本文提出基于哈夫曼编码的乘积量化方法。方法 首先,利用乘积量化法对降维后的数据进行量化,以便较好地保持数据在原始空间中的分布情况。然后,采用子空间方差作为衡量信息量的标准,并以此作为编码位数分配的依据。最后,借助于哈夫曼树,给方差大的子空间分配更多的编码位数。结果 在常用公开数据集MNIST、NUS-WIDE和22K LabelMe上进行实验验证,与原始的乘积量化方法相比,所提出方法能平均降低49%的量化误差,并提高19%的平均准确率。在数据集MNIST上,与同类方法的变换编码方法(TC)进行对比,比较了从32 bit到256 bit编码时的训练时间,本文方法的训练时间能够平均缩短22.5 s。结论 本文提出了一种基于多位编码乘积量化的哈希方法,该方法提高了哈希编码的效率和量化精度,在平均准确率、召回率等性能上优于其他同类算法,可以有效地应用到图像检索相关领域。  相似文献   

9.
作为一种有损图像编码技术,块截短编码算法(BTC)的计算量较少,速度快,有较好的信道容错力,重建图像质量较高。然而,标准BTC算法的主要缺点是其压缩比特率比其他基于块图像编码的算法(如变换编码和矢量量化)高。为了降低比特率,提出了几种有效的BTC算法,还提出了一种简单的查表算法对每块的BTC量化数据编码,另外还引入了矢量量化技术以减少对位平面编码的比特数。为了减少由改进算法引入的额外失真,在每种提出的算法中,采用最优阈值而不用平均值作为量化阈值。  相似文献   

10.
Mikhael, W., and Krishnan, V., Energy-Based Split Vector Quantizer Employing Signal Representation in Multiple Transform Domains, Digital Signal Processing11 (2001) 359–370Vector quantization schemes are widely used for waveform coding of one- and multidimensional signals. In this contribution, a novel energy-based, split vector quantization technique is presented, which represents digital signals efficiently as measured by the number of bits per sample for a predetermined signal reconstruction quality. In this approach, each signal vector is projected into multiple transform domains. In the learning mode, for a given transform domain representation, the transformed vector is split into subvectors (subbands) of equal average energy estimated from the transformed training vector ensemble. An equal number of bits is assigned to each subvector. A codebook is then designed for each equal energy subband of each transform domain representation. In the running mode, the coder selects codes from the domain that best represents the signal vector. The proposed multiple transform, split vector quantizer is developed and its performance is evaluated for both single-stage and multistage implementations. Several single transform vector quantizers for waveform coding exist, some of which employ energy-based bit allocation. Sample results using one-dimensional speech signals confirm the superior performance of the proposed scheme over existing single transform vector quantizers for waveform coding.  相似文献   

11.

In this paper, a novel pyramid coding based rate control scheme is proposed for video streaming applications constrained by a constant channel bandwidth. To achieve the target bit rate with the best quality, the initial quantization parameter (QP) is determined by the average spatio-temporal complexity of the sequence, its resolution and the target bit rate. Simple linear estimation models are then used to predict the number of bits that would be necessary to encode a frame for a given complexity and QP. The experimental results demonstrate that the proposed rate control scheme significantly outperforms the existing rate control scheme in the Joint Model (JM) reference software in terms of Peak Signal to Noise Ratio (PSNR) and consistent perceptual visual quality while achieving the target bit rate. Finally, the proposed scheme is validated through experimental evaluation over a miniature test-bed.

  相似文献   

12.
In this paper a Human Visual System based adaptive quantization scheme is proposed. The proposed algorithm supports perceptually lossless as well as lossy compression. The algorithm uses a transform based compression approach using the wavelet transform, and has incorporated vision models for the compression of both luminance and chrominance components. The major strength of the coder is the incorporation of the vision model for the chrominance components and the optimum way in which the scales are distributed among the luminance and chrominance components to achieve higher compression ratios. The perceptual model developed for the color components gives flexibility for giving more compression for the color components without causing any color degradations. For each image the visual thresholds are evaluated and an optimum bit allocation is done in such a way that the quantization error is always less than the visual distortion for the given rate. To validate the strength of the proposed algorithm, the perceptual quality of the images reconstructed using the proposed coder is compared with the images reconstructed with JPEG2000 standard coder, for the same compression. To evaluate the perceptual quality of the compressed images latest perceptual quality matrices such as Structural Similarity Index, Visual Information Fidelity and Visual Signal-to-Noise Ratio are used. The results obtained reveal that the proposed structure gives excellent improvement in perceptual quality compared to the existing schemes, for both lossy as well as lossless compression. These advantages make the proposed algorithm a good candidate for replacing the quantizer stage of the current image compression standards.  相似文献   

13.
目的 数字视频通常经过压缩后传输,结合视频编码标准嵌入秘密信息是视频信息隐藏的主流技术。然而,现有基于HEVC(high-efficiency video coding)的视频信息隐藏技术存在码率增长过快、视频质量下降等问题。针对以上问题,提出结合恰可察觉编码失真模型(JNCD)的HEVC大容量信息隐藏方法。方法 JNCD模型是一种面向HEVC视频编码的视觉感知模型。该模型充分考虑编码过程的模糊和块效应,有效去除视频感知冗余,在相同码率下可获得更高的主观感知质量。结合JNCD模型,调节I帧中编码单元(CU)的最优量化参数(QP)值,并利用基于方向调整(EMD)算法嵌入秘密信息,进一步增加信息隐藏容量。为了提高信息的安全性,用密钥对秘密信息进行置乱加密处理,在解码端只有持有该密钥的用户才能正确解密,获得秘密信息。结果 实验使用HEVC参考软件HM16.0,选取分辨率不同的序列进行测试。结果表明,秘密信息嵌入后,视频测试序列的PSNR平均值为41.16 dB,与现有的信息隐藏方法相比,不仅保持较好的主观和客观视频质量,而且信息隐藏容量平均提升2倍左右。结论 采用本方法在保证原视频图像的质量的情况下,能够有效增加信息隐藏的容量,并能够一定程度阻止码率增长,符合信息隐藏的不可见性、安全性和实时性要求。  相似文献   

14.
This paper presents a novel Multiresolution, Perceptual and Vector Quantization (MPVQ) based video coding scheme. In the intra-frame mode of operation, a wavelet transform is applied to the input frame and decorrelates it into its frequency subbands. The coefficients in each detail subband are pixel quantized using a uniform quantization factor divided by the perceptual weighting factor of that subband. The quantized coefficients are finally coded using a quadtree-coding algorithm. Perceptual weights are specifically calculated for the centre of each detail subband. In the inter-frame mode of operation, a Displaced Frame Difference (DFD) is first generated using an overlapped block motion estimation/compensation technique. A wavelet transform is then applied on the DFD and converts it into its frequency subbands. The detail subbands are finally vector quantized using an Adaptive Vector Quantization (AVQ) scheme. To evaluate the performance of the proposed codec, the proposed codec and the adaptive subband vector quantization coding scheme (ASVQ), which has been shown to outperform H.263 at all bitrates, were applied to six test sequences. Experimental results indicate that the proposed codec outperforms the ASVQ subjectively and objectively at all bit rates.  相似文献   

15.
在传统视频编码系统中,尽管失真大多由均方误差(MSE)度量,但是基于MSE的失真往往难以衡量不同视频流的主观差异,因此,人眼视觉系统对视频流的感知特性有必要被编码器利用。为了进一步提高编码效率,针对人眼对不同亮度的信号敏感程度不同的特性,提出了一种基于人眼感知特性的亮度系数压缩算法,该算法通过前向量化将人眼不能察觉的冗余信息丢掉来提高编码器压缩效率,并保证了人眼对损失的信息不可见。实验结果表明,采用该算法的AVS参考编码器,其输出码率的下降幅度达到8%~40%,而解码图像的主观观测质量却同未采用该算法的编码器相当。  相似文献   

16.
We propose two quantization techniques for improving the bit-rate scalability of compression systems that optimize a weighted squared error (WSE) distortion metric. We show that quantization of the base-layer reconstruction error using entropy-coded scalar quantizers is suboptimal for the WSE metric. By considering the compandor representation of the quantizer, we demonstrate that asymptotic (high resolution) optimal scalability in the operational rate-distortion sense is achievable by quantizing the reconstruction error in the compandor's companded domain. We then fundamentally extend this work to the low-rate case by the use of enhancement-layer quantization which is conditional on the base-layer information. In the practically important case that the source is well modeled as a Laplacian process, we show that such conditional coding is implementable by only two distinct switchable quantizers. Conditional coding leads to substantial improvement over the companded scalable quantization scheme introduced in the first part, which itself significantly outperforms standard techniques. Simulation results are presented for synthetic memoryless Laplacian sources with /spl mu/-law companding, and for real-world audio signals in conjunction with MPEG AAC. Using the objective noise-mask ratio (NMR) metric, the proposed approaches were found to result in bit-rate savings of a factor of 2 to 3 when implemented within the scalable MPEG AAC. Moreover, the four-layer scalable coder consisting of 16-kb/s layers achieves performance close to that of the 64-kb/s nonscalable coder on the standard test database of 44.1-kHz audio.  相似文献   

17.
This paper describes a coding paradigm using coding tools based on the characteristics of the human hearing system so as to accommodate a wide range of narrow-band audio inputs without annoying artifacts at low rates (down to 8 kb/s). The narrow-band perceptual audio coder (NPAC) employs a variety of algorithms to account for the perceptually irrelevant parts of the input signal in addition to statistical redundancies. The new algorithms used in the NPAC coder include a perceptual error measure in training the codebooks and selecting the best codewords which takes into account the audible parts of the quantization noise, a perception-based bit-allocation algorithm and a new predictive scheme to vector quantize the scale factors. The NPAC coder delivers acceptable quality without annoying artifacts for most narrow-band audio signals at around 1 bit/sample. Informal subjective tests have shown that the NPAC coder outperforms a commercial low-rate music coder operating at 8 kb/s.  相似文献   

18.
Pak  Mesut  Bayazit  Ulug 《Multimedia Tools and Applications》2020,79(27-28):19239-19263

This paper proposes a regional rate allocation method for enhancing the perceived quality in image compression. Bit allocation to image regions should be performed by considering the viewer’s attention and distortion sensitivity maps in order to address subjective quality concerns. The paper first proposes an exponential model for the relation between the viewer’s fixation duration and perceived information. The human visual system is more sensitive to the distortion around edges than the distortion in complex textured regions. Therefore, a novel distortion sensitivity method is also proposed that distinguishes true edges from complex textures without using edge detectors or gradient magnitude thresholds. The estimates for the visual attention level and the distortion sensitivity level are jointly used to modify the distortion contribution of each codeblock for determining its quantization parameter. The experiments validate the improved perceptual quality of decoded images due to the integrated use of the visual distortion sensitivity and the visual attention level in bit allocation. Moreover, the proposed bit allocation method is experimentally shown to yield a substantially higher subjective evaluation score than the other well-known bit allocation methods based on post-compression rate-distortion optimization, saliency maps, foveation of fixations and foveated just-noticeable-difference maps.

  相似文献   

19.
The objective of this paper is to propose an efficient model-based bit allocation process optimizing the performances of a wavelet coder for semiregular meshes. More precisely, this process should compute the best quantizers for the wavelet coefficient subbands that minimize the reconstructed mean square error for one specific target bitrate. In order to design a fast and low complex allocation process, we propose an approximation of the reconstructed mean square error relative to the coding of semiregular mesh geometry. This error is expressed directly from the quantization errors of each coefficient subband. For that purpose, we have to take into account the influence of the wavelet filters on the quantized coefficients. Furthermore, we propose a specific approximation for wavelet transforms based on lifting schemes. Experimentally, we show that, in comparison with a "naive" approximation (depending on the subband levels), using the proposed approximation as distortion criterion during the model-based allocation process improves the performances of a wavelet-based coder for any model, any bitrate, and any lifting scheme.  相似文献   

20.
目的 高效视频编码(HEVC)采用率失真优化技术选择最佳的编码参数,实现编码比特率和视频图像失真之间的平衡。失真度量通常采用均方误差和绝对误差和,这些方法并没有考虑人眼的主观感受。为了提高视频编码的主观感知质量,提出一个融合视觉感知特性的率失真优化算法,并应用于帧间率失真优化过程中。方法 首先定义了一个视觉感知因子,该因子考虑了人类视觉系统对视频图像的空域活动性区域、纹理区域、时域运动性区域和亮度的感知特性,然后以编码树单元为单位对拉格朗日乘子进行自适应调整,最后根据拉格朗日乘子与量化参数之间的关系,对量化参数做进一步的修正。结果 与HEVC参考软件相比,本文算法明显提高了率失真性能,对于相同的结构相似度(SSIM)分值,本文算法在随机访问和低延时配置下平均分别节省3.1%,4.9%的码率,最高能节省9.0%的码率。与代表性文献算法相比,对于相同的SSIM,本文算法在随机访问和低延时配置下平均分别增加了0.7%,2.2%的码率节省。结论 本文率失真优化策略能够根据图像不同的视觉特性自适应的调整率失真优化过程中的拉格朗日乘子,在保持编码质量基本不变的情况下,节省了码率,提高了HEVC的编码性能。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号