首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In order to further improve the efficiency of video compression, we introduce a perceptual characteristics of Human Visual System (HVS) to video coding, and propose a novel video coding rate control algorithm based on human visual saliency model in H.264/AVC. Firstly, we modifie Itti's saliency model. Secondly, target bits of each frame are allocated through the correlation of saliency region between the current and previous frame, and the complexity of each MB is modified through the saliency value and its...  相似文献   

2.
传统基于人眼感兴趣区域(ROI)的分级量化模型,将视频帧划分为ROI区域和非ROI区域,对相应区域用不同的量化参数(QP)进行量化,以提升视频的主观质量.而该模型没有考虑ROI区域的内部特性,不能很好地符合人眼视觉特性(HVS).针对低码率条件下以人脸为主体的桌面视频、手持终端等场景,提出一种基于ROI和恰可观测失真(JND)的分级量化方法.JND模型表明边界区域相对平滑区域能够隐藏更多的失真,利用该属性检测出ROI区域(即人脸)中人眼更感兴趣的边界部分(例如眼睛、鼻子、嘴巴等),据此建立基于ROI与JND的分级量化模型,指导各区域的量化.实验结果表明,针对低码率视频的应用,与传统分级量化方法相比,本文所提方法在相同码率条件下能明显提升视频的主观质量.  相似文献   

3.
Transform coding has been widely used in video coding standards, such as H.264 advanced video coding (H.264/AVC) and high efficiency video coding (HEVC). But the coded video sequences suffer from annoying coding artifacts, such as blocking and ringing artifacts. In this paper, we propose the quadtree-based non-local Kuan’s (QNLK) filter to suppress the quantization noise optimally and improve the objective and subjective quality of the reconstructed frame simultaneously. The proposed filter takes advantage of the non-local Kuan’s (NLK) filter to restore the quantized signal in transform domain. Restored coefficients are then projected onto designed quantization constraint sets (QCS). Quadtree-based signaling strategy is used at the end of QNLK for adaptive filtering on/off control. Experimental results of QNLK show that the proposed method achieves significant objective coding gain and visual quality improvement, compared with both H.264/AVC high profile and HEVC.  相似文献   

4.
Rate control algorithm adopted in H.264/AVC reference software shows several shortcomings that have been highlighted by different studies. For instance, in the baseline profile, the frame target bit-rate estimation assumes similar characteristics for all frames and the quantization parameter determination uses the Mean Absolute Difference for complexity estimation. Consequently, an inefficient bit allocation is performed leading to important quality variation of decoded sequences. A saliency-based rate-control is proposed in this paper to achieve bit-rate saving and improve perceived quality. The saliency map of each frame, simulating the human visual attention by a bottom-up approach, is used at the frame level to adjust the quantization parameter and at the macroblock level to guide the bit allocation process. Simulation results show that the proposed attentional model is well correlated to human behavior. When compared to JM15.0 reference software, at the frame level, the saliency map exploitation achieves bit-rate savings of up to 26%. At the MB level and under the same quality constraint, bit-rate improvement is up to 42% and buffer level variation is reduced by up to 71%.  相似文献   

5.
The visual quality is a critical factor in prediction video coding over packet-switched networks. However, the traditional MSE-based error resilient video coding cannot correlate well with the perceptual characteristics of the human visual system (HVS). This paper proposes a structural similarity (SSIM) based error resilient video coding scheme to improve the visual quality of compressed videos over packet-switched networks. In the proposed scheme, a SSIM-based end-to-end distortion model is developed to estimate the perceptual distortion due to quantization, error concealment, and error propagation. Based on this model, an adaptive mode selection strategy is presented to enhance the communication robustness of compressed videos. Experiments show that the proposed scheme significantly improves the visual quality for H.264/AVC video coding over packet-switched networks.  相似文献   

6.
针对多视点立体视频压缩编码,提出了一种基于立 体视觉显著性的比 特分配方法。研究综合利用多视点立体视频数据中场景的运动、深度以及深度边缘信息提取 人眼感兴趣区 域(ROI)的方法;然后根据ROI的划分结果优化区域比特分配。实验结果表 明,本文提出的算法能有效提 高ROI区域的编码性能,同时整体视频的率失真性能有一定程度的提高。  相似文献   

7.
韩公海  万帅  公衍超 《电视技术》2011,35(17):27-29,51
分级B帧编码结构是H.264/AVC和可伸缩视频编码中实现时域可分级所采用的高效编码方法。在低码率下,按照提案JVT-P014中的分级B帧量化参数(QP)分配方案进行编码会产生较大的视频质量波动,严重影响视频的主观质量。针对这一问题,提出一种新的分级B帧QP分配方法。该方法考虑了人眼视觉暂留特性,能够有效减小低码率下视频帧之间的质量波动。实验结果证明在低码率编码条件下,该算法与JVT-P014方案相比减少视频质量波动10%以上,并能同时维持整体率失真编码性能基本不变。  相似文献   

8.
In this paper, we present a new adaptive video coding control for real-time H.264/AVC encoding system. The main techniques include: (1) the initial quantization parameter (QP) decision scheme is based on Laplacian of Gaussian (LoG) operators; (2) the MB-level QP calculation is based on the spatio-temporal correlation, in which the computation is less than the quadratic model used by H.264/AVC; (3) the adaptive GOP structure is proposed, in which the I-frame is adaptively replaced by an enhancement P-frame to improve the coding efficiency; (4) the scene change is detected with the complexity of adjacent inter-frames and the appropriate QP is re-calculated for the scene-change frame. The proposed algorithm is not only to save the computational complexity but also to improve coding quality. Compared to the JM12.4 reference under various sequences testing, the proposed algorithm can decrease coding time by 64.5% and improve PSNR by 1.52 dB while keeping the same bit-rate.  相似文献   

9.
Rate distortion (RD) optimization for H.264 interframe coding with complete baseline decoding compatibility is investigated on a frame basis. Using soft decision quantization (SDQ) rather than the standard hard decision quantization, we first establish a general framework in which motion estimation, quantization, and entropy coding (in H.264) for the current frame can be jointly designed to minimize a true RD cost given previously coded reference frames. We then propose three RD optimization algorithms--a graph-based algorithm for near optimal SDQ in H.264 baseline encoding given motion estimation and quantization step sizes, an algorithm for near optimal residual coding in H.264 baseline encoding given motion estimation, and an iterative overall algorithm to optimize H.264 baseline encoding for each individual frame given previously coded reference frames-with them embedded in the indicated order. The graph-based algorithm for near optimal SDQ is the core; given motion estimation and quantization step sizes, it is guaranteed to perform optimal SDQ if the weak adjacent block dependency utilized in the context adaptive variable length coding of H.264 is ignored for optimization. The proposed algorithms have been implemented based on the reference encoder JM82 of H.264 with complete compatibility to the baseline profile. Experiments show that for a set of typical video testing sequences, the graph-based algorithm for near optimal SDQ, the algorithm for near optimal residual coding, and the overall algorithm achieve on average, 6%, 8%, and 12%, respectively, rate reduction at the same PSNR (ranging from 30 to 38 dB) when compared with the RD optimization method implemented in the H.264 reference software.  相似文献   

10.
Most model-based rate control schemes use independent rate-distortion (R–D) models at macroblock (MB) level to represent the relationship among bit rate, distortion and encoding complexity. However the correlations between frames (INTER-dependency) are not well considered for distortion, bit allocation and quantization parameter (QP) decision. In this paper, a novel INTER-dependent R–D model is proposed based on the theoretical analysis of the relationship between the predicted residual of one frame and the distortion of its reference frame. To achieve both bit rate accuracy and consistent video quality, a window-based rate control scheme with two sliding windows is introduced. One window is to group certain previously encoded frames and current frame to control the bit rate and buffer delay; the other is to group certain future encoding frames to optimize the fluctuation of video quality. Furthermore, the optimization of Lagrange multiplier is also discussed under the INTER-dependent situation. Experimental results demonstrate that the proposed window-based rate control scheme with INTER-dependent R–D model can achieve accurate target bit rate and improve PSNR performance, meanwhile the variation of PSNR is the smallest compared with other three benchmark algorithms. This one-pass rate control scheme is highly practical for the real-time video coding applications.  相似文献   

11.
Hierarchical B-frames contribute to improvement of coding performance when introduced into H.264/AVC. However, the existing rate control schemes for H.264/AVC, which are mainly applied to IPPP and IBBP coding structures, cannot work efficiently for the coding structures with hierarchical B-frames. In this paper, a frame layer rate control scheme for hierarchical B-frames is proposed. Firstly, an adaptive starting quantization parameter (QP) determination method is implemented to derive the QP for the first coding frame based on the available channel bit rate and the content of the current video sequence. Then, the target bit budget for a group of pictures (GOP) is calculated based on the target bit rate and the buffer status. Afterwards, a temporal level (TL) layer rate control phase is introduced, and the GOP layer target bit budget is allocated to each TL. In the frame layer rate control phase, a method based on a rate-distortion model and the coding properties of the previous coded key frames is derived to determine the QP for the current key frame. For hierarchical B-frames, we introduce a typical weighting factor in the determination of their target bit budgets to address the features of the hierarchical coding structures. This weighting factor is calculated according to the target bit budget of the GOP layer and the knowledge obtained from the previous coded B-frames in each TL. Subsequently, the QP for coding the current B-frame is computed by a quadratic model with different model parameters for different TLs, and the computed QP is further adaptively adjusted according to the usage of the target bit budgets. After coding the current frame, an update stage, in which a threshold-based method is integrated to avoid model degradation, is invoked to update the parameters for rate control. Experimental results demonstrate that when the proposed rate control scheme is applied to the coding structure with hierarchical B-frames in H.264/AVC, the actual coding bit rates can match the target bit rates very well, and the encoding performance is also improved.  相似文献   

12.
Hierarchical B-frames can bring high coding performance when introduced into H.264/AVC.However,the traditional rate control schemes can not work efficiently in such new coding framework.This article presents a rate control algorithm for hierarchical B-frames in H.264/AVC.Taking the feature of the dyadic hierarchical coding structure into consideration,the proposed algorithm includes group of pictures(GOP)layer,temporal layer and frame layer bits allocation.After frame layer bits allocation is complete,frame layer quantization parameters(QP)determination strategy is responsible for calculating the final QP.Experimental results show that compared with other rate control algorithms,the proposed one can improve the coding performance and reduce the mismatch of target bit rate and real bit rate.  相似文献   

13.
Bitstream-layer models are designed to use the information extracted from both packet headers and payload for real-time and non-intrusive quality monitoring of networked video. This paper proposes a content-adaptive bitstream-layer (CABL) model for coding distortion assessment of H.264/AVC networked video. Firstly, the fundamental relationship between perceived coding distortion and quantization parameter (QP) is established. Then, considering the fact that the perceived coding distortion of a networked video significantly relies on both the spatial and temporal characteristics of video content, spatial and temporal complexities are incorporated in the proposed model. Assuming that the residuals before Discrete Cosine Transform (DCT) keep to the Laplace distribution, the scale parameters of the Laplace distribution are estimated utilizing QP and quantized coefficients on the basis of the Parseval theorem firstly. Then the spatial complexity is evaluated using QP and the scale parameters. Meanwhile, the temporal complexity is obtained using the weighted motion vectors (MV) considering the variations in temporal masking extent for high motion regions and low motion regions, respectively. Both the two characteristics of video content are extracted from the compressed bitstream without resorting to a complete decoding. Using content related information, the proposed model is able to adapt to different video contents. Experimental results show that the overall performance of CABL model significantly outperforms that of the P.1202.1 model and other coding distortion assessment models in terms of widely used performance criteria, including the Pearson Correlation Coefficient (PCC), the Spearman Rank Order Correlation Coefficient (SROCC), the Root-Mean-Squared Error (RMSE) and the Outlier Ratio (OR).  相似文献   

14.
Compared with other existing video coding standards, H.264/AVC can achieve a significant improvement in compression performances. A robust criterion named the rate distortion optimization (RDO) is employed to select the optimal coding modes and motion vectors for each macroblock (MB), which achieves a high compression ratio while leading to a great increase in the complexity and computational load unfortunately. In this paper, a fast mode decision algorithm for H.264/AVC intra prediction based on integer transform and adaptive threshold is proposed. Before the intra prediction, integer transform operations on the original image are executed to find the directions of local textures. According to this direction, only a small part of the possible intra prediction modes are tested for RDO calculation at the first step. If the minimum mean absolute error (MMAE) of the reconstructed block corresponding to the best mode is smaller than an adaptive threshold which depends on the quantization parameter (QP), the RDO calculation is terminated. Otherwise, more possible modes need to be tested. The adaptive threshold aims to balance the compression performance and the computational load. Simulation results with various video sequences show that the fast mode decision algorithm proposed in this paper can accelerate the encoding speed significantly only with negligible PSNR loss or bit rate increment. This work is supported in part by China National Natural Science Foundation (CNSF) under Project No.60572045, the Ministry of Education of China Ph.D. Program Foundation under Project No.20050698033, and by a Cooperation Project (2005.7– 2007.7) with Microsoft Research Asia.  相似文献   

15.
The latest international video-coding standard H.264/AVC significantly achieves better coding performance compared to prior video coding standards such as MPEG-2 and H.263, which have been widely used in today’s digital video applications. To provide the interoperability between different coding standards, this paper proposes an efficient architecture for MPEG-2/H.263/H.264/AVC to H.264/AVC intra frame transcoding, using the original information such as discrete cosine transform (DCT) coefficients and coded mode type. Low-frequency components of DCT coefficients and a novel rate distortion cost function are used to select a set of candidate modes for rate distortion optimization (RDO) decision. For H.263 and H.264/AVC, a mode refinement scheme is utilized to eliminate unlikely modes before RDO mode decision, based on coded mode information. The experimental results, conducted on JM12.2 with fast C8MB mode decision, reveal that average 58%, 59% and 60% of computation (re-encoding) time can be saved for MPEG-2, H.263, H.264/AVC to H.264/AVC intra frame transcodings respectively, while preserving good coding performance when compared with complex cascaded pixel domain transcoding (CCPDT); or average 88% (a speed up factor of 8) when compared with CCPDT without considering fast C8MB. The proposed algorithm for H.264/AVC homogeneous transcoding is also compared to the simple cascaded pixel domain transcoding (with original mode reuse). The results of this comparison indicate that the proposed algorithm significantly outperforms the mode reuse algorithm in coding performance, with only slightly higher computation.  相似文献   

16.
基于结构相似的H.264主观率失真性能改进机制   总被引:1,自引:0,他引:1  
H.264以客观失真作为失真准则进行码率控制(RC)和率失真优化(RDO)模式选择,无法得到最优的主观质量。该文在作者之前研究成果的基础上将基于结构相似(SSIM)的主观失真用于指导H.264基于RDO的帧间模式选择,进一步提出了宏块(MB)层自适应的分析型拉格朗日(Lagrange)乘子来更好地平衡码率和SSIM失真。实验结果表明:在给定目标码率下,该文算法相比基于客观质量的编码算法及基于SSIM的RC算法(但未进行基于SSIM的RDO帧间预测)更有效地编码了图像结构信息,得到了更好的主观率失真性能和主观图像质量。  相似文献   

17.
A feature fusion approach is presented to extract the region of interest (ROI) from the stereoscopic video. [0]Based on human vision system (HVS), the depth feature, the color feature and the motion feature are chosen as vision features. [0]The algorithm is shown as follows. Firstly, color saliency is calculated on superpixel scale. Color space distribution of the superpixel and the color difference between the superpixel and background pixel are used to describe color saliency and color salient region is detected. Then, the classic visual background extractor (Vibe) algorithm is improved from the update interval and update region of background model. The update interval is adjusted according to the image content. The update region is determined through non-obvious movement region and background point detection. So the motion region of stereoscopic video is extracted using improved Vibe algorithm. The depth salient region is detected by selecting the region with the highest gray value. Finally, three regions are fused into final ROI. Experiment results show that the proposed method can extract ROI from stereoscopic video effectively. In order to further verify the proposed method, stereoscopic video coding application is also carried out on the joint model (JM) encoder with different bit allocation in ROI and the background region.  相似文献   

18.
Rate control (RC) plays a crucial role in controlling compression bitrates and encoding qualities for networked video applications. In this research, we propose a new total variation (TV) based frame layer rate control algorithm for H.264/AVC. One of its novelties is that a total variation measure, used in image processing field, is proposed to describe encoding distortion in video compression. For intraframes, we present a TV distortion–quantization (DTVQstep) model to obtain accurate QP step size (Qstep). Using TV measure to represent frame complexity, we also present an analytic model to calculate Qstep for the initial frame, and develop an effective scene change detection method. In addition, an incomplete derivative proportional integral derivative (IDPID) buffer controller is proposed to reduce the deviation between the current buffer fullness and the target buffer fullness, and minimizes the buffer overflow or underflow. Extensive experimental results show that, compared with JVT-W042, the proposed algorithm successfully achieves more accurate target bit rates, reduces frame skipping, decreases quality fluctuation and improves the overall coding quality.  相似文献   

19.
In some image/video applications, the variable bit rate image/video bitstream will be transmitted over a constant bit rate transmission channel, in which a channel buffer is employed. In this study, a new rate control scheme for H.263 video transmission is proposed. Three proposed techniques include: preprocessing the INTRA coded macroblock (MB) incorporated the just-noticeable-distortion (JND) concept, constructing a bit estimation model in the frequency domain (instead of the bit estimation model in the spatial domain employed in Test Model Near-term version 11 (TMN11)), and adjusting quantization parameter (QP) for each MB by a Lagrangian optimization strategy.In the proposed approach, the target number of bits for a video frame is first obtained by using a simple rate control procedure for the frame layer. The proposed JND preprocessing is applied on all the INTRA coded MBs so that the number of coded bits for a scene changed frame will decrease, without any perceptual loss. Within the MB layer, instead of the bit estimation model in the spatial domain employed in TMN11, a bit estimation model in the frequency domain, directly depending on discrete cosine transform coefficients, is proposed. Then the Lagrange multiplier is used to determine the optimal QP for each MB. The resulting QP and number of coded bits of the current MB are fed backward to update the parameters of the bit estimation model.Based on the simulation results obtained in this study, the proposed approach can meet the target bit rate more accurately, keep a lower channel buffer fullness (delay), and have a larger average frame rate than TMN11, whereas the peak signal-to-noise ratio value and the processing time of the proposed approach are “approximately” as good as that of TMN11.  相似文献   

20.
In this study, a spatiotemporal saliency detection and salient region determination approach for H.264 videos is proposed. After Gaussian filtering in Lab color space, the phase spectrum of Fourier transform is used to generate the spatial saliency map of each video frame. On the other hand, the motion vector fields from each H.264 compressed video bitstream are backward accumulated. After normalization and global motion compensation, the phase spectrum of Fourier transform for the moving parts is used to generate the temporal saliency map of each video frame. Then, the spatial and temporal saliency maps of each video frame are combined to obtain its spatiotemporal saliency map using adaptive fusion. Finally, a modified salient region determination scheme is used to determine salient regions (SRs) of each video frame. Based on the experimental results obtained in this study, the performance of the proposed approach is better than those of two comparison approaches.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号