期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Statistical coding method for facial features

Shah D. Marshall S. 《Vision, Image and Signal Processing, IEE Proceedings -》1998,145(3):187-192

Model-based techniques have been shown to give high compression rates for coding head and shoulder image sequences, typically for videophone applications. However, they lead to poor image quality in significant areas of the face such as the eyes and mouth. To overcome this problem, a hybrid system could be perceived where the facial features were represented using traditional statistical techniques and the remaining of the head and shoulder sequences using highly efficient model-based methods, therefore utilising more bits to code the sensitive areas and fewer for the rest. In the paper, the method of principal component analysis to code the dynamic changes in a sequence is presented 相似文献

2.

Study on semantic-based video watermarking method

Xuhai Wang Ming Tong Kezhen Qin 《电子科学学刊(英文版)》2010,27(3):428-432

A new video watermarking method for the Audio Video coding Standard (AVS) is proposed. According to human visual masking properties, this method determines the region of interest for watermark embedding by analyzing video semantics, and generates dynamic robust watermark according to video motion semantics, and embeds watermarks in the Intermediate Frequency (IF) Discrete Cosine Transform (DCT) coefficients of the luminance sub-block prediction residual in the region of interest. This method controls watermark embedding strength adaptively by video textures semantics. Experiments show that this method is robust not only to various conventional attacks, but also to re-frame, frame cropping, frame deletion and other video-specific attacks. 相似文献

3.

Logotype detection to support semantic-based video annotation

《Signal Processing: Image Communication》2007,22(7-8):669-679

In conventional video production, logotypes are used to convey information about content originator or the actual video content. Logotypes contain information that is critical to infer genre, class and other important semantic features of video. This paper presents a framework to support semantic-based video classification and annotation. The backbone of the proposed framework is a technique for logotype extraction and recognition. The method consists of two main processing stages. The first stage performs temporal and spatial segmentation by calculating the minimal luminance variance region (MVLR) for a set of frames. Non-linear diffusion filters (NLDF) are used at this stage to reduce noise in the shape of the logotype. In the second stage, logotype classification and recognition are achieved. The earth mover's distance (EMD) is used as a metric to decide if the detected MLVR belongs to one of the following logotype categories: learned or candidate. Learned logos are semantically annotated shapes available in the database. The semantic characterization of such logos is obtained through an iterative learning process. Candidate logos are non-annotated shapes extracted during the first processing stage. They are assigned to clusters grouping different instances of logos of similar shape. Using these clusters, false logotypes are removed and different instances of the same logo are averaged to obtain a unique prototype representing the underlying noisy cluster. Experiments involving several hours of MPEG video and around 1000 of candidate logotypes have been carried out in order to show the robustness of both detection and classification processes. 相似文献

4.

Perceptually optimised sign language video coding based on eye tracking analysis

Agrafiotis D. Canagarajah N. Bull D.R. Dye M. 《Electronics letters》2003,39(24):1703-1705

A perceptually optimised approach to sign language video coding is presented. The proposed approach is based on the results (included) of an eye tracking study in the visual attention of sign language viewers. Results show reductions in bit rate of over 30% with very good subjective quality. 相似文献

5.

Context-based entropy coding in AVS video coding standard

《Signal Processing: Image Communication》2009,24(4):263-276

In this paper, two context-based entropy coding schemes for AVS Part-2 video coding standard are presented. One is Context-based 2D Variable Length Coding (C2DVLC) as a low complexity entropy coding scheme for AVS Part-2 Jizhun profile. C2DVLC uses multiple 2D-VLC tables to exploit the statistical features of DCT coefficients for higher coding efficiency. Exponential–Golomb codes are applied in C2DVLC to code the pairs of the run-length of zero coefficients and the non-zero coefficients for lower storage requirement. The other is Context-based Binary Arithmetic Coding (CBAC) as an enhanced entropy coding scheme for AVS Part-2 Jiaqiang profile. CBAC utilizes all previously coded coefficient magnitudes in a DCT block for context modeling. This enables adaptive arithmetic coding to exploit the redundancy of the high-order Markov process in DCT domain with a few contexts. In addition, a context weighting technique is used to further improve CBAC's coding efficiency. Moreover, CBAC is designed to be compatible to C2DVLC in coding elements which simplifies the implementations. The experimental results demonstrate that both C2DVLC and CBAC can achieve comparable or even slightly higher coding performance when compared to Context-Adaptive Variable Length Coding (CAVLC) in H.264/AVC baseline profile and Context-Based Adaptive Binary Arithmetic Coding (CABAC) in H.264/AVC main profile respectively. 相似文献

6.

Developments in model-based video coding 总被引：4，自引：0，他引：4

Pearson D.E. 《Proceedings of the IEEE. Institute of Electrical and Electronics Engineers》1995,83(6):892-906

This paper reports on current developments in the area of model-based video coding, a technique which shows promise of achieving very large bit-rate reductions for moving images. After an introduction and historical review, advances are summarized in several areas, among them improved 3D tracking of the human head and of facial expressions, the use of muscle-driven model animation with skin synthesis, techniques for luminance compensation, and switched coders. Bit rates ranging from 64 kb/s down to about 1 kb/s have been obtained using head-and-shoulder video sequences. Problems with model-based methods are identified and future developments in both CBR and VBR transmission discussed 相似文献

7.

Multilayer subband-based video coding

Gharavi H. 《Communications, IEEE Transactions on》1991,39(9):1288-1291

A motion compensated interframe subband coding algorithm suitable for a wide range of video coding applications is described. In this approach the spectrum of each frame of video signal is first decomposed into smaller frequency bands where each can then be coded accordingly. For the best performance a combination of hybrid DCT/DPCM (discrete cosine transform/differential pulse code modulation), interframe DPCM, and intraframe PCM was considered. To preserve its hierarchical structure each band is coded independently of higher frequency bands but can share information with the lower bands. A simulation was carried out for HDTV sequences 相似文献

8.

Rate-distortion-constrained subband video coding

Kossentini F. Chung W.C. Smith M.J. 《IEEE transactions on image processing》1999,8(2):145-154

This paper introduces a subband video coding algorithm for operation over a continuum of rates from very low to very high. The key elements of the system are statistical rate-distortion-constrained motion estimation and compensation, multistage residual quantization, high order statistical modeling, and arithmetic coding. The method is unique in that it provides an improved mechanism for dynamic spatial and temporal coding. Motion vectors are determined in a nontraditional way, using a rate-distortion cost criterion. This results in a smoother and more consistent motion field, relative to that produced by conventional block matching algorithms. Control over the system computational complexity and performance may be exercised easily 相似文献

9.

Perceptual-based distributed video coding

Yu-Chen SunChun-Jen Tsai 《Journal of Visual Communication and Image Representation》2012,23(3):535-548

In this paper, we propose a perceptual-based distributed video coding (DVC) technique. Unlike traditional video codecs, DVC applies video prediction process at the decoder side using previously received frames. The predicted video frames (i.e., side information) contain prediction errors. The encoder then transmits error-correcting parity bits to the decoder to reconstruct the video frames from side information. However, channel codes based on i.i.d. noise models are not always efficient in correcting video prediction errors. In addition, some of the prediction errors do not cause perceptible visual distortions. From perceptual coding point of view, there is no need to correct such errors. This paper proposes a scheme for the decoder to perform perceptual quality analysis on the predicted side information. The decoder only requests parity bits to correct visually sensitive errors. More importantly, with the proposed technique, key frames can be encoded at higher rates while still maintaining consistent visual quality across the video sequence. As a result, even the objective PSNR measure of the decoded video sequence will increase too. Experimental results show that the proposed technique improves the R-D performance of a transform domain DVC codec both subjectively and objectively. Comparisons with a well-known DVC codec show that the proposed perceptual-based DVC coding scheme is very promising for distributed video coding framework. 相似文献

10.

Three-dimensional subband coding of video 总被引：13，自引：0，他引：13

Podilchuk C.I. Jayant N.S. Farvardin N. 《IEEE transactions on image processing》1995,4(2):125-139

We describe and show the results of video coding based on a three-dimensional (3-D) spatio-temporal subband decomposition. The results include a 1-Mbps coder based on a new adaptive differential pulse code modulation scheme (ADPCM) and adaptive bit allocation. This rate is useful for video storage on CD-ROM. Coding results are also shown for a 384-kbps rate that are based on ADPCM for the lowest frequency band and a new form of vector quantization (geometric vector quantization (GVQ)) for the data in the higher frequency bands. GVQ takes advantage of the inherent structure and sparseness of the data in the higher bands. Results are also shown for a 128-kbps coder that is based on an unbalanced tree-structured vector quantizer (UTSVQ) for the lowest frequency band and GVQ for the higher frequency bands. The results are competitive with traditional video coding techniques and provide the motivation for investigating the 3-D subband framework for different coding schemes and various applications. 相似文献

11.

Constant quality video coding using video content analysis

《Signal Processing: Image Communication》2005,20(4):343-369

In the literature, several rate control techniques have been proposed to aim at the optimal quality of digitally encoded video under given bit budget, channel rate and buffer size constraints. Typically, these approaches are group-of-picture (GOP) based. For longer, heterogeneous sequences, they become unacceptably complex or struggle with model mismatches. In this paper, an off-line segment-based rate control approach is proposed for controlling the distortion variation across successive shots of a video sequence when encoding with single-layer (MPEG-4 baseline, MPEG-4 AVC) and scalable (wavelet) video codecs. Consistent quality is achieved by optimally distributing the available bits among the different segments, based on efficient rate-distortion (R-D) modelling of each segment. The individual segments are defined based on shot segmentation and activity analysis techniques. The algorithm is formulated for three different distribution models: download, progressive download and streaming. The results indicate that the proposed technique improves the quality consistency significantly, while the processing overhead compared to classical two-pass variable bit-rate (VBR) encoding is limited. 相似文献

12.

Fast coding unit partitioning algorithms for versatile video coding intra coding

《Journal of Visual Communication and Image Representation》2022

Versatile video coding (VVC) is the newest video compression standard. It adopts quadtree with nested multi-type tree (QT-MTT) to encode square or rectangular coding units (CUs). The QT-MTT coding structure is more flexible for encoding video texture, but it is also accompanied by many time-consuming algorithms. So, this work proposes fast algorithms to determine horizontal or vertical split for binary or ternary partition of a 32 × 32 CU in the VVC intra coding to replace the rate-distortion optimization (RDO) process, which is time-consuming. The proposed fast algorithms are actually a two-step algorithm, including feature analysis method and deep learning method. The feature analysis method is based on variances of pixels, and the deep learning method applies the convolution neural networks (CNNs) for classification. Experimental results show that the proposed method can reduce encoding time by 28.94% on average but increase Bjontegaard delta bit rate (BDBR) by about 0.83%. 相似文献

13.

视频编码预处理算法研究

谢正光《光电子．激光》2009,(12):1646-1650

针对现有降噪采用时空域滤波、去块效应采用环路滤波/后处理这种分而治之方案的缺点,通过分析降噪的基本技术、低码率视频应用中块效应产生的原因和去块效应的常用方法,从滤除不需要或相对不重要的高频离散余弦变换(DCT,discrete cosine transform)系数角度提出了降噪和去块效应可同时处理的设想。新提出的预处理算法是根据图像纹理特性、运动情况和码率约束等,在图像的不同区域采用不同强度的自适应双边滤波,这样不仅去噪,滤除不重要的细节以便于高效压缩,避免块效应的产生,同时也可起到一定程度的码率控制效果。相似文献

14.

Reliable video broadcasts via protected Steiner trees

Allen J.D. Kubat P. 《Communications Magazine, IEEE》2010,48(2):70-76

The introduction of the latest generation of ROADMs in the communication long-haul transport networks allows network planners to consider some new cost-effective design alternatives. Specifically, for video broadcast services ROADM wavelength drop-and-continue technology enables simple wavelength connections at each node via a tree-like topology, and the intelligent control plane permits the use of various shared protection schemes (with failure restoration switching times comparable to SONET BLSR). In this article we formulate models for reliable TV/video broadcast. We consider the network topologies based on minimum spanning trees. The objective is to minimize the total network cost while ensuring that the broadcast, originating in one (or two) source node(s), is delivered to a set of destination nodes and the network will tolerate at least one single link failure. The resulting protected tree networks are illustrated, and the cost of protection strategies is analyzed. 相似文献

15.

Reconfigurable video coding on multicore

Amer. I. Lucarz C. Roquier G. Mattavelli M. Raulet M. Nezan J.-F. Deforges O. 《Signal Processing Magazine, IEEE》2009,26(6):113-123

The current monolithic and lengthy scheme behind the standardization and the design of new video coding standards is becoming inappropriate to satisfy the dynamism and changing needs of the video coding community. Such scheme and specification formalism does not allow the clear commonalities between the different codecs to be shown, at the level of the specification nor at the level of the implementation. Such a problem is one of the main reasons for the typically long interval elapsing between the time a new idea is validated until it is implemented in consumer products as part of a worldwide standard. The analysis of this problem originated a new standard initiative within the International Organization for Standardization (ISO)/International Electrotechnical Commission (IEC) Moving Pictures Experts Group (MPEG) committee, namely Reconfigurable Video Coding (RVC). The main idea is to develop a video coding standard that overcomes many shortcomings of the current standardization and specification process by updating and progressively incrementing a modular library of components. As the name implies, flexibility and reconfigurability are new attractive features of the RVC standard. Besides allowing for the definition of new codec algorithms, such features, as well as the dataflow-based specification formalism, open the way to define video coding standards that expressly target implementations on platforms with multiple cores. This article provides an overview of the main objectives of the new RVC standard, with an emphasis on the features that enable efficient implementation on platforms with multiple cores. A brief introduction to the methodologies that efficiently map RVC codec specifications to multicore platforms is accompanied with an example of the possible breakthroughs that are expected to occur in the design and deployment of multimedia services on multicore platforms. 相似文献

16.

Error resilient video coding techniques 总被引：1，自引：0，他引：1

Yao Wang Wenger S. Jiantao Wen Katsaggelos A.K. 《Signal Processing Magazine, IEEE》2000,17(4):61-82

We review error resilience techniques for real-time video transport over unreliable networks. Topics covered include an introduction to today's protocol and network environments and their characteristics, encoder error resilience tools, decoder error concealment techniques, as well as techniques that require cooperation between encoder, decoder, and the network. We provide a review of general principles of these techniques as well as specific implementations adopted by the H.263 and MPEG-4 video coding standards. The majority of the article is devoted to the techniques developed for block-based hybrid coders using motion-compensated prediction and transform coding. A separate section covers error resilience techniques for shape coding in MPEG-4 相似文献

17.

Low complexity distributed video coding

《Journal of Visual Communication and Image Representation》2014,25(2):361-372

ContextConventional video encoding is a computationally intensive process that requires a lot of computing resources, power and memory. Such codecs cannot be deployed in remote sensors that are constrained in terms of power, memory and computational capabilities. For such applications, distributed video coding might hold the answer.ObjectiveIn this paper, we propose a distributed video coding (DVC) architecture that adheres to the principles of DVC by shifting the computational complexity from the encoder to the decoder and caters to low-motion scenarios like video conferencing and surveillance of hallways and buildings.MethodThe architecture presented is block-based and introduces a simple yet effective classification scheme that aims at maximizing the use of skip blocks to exploit temporal correlation between consecutive frames. In addition to the skip blocks, a dynamic GOP size control algorithm is proposed that instantaneously alters the GOP size in response to the video statistics without causing any latency and without the need to buffer additional frames at the encoder. To facilitate real-time video delivery and consumption, iterative channel codes like low density parity check codes and turbo codes are not used and in their place a Bose–Chaudhuri–Hocquenghem (BCH) code with encoder rate control is used.ResultsIn spite of reducing the complexity and eliminating the feedback channel, the proposed architecture can match and even surpass the performance of current DVC systems making it a viable solution as a codec for low-motion scenarios.ConclusionWe conclude that the proposed architecture is a suitable solution for applications that require real-time, low bit rate video transmission but have constrained resources and cannot support the complex conventional video encoding solutions.Practical implicationsThe practical implications of the proposed DVC architecture include deployment in remote video sensors like hallway and building surveillance, video conferencing, video sensors that are deployed in remote regions (wildlife surveillance applications), and capsule endoscopy. 相似文献

18.

Layered Wyner-Ziv video coding. 总被引：2，自引：0，他引：2

Qian Xu Zixiang Xiong 《IEEE transactions on image processing》2006,15(12):3791-3803

Following recent theoretical works on successive Wyner-Ziv coding (WZC), we propose a practical layered Wyner-Ziv video coder using the DCT, nested scalar quantization, and irregular LDPC code based Slepian-Wolf coding (or lossless source coding with side information at the decoder). Our main novelty is to use the base layer of a standard scalable video coder (e.g., MPEG-4/H.26L FGS or H.263+) as the decoder side information and perform layered WZC for quality enhancement. Similar to FGS coding, there is no performance difference between layered and monolithic WZC when the enhancement bitstream is generated in our proposed coder. Using an H.26L coded version as the base layer, experiments indicate that WZC gives slightly worse performance than FGS coding when the channel (for both the base and enhancement layers) is noiseless. However, when the channel is noisy, extensive simulations of video transmission over wireless networks conforming to the CDMA2000 1X standard show that H.26L base layer coding plus Wyner-Ziv enhancement layer coding are more robust against channel errors than H.26L FGS coding. These results demonstrate that layered Wyner-Ziv video coding is a promising new technique for video streaming over wireless networks. 相似文献

19.

面向HEVC的恰可察觉编码失真模型 总被引：2，自引：1，他引：1

徐升阳郁梅蒋刚毅方树清邵枫彭宗举《光电子．激光》2015,26(12):2381-2392

为进一步提高现有视频编码技术的压缩效率及解码重建图像的主观视觉感知质量,在现有人眼恰可察觉失真(JND,just noticeable distortion)模型的基础上, 提出了恰可察觉编码失真(JNCD,just noticeable coding dist ortion)模型。首先,通过主观实验,对恰可察觉梯度幅值差异(JNGD,just noticeable gradient difference)进行了研究,分析其变化规律并建立JNGD模型。使用全变分(TV,total variation)方法将图像分解为结构图和纹理图后,分别求取其梯度信息得到结构梯度图和纹理梯度图, 利用JNGD模型分别滤除结构梯度图和纹理梯度图中的人眼不可察觉的梯度幅值 ;其后,分析了人眼感知对于不同梯度幅值的编码失真敏感性,设计了梯度幅值与JNCD值的主观实验,得到两者的关系模型; 最后,考虑人眼对图像中的边缘、平坦和纹理3类区域失真感知程度的差异性,利用滤波后的结构梯度和纹理梯度信息将图像划分为上述3类区域,最终建立整幅图像的JNCD模型。为验证本文提出的JNCD模型的可靠性,在高效视频编码(HEVC)标准测试平台上进行的模型验证结果表明,在本模型指导下的编码其解码重建图像获得了较好的主观视觉效果,可为人眼视觉感知冗余的分析及感知编码的改进提供依据。相似文献

20.

Absolute value coding for robust and scalable video coding

Redmill D.W. Bull D.R. Canagarajah C.N. 《Electronics letters》2007,43(20):1074-1075

Absolute value coding is introduced as a method for significantly reducing temporal drift within a motion compensated predictive video codec in the presence of loss. Drift reduction both improves error resilience and enables scalability by omission of parts of the bit-stream. In conjunction with matching pursuits, the system can be used to provide a displaced frame difference codec using fixed length codewords, which further improves error resilience and facilitates simple bit-stream editing. 相似文献