期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Video text detection and localization in intra-frames of H.264/AVC compressed video

Xueming Qian Huan Wang Xingsong Hou 《Multimedia Tools and Applications》2014,70(3):1487-1502

Video texts are closely related to the video content. The video text information can facilitate content based video analysis, indexing and retrieval. Video sequences are usually compressed before storage and transmission. A basic step of text-based applications is text detection and localization. In this paper, an overlaid text detection and localization method is proposed for H.264/AVC compressed videos by using the integer discrete cosine transform (DCT) coefficients of intra-frames. The main contributions of this paper are in the following two aspects: 1) coarse text blocks detection using block sizes and quantization parameters adaptive thresholds; 2) text line localization according to the characteristics of text in intra frames of H.264/AVC compressed domain. Comparisons are made with the pixel domain based text detection method for the H.264/AVC compressed video. Text detection results on five H.264/AVC video sequences under various qualities show the effectiveness of the proposed method. 相似文献

2.

Sudden scene change detection in compressed video using interpolated macroblocks in B-frames

W. A. C. Fernando 《Multimedia Tools and Applications》2006,28(3):301-320

This paper addresses an important area in video processing, namely compressed domain processing. For video indexing, video scene transition detection is an essential step to segment the video. Current techniques for scene change detection tend to suffer from a major limitation as most of them cannot identify scene transitions in the compressed domain. Since most video is expected to be stored in the compressed domain, scene transition detection in this domain is highly desirable. In this paper an algorithm for video scene change detection is proposed to overcome this limitation. In this scheme, properties of the B-frames are used as it is capable of measuring the correlation between two adjacent reference frames. The results show that this scheme performs better than schemes based on P-frames. Proposed scheme can be directly applied with compressed data with minimum decompression and hence it is computationally efficient and makes real time implementations possible. Results show that video scene transitions can be identified satisfactorily with the proposed scheme. 相似文献

3.

Optical flow approximation based motion object extraction for MPEG-2 video stream

Gaobo Yang Weiwei Chen Qiya Zhou Zhaoyang Zhang 《Journal of Real-Time Image Processing》2009,4(4):303-316

This paper presents a compressed-domain motion object extraction algorithm based on optical flow approximation for MPEG-2 video stream. The discrete cosine transform (DCT) coefficients of P and B frames are estimated to reconstruct DC + 2AC image using their motion vectors and the DCT coefficients in I frames, which can be directly extracted from MPEG-2 compressed domain. Initial optical flow is estimated with Black’s optical flow estimation framework, in which DC image is substituted by DC + 2AC image to provide more intensity information. A high confidence measure is exploited to generate dense and accurate motion vector field by removing noisy and false motion vectors. Global motion estimation and iterative rejection are further utilized to separate foreground and background motion vectors. Region growing with automatic seed selection is performed to extract accurate object boundary by motion consistency model. The object boundary is further refined by partially decoding the boundary blocks to improve the accuracy. Experimental results on several test sequences demonstrate that the proposed approach can achieve compressed-domain video object extraction for MPEG-2 video stream in CIF format with real-time performance. 相似文献

4.

A Fast Algorithm for Reconstructing Motion-Compensated Blocks in Compressed Domain

HONGZHENG LI HONGCHI SHI 《Journal of Visual Languages and Computing》1999,10(6):85

Many advanced video applications require processing compressed video signals. For compression systems using the discrete cosine transform (DCT) with motion compensation, we propose an algorithm with adaptive low-pass filtering (ALPF) to reconstruct blocks with motion vectors in the compressed domain. Compared with the previous work, this algorithm is faster and reduces blocky artifacts caused by quantization. 相似文献

5.

一种改进的伪3D-DCT域的视频零水印算法

江烨倩宋春霖《计算机工程与科学》2017,39(9):1721-1728

为进一步提高视频水印算法的鲁棒性,提出了一种改进的伪三维离散余弦变换(3D-DCT)的视频零水印算法。该算法首先采用帧间欧氏距离法选取关键帧,然后,利用三帧差分法得到关键帧的运动目标,并对关键帧的运动目标图像进行分块,同时对运动目标的几何形心坐标所在块进行伪3D-DCT变换。将变换后的AC系数用于构造视频的特征值序列,并将混沌映射加密的水印嵌入特征值序列中构造零水印。最后将零水印在数据库中进行注册。实验表明,该算法对噪声、滤波攻击,帧剪切、旋转、缩放等攻击具有很强的不可见性和鲁棒性。相似文献

6.

Effective wipe detection in MPEG compressed video using macro block type information 总被引：2，自引：0，他引：2

Soo-Chang Pei Yu-Zuong Chou 《Multimedia, IEEE Transactions on》2002,4(3):309-319

For video scene analysis, the wipe transition is considered most complex and difficult to detect. In this paper, an effective wipe detection method is proposed using the macroblock (MB) information of the MPEG compressed video. By analyzing the prediction directions of B frames, which are revealed in the MB types, the scene change region of each frame can be found. Once the accumulation of the scene change regions covers most of the area of the frame, the sequence will be considered a motionless wipe transition sequence. Besides, uncommon intracoded MB of the B frame can also be applied as an indicator of the motion wipe transition. A very simple analysis based on small amount of MB type information is sufficient to achieve wipe detection directly on MPEG compressed video. Easy extraction of MB type information, low-complexity analysis algorithm and robustness to arbitrary shape and direction of wipe transitions are the great advantages of the proposed method. 相似文献

7.

Video scene analysis in 3D wavelet transform domain

Zhi Li Guizhong Liu 《Multimedia Tools and Applications》2012,56(3):419-437

This paper proposes a novel scene analysis algorithm based on three-dimensional discrete wavelet transform (3D DWT). Based on the correlation among the adjacent frames, video frames can be considered as four categories: abrupt scene transition, motion scene, gradual scene transition and static scene, which are ranked from low to high according to the strength of the correlation. Through the investigation of the particular temporal and spatial distribution of each category, the correlation among adjacent frames could be described by the 3D DWT coefficients related statistical features, which are the energy of high-frequency coefficients difference, the sum of high-frequency coefficients magnitudes and the difference of low-frequency coefficients magnitudes. The energy of high-frequency coefficients difference is first used to detect the abrupt scene transition including cut and flashlight. Then all the three features are input to SVM for the purpose of analyzing the residual scenes and detecting the gradual scene transition, such as dissolve and fade. Experimental results show the method to be effective not only for the abrupt scene transition detection, but also for the gradual scene transition detection. 相似文献

8.

Low-complexity and high-quality frame-skipping transcoder for continuous presence multipoint video conferencing

Kai-Tat Fung Yui-Lam Chan Wan-Chi Siu 《Multimedia, IEEE Transactions on》2004,6(1):31-46

This paper presents a new frame-skipping transcoding approach for video combiners in multipoint video conferencing. Transcoding is regarded as a process of converting a previously compressed video bitstream into a lower bitrate bitstream. A high transcoding ratio may result in an unacceptable picture quality when the incoming video bitstream is transcoded with the full frame rate. Frame skipping is often used as an efficient scheme to allocate more bits to representative frames, so that an acceptable quality for each frame can be maintained. However, the skipped frame must be decompressed completely, and should act as the reference frame to the nonskipped frame for reconstruction. The newly quantized DCT coefficients of prediction error need to be recomputed for the nonskipped frame with reference to the previous nonskipped frame; this can create an undesirable complexity in the real time application as well as introduce re-encoding error. A new frame-skipping transcoding architecture for improved picture quality and reduced complexity is proposed. The proposed architecture is mainly performed on the discrete cosine transform (DCT) domain to achieve a low complexity transcoder. It is observed that the re-encoding error is avoided at the frame-skipping transcoder when the strategy of direct summation of DCT coefficients is employed. By using the proposed frame-skipping transcoder and dynamically allocating more frames to the active participants in video combining, we are able to make more uniform peak signal-to-noise ratio (PSNR) performance of the subsequences and the video qualities of the active subsequences can be improved significantly. 相似文献

9.

DCT变换域乘嵌入图像水印的检测算法 总被引：11，自引：0，他引：11

孙中伟冯登国《软件学报》2005,16(10):1798-1804

目前大多数水印算法采用线性相关的方法检测水印,但是,当原始媒体信号不服从高斯分布,或者水印不是以加嵌入方式嵌入到待保护的媒体对象中时,该方法存在一定的问题.数字水印的不可感知性约束决定了水印检测是一个弱信号检测问题,利用这一特性,首先从图像DCT(discrete cosine transform)交流变换系数的统计特性出发,应用广义高斯分布来建立其统计分布模型,然后将水印检测问题转化为二元假设检验问题,以非高斯噪声中弱信号检测的基本理论作为乘嵌入水印的理论检测模型,推导出优化的乘嵌入水印检测算法,并对检测算法进行了实验.结果表明,对于未知嵌入强度的乘水印的盲检测,提出的水印检测器具有良好的检测性能.因此,该检测器能在数字媒体数据的版权保护方面得到了实际的应用. 相似文献

10.

Wavelet domain-based video noise reduction using temporal discrete cosine transform and hierarchically adapted thresholding 总被引：1，自引：0，他引：1

Gupta N. Swamy M.N.S. Plotkin E.I. 《Image Processing, IET》2007,1(1):2-12

A novel spatio-temporal filter for video denoising, which operates entirely in the wavelet domain, is proposed. For effective noise reduction, the spatial and temporal redundancies that exist in the wavelet domain representation of a video signal are exploited. First, a 2D discrete wavelet transform is applied to the input noisy frames. This is followed by a discrete cosine transform (DCT), which is applied to the temporal subband coefficients to minimise the redundancy among the consecutive frames. The DCT transformed, noise-free coefficients in the different wavelet domain subbands for the original image sequence are modelled using a prior having a generalised Gaussian distribution. On the basis of this prior, filtering of the noisy wavelet coefficients in each subband is carried out using a new, low-complexity wavelet shrinkage method, which utilises the correlation that exists between subsequent resolution levels. Experimental results show that the proposed scheme outperforms several state-of-the-art spatio-temporal filters in terms of both the peak signal-to-noise ratio and the visual quality 相似文献

11.

Malicious inter-frame video tampering detection in MPEG videos using time and spatial domain analysis of quantization effects

Javad Abbasi Aghamaleki Alireza Behrad 《Multimedia Tools and Applications》2017,76(20):20691-20717

In this paper, a new algorithm is proposed for forgery detection in MPEG videos using spatial and time domain analysis of quantization effect on DCT coefficients of I and residual errors of P frames. The proposed algorithm consists of three modules, including double compression detection, malicious tampering detection and decision fusion. Double compression detection module employs spatial domain analysis using first significant digit distribution of DCT coefficients in I frames to detect single and double compressed videos using an SVM classifier. Double compression does not necessarily imply the existence of malignant tampering in the video. Therefore, malicious tampering detection module utilizes time domain analysis of quantization effect on residual errors of P frames to identify malicious inter-frame forgery comprising frame insertion or deletion. Finally, decision fusion module is used to classify input videos into three categories, including single compressed videos, double compressed videos without malicious tampering and double compressed videos with malicious tampering. The experimental results and the comparison of the results of the proposed method with those of other methods show the efficiency of the proposed algorithm. 相似文献

12.

从视频中检测人脸 总被引：4，自引：1，他引：4

樊昀王润生《计算机辅助设计与图形学学报》2002,14(5):394-400

视频中人脸检测的应用领域广泛，近来受到了极大关注。文中提出一种在MPEG流中检测人脸的新方法，它可以从复杂背景中有效地检测方向、大小不同的人脸，还可以处理多个人脸交叠的情况，为适应视频检索的需要，该算法依据帧间冗余性，自适应地调整肤色检测器，利用MPEG流中的运动矢量在一个GOP内跟踪人脸，依据场景的变化更新分割码本等措施，有效地提高了计算速度。用算法测试多个视频序列，实验结果令人满意。相似文献

13.

一种基于梯度模型的MPEG压缩域的运动对象分割算法

下载免费PDF全文

孙涛杨高波刘理张兆扬《中国图象图形学报》2008,13(6):1109-1114

视频数据大都是经过压缩域的形式存储和传输的,且直接在压缩域进行视频对象分割无需运动估计等复杂的计算,速度较快。本文提出了一种基于梯度模型的MPEG压缩域的运动对象分割算法。首先利用DCT(AC[1]和AC[8])系数获得所有物体的边缘,然后综合在累积运动矢量基础上得到的边缘运动信息,从而获得感兴趣运动物体的边缘。仿真实验结果表明,它可以取得满意的分割质量。相似文献

14.

一种快速的压缩域视频流场景分段算法 总被引：2，自引：0，他引：2

张风超纪强张宪民《计算机工程》2002,28(1):195-197

提出了一种快速的压缩域视频流场景分段算法，对视频流进行两次分析：第一次分析（粗略分析）只分析P-帧中宏块统计信息，检测出可能存在的镜头边界；第二次分析（精确分析）再对粗略分析找出的边界邻近的B-帧和P-帧的宏块类型进行分析。从而对场景变换进行精确分析和定位，实验结果表明，粗略分析可以满足实时检测的速度要求，帧定位误差控制在10帧之内，精确分析可以进一步把帧定位误差控制在2帧之内。相似文献

15.

DCT域中任意比例的图像上下采样算法

刘怀宇蒋冰王晓阳朱维乐《自动化学报》2007,33(5):488-493

在许多实际应用中, 为了满足传输信道和终端显示设备的要求, 需要通过上采样和下采样来改变图像的尺寸. 压缩域中的图像上下采样可以在空域中进行, 然而, 直接在压缩域中实现将更为快速. 本文根据空域中块与子块的相互关系以及酉变换对矩阵乘法的分配率, 提出了 DCT 域内任意比例图像上下采样算法. 与现存的算法相比, 本算法具有较高的信噪比, 较低的运算复杂度, 并适用于帧内与帧间编码的不同情况, 可应用于不同视频编码转码的实时处理. 相似文献

16.

基于H．264／AVC压缩域的实时视频镜头分割算法

洪夏俊夏殿松《数字社区&智能家居》2009,5(2):944-946

随着H．264／AVC压缩标准得到越来越广泛的应用,基于H．264／AVC压缩域的视频镜头分割技术成为视频检索领域的热点问题。根据镜头边缘处前后帧的相关性较低的特征,统计帧在总体上的宏蚺预测模式信息来获取视频的候选镜头边界集．然后利用局部特性对其进行筛选,得出了镜头边界,并通过实验进行验证,实验结果表明,本算法是快速有效的。相似文献

17.

H．264压缩域中利用Biased-SVM检测镜头边界

游运喜张恩迪苟志坚《计算机工程与应用》2013,(24):138-143

为了直接从H．264码流中检测镜头边界,提出了利用H．264压缩域多特征和Biased—SVM（不平衡支持向量机）分类算法的检测方法。分析帧类型、宏块类型、运动矢量、帧内预测模式等信息,以获得发生镜头突变和渐变的特征。针对镜头边界帧的数量远少于视频帧总数的特点,用Biased—SVM分类方法将视频帧分为突变帧、渐变帧和非镜头边界帧。在TRECVID视频集上的实验结果表明,与其他H．264压缩域的算法相比,该算法有更好的性能。相似文献

18.

一种新的H.264视频场景切换检测算法

下载免费PDF全文

南哲万阮秋琦《计算机工程与应用》2011,47(8):1-3

H.264/AVC QCIF视频场景切换检测是视频应用中一个极具挑战性的研究课题,只利用宏块信息很难得到满意的检测效果,为此提出了一种采用动态阈值和AC相似度的H.264/AVC视频场景切换检测算法,考虑编码预测模式和DCT系数等信息,提出基于编码比特数的动态阈值和基于AC能量的AC图像相似性两个判决准则,据此进行视频场景切换检测。实验结果表明,在高速运动及场景切换频繁的视频上场景切换检测率得到提高,性能也优于现有的算法。相似文献

19.

Efficient MPEG compressed video analysis using macroblock typeinformation

Soo-Chang Pei Yu-Zuong Chou 《Multimedia, IEEE Transactions on》1999,1(4):321-333

Efficient indexing methods are required to handle the rapidly increasing amount of visual information within video databases. Video analysis that partitions the video into clips or extracts interesting frames is an important preprocessing step for video indexing. We develop a novel method for video analysis using the macroblock (MB) type information of MPEG compressed video bitstreams. This method exploits the comparison operations performed in the motion estimation procedure, which results in specific characteristics of the MB type information when scene changes occur or some special effects are applied. Only a simple analysis on MB types of frames is needed to achieve very fast scene change, gradual transition, flashlight, and caption detection. The advantages of this novel approach are its direct extraction from the MPEG bitstreams after VLC decoding, very low complexity analysis, frame-based detection accuracy and high sensitivity 相似文献

20.

A novel compact yet rich key frame creation method for compressed video summarization

Mengjuan Fei Wei Jiang Weijie Mao 《Multimedia Tools and Applications》2018,77(10):11957-11977

Video summarization has great potential to enable rapid browsing and efficient video indexing in many applications. In this study, we propose a novel compact yet rich key frame creation method for compressed video summarization. First, we directly extract DC coefficients of I frame from a compressed video stream, and DC-based mutual information is computed to segment the long video into shots. Then, we select shots with static background and moving object according to the intensity and range of motion vector in the video stream. Detecting moving object outliers in each selected shot, the optimal object set is then selected by importance ranking and solving an optimum programming problem. Finally, we conduct an improved KNN matting approach on the optimal object outliers to automatically and seamlessly splice these outliers to the final key frame as video summarization. Previous video summarization methods typically select one or more frames from the original video as the video summarization. However, these existing key frame representation approaches for video summarization eliminate the time axis and lose the dynamic aspect of the video scene. The proposed video summarization preserves both compactness and considerably richer information than previous video summaries. Experimental results indicate that the proposed key frame representation not only includes abundant semantics but also is natural, which satisfies user preferences. 相似文献