共查询到20条相似文献,搜索用时 15 毫秒
1.
Video texts are closely related to the video content. The video text information can facilitate content based video analysis, indexing and retrieval. Video sequences are usually compressed before storage and transmission. A basic step of text-based applications is text detection and localization. In this paper, an overlaid text detection and localization method is proposed for H.264/AVC compressed videos by using the integer discrete cosine transform (DCT) coefficients of intra-frames. The main contributions of this paper are in the following two aspects: 1) coarse text blocks detection using block sizes and quantization parameters adaptive thresholds; 2) text line localization according to the characteristics of text in intra frames of H.264/AVC compressed domain. Comparisons are made with the pixel domain based text detection method for the H.264/AVC compressed video. Text detection results on five H.264/AVC video sequences under various qualities show the effectiveness of the proposed method. 相似文献
2.
W. A. C. Fernando 《Multimedia Tools and Applications》2006,28(3):301-320
This paper addresses an important area in video processing, namely compressed domain processing. For video indexing, video
scene transition detection is an essential step to segment the video. Current techniques for scene change detection tend to
suffer from a major limitation as most of them cannot identify scene transitions in the compressed domain. Since most video
is expected to be stored in the compressed domain, scene transition detection in this domain is highly desirable. In this
paper an algorithm for video scene change detection is proposed to overcome this limitation. In this scheme, properties of
the B-frames are used as it is capable of measuring the correlation between two adjacent reference frames. The results show
that this scheme performs better than schemes based on P-frames. Proposed scheme can be directly applied with compressed data
with minimum decompression and hence it is computationally efficient and makes real time implementations possible. Results
show that video scene transitions can be identified satisfactorily with the proposed scheme. 相似文献
3.
Gaobo Yang Weiwei Chen Qiya Zhou Zhaoyang Zhang 《Journal of Real-Time Image Processing》2009,4(4):303-316
This paper presents a compressed-domain motion object extraction algorithm based on optical flow approximation for MPEG-2
video stream. The discrete cosine transform (DCT) coefficients of P and B frames are estimated to reconstruct DC + 2AC image
using their motion vectors and the DCT coefficients in I frames, which can be directly extracted from MPEG-2 compressed domain.
Initial optical flow is estimated with Black’s optical flow estimation framework, in which DC image is substituted by DC + 2AC
image to provide more intensity information. A high confidence measure is exploited to generate dense and accurate motion
vector field by removing noisy and false motion vectors. Global motion estimation and iterative rejection are further utilized
to separate foreground and background motion vectors. Region growing with automatic seed selection is performed to extract
accurate object boundary by motion consistency model. The object boundary is further refined by partially decoding the boundary
blocks to improve the accuracy. Experimental results on several test sequences demonstrate that the proposed approach can
achieve compressed-domain video object extraction for MPEG-2 video stream in CIF format with real-time performance. 相似文献
4.
Many advanced video applications require processing compressed video signals. For compression systems using the discrete cosine transform (DCT) with motion compensation, we propose an algorithm with adaptive low-pass filtering (ALPF) to reconstruct blocks with motion vectors in the compressed domain. Compared with the previous work, this algorithm is faster and reduces blocky artifacts caused by quantization. 相似文献
5.
为进一步提高视频水印算法的鲁棒性,提出了一种改进的伪三维离散余弦变换(3D-DCT)的视频零水印算法。该算法首先采用帧间欧氏距离法选取关键帧,然后,利用三帧差分法得到关键帧的运动目标,并对关键帧的运动目标图像进行分块,同时对运动目标的几何形心坐标所在块进行伪3D-DCT变换。将变换后的AC系数用于构造视频的特征值序列,并将混沌映射加密的水印嵌入特征值序列中构造零水印。最后将零水印在数据库中进行注册。实验表明,该算法对噪声、滤波攻击,帧剪切、旋转、缩放等攻击具有很强的不可见性和鲁棒性。 相似文献
6.
Effective wipe detection in MPEG compressed video using macro block type information 总被引:2,自引:0,他引:2
Soo-Chang Pei Yu-Zuong Chou 《Multimedia, IEEE Transactions on》2002,4(3):309-319
For video scene analysis, the wipe transition is considered most complex and difficult to detect. In this paper, an effective wipe detection method is proposed using the macroblock (MB) information of the MPEG compressed video. By analyzing the prediction directions of B frames, which are revealed in the MB types, the scene change region of each frame can be found. Once the accumulation of the scene change regions covers most of the area of the frame, the sequence will be considered a motionless wipe transition sequence. Besides, uncommon intracoded MB of the B frame can also be applied as an indicator of the motion wipe transition. A very simple analysis based on small amount of MB type information is sufficient to achieve wipe detection directly on MPEG compressed video. Easy extraction of MB type information, low-complexity analysis algorithm and robustness to arbitrary shape and direction of wipe transitions are the great advantages of the proposed method. 相似文献
7.
This paper proposes a novel scene analysis algorithm based on three-dimensional discrete wavelet transform (3D DWT). Based
on the correlation among the adjacent frames, video frames can be considered as four categories: abrupt scene transition,
motion scene, gradual scene transition and static scene, which are ranked from low to high according to the strength of the
correlation. Through the investigation of the particular temporal and spatial distribution of each category, the correlation
among adjacent frames could be described by the 3D DWT coefficients related statistical features, which are the energy of
high-frequency coefficients difference, the sum of high-frequency coefficients magnitudes and the difference of low-frequency
coefficients magnitudes. The energy of high-frequency coefficients difference is first used to detect the abrupt scene transition
including cut and flashlight. Then all the three features are input to SVM for the purpose of analyzing the residual scenes
and detecting the gradual scene transition, such as dissolve and fade. Experimental results show the method to be effective
not only for the abrupt scene transition detection, but also for the gradual scene transition detection. 相似文献
8.
This paper presents a new frame-skipping transcoding approach for video combiners in multipoint video conferencing. Transcoding is regarded as a process of converting a previously compressed video bitstream into a lower bitrate bitstream. A high transcoding ratio may result in an unacceptable picture quality when the incoming video bitstream is transcoded with the full frame rate. Frame skipping is often used as an efficient scheme to allocate more bits to representative frames, so that an acceptable quality for each frame can be maintained. However, the skipped frame must be decompressed completely, and should act as the reference frame to the nonskipped frame for reconstruction. The newly quantized DCT coefficients of prediction error need to be recomputed for the nonskipped frame with reference to the previous nonskipped frame; this can create an undesirable complexity in the real time application as well as introduce re-encoding error. A new frame-skipping transcoding architecture for improved picture quality and reduced complexity is proposed. The proposed architecture is mainly performed on the discrete cosine transform (DCT) domain to achieve a low complexity transcoder. It is observed that the re-encoding error is avoided at the frame-skipping transcoder when the strategy of direct summation of DCT coefficients is employed. By using the proposed frame-skipping transcoder and dynamically allocating more frames to the active participants in video combining, we are able to make more uniform peak signal-to-noise ratio (PSNR) performance of the subsequences and the video qualities of the active subsequences can be improved significantly. 相似文献
9.
DCT变换域乘嵌入图像水印的检测算法 总被引:11,自引:0,他引:11
目前大多数水印算法采用线性相关的方法检测水印,但是,当原始媒体信号不服从高斯分布,或者水印不是以加嵌入方式嵌入到待保护的媒体对象中时,该方法存在一定的问题.数字水印的不可感知性约束决定了水印检测是一个弱信号检测问题,利用这一特性,首先从图像DCT(discrete cosine transform)交流变换系数的统计特性出发,应用广义高斯分布来建立其统计分布模型,然后将水印检测问题转化为二元假设检验问题,以非高斯噪声中弱信号检测的基本理论作为乘嵌入水印的理论检测模型,推导出优化的乘嵌入水印检测算法,并对检测算法进行了实验.结果表明,对于未知嵌入强度的乘水印的盲检测,提出的水印检测器具有良好的检测性能.因此,该检测器能在数字媒体数据的版权保护方面得到了实际的应用. 相似文献
10.
Wavelet domain-based video noise reduction using temporal discrete cosine transform and hierarchically adapted thresholding 总被引:1,自引:0,他引:1
A novel spatio-temporal filter for video denoising, which operates entirely in the wavelet domain, is proposed. For effective noise reduction, the spatial and temporal redundancies that exist in the wavelet domain representation of a video signal are exploited. First, a 2D discrete wavelet transform is applied to the input noisy frames. This is followed by a discrete cosine transform (DCT), which is applied to the temporal subband coefficients to minimise the redundancy among the consecutive frames. The DCT transformed, noise-free coefficients in the different wavelet domain subbands for the original image sequence are modelled using a prior having a generalised Gaussian distribution. On the basis of this prior, filtering of the noisy wavelet coefficients in each subband is carried out using a new, low-complexity wavelet shrinkage method, which utilises the correlation that exists between subsequent resolution levels. Experimental results show that the proposed scheme outperforms several state-of-the-art spatio-temporal filters in terms of both the peak signal-to-noise ratio and the visual quality 相似文献
11.
In this paper, a new algorithm is proposed for forgery detection in MPEG videos using spatial and time domain analysis of quantization effect on DCT coefficients of I and residual errors of P frames. The proposed algorithm consists of three modules, including double compression detection, malicious tampering detection and decision fusion. Double compression detection module employs spatial domain analysis using first significant digit distribution of DCT coefficients in I frames to detect single and double compressed videos using an SVM classifier. Double compression does not necessarily imply the existence of malignant tampering in the video. Therefore, malicious tampering detection module utilizes time domain analysis of quantization effect on residual errors of P frames to identify malicious inter-frame forgery comprising frame insertion or deletion. Finally, decision fusion module is used to classify input videos into three categories, including single compressed videos, double compressed videos without malicious tampering and double compressed videos with malicious tampering. The experimental results and the comparison of the results of the proposed method with those of other methods show the efficiency of the proposed algorithm. 相似文献
12.
13.
视频数据大都是经过压缩域的形式存储和传输的,且直接在压缩域进行视频对象分割无需运动估计等复杂的计算,速度较快。本文提出了一种基于梯度模型的MPEG压缩域的运动对象分割算法。首先利用DCT(AC[1]和AC[8])系数获得所有物体的边缘,然后综合在累积运动矢量基础上得到的边缘运动信息,从而获得感兴趣运动物体的边缘。仿真实验结果表明,它可以取得满意的分割质量。 相似文献
14.
15.
16.
随着H.264/AVC压缩标准得到越来越广泛的应用,基于H.264/AVC压缩域的视频镜头分割技术成为视频检索领域的热点问题。根据镜头边缘处前后帧的相关性较低的特征,统计帧在总体上的宏蚺预测模式信息来获取视频的候选镜头边界集.然后利用局部特性对其进行筛选,得出了镜头边界,并通过实验进行验证,实验结果表明,本算法是快速有效的。 相似文献
17.
为了直接从H.264码流中检测镜头边界,提出了利用H.264压缩域多特征和Biased—SVM(不平衡支持向量机)分类算法的检测方法。分析帧类型、宏块类型、运动矢量、帧内预测模式等信息,以获得发生镜头突变和渐变的特征。针对镜头边界帧的数量远少于视频帧总数的特点,用Biased—SVM分类方法将视频帧分为突变帧、渐变帧和非镜头边界帧。在TRECVID视频集上的实验结果表明,与其他H.264压缩域的算法相比,该算法有更好的性能。 相似文献
18.
H.264/AVC QCIF视频场景切换检测是视频应用中一个极具挑战性的研究课题,只利用宏块信息很难得到满意的检测效果,为此提出了一种采用动态阈值和AC相似度的H.264/AVC视频场景切换检测算法,考虑编码预测模式和DCT系数等信息,提出基于编码比特数的动态阈值和基于AC能量的AC图像相似性两个判决准则,据此进行视频场景切换检测。实验结果表明,在高速运动及场景切换频繁的视频上场景切换检测率得到提高,性能也优于现有的算法。 相似文献
19.
Soo-Chang Pei Yu-Zuong Chou 《Multimedia, IEEE Transactions on》1999,1(4):321-333
Efficient indexing methods are required to handle the rapidly increasing amount of visual information within video databases. Video analysis that partitions the video into clips or extracts interesting frames is an important preprocessing step for video indexing. We develop a novel method for video analysis using the macroblock (MB) type information of MPEG compressed video bitstreams. This method exploits the comparison operations performed in the motion estimation procedure, which results in specific characteristics of the MB type information when scene changes occur or some special effects are applied. Only a simple analysis on MB types of frames is needed to achieve very fast scene change, gradual transition, flashlight, and caption detection. The advantages of this novel approach are its direct extraction from the MPEG bitstreams after VLC decoding, very low complexity analysis, frame-based detection accuracy and high sensitivity 相似文献
20.
Video summarization has great potential to enable rapid browsing and efficient video indexing in many applications. In this study, we propose a novel compact yet rich key frame creation method for compressed video summarization. First, we directly extract DC coefficients of I frame from a compressed video stream, and DC-based mutual information is computed to segment the long video into shots. Then, we select shots with static background and moving object according to the intensity and range of motion vector in the video stream. Detecting moving object outliers in each selected shot, the optimal object set is then selected by importance ranking and solving an optimum programming problem. Finally, we conduct an improved KNN matting approach on the optimal object outliers to automatically and seamlessly splice these outliers to the final key frame as video summarization. Previous video summarization methods typically select one or more frames from the original video as the video summarization. However, these existing key frame representation approaches for video summarization eliminate the time axis and lose the dynamic aspect of the video scene. The proposed video summarization preserves both compactness and considerably richer information than previous video summaries. Experimental results indicate that the proposed key frame representation not only includes abundant semantics but also is natural, which satisfies user preferences. 相似文献