首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 187 毫秒
1.
视频文字大小自适应提取算法基于离散傅里叶变换(discrete Fourier transform, DFT)特征、多分辨率处理及支持向量机分类技术。算法在不同分辨率下结合梯度信息、文字边界定位技术提取出文字候选区域,然后用支持向量机对于候选图像块DFT特征作进一步分类。结果表明,该算法能提取出视频图像中不同大小的文字,识别率优于以小波、灰度、离散余弦变换系数(discrete cosine transform, DCT)等为纹理特征的算法。  相似文献   

2.
基于角点检测和自适应阈值的新闻字幕检测   总被引:3,自引:2,他引:1       下载免费PDF全文
张洋  朱明 《计算机工程》2009,35(13):186-187
目前用于提取新闻视频帧中字幕的方法准确率和检测速度普遍较低,尤其对于分辨率和对比度较小的标题文字,检测效果很差。针对上述问题,提出一种基于角点检测和自适应阈值的字幕检测方法。该方法利用角点检测确定标题帧中的文字区域并进行灰度变换,利用自适应阈值的方法对其进行二值化,得到OCR可识别的文字图片。实验表明,该方法可以快速有效地提取出分辨率和对比度较小的新闻视频标题字幕。  相似文献   

3.
王祖辉  姜维 《计算机工程》2009,35(13):188-189,
目前用于提取新闻视频帧中字幕的方法准确率和检测速度普遍较低,尤其对于分辨率和对比度较小的标题文字,检测效果很差.针对上述问题,提出一种基于角点检测和自适应阈值的字幕检测方法.该方法利用角点检测确定标题帧中的文字区域并进行灰度变换,利用自适应阈值的方法对其进行二值化,得到OCR可识别的文字图片.实验表明,该方法可以快速有效地提取出分辨率和对比度较小的新闻视频标题字幕.  相似文献   

4.
彩色视频的四维矩阵离散余弦变换编码   总被引:8,自引:1,他引:8       下载免费PDF全文
为了在高信噪比条件下获得对彩色视频的高倍压缩 ,提出了 4维矩阵及 4维矩阵离散余弦变换理论 ,并将该理论应用于彩色视频编码 .基于 4维矩阵离散余弦变换的彩色视频编码方法将彩色视频的多个帧放在一个统一的数学模型 (即 ,4维矩阵 )中考虑 ,利用 4维矩阵离散余弦变换去除其间的各种相关性 ,并通过矢量量化对变换系数进行压缩编码 .其方法可以同时全面利用彩色视频图象相邻像素间、彩色空间 Y、U、V 3分量间 ,以及视频图象相邻帧之间的相关性 .实验结果证明 ,对可视电话和视频会议等应用中的彩色视频序列图象 ,利用该方法可在高信噪比条件下获得较高压缩比的编码效果  相似文献   

5.
对于复杂的场景,人类视觉系统选择性注意机制能够不需要训练而快速地定位到图像中的显著目标上.文中结合火焰的先验信息,基于显著性的四元数离散余弦变换算法来检测视频中的火焰.首先根据火焰在RGB空间中3个颜色分量之间的特殊关系改进了2个火焰颜色特征公式,得到2幅火焰颜色的特征图;然后通过计算疑似火焰区域的LBP特征向量的距离得到火焰的纹理特征图;再根据火焰内部的动态纹理、火焰闪烁频率特征计算改进后的火焰高频过零次数,得到火焰的动态特征图;最后将这4幅火焰特征图构成一个四元数,利用四元数离散余弦变换得到最终的火焰显著图.在Bilkent大学的火焰视频库中进行实验的结果表明,该方法具有准确率高、鲁棒性强的特点,优于对比的其他视频火焰检测算法.  相似文献   

6.
文中研究了基于离散余弦变换的光栅投影图的压缩。首先介绍了三维形貌测量的光栅投影图,及其需要进行压缩的原因。接着讨论了可用于图像压缩的离散余弦变换技术,推导出对于光栅图进行离散余弦变换只保留第一列或第一行系数就可以完整重构光栅图的结论。利用此结论创建用于压缩的二值掩模矩阵,并对光栅投影图之一的位相图进行基于离散余弦变换的压缩。实验结果表明此方法能有效地压缩位相图,使用压缩后还原的位相图解调出来的物体形貌信息能得到有效的保留。  相似文献   

7.
由于视频数据量大、处理上实时性要求高,加密方法的效率通常是视频内容安全的关键。在MPEG-4框架下,提出了一种新的视频保护方法。该方法利用非均匀离散余弦变换(NDCT)取代视频编解码中的常规离散余弦变换(DCT),对MPEG-4视频数据在频域上进行加扰保护和解扰,并将控制离散余弦变换非均匀性的参数作为密钥使用。由于不存在专门的密码操作模块,整个方法的时间和空间开销与正常的编解码相当,且从保护效果和安全性方面满足了大量应用的要求。  相似文献   

8.
提出一种视频字幕的检测与定位算法.利用视频字幕在时间上的冗余特性,以镜头为基本处理单元,采用监视-跟踪模型和扩展QSDD(PQSDD)度量来定位字幕的起始帧和终止帧,利用起始帧和终止帧确定起始字幕转换帧对和终止字幕转换帧对;对各帧对的差值图像利用边缘特性分别进行字幕定位,并提出一种基于背景复杂度的自适应阈值选取算法实现对边缘图像的二值化;最后时两幅差值图像定位出的字幕区域做逻辑与运算和连通区域分析得到最终的字幕区域.实验结果表明本文算法具有较高的检测速度和定位精度.  相似文献   

9.
基于四元数域的彩色图像双重零水印算法   总被引:2,自引:0,他引:2  
提出一种基于四元数离散傅里叶变换QDFT(Quaternion Discrete Fourier Transform)和四元数离散余弦变换QDCT(Quaternion Discrete Cosine Transform)以及四元数奇异值分解QSVD(Quaternion Singular Value Disposition)的双重零水印算法。首先将彩色图像用四元数模型表示并进行分块四元数离散傅里叶变换,通过比较各块的部分系数幅值大小产生二值序列;然后,对各原图像块进行离散余弦变换和奇异值分解,利用奇异值矩阵和左右四元数矩阵的幅值来产生另一个二值序列。将二值序列与版权信息相结合产生零水印,并将零水印在知识产权IPR(Intellectual Property Right)中心注册。实验表明,该算法对常见攻击、部分组合攻击及未来可能的一些攻击方式具有较强的鲁棒性。  相似文献   

10.
由于视频数据量大、处理上实时性要求高,加密方法的效率通常是视频内容安全的关键。在MPEG4框架下,提出了一种新的视频保护方法。该方法利用非均匀离散余弦变换(NDCT)取代视频编解码中的常规离散余弦变换(DCT),对MPEG4视频数据在频域上进行加扰保护和解扰,并将控制离散余弦变换非均匀性的参数作为密钥使用。由于不存在专门的密码操作模块,整个方法的时间和空间开销与正常的编解码相当,且从保护效果和安全性方面满足了大量应用的要求。  相似文献   

11.
Video texts are closely related to the video content. The video text information can facilitate content based video analysis, indexing and retrieval. Video sequences are usually compressed before storage and transmission. A basic step of text-based applications is text detection and localization. In this paper, an overlaid text detection and localization method is proposed for H.264/AVC compressed videos by using the integer discrete cosine transform (DCT) coefficients of intra-frames. The main contributions of this paper are in the following two aspects: 1) coarse text blocks detection using block sizes and quantization parameters adaptive thresholds; 2) text line localization according to the characteristics of text in intra frames of H.264/AVC compressed domain. Comparisons are made with the pixel domain based text detection method for the H.264/AVC compressed video. Text detection results on five H.264/AVC video sequences under various qualities show the effectiveness of the proposed method.  相似文献   

12.
基于梯度增强的新闻字幕分割算法   总被引:2,自引:0,他引:2  
新闻字幕的分割在基于语义的新闻视频检索系统中具有重要的意义,为此提出一种基于梯度增强的新闻字幕分割箅法.该算法使用图像多方向梯度的加权和代替图像的标准方差,通过各方向权值的调节加强某些方向的边缘信息,以提高分割效果.与一些经典的自适应阈值分割算法相比,该算法不仅能够保留大部分笔画,也能有效地减少断笔问题.基于光学文字识别的实验结果证明了文中算法的有效性.  相似文献   

13.
Text in videos contains rich semantic information, which is useful for content based video understanding and retrieval. Although a great number of state-of-the-art methods are proposed to detect text in images and videos, few works focus on spatiotemporal text localization in videos. In this paper, we present a spatiotemporal text localization method with an improved detection efficiency and performance. Concretely, a unified framework is proposed which consists of the sampling-and-recovery model (SaRM) and the divide-and-conquer model (DaCM). SaRM aims at exploiting the temporal redundancy of text to increase the detection efficiency for videos. DaCM is designed to efficiently localize the text in spatiotemporal domain simultaneously. Besides, we construct a challenging video overlaid text dataset named UCAS-STLData, which contains 57070 frames with spatiotemporal ground truths. In the experiments, we comprehensively evaluate the proposed method on the publicly available overlaid text datasets and UCAS-STLData. A slight performance improvement is achieved compared with the state-of-the-art methods for spatiotemporal text localization, with a significant efficiency improvement.  相似文献   

14.
This paper presents a new frame-skipping transcoding approach for video combiners in multipoint video conferencing. Transcoding is regarded as a process of converting a previously compressed video bitstream into a lower bitrate bitstream. A high transcoding ratio may result in an unacceptable picture quality when the incoming video bitstream is transcoded with the full frame rate. Frame skipping is often used as an efficient scheme to allocate more bits to representative frames, so that an acceptable quality for each frame can be maintained. However, the skipped frame must be decompressed completely, and should act as the reference frame to the nonskipped frame for reconstruction. The newly quantized DCT coefficients of prediction error need to be recomputed for the nonskipped frame with reference to the previous nonskipped frame; this can create an undesirable complexity in the real time application as well as introduce re-encoding error. A new frame-skipping transcoding architecture for improved picture quality and reduced complexity is proposed. The proposed architecture is mainly performed on the discrete cosine transform (DCT) domain to achieve a low complexity transcoder. It is observed that the re-encoding error is avoided at the frame-skipping transcoder when the strategy of direct summation of DCT coefficients is employed. By using the proposed frame-skipping transcoder and dynamically allocating more frames to the active participants in video combining, we are able to make more uniform peak signal-to-noise ratio (PSNR) performance of the subsequences and the video qualities of the active subsequences can be improved significantly.  相似文献   

15.
Automatic caption localization in compressed video   总被引:26,自引:0,他引:26  
We present a method to automatically localize captions in JPEG compressed images and the I-frames of MPEG compressed videos. Caption text regions are segmented from background images using their distinguishing texture characteristics. Unlike previously published methods which fully decompress the video sequence before extracting the text regions, this method locates candidate caption text regions directly in the DCT compressed domain using the intensity variation information encoded in the DCT domain. Therefore, only a very small amount of decoding is required. The proposed algorithm takes about 0.006 second to process a 240×350 image and achieves a recall rate of 99.17 percent while falsely accepting about 1.87 percent nontext DCT blocks on a variety of MPEG compressed videos containing more than 2,300 I-frames  相似文献   

16.
针对MPEG-2视频压缩标准,提出一种具备快速实现能力的鲁棒性视频水印嵌入方案。利用视频帧分块DCT系数与整体DCT系数间的映射关系,在水印嵌入和提取过程中避免对视频进行完全解码,减小计算量,提高水印算法实时嵌入、实时检测能力。实验结果表明,该方法能抵抗缩小攻击、高斯白噪声攻击、MPEG-2重编码压缩攻击,而且能够快速实现。  相似文献   

17.
视频数据大都是经过压缩域的形式存储和传输的,且直接在压缩域进行视频对象分割无需运动估计等复杂的计算,速度较快。本文提出了一种基于梯度模型的MPEG压缩域的运动对象分割算法。首先利用DCT(AC[1]和AC[8])系数获得所有物体的边缘,然后综合在累积运动矢量基础上得到的边缘运动信息,从而获得感兴趣运动物体的边缘。仿真实验结果表明,它可以取得满意的分割质量。  相似文献   

18.
Text displayed in a video is an essential part for the high-level semantic information of the video content. Therefore, video text can be used as a valuable source for automated video indexing in digital video libraries. In this paper, we propose a workflow for video text detection and recognition. In the text detection stage, we have developed a fast localization-verification scheme, in which an edge-based multi-scale text detector first identifies potential text candidates with high recall rate. Then, detected candidate text lines are refined by using an image entropy-based filter. Finally, Stroke Width Transform (SWT)- and Support Vector Machine (SVM)-based verification procedures are applied to eliminate the false alarms. For text recognition, we have developed a novel skeleton-based binarization method in order to separate text from complex backgrounds to make it processible for standard OCR (Optical Character Recognition) software. Operability and accuracy of proposed text detection and binarization methods have been evaluated by using publicly available test data sets.  相似文献   

19.
视频和图像文本提取方法综述   总被引:1,自引:0,他引:1  
文本提取在视频和图像中具有重要的应用价值。近年来,大数据时代带来了海量信息检索的迫切需求,大量视频和图像中文本的提取方法涌现出来。回顾了视频和图像中文本提取的算法,从文本提取流程出发,将其分为文本区域检测定位和文本分割两大步骤。在每个步骤中,分析并比较了现有算法的使用范围及相对优缺点,讨论了图像公用数据库,列举了近些年来图像中文本提取的重要应用,指出了当前研究中存在的问题,展望了视频和场景图像文本提取方法的发展趋势。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号