首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 156 毫秒
1.
本文针对当前指代视频目标分割方法缺乏目标时空一致性建模和目标时空表征学习不足等问题,进行了深入的研究,提出了基于时空层级查询的指代视频目标分割方法 (STHQ).本文将指代视频目标分割看作基于查询的序列预测问题,并提出两级查询机制进行目标的时空一致性建模和时空特征学习.在第1阶段,本文提出了帧级空间信息提取模块,该模块使用语言特征作为查询独立地和视频序列中的每一帧在空间维度进行信息交互,生成包含目标空间信息的实例嵌入;在第2阶段,本文提出时空信息聚合模块,该模块使用视频级的可学习查询嵌入和第1阶段生成的实例嵌入在时空维度进行信息交互,生成具有时空表征信息的视频级实例嵌入;最后,视频级实例嵌入线性变换为条件卷积参数,并和视频序列中的每一帧进行卷积操作,生成目标的掩码预测序列.在该领域的3个基准数据集上的实验结果表明,本文提出的STHQ方法超越了现有的方法,实现了最佳的性能.  相似文献   

2.
业界分块视频压缩感知通常对所有图像块均采用相同的测量矩阵进行测量,这种方式未考虑到视频中不同区域的变化程度不同的事实。在视频帧间相关性的基础上提出一种自适应分配采样率的方法,即在编码端根据图像块的帧间相关性大小分类并分配不同的采样率;在解码端使用全变差算法以充分利用帧间相关性。为减小网络环境影响,此算法不区分参考帧与非参考帧,并对每一帧作相同处理。实验结果表明,该方法能够在较低采样率下重构出较高质量的视频图像,并且缩短计算时间。  相似文献   

3.
目前视频目标分割算法多是基于匹配和传播策略分割目标,常常以掩模或者光流的方式利用前一帧的信息,探索了新的帧间特征传播方式,利用短时匹配模块提取前一帧信息并传播给当前帧,提出一种面向视频序列数据的目标分割模型。通过长时匹配模块和短时匹配模块分别与第一帧和前一帧做相关操作进行像素级匹配,得到的全局相似性图和局部相似性图,以及前一帧的掩模和当前帧的特征图,经过两个优化网络后通过分割网络得到分割结果。在视频目标分割公开数据集上的实验表明,所提出方法在单目标和多目标上分别取得了86.5%和77.4%的区域相似度和轮廓精度均值,每秒可计算21帧。提出的短时匹配模块比仅使用掩模更有利于提取前一帧的信息,通过长时匹配模块和短时匹配模块的结合,不使用在线微调即可实现高效的视频目标分割,适合应用于移动机器人视觉感知。  相似文献   

4.
基于帧间预测误差扩展的可逆视频水印   总被引:1,自引:1,他引:0  
为了提高可逆视频水印的嵌入容量和嵌水印视频质量,提出一种使用运动估计和预测误差直方图修改的算法.在嵌入端,根据视频中相邻帧间的内容关系,对每一帧使用运动估计获得其像素的预测误差,生成预测误差直方图,并通过扩展位于直方图峰值点的预测误差来嵌入水印;在每帧的嵌入过程中会产生少量用于提取水印和还原视频的头信息,包括运动向量和边界表等,将其与水印一起嵌入到该帧的参考帧中.在提取端,先获取参考帧中的水印和头信息并还原参考帧,保证提取操作具有正确的上下文,然后根据获取的头信息和还原后的参考帧提取当前帧中的水印.实验结果表明,采用文中算法获得的预测误差直方图具有高度的集中性,并且对像素值的修改十分微小,比其他可逆水印算法具有更大的嵌入容量和更好的嵌水印视频质量.  相似文献   

5.
视频缩放是近年来数字图像处理领域的一项热点问题.针对整个视频进行缩放的方法会带来庞大的内存占用与计算量,导致效率低下、实用性较差;而针对视频中的每一帧进行缩放的方法难以维持视频的时空一致性.为此,基于Seam Carving方法,提出一种逐帧优化的视频缩放方法.首先逐帧读入视频,按照梯度求出当前帧的能量图,并使用高速缓存的置换思想调整能量图;然后根据能量图找出缝;最后使用线性插值的方法删除缝,得到目标大小的帧.实验结果表明,该方法不仅能够在所处理的每一帧中保持图像的重要内容,并且可以维持整个视频的时空一致性,保持较好的视觉效果.  相似文献   

6.
提出一种基于光场与几何混合绘制的视频绘制算法.算法以光场绘制算法计算起始帧,以新视点的上一帧为基础更新前景;结合高斯混合背景建模和场景几何计算方法来计算新视点的当前帧的前景区域,避免了重复绘制占据图像大部分而变化缓慢的背景区域,提高了视频绘制的效率.为了消除累积误差,采用"起始帧+后续帧"的循环模式,同时在循环中统计前景点在场景中的分布,自适应地划分下一个循环的场景层次.实验结果表明,文中算法效率高,所生成的图像质量好.  相似文献   

7.
视频帧预测是计算机视觉领域一个重要的研究领域,并且拥有广泛的应用。目前,常用的视频帧预测模型虽然取得了一定的效果,但由于这类模型并不能在时空信息上同时建模,因此难以在更加复杂度的现实场景下应用。针对此问题,文中提出一种深度时空建模神经络。该网络通过预测未来光流,并利用该光流对前一帧图像进行采用的方法来预测未来图像,此外分别加入卷积LSTM与自注意力机制进行时空信息的建模。文章在Caltech行人数据集上进行了充分的实验,并取得了较好的实验结果。  相似文献   

8.
为了解决传统目标跟踪算法在有遮挡后无法准确跟踪的问题,提出了将YOLO和Camshift算法相联合的目标跟踪算法.基于YOLO网络结构来构建目标检测的模型,在模型构建之前,采用图像增强的方法对视频帧进行预处理,在保留视频帧中足够图像信息的同时,提高图像质量,降低YOLO算法的时间复杂度.用YOLO算法确定出目标,完成对目标跟踪的初始化.根据目标的位置信息使用Camshift算法对后续的视频帧进行处理,并对每一帧的目标进行更新,从而可以保证不断调整跟踪窗口位置,适应目标的移动.实验结果表明,所提的方法能够有效地克服目标被遮挡后跟踪丢失的问题,具有很好的鲁棒性.  相似文献   

9.
为了解决单张照片人体重构出现的姿态翻转问题,提高重构模型的准确度,提出相邻帧姿态约束和人体轮廓线匹配的姿态与形状序列同时重构算法.对视频中的每一帧,首先估计出图像中人物的二维关节点、人物脸部特征点及其边缘轮廓线;然后将参数化模型SMPL所表达的三维人体投影到二维平面上,使得投影后的二维信息与对应的视频帧二维信息相匹配;最后通过调整SMPL的姿态与形状参数来最小化匹配能量函数,从而重构出与视频帧中人物具有相似姿态与形状的三维人体模型.此外,为了使重构结果显得更真实,也对图像帧中人体的头部姿态进行了检测和匹配.该算法在MPI-INF-3DHP数据集、Youku视频和自拍视频帧上均进行了实验,实验结果表明,与SMPLify算法等相比,该算法能有效地避免重构结果中出现姿态翻转的现象,且能在保证模型整体姿态相似性的前提下重构出准确的头部姿态和相似的模型形状.  相似文献   

10.
针对传统视频摘要方法往往没有考虑时序信息以及提取的视频特征过于复杂、易出现过拟合现象的问题,提出一种基于改进的双向长短期记忆(BiLSTM)网络的视频摘要生成模型。首先,通过卷积神经网络(CNN)提取视频帧的深度特征,而且为了使生成的视频摘要更具多样性,采用BiLSTM网络将深度特征识别任务转换为视频帧的时序特征标注任务,让模型获得更多上下文信息;其次,考虑到生成的视频摘要应当具有代表性,因此通过融合最大池化在降低特征维度的同时突出关键信息以淡化冗余信息,使模型能够学习具有代表性的特征,而特征维度的降低也减少了全连接层需要的参数,避免了过拟合问题;最后,预测视频帧的重要性分数并转换为镜头分数,以此选取关键镜头生成视频摘要。实验结果表明,在标准数据集TvSum和SumMe上,改进后的视频摘要生成模型能提升生成视频摘要的准确性;而且它的F1-score值也比基于长短期记忆(LSTM)网络的视频摘要模型DPPLSTM在两个数据集上分别提高1.4和0.3个百分点。  相似文献   

11.
Video summarization has great potential to enable rapid browsing and efficient video indexing in many applications. In this study, we propose a novel compact yet rich key frame creation method for compressed video summarization. First, we directly extract DC coefficients of I frame from a compressed video stream, and DC-based mutual information is computed to segment the long video into shots. Then, we select shots with static background and moving object according to the intensity and range of motion vector in the video stream. Detecting moving object outliers in each selected shot, the optimal object set is then selected by importance ranking and solving an optimum programming problem. Finally, we conduct an improved KNN matting approach on the optimal object outliers to automatically and seamlessly splice these outliers to the final key frame as video summarization. Previous video summarization methods typically select one or more frames from the original video as the video summarization. However, these existing key frame representation approaches for video summarization eliminate the time axis and lose the dynamic aspect of the video scene. The proposed video summarization preserves both compactness and considerably richer information than previous video summaries. Experimental results indicate that the proposed key frame representation not only includes abundant semantics but also is natural, which satisfies user preferences.  相似文献   

12.
《Real》2000,6(6):449-459
In this paper, we propose a new method of temporal summarization of digital video. First, we address the problem of extracting a fixed number of representative frames to summarize a given digital video. To solve it, we have devised an algorithm called content-based adaptive clustering (CBAC). In our algorithm, shot boundary detection is not needed. Video frames are treated as points in the multi-dimensional feature space corresponding to a low-level feature such as color, motion, shape and texture. The changes of their distances are compared globally for extraction of representative frames. Second, we address how to use the representative frames to comprise representative sequences (R - Sequence) which can be used for temporal summarization of video. A video player based on our devised algorithm is developed which has functions of content-based browsing and content-based video summary. Experiments are also shown in the paper.  相似文献   

13.
14.
Video summarization via exploring the global and local importance   总被引:1,自引:0,他引:1  
Video Summarization is to generate an important or interesting short video from a long video. It is important to reduce the time required to analyze the same archived video by removing unnecessary video data. This work proposes a novel method to generate dynamic video summarization by fusing the global importance and local importance based on multiple features and image quality. First, videos are split into several suitable video clips. Second, video frames are extracted from each video clip, and the center parts of frames are also extracted. Third, for each frame and the center part, the global importance and the local importance are calculated by using a set of features and image quality. Finally, the global importance and the local importance are fused to select an optimal subset for generating video summarization. Extensive experiments are conducted to demonstrate that the proposed method enables to generate high-quality video summarization.  相似文献   

15.
视频数据的急剧增加,给视频的浏览、存储、检索等应用带来一系列问题和挑战,视频摘要正是解决此类问题的一个有效途径。针对现有视频摘要算法基于约束和经验设置构造目标函数,并对帧集合进行打分带来的不确定和复杂度高等问题,提出一个基于排序学习的视频摘要生成方法。该方法把视频摘要的提取等价为视频帧对视频内容表示的相关度排序问题,利用训练集学习排序函数,使得排序靠前的是与视频相关度高的帧,用学到的排序函数对帧打分,根据分数高低选择关键帧作为视频摘要。另外,与现有方法相比,该方法是对帧而非帧集合打分,计算复杂度显著降低。通过在TVSum50数据集上测试,实验结果证实了该方法的有效性。  相似文献   

16.
This paper presents a two-level queueing system for dynamic summarization and interactive searching of video content. Video frames enter the queueing system; some insignificant and redundant frames are removed; the remaining frames are pulled out of the system as top-level key frames. Using an energy-minimization method, the first queue removes the video frames that constitute the gradual transitions of video shots. The second queue measures the content similarity of video frames and reduces redundant frames. In the queueing system, all key frames are linked in a directed-graph index structure, allowing video content to be accessed at any level-of-detail. Furthermore, this graph-based index structure enables interactive video content exploration, and the system is able to retrieve the video key frames that complement the video content already viewed by users. Experimental results on four full-length videos show that our queueing system performs much better than two existing methods on video key frame selection at different compression ratios. The evaluation on video content search shows that our interactive system is more effective than other systems on eight video searching tasks. Compared with the regular media player, our system reduces the average content searching time by half.  相似文献   

17.
视频摘要是海量视频浏览的重要手段,现有的方法一般生成短帧视频或多帧序列图像以概括原视频,但它们都受限于原有时间序列,难以高效地表达信息.为此,提出了一种视频海报的自动生成方法来制作更为精练的视频摘要.如何提取视频中的关键画面与如何实现海报自动排版是其中的2个核心问题.对现有的视频关键帧提取方法进行扩展,采用综合视觉关注度模型,提出了基于视觉重要性的关键帧排序算法;在现有排版规则基础上,增加了版面位置对视觉心理感知的影响,设计出位置重要性驱动的视频海报自动排版算法.实验结果证明了文中算法的有效性.  相似文献   

18.
Text mining techniques have been recently employed to classify and summarize user reviews on mobile application stores. However, due to the inherently diverse and unstructured nature of user-generated online textual data, text-based review mining techniques often produce excessively complicated models that are prone to overfitting. In this paper, we propose a novel approach, based on frame semantics, for app review mining. Semantic frames help to generalize from raw text (individual words) to more abstract scenarios (contexts). This lower-dimensional representation of text is expected to enhance the predictive capabilities of review mining techniques and reduce the chances of overfitting. Specifically, our analysis in this paper is two-fold. First, we investigate the performance of semantic frames in classifying informative user reviews into various categories of actionable software maintenance requests. Second, we propose and evaluate the performance of multiple summarization algorithms in generating concise and representative summaries of informative reviews. Three different datasets of app store reviews, sampled from a broad range of application domains, are used to conduct our experimental analysis. The results show that semantic frames can enable an efficient and accurate review classification process. However, in review summarization tasks, our results show that text-based summarization generates more comprehensive summaries than frame-based summarization. Finally, we introduces MARC 2.0, a review classification and summarization suite that implements the algorithms investigated in our analysis.  相似文献   

19.
视频摘要是视频内容的一种压缩表示方式。为了能够更好地浏览视频,提出了一种根据浏览或检索的粒度不同来建立两种层次视频摘要(镜头级和场景级)的思想,并给出了一种视频摘要生成方法:首先用一种根据内容变化自动提取镜头内关键帧的方法来实现关键帧的提取;继而用一种改进的时间自适应算法通过镜头的组合来得到场景;最后在场景级用最小生成树方法提取代表帧。由于关键帧和代表帧分别代表了它们所在镜头和场景的主要内容,因此它们的序列就构成了视频总结。一些电影视频片段检验的实验结果表明,这种生成方法能够较好地提供粗细两种粒度的视频内容总结。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号