期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A dynamic saliency attention model based on local complexity

Longsheng Wei Nong Sang Yuehuan Wang Qingqing Zheng 《Digital Signal Processing》2012,22(5):760-767

A dynamic saliency attention model based on local complexity is proposed in this paper. Low-level visual features are extracted from current and some previous frames. Every feature map is resized into some different sizes. The feature maps in same size and same feature for all the frames are used to calculate a local complexity map. All the local complexity maps are normalized and are fused into a dynamic saliency map. In the same time, a static saliency map is acquired by the current frame. Then dynamic and static saliency maps are fused into a final saliency map. Experimental results indicate that: when there is noise among the frames or there is change of illumination among the frames, our model is excellent to Marat?s model and Shi?s model; when the moving objects do not belong to the static salient regions, our model is better than Ban?s model. 相似文献

2.

Modelling Spatio-Temporal Saliency to Predict Gaze Direction for Short Videos 总被引：1，自引：0，他引：1

Sophie Marat Tien Ho Phuoc Lionel Granjon Nathalie Guyader Denis Pellerin Anne Guérin-Dugué 《International Journal of Computer Vision》2009,82(3):231-243

This paper presents a spatio-temporal saliency model that predicts eye movement during video free viewing. This model is inspired by the biology of the first steps of the human visual system. The model extracts two signals from video stream corresponding to the two main outputs of the retina: parvocellular and magnocellular. Then, both signals are split into elementary feature maps by cortical-like filters. These feature maps are used to form two saliency maps: a static and a dynamic one. These maps are then fused into a spatio-temporal saliency map. The model is evaluated by comparing the salient areas of each frame predicted by the spatio-temporal saliency map to the eye positions of different subjects during a free video viewing experiment with a large database (17000 frames). In parallel, the static and the dynamic pathways are analyzed to understand what is more or less salient and for what type of videos our model is a good or a poor predictor of eye movement. 相似文献

3.

Visual saliency based redundancy allocation in HEVC compatible multiple description video coding

Majid Muhammad Owais Muhammad Anwar Syed Muhammad 《Multimedia Tools and Applications》2018,77(16):20955-20977

相似文献

4.

Video attention prediction using gaze saliency

Chen Yanxiang Tao Gang Xie Qiangqiang Song Minglong 《Multimedia Tools and Applications》2019,78(19):26867-26884

In recent years, the significant progress has been achieved in the field of visual saliency modeling. Our research key is in video saliency, which differs substantially from image saliency and could be better detected by adding the gaze information from the movement of eyes while people are looking at the video. In this paper we purposed a novel gaze saliency method to predict video attention, which is inspired by the widespread usage of mobile smart devices with camera. It is a non-contacted method to predict visual attention, and it does not bring the burden on the hardware. Our method first extracts the bottom-up saliency maps from the video frames, and then constructs the mapping from eye images obtained by the camera in synchronization with the video frames to the screen region. Finally the combination between top-down gaze information and bottom-up saliency maps is conducted by point-wise multiplication to predict the video attention. Furthermore, the proposed approach is validated on the two datasets: one is the public dataset MIT, the other is the dataset we collected, versus other four usual methods, and the experiment results show that our method achieves the state-of-the-art.

相似文献

5.

融合双目多维感知特征的立体视频显著性检测

下载免费PDF全文

周洋何永健唐向宏陆宇蒋刚毅《中国图象图形学报》2017,22(3):305-314

目的立体视频能提供身临其境的逼真感而越来越受到人们的喜爱,而视觉显著性检测可以自动预测、定位和挖掘重要视觉信息,可以帮助机器对海量多媒体信息进行有效筛选。为了提高立体视频中的显著区域检测性能,提出了一种融合双目多维感知特性的立体视频显著性检测模型。方法从立体视频的空域、深度以及时域3个不同维度出发进行显著性计算。首先,基于图像的空间特征利用贝叶斯模型计算2D图像显著图;接着,根据双目感知特征获取立体视频图像的深度显著图;然后,利用Lucas-Kanade光流法计算帧间局部区域的运动特征,获取时域显著图;最后,将3种不同维度的显著图采用一种基于全局-区域差异度大小的融合方法进行相互融合,获得最终的立体视频显著区域分布模型。结果在不同类型的立体视频序列中的实验结果表明,本文模型获得了80%的准确率和72%的召回率,且保持了相对较低的计算复杂度,优于现有的显著性检测模型。结论本文的显著性检测模型能有效地获取立体视频中的显著区域,可应用于立体视频/图像编码、立体视频/图像质量评价等领域。相似文献

6.

融合检测与跟踪的半自动视频目标标注

下载免费PDF全文

陈庆林谷雨宋忠浩聂圣东《计算机工程与应用》2021,57(14):223-230

针对视频图像连续帧间的目标具有冗余性,采用手动标注方式耗时耗力的问题,提出一种融合检测和跟踪算法的视频目标半自动标注框架。利用手动标注的样本离线训练改进YOLO v3模型,并将该检测模型作为在线标注的检测器。在线标注时在初始帧手动确定目标位置和标签,在后续帧根据检测框与跟踪框的IOU（Intersection-Over-Union）值自动确定目标的位置,并利用跟踪器的响应输出判断目标消失,从而自动停止当前目标标注。采用一种基于目标显著性的关键帧提取算法选择关键帧。采用自建舰船目标数据集进行了改进YOLO v3检测性能对比实验,并采用舰船视频序列验证了提出的视频目标半自动标注方法的有效性。实验结果表明,该方法可以显著提高标注效率,能够快速生成标注数据,适用于海上舰船等场景的视频目标标注任务。相似文献

7.

A Storage and Retrieval Technique for Scalable Delivery of MPEG-Encoded Video

《Journal of Parallel and Distributed Computing》1995,30(2):180-189

Concurrent retrieval of continuous media from a physical storage device can be achieved by interleaving data and providing a suitable scheduling algorithm. Scheduling approaches that exploit gains from statistical multiplexing are susceptible to a nonzero probability of frame loss due to the variable-bit-rate characteristic of compressed video. With interframe encoding schemes (such as specified by the MPEG standard), the losses propagate, resulting in a net loss of frames that exceeds the fraction of lost data. In this paper, we describe a mechanism for the storage and retrieval of MPEG-encoded video from a single disk storage system. The scheme balances the need for the reliable delivery of MPEG frames with the desire to support the largest number of sessions. Our approach reorganizes the MPEG-encoded video stream based on the relative importance of the frames and maps them to the storage device geometry. The reorganization reduces the impact of frames lost due to missed deadlines and distributes the frame losses over time and among sessions. Simulation results show that the new approach improves performance when compared to conventional storage and scheduling schemes. 相似文献

8.

基于基线的视频维吾尔文字幕帧提取研究

张鲁建哈力旦·阿布都热依木黄浩《传感器与微系统》2013,32(4)

根据维吾尔文字独有的基线特性,提出了一种新的视频维吾尔文字幕帧提取方法,首先进行维吾尔文字幕帧的读取,然后根据相邻帧之间的像素帧间差异和区域像素统计对视频段作初步镜头关键帧的检测,之后对检测到的镜头关键帧作区域处理,检测视频帧中是否具有基线特性,再根据基线设置阈值,最后提取出代表视频语义的主要视频帧。实验证明:该提取方法简洁有效,其字幕帧提取率平均可达到85%以上。相似文献

9.

基于光流的快速人体姿态估计

周文俊郑新波卿粼波熊文诗吴晓红《计算机系统应用》2018,27(12):109-115

针对目前深度学习领域人体姿态估计算法计算复杂度高的问题,提出了一种基于光流的快速人体姿态估计算法.在原算法的基础上,首先利用视频帧之间的时间相关性,将原始视频序列分为关键帧和非关键帧分别处理（相邻两关键帧之间的图像和前向关键帧组成一个视频帧组,同一视频帧组内的视频帧相似）,仅在关键帧上运用人体姿态估计算法,并通过轻量级光流场将关键帧识别结果传播到其他非关键帧.其次针对视频中运动场的动态特性,提出一种基于局部光流场的自适应关键帧检测算法,以根据视频的局部时域特性确定视频关键帧的位置.在OutdoorPose和HumanEvaI数据集上的实验结果表明,对于存在背景复杂、部件遮挡等问题的视频序列中,所提算法较原算法检测性能略有提升,检测速度平均可提升89.6%. 相似文献

10.

融合图像显著性与特征点匹配的形变目标跟踪

下载免费PDF全文

杨勇闫钧华井庆丰《中国图象图形学报》2018,23(3):384-398

目的针对目标在跟踪过程中出现剧烈形变,特别是剧烈尺度变化的而导致跟踪失败情况,提出融合图像显著性与特征点匹配的目标跟踪算法。方法首先利用改进的BRISK（binary robust invariant scalable keypoints）特征点检测算法,对视频序列中的初始帧提取特征点,确定跟踪算法中的目标模板和目标模板特征点集合;接着对当前帧进行特征点检测,并与目标模板特征点集合利用FLANN（fast approximate nearest neighbor search library）方法进行匹配得到匹配特征点子集;然后融合匹配特征点和光流特征点确定可靠特征点集;再后基于可靠特征点集和目标模板特征点集计算单应性变换矩阵粗确定目标跟踪框,继而基于LC（local contrast）图像显著性精确定目标跟踪框;最后融合图像显著性和可靠特征点自适应确定目标跟踪框。当连续三帧目标发生剧烈形变时,更新目标模板和目标模板特征点集。结果为了验证算法性能,在OTB2013数据集中挑选出具有形变特性的8个视频序列,共2214帧图像作为实验数据集。在重合度实验中,本文算法能够达到0.567 1的平均重合度,优于当前先进的跟踪算法;在重合度成功率实验中,本文算法也比当前先进的跟踪算法具有更好的跟踪效果。最后利用Vega Prime仿真了无人机快速抵近飞行下目标出现剧烈形变的航拍视频序列,序列中目标的最大形变量超过14,帧间最大形变量达到1.72,实验表明本文算法在该视频序列上具有更好的跟踪效果。本文算法具有较好的实时性,平均帧率48.6帧/s。结论本文算法能够实时准确的跟踪剧烈形变的目标,特别是剧烈尺度变化的目标。相似文献

11.

Exploiting visual saliency for assessing the impact of car commercials upon viewers

F. Fernández-Martínez A. Hernández-García M. A. Fernández-Torres I. González-Díaz Á. García-Faura F. Díaz de María 《Multimedia Tools and Applications》2018,77(15):18903-18933

相似文献

12.

多关键帧特征交互的人脸篡改视频检测

下载免费PDF全文

祝恺蔓徐文博卢伟赵险峰《中国图象图形学报》2022,27(1):188-202

目的深度伪造是新兴的一种使用深度学习手段对图像和视频进行篡改的技术,其中针对人脸视频进行的篡改对社会和个人有着巨大的威胁。目前,利用时序或多帧信息的检测方法仍处于初级研究阶段,同时现有工作往往忽视了从视频中提取帧的方式对检测的意义和效率的问题。针对人脸交换篡改视频提出了一个在多个关键帧中进行帧上特征提取与帧间交互的高效检测框架。方法从视频流直接提取一定数量的关键帧,避免了帧间解码的过程;使用卷积神经网络将样本中单帧人脸图像映射到统一的特征空间;利用多层基于自注意力机制的编码单元与线性和非线性的变换,使得每帧特征能够聚合其他帧的信息进行学习与更新,并提取篡改帧图像在特征空间中的异常信息;使用额外的指示器聚合全局信息,作出最终的检测判决。结果所提框架在FaceForensics++的3个人脸交换数据集上的检测准确率均达到96.79%以上;在Celeb-DF数据集的识别准确率达到了99.61%。在检测耗时上的对比实验也证实了使用关键帧作为样本对检测效率的提升以及本文所提检测框架的高效性。结论本文所提出的针对人脸交换篡改视频的检测框架通过提取关键帧减少视频级检测中的计算成本和时间消耗,使用卷积... 相似文献

13.

Painterly animation using motion maps

《Graphical Models》2008,70(1-2):1-15

Starting from an input video, we replicate the manual technique of paint-on-glass animation. Motion maps are used to represent the regions where changes occur between frames. Edges are the key to identifying frame-to-frame changes, and a strong motion map is constructed from the edges in each frame, displaced by the motion vector. A second, weak motion map records the other pixels where there is significant movement between frames. These maps are used to generate the brush strokes necessary to convert one ‘painted’ frame into the next. Local gradient interpolation, based robustly on the edges, is used to determine the orientation of the brush strokes, and we avoid holes in the image by making additional strokes with smaller brushes. We also employ MSE data in evaluating temporal coherence between frames. 相似文献

14.

An innovative algorithm for key frame extraction in video summarization 总被引：2，自引：0，他引：2

Ciocca Gianluigi Schettini Raimondo 《Journal of Real-Time Image Processing》2006,1(1):69-88

相似文献

15.

基于改进分块颜色特征和二次提取的关键帧提取算法 总被引：1，自引：0，他引：1

刘华咏李涛《计算机科学》2015,42(12):307-311

关键帧提取技术是视频摘要、检索、浏览和理解中的一项重要技术。目前关键帧提取算法存在一些问题,例如特征选择复杂、阈值选择难、自适应性不强等。为了更有效地提取视频关键帧,提出了一种基于改进分块颜色特征和二次提取的关键帧提取算法。首先,对视频帧进行等面积矩形环划分;其次,提取矩形环的HSV量化颜色特征,并由帧图像中心到外依次减小每个矩形环特征的权值以突出图像主体部分;然后,依据相邻视频帧间特征的显著性变化初步选取关键帧;最后,依据初次提取的关键帧在视频中的位置间隔大小进行二次提取优化关键帧。实验结果表明,该方法具有良好的适应性,同时能够有效避免因镜头有突然闪光或物体快速运动而提取过多的关键帧,最终提取的关键帧能够比较全面准确地表达视频内容。相似文献

16.

T-STAM:基于双流时空注意力机制的端到端的动作识别模型

石祥滨李怡颖刘芳代钦《计算机应用研究》2021,38(4):1235-1239,1276

针对双流法进行视频动作识别时忽略特征通道间的相互联系、特征存在大量冗余的时空信息等问题,提出一种基于双流时空注意力机制的端到端的动作识别模型T-STAM,实现了对视频关键时空信息的充分利用。首先,将通道注意力机制引入到双流基础网络中,通过对特征通道间的依赖关系进行建模来校准通道信息,提高特征的表达能力。其次,提出一种基于CNN的时间注意力模型,使用较少的参数学习每帧的注意力得分,重点关注运动幅度明显的帧。同时提出一种多空间注意力模型,从不同角度计算每帧中各个位置的注意力得分,提取多个运动显著区域,并且对时空特征进行融合进一步增强视频的特征表示。最后,将融合后的特征输入到分类网络,按不同权重融合两流输出得到动作识别结果。在数据集HMDB51和UCF101上的实验结果表明T-STAM能有效地识别视频中的动作。相似文献

17.

基于视频聚类的关键帧提取算法

刘华咏郝会芬李涛《物联网技术》2014,(8):59-61

关键帧可以有效减少视频索引的数据量,是分析和检索视频的关键。在提取关键帧过程中,为了解决传统聚类算法对初始参数敏感的问题,提出了一种改进的基于视频聚类的关键帧提取算法。首先,提取视频帧的特征,依据帧间相似度,对视频帧进行层次聚类,并得到初始聚类结果;接着使用K-means算法对初始聚类结果进行优化,最后提取聚类的中心作为视频的关键帧。实验结果表明该方法可以大幅提高关键帧的准确率和查全率,能较好地表达视频的主要内容。相似文献

18.

分布式视频编码中关键帧丢失错误保护

下载免费PDF全文

荣松杨红卿粼波王正勇《中国图象图形学报》2017,22(5):656-662

目的分布式视频编码较其传统视频编码具有编码简单、误码鲁棒性高等特点,可以很好地满足如无人机航拍、无线监控等新型视频业务的需求。在分布式视频编码中,视频图像被交替分为关键帧和Wyner-Ziv帧,由于受到信道衰落和干扰等因素的影响,采用传统帧内编码方式的关键帧的误码鲁棒性远不如基于信道编码的Wyner-Ziv帧。关键帧能否正确传输和解码对于Wyner-Ziv帧能否正确解码起着决定性的作用,进而影响着整个系统的压缩效率和率失真性能。为此针对关键帧在异构网络中的鲁棒性传输问题,提出一种基于小波域的关键帧质量可分级保护传输方案。方法在编码端对关键帧同时进行传统的帧内视频编码和基于小波域的Wyner-Ziv编码,解码端将经过错误隐藏后的误码关键帧作为基本层,Wyner-Ziv编码产生的校验信息码流作为增强层。为了提高系统的分层特性以便使系统的码率适应不同的网络条件,进一步将小波分解后图像的各个不同层的低频带和高频带组合成不同的增强层,根据不同信道环境,传输不同层的Wyner-Ziv校验数据。同时对误码情况下关键帧的虚拟噪声模型进行了改进,利用第1个增强层已解码重建的频带与其对应边信息来获得第2个和第3个增强层对应频带的更加符合实际的虚拟信道模型的估计。结果针对不同的视频序列在关键帧误码率为1%20%时,相比较于传统的帧内错误隐藏算法,所提方案可以提高视频重建图像的主观质量和整体系统的率失真性能。例如在关键帧误码率为5%时,通过传输第1个增强层,不同的视频序列峰值信噪比（PSNR）提升可达25 dB左右;如果继续传输第2个增强层的校验信息,视频图像的PSNR也可以提升0.51.6 dB左右;如果3个增强层的校验信息都传输的话,基本上可以达到无误码情况下关键帧的PSNR。结论本文所提方案可以很好地解决分布式视频编码系统中的关键帧在实际信道传输过程中可能出现的误码问题,同时采用的分层传输方案可以适应不同网络的信道情况。相似文献

19.

基于运动特征融合的快速视频超分辨率重构方法

付利华孙晓威赵宇李宗刚黄笳倞王路远《模式识别与人工智能》2019,32(11):1022-1031

基于深度学习的视频超分辨率重构方法常面临重构精度不高或重构时间过长的问题,难以实时获得高精度的重构结果.针对此问题,文中提出基于深度残差网络的视频超分辨率重构方法,可以快速地对视频进行高精度重构,并在较小分辨率视频的重构过程中达到实时重构的要求.自适应关键帧判别子网自适应地从视频帧中判别关键帧,关键帧经过高精度关键帧重构子网进行重构.对于非关键帧,将其特征与邻近关键帧间的运动估计特征和邻近关键帧的特征逐层融合,直接获得非关键帧的特征,从而快速获得非关键帧的重构结果.在公开数据集上的实验表明,文中方法能实现对视频的快速、高精度重构,鲁棒性较好. 相似文献

20.

基于图像主色彩的视频关键帧提取方法

王松韩永国吴亚东张赛楠《计算机应用》2013,33(9):2631-2635

针对现有关键帧提取算法存在的计算量大、阈值选择困难、视频类型受限等问题, 提出了一种基于图像主色彩的视频关键帧提取方法。该方法利用基于八叉树结构的色彩量化算法提取图像主色彩特征,通过计算颜色特征的相似度实现镜头边界检测,最后采用K-均值算法对提取出的代表帧序列进行聚类,准确提取出指定数目的关键帧。实验结果表明,所提算法计算简单、空间耗费少,具有良好的通用性和适应性。相似文献