期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

王峰张磊黄华《中国图象图形学报》2016,21(3):365-372

目的海上拍摄的视频存在大面积的无纹理区域,传统基于特征点检测和跟踪的视频去抖方法处理这类视频时往往效果较差。为此提出一种基于平稳光流估计的海上视频去抖算法。方法该算法以层次化块匹配作为基础,引入平滑性约束计算基于层次块的光流,能够快速计算海上视频的近似光流场;然后利用基于平稳光流的能量函数优化,实现海上视频的高效去抖动。结果分别进行了光流估计运行时间对比、视频稳定运行时间对比和用户体验比较共3组实验。相比于能处理海上视频去抖的SteadyFlow算法,本文的光流估计算法较SteadFlow算法的运动估计方法快10倍左右,整个视频去抖算法在处理速度上能提升70%以上。本文算法能够有效地实现海上视频去抖,获得稳定的输出视频。结论提出了一种基于平稳光流估计的海上视频去抖算法,相对于传统方法,本文方法更适合处理海上视频的去抖。相似文献

2.

Method for selecting representative videos for change detection datasets

Silva Claudinei M. Rosa Katharina A. I. Bugatti Pedro H. Saito Priscila T. M. Corrêa Cléber G. Yokoyama Roberto S. Sanches Silvio R. R. 《Multimedia Tools and Applications》2022,81(3):3773-3791

In evaluating the change detection algorithms, the algorithm evaluated must show a superior performance than the state-of-the-art algorithms. The evaluation process steps comprise executing a new algorithm to segment a set of videos from a dataset and compare the results regarding a ground truth. In this paper, we propose using additional information in evaluating change detection algorithms: the level of difficulty in classifying a pixel. First, for each video frame used in the evaluation, we created a difficulty map structure, which stores values representing the level of difficulty required by an algorithm to classify each pixel of that frame. Second, we developed a metric to estimate each dataset video’s difficulty based on our difficulty maps. Third, we applied the metric to selecting the more representative videos from the dataset based on their difficulty level. Finally, to demonstrate the method’s contribution, we evaluated it using all videos from the CDNet 2014 dataset. The results showed that a subset of videos selected by our method has the same potential as the original CDNet 2014 dataset. Hence, a new change detection algorithm can be evaluated more quickly using our subset of videos selected.

相似文献

3.

基于帧间差异的人脸篡改视频检测方法

下载免费PDF全文

张怡暄李根曹纭赵险峰《信息安全学报》2020,5(2):49-72

近几年,随着计算机硬件设备的不断更新换代和深度学习技术的不断发展,新出现的多媒体篡改工具可以让人们更容易地对视频中的人脸进行篡改。使用这些新工具制作出的人脸篡改视频几乎无法被肉眼所察觉,因此我们急需有效的手段来对这些人脸篡改视频进行检测。目前流行的视频人脸篡改技术主要包括以自编码器为基础的Deepfake技术和以计算机图形学为基础的Face2face技术。我们注意到人脸篡改视频里人脸区域的帧间差异要明显大于未被篡改的视频中人脸区域的帧间差异,因此视频相邻帧中人脸图像的差异可以作为篡改检测的重要线索。在本文中,我们提出一种新的基于帧间差异的人脸篡改视频检测框架。我们首先使用一种基于传统手工设计特征的检测方法,即基于局部二值模式（Local binary pattern,LBP）/方向梯度直方图（Histogram of oriented gradient,HOG）特征的检测方法来验证该框架的有效性。然后,我们结合一种基于深度学习的检测方法,即基于孪生网络的检测方法进一步增强人脸图像特征表示来提升检测效果。在FaceForensics++数据集上,基于LBP/HOG特征的检测方法有较高的检测准确率,而基于孪生网络的方法可以达到更高的检测准确率,且该方法有较强的鲁棒性;在这里,鲁棒性指一种检测方法可以在三种不同情况下达到较高的检测准确率,这三种情况分别是：对视频相邻帧中人脸图像差异用两种不同方式进行表示、提取三种不同间隔的帧对来计算帧间差异以及训练集与测试集压缩率不同。相似文献

4.

融入时空显著性的高精度视频稳像算法

尹丽华康亮朱文华《计算机应用》2022,42(8):2564-2570

为剔除复杂运动前景对视频稳像精度的干扰,同时结合时空显著性在运动目标检测上的独特优势,提出一种融入时空显著性的高精度视频稳像算法。该算法一方面通过时空显著性检测技术识别出运动目标并对其进行剔除;另一方面,采用多网格的运动路径进行运动补偿。具体包括：SURF特征点提取和匹配、时空显著性目标检测、网格划分与运动矢量计算、运动轨迹生成、多路径平滑、运动补偿等环节。实验结果表明,相较于传统的稳像算法,所提算法在稳定度（Stability）指标方面表现突出。对于有大范围运动前景干扰的视频,所提算法比RTVSM（Robust Traffic Video Stabilization Method assisted by foreground feature trajectories）的Stability指标提高了约9.6%;对于有多运动前景干扰的视频,所提算法比Bundled-paths算法的Stability指标提高了约5.8%,充分说明了所提算法对于复杂场景的稳像优势。相似文献

5.

Extracting representative motion flows for effective video retrieval

Zhe Zhao Bin Cui Gao Cong Zi Huang Heng Tao Shen 《Multimedia Tools and Applications》2012,58(3):687-711

In this paper, we propose a novel motion-based video retrieval approach to find desired videos from video databases through trajectory matching. The main component of our approach is to extract representative motion features from the video, which could be broken down to the following three steps. First, we extract the motion vectors from each frame of videos and utilize Harris corner points to compensate the effect of the camera motion. Second, we find interesting motion flows from frames using sliding window mechanism and a clustering algorithm. Third, we merge the generated motion flows and select representative ones to capture the motion features of videos. Furthermore, we design a symbolic based trajectory matching method for effective video retrieval. The experimental results show that our algorithm is capable to effectively extract motion flows with high accuracy and outperforms existing approaches for video retrieval. 相似文献

6.

基于骨骼时序散度特征的人体行为识别算法

田志强邓春华张俊雯《计算机应用》2021,41(5):1450-1457

人体行为识别是智能监控、人机交互、机器人等领域的一项重要的基础技术。图卷积神经网络（GCN）在基于骨骼的人体行为识别上取得了卓越的性能。不过GCN在人体行为识别研究中存在以下问题：1）人体骨架的骨骼点采用坐标表示,缺乏骨骼点的运动细节信息;2）在某些视频中,人体骨架的运动幅度太小导致关键骨骼点的表征信息不明显。针对上述问题,首先提出骨骼点的时序散度模型来描述骨骼点的运动状态,从而放大了不同人体行为的类间方差。并进一步提出了时序散度特征的注意力机制,以突显关键骨骼点,进一步扩大类间方差。最后根据原始骨架的空间数据特征和时序散度特征的互补性构建了双流融合模型。所提算法在权威的人体行为数据集NTU-RGB+D的两种划分策略下分别达到了82.9%和83.7%的准确率,相比自适应图卷积网络（AGCN）提高了1.3个百分点和0.5个百分点,准确率的提升证明了所提算法的有效性。相似文献

7.

Pixel-wise video stabilization

Zhongqiang Wang Hua Huang 《Multimedia Tools and Applications》2016,75(23):15939-15954

In this paper, we present a novel video stabilization method with a pixel-wise motion model. In order to avoid distortion introduced by traditional feature points based motion models, we focus on constructing a more accurate model to capture the motion in videos. By taking advantage of dense optical flow, we can obtain the dense motion field between adjacent frames and set up a pixel-wise motion model which is accurate enough. Our method first estimates dense motion field between adjacent frames. A PatchMatch based dense motion field estimation algorithm is proposed. This algorithm is specially designed for similar video frames rather than arbitrary images to reach higher speed and better performance. Then, a simple and fast smoothing algorithm is performed to make the jittered motion stabilized. After that, we warp input frames using a weighted average algorithm to construct the output frames. Some pixels in output frames may be still empty after the warping step, so in the last step, these empty pixels are filled using a patch based image completion algorithm. We test our method on many challenging videos and demonstrate the accuracy of our model and the effectiveness of our method. 相似文献

8.

基于轨迹行为模式特征的视频拷贝检测算法 总被引：1，自引：0，他引：1

郭俊波李锦涛张勇东张冬明吴潇《计算机辅助设计与图形学学报》2010,22(6)

为了有效地利用视频的时域运动信息来提高视频拷贝检测的精度和鲁棒性,提出一种基于特征点轨迹行为模式的拷贝检测算法.首先从视频连续帧中提取特征点轨迹的行为模式特征,然后采用视觉关键词典技术构造视频的运动特征,最后基于运动特征的相似度进行视频拷贝检测.该算法在TRECVID标准数据集上取得了较高的检测精度.实验分析表明,基于轨迹的运动特征具有较强的描述区分能力,对各种常见的拷贝变化具有鲁棒性. 相似文献

9.

Selection of negative samples and two-stage combination of multiple features for action detection in thousands of videos

G. J. Burghouts K. Schutte H. Bouma R. J. M. den Hollander 《Machine Vision and Applications》2014,25(1):85-98

In this paper, a system is presented that can detect 48 human actions in realistic videos, ranging from simple actions such as ‘walk’ to complex actions such as ‘exchange’. We propose a method that gives a major contribution in performance. The reason for this major improvement is related to a different approach on three themes: sample selection, two-stage classification, and the combination of multiple features. First, we show that the sampling can be improved by smart selection of the negatives. Second, we show that exploiting all 48 actions’ posteriors by two-stage classification greatly improves its detection. Third, we show how low-level motion and high-level object features should be combined. These three yield a performance improvement of a factor 2.37 for human action detection in the visint.org test set of 1,294 realistic videos. In addition, we demonstrate that selective sampling and the two-stage setup improve on standard bag-of-feature methods on the UT-interaction dataset, and our method outperforms state-of-the-art for the IXMAS dataset. 相似文献

10.

基于改进FCOS的拥挤行人检测算法

下载免费PDF全文

齐鹏宇王洪元张继朱繁徐志晨《智能系统学报》2021,16(4):811-818

针对大规模拥挤场景视频中行人目标小、行人遮挡和行人交叠而导致的检测困难等问题,本文将逐像素预测目标检测框架—全卷积单阶段目标检测FCOS(fully convolutional one-stage object detection)应用于行人检测,提出一种改进的主干网络用于提取行人特征,通过增加尺度回归实现目标行人的多... 相似文献

11.

面向Kinect运动数据的鲁棒足迹检测

下载免费PDF全文

罗飘刘晓平《中国图象图形学报》2016,21(2):225-234

目的 Kinect可实时获取运动数据且较传统的运动捕捉设备采集成本低廉,在运动数据捕捉方面得到了广泛应用。但Kinect获取的运动数据精度较低,现有运动数据处理算法难以适用。方法针对运动数据处理的关键步骤足迹检测问题,提出面向Kinect运动数据的鲁棒足迹检测算法。首先使用自适应的双边滤波算法降低Kinect运动数据中的噪声;其次定义多种脚部运动特征并用于分类,优化分类效果;最后使用支持向量机(SVM)算法训练决策函数并用于足迹检测。结果本文算法应用于多种类型运动数据后,可以有效地减少Kinect运动数据中的噪声,足迹检测的时间性能以及准确性良好,其中足迹检测的准确率比经典的基准线方法提高了10%左右,比K近邻方法提高了8%左右,检测一帧运动足迹的速度为K近邻方法的7倍左右。结论对实验结果的分析证明算法具有良好的鲁棒性、时间性能以及准确率,可广泛应用于运动数据的处理之中。相似文献

12.

融合双目多维感知特征的立体视频显著性检测

下载免费PDF全文

周洋何永健唐向宏陆宇蒋刚毅《中国图象图形学报》2017,22(3):305-314

目的立体视频能提供身临其境的逼真感而越来越受到人们的喜爱,而视觉显著性检测可以自动预测、定位和挖掘重要视觉信息,可以帮助机器对海量多媒体信息进行有效筛选。为了提高立体视频中的显著区域检测性能,提出了一种融合双目多维感知特性的立体视频显著性检测模型。方法从立体视频的空域、深度以及时域3个不同维度出发进行显著性计算。首先,基于图像的空间特征利用贝叶斯模型计算2D图像显著图;接着,根据双目感知特征获取立体视频图像的深度显著图;然后,利用Lucas-Kanade光流法计算帧间局部区域的运动特征,获取时域显著图;最后,将3种不同维度的显著图采用一种基于全局-区域差异度大小的融合方法进行相互融合,获得最终的立体视频显著区域分布模型。结果在不同类型的立体视频序列中的实验结果表明,本文模型获得了80%的准确率和72%的召回率,且保持了相对较低的计算复杂度,优于现有的显著性检测模型。结论本文的显著性检测模型能有效地获取立体视频中的显著区域,可应用于立体视频/图像编码、立体视频/图像质量评价等领域。相似文献

13.

基于注意力机制的3D DenseNet人体动作识别方法

张聪聪何宁孙琪翔尹晓杰《计算机工程》2021,47(11):313-320

传统人体动作识别算法无法充分利用视频中人体动作的时空信息,且识别准确率较低。提出一种新的三维密集卷积网络人体动作识别方法。将双流网络作为基本框架,在空间网络中运用添加注意力机制的三维密集网络提取视频中动作的表观信息特征,结合时间网络对连续视频序列运动光流的运动信息进行特征提取,经过时空特征和分类层的融合后得到最终的动作识别结果。同时为更准确地提取特征并对时空网络之间的相互作用进行建模,在双流网络之间加入跨流连接对时空网络进行卷积层的特征融合。在UCF101和HMDB51数据集上的实验结果表明,该模型识别准确率分别为94.52%和69.64%,能够充分利用视频中的时空信息,并提取运动的关键信息。相似文献

14.

多尺度分析的运动注意力计算

下载免费PDF全文

刘龙樊波阳《中国图象图形学报》2014,19(1):101-108

目的由于光流估算的缺陷、噪声干扰以及现有运动注意力模型的局限性,导致运动注意力计算结果不能准确反映运动的显著性特征,制约了运动显著图的进一步应用。为提高运动注意力计算的准确性,提出一种基于时—空多尺度分析的运动注意力计算方法。方法该方法根据视觉运动注意力来自于时—空运动反差的注意力形成机理构建运动注意力模型;通过时间尺度滤波去除噪声影响;鉴于视觉观测对尺度的依赖性,通过对视频帧的多尺度分解,在多个空间尺度进行运动注意力的计算,根据宏块像素值的相关系数大小对低尺度、中低尺度和原始尺度的运动注意力计算结果进行融合,得到最终的运动注意力显著图。结果对多个视频测试序列的测试,测试结果表明,本文方法比同类方法更能真实有效地反映出视频场景中的运动显著性特征,大大提高了运动显著图的准确性。结论为有效提高运动注意力计算不准确的问题,提出一种基于时—空多尺度分析的运动注意力计算方法,对于不同复杂视频运动场景,该方法能明显增强运动注意力计算的准确性,为视觉运动注意力的进一步应用奠定了良好基础。相似文献

15.

Camera-based Basketball Scoring Detection Using Convolutional Neural Network

Fu Xu-Bo Yue Shao-Long Pan De-Yun 《国际自动化与计算杂志》2021,18(2):266-276

Recently, deep learning methods have been applied in many real scenarios with the development of convolutional neural networks (CNNs). In this paper, we introduce a camera-based basketball scoring detection (BSD) method with CNN based object detection and frame difference-based motion detection. In the proposed BSD method, the videos of the basketball court are taken as inputs. Afterwards, the real-time object detection, i.e., you only look once (YOLO) model, is implemented to locate the position of the basketball hoop. Then, the motion detection based on frame difference is utilized to detect whether there is any object motion in the area of the hoop to determine the basketball scoring condition. The proposed BSD method runs in real-time with satisfactory basketball scoring detection accuracy. Our experiments on the collected real scenario basketball court videos show the accuracy of the proposed BSD method. Furthermore, several intelligent basketball analysis systems based on the proposed method have been installed at multiple basketball courts in Beijing, and they provide good performance.

相似文献

16.

A real-time vehicle detection and a novel vehicle tracking systems for estimating and monitoring traffic flow on highways

《Advanced Engineering Informatics》2021

Real-time highway traffic monitoring systems play a vital role in road traffic management, planning, and preventing frequent traffic jams, traffic rule violations, and fatal road accidents. These systems rely entirely on online traffic flow info estimated from time-dependent vehicle trajectories. Vehicle trajectories are extracted from vehicle detection and tracking data obtained by processing road-side camera images. General-purpose object detectors including Yolo, SSD, EfficientNet have been utilized extensively for real-time object detection task, but, in principle, Yolo is preferred because it provides a high frame per second (FPS) performance and robust object localization functionality. However, this algorithm’s average vehicle classification accuracy is below 57%, which is insufficient for traffic flow monitoring. This study proposes improving the vehicle classification accuracy of Yolo, and developing a novel bounding box (Bbox)-based vehicle tracking algorithm. For this purpose, a new vehicle dataset is prepared by annotating 7216 images with 123831 object patterns collected from highway videos. Nine machine learning-based classifiers and a CNN-based classifier were selected. Next, the classifiers were trained via the dataset. One out of ten classifiers with the highest accuracy was selected to combine to Yolo. This way, the classification accuracy of the Yolo-based vehicle detector was increased from 57% to 95.45%. Vehicle detector 1 (Yolo) and vehicle detector 2 (Yolo + best classifier), and the Kalman filter-based tracking as vehicle tracker 1 and the Bbox-based tracking as vehicle tracker 2 were applied to the categorical/total vehicle counting tasks on 4 highway videos. The vehicle counting results show that the vehicle counting accuracy of the developed approach (vehicle detector 2 + vehicle tracker 2) was improved by 13.25% and this method performed better than the other 3 vehicle counting systems implemented in this study. 相似文献

17.

Recognizing 50 human action categories of web videos

Kishore K. Reddy Mubarak Shah 《Machine Vision and Applications》2013,24(5):971-981

Action recognition on large categories of unconstrained videos taken from the web is a very challenging problem compared to datasets like KTH (6 actions), IXMAS (13 actions), and Weizmann (10 actions). Challenges like camera motion, different viewpoints, large interclass variations, cluttered background, occlusions, bad illumination conditions, and poor quality of web videos cause the majority of the state-of-the-art action recognition approaches to fail. Also, an increased number of categories and the inclusion of actions with high confusion add to the challenges. In this paper, we propose using the scene context information obtained from moving and stationary pixels in the key frames, in conjunction with motion features, to solve the action recognition problem on a large (50 actions) dataset with videos from the web. We perform a combination of early and late fusion on multiple features to handle the very large number of categories. We demonstrate that scene context is a very important feature to perform action recognition on very large datasets. The proposed method does not require any kind of video stabilization, person detection, or tracking and pruning of features. Our approach gives good performance on a large number of action categories; it has been tested on the UCF50 dataset with 50 action categories, which is an extension of the UCF YouTube Action (UCF11) dataset containing 11 action categories. We also tested our approach on the KTH and HMDB51 datasets for comparison. 相似文献

18.

基于深度学习的双流程短视频分类方法

张瑷涵刘翔石蕴玉刘思齐《计算机工程》2022,48(7):277-283

随着智能手机和5G网络的普及,短视频已经成为人们碎片时间获取知识的主要途径。针对现实生活场景短视频数据集不足及分类精度较低等问题,提出融合深度学习技术的双流程短视频分类方法。在主流程中,构建A-VGG-3D网络模型,利用带有注意力机制的VGG网络提取特征,采用优化的3D卷积神经网络进行短视频分类,提升短视频在时间维度上的连续性、平衡性和鲁棒性。在辅助流程中,使用帧差法判断镜头切换抽取出短视频中的若干帧,通过滑动窗口机制与级联分类器融合的方式对其进行多尺度人脸检测,进一步提高短视频分类准确性。实验结果表明,该方法在UCF101数据集和自建的生活场景短视频数据集上对于非剧情类与非访谈类短视频的查准率和查全率最高达到98.9%和98.6%,并且相比基于C3D网络的短视频分类方法,在UCF101数据集上的分类准确率提升了9.7个百分点,具有更强的普适性。相似文献

19.

使用场景增强的安全帽佩戴检测方法研究

下载免费PDF全文

徐传运袁含香李刚郑宇刘欢《计算机工程与应用》2022,58(19):326-332

为了解决现有安全帽佩戴数据集样本数量有限导致模型检测精度较低的问题,提出一种基于场景增强的样本扩充算法。该算法将训练集中随机抽取的图像中的检测目标随机缩放后,粘贴到另一随机场景图像上的任意位置,基于现有场景构建出拥有新的检测目标的增强场景,通过场景增强扩充安全帽佩戴训练数据集,增加训练数据集的多样性。为了验证该算法在安全帽佩戴检测中的有效性,使用场景增强算法扩充HelmetWear数据集,并用其训练基于YOLO v4的安全帽佩戴检测模型,通过检测精度评估场景增强算法。在HelmetWear数据集上检测精度达到93.81%,检测精度提升了6.39个百分点。实验结果表明该算法能有效提升安全帽佩戴检测的精度,特别是在样本最为欠缺的小目标上表现更为显著;场景增强算法对解决其他领域目标检测训练数据不足的问题有重要的借鉴意义。相似文献

20.

基于决策树的MPEG视频镜头分割算法 总被引：1，自引：0，他引：1

沈玉利任建峰郭雷《计算机工程与应用》2006,42(12):27-29,59

压缩视频镜头的分割是视频内容分析中的一个难点,由于镜头在组织和索引视频中起关键性的作用,提出了一种基于决策树的MPEG视频镜头分割算法。该算法采用决策树这种机器学习方法对样本视频进行训练,通过融合运动信息、颜色、边缘等特征获得镜头分割的最佳阈值,较好地解决了压缩视频处理中检测镜头突变和渐变难题,同时还能够检测出镜头是否产生闪光现象和相机运动的产生。实验证明本算法在压缩视频镜头检测方面取得了较好的检测结果。相似文献