期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Video saliency detection incorporating temporal information in compressed domain

《Signal Processing: Image Communication》2015

Saliency detection is widely used to pick out relevant parts of a scene as visual attention regions for various image/video applications. Since video is increasingly being captured, moved and stored in compressed form, there is a need for detecting video saliency directly in compressed domain. In this study, a compressed video saliency detection algorithm is proposed based on discrete cosine transformation (DCT) coefficients and motion information within a visual window. Firstly, DCT coefficients and motion information are extracted from H.264 video bitstream without full decoding. Due to a high quantization parameter setting in encoder, skip/intra is easily chosen as the best prediction mode, resulting in a large number of blocks with zero motion vector and no residual existing in video bitstream. To address these problems, the motion vectors of skip/intra coded blocks are calculated by interpolating its surroundings. In addition, a visual window is constructed to enhance the contrast of features and to avoid being affected by encoder. Secondly, after spatial and temporal saliency maps being generated by the normalized entropy, a motion importance factor is imposed to refine the temporal saliency map. Finally, a variance-like fusion method is proposed to dynamically combine these maps to yield the final video saliency map. Experimental results show that the proposed approach significantly outperforms other state-of-the-art video saliency detection models. 相似文献

2.

Spatiotemporal saliency detection and salient region determination for H.264 videos

Kang-Ting Hu Jin-Jang Leou Han-Hui Hsiao 《Journal of Visual Communication and Image Representation》2013,24(7):760-772

In this study, a spatiotemporal saliency detection and salient region determination approach for H.264 videos is proposed. After Gaussian filtering in Lab color space, the phase spectrum of Fourier transform is used to generate the spatial saliency map of each video frame. On the other hand, the motion vector fields from each H.264 compressed video bitstream are backward accumulated. After normalization and global motion compensation, the phase spectrum of Fourier transform for the moving parts is used to generate the temporal saliency map of each video frame. Then, the spatial and temporal saliency maps of each video frame are combined to obtain its spatiotemporal saliency map using adaptive fusion. Finally, a modified salient region determination scheme is used to determine salient regions (SRs) of each video frame. Based on the experimental results obtained in this study, the performance of the proposed approach is better than those of two comparison approaches. 相似文献

3.

基于空时域特征的视觉显著度图生成算法

鲁雯崔子冠干宗良刘峰朱秀昌《电视技术》2015,39(17):1-4

本文提出了一种新的计算图像空时域显著图的方法,该算法首先用lucas-kanade金字塔算法求绝对运动矢量,用8参数透视模型计算背景运动矢量,再用二者的差值求时域显著图;然后利用颜色对比度和纹理信息计算空域显著图;最后,融合空时域并设置阈值得到总的图像显著图。实验结果表明,新算法能比已有算法更有效的提取视频图像的显著性区域。相似文献

4.

Visual saliency guided video compression algorithm

Rupesh Gupta Meera Thapar Khanna Santanu Chaudhury 《Signal Processing: Image Communication》2013,28(9):1006-1022

Recently Saliency maps from input images are used to detect interesting regions in images/videos and focus on processing these salient regions. This paper introduces a novel, macroblock level visual saliency guided video compression algorithm. This is modelled as a 2 step process viz. salient region detection and frame foveation. Visual saliency is modelled as a combination of low level, as well as high level features which become important at the higher-level visual cortex. A relevance vector machine is trained over 3 dimensional feature vectors pertaining to global, local and rarity measures of conspicuity, to yield probabilistic values which form the saliency map. These saliency values are used for non-uniform bit-allocation over video frames. To achieve these goals, we also propose a novel video compression architecture, incorporating saliency, to save tremendous amount of computation. This architecture is based on thresholding of mutual information between successive frames for flagging frames requiring re-computation of saliency, and use of motion vectors for propagation of saliency values. 相似文献

5.

Saliency detection in the compressed domain for adaptive image retargeting 总被引：2，自引：0，他引：2

Fang Y Chen Z Lin W Lin CW 《IEEE transactions on image processing》2012,21(9):3888-3901

Saliency detection plays important roles in many image processing applications, such as regions of interest extraction and image resizing. Existing saliency detection models are built in the uncompressed domain. Since most images over Internet are typically stored in the compressed domain such as joint photographic experts group (JPEG), we propose a novel saliency detection model in the compressed domain in this paper. The intensity, color, and texture features of the image are extracted from discrete cosine transform (DCT) coefficients in the JPEG bit-stream. Saliency value of each DCT block is obtained based on the Hausdorff distance calculation and feature map fusion. Based on the proposed saliency detection model, we further design an adaptive image retargeting algorithm in the compressed domain. The proposed image retargeting algorithm utilizes multioperator operation comprised of the block-based seam carving and the image scaling to resize images. A new definition of texture homogeneity is given to determine the amount of removal block-based seams. Thanks to the directly derived accurate saliency information from the compressed domain, the proposed image retargeting algorithm effectively preserves the visually important regions for images, efficiently removes the less crucial regions, and therefore significantly outperforms the relevant state-of-the-art algorithms, as demonstrated with the in-depth analysis in the extensive experiments. 相似文献

6.

Key frame extraction based on visual attention model 总被引：2，自引：0，他引：2

Jie-Ling LaiYang Yi 《Journal of Visual Communication and Image Representation》2012,23(1):114-125

Key frame extraction is an important technique in video summarization, browsing, searching and understanding. In this paper, we propose a novel approach to extract the most attractive key frames by using a saliency-based visual attention model that bridges the gap between semantic interpretation of the video and low-level features. First, dynamic and static conspicuity maps are constructed based on motion, color and texture features. Then, by introducing suppression factor and motion priority schemes, the conspicuity maps are fused into a saliency map that includes only true attention regions to produce attention curve. Finally, after time-constraint cluster algorithm grouping frames with similar content, the frames with maximum saliency value are selected as key-frames. Experimental results demonstrate the effectiveness of our approach for video summarization by retrieving the meaningful key frames. 相似文献

7.

Spatiotemporal segmentation for compact video representation

《Signal Processing: Image Communication》2001,16(6):553-566

In this paper, a novel hierarchical object-oriented video segmentation and representation algorithm is proposed. The local variance contrast and the frame difference contrast are jointly exploited for structural spatiotemporal video segmentation because these two visual features can indicate the spatial homogeneity of the grey levels and the temporal coherence of the motion fields efficiently, where the two-dimensional (2D) spatiotemporal entropic technique is further selected for generating the 2D thresholding vectors adaptively according to the variations of the video components. After the region growing and edge simplification procedures, the accurate boundaries among the different video components are further exploited by an intra-block edge extraction procedure. Moreover, the relationships of the video components among frames are exploited by a temporal tracking procedure. This proposed object-oriented spatiotemporal video segmentation algorithm may be useful for MPEG-4 system generating the video object plane (VOP) automatically. 相似文献

8.

Saliency-based dense trajectories for action recognition using low-rank matrix decomposition

《Journal of Visual Communication and Image Representation》2016

Dense trajectory methods have recently been proved to be successful in recognizing actions in realistic videos. However, their performance is still limited due to the uniform dense sampling, which does not discriminate between action-related areas and background. This paper proposes to improve the dense trajectories for recognizing actions captured in realistic scenes, especially in the presence of camera motion. Firstly, based on the observation that the motion in action-related areas is usually much more irregular than the camera motion in background, we recover the salient regions in a video by implementing low-rank matrix decomposition on the motion information and use the saliency maps to indicate action-related areas. Considering action-related regions are changeable but continuous with time, we temporally split a video into subvideos and compute the salient regions subvideo by subvideo. In addition, to ensure spatial continuity, we spatially divide a subvideo into patches and arrange the vectorized optical flow of all the spatial patches to collect the motion information for salient region detection. Then, after the saliency maps of all subvideos in a video are obtained, we incorporate them into dense tracking to extract saliency-based dense trajectories to describe actions. To evaluate the performance of the proposed method, we conduct experiments on four benchmark datasets, namely, Hollywood2, YouTube, HMDB51 and UCF101, and show that the performance of our method is competitive with the state of the art. 相似文献

9.

一种应用形态学滤波的视频对象时空分割算法

王煜坚高建坡吴镇扬《电路与系统学报》2007,12(5):18-24

从视频图像中提取视频对象是基于内容的视频编码中的一项关键技术。本文提出了一种基于帧间运动信息和形态学滤波的视频对象时空分割算法。该算法首先利用分块高阶统计算法和基于最大类间方差的阈值算法得到目标的运动区域检测模板。然后,用基于交变序列重建滤波的分水岭算法得到前景对象的精确边缘。最后,用区域基时空融合方法将运动检测和形态学分割结果结合起来提取出视频对象。实验结果表明,本文算法能避免区域合并有效提取出具有精确边缘的视频对象,主客观分割效果理想。相似文献

10.

视觉注意驱动的基于混沌分析的运动检测方法

下载免费PDF全文

马龙王鲁平李飚沈振康《信号处理》2010,26(12):1825-1832

提出了视觉注意驱动的基于混沌分析的运动检测方法(MDSA)。MDSA首先基于视觉注意机制提取图像的显著区域,而后对显著区域进行混沌分析以检测运动目标。算法技术路线为:首先根据场景图像提取多种视觉敏感的底层图像特征;然后根据特征综合理论将这些特征融合起来得到一幅反映场景图像中各个位置视觉显著性的显著图;而后对显著性水平最高的图像位置所在的显著区域运用混沌分析的方法进行运动检测;根据邻近优先和返回抑制原则提取下一最显著区域并进行运动检测,直至遍历所有的显著区域。本文对传统的显著区域提取方法进行了改进以减少计算量:以邻域标准差代替center-surround算子评估图像各位置的局部显著度,采用显著点聚类的方法代替尺度显著性准则提取显著区域;混沌分析首先判断各显著区域的联合直方图（JH）是否呈现混沌特征,而后依据分维数以一固定阈值对存在混沌的JH中各散点进行分类,最后将分类结果对应到显著区域从而实现运动分割。MDSA具有较好的运动分割效果和抗噪性能,对比实验和算法开销分析证明MDSA优于基于马塞克的运动检测方法（MDM）。相似文献

11.

基于视觉显著性的雷达视频舰船检测

周伟何东亮关键李国强《雷达科学与技术》2012,10(1):54-58

将计算机视觉领域的视觉显著性的概念引入常规雷达视频序列中舰船目标的检测,提出了适用于雷达视频的目标显著性表示模型。利用局部对比度来表征目标与杂波在回波强度上的差异,利用运动显著性来提取目标与固定地杂波和起伏海杂波之间的差异,经线性组合形成综合显著图,可快速准确提取目标。将连续多帧提取的结果进行积累,通过分析目标历史轨迹对目标加以确认。最后,利用采集的某型导航雷达实测数据进行的实验表明了方法的有效性。相似文献

12.

融合相位一致性与二维主成分分析的视觉显著性预测

徐威唐振民《电子与信息学报》2015,37(9):2089-2096

为了更加有效地预测图像中吸引视觉注意的关键区域,该文提出一种融合相位一致性与2维主成分分析(2DPCA)的显著性方法。该方法不同于传统的利用相位谱的方式,而是提出采用相位一致性(PC)获取图像中重要的特征点和边缘信息,经快速漂移超像素优化后,融合局部和全局颜色对比度,生成低层特征显著图。接着提出利用2DPCA提取图像块的主成分后,计算主成分空间中图像块的局部和全局可区分性,得到模式显著图。最后,通过空间离散度度量分配合适的权重,使两者融合,提取显著性区域。在两种人眼跟踪数据库上与5种经典算法的实验对比结果表明,该算法能更加准确地预测人眼视觉关注点。相似文献

13.

Salient object detection using local,global and high contrast graphs

Fatemeh Nouri Kamran Kazemi Habibollah Danyali 《Signal, Image and Video Processing》2018,12(4):659-667

In this paper, we propose a novel multi-graph-based method for salient object detection in natural images. Starting from image decomposition via a superpixel generation algorithm, we utilize color, spatial and background label to calculate edge weight matrix of the graphs. By considering superpixels as the nodes and region similarities as the edge weights, local, global and high contrast graphs are created. Then, an integration technique is applied to form the saliency maps using degree vectors of the graphs. Extensive experiments on three challenging datasets show that the proposed unsupervised method outperforms the several different state-of-the-art unsupervised methods. 相似文献

14.

基于区域协方差的视频显著度局部空时优化模型

田畅姜青竹吴泽民刘涛胡磊《电子与信息学报》2016,38(7):1586-1593

显著度检测在计算机视觉中应用非常广泛,图像级的显著度检测研究已较为成熟,但视频显著度因其高度挑战性研究相对较少。该文借鉴图像级显著度算法的思想,提出一种通用的空时特征提取与优化模型来检测视频显著度。首先利用区域协方差矩阵构造视频的空时特征描述子,然后计算对比度得出初始显著图,最后通过联合前后帧的局部空时优化模型得到最终的显著图。在2个公开视频显著性数据集上的实验结果表明,所提算法性能优于目前的主流算法,同时具有良好的扩展性。相似文献

15.

基于多特征融合的视频检索算法

侯严明李菲菲陈虬《电子科技》2019,32(5):44-49

随着视频等多媒体数据呈指数式迅猛增长,高效快速的视频检索算法引起越来越多的重视。传统的图像特征如颜色直方图以及尺度不变特征变换等对视频拷贝检测中检索速度以及检测精度等问题无法达到很好的效果,因此文中提出一种多特征融合的视频检索方法。该方法利用前后两帧的时空特征进行基于滑动窗口的时间对齐算法,以达到减少检索的范围和提高检索速度的目的。该算法对关键帧进行灰度序列特征、颜色相关图特征以及SIFT局部特征提取,然后融合全局特征和局部特征两者的优势,从而提高检测精度。实验结果表明,该方法可达到较好的视频检索精度。相似文献

16.

A hybrid algorithm for automatic segmentation of slowly moving objects

Zhongjie Zhu^{Author Vitae} Yuer Wang Author Vitae 《AEUE-International Journal of Electronics and Communications》2012,66(3):249-254

Segmentation of moving objects in video sequences is a basic task in many applications. However, it is still challenging due to the semantic gap between the low-level visual features and the high-level human interpretation of video semantics. Compared with segmentation of fast moving objects, accurate and perceptually consistent segmentation of slowly moving objects is more difficult. In this paper, a novel hybrid algorithm is proposed for segmentation of slowly moving objects in video sequence aiming to acquire perceptually consistent results. Firstly, the temporal information of the differences among multiple frames is employed to detect initial moving regions. Then, the Gaussian mixture model (GMM) is employed and an improved expectation maximization (EM) algorithm is introduced to segment a spatial image into homogeneous regions. Finally, the results of motion detection and spatial segmentation are fused to extract final moving objects. Experiments are conducted and provide convincing results. 相似文献

17.

时空深度特征AP聚类的稀疏表示视频异常检测算法

下载免费PDF全文

胡正平张乐尹艳华《信号处理》2019,35(3):386-395

针对异常行为检测问题, 提出基于时空深度特征的AP聚类稀疏表示视频异常检测方法。由于视频序列中大量背景信息及有效信息分布不均匀的情况，首先利用光流结合非均匀的细胞分割对视频的运动目标进行提取并得到空间尺寸大小不同的时空兴趣块。其次利用三维卷积神经网络提取不同时空兴趣块的时空深度特征从而对原始视频序列进行三维描述。然后在字典学习时，采用AP聚类方法，将训练样本中具有代表性的特征作为字典，极大降低字典维度以及稀疏表示方法对计算内存的要求。本文将测试样本进行AP聚类后仅对具有代表性的聚类中心进行检测，在减少实验时间的同时削减了阈值对检测效果的敏感度。实验结果表明，与现有的检测方法相比本文方法具有优越性。相似文献

18.

Adaptive Fuzzy Filtering for Artifact Reduction in Compressed Images and Videos

《IEEE transactions on image processing》2009,18(6):1166-1178

A fuzzy filter adaptive to both sample's activity and the relative position between samples is proposed to reduce the artifacts in compressed multidimensional signals. For JPEG images, the fuzzy spatial filter is based on the directional characteristics of ringing artifacts along the strong edges. For compressed video sequences, the motion compensated spatiotemporal filter (MCSTF) is applied to intraframe and interframe pixels to deal with both spatial and temporal artifacts. A new metric which considers the tracking characteristic of human eyes is proposed to evaluate the flickering artifacts. Simulations on compressed images and videos show improvement in artifact reduction of the proposed adaptive fuzzy filter over other conventional spatial or temporal filtering approaches. 相似文献

19.

Extraction technique of region of interest from stereoscopic video

Lü Chaohui Pan Jiaying 《中国邮电高校学报(英文版)》2017,24(5):68-76

A feature fusion approach is presented to extract the region of interest (ROI) from the stereoscopic video. [0]Based on human vision system (HVS), the depth feature, the color feature and the motion feature are chosen as vision features. [0]The algorithm is shown as follows. Firstly, color saliency is calculated on superpixel scale. Color space distribution of the superpixel and the color difference between the superpixel and background pixel are used to describe color saliency and color salient region is detected. Then, the classic visual background extractor (Vibe) algorithm is improved from the update interval and update region of background model. The update interval is adjusted according to the image content. The update region is determined through non-obvious movement region and background point detection. So the motion region of stereoscopic video is extracted using improved Vibe algorithm. The depth salient region is detected by selecting the region with the highest gray value. Finally, three regions are fused into final ROI. Experiment results show that the proposed method can extract ROI from stereoscopic video effectively. In order to further verify the proposed method, stereoscopic video coding application is also carried out on the joint model (JM) encoder with different bit allocation in ROI and the background region. 相似文献

20.

Attention-guided image captioning with adaptive global and local feature fusion

《Journal of Visual Communication and Image Representation》2021

相似文献