期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Compressed domain video saliency detection using global and local spatiotemporal features

《Journal of Visual Communication and Image Representation》2016

A compressed domain video saliency detection algorithm, which employs global and local spatiotemporal (GLST) features, is proposed in this work. We first conduct partial decoding of a compressed video bitstream to obtain motion vectors and DCT coefficients, from which GLST features are extracted. More specifically, we extract the spatial features of rarity, compactness, and center prior from DC coefficients by investigating the global color distribution in a frame. We also extract the spatial feature of texture contrast from AC coefficients to identify regions, whose local textures are distinct from those of neighboring regions. Moreover, we use the temporal features of motion intensity and motion contrast to detect visually important motions. Then, we generate spatial and temporal saliency maps, respectively, by linearly combining the spatial features and the temporal features. Finally, we fuse the two saliency maps into a spatiotemporal saliency map adaptively by comparing the robustness of the spatial features with that of the temporal features. Experimental results demonstrate that the proposed algorithm provides excellent saliency detection performance, while requiring low complexity and thus performing the detection in real-time. 相似文献

2.

Video saliency detection incorporating temporal information in compressed domain

《Signal Processing: Image Communication》2015

Saliency detection is widely used to pick out relevant parts of a scene as visual attention regions for various image/video applications. Since video is increasingly being captured, moved and stored in compressed form, there is a need for detecting video saliency directly in compressed domain. In this study, a compressed video saliency detection algorithm is proposed based on discrete cosine transformation (DCT) coefficients and motion information within a visual window. Firstly, DCT coefficients and motion information are extracted from H.264 video bitstream without full decoding. Due to a high quantization parameter setting in encoder, skip/intra is easily chosen as the best prediction mode, resulting in a large number of blocks with zero motion vector and no residual existing in video bitstream. To address these problems, the motion vectors of skip/intra coded blocks are calculated by interpolating its surroundings. In addition, a visual window is constructed to enhance the contrast of features and to avoid being affected by encoder. Secondly, after spatial and temporal saliency maps being generated by the normalized entropy, a motion importance factor is imposed to refine the temporal saliency map. Finally, a variance-like fusion method is proposed to dynamically combine these maps to yield the final video saliency map. Experimental results show that the proposed approach significantly outperforms other state-of-the-art video saliency detection models. 相似文献

3.

Visual saliency guided video compression algorithm

Rupesh Gupta Meera Thapar Khanna Santanu Chaudhury 《Signal Processing: Image Communication》2013,28(9):1006-1022

Recently Saliency maps from input images are used to detect interesting regions in images/videos and focus on processing these salient regions. This paper introduces a novel, macroblock level visual saliency guided video compression algorithm. This is modelled as a 2 step process viz. salient region detection and frame foveation. Visual saliency is modelled as a combination of low level, as well as high level features which become important at the higher-level visual cortex. A relevance vector machine is trained over 3 dimensional feature vectors pertaining to global, local and rarity measures of conspicuity, to yield probabilistic values which form the saliency map. These saliency values are used for non-uniform bit-allocation over video frames. To achieve these goals, we also propose a novel video compression architecture, incorporating saliency, to save tremendous amount of computation. This architecture is based on thresholding of mutual information between successive frames for flagging frames requiring re-computation of saliency, and use of motion vectors for propagation of saliency values. 相似文献

4.

一种基于视觉显著度模型的无线视频码率控制算法(英文)

下载免费PDF全文

阮若林胡瑞敏李忠明尹黎明《中国通信》2011,8(7):105-110

In order to further improve the efficiency of video compression, we introduce a perceptual characteristics of Human Visual System (HVS) to video coding, and propose a novel video coding rate control algorithm based on human visual saliency model in H.264/AVC. Firstly, we modifie Itti's saliency model. Secondly, target bits of each frame are allocated through the correlation of saliency region between the current and previous frame, and the complexity of each MB is modified through the saliency value and its... 相似文献

5.

基于三维变换域频谱差的视频显著性检测算法

关爽殷海兵《电视技术》2015,39(5)

显著性区域检测是计算机视觉的重要课题,对视频质量评价和感知视频编码算法优化也至关重要.大多显著性检测算法不能权衡准确率和复杂度,限制了它们在视频预处理和实时处理中的应用.提出了一种基于三维变换域频谱差(3DTDSD)的快速视频显著性检测算法,分别以关键帧及其前一帧为中心建立一定数量图像帧的滑动窗,得到两组3D视频体,用傅里叶变换将两组视频变换到三维频域,两组三维数据之间的差值经过反变换得到显著性图,最后通过连通分析、阈值判断等得到显著区域.频域算法具有运算速度快的特点,实验对比和算法复杂度分析证明了该算法的有效性和快速性. 相似文献

6.

Extraction technique of region of interest from stereoscopic video

Lü Chaohui Pan Jiaying 《中国邮电高校学报(英文版)》2017,24(5):68-76

A feature fusion approach is presented to extract the region of interest (ROI) from the stereoscopic video. [0]Based on human vision system (HVS), the depth feature, the color feature and the motion feature are chosen as vision features. [0]The algorithm is shown as follows. Firstly, color saliency is calculated on superpixel scale. Color space distribution of the superpixel and the color difference between the superpixel and background pixel are used to describe color saliency and color salient region is detected. Then, the classic visual background extractor (Vibe) algorithm is improved from the update interval and update region of background model. The update interval is adjusted according to the image content. The update region is determined through non-obvious movement region and background point detection. So the motion region of stereoscopic video is extracted using improved Vibe algorithm. The depth salient region is detected by selecting the region with the highest gray value. Finally, three regions are fused into final ROI. Experiment results show that the proposed method can extract ROI from stereoscopic video effectively. In order to further verify the proposed method, stereoscopic video coding application is also carried out on the joint model (JM) encoder with different bit allocation in ROI and the background region. 相似文献

7.

Selecting salient frames for spatiotemporal video modeling and segmentation.

Xiaomu Song Guoliang Fan 《IEEE transactions on image processing》2007,16(12):3035-3046

We propose a new statistical generative model for spatiotemporal video segmentation. The objective is to partition a video sequence into homogeneous segments that can be used as "building blocks" for semantic video segmentation. The baseline framework is a Gaussian mixture model (GMM)-based video modeling approach that involves a six-dimensional spatiotemporal feature space. Specifically, we introduce the concept of frame saliency to quantify the relevancy of a video frame to the GMM-based spatiotemporal video modeling. This helps us use a small set of salient frames to facilitate the model training by reducing data redundancy and irrelevance. A modified expectation maximization algorithm is developed for simultaneous GMM training and frame saliency estimation, and the frames with the highest saliency values are extracted to refine the GMM estimation for video segmentation. Moreover, it is interesting to find that frame saliency can imply some object behaviors. This makes the proposed method also applicable to other frame-related video analysis tasks, such as key-frame extraction, video skimming, etc. Experiments on real videos demonstrate the effectiveness and efficiency of the proposed method. 相似文献

8.

Human centered perceptual adaptation for video coding

Minglei Tong Zhouye Gu Nam Ling Junjie Yang 《Multidimensional Systems and Signal Processing》2016,27(3):785-799

Traditional visual saliency based video compression methods try to encode the image with higher quality in the region of saliency. However, the saliency feature changes according to persons, viewpoints, and distances. In this paper, we propose to apply a technique of human centered perceptual computation to improve video coding in the region of human centered perception. To detect the region of interest (ROI) of human body, upper body, frontal face, and profile face, we construct Harr and histogram of oriented gradients features based combo of detectors to analyze a video in the first frame (intra-frame). From the second frame (inter-frame) onward, the optical flow image is computed in the ROI area of the first frame. The optical flow in human centered ROI is then used for macroblock (MB) quantization adjustment in H.264/AVC. For each MB, the quantization parameter (QP) is optimized with density value of optical flow image. The QP optimization process is based on a MB mapping model, which can be calculated by an inverse of the inverse tangent function. The Lagrange multiplier in the rate distortion optimization is also adapted so that the MB distortion at human centered region is minimized. We apply our technique to the H.264 video encoder to improve coding visual quality. By evaluating our scheme with the H.264 reference software, our results show that the proposed algorithm can improve the visual quality of ROI by about 1.01 dB while preserving coding efficiency. 相似文献

9.

A new video watermarking algorithm based on 1D DFT and Radon transform 总被引：2，自引：0，他引：2

Yan Liu Jiying Zhao 《Signal processing》2010,90(2):626-639

In this paper, we propose a new video watermarking algorithm based on the 1D DFT (one-dimensional discrete Fourier transform) and Radon transform. The 1D DFT for a video sequence generates an ideal domain, in which the spatial information is still kept and the temporal information is obtained. With detailed analysis and calculation, we choose the frames with highest temporal frequencies to embed the fence-shaped watermark pattern in the Radon transform domain of the selected frames. The adaptive embedding strength for different locations keeps the fidelity of the watermarked video. The performance of the proposed algorithm is evaluated by video compression standard H.264 with three different bit rates; geometric attacks such as rotation, translation, and aspect-ratio changes; and other attacks like frame drop, frame swap, spatial filtering, noise addition, lighting change, and histogram equalization. The main contributions of this paper are the introduction of the 1D DFT along temporal direction for watermarking that enables the robustness against video compression, and the Radon transform-based watermark embedding and extraction that produces the robustness against geometric transformations. One of the most important advantages of this video watermarking algorithm is its simplicity and practicality. 相似文献

10.

Bayesian salient object detection based on saliency driven clustering

《Signal Processing: Image Communication》2014,29(3):434-447

Salient object detection is essential for applications, such as image classification, object recognition and image retrieval. In this paper, we design a new approach to detect salient objects from an image by describing what does salient objects and backgrounds look like using statistic of the image. First, we introduce a saliency driven clustering method to reveal distinct visual patterns of images by generating image clusters. The Gaussian Mixture Model (GMM) is applied to represent the statistic of each cluster, which is used to compute the color spatial distribution. Second, three kinds of regional saliency measures, i.e, regional color contrast saliency, regional boundary prior saliency and regional color spatial distribution, are computed and combined. Then, a region selection strategy integrating color contrast prior, boundary prior and visual patterns information of images is presented. The pixels of an image are divided into either potential salient region or background region adaptively based on the combined regional saliency measures. Finally, a Bayesian framework is employed to compute the saliency value for each pixel taking the regional saliency values as priority. Our approach has been extensively evaluated on two popular image databases. Experimental results show that our approach can achieve considerable performance improvement in terms of commonly adopted performance measures in salient object detection. 相似文献

11.

基于相位谱的红外小目标搜索算法研究

许强马登武郭小威《红外》2012,33(6):32-37

针对红外小目标搜索阶段中图像背景稀疏的特点,提出了一种利用图像相位谱计算显著图和定位目标的新方法。与传统的利用海天线和海岸线对目标进行搜索的方法相比,本文算法的计算复杂度大大降低,并弥补了其受气温影响而不能准确定位海天线和海岸线的不足;与利用Itti模型的方法相比,该算法克服了其不能有效分离目标和背景的缺点。阐述了利用离散傅里叶变换(Discrete Fourier Transform,DFT)和离散余弦变换(Discrete Cosine Transform,DCT)计算相位谱的两种方法及它们之间的一致性。构建了利用图像相位谱计算显著图的数学模型,明确了模型参数选择的作用和意义。通过理论和实例证明了本文算法对于稀疏背景下小目标定位的可行性和高效性。相似文献

12.

融合显著图和保真图的全参考图像质量评价

郭迎春于洋师硕于明《光电子．激光》2016,27(11):1228-1237

提出一种融合显著图(SM)和保真图(FM)的全参考图像质量评价算法,用于评价质降图像的失真度。利用亮度和色度的相似度提取质降图像相对于参考图像的FM;对参考图像进行区域划分、全局显著性提取和纹理边缘补充得到SM,将SM与质降图像的FM融合得到基于感知的显著保真图(PSM),计算质降图像的客观评价得分。在标准数据库上的实验结果表明,本文方法与主观评价能够很好保持一致,并对LIVE图像库中的5种失真图像均有很好的表现。相似文献

13.

Spatiotemporal segmentation for compact video representation

《Signal Processing: Image Communication》2001,16(6):553-566

In this paper, a novel hierarchical object-oriented video segmentation and representation algorithm is proposed. The local variance contrast and the frame difference contrast are jointly exploited for structural spatiotemporal video segmentation because these two visual features can indicate the spatial homogeneity of the grey levels and the temporal coherence of the motion fields efficiently, where the two-dimensional (2D) spatiotemporal entropic technique is further selected for generating the 2D thresholding vectors adaptively according to the variations of the video components. After the region growing and edge simplification procedures, the accurate boundaries among the different video components are further exploited by an intra-block edge extraction procedure. Moreover, the relationships of the video components among frames are exploited by a temporal tracking procedure. This proposed object-oriented spatiotemporal video segmentation algorithm may be useful for MPEG-4 system generating the video object plane (VOP) automatically. 相似文献

14.

Salient object detection in video using deep non-local neural networks

《Journal of Visual Communication and Image Representation》2020

Detection of salient objects in image and video is of great importance in many computer vision applications. In spite of the fact that the state of the art in saliency detection for still images has been changed substantially over the last few years, there have been few improvements in video saliency detection. This paper proposes a novel non-local fully convolutional network architecture for capturing global dependencies more efficiently and investigates the use of recently introduced non-local neural networks in video salient object detection. The effect of non-local operations is studied separately on static and dynamic saliency detection in order to exploit both appearance and motion features. A novel deep non-local fully convolutional network architecture is introduced for video salient object detection and tested on two well-known datasets DAVIS and FBMS. The experimental results show that the proposed algorithm outperforms state-of-the-art video saliency detection methods. 相似文献

15.

Visual saliency detection based on region descriptors and prior knowledge

《Signal Processing: Image Communication》2014,29(3):424-433

相似文献

16.

基于视觉显著计算的视频流媒体渐进式表达方法

下载免费PDF全文

梁永生柳伟周莺魏泽锋张基宏《电子学报》2017,45(7):1567-1575

为了有效解决视频流媒体传输网络带宽、播出视频质量和用户实时性访问之间的矛盾,本文提出了一种基于视觉显著计算的视频流媒体渐进式表达方法.在视频内容分析和理解的基础上,首先进行场景分类和视觉敏感区域提取;然后根据编码信息确定视频序列中各帧的重要性,估计帧内片层数据重要性;最后基于视觉显著计算的结果提出一种适应网络带宽和质量可伸缩的视频流媒体渐进式表达方法.采用中粒度质量可伸缩（MGS）编码,在模拟网络测试平台上分别针对集中式和分散式视觉敏感区域视频序列进行实验研究,实验结果验证了本文提出的基于视觉显著计算的视频流媒体渐进式表达方法的正确性和有效性. 相似文献

17.

超复数域小波变换的显著性检测

余映吴青龙邵凯旋康迂星杨鉴《电子与信息学报》2019,41(9):2231-2238

针对现有频域显著性检测方法得到的显著区域不完整的问题,该文提出一种多尺度分析的频率域显著性检测方法。首先由输入图像特征通道信息构建4元超复数,然后通过小波变换对4元超复数域中幅度谱进行多尺度分解,计算生成多尺度下的视觉显著图,最后由评价函数选出效果较好显著图合成最终视觉显著图。实验结果表明,该文方法能够有效地抑制背景干扰,快速、精确地找到完整的显著目标,具有较高的检测精确度。相似文献

18.

基于渐进结构感受野和全局注意力的显著性检测

董波周燕王永雄《电子科技》2009,34(1):23-30

当前的显著性检测算法在复杂场景下难以分割出完整显著性区域以及锐利的边缘细节。针对这一问题,文中提出了一种新颖的特征融合算法。该方法利用全卷积神经网络获取多个层次粗糙的初始特征并结合特征金字塔结构对其深度解析。设计渐进结构感受野模块将特征转换至不同尺度的空间进行优化,实现特征的渐进融合与传递,有选择性地增强显著性区域。采用全局注意力机制消除背景噪声并建立显著性像素之间的长距离依赖,以提高显著性区域的有效性,突出显著性目标,再通过学习融合个层次特征得到显著图。综合实验表明,在绝对误差减小的情况下,F-measure指标远超出其他7种主流方法。所提的显著性模型综合了全卷积神经网络和特征金字塔结构的优点,结合文中设计的渐进结构感受野和全局注意力机制,使得显著图更接近真值图。相似文献

19.

超像素内容感知先验的多尺度贝叶斯显著性检测方法

下载免费PDF全文

张荣国贾玉闪胡静刘小君李晓明《电子学报》2020,48(8):1509-1515

针对复杂背景下显著性检测方法不能够有效地抑制背景,进而准确地检测目标这一问题,提出了超像素内容感知先验的多尺度贝叶斯显著性检测方法.首先,将目标图像分割为多尺度的超像素图,在每个尺度上引入内容感知的对比度先验、中心位置先验、边界连通背景先验来计算单一尺度上的目标显著值;其次,融合多个尺度的内容感知先验显著值生成一个粗略的显著图;然后,将粗略显著图值作为先验概率,根据颜色直方图和凸包中心先验计算观测似然概率,再使用多尺度贝叶斯模型来获取最终显著目标;最后,使用了3个公开的数据集、5种评估指标、7种现有的方法进行对比实验,结果表明本文方法在显著性目标检测方面具有更好的表现. 相似文献

20.

基于空时域特征的视觉显著度图生成算法

鲁雯崔子冠干宗良刘峰朱秀昌《电视技术》2015,39(17):1-4

本文提出了一种新的计算图像空时域显著图的方法,该算法首先用lucas-kanade金字塔算法求绝对运动矢量,用8参数透视模型计算背景运动矢量,再用二者的差值求时域显著图;然后利用颜色对比度和纹理信息计算空域显著图;最后,融合空时域并设置阈值得到总的图像显著图。实验结果表明,新算法能比已有算法更有效的提取视频图像的显著性区域。相似文献