共查询到20条相似文献,搜索用时 0 毫秒
1.
We propose a new statistical generative model for spatiotemporal video segmentation. The objective is to partition a video sequence into homogeneous segments that can be used as "building blocks" for semantic video segmentation. The baseline framework is a Gaussian mixture model (GMM)-based video modeling approach that involves a six-dimensional spatiotemporal feature space. Specifically, we introduce the concept of frame saliency to quantify the relevancy of a video frame to the GMM-based spatiotemporal video modeling. This helps us use a small set of salient frames to facilitate the model training by reducing data redundancy and irrelevance. A modified expectation maximization algorithm is developed for simultaneous GMM training and frame saliency estimation, and the frames with the highest saliency values are extracted to refine the GMM estimation for video segmentation. Moreover, it is interesting to find that frame saliency can imply some object behaviors. This makes the proposed method also applicable to other frame-related video analysis tasks, such as key-frame extraction, video skimming, etc. Experiments on real videos demonstrate the effectiveness and efficiency of the proposed method. 相似文献
2.
Saliency model-based face segmentation and tracking in head-and-shoulder video sequences 总被引:1,自引:0,他引:1
Hongliang Li King N. Ngan 《Journal of Visual Communication and Image Representation》2008,19(5):320-333
In this paper, a novel face segmentation algorithm is proposed based on facial saliency map (FSM) for head-and-shoulder type video application. This method consists of three stages. The first stage is to generate the saliency map of input video image by our proposed facial attention model. In the second stage, a geometric model and an eye-map built from chrominance components are employed to localize the face region according to the saliency map. The third stage involves the adaptive boundary correction and the final face contour extraction. Based on the segmented result, an effective boundary saliency map (BSM) is then constructed, and applied for the tracking based segmentation of the successive frames. Experimental evaluation on test sequences shows that the proposed method is capable of segmenting the face area quite effectively. 相似文献
3.
In this work, we present a segmentation algorithm for color images that uses the watershed algorithm to segment either the two-dimensional (2-D) or the three-dimensional (3-D) color histogram of an image. For compliance with the way humans perceive color, this segmentation has to take place in a perceptually uniform color space like the Luv space. To avoid oversegmentation, the watershed algorithm has to be applied to a smoothed histogram. 相似文献
4.
Qian Zhang King Ngi Ngan 《Journal of Visual Communication and Image Representation》2010,21(5-6):453-461
In this paper, we present an automatic algorithm to segment multiple objects from multi-view video. The Initial Interested Objects (IIOs) are automatically extracted in the key view of the initial frame based on the saliency model. Multiple objects segmentation is decomposed into several sub-segmentation problems, and solved by minimizing the energy function using binary label graph cut. In the proposed novel energy function, the color and depth cues are integrated with the data term, which is then modified with background penalty with occlusion reasoning. In the smoothness term, foreground contrast enhancement is developed to strengthen the moving objects boundary, and at the same time attenuates the background contrast. To segment the multi-view video, the coarse predictions of the other views and the successive frame are projected by pixel-based disparity and motion compensation, respectively, which exploits the inherent spatiotemporal consistency. Uncertain band along the object boundary is shaped based on activity measure and refined with graph cut, resulting in a more accurate Interested Objects (IOs) layer across all views of the frames. The experiments are implemented on a couple of multi-view videos with real and complex scenes. Excellent subjective results have shown the robustness and efficiency of the proposed algorithm. 相似文献
5.
Color local texture features for color face recognition 总被引:1,自引:0,他引:1
This paper proposes new color local texture features, i.e., color local Gabor wavelets (CLGWs) and color local binary pattern (CLBP), for the purpose of face recognition (FR). The proposed color local texture features are able to exploit the discriminative information derived from spatiochromatic texture patterns of different spectral channels within a certain local face region. Furthermore, in order to maximize a complementary effect taken by using both color and texture information, the opponent color texture features that capture the texture patterns of spatial interactions between spectral channels are also incorporated into the generation of CLGW and CLBP. In addition, to perform the final classification, multiple color local texture features (each corresponding to the associated color band) are combined within a feature-level fusion framework. Extensive and comparative experiments have been conducted to evaluate our color local texture features for FR on five public face databases, i.e., CMU-PIE, Color FERET, XM2VTSDB, SCface, and FRGC 2.0. Experimental results show that FR approaches using color local texture features impressively yield better recognition rates than FR approaches using only color or texture information. Particularly, compared with grayscale texture features, the proposed color local texture features are able to provide excellent recognition rates for face images taken under severe variation in illumination, as well as for small- (low-) resolution face images. In addition, the feasibility of our color local texture features has been successfully demonstrated by making comparisons with other state-of-the-art color FR methods. 相似文献
6.
This correspondence presents a novel hybrid Color and Frequency Features (CFF) method for face recognition. The CFF method, which applies an Enhanced Fisher Model (EFM), extracts the complementary frequency features in a new hybrid color space for improving face recognition performance. The new color space, the RIQ color space, which combines the R component image of the RGB color space and the chromatic components I and Q of the YIQ color space, displays prominent capability for improving face recognition performance due to the complementary characteristics of its component images. The EFM then extracts the complementary features from the real part, the imaginary part, and the magnitude of the R image in the frequency domain. The complementary features are then fused by means of concatenation at the feature level to derive similarity scores for classification. The complementary feature extraction and feature level fusion procedure applies to the I and Q component images as well. Experiments on the Face Recognition Grand Challenge (FRGC) version 2 Experiment 4 show that i) the hybrid color space improves face recognition performance significantly, and ii) the complementary color and frequency features further improve face recognition performance. 相似文献
7.
A compressed domain video saliency detection algorithm, which employs global and local spatiotemporal (GLST) features, is proposed in this work. We first conduct partial decoding of a compressed video bitstream to obtain motion vectors and DCT coefficients, from which GLST features are extracted. More specifically, we extract the spatial features of rarity, compactness, and center prior from DC coefficients by investigating the global color distribution in a frame. We also extract the spatial feature of texture contrast from AC coefficients to identify regions, whose local textures are distinct from those of neighboring regions. Moreover, we use the temporal features of motion intensity and motion contrast to detect visually important motions. Then, we generate spatial and temporal saliency maps, respectively, by linearly combining the spatial features and the temporal features. Finally, we fuse the two saliency maps into a spatiotemporal saliency map adaptively by comparing the robustness of the spatial features with that of the temporal features. Experimental results demonstrate that the proposed algorithm provides excellent saliency detection performance, while requiring low complexity and thus performing the detection in real-time. 相似文献
8.
图像分割的研究一直是图像处理研究的热点问题,尤其是对彩色图像的分割研究更为重要,虽然对彩色图像分割的研究提出很多分割算法,但是很多算法仍存在缺陷,本文针对解决二维OSTU分割算法分割图像时计算复杂和易受噪声干扰的问题,提出将Lab彩色空间应用到二维OSTU算法中,首先将色彩图像从RGB空间转到Lab空间,然后联合利用L通道、a通道、b通道图像信息进行粗分割,最后针对其中某个通道的图像信息进行二维OSTU细分割.通过试验表明,该方法对彩色图像有较好的分割效果. 相似文献
9.
Karkanis S.A. Iakovidis D.K. Maroulis D.E. Karras D.A. Tzivras M. 《IEEE transactions on information technology in biomedicine》2003,7(3):141-152
We present an approach to the detection of tumors in colonoscopic video. It is based on a new color feature extraction scheme to represent the different regions in the frame sequence. This scheme is built on the wavelet decomposition. The features named as color wavelet covariance (CWC) are based on the covariances of second-order textural measures and an optimum subset of them is proposed after the application of a selection algorithm. The proposed approach is supported by a linear discriminant analysis (LDA) procedure for the characterization of the image regions along the video frames. The whole methodology has been applied on real data sets of color colonoscopic videos. The performance in the detection of abnormal colonic regions corresponding to adenomatous polyps has been estimated high, reaching 97% specificity and 90% sensitivity. 相似文献
10.
Yang Gaobo Zhang Zhaoyang 《Electronics letters》2003,39(15):1113-1114
A semiautomatic video object segmentation is proposed. The initial object contour is obtained by modified intelligent scissors. Video decomposing is performed to avoid errors accumulating during object tracking. Snake-based bidirectional tracking is utilised to interpolate the VOPs of successive frames. Experimental results show the effectiveness of the method. 相似文献
11.
本文首先介绍了3GPP 定义的IMS 网络的体系架构,之后阐述了MRF 的组成结构及功能,然后着重讨论了音视频会议在MRF 中的具体设计方案,分别给出了MRFC 和MRFP 的功能模块的设计,最后具体描述了MRF 的音视频会议的信令流程的实现. 相似文献
12.
This paper describes the implementation of the recently introducedcolor set partitioning in hierarchical tree (CSPIHT)-based scheme for video coding. The intra- and interframe coding performance of a CSPIHT-based video coder (CVC) is compared against that of the H.263 at bit rates lower than 64 kbit/s. The CVC performs comparably or better than the H.263 at lower bit rates, whereas the H.263 performs better than the CVC at higher bit rates. We identify areas that hamper the performance of the CVC and propose an improved scheme that yields better performance in image and video coding in low bit-rate environments. 相似文献
13.
Hongliang Li King N. Ngan 《Communications Magazine, IEEE》2007,45(1):27-33
Advanced multimedia applications have to provide content-related functionalities such as search and retrieval of meaningful objects, detection and analysis of events, and understanding of scenes, which allow the user to access and manipulate the multimedia content with greater flexibility. This greatly depends on automatic techniques for extracting such objects from multimedia data. In this article we intend to provide a tutorial on the state-of-the-art in video segmentation and tracking technology with particular attention paid to the recent developments in attention-based object extraction. Performance results are included to highlight this emerging technology 相似文献
14.
Performance measures for video object segmentation and tracking 总被引:2,自引:0,他引:2
We propose measures to evaluate quantitatively the performance of video object segmentation and tracking methods without ground-truth (GT) segmentation maps. The proposed measures are based on spatial differences of color and motion along the boundary of the estimated video object plane and temporal differences between the color histogram of the current object plane and its predecessors. They can be used to localize (spatially and/or temporally) regions where segmentation results are good or bad; and/or they can be combined to yield a single numerical measure to indicate the goodness of the boundary segmentation and tracking results over a sequence. The validity of the proposed performance measures without GT have been demonstrated by canonical correlation analysis with another set of measures with GT on a set of sequences (where GT information is available). Experimental results are presented to evaluate the segmentation maps obtained from various sequences using different segmentation approaches. 相似文献
15.
A generic definition of video objects, which is a group of pixels with temporal motion coherence, is considered. The generic video object (GVO) is the superset of the conventional video objects considered in the object segmentation literature. Because of its motion coherence, the GVO can be easily recognised by the human visual system. However, due to its arbitrary spatial distribution, the GVO cannot be easily detected by the existing algorithms which often assume the spatial homogeneousness of the video objects. The concept of extended optical flow is introduced and a dynamic programming framework for the GVO detection and segmentation is developed, whose solution is given by the Viterbi algorithm. Using this dynamic programming formulation, the proposed object detection algorithm is able to discover the motion path of the GVO automatically and refine its spatial region of support progressively. In addition to object segmentation, the proposed algorithm can also be applied to video pre-processing, removing the so-called 'video mask' noise in digital videos. Experimental results show that this type of vision-assisted video pre-processing significantly improves the compression efficiency. 相似文献
16.
王凤领 《智能计算机与应用》2017,7(5)
基于压缩域视频片段检测可以省略解压步骤,直接从原始视频数据流提取特征,可以加快检测率.本文首先分析了视频数据的特性,视频的分割和关键帧的选取,阐述现有的典型方法,通过分析视频检索技术的关键技术,采用基于视频分割和关键帧的压缩视频流提取方法,提出了一种基于DC系数和运动矢量从MPEG压缩视频中提取关键帧的方法.实验表明,所提出的方法可以减少计算负担,并可以更好地表示视频内容. 相似文献
17.
A recursive algorithm is presented that is designed to construct quantisation tables and codebooks for the hierarchical vector quantisation of images. The algorithm is computationally inexpensive and yields high quality codebooks 相似文献
18.
19.
20.
In order to complement subjective evaluation of the quality of segmentation masks, this paper introduces a procedure for automatically assessing this quality. Algorithmically computed figures of merit are proposed. Assuming the existence of a perfect reference mask (ground truth), generated manually or with a reliable procedure over a test set, these figures of merit take into account visually desirable properties of a segmentation mask in order to provide the user with metrics that best quantify the spatial and temporal accuracy of the segmentation masks. For the sake of easy interpretation, results are presented on a peaked signal-to-noise ratio-like logarithmic scale. 相似文献