首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In this paper we present a novel foreground segmentation system that combines color and depth sensors information to perform a more complete Bayesian segmentation between foreground and background classes. The system shows a combination of spatial-color and spatial-depth region-based models for the foreground as well as color and depth pixel-wise models for the background in a Logarithmic Opinion Pool decision framework used to correctly combine the likelihoods of each model. A posterior enhancement step based on a trimap analysis is also proposed in order to correct the precision errors that the depth sensor introduces. The results presented in this paper show that our system is robust in front of color and depth camouflage problems between the foreground object and the background, and also improves the segmentation in the area of the objects’ contours by reducing the false positive detections that appear due to the lack of precision of the depth sensors.  相似文献   

2.
Because salient objects usually have fewer data in a scene, the problem of class imbalance is often encountered in salient object detection (SOD). In order to address this issue and achieve the consistent salient objects, we propose an adversarial focal loss network with improving generative adversarial networks for RGB-D SOD (called AFLNet), in which color and depth branches constitute the generator to achieve the saliency map, and adversarial branch with high-order potentials, instead of pixel-wise loss function, refines the output of the generator to obtain contextual information of objects. We infer the adversarial focal loss function to solve the problem of foreground–background class imbalance. To sufficiently fuse the high-level features of color and depth cues, an inception model is adopted in deep layers. We conduct a large number of experiments using our proposed model and its variants, and compare them with state-of-the-art methods. Quantitative and qualitative experimental results exhibit that our proposed approach can improve the accuracy of salient object detection and achieve the consistent objects.  相似文献   

3.
Depth maps have been proven profitable to provide supplements for salient object detection in recent years. However, most RGB-D salient object detection approaches ignore that there are usually low-quality depth maps, which will inevitably result in unsatisfactory results. In this paper, we propose a depth cue enhancement and guidance network (DEGNet) for RGB-D salient object detection by exploring the depth quality enhancement and utilizing the depth cue guidance to generate predictions with highlighted objects and suppressed backgrounds. Specifically, a depth cue enhancement module is designed to generate high-quality depth maps by enhancing the contrast between the foreground and the background. Then considering the different characteristics of unimodal RGB and depth features, we use different feature enhancement strategies to strengthen the representation capability of side-output unimodal features. Moreover, we propose a depth-guided feature fusion module to excavate depth cues provided by the depth stream to guide the fusion of multi-modal features by fully making use of different modal properties, thus generating discriminative cross-modal features. Besides, we aggregate cross-modal features at different levels to obtain the final prediction by adopting a pyramid feature shrinking structure. Experimental results on six benchmark datasets demonstrate that the proposed network DEGNet outperforms 17 state-of-the-art methods.  相似文献   

4.
基于颜色信息的运动目标检测易受光照、阴影等影响,基于深度信息的运动目标检测存在目标边缘噪声大,无法检测距离背景较近的目标等问题。针对上述问题,该文利用CCD相机获取的颜色信息及TOF相机获取的深度信息分别为每个像素建立颜色与深度信息的分类器,根据像素点的深度特征及前一帧的检测结果,自适应地为每个分类器的输出分配不同的权值,实现运动目标的检测。该文采集多组视频序列进行实验,实验结果表明该方法能有效解决单独利用颜色或深度信息进行运动目标检测时出现的问题。  相似文献   

5.
Extracting accurate foreground objects from a scene is an essential step for many video applications. Traditional background subtraction algorithms can generate coarse estimates, but generating high quality masks requires professional softwares with significant human interventions, e.g., providing trimaps or labeling key frames. We propose an automatic foreground extraction method in applications where a static but imperfect background is available. Examples include filming and surveillance where the background can be captured before the objects enter the scene or after they leave the scene. Our proposed method is very robust and produces significantly better estimates than state-of-the-art background subtraction, video segmentation and alpha matting methods. The key innovation of our method is a novel information fusion technique. The fusion framework allows us to integrate the individual strengths of alpha matting, background subtraction and image denoising to produce an overall better estimate. Such integration is particularly important when handling complex scenes with imperfect background. We show how the framework is developed, and how the individual components are built. Extensive experiments and ablation studies are conducted to evaluate the proposed method.  相似文献   

6.
基于分割的离焦图像深度图提取方法   总被引:3,自引:1,他引:2  
针对影视作品中的大量离焦图像,提出了一种离焦图像的深度图提取方法。将离焦图像的聚焦前景和离焦背景进行分离。对离焦背景提出了深度图模型匹配的方法,构建深度图模型并结合人眼视觉对场景深度的敏锐判断,将背景与对应的深度图模型进行匹配,实现背景深度图的构建;提出了基于颜色分割的深度图再处理,来进一步提高场景深度图的精度。对前景采用单深度赋值,并结合背景深度图融合生成最终深度图。实验表明采用该方法提取的深度图在深度跳跃和深度平滑区域都得到了好的效果。  相似文献   

7.
Extraction of foreground contents in complex background document images is very difficult as background texture, color and foreground font, size, color, tilt are not known in advance. In this work, we propose a RGB color model for the input of complex color document images. An algorithm to detect the text regions using Gabor filters followed by extraction of text using color feature luminance is developed too. The proposed approach consists of three stages. Based on the Gabor features, the candidate image segments containing text are detected in stage-1. Because of complex background, certain amount of high frequency non-text objects in the background are also detected as text objects in stage-1. In stage-2, certain amount of false text objects is dropped by performing the connected component analysis. In stage-3, the image segments containing textual information, which are obtained from the previous stage are binarized to extract the foreground text. The color feature luminance is extracted from the input color document image. The threshold value is derived automatically using this color feature. The proposed approach handles both printed and handwritten color document images with foreground text in any color, font, size and orientation. For experimental evaluations, we have considered a variety of document images having non-uniform/uniform textured and multicolored background. Performance of segmentation of foreground text is evaluated on a commercially available OCR. Evaluation results show better recognition accuracy of foreground characters in the processed document images against unprocessed document images.  相似文献   

8.
Unlike 2D saliency detection, 3D saliency detection can consider the effects of depth and binocular parallax. In this paper, we propose a 3D saliency detection approach based on background detection via depth information. With the aid of the synergism between a color image and the corresponding depth map, our approach can detect the distant background and surfaces with gradual changes in depth. We then use the detected background to predict the potential characteristics of the background regions that are occluded by foreground objects through polynomial fitting; this step imitates the human imagination/envisioning process. Finally, a saliency map is obtained based on the contrast between the foreground objects and the potential background. We compare our approach with 14 state-of-the-art saliency detection methods on three publicly available databases. The proposed model demonstrates good performance and succeeds in detecting and removing backgrounds and surfaces of gradually varying depth on all tested databases.  相似文献   

9.
为解决当被检测图像中具有复杂背景或者含有多人脸时,不能够快速准确的进行人脸检测的问题,本文提出一种基于肤色分割和改进AdaBoost算法的人脸检测方法。首先利用肤色分割方法对样本图像实现图像的预处理,排除样本图像的复杂背景和人体非肤色区域,简化后续的人脸检测工作。然后对AdaBoost算法的弱分类器使用双阈值判决方法,以减少弱分类器个数,提升训练速度;改进权值更新规则,防止训练过程中出现过分配现象。最后对基于肤色分割得到的区域图像利用改进后的Adaboost算法进行最后的精确人脸检测。仿真试验表明,两种算法结合后在训练速度上提升,在检测速度和检测率上有明显提高。  相似文献   

10.
Natural and Seamless Image Composition With Color Control   总被引:1,自引:0,他引:1  
While the state-of-the-art image composition algorithms subtly handle the object boundary to achieve seamless image copy-and-paste, it is observed that they are unable to preserve the color fidelity of the source object, often require quite an amount of user interactions, and often fail to achieve realism when there exists salient discrepancy between the background textures in the source and destination images. These observations motivate our research towards color controlled natural and seamless image composition with least user interactions. In particular, based on the Poisson image editing framework, we first propose a variational model that considers both the gradient constraint and the color fidelity. The proposed model allows users to control the coloring effect caused by gradient domain fusion. Second, to have less user interactions, we propose a distance-enhanced random walks algorithm, through which we avoid the necessity of accurate image segmentation while still able to highlight the foreground object. Third, we propose a multiresolution framework to perform image compositions at different subbands so as to separate the texture and color components to simultaneously achieve smooth texture transition and desired color control. The experimental results demonstrate that our proposed framework achieves better and more realistic results for images with salient background color or texture differences, while providing comparable results as the state-of-the-art algorithms for images without the need of preserving the object color fidelity and without significant background texture discrepancy.  相似文献   

11.
This paper presents a technique for semi-automatic 2D-to-3D stereo video conversion, which is known to provide user intervention in assigning foreground/background depths for key frames and then get depth maps for non-key frames via automatic depth propagation. Our algorithm treats foreground and background separately. For foregrounds, kernel pixels are identified and then used as the seeds for graph-cut segmentation for each non-key frame independently, resulting in results not limited by objects’ motion activity. For backgrounds, all video frames, after foregrounds being removed, are integrated into a common background sprite model (BSM) based on a relay-frame-based image registration algorithm. Users can then draw background depths for BSM in an integrated manner, thus reducing human efforts significantly. Experimental results show that our method is capable of retaining more faithful foreground depth boundaries (by 1.6–2.7 dB) and smoother background depths than prior works. This advantage is helpful for 3D display and 3D perception.  相似文献   

12.
In this paper, we propose a novel framework to extract text regions from scene images with complex backgrounds and multiple text appearances. This framework consists of three main steps: boundary clustering (BC), stroke segmentation, and string fragment classification. In BC, we propose a new bigram-color-uniformity-based method to model both text and attachment surface, and cluster edge pixels based on color pairs and spatial positions into boundary layers. Then, stroke segmentation is performed at each boundary layer by color assignment to extract character candidates. We propose two algorithms to combine the structural analysis of text stroke with color assignment and filter out background interferences. Further, we design a robust string fragment classification based on Gabor-based text features. The features are obtained from feature maps of gradient, stroke distribution, and stroke width. The proposed framework of text localization is evaluated on scene images, born-digital images, broadcast video images, and images of handheld objects captured by blind persons. Experimental results on respective datasets demonstrate that the framework outperforms state-of-the-art localization algorithms.  相似文献   

13.
In this paper, we propose a fully automatic image segmentation and matting approach with RGB-Depth (RGB-D) data based on iterative transductive learning. The algorithm consists of two key elements: robust hard segmentation for trimap generation, and iterative transductive learning based image matting. The hard segmentation step is formulated as a Maximum A Posterior (MAP) estimation problem, where we iteratively perform depth refinement and bi-layer classification to achieve optimal results. For image matting, we propose a transductive learning algorithm that iteratively adjusts the weights between the objective function and the constraints, overcoming common issues such as over-smoothness in existing methods. In addition, we present a new way to form the Laplacian matrix in transductive learning by ranking similarities of neighboring pixels, which is essential to efficient and accurate matting. Extensive experimental results are reported to demonstrate the state-of-the-art performance of our method both subjectively and quantitatively.  相似文献   

14.
Although the actual visual simultaneous localization and mapping (SLAM) algorithms provide highly accurate tracking and mapping, most algorithms are too heavy to run live on embedded devices. In addition, the maps they produce are often unsuitable for path planning. To mitigate these issues, we propose a completely closed-loop online dense RGB-D SLAM algorithm targeting autonomous indoor mobile robot navigation tasks. The proposed algorithm runs live on an NVIDIA Jetson board embedded on a two-wheel differential-drive robot. It exhibits lightweight three-dimensional mapping, room-scale consistency, accurate pose tracking, and robustness to moving objects. Further, we introduce a navigation strategy based on the proposed algorithm. Experimental results demonstrate the robustness of the proposed SLAM algorithm, its computational efficiency, and its benefits for on-the-fly navigation while mapping.  相似文献   

15.
16.
Maritime signal processing technologies have emerged as an important area of study because of the increasing popularity of autonomous ships and automatic maritime surveillance systems. However, the various techniques developed for detecting or tracking objects remain unable to address various maritime noise challenges that cause several types of false positives in maritime visual surveillance. Maritime signal processing is challenging because of the prevalence of noise sources such as severe dynamic backgrounds, wakes, and reflections, owing to the complex, unconstrained, and diverse nature of such scenes caused by the surface properties of water. Moreover, few studies have investigated specific maritime noise filtering as a general integrated processing approach with image and video technologies in the context of maritime visual surveillance. In this study, we propose a novel maritime noise prior (MNP) based on a dark channel prior and observations of the characteristics of the sea. A general maritime filtering technique is developed to suppress noise originating from the properties of water in maritime images and videos. The proposed method employs a noniterative, nonlinear, and simple maritime filtering approach based on MNP that does not require specialized knowledge of application scene conditions or structure. We conducted image and video experiments by applying our approach to three publicly available databases. In experiments with color images, our method successfully filtered related background noise and water, i.e., severe boat wakes and reflections, while preserving objects other than water in color images. In the experiments with video sequences, the results demonstrated that the proposed filter improved the overall performance of state-of-the-art background subtraction (BS) algorithms from 36.60%–50.63%. By combining BS algorithms and filtering to enhance foreground detection in video sequences, the proposed method ensures the universal applicability and flexibility required to eliminate noise from images and videos obtained in challenging maritime environments. The results indicate that the proposed method is appropriate for maritime surveillance applications implementing image segmentation and foreground detection, and it can potentially increase the accuracy of maritime visual surveillance.  相似文献   

17.
侯小刚  赵海英  马严 《电子学报》2019,47(10):2126-2133
为了提高高分辨率图像分割效率,解决复杂图案中待分割目标边缘附近前景与背景区分度小而造成的分割目标不完整问题,本文通过引入超像素HOG特征,提出了一种基于超像素多特征融合(superpixel multi-feature fusion,SMFF)的快速图像分割算法.首先采用目前最有效的超像素算法对待分割图像进行超像素预分割,然后提取基于超像素的HOG特征、Lab颜色特征和空间位置特征,设计基于超像素的多特征度量算法,最终采用图割理论实现了基于超像素多特征融合的快速图像分割.实验结果验证了本文算法的有效性,其算法性能接近于目前最经典图像分割算法,且本文算法的时间性能要明显优于其它对比算法.  相似文献   

18.
目前,相当多的显著目标检测方法均聚焦于2D的图像上,而RGB-D图像所需要的显著检测方法与单纯的2D图像相去甚远,这就需要新的适用于RGB-D的显著检测方法。该文在经典的RGB显著检测方法,即极限学习机的应用的基础上,提出融合了特征提取、前景增强、深度层次检测等多种思路的新的RGB-D显著性检测方法。该文的方法是:第一,运用特征提取的方法,提取RGB图4个超像素尺度的4096维特征;第二,依据特征提取中产生的4个尺度的超像素数量,分别提取RGB图的RGB, LAB, LBP特征以及深度图的LBE特征;第三,根据LBE和暗通道特征两种特征求出粗显著图,并在4个尺度的遍历中不断强化前景、削弱背景;第四,根据粗显著图选取前景与背景种子,放入极限学习机中进行分类,得到第1阶段显著图;第五,运用深度层次检测、图割等方法对第1阶段显著图进行再次优化,得到第2阶段显著图,即最终显著图。  相似文献   

19.
Global motion estimation (GME) is a vital part of many video compression and computer vision applications. However, the large moving foreground objects that are present in many video scenes make the task of GME more challenging. In this paper, we propose an automatic, efficient, and robust approach for GME that addresses the issue of large foreground objects. The proposed GME algorithm is based on two key ideas: a new clustering technique, to automate the initial segmentation of background and foreground blocks, and a modified Lorentzian estimator, to reduce the impact of any remaining foreground blocks on the GME process. We also apply an up-sampling technique to the estimated motion parameters to remove any errors caused by under-sampling during the warping process. These ideas provide a significant improvement in performance when combined into a common framework. Simulation results and analyses demonstrate the improved performance of our proposed algorithm over other state-of-the-art methods.  相似文献   

20.
Object segmentation of unknown objects with arbitrary shape in cluttered scenes is an ambitious goal in computer vision and became a great impulse with the introduction of cheap and powerful RGB-D sensors. We introduce a framework for segmenting RGB-D images where data is processed in a hierarchical fashion. After pre-clustering on pixel level parametric surface patches are estimated. Different relations between patch-pairs are calculated, which we derive from perceptual grouping principles, and support vector machine classification is employed to learn Perceptual Grouping. Finally, we show that object hypotheses generation with Graph-Cut finds a globally optimal solution and prevents wrong grouping. Our framework is able to segment objects, even if they are stacked or jumbled in cluttered scenes. We also tackle the problem of segmenting objects when they are partially occluded. The work is evaluated on publicly available object segmentation databases and also compared with state-of-the-art work of object segmentation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号