首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 140 毫秒
1.
图像分割是从图像中提取有意义的区域,是图像处理和计算机视觉中的关键技术。而自动分割方法不能很好地处理前景复杂的图像,对此提出一种基于区域中心的交互式图像前景提取算法。针对图像前景的复杂度,很难用单一的相似区域描述前景,文中采用多个区域中心来刻画目标区域。为提升图像分割的稳定性,给出基于超像素颜色、空间位置和纹理信息的相似性度量方法;为确保图像分割区域的连通性和准确性,定义了基于超像素的测地距离计算方法。使用基于测地距离的超像素局部密度,来分析图像的若干区域中心;基于用户交互的方式来分析前景的区域中心,得到图像前景。经过大量彩色图像的仿真表明,在分割过程中利用少量的用户交互信息,可有效提升图像分割的稳定性和准确性。  相似文献   

2.
针对RGB图像的实例分割任务在图像目标物体纹理相近但类别不同的区域可能出现分割错误的问题,引入Depth信息,结合RGB-D图像的三维几何结构特点,提出一种以双金字塔特征融合网络为框架的RGB-D实例分割方法.所提出的方法通过构建两种不同复杂度的金字塔深度卷积神经网络分别提取不同梯度分辨率大小的RGB特征及Depth特征,将对应分辨率大小的两种特征相加输入区域候选网络,以此改变输入区域候选网络层的共享特征,共享特征再经过分类、回归与掩码网络分支输出定位与分类结果,从而实现RGB-D图像的实例分割.实验结果表明,所提出的双金字塔特征融合网络模型能够完成RGB-D图像的实例分割任务,有效学习到深度图像与彩色图像之间的互补信息,与不包含Depth信息的Mask R-CNN相比,平均精度提高7.4%.  相似文献   

3.

Augmented Reality applications are set to revolutionize the smartphone industry due to the integration of RGB-D sensors into mobile devices. Given the large number of smartphone users, efficient storage and transmission of RGB-D data is of paramount interest to the research community. While there exist Video Coding Standards such as HEVC and H.264/AVC for compression of RGB/texture component, the coding of depth data is still an area of active research. This paper presents a method for coding depth videos, captured from mobile RGB-D sensors, by planar segmentation. The segmentation algorithm is based on Markov Random Field assumptions on depth data and solved using Graph Cuts. While all prior works based on this approach remain restricted to images only and under noise-free conditions, this paper presents an efficient solution to planar segmentation in noisy depth videos. Also presented is a unique method to encode depth based on its segmented planar representation. Experiments on depth captured from a noisy sensor (Microsoft Kinect) shows superior Rate-Distortion performance over the 3D extension of HEVC codec.

  相似文献   

4.
Pan  Baiyu  Zhang  Liming  Yin  Hanxiong  Lan  Jun  Cao  Feilong 《Multimedia Tools and Applications》2021,80(13):19179-19201

3D movies/videos have become increasingly popular in the market; however, they are usually produced by professionals. This paper presents a new technique for the automatic conversion of 2D to 3D video based on RGB-D sensors, which can be easily conducted by ordinary users. To generate a 3D image, one approach is to combine the original 2D color image and its corresponding depth map together to perform depth image-based rendering (DIBR). An RGB-D sensor is one of the inexpensive ways to capture an image and its corresponding depth map. The quality of the depth map and the DIBR algorithm are crucial to this process. Our approach is twofold. First, the depth maps captured directly by RGB-D sensors are generally of poor quality because there are many regions missing depth information, especially near the edges of objects. This paper proposes a new RGB-D sensor based depth map inpainting method that divides the regions with missing depths into interior holes and border holes. Different schemes are used to inpaint the different types of holes. Second, an improved hole filling approach for DIBR is proposed to synthesize the 3D images by using the corresponding color images and the inpainted depth maps. Extensive experiments were conducted on different evaluation datasets. The results show the effectiveness of our method.

  相似文献   

5.
目的 许多先前的显著目标检测工作都是集中在2D的图像上,并不能适用于RGB-D图像的显著性检测。本文同时提取颜色特征以及深度特征,提出了一种基于特征融合和S-D概率矫正的RGB-D显著性检测方法,使得颜色特征和深度特征相互补充。方法 首先,以RGB图像的4个边界为背景询问节点,使用特征融合的Manifold Ranking输出RGB图像的显著图;其次,依据RGB图像的显著图和深度特征计算S-D矫正概率;再次,计算深度图的显著图并依据S-D矫正概率对该显著图进行S-D概率矫正;最后,对矫正后的显著图提取前景询问节点再次使用特征融合的Manifold Ranking方法进行显著优化,得到最终的显著图。结果 利用本文RGB-D显著性检测方法对RGBD数据集上的1 000幅图像进行了显著性检测,并与6种不同的方法进行对比,本文方法的显著性检测结果更接近人工标定结果。Precision-Recall曲线(PR曲线)显示在相同召回率下本文方法的准确率较其中5种方法高,且处理单幅图像的时间为2.150 s,与其他算法相比也有一定优势。结论 本文方法能较准确地对RGB-D图像进行显著性检测。  相似文献   

6.
交互式图像分割是图像分割中的重要分支,在现实生活和医学领域都有着广泛的应用。该文基于计算测地距离的热方法,引入了热扩散系数,提出了一种基于非均匀热扩散的交互式图像分割算法。该算法利用图像的颜色信息构造三角网格作为热扩散的媒介,首先由热方程找到距离增加的方向,再利用泊松方程还原测地距离。将前景中人工交互区域上的热流扩散速度增加,则前景不同部分之间的测地距离变小,消除了内部边界,通过设置外部边界分割限制条件,即可实现完整的前景分割。算法仅需求解两个稀疏线性方程组,鲁棒性强、精度高且更易于操作。同时,拉普拉斯算子和梯度算子的预计算可以被多次重用,减少了内存占用和时间消耗。大量交互式图像分割实验结果表明:该算法无需过多的用户交互信息,即可将现实图像中的复杂前景快速准确地分割出来。  相似文献   

7.
基于注意力感知和语义感知的RGB-D室内图像语义分割算法   总被引:1,自引:0,他引:1  
近年来,全卷积神经网络有效提升了语义分割任务的准确率.然而,由于室内环境的复杂性,室内场景语义分割仍然是一个具有挑战性的问题.随着深度传感器的出现,人们开始考虑利用深度信息提升语义分割效果.以往的研究大多简单地使用等权值的拼接或求和操作来融合RGB特征和深度特征,未能充分利用RGB特征与深度特征之间的互补信息.本文提出...  相似文献   

8.
针对室内场景图像语义分割结果不精确、显著图粗糙的问题,提出一种基于多模态特征优化提取和双路径引导解码的网络架构(feature regulator and dual-path guidance,FG-Net)。具体来说,设计的特征调节器对每个阶段的多模态特征依次进行噪声过滤、重加权表示、差异性互补和交互融合,通过强化RGB和深度特征聚合,优化特征提取过程中的多模态特征表示。然后,在解码阶段引入特征交互融合后丰富的跨模态线索,进一步发挥多模态特征的优势。结合双路径协同引导结构,在解码阶段融合多尺度、多层次的特征信息,从而输出更细致的显著图。实验在公开数据集NYUD-v2和SUN RGB-D上进行,在主要评价指标mIoU上达到48.5 %,优于其他先进算法。结果表明,该算法实现了更精细的室内场景图像语义分割,表现出了较好的泛化性和鲁棒性。  相似文献   

9.
ABSTRACT

Achieving polite service with a public service robot requires it to proactively ascertain who will interact with it in human-populated environments. Enlightened by interactive inference of intentions among humans, we investigate a novel and practical interactive intention-predicting method for people using bimodal information analysis for a public service robot. Different from the traditional research, only the visual cues are used to analyze the user's attention, this method combines the RGB-D camera and laser information to perceive the user, which realizes the 360-degree range perception, and compensates for the lack of perspective using the RGB-D camera. In addition, seven kinds of interactive intent features were extracted, and a random forest regression model was trained to score the interaction intentions of the people in the field of view. Considering the inference order of two different sensors, a priority rule for intention inference is also designed. The algorithm is implemented into a robot operation system (ROS) and evaluated on our public service robot. Extensive experimental results illustrate that the proposed method enables public service robots to achieve a higher level of politeness than the traditional, passive interactivity approach in which robots wait for commands from users.  相似文献   

10.
Extracting foreground objects from videos captured by a handheld camera has emerged as a new challenge. While existing approaches aim to exploit several clues such as depth and motion to extract the foreground layer, there are limitations in handling partial movement and cast shadow. In this paper, we bring a novel perspective to address these two issues by utilizing occlusion map introduced by object and camera motion and taking the advantage of interactive image segmentation methods. For partial movement, we treat each video frame as an image and synthesize “seeding” user interactions (i.e., user manually marking foreground and background) with both forward and backward occlusion maps to leverage the advances in high quality interactive image segmentation. For cast shadow, we utilize a paired region based shadow detection method to further refine initial segmentation results by removing detected shadow regions. Experimental results from both qualitative evaluation and quantitative evaluation on the Hopkins dataset demonstrate both the effectiveness and the efficiency of our proposed approach.  相似文献   

11.
Semantic segmentation based on the complementary information from RGB and depth images has recently gained great popularity, but due to the difference between RGB and depth maps, how to effectively use RGB-D information is still a problem. In this paper, we propose a novel RGB-D semantic segmentation network named RAFNet, which can selectively gather features from the RGB and depth information. Specifically, we construct an architecture with three parallel branches and propose several complementary attention modules. This structure enables a fusion branch and we add the Bi-directional Multi-step Propagation (BMP) strategy to it, which can not only retain the feature streams of the original RGB and depth branches but also fully utilize the feature flow of the fusion branch. There are three kinds of complementary attention modules that we have constructed. The RGB-D fusion module can effectively extract important features from the RGB and depth branch streams. The refinement module can reduce the loss of semantic information and the context aggregation module can help propagate and integrate information better. We train and evaluate our model on NYUDv2 and SUN-RGBD datasets, and prove that our model achieves state-of-the-art performances.  相似文献   

12.
In this paper, we present a new algorithm that utilizes low-quality red, green, blue and depth (RGB-D) data from the Kinect sensor for face recognition under challenging conditions. This algorithm extracts multiple features and fuses them at the feature level. A Finer Feature Fusion technique is developed that removes redundant information and retains only the meaningful features for possible maximum class separability. We also introduce a new 3D face database acquired with the Kinect sensor which has released to the research community. This database contains over 5,000 facial images (RGB-D) of 52 individuals under varying pose, expression, illumination and occlusions. Under the first three variations and using only the noisy depth data, the proposed algorithm can achieve 72.5 % recognition rate which is significantly higher than the 41.9 % achieved by the baseline LDA method. Combined with the texture information, 91.3 % recognition rate has achieved under illumination, pose and expression variations. These results suggest the feasibility of low-cost 3D sensors for real-time face recognition.  相似文献   

13.
目的 在室内场景语义分割任务中,深度信息会在一定程度上提高分割精度。但是如何更有效地利用深度信息仍是一个开放性问题。当前方法大都引入全部深度信息,然而将全部深度信息和视觉特征组合在一起可能对模型产生干扰,原因是仅依靠视觉特征网络模型就能区分的不同物体,在引入深度信息后可能产生错误判断。此外,卷积核固有的几何结构限制了卷积神经网络的建模能力,可变形卷积(deformable convolution, DC)在一定程度上缓解了这个问题。但是可变形卷积中产生位置偏移的视觉特征空间深度信息相对不足,限制了进一步发展。基于上述问题,本文提出一种深度信息引导的特征提取(depth guided feature extraction, DFE)模块。方法 深度信息引导的特征提取模块包括深度信息引导的特征选择模块(depth guided feature selection, DFS)和深度信息嵌入的可变形卷积模块(depth embedded deformable convolution, DDC)。DFS可以筛选出关键的深度信息,自适应地调整深度信息引入视觉特征的比例,在网络模型需要时将深度信息嵌...  相似文献   

14.
This paper analyzes with a new perspective the recent state of-the-art on gesture recognition approaches that exploit both RGB and depth data (RGB-D images). The most relevant papers have been analyzed to point out which features and classifiers best work with depth data, if these fundamentals are specifically designed to process RGB-D images and, above all, how depth information can improve gesture recognition beyond the limit of standard approaches based on solely color images. Papers have been deeply reviewed finding the relation between gesture complexity and features/methodologies suitability. Different types of gestures are discussed, focusing attention on the kind of datasets (public or private) used to compare results, in order to understand weather they provide a good representation of actual challenging problems, such as: gesture segmentation, idle gesture recognition, and length gesture invariance. Finally the paper discusses on the current open problems and highlights the future directions of research in the field of processing of RGB-D data for gesture recognition.  相似文献   

15.
RGB-D 图像在提供场景 RGB 信息的基础上添加了 Depth 信息,可以有效地描述场景的色彩及 三维几何信息。结合 RGB 图像及 Depth 图像的特点,提出一种将高层次的语义特征反向融合到低层次的边缘 细节特征的反向融合实例分割算法。该方法通过采用不同深度的特征金字塔网络(FPN)分别提取 RGB 与 Depth 图像特征,将高层特征经上采样后达到与最底层特征同等尺寸,再采用反向融合将高层特征融合到低层,同时 在掩码分支引入掩码优化结构,从而实现 RGB-D 的反向融合实例分割。实验结果表明,反向融合特征模型能 够在 RGB-D 实例分割的研究中获得更加优异的成绩,有效地融合了 Depth 图像与彩色图像 2 种不同特征图像 特征,在使用 ResNet-101 作为骨干网络的基础上,与不加入深度信息的 Mask R-CNN 相比平均精度提高 10.6%, 比直接正向融合 2 种特征平均精度提高 4.5%。  相似文献   

16.
文中提出一种羽毛球比赛的2D视频转换到3D视频的算法。在这类视频中,前景是最受关注的部分,准确地从背景中提取出前景对象是获取深度图的关键。文中采用一种改进的图割算法来获取前景,并根据场景结构构建背景深度模型,获取背景深度图;在背景深度图的基础上,根据前景与镜头之间的距离关系为前景对象进行深度赋值,从而得到前景深度图。然后,融合背景深度图和前景深度图,得到完整的深度图。最后,通过基于深度图像的虚拟视点绘制技术DIBR来获取用于3D显示的立体图像对。实验结果表明,最终生成的立体图像对具有较好的3D效果。  相似文献   

17.
An innovative background modeling technique that is able to accurately segment foreground regions in RGB-D imagery (RGB plus depth) has been presented in this paper. The technique is based on a Bayesian framework that efficiently fuses different sources of information to segment the foreground. In particular, the final segmentation is obtained by considering a prediction of the foreground regions, carried out by a novel Bayesian Network with a depth-based dynamic model, and, by considering two independent depth and color-based mixture of Gaussians background models. The efficient Bayesian combination of all these data reduces the noise and uncertainties introduced by the color and depth features and the corresponding models. As a result, more compact segmentations, and refined foreground object silhouettes are obtained. Experimental results with different databases suggest that the proposed technique outperforms existing state-of-the-art algorithms.  相似文献   

18.
从深度图RGB-D域中联合学习RGB图像特征与3D几何信息有利于室内场景语义分割,然而传统分割方法通常需要精确的深度图作为输入,严重限制了其应用范围。提出一种新的室内场景理解网络框架,建立基于语义特征与深度特征提取网络的联合学习网络模型提取深度感知特征,通过几何信息指导的深度特征传输模块与金字塔特征融合模块将学习到的深度特征、多尺度空间信息与语义特征相结合,生成具有更强表达能力的特征表示,实现更准确的室内场景语义分割。实验结果表明,联合学习网络模型在NYU-Dv2与SUN RGBD数据集上分别取得了69.5%与68.4%的平均分割准确度,相比传统分割方法具有更好的室内场景语义分割性能及更强的适用性。  相似文献   

19.
艾青林  王威  刘刚江 《机器人》2022,44(4):431-442
为解决室内动态环境下现有RGB-D SLAM(同步定位与地图创建)系统定位精度低、建图效果差的问题,提出一种基于网格分割与双地图耦合的RGB-D SLAM算法。基于单应运动补偿与双向补偿光流法,根据几何连通性与深度图像聚类结果实现网格化运动分割,同时保证算法的快速性。利用静态区域内的特征点最小化重投影误差对相机进行位置估计。结合相机位姿、RGB-D图像、网格化运动分割图像,同时构建场景的稀疏点云地图和静态八叉树地图并进行耦合,在关键帧上使用基于网格分割和八叉树地图光线遍历的方法筛选静态地图点,更新稀疏点云地图,保障定位精度。公开数据集和实际动态场景中的实验结果都表明,本文算法能够有效提升室内动态场景中的相机位姿估计精度,实现场景静态八叉树地图的实时构建和更新。此外,本文算法能够实时运行在标准CPU硬件平台上,无需GPU等额外计算资源。  相似文献   

20.
目的 深度图像作为一种普遍的3维场景信息表达方式在立体视觉领域有着广泛的应用。Kinect深度相机能够实时获取场景的深度图像,但由于内部硬件的限制和外界因素的干扰,获取的深度图像存在分辨率低、边缘不准确的问题,无法满足实际应用的需要。为此提出了一种基于彩色图像边缘引导的Kinect深度图像超分辨率重建算法。方法 首先对深度图像进行初始化上采样,并提取初始化深度图像的边缘;进一步利用高分辨率彩色图像和深度图像的相似性,采用基于结构化学习的边缘检测方法提取深度图的正确边缘;最后找出初始化深度图的错误边缘和深度图正确边缘之间的不可靠区域,采用边缘对齐的策略对不可靠区域进行插值填充。结果 在NYU2数据集上进行实验,与8种最新的深度图像超分辨率重建算法作比较,用重建之后的深度图像和3维重建的点云效果进行验证。实验结果表明本文算法在提高深度图像的分辨率的同时,能有效修正上采样后深度图像的边缘,使深度边缘与纹理边缘对齐,也能抑制上采样算法带来的边缘模糊现象;3维点云效果显示,本文算法能准确区分场景中的前景和背景,应用于3维重建等应用能取得较其他算法更好的效果。结论 本文算法普遍适用于Kinect深度图像的超分辨率重建问题,该算法结合同场景彩色图像与深度图像的相似性,利用纹理边缘引导深度图像的超分辨率重建,可以得到较好的重建结果。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号