期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Semantic part segmentation method based 3D object pose estimation with RGB-D images for bin-picking

《Robotics and Computer》2021

3D object pose estimation for grasping and manipulation is a crucial task in robotic and industrial applications. Robustness and efficiency for robotic manipulation are desirable properties that are still very challenging in complex and cluttered scenes, because 3D objects have different appearances, illumination and occlusion when seen from different viewpoints. This article proposes a Semantic Point Pair Feature (PPF) method for 3D object pose estimation, which combines the semantic image segmentation using deep learning with the voting-based 3D object pose estimation. The Part Mask RCNN ispresented to obtain the semantic object-part segmentation related to the point cloud of object, which is combined with the PPF method for 3D object pose estimation. In order to reduce the cost of collecting datasets in cluttered scenes, a physically-simulated environment is constructed to generate labeled synthetic semantic datasets. Finally, two robotic bin-picking experiments are demonstrated and the Part Mask RCNN for scene segmentation is evaluated through the constructed 3D object datasets. The experimental results show that the proposed Semantic PPF methodimproves the robustness and efficiency of 3D object pose estimation in cluttered scenes with partial occlusions. 相似文献

2.

融合注意力特征的遮挡物体6D姿态估计

马康哲皮家甜熊周兵吕佳《计算机应用》2022,42(12):3715-3722

在机械臂视觉抓取过程中,现有的算法在复杂背景、光照不足、遮挡等条件下,难以对目标物体进行实时、准确、鲁棒的姿态估计。针对以上问题,提出一种基于关键点方法的融合注意力特征的物体6D姿态网络。首先,在跳跃连接（Skip Connection）阶段引入能够聚焦通道空间信息的卷积注意力模块（CBAM）,使编码阶段的浅层特征与解码阶段的深层特征进行有效融合,增强特征图的空间域信息和精确位置通道信息;其次,采用归一化损失函数以弱监督的方式回归每个关键点的注意力图,将注意力图作为对应像素位置上关键点偏移量的权重分数;最后,累加求和得到关键点坐标。实验结果证明,所提网络在LINEMOD数据集和Occlusion LINEMOD数据集上ADD(-S)指标分别达到了91.3%和46.3%。与基于关键点的逐像素投票网络（PVNet）相比ADD(-S)指标分别提升了5.0个百分点和5.5个百分点,验证了所提网络在遮挡场景下有更好的鲁棒性。相似文献

3.

基于点云采样权重估计的未知物体抓取位姿生成方法

蔡子豪杨亮黄之峰《控制与决策》2023,38(10):2859-2866

针对机械臂在非结构环境中对未知物体抓取位姿生成困难及抓取稳定性差的问题,提出一种基于点云采样权重估计的抓取位姿生成方法.首先通过移动深度相机的方式拼接得到较完整的物体点云信息,并对物体的几何特性进行分析,有效避开物体不宜抓取的位置进行抓取位姿样本生成;然后结合几何约束条件实现抓取位姿搜索,并利用力封闭条件对样本稳定性进行评估;最后为了对实际的抓取位姿进行评价,根据其稳定性、夹取深度、夹取角度等设定抓取可行性指标,据此在工作空间输出最佳抓取位姿并完成指定的抓取任务.实验结果表明,采用所提方法能够高效生成大量且稳定的抓取位姿,并在仿真环境中有效实现机械臂对单个或多个随机摆放的未知物体的抓取任务. 相似文献

4.

基于关键点的类别级三维可形变目标姿态估计

曾一芳《计算机应用研究》2022,39(2):587-592

为了解决类别级三维可形变目标姿态估计问题,基于目标的关键点,提出了一种面向类别的三维可形变目标姿态估计方法。该方法设计了一种基于关键点的端到端深度学习框架,框架以PointNet++为后端网络,通过特征提取、部位分割、关键点提取和基于关键点的姿态估计部分实现可形变目标的姿态估计,具有计算精度高、鲁棒性强等优势。同时,基于ANCSH方法设计了适用于K-AOPE网络的关键点标准化分层表示方法,该方法仅需提取目标少量的关键点即可表示类别物体。为了验证方法的有效性,在公共数据集shape2motion上进行测试。实验结果显示,提出的姿态估计方法(以眼镜类别为例)在旋转角上的误差分别为2.3°、3.1°、3.7°,平移误差分别为0.034、0.030、0.046,连接状态误差为2.4°、2.5°,连接参数误差为1.2°、0.9°,0.008、0.010。与ANCSH方法相比,所提方法具有较高的准确性和鲁棒性。相似文献

5.

Segmentation based 6D pose estimation using integrated shape pattern and RGB information

Gu Chaochen Feng Qi Lu Changsheng Zhao Shuxin Xu Rui 《Pattern Analysis & Applications》2022,25(4):1055-1073

Pattern Analysis and Applications - Point cloud is currently the most typical representation in describing the 3D world. However, recognizing objects as well as the poses from point clouds is still... 相似文献

6.

基于深度学习的三维点云头部姿态估计

肖仕华桑楠王旭鹏《计算机应用》2020,40(4):996-1001

快速、可靠的头部姿态估计算法是高级人脸分析任务的基础。为了解决现有算法存在的光照变化、遮挡、姿态尺度较大等问题,提出一种新的深度学习框架HPENet。该网络以点云数据为输入,首先通过最远点采样算法提取点云结构中的特征点,以特征点为球心,将不同半径的球体内的点构成多个分组,用于后续的特征描述;然后采用多层感知器和最大池化层实现点云的特征提取,提取的特征通过全连接层输出预测的头部姿态。为了验证HPENet的有效性,在公共数据集Biwi Kinect Head Pose上进行测试。实验结果显示,HPENet在俯仰角、侧倾角和偏航角上的误差分别为2.3°、1.5°、2.4°,平均每帧的时间消耗为8 ms。与其他优秀算法相比,所提方法在准确度和计算的复杂度方面都具有更好的性能。相似文献

7.

融合时序特征约束与联合优化的点云3维人体姿态序列估计

下载免费PDF全文

廖联军钟重阳张智恒胡磊张子豪夏时洪《中国图象图形学报》2022,27(12):3608-3621

目的 3维人体姿态估计传统方法通常采用单帧点云作为输入,可能会忽略人体运动平滑度的固有先验知识,导致产生抖动伪影。目前,获取2维人体姿态标注的真实图像数据集相对容易,而采集大规模的具有高质量3维人体姿态标注的真实图像数据集进行完全监督训练有一定难度。对此,本文提出了一种新的点云序列3维人体姿态估计方法。方法首先从深度图像序列估计姿态相关点云,然后利用时序信息构建神经网络,对姿态相关点云序列的时空特征进行编码。选用弱监督深度学习,以利用大量的更容易获得的带2维人体姿态标注的数据集。最后采用多任务网络对人体姿态估计和人体运动预测进行联合训练,提高优化效果。结果在两个数据集上对本文算法进行评估。在ITOP(invariant-top view dataset)数据集上,本文方法的平均精度均值(mean average precision,mAP)比对比方法分别高0.99%、13.18%和17.96%。在NTU-RGBD数据集上,本文方法的mAP值比最先进的WSM(weakly supervised adversarial learning methods)方法高7.03%。同时,在ITOP数据集上对模型进行消融实验,验证了算法各个不同组成部分的有效性。与单任务模型训练相比,多任务网络联合进行人体姿态估计和运动预测的mAP可以提高2%以上。结论本文提出的点云序列3维人体姿态估计方法能充分利用人体运动连续性的先验知识,获得更平滑的人体姿态估计结果,在ITOP和NTU-RGBD数据集上都能获得很好的效果。采用多任务网络联合优化策略,人体姿态估计和运动预测两个任务联合优化求解,有互相促进的作用。相似文献

8.

结合图像分割和点云分割的障碍物检测算法

莫建文芦爱余张彤《计算机工程与设计》2015,(7)

针对单目图像检测障碍物的低可靠性和当前双目视觉检测障碍物的局限性的问题,提出一种结合图像分割和点云分割技术的双目视觉障碍物检测方法。通过设定检测深度范围,分割障碍物点云与道路点云;采用将分割出的障碍物点云对应的视差图与图像分割得到的子图进行比较的策略,有效解决对不同深度、倾斜面和不规则障碍物检测效果差的问题。通过实验验证了在获得稀疏三维点云的情况下,该方法对障碍物的检测具有较好的鲁棒性。相似文献

9.

三维点云法向量估计综述 总被引：6，自引：1，他引：5

下载免费PDF全文

李宝程志全党岗金士尧《计算机工程与应用》2010,46(23):1-7

由于获取方便、表示简单、灵活等优势,点云逐渐成为常用的三维模型表示方法之一。法向量作为点云必不可少的属性之一,其估计方法在点云处理中具有重要的位置。另一方面,由于点云获取过程中不可避免的噪声、误差和遮挡,点云中通常含有噪声、外点和空洞,并且部分采样模型如CAD模型,也会存在尖锐特征,这些都给法向量估计提出了挑战。对当前已有的点云法向量估计算法进行综述,分析其原理及关键技术,着重分析它们在处理噪声、外点和尖锐特征等方面的能力并给出比较,最后为未来研究提供了一些建议。相似文献

10.

3D pose estimation for articulated vehicles using Kalman-filter based tracking

C. Fuchs F. Neuhaus D. Paulus 《Pattern Recognition and Image Analysis》2016,26(1):109-113

Knowledge about relative poses within a tractor/trailer combination is a vital prerequisite for kinematic modelling and trajectory estimation. In case of autonomous vehicles or driver assistance systems, for example, the monitoring of an attached passive trailer is crucial for operational safety. We propose a camerabased 3D pose estimation system based on a Kalman-filter. It is evaluated against previously published methods for the same problem. 相似文献

11.

Object class segmentation of massive 3D point clouds of urban areas using point cloud topology

Rico Richter Markus Behrens Jürgen Döllner 《International journal of remote sensing》2013,34(23):8408-8424

A large number of remote-sensing techniques and image-based photogrammetric approaches allow an efficient generation of massive 3D point clouds of our physical environment. The efficient processing, analysis, exploration, and visualization of massive 3D point clouds constitute challenging tasks for applications, systems, and workflows in disciplines such as urban planning, environmental monitoring, disaster management, and homeland security. We present an approach to segment massive 3D point clouds according to object classes of virtual urban environments including terrain, building, vegetation, water, and infrastructure. The classification relies on analysing the point cloud topology; it does not require per-point attributes or representative training data. The approach is based on an iterative multi-pass processing scheme, where each pass focuses on different topological features and considers already detected object classes from previous passes. To cope with the massive amount of data, out-of-core spatial data structures and graphics processing unit (GPU)-accelerated algorithms are utilized. Classification results are discussed based on a massive 3D point cloud with almost 5 billion points of a city. The results indicate that object-class-enriched 3D point clouds can substantially improve analysis algorithms and applications as well as enhance visualization techniques. 相似文献

12.

Accurate constrained pose estimation for small objects

V. Eruhimov 《Pattern Recognition and Image Analysis》2013,23(3):389-392

The paper proposes a novel method for accurate pose estimation of small objects. A range sensor is used to find object orientation that constraints a PnP solver. The method finds the global minimum of a cost function (as opposed to traditional PnP algorithms) in closed form. This approach was used to enable robust robotic operation with sub millimeter precision. 相似文献

13.

局部特征表征的6D位姿估计算法

王晨露陈立家《计算机应用研究》2022,39(12)

为解决有纹理模型在遮挡条件下6D位姿估计精确度不高的问题,提出了一种局部特征表征的端到端6D位姿估计算法。首先为了得到准确的定位信息,提出了一个空间—坐标注意力机制（spatial and coordinate attention）,通过在YOLOv5网络中加入空间—坐标注意力机制和加权双向特征金字塔网络（bidirectional feature pyramid network）,YOLOv5-CBE算法的精确度（precision）、召回率（recall）、平均精度均值（mAP@0.5）分别提升了3.6%、2.8%、2.5%,局部特征中心点坐标误差最高提升了25%;然后用 YOLOv5-CBE算法检测局部特征关键点,结合3D Harris关键点通过奇异值分解法（singular value decomposition）计算模型的6D位姿,最高遮挡70%的情况下仍然可以保证二维重投影精度（2D reprojection accuracy）和ADD度量精度（ADD accuracy）在95%以上,具有较强的鲁棒性。相似文献

14.

低质量渲染图像的目标物体6D姿态估计

左国玉张成威刘洪星龚道雄《控制与决策》2022,37(1):135-141

从图像中获取目标物体的6D位姿信息在机器人操作和虚拟现实等领域有着广泛的应用,然而,基于深度学习的位姿估计方法在训练模型时通常需要大量的训练数据集来提高模型的泛化能力,一般的数据采集方法存在收集成本高同时缺乏3D空间位置信息等问题.鉴于此,提出一种低质量渲染图像的目标物体6D姿态估计网络框架.该网络中,特征提取部分以单张RGB图像作为输入,用残差网络提取输入图像特征;位姿估计部分的目标物体分类流用于预测目标物体的类别,姿态回归流在3D空间中回归目标物体的旋转角度和平移矢量.另外,采用域随机化方法以低收集成本方式构建大规模低质量渲染、带有物体3D空间位置信息的图像数据集Pose6DDR.在所建立的Pose6DDR数据集和LineMod公共数据集上的测试结果表明了所提出位姿估计方法的优越性以及大规模数据集域随机化生成数据方法的有效性. 相似文献

15.

结合掩码定位和漏斗网络的6D姿态估计

下载免费PDF全文

李冬冬郑河荣刘复昌潘翔《中国图象图形学报》2022,27(2):642-652

目的 6D姿态估计是3D目标识别及重建中的一个重要问题。由于很多物体表面光滑、无纹理,特征难以提取,导致检测难度大。很多算法依赖后处理过程提高姿态估计精度,导致算法速度降低。针对以上问题,本文提出一种基于热力图的6D物体姿态估计算法。方法首先,采用分割掩码避免遮挡造成的热力图污染导致的特征点预测准确率下降问题。其次,基于漏斗网络架构,无需后处理过程,保证算法具有高效性能。在物体检测阶段,采用一个分割网络结构,使用速度较快的YOLOv3（you only look once v3）作为网络骨架,目的在于预测目标物体掩码分割图,从而减少其他不相关物体通过遮挡带来的影响。为了提高掩码的准确度,增加反卷积层提高特征层的分辨率并对它们进行融合。然后,针对关键点采用漏斗网络进行特征点预测,避免残差网络模块由于局部特征丢失导致的关键点检测准确率下降问题。最后,对检测得到的关键点进行位姿计算,通过PnP （perspective-n-point）算法恢复物体的6D姿态。结果在有挑战的Linemod数据集上进行实验。实验结果表明,本文算法的3D误差准确性为82.7%,与热力图方法相比提高了10%;2D投影准确性为98.9%,比主流算法提高了4%;同时达到了15帧/s的检测速度。结论本文提出的基于掩码和关键点检测算法不仅有效提高了6D姿态估计准确性,而且可以维持高效的检测速度。相似文献

16.

MPF6D: masked pyramid fusion 6D pose estimation

Pereira Nuno Alexandre Luís A. 《Pattern Analysis & Applications》2023,26(3):1363-1373

Pattern Analysis and Applications - Object pose estimation has multiple important applications, such as robotic grasping and augmented reality. We present a new method to estimate the 6D pose of... 相似文献

17.

Precise and efficient pose estimation of stacked objects for mobile manipulation in industrial robotics challenges

Gi Hyun Lim Nuno Lau Eurico Pedrosa Filipe Amaral Artur Pereira José Luís Azevedo 《Advanced Robotics》2019,33(13):636-646

Object manipulation tasks such as picking up, carrying and placing should be executed based on the information of objects which are provided by the perception system. A precise and efficient pose estimation system has been developed to address the requirements and to achieve the objectives for autonomous packaging, specifically picking up of stacked non-rigid objects. For fine pose estimation, a drawing pin shaped kernel and pinhole filtering methods are used on the roughly estimated pose of objects. The system has been applied in a realistic industrial environment as a challenging scenario for the Challenge 2 – Shop Floor Logistics and Manipulation on a mobile manipulator in the context of the European Robotics Challenges (EuRoC) project. 相似文献

18.

Online urban object recognition in point clouds using consecutive point information for urban robotic missions

《Robotics and Autonomous Systems》2014,62(8):1130-1152

Urban object recognition is the ability to categorize ambient objects into several classes and it plays an important role in various urban robotic missions, such as surveillance, rescue, and SLAM. However, there were several difficulties when previous studies on urban object recognition in point clouds were adopted for robotic missions: offline-batch processing, deterministic results in classification, and necessity of many training examples. The aim of this paper is to propose an urban object recognition algorithm for urban robotic missions with useful properties: online processing, classification results with probabilistic outputs, and training with a few examples based on a generative model. To achieve this, the proposed algorithm utilizes the consecutive point information (CPI) of a 2D LIDAR sensor. This additional information was useful for designing an online algorithm consisting of segmentation and classification. Experimental results show that the proposed algorithm using CPI enhances the applicability of urban object recognition for various urban robotic missions. 相似文献

19.

Automatic segmentation of point clouds from multi-view reconstruction using graph-cut

Rongjiang Pan Gabriel Taubin 《The Visual computer》2016,32(5):601-609

In multi-view reconstruction systems, the recovered point cloud often consists of numerous unwanted background points. We propose a graph-cut based method for automatically segmenting point clouds from multi-view reconstruction. Based on the observation that the object of interest is likely to be central to the intended multi-view images, our method requires no user interaction except two roughly estimated parameters of objects covering in the central area of images. The proposed segmentation process is carried out in two steps: first, we build a weighted graph whose nodes represent points and edges that connect each point to its k-nearest neighbors. The potentials of each point being object and background are estimated according to distances between its projections in images and the corresponding image centers. The pairwise potentials between each point and its neighbors are computed according to their positions, colors and normals. Graph-cut optimization is then used to find the initial binary segmentation of object and background points. Second, to refine the initial segmentation, Gaussian mixture models (GMMs) are created from the color and density features of points in object and background classes, respectively. The potentials of each point being object and background are re-calculated based on the learned GMMs. The graph is updated and the segmentation of point clouds is improved by graph-cut optimization. The second step is iterated until convergence. Our method requires no manual labeling points and employs available information of point clouds from multi-view systems. We test the approach on real-world data generated by multi-view reconstruction systems. 相似文献

20.

Haptic interaction with objects in a picture based on pose estimation

Seung-Chan Kim Dong-Soo Kwon 《Multimedia Tools and Applications》2014,72(2):2041-2062

In pictures, every object is displayed in 2D space. Seeing the 2D image, people can perceptually reconstruct and understand information regarding the scene. To enable users to haptically interact with an object that appears in the image, the present study proposes a geometry-based haptic rendering method. More specifically, our approach is intended to estimate haptic information from the object’s structure contained in an image while preserving the two-dimensional visual information. Of the many types of objects that can be seen in everyday pictures, this paper mainly deals with polyhedron figures or objects composed of rectangular faces, some of which might be shown in a slanted configuration in the picture. To obtain the geometric layout of the object being viewed from the image plane, we first estimate homographic information that describes a mapping from the object coordinate to the target image coordinate. Then, we transform the surface normals of the object face using the extrinsic part of homography that locates the face of the object we are viewing. Because the transformed normals are utilized for calculating the force in the image space, we call this process normal vector perturbation in the 2D image space. To physically represent the estimated normal vector without distorting the visual information, we employed a lateral haptic rendering scheme in that it fits with our interaction styles on 2D images. The active force value at a given position on the slanted faces is calculated during the interaction phase. To evaluate our approach, we conducted an experiment with different stimulus conditions, in which it was found that participants could reliably estimate the geometric layout that appears in the picture. We conclude with explorations of applications and a discussion of future work. 相似文献