首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
Model-based recognition of 3D objects from single images   总被引:1,自引:0,他引:1  
In this work, we treat major problems of object recognition which have received relatively little attention lately. Among them are the loss of depth information in the projection from a 3D object to a single 2D image, and the complexity of finding feature correspondences between images. We use geometric invariants to reduce the complexity of these problems. There are no geometric invariants of a projection from 3D to 2D. However, given certain modeling assumptions about the 3D object, such invariants can be found. The modeling assumptions can be either a particular model or a generic assumption about a class of models. Here, we use such assumptions for single-view recognition. We find algebraic relations between the invariants of a 3D model and those of its 2D image under general projective projection. These relations can be described geometrically as invariant models in a 3D invariant space, illuminated by invariant “light rays,” and projected onto an invariant version of the given image. We apply the method to real images  相似文献   


Due to severe articulation, self-occlusion, various scales, and high dexterity of the hand, hand pose estimation is more challenging than body pose estimation. Recently-developed body pose estimation algorithms are not suitable for addressing the unique challenges of hand pose estimation because they are trained without explicitly modeling structural relationships between keypoints. In this paper, we propose a novel cascaded hierarchical CNN(CH-HandNet) for 2D hand pose estimation from a single color image. The CH-HandNet includes three modules, hand mask segmentation, preliminary 2D hand pose estimation, and hierarchical estimation. The first module obtains a hand mask by hand mask segmentation network. The second module connects the hand mask and the intermediate image features to estimate the 2D hand heatmaps. The last module connects hand heatmaps with the intermediate image features and hand mask to estimate finger and palm heatmaps hierarchically. Finally, the extracted Finger(pinky,ring,middle,index) and Palm(thumb and palm) feature information are fused to estimate 2D hand pose. Experimental results on three datasets - OneHand 10k, Panoptic, and Eric.Lee, consistently shows that our proposed CH-HandNet outperforms previous state-of-the-art hand pose estimation methods.


We propose an end-to-end deep learning architecture for simultaneously detecting objects and recovering 6D poses in an RGB image. Concretely, we extend the 2D detection pipeline with a pose estimation module to indirectly regress the image coordinates of the object's 3D vertices based on 2D detection results. Then the object's 6D pose can be estimated using a Perspective-n-Point algorithm without any post-refinements. Moreover, we elaborately design a backbone structure to maintain spatial resolution of low level features for pose estimation task. Compared with state-of-the-art RGB based pose estimation methods, our approach achieves competitive or superior performance on two benchmark datasets at an inference speed of 25 fps on a GTX 1080Ti GPU, which is capable of real-time processing.  相似文献   

为了解决类别级三维可形变目标姿态估计问题,基于目标的关键点,提出了一种面向类别的三维可形变目标姿态估计方法。该方法设计了一种基于关键点的端到端深度学习框架,框架以PointNet++为后端网络,通过特征提取、部位分割、关键点提取和基于关键点的姿态估计部分实现可形变目标的姿态估计,具有计算精度高、鲁棒性强等优势。同时,基于ANCSH方法设计了适用于K-AOPE网络的关键点标准化分层表示方法,该方法仅需提取目标少量的关键点即可表示类别物体。为了验证方法的有效性,在公共数据集shape2motion上进行测试。实验结果显示,提出的姿态估计方法(以眼镜类别为例)在旋转角上的误差分别为2.3°、3.1°、3.7°,平移误差分别为0.034、0.030、0.046,连接状态误差为2.4°、2.5°,连接参数误差为1.2°、0.9°,0.008、0.010。与ANCSH方法相比,所提方法具有较高的准确性和鲁棒性。  相似文献   

针对单张人像的三维姿态计算,结合面貌测量和射影几何的理论提出了一种方法:首先在人面部的平面区域内,选取眼角点,口角点,鼻翼点建立人脸模型;然后根据人脸平面上两个相互垂直的特征线投影到照片上的灭点位置,求出人脸平面的旋转方向。该方法特征点易于标定,且无需任何的辅助设备和先验知识,具有一定的实用性。  相似文献   

分别就两种约束使用神经网络对三维刚体运动进行参数估计.一是基于三维点匹配,将预测的运动参数作用于运动前的坐标,与运动后坐标进行比较;二是基于二维运动场,将使用预测的运动参数计算得出的二维运动场与图像序列中计算得出的二维运动场进行比较.两个神经网络均使用Newton-Raphson方法更新权值,以达到目标误差最小化.通过实验验证了该神经网络方法.  相似文献   

马康哲  皮家甜  熊周兵  吕佳 《计算机应用》2022,42(12):3715-3722
在机械臂视觉抓取过程中,现有的算法在复杂背景、光照不足、遮挡等条件下,难以对目标物体进行实时、准确、鲁棒的姿态估计。针对以上问题,提出一种基于关键点方法的融合注意力特征的物体6D姿态网络。首先,在跳跃连接(Skip Connection)阶段引入能够聚焦通道空间信息的卷积注意力模块(CBAM),使编码阶段的浅层特征与解码阶段的深层特征进行有效融合,增强特征图的空间域信息和精确位置通道信息;其次,采用归一化损失函数以弱监督的方式回归每个关键点的注意力图,将注意力图作为对应像素位置上关键点偏移量的权重分数;最后,累加求和得到关键点坐标。实验结果证明,所提网络在LINEMOD数据集和Occlusion LINEMOD数据集上ADD(-S)指标分别达到了91.3%和46.3%。与基于关键点的逐像素投票网络(PVNet)相比ADD(-S)指标分别提升了5.0个百分点和5.5个百分点,验证了所提网络在遮挡场景下有更好的鲁棒性。  相似文献   

《Artificial Intelligence》1985,26(2):145-169
Given a single image of a scene containing a perspective view of three-dimensional objects, we would like a computer vision system to be able to perceive the objects in it. This paper describes an efficient way of labeling a line-drawing image and shows how to utilize the line-labeling and junction-labeling information to group together faces of the same object volume and predict the simplest arrangement and types of the hidden vertices caused by the overlapping of objects. This scheme is valid for both planar and curved-surface objects. In addition, it can handle some instances of multi-type vertices.  相似文献   

This paper describes a neural network (NN) based system for recognition and pose estimation of an unoccluded three-dimensional (3-D) object from any single two-dimensional (2-D) perspective view. The approach is invariant to translation, orientation, and scale. First, the binary silhouette of the object is obtained and normalized for translation and scale. Then, the object is represented by a set of rotation invariant features derived from the complex orthogonal pseudo-Zernike moments of the image. The recognition scheme combines the decisions of a bank of multilayer perceptron NN classifiers operating in parallel on the same data. These classifiers have different topologies and internal parameters, but are trained on the same set of exemplar perspective views of the objects. Next, two pose parameters, elevation and aspect angles, are obtained by a novel two-stage NN system consisting of a quadrant classifier followed by NN angle estimators. Performance is tested on clean and noisy data bases of military ground vehicles. Comparative studies with three other classifiers (a single NN, the weighted nearest-neighbor classifier, and a binary decision tree) are carried out.  相似文献   

Guo  Xinru  Xu  Song  Lin  Xiangbo  Sun  Yi  Ma  Xiaohong 《Pattern Analysis & Applications》2022,25(1):157-167
Pattern Analysis and Applications - Based on the disentanglement representation learning theory and the cross-modal variational autoencoder (VAE) model, we derive a “Single Input Multiple...  相似文献   

A method for finding analytical solutions to the problem of determining the attitude of a 3D object in space from a single perspective image is presented. Its principle is based on the interpretation of a triplet of any image lines as the perspective projection of a triplet of linear ridges of the object model, and on the search for the model attitude consistent with these projections. The geometrical transformations to be applied to the model to bring it into the corresponding location are obtained by the resolution of an eight-degree equation in the general case. Using simple logical rules, it is shown on examples related to polyhedra that this approach leads to results useful for both location and recognition of 3D objects because few admissible hypotheses are retained from the interpolation of the three line segments. Line matching by the prediction-verification procedure is thus less complex  相似文献   

Multimedia Tools and Applications - Multiple human 3D pose estimation is a challenging task. It is mainly because of large variations in the scale and pose of humans, fast motions, multiple persons...  相似文献   

Reliable manipulation of everyday household objects is essential to the success of service robots. In order to accurately manipulate these objects, robots need to know objects’ full 6-DOF pose, which is challenging due to sensor noise, clutters, and occlusions. In this paper, we present a new approach for effectively guessing the object pose given an observation of just a small patch of the object, by leveraging the fact that many household objects can only keep stable on a planar surface under a small set of poses. In particular, for each stable pose of an object, we slice the object with horizontal planes and extract multiple cross-section 2D contours. The pose estimation is then reduced to find a stable pose whose contour matches best with that of the sensor data, and this can be solved efficiently by cross-correlation. Experiments on the manipulation tasks in the DARPA Robotics Challenge validate our approach. In addition, we also investigate our method’s performance on object recognition tasks raising in the challenge.  相似文献   

Determining pose of 3D objects with curved surfaces   总被引:1,自引:0,他引:1  
A method is presented for computing the pose of rigid 3D objects with arbitrary curved surfaces. Given an input image and a candidate object model and aspect, the method will verify whether or not the object is present and if so, report pose parameters. The curvature method of Bash and Ullman is used to model points on the object rim, while stereo matching is used for internal edge points. The model allows an object edge-map to be predicted from pose parameters. Pose is computed via an iterative search for the best pose parameters. Heuristics are used so that matching can succeed in the presence of occlusion and artifact and without resetting to use of corresponding salient feature points. Bench tests and simulations show that the method almost always converges to ground truth pose parameters for a variety of objects and for a broad set of starting parameters in the same aspect  相似文献   

In this paper a real-time 3D pose estimation algorithm using range data is described. The system relies on a novel 3D sensor that generates a dense range image of the scene. By not relying on brightness information, the proposed system guarantees robustness under a variety of illumination conditions, and scene contents. Efficient face detection using global features and exploitation of prior knowledge along with novel feature localization and tracking techniques are described. Experimental results demonstrate accurate estimation of the six degrees of freedom of the head and robustness under occlusions, facial expressions, and head shape variability.  相似文献   

刚体目标姿态作为计算机视觉技术的重点研究方向之一,旨在确定场景中3维目标的位置平移和方位旋转等多个自由度,越来越多地应用在工业机械臂操控、空间在轨服务、自动驾驶和现实增强等领域。本文对基于单幅图像的刚体目标姿态过程、方法分类及其现存问题进行了整体综述。通过利用单幅刚体目标图像实现多自由度姿态估计的各类方法进行总结、分类及比较,重点论述了姿态估计的一般过程、估计方法的演进和划分、常用数据集及评估准则、研究现状与展望。目前,多自由度刚体目标姿态估计方法主要针对单一特定应用场景具有较好的效果,还没有通用于复合场景的方法,且现有方法在面对多种光照条件、杂乱遮挡场景、旋转对称和类间相似性目标时,估计精度和效率下降显著。结合现存问题及当前深度学习技术的助推影响,从场景级多目标推理、自监督学习方法、前端检测网络、轻量高效的网络设计、多信息融合姿态估计框架和图像数据表征空间等6个方面对该领域的发展趋势进行预测和展望。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号