首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
We advance new active computer vision algorithms based on the Feature space Trajectory (FST) representations of objects and a neural network processor for computation of distances in global feature space. Our algorithms classify rigid objects and estimate their pose from intensity images. They also indicate how to automatically reposition the sensor if the class or pose of an object is ambiguous from a given viewpoint and they incorporate data from multiple object views in the final object classification. An FST in a global eigenfeature space is used to represent 3D distorted views of an object. Assuming that an observed feature vector consists of Gaussian noise added to a point on the FST, we derive a probability density function for the observation conditioned on the class and pose of the object. Bayesian estimation and hypothesis testing theory are then used to derive approximations to the maximum a posterioriprobability pose estimate and the minimum probability of error classifier. Confidence measures for the class and pose estimates, derived using Bayes theory, determine when additional observations are required, as well as where the sensor should be positioned to provide the most useful information.  相似文献   

2.
Classifying objects in complex unknown environments is a challenging problem in robotics and is fundamental in many applications. Modern sensors and sophisticated perception algorithms extract rich 3D textured information, but are limited to the data that are collected from a given location or path. We are interested in closing the loop around perception and planning, in particular to plan paths for better perceptual data, and focus on the problem of planning scanning sequences to improve object classification from range data. We formulate a novel time-constrained active classification problem and propose solution algorithms that employ a variation of Monte Carlo tree search to plan non-myopically. Our algorithms use a particle filter combined with Gaussian process regression to estimate joint distributions of object class and pose. This estimator is used in planning to generate a probabilistic belief about the state of objects in a scene, and also to generate beliefs for predicted sensor observations from future viewpoints. These predictions consider occlusions arising from predicted object positions and shapes. We evaluate our algorithms in simulation, in comparison to passive and greedy strategies. We also describe similar experiments where the algorithms are implemented online, using a mobile ground robot in a farm environment. Results indicate that our non-myopic approach outperforms both passive and myopic strategies, and clearly show the benefit of active perception for outdoor object classification.  相似文献   

3.
This paper introduces a uniform statistical framework for both 3-D and 2-D object recognition using intensity images as input data. The theoretical part provides a mathematical tool for stochastic modeling. The algorithmic part introduces methods for automatic model generation, localization, and recognition of objects. 2-D images are used for learning the statistical appearance of 3-D objects; both the depth information and the matching between image and model features are missing for model generation. The implied incomplete data estimation problem is solved by the Expectation Maximization algorithm. This leads to a novel class of algorithms for automatic model generation from projections. The estimation of pose parameters corresponds to a non-linear maximum likelihood estimation problem which is solved by a global optimization procedure. Classification is done by the Bayesian decision rule. This work includes the experimental evaluation of the various facets of the presented approach. An empirical evaluation of learning algorithms and the comparison of different pose estimation algorithms show the feasibility of the proposed probabilistic framework.  相似文献   

4.
In this paper, we introduce a method to estimate the object’s pose from multiple cameras. We focus on direct estimation of the 3D object pose from 2D image sequences. Scale-Invariant Feature Transform (SIFT) is used to extract corresponding feature points from adjacent images in the video sequence. We first demonstrate that centralized pose estimation from the collection of corresponding feature points in the 2D images from all cameras can be obtained as a solution to a generalized Sylvester’s equation. We subsequently derive a distributed solution to pose estimation from multiple cameras and show that it is equivalent to the solution of the centralized pose estimation based on Sylvester’s equation. Specifically, we rely on collaboration among the multiple cameras to provide an iterative refinement of the independent solution to pose estimation obtained for each camera based on Sylvester’s equation. The proposed approach to pose estimation from multiple cameras relies on all of the information available from all cameras to obtain an estimate at each camera even when the image features are not visible to some of the cameras. The resulting pose estimation technique is therefore robust to occlusion and sensor errors from specific camera views. Moreover, the proposed approach does not require matching feature points among images from different camera views nor does it demand reconstruction of 3D points. Furthermore, the computational complexity of the proposed solution grows linearly with the number of cameras. Finally, computer simulation experiments demonstrate the accuracy and speed of our approach to pose estimation from multiple cameras.  相似文献   

5.
6.
Multi-view object class recognition can be achieved using existing approaches for single-view object class recognition, by treating different views as entirely independent classes. This strategy requires a large amount of training data for many viewpoints, which can be costly to obtain. We describe a method for constructing a weak three-dimensional model from as few as two views of an object of the target class, and using that model to transform images of objects from one view to several other views, effectively multiplying their value for class recognition. Our approach can be coupled with any 2D image-based recognition system. We show that automatically transformed images dramatically decrease the data requirements for multi-view object class recognition.  相似文献   

7.
In this paper, we present a new framework for three-dimensional (3D) reconstruction of multiple rigid objects from dynamic scenes. Conventional 3D reconstruction from multiple views is applicable to static scenes, in which the configuration of objects is fixed while the images are taken. In our framework, we aim to reconstruct the 3D models of multiple objects in a more general setting where the configuration of the objects varies among views. We solve this problem by object-centered decomposition of the dynamic scenes using unsupervised co-recognition approach. Unlike conventional motion segmentation algorithms that require small motion assumption between consecutive views, co-recognition method provides reliable accurate correspondences of a same object among unordered and wide-baseline views. In order to segment each object region, we benefit from the 3D sparse points obtained from the structure-from-motion. These points are reliable and serve as automatic seed points for a seeded-segmentation algorithm. Experiments on various real challenging image sequences demonstrate the effectiveness of our approach, especially in the presence of abrupt independent motions of objects.  相似文献   

8.
This paper addresses the automatic construction of complex spline object models from a few photographs. Our approach combines silhouettes from registered images to construct a G1-continuous triangular spline approximation of an object with unknown topology. We apply a similar optimization procedure to estimate the pose of a modeled object from a single image. Experimental examples of model construction and pose estimation are presented for several complex objects  相似文献   

9.
Grasping is a fundamental skill for robots which work for manipulation tasks. Grasping of unknown objects remains a big challenge. Precision grasping of unknown objects is even harder. Due to imperfection of sensor measurements and lack of prior knowledge of objects, robots have to handle the uncertainty effectively. In previous work (Chen and Wichert 2015), we use a probabilistic framework to tackle precision grasping of model-based objects. In this paper, we extend the probabilistic framework to tackle the problem of precision grasping of unknown objects. We first propose an object model called probabilistic signed distance function (p-SDF) to represent unknown object surface. p-SDF models measurement uncertainty explicitly and allows measurement from multiple sensors to be fused in real time. Based on the surface representation, we propose a model to evaluate the likelihood of grasp success for antipodal grasps. This model uses four heuristics to model the condition of force closure and perceptual uncertainty. A two step simulated annealing approach is further proposed to search and optimize a precision grasp. We use the object representation as a bridge to unify grasp synthesis and grasp execution. Our grasp execution is performed in a closed-loop, so that robots can actively reduce the uncertainty and react to external perturbations during a grasping process. We perform extensive grasping experiments using real world challenging objects and demonstrate that our method achieves high robustness and accuracy in grasping unknown objects.  相似文献   

10.
The need to generate new views of a 3D object from a single real image arises in several fields, including graphics and object recognition. While the traditional approach relies on the use of 3D models, simpler techniques are applicable under restricted conditions. The approach exploits image transformations that are specific to the relevant object class, and learnable from example views of other “prototypical” objects of the same class. In this paper, we introduce such a technique by extending the notion of linear class proposed by the authors (1992). For linear object classes, it is shown that linear transformations can be learned exactly from a basis set of 2D prototypical views. We demonstrate the approach on artificial objects and then show preliminary evidence that the technique can effectively “rotate” high-resolution face images from a single 2D view  相似文献   

11.
Since sensory feedback is an important part of robot control and the acquisition, manipulation, and recognition of objects, incorporating a sense of touch into a robotic system can greatly enhance the performance of that system. This article describes the evaluation of a recently developed low-resolution tactile array sensor pad system for use in robotic applications. Computer algorithms are developed which acquire data from the sensor pad and display the data on a CRT screen. Vision algorithms are implemented in order to extract the necessary information from the tactile data which will aid in the acquisition, manipulation, and recognition of objects. An object's pose is estimated by calculating its center of gravity (position) and principal axis (orientation). Recognizing an object and distinguishing between different objects is accomplished by implementing algorithms which estimate an object's perimeter (shape) and area (size). This work demonstrates that a low-resolution tactile array sensor is capable of providing the information that is required for many robotic applications in which objects must be acquired, manipulated, and recognized. Such a system provides a low-cost alternative to more conventional vision-based systems.  相似文献   

12.
This paper describes a genetic algorithm that tackles the pose-estimation problem in computer vision. Our genetic algorithm can find the rotation and translation of an object accurately when the three-dimensional structure of the object is given. In our implementation, each chromosome encodes both the pose and the indexes to the selected point features of the object. Instead of only searching for the pose as in the existing work, our algorithm, at the same time, searches for a set containing the most reliable feature points in the process. This mismatch filtering strategy successfully makes the algorithm more robust under the presence of point mismatches and outliers in the images. Our algorithm has been tested with both synthetic and real data with good results. The accuracy of the recovered pose is compared to the existing algorithms. Our approach outperformed the Lowe's method and the other two genetic algorithms under the presence of point mismatches and outliers. In addition, it has been used to estimate the pose of a real object. It is shown that the proposed method is applicable to augmented reality applications.  相似文献   

13.
This article describes a probabilistic approach for improving the accuracy of general object pose estimation algorithms. We propose a histogram filter variant that uses the exploration capabilities of robots, and supports active perception through a next-best-view proposal algorithm. For the histogram-based fusion method we focus on the orientation of the 6 degrees of freedom (DoF) pose, since the position can be processed with common filtering techniques. The detected orientations of the object, estimated with a pose estimator, are used to update the hypothesis of its actual orientation. We discuss the design of experiments to estimate the error model of a detection method, and describe a suitable representation of the orientation histograms. This allows us to consider priors about likely object poses or symmetries, and use information gain measures for view selection. The method is validated and compared to alternatives, based on the outputs of different 6 DoF pose estimators, using real-world depth images acquired using different sensors, and on a large synthetic dataset.  相似文献   

14.
15.
Visual learning and recognition of 3-d objects from appearance   总被引:33,自引:9,他引:24  
The problem of automatically learning object models for recognition and pose estimation is addressed. In contrast to the traditional approach, the recognition problem is formulated as one of matching appearance rather than shape. The appearance of an object in a two-dimensional image depends on its shape, reflectance properties, pose in the scene, and the illumination conditions. While shape and reflectance are intrinsic properties and constant for a rigid object, pose and illumination vary from scene to scene. A compact representation of object appearance is proposed that is parametrized by pose and illumination. For each object of interest, a large set of images is obtained by automatically varying pose and illumination. This image set is compressed to obtain a low-dimensional subspace, called the eigenspace, in which the object is represented as a manifold. Given an unknown input image, the recognition system projects the image to eigenspace. The object is recognized based on the manifold it lies on. The exact position of the projection on the manifold determines the object's pose in the image.A variety of experiments are conducted using objects with complex appearance characteristics. The performance of the recognition and pose estimation algorithms is studied using over a thousand input images of sample objects. Sensitivity of recognition to the number of eigenspace dimensions and the number of learning samples is analyzed. For the objects used, appearance representation in eigenspaces with less than 20 dimensions produces accurate recognition results with an average pose estimation error of about 1.0 degree. A near real-time recognition system with 20 complex objects in the database has been developed. The paper is concluded with a discussion on various issues related to the proposed learning and recognition methodology.  相似文献   

16.
A nearest neighbor (NN) query, which returns the most similar object to a user-specified query object, plays an important role in a wide range of applications and hence has received considerable attention. In many such applications, e.g., sensor data collection and location-based services, objects are inherently uncertain. Furthermore, due to the ever increasing generation of massive datasets, the importance of distributed databases, which deal with such data objects, has been growing. One emerging challenge is to efficiently process probabilistic NN queries over distributed uncertain databases. The straightforward approach, that each local site forwards its own database to the central server, is communication-expensive, so we have to minimize communication cost for the NN object retrieval. In this paper, we focus on two important queries, namely top-k probable NN queries and probabilistic star queries, and propose efficient algorithms to process them over distributed uncertain databases. Extensive experiments on both real and synthetic data have demonstrated that our algorithms significantly reduce communication cost.  相似文献   

17.
This article incorporates fuzzy set theory into the task of image segmentation. the basic concept is to allow the fuzzy membership function to model the uncertainty and vagueness of definition of objects in digital images. We define a fuzzy segmentation as a fuzzy c-partition of an image and incorporate this definition and fuzzy criteria into several image segmentation techniques including segmentation by clustering, region growing, and relaxation labelling. the algorithms are tested on digital forward looking infrared (FLIR) images and digital subtraction angiographic images. These techniques are shown to perform at least as well as their crisp or probabilistic counterparts when converted to a crisp partition. However, the real advantage to a fuzzy methodology is that the degree of membership provides a model of uncertainty and can subsequently be used by feature extraction and object recognition algorithms to increase the amount of information available in decision processes.  相似文献   

18.
3D object recognition is a difficult and yet an important problem in computer vision. A 3D object recognition system has two major components, namely: an object modeller and a system that performs the matching of stored representations to those derived from the sensed image. The performance of systems wherein the construction of object models is done by training from one or more images of the objects, has not been very satisfactory. Although objects used in a robotic workcell or in assembly processes have been designed using a CAD system, the vision systems used for recognition of these objects are independent of the CAD database. This paper proposes a scheme for interfacing the CAD database of objects and the computer vision processes used for recognising these objects. CAD models of objects are processed to generate vision oriented features that appear in the different views of the object and the same features are extracted from images of the object to identify the object and its pose.  相似文献   

19.
Automatic 3D object model construction is important in applications ranging from manufacturing to entertainment, since CAD models of existing objects may be either unavailable or unusable. We describe a prototype system for automatically registering and integrating multiple views of objects from range data. The results can then be used to construct geometric models of the objects. New techniques for handling key problems such as robust estimation of transformations relating multiple views and seamless integration of registered data to form an unbroken surface have been proposed and implemented in the system. Experimental results on real surface data acquired using a digital interferometric sensor as well as a laser range scanner demonstrate the good performance of our system  相似文献   

20.
Humans excel in manipulation tasks, a basic skill for our survival and a key feature in our manmade world of artefacts and devices. In this work, we study how humans manipulate simple daily objects, and construct a probabilistic representation model for the tasks and objects useful for autonomous grasping and manipulation by robotic hands. Human demonstrations of predefined object manipulation tasks are recorded from both the human hand and object points of view. The multimodal data acquisition system records human gaze, hand and fingers 6D pose, finger flexure, tactile forces distributed on the inside of the hand, colour images and stereo depth map, and also object 6D pose and object tactile forces using instrumented objects. From the acquired data, relevant features are detected concerning motion patterns, tactile forces and hand-object states. This will enable modelling a class of tasks from sets of repeated demonstrations of the same task, so that a generalised probabilistic representation is derived to be used for task planning in artificial systems. An object centred probabilistic volumetric model is proposed to fuse the multimodal data and map contact regions, gaze, and tactile forces during stable grasps. This model is refined by segmenting the volume into components approximated by superquadrics, and overlaying the contact points used taking into account the task context. Results show that the features extracted are sufficient to distinguish key patterns that characterise each stage of the manipulation tasks, ranging from simple object displacement, where the same grasp is employed during manipulation (homogeneous manipulation) to more complex interactions such as object reorientation, fine positioning, and sequential in-hand rotation (dexterous manipulation). The framework presented retains the relevant data from human demonstrations, concerning both the manipulation and object characteristics, to be used by future grasp planning in artificial systems performing autonomous grasping.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号