首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 140 毫秒
1.
We present a novel approach to the robust classification of arbitrary object classes in complex, natural scenes. Starting from a re-appraisal of Marr's ‘primal sketch’, we develop an algorithm that (1) employs local orientations as the fundamental picture primitives, rather than the more usual edge locations, (2) retains and exploits the local spatial arrangement of features of different complexity in an image and (3) is hierarchically arranged so that the level of feature abstraction increases at each processing stage. The resulting, simple technique is based on the accumulation of evidence in binary channels, followed by a weighted, non-linear sum of the evidence accumulators. The steps involved in designing a template for recognizing a simple object are explained. The practical application of the algorithm is illustrated, with examples taken from a broad range of object classification problems. We discuss the performance of the algorithm and describe a hardware implementation. First successful attempts to train the algorithm, automatically, are presented. Finally, we compare our algorithm with other object classification algorithms described in the literature.  相似文献   

2.
Robotic grasping is very sensitive to how accurate is the pose estimation of the object to grasp. Even a small error in the estimated pose may cause the planned grasp to fail. Several methods for robust grasp planning exploit the object geometry or tactile sensor feedback. However, object pose range estimation introduces specific uncertainties that can also be exploited to choose more robust grasps. We present a grasp planning method that explicitly considers the uncertainties on the visually-estimated object pose. We assume a known shape (e.g. primitive shape or triangle mesh), observed as a–possibly sparse–point cloud. The measured points are usually not uniformly distributed over the surface as the object is seen from a particular viewpoint; additionally this non-uniformity can be the result of heterogeneous textures over the object surface, when using stereo-vision algorithms based on robust feature-point matching. Consequently the pose estimation may be more accurate in some directions and contain unavoidable ambiguities.The proposed grasp planner is based on a particle filter to estimate the object probability distribution as a discrete set. We show that, for grasping, some ambiguities are less unfavorable so the distribution can be used to select robust grasps. Some experiments are presented with the humanoid robot iCub and its stereo cameras.  相似文献   

3.
In this paper, we present a method called MODEEP (Motion-based Object DEtection and Estimation of Pose) to detect independently moving objects (IMOs) in forward-looking infrared (FLIR) image sequences taken from an airborne, moving platform. Ego-motion effects are removed through a robust multi-scale affine image registration process. Thereafter, areas with residual motion indicate potential object activity. These areas are detected, refined and selected using a Bayesian classifier. The resulting regions are clustered into pairs such that each pair represents one object's front and rear end. Using motion and scene knowledge, we estimate object pose and establish a region of interest (ROI) for each pair. Edge elements within each ROI are used to segment the convex cover containing the IMO. We show detailed results on real, complex, cluttered and noisy sequences. Moreover, we outline the integration of our fast and robust system into a comprehensive automatic target recognition (ATR) and action classification system.  相似文献   

4.
Abstract. Conventional tracking methods encounter difficulties as the number of objects, clutter, and sensors increase, because of the requirement for data association. Statistical tracking, based on the concept of network tomography, is an alternative that avoids data association. It estimates the number of trips made from one region to another in a scene based on interregion boundary traffic counts accumulated over time. It is not necessary to track an object through a scene to determine when an object crosses a boundary. This paper describes statistical tracing and presents an evaluation based on the estimation of pedestrian and vehicular traffic intensities at an intersection over a period of 1 month. We compare the results with those from a multiple-hypothesis tracker and manually counted ground-truth estimates. Received: 30 August 2001 / Accepted: 28 May 2002 Correspondence to: J.E. Boyd  相似文献   

5.
A bin picking system based on depth from defocus   总被引:3,自引:0,他引:3  
It is generally accepted that to develop versatile bin-picking systems capable of grasping and manipulation operations, accurate 3-D information is required. To accomplish this goal, we have developed a fast and precise range sensor based on active depth from defocus (DFD). This sensor is used in conjunction with a three-component vision system, which is able to recognize and evaluate the attitude of 3-D objects. The first component performs scene segmentation using an edge-based approach. Since edges are used to detect the object boundaries, a key issue consists of improving the quality of edge detection. The second component attempts to recognize the object placed on the top of the object pile using a model-driven approach in which the segmented surfaces are compared with those stored in the model database. Finally, the attitude of the recognized object is evaluated using an eigenimage approach augmented with range data analysis. The full bin-picking system will be outlined, and a number of experimental results will be examined. Received: 2 December 2000 / Accepted: 9 September 2001 Correspondence to: O. Ghita  相似文献   

6.
This article describes a probabilistic approach for improving the accuracy of general object pose estimation algorithms. We propose a histogram filter variant that uses the exploration capabilities of robots, and supports active perception through a next-best-view proposal algorithm. For the histogram-based fusion method we focus on the orientation of the 6 degrees of freedom (DoF) pose, since the position can be processed with common filtering techniques. The detected orientations of the object, estimated with a pose estimator, are used to update the hypothesis of its actual orientation. We discuss the design of experiments to estimate the error model of a detection method, and describe a suitable representation of the orientation histograms. This allows us to consider priors about likely object poses or symmetries, and use information gain measures for view selection. The method is validated and compared to alternatives, based on the outputs of different 6 DoF pose estimators, using real-world depth images acquired using different sensors, and on a large synthetic dataset.  相似文献   

7.
The notion of viewpoints as a means of eliciting and formulating requirements is now well known. However, there is little practical evidence that viewpoint-based requirements methods scale up to address real problems. This paper presents a detailed case study based on a medium-sized system, and illustrates how a viewpoint-based requirements method can be used to structure and specify system requirements. The case study is intended to serve two purposes: first, to demonstrate the scalability of viewpoint-based requirements methods; and second, to act as a shared example for other researchers in the field to test their techniques and methods. The case study is based on an electronic document delivery and interchange system (EDDIS). The requirements are presented as they appeared in the original user requirements document. The paper concludes by outlining the lessons learnt in applying VORD to EDDIS, and proposes a set of 10 comparators that other researchers can use to compare their approaches and techniques.  相似文献   

8.
9.
Edges are useful features for structural image analysis, but the output of standard edge detectors must be thresholded to remove the many spurious edges. This paper describes experiments with both new and old techniques for: 1. determining edge saliency (as alternatives to gradient magnitude) and 2. automatically determining appropriate edge threshold values. Some examples of edge saliency measures are lifetime, wiggliness, spatial width, and phase congruency. Examples of thresholding techniques use: the Rayleigh distribution to model the edge gradient magnitude histogram, relaxation labelling, and an edge curve “length”–“average gradient magnitude” feature space.  相似文献   

10.
This work investigates map-to-image registration for planar scenes in the context of robust parameter estimation. Registration is posed as the problem of estimating a projective transformation which optimally aligns transformed model line segments from a map with data line segments extracted from an image. Matching and parameter estimation is solved simultaneously by optimizing an objective function which is based on M-estimators, and depends on overlap and the weighted orthogonal distance between transformed model segments and data segments. An extensive series of registration experiments was conducted to test the performance of the proposed parameter estimation algorithm. More than 200 000 registration experiments were run with different objective functions for 12 aerial images and randomly corrupted maps distorted by randomly selected projective transformations. Received: 10 August 2000 / Accepted: 29 January 2001  相似文献   

11.
A system to navigate a robot into a ship structure   总被引:1,自引:0,他引:1  
Abstract. A prototype system has been built to navigate a walking robot into a ship structure. The 8-legged robot is equipped with an active stereo head. From the CAD-model of the ship good view points are selected, such that the head can look at locations with sufficient edge features, which are extracted automatically for each view. The pose of the robot is estimated from the features detected by two vision approaches. One approach searches in stereo images for junctions and measures the 3-D position. The other method uses monocular image and tracks 2-D edge features. Robust tracking is achieved with a method of edge projected integration of cues (EPIC). Two inclinometres are used to stabilise the head while the robot moves. The results of the final demonstration to navigate the robot within centimetre accuracy are given.  相似文献   

12.
We present an easy interaction technique for accessing location-based contextual data shown on a head-worn wearable computer display. Our technique, called Context Compass, is based on a regular compass metaphor. Each object belonging to the user’s current context is visualised on a linear compass shown on the screen. The object directly in front of the user is shown in the middle of the compass and can be activated. Whenever the user turns his or her head, the objects on the screen move accordingly. Therefore, an object can be selected by simply turning one’s head towards it. Context Compass consumes a minimal amount of screen space, making it ideal for usage with see-through head-worn displays. An initial pilot study, applying a newly developed usability method customised especially for Context Compass, revealed that Context Compass can be learned virtually immediately. Further, the method itself proved to be successful in evaluating techniques such as Context Compass.  相似文献   

13.
In this paper, we systematically assess the performance of an automatic calibration chart detector. Through simulation we establish the optimal set of control parameters and the rate of successful detection as a function of pose. We validate the simulation results on real images taken from a camera mounted on a robot arm. The results confirm the utility of such simulation studies. The feedback obtained suggested a number of modifications for the chart detection system, which led to a significant improvement in performance. In particular, the chart design was changed to accommodate wider range and better stability in detection. Received: 16 December 1999 / Accepted: 15 October 2000  相似文献   

14.
Abstract. We propose a new approach for automatic road extraction from aerial imagery with a model and a strategy mainly based on the multi-scale detection of roads in combination with geometry-constrained edge extraction using snakes. A main advantage of our approach is, that it allows for the first time a bridging of shadows and partially occluded areas using the heavily disturbed evidence in the image. Additionally, it has only few parameters to be adjusted. The road network is constructed after extracting crossings with varying shape and topology. We show the feasibility of the approach not only by presenting reasonable results but also by evaluating them quantitatively based on ground truth. Received: 22 July 1999 / Accepted: 20 March 2000  相似文献   

15.
Head tracking using stereo   总被引:2,自引:0,他引:2  
Head tracking is an important primitive for smart environments and perceptual user interfaces where the poses and movements of body parts need to be determined. Most previous solutions to this problem are based on intensity images and, as a result, suffer from a host of problems including sensitivity to background clutter and lighting variations. Our approach avoids these pitfalls by using stereo depth data together with a simple human-torso model to create a head-tracking system that is both fast and robust. We use stereo data (Commercial equipment and materials are identified in order to adequately specify certain procedures. In no case does such identification imply recommendation or endorsement by the National Institute of Standards and Technology, nor does it imply that the materials or equipment identified are necessarily the best available for the purpose.) to derive a depth model of the background that is then employed to provide accurate foreground segmentation. We then use directed local edge detectors on the foreground to find occluding edges that are used as features to fit to a torso model. Once we have the model parameters, the location and orientation of the head can be easily estimated. A useful side effect from using stereo data is the ability to track head movement through a room in three dimensions. Experimental results on real image sequences are given. Accepted: 13 August 2001  相似文献   

16.
The existing skew estimation techniques usually assume that the input image is of high resolution and that the detectable angle range is limited. We present a more generic solution for this task that overcomes these restrictions. Our method is based on determination of the first eigenvector of the data covariance matrix. The solution comprises image resolution reduction, connected component analysis, component classification using a fuzzy approach, and skew estimation. Experiments on a large set of various document images and performance comparison with two Hough transform-based methods show a good accuracy and robustness for our method. Received October 10, 1998 / Revised version September 9, 1999  相似文献   

17.
In video processing, a common first step is to segment the videos into physical units, generally called shots. A shot is a video segment that consists of one continuous action. In general, these physical units need to be clustered to form more semantically significant units, such as scenes, sequences, programs, etc. This is the so-called story-based video structuring. Automatic video structuring is of great importance for video browsing and retrieval. The shots or scenes are usually described by one or several representative frames, called key-frames. Viewed from a higher level, key frames of some shots might be redundant in terms of semantics. In this paper, we propose automatic solutions to the problems of: (i) video partitioning, (ii) key frame computing, (iii) key frame pruning. For the first problem, an algorithm called “net comparison” is devised. It is accurate and fast because it uses both statistical and spatial information in an image and does not have to process the entire image. For the last two problems, we develop an original image similarity criterion, which considers both spatial layout and detail content in an image. For this purpose, coefficients of wavelet decomposition are used to derive parameter vectors accounting for the above two aspects. The parameters exhibit (quasi-) invariant properties, thus making the algorithm robust for many types of object/camera motions and scaling variances. The novel “seek and spread” strategy used in key frame computing allows us to obtain a large representative range for the key frames. Inter-shot redundancy of the key-frames is suppressed using the same image similarity measure. Experimental results demonstrate the effectiveness and efficiency of our techniques.  相似文献   

18.
Robust camera pose and scene structure analysis for service robotics   总被引:1,自引:0,他引:1  
Successful path planning and object manipulation in service robotics applications rely both on a good estimation of the robot’s position and orientation (pose) in the environment, as well as on a reliable understanding of the visualized scene. In this paper a robust real-time camera pose and a scene structure estimation system is proposed. First, the pose of the camera is estimated through the analysis of the so-called tracks. The tracks include key features from the imaged scene and geometric constraints which are used to solve the pose estimation problem. Second, based on the calculated pose of the camera, i.e. robot, the scene is analyzed via a robust depth segmentation and object classification approach. In order to reliably segment the object’s depth, a feedback control technique at an image processing level has been used with the purpose of improving the robustness of the robotic vision system with respect to external influences, such as cluttered scenes and variable illumination conditions. The control strategy detailed in this paper is based on the traditional open-loop mathematical model of the depth estimation process. In order to control a robotic system, the obtained visual information is classified into objects of interest and obstacles. The proposed scene analysis architecture is evaluated through experimental results within a robotic collision avoidance system.  相似文献   

19.
Personalized, interactive news on the Web   总被引:2,自引:0,他引:2  
We present Krakatoa Chronicle, an interactive, personalized newspaper on the World Wide Web implemented as a Java applet. The newspaper is similar in appearance to newspapers in the real world, with a multi-column layout and justified text. At the same time, it provides various interaction techniques for browsing the content of articles, giving relevance feedback, and dynamically changing layout. As users interact with the system, individual ‘user profiles’ are built up at the webserver site. These are used to tailor the newspaper's content and layout to each user's declared and inferred preferences. The system allows for a balancing of personal and community interests, allowing the user to navigate through a space of newspapers corresponding to a range of viewpoints.  相似文献   

20.
Standard methods for sub-pixel matching are iterative and nonlinear; they are also sensitive to false initialization and window deformation. In this paper, we present a linear method that incorporates information from neighboring pixels. Two algorithms are presented: one ‘fast’ and one ‘robust’. They both start from an initial rough estimate of the matching. The fast one is suitable for pairs of images requiring negligible window deformation. The robust method is slower but more general and more precise. It eliminates false matches in the initialization by using robust estimation of the local affine deformation. The first algorithm attains an accuracy of 0.05 pixels for interest points and 0.06 for random points in the translational case. For the general case, if the deformation is small, the second method gives an accuracy of 0.05 pixels; while for large deformation, it gives an accuracy of about 0.06 pixels for points of interest and 0.10 pixels for random points. They are very few false matches in all cases, even if there are many in the initialization. Received: 24 July 1997 / Accepted: 4 December 1997  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号