首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We propose an edge-based method for 6DOF pose tracking of rigid objects using a monocular RGB camera. One of the critical problem for edge-based methods is to search the object contour points in the image corresponding to the known 3D model points. However, previous methods often produce false object contour points in case of cluttered backgrounds and partial occlusions. In this paper, we propose a novel edge-based 3D objects tracking method to tackle this problem. To search the object contour points, foreground and background clutter points are first filtered out using edge color cue, then object contour points are searched by maximizing their edge confidence which combines edge color and distance cues. Furthermore, the edge confidence is integrated into the edge-based energy function to reduce the influence of false contour points caused by cluttered backgrounds and partial occlusions. We also extend our method to multi-object tracking which can handle mutual occlusions. We compare our method with the recent state-of-art methods on challenging public datasets. Experiments demonstrate that our method improves robustness and accuracy against cluttered backgrounds and partial occlusions.  相似文献   

2.
In this paper we propose a novel framework for contour based object detection from cluttered environments. Given a contour model for a class of objects, it is first decomposed into fragments hierarchically. Then, we group these fragments into part bundles, where a part bundle can contain overlapping fragments. Given a new image with set of edge fragments we develop an efficient voting method using local shape similarity between part bundles and edge fragments that generates high quality candidate part configurations. We then use global shape similarity between the part configurations and the model contour to find optimal configuration. Furthermore, we show that appearance information can be used for improving detection for objects with distinctive texture when model contour does not sufficiently capture deformation of the objects.  相似文献   

3.
This paper presents the results of an investigation and pilot study into an active binocular vision system that combines binocular vergence, object recognition and attention control in a unified framework. The prototype developed is capable of identifying, targeting, verging on and recognising objects in a cluttered scene without the need for calibration or other knowledge of the camera geometry. This is achieved by implementing all image analysis in a symbolic space without creating explicit pixel-space maps. The system structure is based on the ‘searchlight metaphor’ of biological systems. We present results of an investigation that yield a maximum vergence error of ~6.5 pixels, while ~85% of known objects were recognised in five different cluttered scenes. Finally a ‘stepping-stone’ visual search strategy was demonstrated, taking a total of 40 saccades to find two known objects in the workspace, neither of which appeared simultaneously within the field of view resulting from any individual saccade.  相似文献   

4.
This paper presents a technique for automatic airborne target recognition and tracking in forward-looking infrared (FLIR) images with a complex background. An image splitting and merging method is applied for detecting target signals. The presence of a complex background due to clouds and sun glint generates clutter in the image with the resulting possibility of false alarms. A Bayesian classifier trained using the NMI (normalized moment of inertia) feature is proposed for efficient clutter rejection. After classification, target candidates are entered into a tracking filter. As an efficient and robust multi-target tracking filter in cluttered environments, the JDC-JIHPDAF is proposed. Experimental results using a wide range of real FLIR images ensure reliable classification and automatic target recognition performance.  相似文献   

5.
Recognizing solid objects by alignment with an image   总被引:20,自引:15,他引:5  
In this paper we consider the problem of recognizing solid objects from a single two-dimensional image of a three-dimensional scene. We develop a new method for computing a transformation from a three-dimensional model coordinate frame to the two-dimensional image coordinate frame, using three pairs of model and image points. We show that this transformation always exists for three noncollinear points, and is unique up to a reflective ambiguity. The solution method is closed-form and only involves second-order equations. We have implemented a recognition system that uses this transformation method to determine possible alignments of a model with an image. Each of these hypothesized matches is verified by comparing the entire edge contours of the aligned object with the image edges. Using the entire edge contours for verification, rather than a few local feature points, reduces the chance of finding false matches. The system has been tested on partly occluded objects in highly cluttered scenes.  相似文献   

6.
Mel BW  Fiser J 《Neural computation》2000,12(4):731-762
We have studied some of the design trade-offs governing visual representations based on spatially invariant conjunctive feature detectors, with an emphasis on the susceptibility of such systems to false-positive recognition errors-Malsburg's classical binding problem. We begin by deriving an analytical model that makes explicit how recognition performance is affected by the number of objects that must be distinguished, the number of features included in the representation, the complexity of individual objects, and the clutter load, that is, the amount of visual material in the field of view in which multiple objects must be simultaneously recognized, independent of pose, and without explicit segmentation. Using the domain of text to model object recognition in cluttered scenes, we show that with corrections for the nonuniform probability and nonindependence of text features, the analytical model achieves good fits to measured recognition rates in simulations involving a wide range of clutter loads, word size, and feature counts. We then introduce a greedy algorithm for feature learning, derived from the analytical model, which grows a representation by choosing those conjunctive features that are most likely to distinguish objects from the cluttered backgrounds in which they are embedded. We show that the representations produced by this algorithm are compact, decorrelated, and heavily weighted toward features of low conjunctive order. Our results provide a more quantitative basis for understanding when spatially invariant conjunctive features can support unambiguous perception in multiobject scenes, and lead to several insights regarding the properties of visual representations optimized for specific recognition tasks.  相似文献   

7.
Mel B  Fiser J 《Neural computation》2000,12(2):247-278
We have studied some of the design trade-offs governing visual representations based on spatially invariant conjunctive feature detectors, with an emphasis on the susceptibility of such systems to false-positive recognition errors-Malsburg's classical binding problem. We begin by deriving an analytical model that makes explicit how recognition performance is affected by the number of objects that must be distinguished, the number of features included in the representation, the complexity of individual objects, and the clutter load, that is, the amount of visual material in the field of view in which multiple objects must be simultaneously recognized, independent of pose, and without explicit segmentation. Using the domain of text to model object recognition in cluttered scenes, we show that with corrections for the nonuniform probability and nonindependence of text features, the analytical model achieves good fits to measured recognition rates in simulations involving a wide range of clutter loads, word size, and feature counts. We then introduce a greedy algorithm for feature learning, derived from the analytical model, which grows a representation by choosing those conjunctive features that are most likely to distinguish objects from the cluttered backgrounds in which they are embedded. We show that the representations produced by this algorithm are compact, decorrelated, and heavily weighted toward features of low conjunctive order. Our results provide a more quantitative basis for understanding when spatially invariant conjunctive features can support unambiguous perception in multiobject scenes, and lead to several insights regarding the properties of visual representations optimized for specific recognition tasks.  相似文献   

8.
9.
10.
Probabilistic Models of Appearance for 3-D Object Recognition   总被引:6,自引:0,他引:6  
We describe how to model the appearance of a 3-D object using multiple views, learn such a model from training images, and use the model for object recognition. The model uses probability distributions to describe the range of possible variation in the object's appearance. These distributions are organized on two levels. Large variations are handled by partitioning training images into clusters corresponding to distinctly different views of the object. Within each cluster, smaller variations are represented by distributions characterizing uncertainty in the presence, position, and measurements of various discrete features of appearance. Many types of features are used, ranging in abstraction from edge segments to perceptual groupings and regions. A matching procedure uses the feature uncertainty information to guide the search for a match between model and image. Hypothesized feature pairings are used to estimate a viewpoint transformation taking account of feature uncertainty. These methods have been implemented in an object recognition system, OLIVER. Experiments show that OLIVER is capable of learning to recognize complex objects in cluttered images, while acquiring models that represent those objects using relatively few views.  相似文献   

11.
The goal of object categorization is to locate and identify instances of an object category within an image. Recognizing an object in an image is difficult when images include occlusion, poor quality, noise or background clutter, and this task becomes even more challenging when many objects are present in the same scene. Several models for object categorization use appearance and context information from objects to improve recognition accuracy. Appearance information, based on visual cues, can successfully identify object classes up to a certain extent. Context information, based on the interaction among objects in the scene or global scene statistics, can help successfully disambiguate appearance inputs in recognition tasks. In this work we address the problem of incorporating different types of contextual information for robust object categorization in computer vision. We review different ways of using contextual information in the field of object categorization, considering the most common levels of extraction of context and the different levels of contextual interactions. We also examine common machine learning models that integrate context information into object recognition frameworks and discuss scalability, optimizations and possible future approaches.  相似文献   

12.
13.
飞机图像中,目标与背景往往对比较为强烈,但目标的位置、角度和尺度变化也较为剧烈,这给飞机目标的识别带来了一定的困难。提出了一种基于目标矩和角点特征的飞机图像识别方法。首先分割待测图像和模板图像中的目标,将分割结果根据其矩特征进行规格化。之后运用待测图像中目标的角点集与模板图像中目标的边缘点集计算两者相似性,实现识别。实验证明,方法具有计算快捷,适应性强的特点。  相似文献   

14.
15.
Many current recognition systems use variations on constrained tree search to locate objects in cluttered environments. If the system is simply finding instances of an object known to be in the scene, then previous formal analysis has shown that the expected amount of search is quadratic in the number of model and data features when all the data is known to come from a single object, but is exponential when spurious data is included. If one can group the data into subsets likely to have come from a single object, then terminating the search once a “good enough” interpretation is found reduces the expected search to cubic. Without successful grouping, terminated search is still exponential. These results apply to finding instances of a known object in the data. What happens when the object is not present? In this article, we turn to the problem of selecting models from a library, and examine the combinatorial cost of determining that an incorrectly chosen candidate object is not present in the data. We show that the expected search is again exponential, implying that naive approaches to library indexing are likely to carry an expensive overhead, since an exponential amount of work is needed to weed out each incorrect model. The analytic results are shown to be in agreement with empirical data for cluttered object recognition.  相似文献   

16.
J. C.  J. S. 《Pattern recognition》2002,35(12):2711-2718
This paper addresses the problem of tracking objects with complex motion dynamics or shape changes. It is assumed that some of the visual features detected in the image (e.g., edge strokes) are outliers i.e., they do not belong to the object boundary. A robust tracking algorithm is proposed which allows to efficiently track an object with complex shape or motion changes in clutter environments. The algorithm relies on the use of multiple models, i.e., a bank of stochastic motion models switched according to a probabilistic mechanism. Robust filtering methods are used to estimate the label of the active model as well as the state trajectory.  相似文献   

17.
This paper presents a new robot-vision system architecture for real-time moving object localization. The 6-DOF (3 translation and 3 rotation) motion of the objects is detected and tracked accurately in clutter using a model-based approach without information of the objects’ initial positions. An object identification task and an object tracking task are combined under this architecture. The computational time-lag between the two tasks is absorbed by a large amount of frame memory. The tasks are implemented as independent software modules using stereo-vision-based methods which can deal with objects of various shapes with edges, including planar to smooth-curved objects, in cluttered environments. This architecture also leads to failure-recoverable object tracking, because the tracking processes can be automatically recovered, even if the moving objects are lost while tracking. Experimental results obtained with prototype systems demonstrate the effectiveness of the proposed architecture.  相似文献   

18.
The explosion of the Internet provides us with a tremendous resource of images shared online. It also confronts vision researchers the problem of finding effective methods to navigate the vast amount of visual information. Semantic image understanding plays a vital role towards solving this problem. One important task in image understanding is object recognition, in particular, generic object categorization. Critical to this problem are the issues of learning and dataset. Abundant data helps to train a robust recognition system, while a good object classifier can help to collect a large amount of images. This paper presents a novel object recognition algorithm that performs automatic dataset collecting and incremental model learning simultaneously. The goal of this work is to use the tremendous resources of the web to learn robust object category models for detecting and searching for objects in real-world cluttered scenes. Humans contiguously update the knowledge of objects when new examples are observed. Our framework emulates this human learning process by iteratively accumulating model knowledge and image examples. We adapt a non-parametric latent topic model and propose an incremental learning framework. Our algorithm is capable of automatically collecting much larger object category datasets for 22 randomly selected classes from the Caltech 101 dataset. Furthermore, our system offers not only more images in each object category but also a robust object category model and meaningful image annotation. Our experiments show that OPTIMOL is capable of collecting image datasets that are superior to the well known manually collected object datasets Caltech 101 and LabelMe.  相似文献   

19.
Robust Object Detection with Interleaved Categorization and Segmentation   总被引:5,自引:0,他引:5  
This paper presents a novel method for detecting and localizing objects of a visual category in cluttered real-world scenes. Our approach considers object categorization and figure-ground segmentation as two interleaved processes that closely collaborate towards a common goal. As shown in our work, the tight coupling between those two processes allows them to benefit from each other and improve the combined performance. The core part of our approach is a highly flexible learned representation for object shape that can combine the information observed on different training examples in a probabilistic extension of the Generalized Hough Transform. The resulting approach can detect categorical objects in novel images and automatically infer a probabilistic segmentation from the recognition result. This segmentation is then in turn used to again improve recognition by allowing the system to focus its efforts on object pixels and to discard misleading influences from the background. Moreover, the information from where in the image a hypothesis draws its support is employed in an MDL based hypothesis verification stage to resolve ambiguities between overlapping hypotheses and factor out the effects of partial occlusion. An extensive evaluation on several large data sets shows that the proposed system is applicable to a range of different object categories, including both rigid and articulated objects. In addition, its flexible representation allows it to achieve competitive object detection performance already from training sets that are between one and two orders of magnitude smaller than those used in comparable systems.  相似文献   

20.
Many current recognition systems terminate a search once an interpretation that is good enough is found. The author formally examines the combinatorics of this approach, showing that choosing correct termination procedures can dramatically reduce the search. In particular, the author provides conditions on the object model and the scene clutter such that the expected search is at most quartic. The analytic results are shown to be in agreement with empirical data for cluttered object recognition. These results imply that it is critical to use techniques that select subsets of the data likely to have come from a single object before establishing a correspondence between data and model features  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号