首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 757 毫秒
1.
The Al‐Robotics team was selected as one of the 25 finalist teams out of 143 applications received to participate in the first edition of the Mohamed Bin Zayed International Robotic Challenge (MBZIRC), held in 2017. In particular, one of the competition Challenges offered us the opportunity to develop a cooperative approach with multiple unmanned aerial vehicles (UAVs) searching, picking up, and dropping static and moving objects. This paper presents the approach that our team Al‐Robotics followed to address that Challenge 3 of the MBZIRC. First, we overview the overall architecture of the system, with the different modules involved. Second, we describe the procedure that we followed to design the aerial platforms, as well as all their onboard components. Then, we explain the techniques that we used to develop the software functionalities of the system. Finally, we discuss our experimental results and the lessons that we learned before and during the competition. The cooperative approach was validated with fully autonomous missions in experiments previous to the actual competition. We also analyze the results that we obtained during the competition trials.  相似文献   

2.
Recently, robots are introduced to warehouses and factories for automation and are expected to execute dual-arm manipulation as human does and to manipulate large, heavy and unbalanced objects. We focus on target picking task in the cluttered environment and aim to realize a robot picking system which the robot selects and executes proper grasping motion from single-arm and dual-arm motion. In this paper, we propose a few-experiential learning-based target picking system with selective dual-arm grasping. In our system, a robot first learns grasping points and object semantic and instance label with automatically synthesized dataset. The robot then executes and collects grasp trial experiences in the real world and retrains the grasping point prediction model with the collected trial experiences. Finally, the robot evaluates candidate pairs of grasping object instance, strategy and points and selects to execute the optimal grasping motion. In the experiments, we evaluated our system by conducting target picking task experiments with a dual-arm humanoid robot Baxter in the cluttered environment as warehouse.  相似文献   

3.
We present an approach to the recognition of complex-shaped objects in cluttered environments based on edge information. We first use example images of a target object in typical environments to train a classifier cascade that determines whether edge pixels in an image belong to an instance of the desired object or the clutter. Presented with a novel image, we use the cascade to discard clutter edge pixels and group the object edge pixels into overall detections of the object. The features used for the edge pixel classification are localized, sparse edge density operations. Experiments validate the effectiveness of the technique for recognition of a set of complex objects in a variety of cluttered indoor scenes under arbitrary out-of-image-plane rotation. Furthermore, our experiments suggest that the technique is robust to variations between training and testing environments and is efficient at runtime.  相似文献   

4.
Real-world actions occur often in crowded, dynamic environments. This poses a difficult challenge for current approaches to video event detection because it is difficult to segment the actor from the background due to distracting motion from other objects in the scene. We propose a technique for event recognition in crowded videos that reliably identifies actions in the presence of partial occlusion and background clutter. Our approach is based on three key ideas: (1) we efficiently match the volumetric representation of an event against oversegmented spatio-temporal video volumes; (2) we augment our shape-based features using flow; (3) rather than treating an event template as an atomic entity, we separately match by parts (both in space and time), enabling robustness against occlusions and actor variability. Our experiments on human actions, such as picking up a dropped object or waving in a crowd show reliable detection with few false positives.  相似文献   

5.
Even though visual attention models using bottom-up saliency can speed up object recognition by predicting object locations, in the presence of multiple salient objects, saliency alone cannot discern target objects from the clutter in a scene. Using a metric named familiarity, we propose a top-down method for guiding attention towards target objects, in addition to bottom-up saliency. To demonstrate the effectiveness of familiarity, the unified visual attention model (UVAM) which combines top-down familiarity and bottom-up saliency is applied to SIFT based object recognition. The UVAM is tested on 3600 artificially generated images containing COIL-100 objects with varying amounts of clutter, and on 126 images of real scenes. The recognition times are reduced by 2.7× and 2×, respectively, with no reduction in recognition accuracy, demonstrating the effectiveness and robustness of the familiarity based UVAM.  相似文献   

6.
Tracking multiple objects is more challenging than tracking a single object. Some problems arise in multiple-object tracking that do not exist in single-object tracking, such as object occlusion, the appearance of a new object and the disappearance of an existing object, updating the occluded object, etc. In this article, we present an approach to handling multiple-object tracking in the presence of occlusions, background clutter, and changing appearance. The occlusion is handled by considering the predicted trajectories of the objects based on a dynamic model and likelihood measures. We also propose target-model-update conditions, ensuring the proper tracking of multiple objects. The proposed method is implemented in a probabilistic framework such as a particle filter in conjunction with a color feature. The particle filter has proven very successful for nonlinear and non-Gaussian estimation problems. It approximates a posterior probability density of the state, such as the object’s position, by using samples or particles, where each state is denoted as the hypothetical state of the tracked object and its weight. The observation likelihood of the objects is modeled based on a color histogram. The sample weight is measured based on the Bhattacharya coefficient, which measures the similarity between each sample’s histogram and a specified target model. The algorithm can successfully track multiple objects in the presence of occlusion and noise. Experimental results show the effectiveness of our method in tracking multiple objects.  相似文献   

7.
8.
A new method for tracking contours of moving objects in clutter is presented. For a given object, a model of its contours is learned from training data in the form of a subset of contour space. Greater complexity is added to the contour model by analyzing rigid and non-rigid transformations of contours separately. In the course of tracking, multiple contours may be observed due to the presence of extraneous edges in the form of clutter; the learned model guides the algorithm in picking out the correct one. The algorithm, which is posed as a solution to a minimization problem, is made efficient by the use of several iterative schemes. Results applying the proposed algorithm to the tracking of a flexing finger and to a conversing individual's lips are presented.  相似文献   

9.
We present a novel and light‐weight approach to capture and reconstruct structured 3D models of multi‐room floor plans. Starting from a small set of registered panoramic images, we automatically generate a 3D layout of the rooms and of all the main objects inside. Such a 3D layout is directly suitable for use in a number of real‐world applications, such as guidance, location, routing, or content creation for security and energy management. Our novel pipeline introduces several contributions to indoor reconstruction from purely visual data. In particular, we automatically partition panoramic images in a connectivity graph, according to the visual layout of the rooms, and exploit this graph to support object recovery and rooms boundaries extraction. Moreover, we introduce a plane‐sweeping approach to jointly reason about the content of multiple images and solve the problem of object inference in a top‐down 2D domain. Finally, we combine these methods in a fully automated pipeline for creating a structured 3D model of a multi‐room floor plan and of the location and extent of clutter objects. These contribution make our pipeline able to handle cluttered scenes with complex geometry that are challenging to existing techniques. The effectiveness and performance of our approach is evaluated on both real‐world and synthetic models.  相似文献   

10.
Accurate Object Recognition with Shape Masks   总被引:1,自引:0,他引:1  
In this paper we propose an object recognition approach that is based on shape masks—generalizations of segmentation masks. As shape masks carry information about the extent (outline) of objects, they provide a convenient tool to exploit the geometry of objects. We apply our ideas to two common object class recognition tasks—classification and localization. For classification, we extend the orderless bag-of-features image representation. In the proposed setup shape masks can be seen as weak geometrical constraints over bag-of-features. Those constraints can be used to reduce background clutter and help recognition. For localization, we propose a new recognition scheme based on high-dimensional hypothesis clustering. Shape masks allow to go beyond bounding boxes and determine the outline (approximate segmentation) of the object during localization. Furthermore, the method easily learns and detects possible object viewpoints and articulations, which are often well characterized by the object outline. Our experiments reveal that shape masks can improve recognition accuracy of state-of-the-art methods while returning richer recognition answers at the same time. We evaluate the proposed approach on the challenging natural-scene Graz-02 object classes dataset.  相似文献   

11.
We study the problem of detecting objects in still, gray-scale images. Our primary focus is the development of a learning-based approach to the problem that makes use of a sparse, part-based representation. A vocabulary of distinctive object parts is automatically constructed from a set of sample images of the object class of interest; images are then represented using parts from this vocabulary, together with spatial relations observed among the parts. Based on this representation, a learning algorithm is used to automatically learn to detect instances of the object class in new images. The approach can be applied to any object with distinguishable parts in a relatively fixed spatial configuration; it is evaluated here on difficult sets of real-world images containing side views of cars, and is seen to successfully detect objects in varying conditions amidst background clutter and mild occlusion. In evaluating object detection approaches, several important methodological issues arise that have not been satisfactorily addressed in the previous work. A secondary focus of this paper is to highlight these issues, and to develop rigorous evaluation standards for the object detection problem. A critical evaluation of our approach under the proposed standards is presented.  相似文献   

12.
Tracking is a major issue of virtual and augmented reality applications. Single object tracking on monocular video streams is fairly well understood. However, when it comes to multiple objects, existing methods lack scalability and can recognize only a limited number of objects. Thanks to recent progress in feature matching, state-of-the-art image retrieval techniques can deal with millions of images. However, these methods do not focus on real-time video processing and cannot track retrieved objects. In this paper, we present a method that combines the speed and accuracy of tracking with the scalability of image retrieval. At the heart of our approach is a bi-layer clustering process that allows our system to index and retrieve objects based on tracks of features, thereby effectively summarizing the information available on multiple video frames. Dynamic learning of new viewpoints as the camera moves naturally yields the kind of robustness and reliability expected from an augmented reality engine. As a result, our system is able to track in real-time multiple objects, recognized with low delay from a database of more than 300 entries. We released the source code of our system in a package called Polyora.  相似文献   

13.
This paper presents a new robot-vision system architecture for real-time moving object localization. The 6-DOF (3 translation and 3 rotation) motion of the objects is detected and tracked accurately in clutter using a model-based approach without information of the objects’ initial positions. An object identification task and an object tracking task are combined under this architecture. The computational time-lag between the two tasks is absorbed by a large amount of frame memory. The tasks are implemented as independent software modules using stereo-vision-based methods which can deal with objects of various shapes with edges, including planar to smooth-curved objects, in cluttered environments. This architecture also leads to failure-recoverable object tracking, because the tracking processes can be automatically recovered, even if the moving objects are lost while tracking. Experimental results obtained with prototype systems demonstrate the effectiveness of the proposed architecture.  相似文献   

14.
One of the central problems in automated target recognition is to accommodate the infinite variety of clutter in real military environments. The principle focus of our paper is on the construction of metric spaces where the metric measures the distance between objects of interest invariant to the infinite variety of clutter. Such metrics are formulated using second-order random field models. Our results indicate that this approach significantly improves detection/classification rates of targets in clutter.  相似文献   

15.
Category-level object recognition, segmentation, and tracking in videos becomes highly challenging when applied to sequences from a hand-held camera that features extensive motion and zooming. An additional challenge is then to develop a fully automatic video analysis system that works without manual initialization of a tracker or other human intervention, both during training and during recognition, despite background clutter and other distracting objects. Moreover, our working hypothesis states that category-level recognition is possible based only on an erratic, flickering pattern of interest point locations without extracting additional features. Compositions of these points are then tracked individually by estimating a parametric motion model. Groups of compositions segment a video frame into the various objects that are present and into background clutter. Objects can then be recognized and tracked based on the motion of their compositions and on the shape they form. Finally, the combination of this flow-based representation with an appearance-based one is investigated. Besides evaluating the approach on a challenging video categorization database with significant camera motion and clutter, we also demonstrate that it generalizes to action recognition in a natural way. Electronic Supplementary Material  The online version of this article () contains supplementary material, which is available to authorized users. This work was supported in part by the Swiss national science foundation under contract no. 200021-107636.  相似文献   

16.
基于几何特征的灵武长枣图像分割算法   总被引:1,自引:0,他引:1       下载免费PDF全文
智能采摘机器人在采摘灵武长枣果实时,其视觉系统采集的目标图像中,目标在自然场景下存在粘连、遮挡、重叠、阴影等问题,造成在图像目标识别时的误识对象问题,这对智能采摘是极其不利的。针对这一问题,提出了一种基于几何特征的灵武长枣图像分割的算法。根据灵武长枣的外形接近椭圆的特征,通过大量统计灵武长枣果实的外形特征数据,建立基于灵武长枣外形的几何模型。通过一系列图像预处理获得二值图像,再利用形态学变换进行连续腐蚀得到目标物的相对质心点位并标记,以确定目标物个数。以标记的质心作为模型的中心,在变换后的二值图像中建立该几何模型,利用所建立模型的边界曲线拟合出灵武长枣图像中目标物的分割线,从而实现灵武长枣图像的分割。实验结果表明,该方法能够简便快捷地解决图像目标物之间的粘连、阴影问题,并能保证高准确率。对于果实粘连较轻的图像,其分割准确率可达到92.31%。  相似文献   

17.
From Images to Shape Models for Object Detection   总被引:2,自引:0,他引:2  
We present an object class detection approach which fully integrates the complementary strengths offered by shape matchers. Like an object detector, it can learn class models directly from images, and can localize novel instances in the presence of intra-class variations, clutter, and scale changes. Like a shape matcher, it finds the boundaries of objects, rather than just their bounding-boxes. This is achieved by a novel technique for learning a shape model of an object class given images of example instances. Furthermore, we also integrate Hough-style voting with a non-rigid point matching algorithm to localize the model in cluttered images. As demonstrated by an extensive evaluation, our method can localize object boundaries accurately and does not need segmented examples for training (only bounding-boxes).  相似文献   

18.
一种多尺度分形特征目标检测方法   总被引:4,自引:0,他引:4  
自然背景干扰下的自动目标检测是目标检测的一个基本问题.根据尺度变化时自然 场景中人造目标的分形特征变化剧烈这一特点,提出了一种分形参数极值特征的自动目标检 测方法.大量的实验结果表明,分形参数极值特征的自动目标检测方法能较好地完成自然背景 干扰中人造小目标的自动检测.  相似文献   

19.
This paper presents a multiple model real-time tracking technique for video sequences, based on the mean-shift algorithm. The proposed approach incorporates spatial information from several connected regions into the histogram-based representation model of the target, and enables multiple models to be used to represent the same object. The use of several regions to capture the color spatial information into a single combined model, allow us to increase the object tracking efficiency. By using multiple models, we can make the tracking scheme more robust in order to work with sequences with illumination and pose changes. We define a model selection function that takes into account both the similarity of the model with the information present in the image, and the target dynamics. In the tracking experiments presented, our method successfully coped with lighting changes, occlusion, and clutter.  相似文献   

20.
We propose an edge-based method for 6DOF pose tracking of rigid objects using a monocular RGB camera. One of the critical problem for edge-based methods is to search the object contour points in the image corresponding to the known 3D model points. However, previous methods often produce false object contour points in case of cluttered backgrounds and partial occlusions. In this paper, we propose a novel edge-based 3D objects tracking method to tackle this problem. To search the object contour points, foreground and background clutter points are first filtered out using edge color cue, then object contour points are searched by maximizing their edge confidence which combines edge color and distance cues. Furthermore, the edge confidence is integrated into the edge-based energy function to reduce the influence of false contour points caused by cluttered backgrounds and partial occlusions. We also extend our method to multi-object tracking which can handle mutual occlusions. We compare our method with the recent state-of-art methods on challenging public datasets. Experiments demonstrate that our method improves robustness and accuracy against cluttered backgrounds and partial occlusions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号