首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 375 毫秒
1.
In many mobile robot applications, the high cost of damaging the robot or its environment makes even rare failures unacceptable. To mitigate this risk, a robot must be able to detect potentially hazardous situations before it experiences a major failure. This problem therefore becomes one of novelty and change detection: how a robot can identify when perception inputs differ from prior inputs seen during training or previous operation in the same area. With this ability, the system can either avoid novel locations to minimize risk or stop and enlist human help via supervisory control or teleoperation. We present an anytime novelty detection algorithm that deals with noisy and redundant high‐dimensional feature spaces that are common in robotics by utilizing prior class information within the training set. This approach is also well suited for online use when a constantly adjusting environmental model is beneficial. Furthermore, we address the problem of change detection in an environment of repeated operation by framing it as a location‐specific version of novelty detection and present an online scene segmentation algorithm that improves accuracy across diverse environments. We validate these approaches through extensive experiments onboard two outdoor mobile robot platforms, show that our approaches are robust to noisy sensor data and moderate registration errors, and argue how such abilities could be key in increasing the real‐world applications and impact of mobile robotics. © 2011 Wiley Periodicals, Inc.  相似文献   

2.
A camera's shutter controls the incoming light that is reaching the camera sensor. Different shutters lead to wildly different results, and are often used as a tool in movies for artistic purpose, e.g., they can indirectly control the effect of motion blur. However, a physical camera is limited to a single shutter setting at any given moment. ShutterApp enables users to define spatio‐temporally‐varying virtual shutters that go beyond the options available in real‐world camera systems. A user provides a sparse set of annotations that define shutter functions at selected locations in key frames. From this input, our solution defines shutter functions for each pixel of the video sequence using a suitable interpolation technique, which are then employed to derive the output video. Our solution performs in real‐time on commodity hardware. Hereby, users can explore different options interactively, leading to a new level of expressiveness without having to rely on specialized hardware or laborious editing.  相似文献   

3.
In this article, we propose a new approach to the map building task: the implementation of the Spatial Semantic Hierarchy (SSH), proposed by B. Kuipers, on a real robot fitted with an omnidirectional camera. The original Kuiper's formulation of the SSH was slightly modified, in order to manage in a more efficient way the knowledge the real robot collects while moving in the environment. The sensory data experienced by the robot are transformed by the different levels of the SSH in order to obtain a compact representation of the environment. This knowledge is stored in the form of a topological map and, eventually, of a metrical map. The aim of this article is to show that a catadioptric omnidirectional camera is a good sensor for the SSH and nicely couples with several elements of the SSH. The panoramic view and rotational invariance of our omnidirectional camera makes the identification and labelling of places a simple matter. A deeper insight is that the tracking and identification of events on an omnidirectional image such as occlusions and alignments can be used for the segmentation of continuous sensory image data into the discrete topological and metric elements of a map. The proposed combination of the SSH and omnidirectional vision provides a powerful general framework for robot maping and offers new insights into the concept of “place.” Some preliminary experiments performed with a real robot in an unmodified office environment are presented.  相似文献   

4.
In this work we propose methods that exploit context sensor data modalities for the task of detecting interesting events and extracting high-level contextual information about the recording activity in user generated videos. Indeed, most camera-enabled electronic devices contain various auxiliary sensors such as accelerometers, compasses, GPS receivers, etc. Data captured by these sensors during the media acquisition have already been used to limit camera degradations such as shake and also to provide some basic tagging information such as the location. However, exploiting the sensor-recordings modality for subsequent higher-level information extraction such as interesting events has been a subject of rather limited research, further constrained to specialized acquisition setups. In this work, we show how these sensor modalities allow inferring information (camera movements, content degradations) about each individual video recording. In addition, we consider a multi-camera scenario, where multiple user generated recordings of a common scene (e.g., music concerts) are available. For this kind of scenarios we jointly analyze these multiple video recordings and their associated sensor modalities in order to extract higher-level semantics of the recorded media: based on the orientation of cameras we identify the region of interest of the recorded scene, by exploiting correlation in the motion of different cameras we detect generic interesting events and estimate their relative position. Furthermore, by analyzing also the audio content captured by multiple users we detect more specific interesting events. We show that the proposed multimodal analysis methods perform well on various recordings obtained in real live music performances.  相似文献   

5.
Learning human–robot interaction logic from example interaction data has the potential to leverage “big data” to reduce the effort and time spent on designing interaction logic or crafting interaction content. Previous work has demonstrated techniques by which a robot can learn motion and speech behaviors from non-annotated human–human interaction data, but these techniques only enable a robot to respond to human-initiated inputs, and do not enable the robot to proactively initiate interaction. In this work, we propose a method for learning both human-initiated and robot-initiated behavior for a social robot from human–human example interactions, which we demonstrate for a shopkeeper interacting with a customer in a camera shop scenario. This was achieved by extending an existing technique by (1) introducing a concept of a customer yield action, (2) incorporating interaction history, represented by sequences of discretized actions, as inputs for training and generating robot behavior, and (3) using an “attention mechanism” in our learning system for training robot behaviors, that learns which parts of the interaction history are more important for generating robot behaviors. The proposed method trains a robot to generate multimodal actions, consisting of speech and locomotion behaviors. We compared this study with the previous technique in two ways. Cross-validation on the training data showed higher social appropriateness of predicted behaviors using the proposed technique, and a user study of live interaction with a robot showed that participants perceived the proposed technique to produce behaviors that were more proactive, socially-appropriate, and better in overall quality.  相似文献   

6.
A time-of-flight camera can help a service robot to sense its 3D environment. In this paper, we introduce our methods for sensor calibration and 3D data segmentation to use it to automatically plan grasps and manipulation actions for a service robot. Impedance control is intensively used to further compensate the modeling error and to apply the computed forces. The methods are further demonstrated in three service robotic applications. Sensor-based motion planning allows the robot to move within dynamic and cluttered environment without collision. Unknown objects can be detected and grasped. In the autonomous ice cream serving scenario, the robot captures the surface of ice cream and plans a manipulation trajectory to scoop a portion of ice cream.  相似文献   

7.
An autonomous mobile robot must have the ability to navigate in an unknown environment. The simultaneous localization and map building (SLAM) problem have relation to this autonomous ability. Vision sensors are attractive equipment for an autonomous mobile robot because they are information-rich and rarely have restrictions on various applications. However, many vision based SLAM methods using a general pin-hole camera suffer from variation in illumination and occlusion, because they mostly extract corner points for the feature map. Moreover, due to the narrow field of view of the pin-hole camera, they are not adequate for a high speed camera motion. To solve these problems, this paper presents a new SLAM method which uses vertical lines extracted from an omni-directional camera image and horizontal lines from the range sensor data. Due to the large field of view of the omni-directional camera, features remain in the image for enough time to estimate the pose of the robot and the features more accurately. Furthermore, since the proposed SLAM does not use corner points but the lines as the features, it reduces the effect of illumination and partial occlusion. Moreover, we use not only the lines at corners of wall but also many other vertical lines at doors, columns and the information panels on the wall which cannot be extracted by a range sensor. Finally, since we use the horizontal lines to estimate the positions of the vertical line features, we do not require any camera calibration. Experimental work based on MORIS, our mobile robot test bed, moving at a human’s pace in the real indoor environment verifies the efficacy of this approach.  相似文献   

8.
We present a novel optimisation framework for the estimation of the multi-body motion segmentation and 3D reconstruction of a set of point trajectories in the presence of missing data. The proposed solution not only assigns the trajectories to the correct motion but it also solves for the 3D location of multi-body shape and it fills the missing entries in the measurement matrix. Such a solution is based on two fundamental principles: each of the multi-body motions is controlled by a set of metric constraints that are given by the specific camera model, and the shape matrix that describes the multi-body 3D shape is generally sparse. We jointly include such constraints in a unique optimisation framework which, starting from an initial segmentation, iteratively enforces these set of constraints in three stages. First, metric constraints are used to estimate the 3D metric shape and to fill the missing entries according to an orthographic camera model. Then, wrongly segmented trajectories are detected by using sparse optimisation of the shape matrix. A final reclassification strategy assigns the detected points to the right motion or discards them as outliers. We provide experiments that show consistent improvements to previous approaches both on synthetic and real data.  相似文献   

9.
Reliable obstacle detection and classification in rough and unstructured terrain such as agricultural fields or orchards remains a challenging problem. These environments involve large variations in both geometry and appearance, challenging perception systems that rely on only a single sensor modality. Geometrically, tall grass, fallen leaves, or terrain roughness can mistakenly be perceived as nontraversable or might even obscure actual obstacles. Likewise, traversable grass or dirt roads and obstacles such as trees and bushes might be visually ambiguous. In this paper, we combine appearance‐ and geometry‐based detection methods by probabilistically fusing lidar and camera sensing with semantic segmentation using a conditional random field. We apply a state‐of‐the‐art multimodal fusion algorithm from the scene analysis domain and adjust it for obstacle detection in agriculture with moving ground vehicles. This involves explicitly handling sparse point cloud data and exploiting both spatial, temporal, and multimodal links between corresponding 2D and 3D regions. The proposed method was evaluated on a diverse data set, comprising a dairy paddock and different orchards gathered with a perception research robot in Australia. Results showed that for a two‐class classification problem (ground and nonground), only the camera leveraged from information provided by the other modality with an increase in the mean classification score of 0.5%. However, as more classes were introduced (ground, sky, vegetation, and object), both modalities complemented each other with improvements of 1.4% in 2D and 7.9% in 3D. Finally, introducing temporal links between successive frames resulted in improvements of 0.2% in 2D and 1.5% in 3D.  相似文献   

10.
An innovative neuro-evolutionary approach for mobile robot egomotion estimation with a 3D ToF camera is proposed. The system is composed of two main modules following a preprocessing step. The first module is a Neural Gas network that computes a Vector Quantization of the preprocessed camera 3D point cloud. The second module is an Evolution Strategy that estimates the robot motion parameters by performing a registration process, searching on the space of linear transformations, restricted to the translation and rotation, between the codebooks obtained for successive camera readings. The fitness function is the matching error between the predicted and the observed codebook corresponding to the next camera readings. In this paper, we report results of an implementation of this system tested on data from a real mobile robot, and provide several comparisons between our and other well-known registration algorithms.  相似文献   

11.
We present a real‐time multi‐view facial capture system facilitated by synthetic training imagery. Our method is able to achieve high‐quality markerless facial performance capture in real‐time from multi‐view helmet camera data, employing an actor specific regressor. The regressor training is tailored to specified actor appearance and we further condition it for the expected illumination conditions and the physical capture rig by generating the training data synthetically. In order to leverage the information present in live imagery, which is typically provided by multiple cameras, we propose a novel multi‐view regression algorithm that uses multi‐dimensional random ferns. We show that higher quality can be achieved by regressing on multiple video streams than previous approaches that were designed to operate on only a single view. Furthermore, we evaluate possible camera placements and propose a novel camera configuration that allows to mount cameras outside the field of view of the actor, which is very beneficial as the cameras are then less of a distraction for the actor and allow for an unobstructed line of sight to the director and other actors. Our new real‐time facial capture approach has immediate application in on‐set virtual production, in particular with the ever‐growing demand for motion‐captured facial animation in visual effects and video games.  相似文献   

12.
In this paper, a concept for virtual sensors is proposed for efficient avoidance of obstacles during the motion of robots. The virtual sensor yields new data by combining encoder values and real distance data, and derives new sensor data that includes the mobility of the robot. Simulation on Windows XP is executed to illustrate the proposed approach with actually acquired distance from virtual and actual sensors. To facilitate comparison with the alternative results developed in this paper, we refer to the conventional artificial potential field method using actual distance. Data from virtual sensors show smoother and safer motion in obstacle avoidance traces in regards to obstacle and robot mobility.  相似文献   

13.
In this paper, we present a real‐time high‐precision visual localization system for an autonomous vehicle which employs only low‐cost stereo cameras to localize the vehicle with a priori map built using a more expensive 3D LiDAR sensor. To this end, we construct two different visual maps: a sparse feature visual map for visual odometry (VO) based motion tracking, and a semidense visual map for registration with the prior LiDAR map. To register two point clouds sourced from different modalities (i.e., cameras and LiDAR), we leverage probabilistic weighted normal distributions transformation (ProW‐NDT), by particularly taking into account the uncertainty of source point clouds. The registration results are then fused via pose graph optimization to correct the VO drift. Moreover, surfels extracted from the prior LiDAR map are used to refine the sparse 3D visual features that will further improve VO‐based motion estimation. The proposed system has been tested extensively in both simulated and real‐world experiments, showing that robust, high‐precision, real‐time localization can be achieved.  相似文献   

14.
In this paper, we consider the problem of planning optimal paths for a differential-drive robot with limited sensing, that must maintain visibility of a fixed landmark as it navigates in its environment. In particular, we assume that the robot's vision sensor has a limited field of view (FOV), and that the fixed landmark must remain within the FOV throughout the robot's motion. We first investigate the nature of extremal paths that satisfy the FOV constraint. These extremal paths saturate the camera pan angle. We then show that optimal paths are composed of straight-line segments and sections of these these extremal paths. We provide the complete characterization of the shortest paths for the system by partitioning the plane into a set of disjoint regions, such that the structure of the optimal path is invariant over the individual regions  相似文献   

15.
Virtual world explorations by using topological and semantic knowledge   总被引:3,自引:0,他引:3  
This paper is dedicated to virtual world exploration techniques. Automatic camera control is important in many fields as computational geometry, visual servoing, robot motion, graph drawing, etc. The paper introduces a high-level camera controlling approach in virtual environments. The proposed method is related to real-time 3D scene exploration and is made of two steps. In the first step, a set of good viewpoints is chosen to give the user a maximum knowledge of the scene. The second step uses the viewpoints to compute a camera path between them. Finally, we define a notion of semantic distance between objects of the scene to improve the approach.  相似文献   

16.
High‐quality video editing usually requires accurate layer separation in order to resolve occlusions. However, most of the existing bilayer segmentation algorithms require either considerable user intervention or a simple stationary camera configuration with known background, which is difficult to meet for many real world online applications. This paper demonstrates that various visually appealing montage effects can be online created from a live video captured by a rotating camera, by accurately retrieving the camera state and segmenting out the dynamic foreground. The key contribution is that a novel fast bilayer segmentation method is proposed which can effectively extract the dynamic foreground under rotational camera configuration, and is robust to imperfect background estimation and complex background colors. Our system can create a variety of live visual effects, including but not limited to, realistic virtual object insertion, background substitution and blurring, non‐photorealistic rendering and camouflage effect. A variety of challenging examples demonstrate the effectiveness of our method.  相似文献   

17.
Hyperspectral cameras sample many different spectral bands at each pixel, enabling advanced detection and classification algorithms. However, their limited spatial resolution and the need to measure the camera motion to create hyperspectral images makes them unsuitable for nonsmooth moving platforms such as unmanned aerial vehicles (UAVs). We present a procedure to build hyperspectral images from line sensor data without camera motion information or extraneous sensors. Our approach relies on an accompanying conventional camera to exploit the homographies between images for mosaic construction. We provide experimental results from a low‐altitude UAV, achieving high‐resolution spectroscopy with our system.  相似文献   

18.
This paper proposes an algorithm which uses image registration to estimate a non‐uniform motion blur point spread function (PSF) caused by camera shake. Our study is based on a motion blur model which models blur effects of camera shakes using a set of planar perspective projections (i.e., homographies). This representation can fully describe motions of camera shakes in 3D which cause non‐uniform motion blurs. We transform the non‐uniform PSF estimation problem into a set of image registration problems which estimate homographies of the motion blur model one‐by‐one through the Lucas‐Kanade algorithm. We demonstrate the performance of our algorithm using both synthetic and real world examples. We also discuss the effectiveness and limitations of our algorithm for non‐uniform deblurring.  相似文献   

19.
Robot motion controls (especially for non-Cartesian kinematics of the robot) are realized with strong real-time demands. In industrial use they are primarily designed to work without sensor feedback. Within the control different levels of coordinates are necessary that rank from the Cartesian world to the realization of motions by the joints. A connection of the sensor to the Cartesian level results in a long reaction-time which often is insufficient either for the data rate of the sensor data or for the demands of the application. This paper describes a method which offers advantages because of parallel integration of the sensor in the robot motion control. These advantages are: great modularity by decentralization, fast reactions in real-time by parallel computing of the sensor data, implementability of various, application-specific control algorithms for sensor data, a relatively slow communication with the robot motion control on the Cartesian level and fast, immediate sensor influence on the robot joints. Since sensor corrections are performed by the use of a differential method (inverse Jacobian matrix) fast corrections have to be limited to certain amounts. Nevertheless corrections of robot motion paths can extend to any value by overlaying the fast corrections by additive slow correction values.  相似文献   

20.
This paper deals with a motion control system for a space robot with a manipulator. Many motion controllers require the positions of the robot body and the manipulator hand with respect to an inertial coordinate system. In order to measure them, a visual sensor using a camera is frequently used. However, there are two difficulties in measuring them by means of a camera. The first one is that a camera is mounted on the robot body, and hence it is difficult to directly measure the position of the robot body by means of it. The second one is that the sampling period of a vision system with a general-purpose camera is much longer than that of a general servo system. In this paper, we develop an adaptive state observer that overcomes the two difficulties. In order to investigate its performance, we design a motion control system that is constructed by combining the observer with a PD control input, and then conduct numerical simulations for the control system. Simulation results demonstrate the effectiveness of the proposed observer.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号