首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We demonstrate how 3D head tracking and pose estimation can be effectively and efficiently achieved from noisy RGB-D sequences. Our proposal leverages on a random forest framework, designed to regress the 3D head pose at every frame in a temporal tracking manner. One peculiarity of the algorithm is that it exploits together (1) a generic training dataset of 3D head models, which is learned once offline; and, (2) an online refinement with subject-specific 3D data, which aims for the tracker to withstand slight facial deformations and to adapt its forest to the specific characteristics of an individual subject. The combination of these works allows our algorithm to be robust even under extreme poses, where the user’s face is no longer visible on the image. Finally, we also propose another solution that utilizes a multi-camera system such that the data simultaneously acquired from multiple RGB-D sensors helps the tracker to handle challenging conditions that affect a subset of the cameras. Notably, the proposed multi-camera frameworks yields a real-time performance of approximately 8 ms per frame given six cameras and one CPU core, and scales up linearly to 30 fps with 25 cameras.  相似文献   

2.
Recently, technologies such as face detection, facial landmark localisation and face recognition and verification have matured enough to provide effective and efficient solutions for imagery captured under arbitrary conditions (referred to as “in-the-wild”). This is partially attributed to the fact that comprehensive “in-the-wild” benchmarks have been developed for face detection, landmark localisation and recognition/verification. A very important technology that has not been thoroughly evaluated yet is deformable face tracking “in-the-wild”. Until now, the performance has mainly been assessed qualitatively by visually assessing the result of a deformable face tracking technology on short videos. In this paper, we perform the first, to the best of our knowledge, thorough evaluation of state-of-the-art deformable face tracking pipelines using the recently introduced 300 VW benchmark. We evaluate many different architectures focusing mainly on the task of on-line deformable face tracking. In particular, we compare the following general strategies: (a) generic face detection plus generic facial landmark localisation, (b) generic model free tracking plus generic facial landmark localisation, as well as (c) hybrid approaches using state-of-the-art face detection, model free tracking and facial landmark localisation technologies. Our evaluation reveals future avenues for further research on the topic.  相似文献   

3.
In this paper, a landmark selection and tracking approach is presented for mobile robot navigation in natural environments, using textural distinctiveness-based saliency detection and spatial information acquired from stereo data. The presented method focuses on achieving high robustness of tracking rather than self-positioning accuracy. The landmark selection method is designed to select a small amount of the most salient feature points in a wide variety of sparse unknown environments to ensure successful matching. Landmarks are selected by an iterative algorithm from a textural distinctiveness-based saliency map extended with spatial information, where a repulsive potential field is created around the position of each already selected landmark for better distribution in order to increase robustness. The template matching of landmarks is aided with visual odometry-based motion estimation. Other robustness increasing strategies includes estimating landmark positions by unscented Kalman filters as well as from surrounding landmarks. Experimental results show that the introduced method is robust and suitable for natural environments.  相似文献   

4.
This paper aims to present a robust airborne 3D Visual Simultaneous Localization and Mapping (VSLAM) solution based on a stereovision system. We propose three innovative contributions to the Airborne VSLAM. The first one is the development of an alternative data fusion nonlinear H?∞ filtering scheme. This scheme is based on 3D vision observation model and avoids issues linked with the classical Extended Kalman Filtering (EKF) techniques such as the linearization errors, the initialization problem and noise statistics assumptions. The second contribution consists of a consistency and observability analysis for the Airborne VSLAM. The third contribution is a new approach to map management, based on the k-nearest landmark concept, and allowing efficient loop closure detection and map building. This approach reduces considerably the complexity of our Airborne VSLAM algorithm, which becomes independent of the map landmark number. Simulation results show the efficiency of the proposed Airborne VSLAM solution for which comparisons with other techniques are favourable.  相似文献   

5.
In this paper, we propose a high-speed vision system that can be applied to real-time face tracking at 500 fps using GPU acceleration of a boosting-based face tracking algorithm. By assuming a small image displacement between frames, which is a property of high-frame rate vision, we develop an improved boosting-based face tracking algorithm for fast face tracking by enhancing the Viola–Jones face detector. In the improved algorithm, face detection can be efficiently accelerated by reducing the number of window searches for Haar-like features, and the tracked face pattern can be localized pixel-wise even when the window is sparsely scanned for a larger face pattern by introducing skin color extraction in the boosting-based face detector. The improved boosting-based face tracking algorithm is implemented on a GPU-based high-speed vision platform, and face tracking can be executed in real time at 500 fps for an 8-bit color image of 512 × 512 pixels. In order to verify the effectiveness of the developed face tracking system, we install it on a two-axis mechanical active vision system and perform several experiments for tracking face patterns.  相似文献   

6.
In depth map generation algorithms, parameters settings to yield an accurate disparity map estimation are usually chosen empirically or based on unplanned experiments. Algorithms’ performance is measured based on the distance of the algorithm results vs. the Ground Truth by Middlebury’s standards. This work shows a systematic statistical approach including exploratory data analyses on over 14000 images and designs of experiments using 31 depth maps to measure the relative influence of the parameters and to fine-tune them based on the number of bad pixels. The implemented methodology improves the performance of adaptive weight based dense depth map algorithms. As a result, the algorithm improves from 16.78 to 14.48 % bad pixels using a classical exploratory data analysis of over 14000 existing images, while using designs of computer experiments with 31 runs yielded an even better performance by lowering bad pixels from 16.78 to 13 %.  相似文献   

7.
Smartphone-based pedestrian tracking in indoor corridor environments   总被引:1,自引:0,他引:1  
As the use of smartphones spreads rapidly, user localization becomes an important issue for providing diverse location-based services (LBS). While tracking users in outdoor environments is easily done with GPS, the solution for indoor tracking is not trivial. One common technique for indoor user tracking is to employ inertial sensors, but such a system needs to be capable of handling noisy sensors that would normally lead to cumulative locating errors. To reduce such error, additional infrastructure has often been deployed to adjust for these cumulative location errors. As well, previous work has used highly accurate sensors or sensors that are strapped to the body. This paper presents a stand-alone pedestrian tracking system, using only a magnetometer and an accelerometer in a smartphone in indoor corridor environments that are normally laid out in a perpendicular design. Our system provides reasonably accurate pedestrian locations without additional infrastructure or sensors. The experiment results show that the location error is less than approximately 7 m, which is considered adequate for indoor LBS applications.  相似文献   

8.
Active Appearance Model (AAM) is an algorithm for fitting a generative model of object shape and appearance to an input image. AAM allows accurate, real-time tracking of human faces in 2D and can be extended to track faces in 3D by constraining its fitting with a linear 3D morphable model. Unfortunately, this AAM-based 3D tracking does not provide adequate accuracy and robustness, as we show in this paper. We introduce a new constraint into AAM fitting that uses depth data from a commodity RGBD camera (Kinect). This addition significantly reduces 3D tracking errors. We also describe how to initialize the 3D morphable face model used in our tracking algorithm by computing its face shape parameters of the user from a batch of tracked frames. The described face tracking algorithm is used in Microsoft's Kinect system.  相似文献   

9.
We present a novel automatic method for high resolution, non-rigid dense 3D point tracking. High quality dense point clouds of non-rigid geometry moving at video speeds are acquired using a phase-shifting structured light ranging technique. To use such data for the temporal study of subtle motions such as those seen in facial expressions, an efficient non-rigid 3D motion tracking algorithm is needed to establish inter-frame correspondences. The novelty of this paper is the development of an algorithmic framework for 3D tracking that unifies tracking of intensity and geometric features, using harmonic maps with added feature correspondence constraints. While the previous uses of harmonic maps provided only global alignment, the proposed introduction of interior feature constraints allows to track non-rigid deformations accurately as well. The harmonic map between two topological disks is a diffeomorphism with minimal stretching energy and bounded angle distortion. The map is stable, insensitive to resolution changes and is robust to noise. Due to the strong implicit and explicit smoothness constraints imposed by the algorithm and the high-resolution data, the resulting registration/deformation field is smooth, continuous and gives dense one-to-one inter-frame correspondences. Our method is validated through a series of experiments demonstrating its accuracy and efficiency.  相似文献   

10.
《Ergonomics》2012,55(12):2057-2066
Marker-less 2D video tracking was studied as a practical means to measure upper limb kinematics for ergonomics evaluations. Hand activity level (HAL) can be estimated from speed and duty cycle. Accuracy was measured using a cross-correlation template-matching algorithm for tracking a region of interest on the upper extremities. Ten participants performed a paced load transfer task while varying HAL (2, 4, and 5) and load (2.2 N, 8.9 N and 17.8 N). Speed and acceleration measured from 2D video were compared against ground truth measurements using 3D infrared motion capture. The median absolute difference between 2D video and 3D motion capture was 86.5 mm/s for speed, and 591 mm/s2 for acceleration, and less than 93 mm/s for speed and 656 mm/s2 for acceleration when camera pan and tilt were within ± 30 degrees. Single-camera 2D video had sufficient accuracy ( < 100 mm/s) for evaluating HAL.

Practitioner Summary: This study demonstrated that 2D video tracking had sufficient accuracy to measure HAL for ascertaining the American Conference of Government Industrial Hygienists Threshold Limit Value® for repetitive motion when the camera is located within ± 30 degrees off the plane of motion when compared against 3D motion capture for a simulated repetitive motion task.  相似文献   

11.
While research on articulated human motion and pose estimation has progressed rapidly in the last few years, there has been no systematic quantitative evaluation of competing methods to establish the current state of the art. We present data obtained using a hardware system that is able to capture synchronized video and ground-truth 3D motion. The resulting HumanEva datasets contain multiple subjects performing a set of predefined actions with a number of repetitions. On the order of 40,000 frames of synchronized motion capture and multi-view video (resulting in over one quarter million image frames in total) were collected at 60 Hz with an additional 37,000 time instants of pure motion capture data. A standard set of error measures is defined for evaluating both 2D and 3D pose estimation and tracking algorithms. We also describe a baseline algorithm for 3D articulated tracking that uses a relatively standard Bayesian framework with optimization in the form of Sequential Importance Resampling and Annealed Particle Filtering. In the context of this baseline algorithm we explore a variety of likelihood functions, prior models of human motion and the effects of algorithm parameters. Our experiments suggest that image observation models and motion priors play important roles in performance, and that in a multi-view laboratory environment, where initialization is available, Bayesian filtering tends to perform well. The datasets and the software are made available to the research community. This infrastructure will support the development of new articulated motion and pose estimation algorithms, will provide a baseline for the evaluation and comparison of new methods, and will help establish the current state of the art in human pose estimation and tracking.  相似文献   

12.
现有的三维地图构建算法多强调对地图构建的精确性,导致成图效率低、成本高。 为了提高建立地图的效率,提出了一种对地标性物体进行圆柱体识别与提取并以其轴线特征作 为地标构建简化地图的改进算法。基于随机采样一致算法(RANSAC)对点云模型中的待提取 主体模型生成待估计圆柱模型并进行匹配,通过对单应性矩阵及其误差函数的计算得到迭代过 程中的最佳阈值,以得到最佳匹配圆柱模型并提高提取效率,然后用所提取的圆柱轴线描述地 标的空间位置,圆柱半径描述地标的空间几何信息。通过与传统 RANSAC 方法的仿真实验对比, 证明该方法可以有效的精简地图,为后续识别地标路径规划奠定基础。  相似文献   

13.
在一些布局易变或存在较多动态障碍物的室内,移动机器人的全局定位依然面临较大的应用挑战.针对这类场景,实现了一种新的基于人工路标的易部署室内机器人全局定位系统.该系统将人工路标粘贴在不易被遮挡的天花板上来作为参照物,仅依赖一个摄像头即能实现稳定的全局定位.整个系统根据具体的功能分为地图构建和全局定位两个过程.在地图构建过程中,系统使用激光SLAM算法所输出的位姿估计结果为基准,根据相机对路标点的观测信息来自动估计人工路标点在全局坐标系中的位姿,建立人工路标地图.而在全局定位过程中,该系统则是根据相机对地图中已知位姿的人工路标点的观测信息,结合里程计与IMU融合的预积分信息来对位姿进行实时估计.充分的实验测试表明,机器人在该系统所部署范围内运行的定位误差稳定在10 cm以内,且运行过程可以保证实时位姿输出,满足典型实际室内移动机器人全局定位的应用需求.  相似文献   

14.
We present a simultaneous localization and mapping (SLAM) algorithm that uses Bézier curves as static landmark primitives rather than feature points. Our approach allows us to estimate the full six degrees of freedom pose of a robot while providing a structured map that can be used to assist a robot in motion planning and control. We demonstrate how to reconstruct the three‐dimensional (3D) location of curve landmarks from a stereo pair and how to compare the 3D shape of curve landmarks between chronologically sequential stereo frames to solve the data association problem. We also present a method to combine curve landmarks for mapping purposes, resulting in a map with a continuous set of curves that contain fewer landmark states than conventional point‐based SLAM algorithms. We demonstrate our algorithm's effectiveness with numerous experiments, including comparisons to existing state‐of‐the‐art SLAM algorithms.  相似文献   

15.
In this paper, the first application of utilizing a unique 3D sensor for sequential 3D map building in unknown cluttered urban search and rescue (USAR) environments is proposed. The sensor utilizes a digital fringe projection and phase shifting technique to provide real-time 2D and 3D sensory information of the environment. The proposed sensor is unique over current technologies in that high-resolution 3D information of rubble filled environments can be acquired from the single sensor at a speed of 30 frames per second (fps). Furthermore, we propose the development of a novel robust and reliable landmark identification technique that utilizes both 2D and 3D depth images taken by the sensor for 3D mapping. Preliminary experiments show the potential of the real-time 3D sensory system and landmark identification scheme for robotic 3D mapping in unknown cluttered USAR-like environments.  相似文献   

16.
This paper presents a robust framework for tracking complex objects in video sequences. Multiple hypothesis tracking (MHT) algorithm reported in (IEEE Trans. Pattern Anal. Mach. Intell. 18(2) (1996)) is modified to accommodate a high level representations (2D edge map, 3D models) of objects for tracking. The framework exploits the advantages of MHT algorithm which is capable of resolving data association/uncertainty and integrates it with object matching techniques to provide a robust behavior while tracking complex objects. To track objects in 2D, a 4D feature is used to represent edge/line segments and are tracked using MHT. In many practical applications 3D models provide more information about the object's pose (i.e., rotation information in the transformation space) which cannot be recovered using 2D edge information. Hence, a 3D model-based object tracking algorithm is also presented. A probabilistic Hausdorff image matching algorithm is incorporated into the framework in order to determine the geometric transformation that best maps the model features onto their corresponding ones in the image plane. 3D model of the object is used to constrain the tracker to operate in a consistent manner. Experimental results on real and synthetic image sequences are presented to demonstrate the efficacy of the proposed framework.  相似文献   

17.
We present a new algorithm to tracking multiple 3D objects that has robustness, real-time processing ability and fast object registration. Usually, many augmented reality applications want to track 3D object using natural features in real-time, more accuracy and want to register target object immediately in few seconds. Prevalent object tracking algorithm uses FERN for feature extraction that takes long time to register and learning target object for high quality performance. Our method provides not only high accuracy but also fast target object registering time about 0.3 ms in same environment and real-time processing. These features are presented by using SURF, ROI, double robust filtering and optimized multi-core parallelization. Using our methods, tracking multiple 3D objects with fast and high accuracy is available.  相似文献   

18.
Monte Carlo localization (MCL) uses a reference map to estimate a pose of a ground robot in outdoor environments. However, MCL shows low performance when it uses an elevation map built by an aerial mapping system because three‐dimensional (3D) environments are observed differently from the air and the ground and such an elevation map cannot represent outdoor environments in detail. Although other types of maps have been proposed to improve localization performance, an elevation map is still used as the main reference map in some applications. Therefore, we propose a new feature to improve localization performance with an elevation map. This feature is extracted from 3D range data and represents the part of an object that can be commonly observed from both the air and the ground. Therefore, this feature is likely to be accurately matched with an elevation map, and the average error of this feature is much smaller than that of unclassified sensing data. Experimental results in real environments show that the success rate of global localization increased and the error of local tracking decreased. Thus, the proposed feature can be very useful for localization of an outdoor ground robot when an elevation map is used as a reference map. © 2010 Wiley Periodicals, Inc.  相似文献   

19.
Landing an autonomous spacecraft within 100 m of a mapped target is a navigation challenge in planetary exploration. Vision-based approaches attempt to pair 2D features detected in camera images with 3D mapped landmarks to reach the required precision. This paper presents a vision-aided inertial navigation system for pinpoint planetary landing called LION (Landing Inertial and Optical Navigation). It can fly over any type of terrain, whatever the topography. LION uses measurements from a novel image-to-map matcher in order to update through a tight data fusion scheme the state of an extended Kalman filter propagated with inertial data. The image processing uses the state and covariance predictions from the filter to determine the regions and extraction scales in which to search for non-ambiguous landmarks in the image. The image scale management process operates per landmark and greatly improves the repeatability rate between the map and descent images. A lunar-representative optical test bench called Visilab was also designed in order to test LION. The observability of absolute navigation performances in Visilab is evaluated with a model developed specifically for this purpose. Finally, the system performances are evaluated at a number of altitudes along with its robustness to off-nadir camera angle, illumination changes, a different map generation process and non-planar topography. The error converges to a mean of 4 m and a 3-RMS dispersion of 47 m at 3 km of altitude on the test setup at scale.  相似文献   

20.
We present a novel method for planning coverage paths for inspecting complex structures on the ocean floor using an autonomous underwater vehicle (AUV). Our method initially uses a 2.5‐dimensional (2.5D) prior bathymetric map to plan a nominal coverage path that allows the AUV to pass its sensors over all points on the target area. The nominal path uses a standard mowing‐the‐lawn pattern in effectively planar regions, while in regions with substantial 3D relief it follows horizontal contours of the terrain at a given offset distance. We then go beyond previous approaches in the literature by considering the vehicle's state uncertainty rather than relying on the unrealistic assumption of an idealized path execution. Toward that end, we present a replanning algorithm based on a stochastic trajectory optimization that reshapes the nominal path to cope with the actual target structure perceived in situ. The replanning algorithm runs onboard the AUV in real time during the inspection mission, adapting the path according to the measurements provided by the vehicle's range‐sensing sonars. Furthermore, we propose a pipeline of state‐of‐the‐art surface reconstruction techniques we apply to the data acquired by the AUV to obtain 3D models of the inspected structures that show the benefits of our planning method for 3D mapping. We demonstrate the efficacy of our method in experiments at sea using the GIRONA 500 AUV, where we cover part of a breakwater structure in a harbor and an underwater boulder rising from 40 m up to 27 m depth.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号