首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In this paper, we introduce a method to estimate the object’s pose from multiple cameras. We focus on direct estimation of the 3D object pose from 2D image sequences. Scale-Invariant Feature Transform (SIFT) is used to extract corresponding feature points from adjacent images in the video sequence. We first demonstrate that centralized pose estimation from the collection of corresponding feature points in the 2D images from all cameras can be obtained as a solution to a generalized Sylvester’s equation. We subsequently derive a distributed solution to pose estimation from multiple cameras and show that it is equivalent to the solution of the centralized pose estimation based on Sylvester’s equation. Specifically, we rely on collaboration among the multiple cameras to provide an iterative refinement of the independent solution to pose estimation obtained for each camera based on Sylvester’s equation. The proposed approach to pose estimation from multiple cameras relies on all of the information available from all cameras to obtain an estimate at each camera even when the image features are not visible to some of the cameras. The resulting pose estimation technique is therefore robust to occlusion and sensor errors from specific camera views. Moreover, the proposed approach does not require matching feature points among images from different camera views nor does it demand reconstruction of 3D points. Furthermore, the computational complexity of the proposed solution grows linearly with the number of cameras. Finally, computer simulation experiments demonstrate the accuracy and speed of our approach to pose estimation from multiple cameras.  相似文献   

2.
位姿估计一直是三维重建领域的关键性问题.为保证移动端有限计算资源下的实时性并提高轨迹计算的准确性,提出一种紧耦合的移动端实时位姿优化方法.首先,获取图像信息与运动传感器信息进行特征提取、预积分等预处理;然后根据对极几何约束,计算重投影误差与惯性传感器误差;最后采用加权误差联合优化计算位姿轨迹.紧耦合策略可以有效利用图像...  相似文献   

3.
席志红  韩双全  王洪旭 《计算机应用》2019,39(10):2847-2851
针对动态物体在室内同步定位与地图构建(SLAM)系统中影响位姿估计的问题,提出一种动态场景下基于语义分割的SLAM系统。在相机捕获图像后,首先用PSPNet(Pyramid Scene Parsing Network)对图像进行语义分割;之后提取图像特征点,剔除分布在动态物体内的特征点,并用静态的特征点进行相机位姿估计;最后完成语义点云图和语义八叉树地图的构建。在公开数据集上的五个动态序列进行多次对比测试的结果表明,相对于使用SegNet网络的SLAM系统,所提系统的绝对轨迹误差的标准偏差有6.9%~89.8%的下降,平移和旋转漂移的标准偏差在高动态场景中的最佳效果也能分别提升73.61%和72.90%。结果表明,改进的系统能够显著减小动态场景下位姿估计的误差,准确地在动态场景中进行相机位姿估计。  相似文献   

4.
We approach mosaicing as a camera tracking problem within a known parameterized surface. From a video of a camera moving within a surface, we compute a mosaic representing the texture of that surface, flattened onto a planar image. Our approach works by defining a warp between images as a function of surface geometry and camera pose. Globally optimizing this warp to maximize alignment across all frames determines the camera trajectory, and the corresponding flattened mosaic image. In contrast to previous mosaicing methods which assume planar or distant scenes, or controlled camera motion, our approach enables mosaicing in cases where the camera moves unpredictably through proximal surfaces, such as in medical endoscopy applications.  相似文献   

5.
In this paper, we describe a real-time algorithm for computing the ego-motion of a vehicle relative to the road. The algorithm uses as input only those images provided by a single omnidirectional camera mounted on the roof of the vehicle. The front ends of the system are two different trackers. The first one is a homography-based tracker that detects and matches robust scale-invariant features that most likely belong to the ground plane. The second one uses an appearance-based approach and gives high-resolution estimates of the rotation of the vehicle. This planar pose estimation method has been successfully applied to videos from an automotive platform. We give an example of camera trajectory estimated purely from omnidirectional images over a distance of 400 m. For performance evaluation, the estimated path is superimposed onto a satellite image. In the end, we use image mosaicing to obtain a textured 2-D reconstruction of the estimated path.   相似文献   

6.
大视场视频全景图快速生成方法   总被引:3,自引:0,他引:3       下载免费PDF全文
李庆忠  耿晓玲  王冰 《计算机工程》2009,35(22):170-172
针对视频图像拼接中配准误差的积累问题,提出一种适合大面积静态场景观测的视频全景图快速生成方法,包括摄像机在一维单幅扫描方式和二维多幅扫描方式下的视频全景图的直接拼接算法。实验结果表明,无论摄像机工作在单幅方式还是多幅方式下,该方法都可以快速、准确地拼接出较高质量的视频全景图像,能有效避免局部配准误差的积累与传播,并解决一般全局配准算法复杂费时的问题。  相似文献   

7.
针对无人机航拍图像位姿估计采用单目视觉SLAM(simultaneous localization and mapping)时具有尺度不确定性、大场景下累积误差带来的轨迹漂移以及得到的是一个局部坐标系下的相对位姿问题,提出了一种无人机航拍图像实时位姿估计的方案.首先,实时进行视觉图像的跟踪,通过引入RTK(real-time kinematic)信息得到视觉坐标系与世界坐标系的转换关系并且解决尺度不确定性和轨迹漂移的问题,最后得到一个世界坐标系下的位姿.考虑到视觉SLAM处理的视频流会处理冗余的图像,且增加了图像的存储、拍摄和计算的压力,该方案采用处理非连续拍摄的低重叠度图像来计算位姿以避免这些问题.在真实场景下的实验结果表明,该方案的精度比当前主流的开源框架ORB-SLAM2、DSO、OpenMVG的精度更高,并且实现了整体轨迹误差的均值在10 cm以内.  相似文献   

8.
针对人脸姿态估计对系统性能要求高、在手机上运行无法满足实时性要求等问题,实现了一种Android手机端的人脸姿态实时估计系统。首先,由摄像头获得一幅正面和一幅偏移一定角度的人脸图像,利用从运动中构建结构(SfM)算法建立简单三维人脸模型;然后,提取实时人脸图像中与三维人脸模型相互对应的特征点,基于缩放正投影位姿估计(POSIT)算法估计人脸姿态角度;最后将三维人脸模型通过开放图形开发库(OpenGL)实时显示在手机屏幕上。实验结果表明,实时视频中检测人脸姿态并显示的速度可以达到20 frame/s,接近计算机端的基于仿射对应的三维人脸姿态估计算法,而且针对大量图片序列的检测可以达到50 frame/s,能够满足Android手机端的性能和检测人脸姿态的实时性要求。  相似文献   

9.
In this paper, we present a pipeline for camera pose and trajectory estimation, and image stabilization and rectification for dense as well as wide baseline omnidirectional images. The proposed pipeline transforms a set of images taken by a single hand-held camera to a set of stabilized and rectified images augmented by the computed camera 3D trajectory and a reconstruction of feature points facilitating visual object recognition. The paper generalizes previous works on camera trajectory estimation done on perspective images to omnidirectional images and introduces a new technique for omnidirectional image rectification that is suited for recognizing people and cars in images. The performance of the pipeline is demonstrated on real image sequences acquired in urban as well as natural environments.  相似文献   

10.
刘辉  张雪波  李如意  苑晶 《控制与决策》2024,39(6):1787-1800
激光同步定位与地图构建(simultaneous localization and mapping, SLAM)算法在位姿估计和构建环境地图时依赖环境结构特征信息,在结构特征缺乏的场景下,此类算法的位姿估计精度与鲁棒性将下降甚至运行失败.对此,结合惯性测量单元(inertial measurement unit, IMU)不受环境约束、相机依赖视觉纹理的特点,提出一种双目视觉辅助的激光惯导SLAM算法,以解决纯激光SLAM算法在环境结构特征缺乏时的退化问题.即采用双目视觉惯导里程计算法为激光扫描匹配模块提供视觉先验位姿,并进一步兼顾视觉约束与激光结构特征约束进行联合位姿估计.此外,提出一种互补滤波算法与因子图优化求解的组合策略,完成激光里程计参考系与惯性参考系对准,并基于因子图将激光位姿与IMU数据融合以约束IMU偏置,在视觉里程计失效的情况下为激光扫描匹配提供候补的相对位姿预测.为进一步提高全局轨迹估计精度,提出基于迭代最近点匹配算法(iterative closest point, ICP)与基于图像特征匹配算法融合的混合闭环检测策略,利用6自由度位姿图优化方法显著降低里程计漂移误...  相似文献   

11.
While research on articulated human motion and pose estimation has progressed rapidly in the last few years, there has been no systematic quantitative evaluation of competing methods to establish the current state of the art. We present data obtained using a hardware system that is able to capture synchronized video and ground-truth 3D motion. The resulting HumanEva datasets contain multiple subjects performing a set of predefined actions with a number of repetitions. On the order of 40,000 frames of synchronized motion capture and multi-view video (resulting in over one quarter million image frames in total) were collected at 60 Hz with an additional 37,000 time instants of pure motion capture data. A standard set of error measures is defined for evaluating both 2D and 3D pose estimation and tracking algorithms. We also describe a baseline algorithm for 3D articulated tracking that uses a relatively standard Bayesian framework with optimization in the form of Sequential Importance Resampling and Annealed Particle Filtering. In the context of this baseline algorithm we explore a variety of likelihood functions, prior models of human motion and the effects of algorithm parameters. Our experiments suggest that image observation models and motion priors play important roles in performance, and that in a multi-view laboratory environment, where initialization is available, Bayesian filtering tends to perform well. The datasets and the software are made available to the research community. This infrastructure will support the development of new articulated motion and pose estimation algorithms, will provide a baseline for the evaluation and comparison of new methods, and will help establish the current state of the art in human pose estimation and tracking.  相似文献   

12.
针对KinectFusion算法中存在的重建范围小、缺少有效的重新定位策略及累计误差问题,提出了一种基于随机蕨编码的三维重建方法.应用随机蕨编码构建相机路径回环的检测策略减少长时间重建所产生的累积误差,通过检索相似关键帧进行相机位姿估计失败后的重新定位,通过与程序集成框架Infini-TAM相结合,增大重建范围.采用RGB-D SLAM验证数据集进行了对比实验.实验表明:提出的方法可以大大增加重建范围,在相机定位失败后有效地进行重新定位,同时减少了长时间重建产生的累积误差,使得三维重建的过程更加稳定,获得的相机位姿更加精确.  相似文献   

13.
P3P位姿测量方法的误差分析   总被引:1,自引:0,他引:1  
从工程应用的角度出发,在三个控制点构成等腰三角形的条件下,研究P3P位姿测量方法的误差与输入参数误差的关系。首先在理论上推导出测量位姿误差与输入参数误差间的关系,在此基础上通过误差统计分析和直接仿真验证的方式得到在输入误差中,图像坐标的检测误差和像机内参数标定误差对测量位姿误差的影响较大,而目标模型的测量误差对测量位姿误差的影响可以忽略不计的结论。这对位姿测量系统的设计与实现具有一定的指导意义。  相似文献   

14.

Tracking the head in a video stream is a common thread seen within computer vision literature, supplying the research community with a large number of challenging and interesting problems. Head pose estimation from monocular cameras is often considered an extended application after the face tracking task has already been performed. This often involves passing the resultant 2D data through a simpler algorithm that best fits the data to a static 3D model to determine the 3D pose estimate. This work describes the 2.5D constrained local model, combining a deformable 3D shape point model with 2D texture information to provide direct estimation of the pose parameters, avoiding the need for additional optimization strategies. It achieves this through an analytical derivation of a Jacobian matrix describing how changes in the parameters of the model create changes in the shape within the image through a full-perspective camera model. In addition, the model has very low computational complexity and can run in real-time on modern mobile devices such as tablets and laptops. The point distribution model of the face is built in a unique way, so as to minimize the effect of changes in facial expressions on the estimated head pose and hence make the solution more robust. Finally, the texture information is trained via local neural fields—a deep learning approach that utilizes small discriminative patches to exploit spatial relationships between the pixels and provide strong peaks at the optimal locations.

  相似文献   

15.
We propose a framework to reconstruct the 3D pose of a human for animation from a sequence of single-view video frames. The framework for pose construction starts with background estimation and the performer?s silhouette is extracted using image subtraction for each frame. Then the body silhouettes are automatically labeled using a model-based approach. Finally, the 3D pose is constructed from the labeled human silhouette by assuming orthographic projection. The proposed approach does not require camera calibration. It assumes that the input video has a static background, it has no significant perspective effects, and the performer is in an upright position. The proposed approach requires minimal user interaction.  相似文献   

16.
Robust pose estimation from a planar target   总被引:2,自引:0,他引:2  
In theory, the pose of a calibrated camera can be uniquely determined from a minimum of four coplanar but noncollinear points. In practice, there are many applications of camera pose tracking from planar targets and there is also a number of recent pose estimation algorithms which perform this task in real-time, but all of these algorithms suffer from pose ambiguities. This paper investigates the pose ambiguity for planar targets viewed by a perspective camera. We show that pose ambiguities - two distinct local minima of the according error function - exist even for cases with wide angle lenses and close range targets. We give a comprehensive interpretation of the two minima and derive an analytical solution that locates the second minimum. Based on this solution, we develop a new algorithm for unique and robust pose estimation from a planar target. In the experimental evaluation, this algorithm outperforms four state-of-the-art pose estimation algorithms  相似文献   

17.
Observability of 3D Motion   总被引:2,自引:2,他引:0  
This paper examines the inherent difficulties in observing 3D rigid motion from image sequences. It does so without considering a particular estimator. Instead, it presents a statistical analysis of all the possible computational models which can be used for estimating 3D motion from an image sequence. These computational models are classified according to the mathematical constraints that they employ and the characteristics of the imaging sensor (restricted field of view and full field of view). Regarding the mathematical constraints, there exist two principles relating a sequence of images taken by a moving camera. One is the epipolar constraint, applied to motion fields, and the other the positive depth constraint, applied to normal flow fields. 3D motion estimation amounts to optimizing these constraints over the image. A statistical modeling of these constraints leads to functions which are studied with regard to their topographic structure, specifically as regards the errors in the 3D motion parameters at the places representing the minima of the functions. For conventional video cameras possessing a restricted field of view, the analysis shows that for algorithms in both classes which estimate all motion parameters simultaneously, the obtained solution has an error such that the projections of the translational and rotational errors on the image plane are perpendicular to each other. Furthermore, the estimated projection of the translation on the image lies on a line through the origin and the projection of the real translation. The situation is different for a camera with a full (360 degree) field of view (achieved by a panoramic sensor or by a system of conventional cameras). In this case, at the locations of the minima of the above two functions, either the translational or the rotational error becomes zero, while in the case of a restricted field of view both errors are non-zero. Although some ambiguities still remain in the full field of view case, the implication is that visual navigation tasks, such as visual servoing, involving 3D motion estimation are easier to solve by employing panoramic vision. Also, the analysis makes it possible to compare properties of algorithms that first estimate the translation and on the basis of the translational result estimate the rotation, algorithms that do the opposite, and algorithms that estimate all motion parameters simultaneously, thus providing a sound framework for the observability of 3D motion. Finally, the introduced framework points to new avenues for studying the stability of image-based servoing schemes.  相似文献   

18.
19.
In this study, we proposed a high-density three-dimensional (3D) tunnel measurement method, which estimates the pose changes of cameras based on a point set registration algorithm regarding 2D and 3D point clouds. To detect small deformations and defects, high-density 3D measurements are necessary for tunnel construction sites. The line-structured light method uses an omnidirectional laser to measure a high-density cross-section point cloud from camera images. To estimate the pose changes of cameras in tunnels, which have few textures and distinctive shapes, cooperative robots are useful because they estimate the pose by aggregating relative poses from the other robots. However, previous studies mounted several sensors for both the 3D measurement and pose estimation, increasing the size of the measurement system. Furthermore, the lack of 3D features makes it difficult to match point clouds obtained from different robots. The proposed measurement system consists of a cross-section measurement unit and a pose estimation unit; one camera was mounted for each unit. To estimate the relative poses of the two cameras, we designed a 2D–3D registration algorithm for the omnidirectional laser light, and implemented hand-truck and unmanned aerial vehicle systems. In the measurement of a tunnel with a width of 8.8 m and a height of 6.4 m, the error of the point cloud measured by the proposed method was 162.8 and 575.3 mm along 27 m, respectively. In a hallway measurement, the proposed method generated less errors in straight line shapes with few distinctive shapes compared with that of the 3D point set registration algorithm with Light Detection and Ranging.  相似文献   

20.
Traffic violation is the main cause of traffic accidents. To reduce the incidence of traffic accidents, the common practice at present is to strength the penalties for traffic violation. However, little attention has been paid to issue warning for dangerous driving behaviors, especially for the case where two vehicles have a good chance of collision. In this paper, a framework for collision risk estimation using RGB-D camera is proposed for vehicles running on the urban road, where the depth information is fused with the video information for accurate calculation of the position and speed of the vehicles, two essential parameters for motion trajectory estimation. Considering that the motion trajectory or its differences can be considered as a steady signal, a method based on autoregressive integrated moving average (ARIMA) models is presented to predict vehicle trajectory. Then, the collision risk is estimated based on the predicted trajectory. The experiments are carried out on the data from the real vehicles. The result shows that the accuracy of position and speed estimation can be guaranteed within urban road and the error of trajectory prediction is very minor which is unlikely to have a significant impact on calculating the probability of collision in most situations, so the proposed framework is effective in collision risk estimation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号