首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
目的 传统的单目视觉深度测量方法具有设备简单、价格低廉、运算速度快等优点,但需要对相机进行复杂标定,并且只在特定的场景条件下适用。为此,提出基于运动视差线索的物体深度测量方法,从图像中提取特征点,利用特征点与图像深度的关系得到测量结果。方法 对两幅图像进行分割,获取被测量物体所在区域;然后采用本文提出的改进的尺度不变特征变换SIFT(scale-invariant feature transtorm)算法对两幅图像进行匹配,结合图像匹配和图像分割的结果获取被测量物体的匹配结果;用Graham扫描法求得匹配后特征点的凸包,获取凸包上最长线段的长度;最后利用相机成像的基本原理和三角几何知识求出图像深度。结果 实验结果表明,本文方法在测量精度和实时性两方面都有所提升。当图像中的物体不被遮挡时,实际距离与测量距离之间的误差为2.60%,测量距离的时间消耗为1.577 s;当图像中的物体存在部分遮挡时,该方法也获得了较好的测量结果,实际距离与测量距离之间的误差为3.19%,测量距离所需时间为1.689 s。结论 利用两幅图像上的特征点来估计图像深度,对图像中物体存在部分遮挡情况具有良好的鲁棒性,同时避免了复杂的摄像机标定过程,具有实际应用价值。  相似文献   

2.
提出了一种多物体环境下基于改进YOLOv2的无标定3D机械臂自主抓取方法。首先为了降低深度学习算法YOLOv2检测多物体边界框重合率和3D距离计算误差,提出了一种改进的YOLOv2算法。利用此算法对图像中的目标物体进行检测识别,得到目标物体在RGB图像中的位置信息; 然后根据深度图像信息使用K-means++聚类算法快速计算目标物体到摄像机的距离,估计目标物体大小和姿态,同时检测机械手的位置信息,计算机械手到目标物体的距离; 最后根据目标物体的大小、姿态和到机械手的距离,使用PID算法控制机械手抓取物体。提出的改进YOLOv2算法获得了更精准的物体边界框,边框交集更小,提高了目标物体距离检测和大小、姿态估计的准确率。为了避免了繁杂的标定,提出无标定抓取方法,代替了基于雅克比矩阵的无标定估计方法,通用性好。实验验证了提出的系统框架能对图像中物体进行较为准确的自动分类和定位,利用Universal Robot 3机械臂能够对任意摆放的物体进行较为准确的抓取。  相似文献   

3.
目的 双目视觉是目标距离估计问题的一个很好的解决方案。现有的双目目标距离估计方法存在估计精度较低或数据准备较繁琐的问题,为此需要一个可以兼顾精度和数据准备便利性的双目目标距离估计算法。方法 提出一个基于R-CNN(region convolutional neural network)结构的网络,该网络可以实现同时进行目标检测与目标距离估计。双目图像输入网络后,通过主干网络提取特征,通过双目候选框提取网络以同时得到左右图像中相同目标的包围框,将成对的目标框内的局部特征输入目标视差估计分支以估计目标的距离。为了同时得到左右图像中相同目标的包围框,使用双目候选框提取网络代替原有的候选框提取网络,并提出了双目包围框分支以同时进行双目包围框的回归;为了提升视差估计的精度,借鉴双目视差图估计网络的结构,提出了一个基于组相关和3维卷积的视差估计分支。结果 在KITTI(Karlsruhe Institute of Technology and Toyota Technological Institute)数据集上进行验证实验,与同类算法比较,本文算法平均相对误差值约为3.2%,远小于基于双目视差图估计算法(11.3%),与基于3维目标检测的算法接近(约为3.9%)。另外,提出的视差估计分支改进对精度有明显的提升效果,平均相对误差值从5.1%下降到3.2%。通过在另外采集并标注的行人监控数据集上进行类似实验,实验结果平均相对误差值约为4.6%,表明本文方法可以有效应用于监控场景。结论 提出的双目目标距离估计网络结合了目标检测与双目视差估计的优势,具有较高的精度。该网络可以有效运用于车载相机及监控场景,并有希望运用于其他安装有双目相机的场景。  相似文献   

4.
目的 深度信息的获取是3维重建、虚拟现实等应用的关键技术,基于单目视觉的深度信息获取是非接触式3维测量技术中成本最低、也是技术难度最大的手段。传统的单目方法多基于线性透视、纹理梯度、运动视差、聚焦散焦等深度线索来对深度信息进行求取,计算量大,对相机精度要求高,应用场景受限,本文基于固定光强的点光源在场景中的移动所带来的物体表面亮度的变化,提出一种简单快捷的单目深度提取方法。方法 首先根据体表面反射模型,得到光源照射下的物体表面的辐亮度,然后结合光度立体学推导物体表面辐亮度与摄像机图像亮度之间的关系,在得到此关系式后,设计实验,依据点光源移动所带来的图像亮度的变化对深度信息进行求解。结果 该算法在简单场景和一些日常场景下均取得了较好的恢复效果,深度估计值与实际深度值之间的误差小于10%。结论 本文方法通过光源移动带来的图像亮度变化估计深度信息,避免了复杂的相机标定过程,计算复杂度小,是一种全新的场景深度信息获取方法。  相似文献   

5.
A simple and high image quality method for viewpoint image synthesis from multi‐camera images for a stereoscopic 3D display using head tracking is proposed. In this method, slices of images for depth layers are made using approximate depth information, the slices are linearly blended corresponding to the distance between the viewpoint and cameras at each layer, and the layers are overlaid from the perspective of viewpoint. Because the linear blending automatically compensates for depth error because of the visual effects of depth‐fused 3D (DFD), the resulting image is natural to observer's perception. Smooth motion parallax of wide depth range objects induced by viewpoint movement for left‐and‐right and front‐and‐back directions is achieved using multi‐camera images and approximate depth information. Because the calculation algorithm is very simple, it is suitable for real time 3D display applications.  相似文献   

6.
This paper presents a 2D to 3D conversion scheme to generate a 3D human model using a single depth image with several color images. In building a complete 3D model, no prior knowledge such as a pre-computed scene structure and photometric and geometric calibrations is required since the depth camera can directly acquire the calibrated geometric and color information in real time. The proposed method deals with a self-occlusion problem which often occurs in images captured by a monocular camera. When an image is obtained from a fixed view, it may not have data for a certain part of an object due to occlusion. The proposed method consists of following steps to resolve this problem. First, the noise in a depth image is reduced by using a series of image processing techniques. Second, a 3D mesh surface is constructed using the proposed depth image-based modeling method. Third, the occlusion problem is resolved by removing the unwanted triangles in the occlusion region and filling the corresponding hole. Finally, textures are extracted and mapped to the 3D surface of the model to provide photo-realistic appearance. Comparison results with the related work demonstrate the efficiency of our method in terms of visual quality and computation time. It can be utilized in creating 3D human models in many 3D applications.  相似文献   

7.
While laser scanners can produce a high-precision 3D shape of a real object, appearance information of the object has to be captured by an image sensor, such as a digital camera. This paper proposes a novel and simple technique for colorizing 3D geometric models based on laser reflectivity. Laser scanners capture the range data of a target object from the sensors. Simultaneously, the power of the reflected laser is obtained as a by-product of the range data. The reflectance image, which is a collection of laser reflectance depicted as a grayscale image, contains rich appearance information about the target object. The proposed technique is an alternative to texture mapping, which has been widely used to realize photo-realistic 3D modeling but requires strict alignment between range data and texture images. The proposed technique first colorizes a reflectance image based on the similarity of color and reflectance images. Then the appearance information (color and texture information) is added to a 3D model by transferring the color in the colorized reflectance image to the corresponding range image. Some experiments and comparisons between texture mapping and the proposed technique demonstrate the validity of the proposed technique.  相似文献   

8.
《Advanced Robotics》2013,27(8):781-798
In this paper, an observational sensor system for tele-micro-operation is proposed with a dynamic focusing lens and a smart vision sensor using the 'depth from focus' criteria. Recently, micro-operations, such as for micro-surgery, DNA manipulations, etc., have gained in importance. However, the small depth of focus of the microscope produces poor observability. For example, if the focus is on the object, the actuator cannot be seen with the microscope. On the other hand, if the focus is on the actuator, the object cannot be observed. In this sense, the 'all-in-focus image', which holds the in-focused texture all over the image, is useful to observe the micro-environments with a microscope. One drawback of the all-in-focus image is that there is no information about the depth of objects. It is also important to obtain the depth map and show the three-dimensional (3D) micro virtual environments in real-time to actuate the micro objects intuitively. First, this paper reviews the criteria of 'depth from focus' to achieve the all-in-focus image and the 3D micro environments' simultaneous reconstruction. After evaluating the validity of this criteria with off-line simulation, a real-time virtual reality (VR) micro camera system is proposed to achieve the micro VR environments with the 'depth from focus' criteria. This system is constructed with a dynamic focusing lens, which can change its focal distance at a high frequency, and a smart vision system, which is capable of capturing and processing the image data in high speed with SIMD architecture.  相似文献   

9.
研究了物距(即手掌与镜头之间的移动范围)与成像清晰度的关系,从而可以根据掌纹识别精度确定在超出景深范围情况下的手掌与镜头之间的可移动范围。通过三种基于梯度的清晰度与物距实验结果发现,三种清晰度评价值与物距均成单调单值非线性关系且曲线靠近。以此可以推论,各种梯度清晰度评价函数在与物距的关系表述上没有明显差异。  相似文献   

10.
This paper presents a novel vision-based global localization that uses hybrid maps of objects and spatial layouts. We model indoor environments with a stereo camera using the following visual cues: local invariant features for object recognition and their 3D positions for object pose estimation. We also use the depth information at the horizontal centerline of image where the optical axis passes through, which is similar to the data from a 2D laser range finder. This allows us to build our topological node that is composed of a horizontal depth map and an object location map. The horizontal depth map describes the explicit spatial layout of each local space and provides metric information to compute the spatial relationships between adjacent spaces, while the object location map contains the pose information of objects found in each local space and the visual features for object recognition. Based on this map representation, we suggest a coarse-to-fine strategy for global localization. The coarse pose is estimated by means of object recognition and SVD-based point cloud fitting, and then is refined by stochastic scan matching. Experimental results show that our approaches can be used for an effective vision-based map representation as well as for global localization methods.  相似文献   

11.
Detecting objects, estimating their pose, and recovering their 3D shape are critical problems in many vision and robotics applications. This paper addresses the above needs using a two stages approach. In the first stage, we propose a new method called DEHV – Depth-Encoded Hough Voting. DEHV jointly detects objects, infers their categories, estimates their pose, and infers/decodes objects depth maps from either a single image (when no depth maps are available in testing) or a single image augmented with depth map (when this is available in testing). Inspired by the Hough voting scheme introduced in [1], DEHV incorporates depth information into the process of learning distributions of image features (patches) representing an object category. DEHV takes advantage of the interplay between the scale of each object patch in the image and its distance (depth) from the corresponding physical patch attached to the 3D object. Once the depth map is given, a full reconstruction is achieved in a second (3D modelling) stage, where modified or state-of-the-art 3D shape and texture completion techniques are used to recover the complete 3D model. Extensive quantitative and qualitative experimental analysis on existing datasets [2], [3], [4] and a newly proposed 3D table-top object category dataset shows that our DEHV scheme obtains competitive detection and pose estimation results. Finally, the quality of 3D modelling in terms of both shape completion and texture completion is evaluated on a 3D modelling dataset containing both in-door and out-door object categories. We demonstrate that our overall algorithm can obtain convincing 3D shape reconstruction from just one single uncalibrated image.  相似文献   

12.
从高分辨率图像中获取周边目标的精准3D位置和尺寸信息是实现自动驾驶控制和行为决策的基础,因此基于图像的3D目标检测是自动驾驶领域中的研究热点。已有学者对该领域方法论及成果进行了比较详细的综述,但对于导致现有方法检测精度不尽如意的制约因素未能进行深入系统的分析。考虑自动驾驶领域在工程应用方面的要求高,且现有方法以数据驱动类型为主,本文从常用数据集和评价基准、数据影响、方法论的制约因素和误差等角度,对学术界和产业界在3D目标检测方面的研究成果及行业应用进行较为系统的阐述。首先,从学术界探索成果以及自动驾驶行业的应用角度进行概要介绍。然后,从数据采集设备、数据精度和标注信息3方面详细分析总结了KITTI等4个通用数据集,并对这些数据集提出的主要评价指标进行对比分析。接着,从数据和方法论方面分析制约算法性能的主要因素及由此造成的误差影响。在数据方面,制约因素主要是数据精度、样本差异、标注数据量和标注规范;在方法论方面,制约因素主要包括先验几何关系、深度预测误差和数据模态等。最后,对国内外研究现状进行总结,并在数据集、评价指标和目标深度预测等方面提出了未来需要重点关注的研究方向。  相似文献   

13.
为解决复杂场景目标识别中伪目标的干扰问题,采用基于AdaBoost分类的方法分析疑似目标的三维轨迹,结合真实目标的共有的特征信息,进一步分类真实目标与伪目标。首先,根据深度相机获取的深度图像提取疑似目标的人头区域,利用Kalman滤波跟踪得到的二维轨迹。其次,通过摄像机标定将目标的二维轨迹转换为空间中的三维轨迹。最后,利用AdaBoost训练正负样本得到强分类器,进一步分类真实目标与伪目标。实验结果表明,该方法能够有效的提高目标识别的精度,对复杂场景下的目标识别具有良好的适应性。  相似文献   

14.
3维全景图像技术是一种能够记录和显示全真3维场景的图像技术。该技术采用微透镜阵列记录空间场景,空间任意一点的深度信息只需通过一次成像即可直接获得。本文研究采用全景图像技术直接获取物体空间信息的方法。此方法首先从全景图像中抽提视图。视图是通过抽提全景图像中对应于每个微透镜下同一局部位置的点人工合成的。每幅视图包含了全景图像中对原来的物空间场景按照某一特定方向的平行投影记录信息。接下来通过分析全景图像的光学成像过程。推导了用来描述物体深度信息和其在对应的视图间的视差关系的深度方程。从而得出空间任一点的深度可以通过其在对应视图间的视差来求得。最后,通过运用全景图像测量火柴盒的厚度的实例,验证了这一方法的可行性。其结果一方面可用于全景图像的数据处理本身,另一方面可望为开发新型的深度测量工具提供理论依据。  相似文献   

15.
Machine vision system for curved surface inspection   总被引:2,自引:0,他引:2  
This application-oriented paper discusses a non-contact 3D range data measurement system to improve the performance of the existing 2D herring roe grading system. The existing system uses a single CCD camera with unstructured halogen lighting to acquire and analyze the shape of the 2D shape of the herring roe for size and deformity grading. Our system will act as an additional system module, which can be integrated into the existing 2D grading system, providing the additional third dimension to detect deformities in the herring roe, which were not detected in the 2D analysis. Furthermore, the additional surface depth data will increase the accuracy of the weight information used in the existing grading system. In the proposed system, multiple laser light stripes are projected into the herring roe and the single B/W CCD camera records the image of the scene. The distortion in the projected line pattern is due to the surface curvature and orientation. Utilizing the linear relation between the projected line distortion and surface depth, the range data was recovered from a single camera image. The measurement technique is described and the depth information is obtained through four steps: (1) image capture, (2) stripe extraction, (3) stripe coding, (4) triangulation, and system calibration. Then, this depth information can be converted into the curvature and orientation of the shape for deformity inspection, and also used for the weight estimation. Preliminary results are included to show the feasibility and performance of our measurement technique. The accuracy and reliability of the computerized herring roe grading system can be greatly improved by integrating this system into existing system in the future.  相似文献   

16.
本文提出了一个基于流形学习的动作识别框架,用来识别深度图像序列中的人体行为。本文从Kinect设备获得的深度信息中评估出人体的关节点信息,并用相对关节点位置差作为人体特征表达。在训练阶段,本文利用Lapacian eigenmaps(LE)流形学习对高维空间下的训练集进行降维,得到低维隐空间下的运动模型。在识别阶段,本文用最近邻差值方法将测试序列映射到低维流形空间中去,然后进行匹配计算。在匹配过程中,通过使用改进的Hausdorff距离对低维空间下测试序列和训练运动集的吻合度和相似度进行度量。本文用Kinect设备捕获的数据进行了实验,取得了良好的效果;同时本文也在MSR Action3D数据库上进行了测试,结果表明在训练样本较多情况下,本文识别效果优于以往方法。实验结果表明本文所提的方法适用于基于深度图像序列的人体动作识别。  相似文献   

17.
密集的深度信息在计算机视觉的各种任务中有广泛的应用,然而深度相机在有光泽、透明、较远处的物体表面通常无法探测到深度信息,映射到深度图片上形成了大小不一的孔洞。因此,提出了一种沿法线方向传播的单一深度图像补洞算法。本文方法沿着物体本身的曲面变化进行扩散,把深度图片的补全问题转化成几何完成的问题。首先把2D的深度图像扩展到3D点云,然后3D点云沿着孔洞边界的法线方向向内收缩。收缩的过程中加入类似正态滤波器的约束函数来模拟深度的变化,使填充的点云更加适合整个物体的结构。最后把3D点云重新映射到2D图片上。本文的算法在NYU-Depth-v2数据集上进行测试,实验证明本文算法对孔洞的填充有较好的效果。  相似文献   

18.
《Computers in Industry》2013,64(9):1115-1128
3D difference detection is the task to verify whether the 3D geometry of a real object exactly corresponds to a 3D model of this object. We present an approach for 3D difference detection with a hand-held depth camera. In contrast to previous approaches, with the presented approach geometric differences can be detected in real-time and from arbitrary viewpoints. The 3D difference detection accuracy is improved by two approaches: first, the precision of the depth camera's pose estimation is improved by coupling the depth camera with a high precision industrial measurement arm. Second, the influence of the depth measurement noise is reduced by integrating a 3D surface reconstruction algorithm. The effects of both enhancements are quantified by a ground-truth based quantitative evaluation, both for a time-of-flight (SwissRanger 4000) and a structured light depth camera (Kinect). With the proposed enhancements, differences of few millimeters can be detected from 1 m measurement distance.  相似文献   

19.
为了高效、高精度、低成本地实现对物体的全视角三维重建, 提出一种使用深度相机融合光照约束实现全视角三维重建的方法。该重建方法中,在进行单帧重建时采用RGBD深度图像融合明暗恢复形状(Shape from shading,SFS)的重建方法, 即在原有的深度数据上加上额外的光照约束来优化深度值; 在相邻两帧配准时, 采用快速点特征直方图(Fast point feature histograms, FPFH)特征进行匹配并通过随机采样一致性(Random sample consensus, RANSAC)滤除错误的匹配点对求解粗配准矩阵, 然后通过迭代最近点(Iterative closest point, ICP)算法进行精配准得出两帧间的配准矩阵; 在进行全视角的三维重建时, 采用光束平差法优化相机位姿, 从而消除累积误差使首尾帧完全重合, 最后融合生成一个完整的模型。该方法融入了物体表面的光照信息,因此生成的三维模型更为光顺,也包含了更多物体表面的细节信息,提高了重建精度;同时该方法仅通过单张照片就能在自然光环境下完成对多反射率三维物体的重建,适用范围更广。本文方法的整个实验过程通过手持深度相机就能完成,不需要借助转台,操作更加方便。  相似文献   

20.
基于流形学习的人体动作识别   总被引:5,自引:2,他引:3       下载免费PDF全文
目的 提出了一个基于流形学习的动作识别框架,用来识别深度图像序列中的人体行为。方法 从Kinect设备获得的深度信息中评估出人体的关节点信息,并用相对关节点位置差作为人体特征表达。在训练阶段,利用LE(Lalpacian eigenmaps)流形学习对高维空间下的训练集进行降维,得到低维隐空间下的运动模型。在识别阶段,用最近邻差值方法将测试序列映射到低维流形空间中去,然后进行匹配计算。在匹配过程中,通过使用改进的Hausdorff距离对低维空间下测试序列和训练运动集的吻合度和相似度进行度量。结果 用Kinect设备捕获的数据进行了实验,取得了良好的效果;同时也在MSR Action3D数据库上进行了测试,结果表明在训练样本较多情况下,本文方法识别效果优于以往方法。结论 实验结果表明本文方法适用于基于深度图像序列的人体动作识别。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号