首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This paper extends the Region-based Deformable Net (RbDN) technique described in [1] to extract the 3D information of all the objects in the scene from a single moving camera. The technique is used for segmenting real-time video sequences captured from a single moving camera. The deformation process tracks the changes in the location and the shape of the segments across the frames. These changes along with the camera displacement are used to estimate the 3D information. The algorithm is completely autonomous and does not require pre-knowledge, training, or assumption about the contents of the sequence. It can handle the difficult case where the motion of the camera is parallel to its optical axis. It can also estimate the distances to objects that are more than 100 m away as long as the camera displacement is over 10% of the expected distance to the objects.  相似文献   

2.
3D reconstruction is a major problem in computer vision. This paper considers the problem of reconstructing 3D structures, given a 2D video sequence. This problem is challenging since it is difficult to identify the trajectory of each object point/pixel over time. Traditional stereo 3D reconstruction methods and volumetric 3D reconstruction methods suffer from the blank wall problem, and the estimated dense depth map is not smooth, resulting in loss of actual geometric structures such as planes. To retain geometric structures embedded in the 3D scene, this paper proposes a novel surface fitting approach for 3D dense reconstruction. Specifically, we develop an expanded deterministic annealing algorithm to decompose 3D point cloud to multiple geometric structures, and estimate the parameters of each geometric structure. In this paper, we only consider plane structure, but our methodology can be extended to other parametric geometric structures such as spheres, cylinders, and cones. The experimental results show that the new approach is able to segment 3D point cloud into appropriate geometric structures and generate accurate 3D dense depth map.  相似文献   

3.
Finding objects and tracking their poses are essential functions for service robots, in order to manipulate objects and interact with humans. We present novel algorithms for local feature matching for object detection, and 3D pose estimation. Our feature matching algorithm takes advantage of local geometric consistency for better performance, and the new 3D pose estimation algorithm solves the pose in a closed-form using homography, followed by a non-linear optimization step for stability. Advantages of our approach include better performance, minimal prior knowledge for the target pattern, and easy implementation and portability as a modularized software component. We have implemented our approach along with both CPU and GPU-based feature extraction, and built an interoperable component that can be used in any Robot Technology (RT)-based control system. Experiment shows that our approach produces very robust results for the estimated 3D pose, and maintain very low false positive rate. It is also fast enough to be used in on-line applications. We integrated our vision component in an autonomous robot system with a search-and-grasp task, and tested it with several objects that are found in ordinary domestic environment. We present the details of our approach, the design of our modular component design, and the results of the experiments in this paper.  相似文献   

4.
Error control techniques like error resilience (ER) and error concealment (EC) are efficient techniques to ameliorate the lost macroblocks (MBs) in the 3D video (3DV) communication system. In this paper, we propose efficient and adaptive hybrid ER‐EC algorithms for 3DV transmission over error‐prone wireless channels. At the encoder, adaptive preprocessing ER mechanisms are proposed through using the context adaptive variable length coding entropy, slice structured coding modes, and explicit flexible macroblock ordering mapping. They are used to assist the suggested EC techniques at the decoder to accurately reconstruct the erroneous MBs and frames. At the decoder, an efficient postprocessing EC technique with multiproposition methods is proposed to dynamically select the convenient EC hypothesis method based on the size of the lost MBs, the faulty view, and the frame type. It conceals the received erroneous MBs of intra‐encoded and inter‐encoded frames of the transmitted 3DV by exploiting the temporal, spatial, and inter‐view correlations among frames and views. To further improve the decoded 3DV quality, a weighted overlapping block motion and disparity compensation technique is used to reinforce the performance of the suggested ER‐EC techniques. Experimental results on various 3DV streams prove that the suggested techniques have considerably acceptable subjective and objective 3DV performance. They achieve an improved average peak signal‐to‐noise ratio gain by almost 2.85 dB compared to the conventional error control algorithms at a packet loss rate = 40%.  相似文献   

5.
Wireless Networks - The Three-Dimensional Video (3DV) contains diverse video streams taken by different cameras around an object. Thence, it is an imperative assignment to fulfill efficient...  相似文献   

6.
许雄  陶强强  沈飞  郭忠义 《红外与激光工程》2016,45(9):922002-0922002(8)
基于斯托克斯矢量和MC算法,在各种散射系统中研究了偏振信息的传输性能。根据偏振光斯托克斯矢量的散射特性,提出了一种能够减少散射对入射偏振光影响的PR方法。为了验证PR方法的有效性和实用性,仿真了在不同实际大气和水下环境,偏振传输和偏振信息恢复的结果。仿真结果表明,PR方法更适用于粒子半径相对大的杂乱媒介,并且长波可以有效地减少偏振信息的损失。此外,仿真结果也表明,在非均匀的大气媒介中下行和上行链路是不可逆的。在水下,PR方法同样用来减小散射对于光的偏振度的影响。通过PR方法,线偏振度的最大增强可达到16%。这些结果对于未来大气,水下量子保密通信具有重要的意义。  相似文献   

7.
Virtual reality systems use digital models to provide interactive viewing. We present a 3D digital video system that attempts to provide the same capabilities for actual performances such as dancing. Recreating the original dynamic scene in 3D, the system allows photorealistic interactive playback from arbitrary viewpoints using video streams of a given scene from multiple perspectives  相似文献   

8.
9.
Multi-view video plus depth (MVD) format is considered as the next-generation standard for advanced 3D video systems. MVD consists of multiple color videos with a depth value associated with each texture pixel. Relying on this representation and by using depth-image-based rendering techniques, new viewpoints for multi-view video applications can be generated. However, since MVD is captured from different viewing angles with different cameras, significant illumination and color differences can be observed between views. These color mismatches degrade the performance of view rendering algorithms by introducing visible artifacts leading to a reduced view synthesis quality. To cope with this issue, we propose an effective method for correcting color inconsistencies in MVD. Firstly, to avoid occlusion problems and allow performing correction in the most accurate way, we consider only the overlapping region when calculating the color mapping function. These common regions are determined using a reliable feature matching technique. Also, to maintain the temporal coherence, correction is applied on a temporal sliding window. Experimental results show that the proposed method reduces the color difference between views and improves view rendering process providing high-quality results.  相似文献   

10.
3D video for tele-medicine applications is gradually gaining momentum since the 3D technology can provide precise location information. However, the weak link for 3D video streaming is the necessary wireless link of the communication system. Neglecting the wireless impairments can severely degrade the performance of 3D video streaming that communicates complex critical medical data. In this paper, we propose systematic methodology for ensuring high performance of the 3D medical video streaming system. First, we present a recursive end-to-end distortion estimation approach for MVC (multiview video coding)-based 3D video streaming over error-prone networks by considering the 3D inter-view prediction. Then, based on the previous model, we develop a cross-layer optimization scheme that considers the LTE wireless physical layer (PHY). In this optimization, the authentication requirements of 3D medical video are also taken into account. The proposed cross-layer optimization approach jointly controls and manages the authentication, video coding quantization of 3D video, and the modulation and channel coding scheme (MCS) of the LTE wireless PHY to minimize the end-to-end video distortion. Experimental results show that the proposed approach can provide superior 3D medical video streaming performance in terms of peak signal-to-noise ratio (PSNR) when compared to state-of-the-art approaches that include joint source-channel optimized streaming with multi-path hash-chaining based-authentication, and also conventional video streaming with single path hash-chaining-based authentication.  相似文献   

11.
Even though numerous algorithms exist for estimating the three-dimensional (3-D) structure of a scene from its video, the solutions obtained are often of unacceptable quality. To overcome some of the deficiencies, many application systems rely on processing more data than necessary, thus raising the question: how is the accuracy of the solution related to the amount of data processed by the algorithm? Can we automatically recognize situations where the quality of the data is so bad that even a large number of additional observations will not yield the desired solution? Previous efforts to answer this question have used statistical measures like second order moments. They are useful if the estimate of the structure is unbiased and the higher order statistical effects are negligible, which is often not the case. This paper introduces an alternative information-theoretic criterion for evaluating the quality of a 3-D reconstruction. The accuracy of the reconstruction is judged by considering the change in mutual information (MI) (termed as the incremental MI) between a scene and its reconstructions. An example of 3-D reconstruction from a video sequence using optical flow equations and known noise distribution is considered and it is shown how the MI can be computed from first principles. We present simulations on both synthetic and real data to demonstrate the effectiveness of the proposed criterion.  相似文献   

12.
Though constrained by payload and processing, small robots have gained applications in collecting visual information from the scene. Typically these small-size robots do not carry data loggers and send the video information to a hand-held device at a remote location for visual observations. Due to sophisticated processing and control limitations from mechatronics resources, the video captured by the robot is subjected to the effects of unintended motion, which requires digital methods for video stabilization. For a lightweight solution for video stabilization, we avoid use of any external hardware and develop a Singular Value Decomposition (SVD) based digital algorithm that avoids explicit feature tracking and motion estimation during stabilization. The process involves identifying a subspace with minimal dimensions that contains information of intentional motion alone. This work identifies the minimal subspace for video stabilization using the sliding window geometry method for practical implementation. Further, a shape-preserving filter is utilized to remove perturbations induced by the unintended motions, thereby resulting in the reconstruction of the stabilized video sequence. Experimental results on two different small-size robots viz spherical robot and Unmanned Aerial Vehicle (UAV) in indoor and outdoor settings, respectively, show quality outcomes without any change in parameters of the proposed filter design. Performance comparison with existing methods on the quality of stabilized video shows that the proposed stabilization method overcomes the non-availability of features for tracking due to large amplitudes and limited onboard resources. With the proposed video stabilization method, there is a potential for wider applicability of small-size robots in remote visual observations.  相似文献   

13.
《现代电子技术》2017,(12):105-107
为了提高机器人人机界面的三维可视化操作性能,提出一种基于GPU实时图形跟踪渲染的机器人人机界面的三维可视化重构设计方法。采用计算机视觉方法进行机器人的人机界面视觉特征采样,对采样的视觉像素信息进行稀疏散点重构,在重构的三维空间中通过图像处理方法实现图形降噪和边缘修正处理,提高人机交互界面的三维可视化图形细节表达能力。仿真结果表明,采用该方法进行机器人人机界面的三维可视化设计,输出图形的视觉效果较好,人机交互能力较强,具有较高的应用价值。  相似文献   

14.
15.
基于FPGA的三维视频系统实时深度估计   总被引:2,自引:1,他引:1  
深度估计是基于视频加深度图像的三维视频系统中前端预处理的核心技术,其主要技术难题包括准确性、实时处理和大分辨率深度图获取等。本文提出一种实时深度估计的硬件实现方案,主要解决处理速度问题,并兼顾了准确性和大分辨率问题。本方案采用单片FPGA实现深度估计,其中采用census变换与SAD(Sum of Absolute Differences)混合的算法进行逐点匹配得到稠密深度图。硬件设计充分利用FPGA的大规模并行能力,并采用流水线设计提高数据通路的数据吞吐量,提升整个设计的时钟频率。实验表明,所提出的方案可实现全高清(1 920×1 080)分辨率视频实时深度估计。为了支持大分辨率图像并能观测距离相机较近的物体深度,本文方案视差搜索范围可以达到240pixels,帧率最高可达69.6fps,达到了实时和高清的处理目的。  相似文献   

16.
In this paper, we proposed an efficient coding method for digital hologram video using a three-dimensional (3D) scanning method and two-dimensional (2D) video compression technique. It consists of separation of the captured 3D image into R, G, and B color space components, localization by segmenting the fringe pattern in to M×N [pixel2], frequency-transform by 2D discrete cosine transform (2D DCT), 3D-scanning the segments to form a video sequence, classification of coefficients, and hybrid video coding with H.264/AVC, differential pulse code modulation (DPCM), and lossless coding method. The experimental results with this method showed that the proposed method has compression ratios of 8–16 times higher than the previous researches. Thus, we expect it to contribute to reduce the amount of digital hologram data for communication or storage.  相似文献   

17.
18.
实时三维信息获取系统   总被引:6,自引:1,他引:5  
介绍了信息获取系统结构和图像处理算法,系统的主从结构方式,从机以SDP和阵列处理为基础,能实现并行实时处理。在80ms内完成一次对图像分割、激光线提取、正侧轮廓提取、获取彩色信息等处理。本系统把扫描速度升到一个新的数量级,采样速率达1800点/s,实时处理和显示立体三维图形,大大提高性价比,减少存储器容量,降低配置要求,充分利用了硬、软件资源,与国外同类产品相比,在彩色获取处理、特殊反射区处理等方面有其特色,本系统特别适用机械远程加工、快速成型、虚拟现实和三维传真等。系统的研制的三维信息获取技术的产品化、实用化奠定了基础。  相似文献   

19.
Mobile robots are used in modern life; however, object recognition is still insufficient to realize robot navigation in crowded environments. Mobile robots must rapidly and accurately recognize the movements and shapes of pedestrians to navigate safely in pedestrian-rich spaces. This study proposes real-time, accurate, three-dimensional (3D) multi-pedestrian detection and tracking using a 3D light detection and ranging (LiDAR) point cloud in crowded environments. The pedestrian detection quickly segments a sparse 3D point cloud into individual pedestrians using a lightweight convolutional autoencoder and connected-component algorithm. The multi-pedestrian tracking identifies the same pedestrians considering motion and appearance cues in continuing frames. In addition, it estimates pedestrians' dynamic movements with various patterns by adaptively mixing heterogeneous motion models. We evaluate the computational speed and accuracy of each module using the KITTI dataset. We demonstrate that our integrated system, which rapidly and accurately recognizes pedestrian movement and appearance using a sparse 3D LiDAR, is applicable for robot navigation in crowded spaces.  相似文献   

20.
视频去噪的目的是将原始视频从观测到的含噪视频中还原出来。对基于三维滤波的视频去噪算法进行了研究。首先利用贝叶斯阈值对视频序列的各帧在小波域中滤波,之后对帧间连续三帧图像进行帧间滤波。仿真结果表明该算法的有效性。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号