首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Images/videos captured by portable devices (e.g., cellphones, DV cameras) often have limited fields of view. Image stitching, also referred to as mosaics or panorama, can produce a wide angle image by compositing several photographs together. Although various methods have been developed for image stitching in recent years, few works address the video stitching problem. In this paper, we present the first system to stitch videos captured by hand‐held cameras. We first recover the 3D camera paths and a sparse set of 3D scene points using CoSLAM system, and densely reconstruct the 3D scene in the overlapping regions. Then, we generate a smooth virtual camera path, which stays in the middle of the original paths. Finally, the stitched video is synthesized along the virtual path as if it was taken from this new trajectory. The warping required for the stitching is obtained by optimizing over both temporal stability and alignment quality, while leveraging on 3D information at our disposal. The experiments show that our method can produce high quality stitching results for various challenging scenarios.  相似文献   

2.
This paper describes a new algorithm for visual control of an uncalibrated 3 DOF joint system, using two weakly calibrated fixed cameras. The algorithm estimates on-line the Image Jacobian, a matrix which linearly relates joint velocity and image feature velocity. In our experiments we prove that by using the fundamental matrix, robustness of the estimation in the presence of noise is significantly increased with respect to already existing algorithms in specialized literature.  相似文献   

3.
In this paper, we propose to capture image-based rendering scenes using a novel approach called active rearranged capturing (ARC). Given the total number of available cameras, ARC moves them strategically on the camera plane in order to minimize the sum of squared rendering errors for a given set of light rays to be rendered. Assuming the scene changes slowly so that the optimized camera locations are valid in the next time instance, we formulate the problem as a recursive weighted vector quantization problem, which can be solved efficiently. The ARC approach is verified on both synthetic and real-world scenes. In particular, a large self-reconfigurable camera array is built to demonstrate ARC's performance on real-world scenes. The system renders virtual views at 5-10 frames/s depending on the scene complexity on a moderately equipped computer. Given the virtual view point, the cameras move on a set of rails to perform ARC and improve the rendering quality on the fly  相似文献   

4.
In virtual reality (VR) applications, the contents are usually generated by creating a 360° Video panorama of a real‐world scene. Although many capture devices are being released, getting high‐resolution panoramas and displaying a virtual world in real‐time remains challenging due to its computationally demanding nature. In this paper, we propose a real‐time 360° Video foveated stitching framework, that renders the entire scene in different level of detail, aiming to create a high‐resolution panoramic Video in real‐time that can be streamed directly to the client. Our foveated stitching algorithm takes Videos from multiple cameras as input, combined with measurements of human visual attention (i.e. the acuity map and the saliency map), can greatly reduce the number of pixels to be processed. We further parallelize the algorithm using GPU to achieve a responsive interface and validate our results via a user study. Our system accelerates graphics computation by a factor of 6 on a Google Cardboard display.  相似文献   

5.
增强现实系统中的深度检测技术研究   总被引:5,自引:0,他引:5  
倪剑  闫达远  周雅  刘伟  高宇 《计算机应用》2006,26(1):132-0134
增强现实技术是虚拟现实的一个分支,它将虚拟的增强信息融合到真实场景中去。对于增强现实系统而言,融合过程中虚实的遮挡关系需要根据真实环境的深度信息来判断。为了实时获取真实环境的深度信息,文中根据双目立体视觉原理,构建了一个完整的增强现实深度检测系统,该系统安装要求低,通过对系统进行标定和校正,使其达到双目立体规范结构,为对应点的匹配搜索提供了很大的方便。从实验的结果可以看出,文中的深度检测系统算法快速稳定,左右视差图效果良好,可以为虚实遮挡提供可靠的深度信息。  相似文献   

6.
We present a real‐time multi‐view facial capture system facilitated by synthetic training imagery. Our method is able to achieve high‐quality markerless facial performance capture in real‐time from multi‐view helmet camera data, employing an actor specific regressor. The regressor training is tailored to specified actor appearance and we further condition it for the expected illumination conditions and the physical capture rig by generating the training data synthetically. In order to leverage the information present in live imagery, which is typically provided by multiple cameras, we propose a novel multi‐view regression algorithm that uses multi‐dimensional random ferns. We show that higher quality can be achieved by regressing on multiple video streams than previous approaches that were designed to operate on only a single view. Furthermore, we evaluate possible camera placements and propose a novel camera configuration that allows to mount cameras outside the field of view of the actor, which is very beneficial as the cameras are then less of a distraction for the actor and allow for an unobstructed line of sight to the director and other actors. Our new real‐time facial capture approach has immediate application in on‐set virtual production, in particular with the ever‐growing demand for motion‐captured facial animation in visual effects and video games.  相似文献   

7.
We present a surveillance system, comprising wide field-of-view (FOV) passive cameras and pan/tilt/zoom (PTZ) active cameras, which automatically captures high-resolution videos of pedestrians as they move through a designated area. A wide-FOV static camera can track multiple pedestrians, while any PTZ active camera can capture high-quality videos of one pedestrian at a time. We formulate the multi-camera control strategy as an online scheduling problem and propose a solution that combines the information gathered by the wide-FOV cameras with weighted round-robin scheduling to guide the available PTZ cameras, such that each pedestrian is observed by at least one PTZ camera while in the designated area. A centerpiece of our work is the development and testing of experimental surveillance systems within a visually and behaviorally realistic virtual environment simulator. The simulator is valuable as our research would be more or less infeasible in the real world given the impediments to deploying and experimenting with appropriately complex camera sensor networks in large public spaces. In particular, we demonstrate our surveillance system in a virtual train station environment populated by autonomous, lifelike virtual pedestrians, wherein easily reconfigurable virtual cameras generate synthetic video feeds. The video streams emulate those generated by real surveillance cameras monitoring richly populated public spaces.A preliminary version of this paper appeared as [1].  相似文献   

8.
Automatic 3D animation generation techniques are becoming increasingly popular in different areas related to computer graphics such as video games and animated movies. They help automate the filmmaking process even by non professionals without or with minimal intervention of animators and computer graphics programmers. Based on specified cinematographic principles and filming rules, they plan the sequence of virtual cameras that the best render a 3D scene. In this paper, we present an approach for automatic movie generation using linear temporal logic to express these filming and cinematography rules. We consider the filming of a 3D scene as a sequence of shots satisfying given filming rules, conveying constraints on the desirable configuration (position, orientation, and zoom) of virtual cameras. The selection of camera configurations at different points in time is understood as a camera plan, which is computed using a temporal-logic based planning system (TLPlan) to obtain a 3D movie. The camera planner is used within an automated planning application for generating 3D tasks demonstrations involving a teleoperated robot arm on the the International Space Station (ISS). A typical task demonstration involves moving the robot arm from one configuration to another. The main challenge is to automatically plan the configurations of virtual cameras to film the arm in a manner that conveys the best awareness of the robot trajectory to the user. The robot trajectory is generated using a path-planner. The camera planner is then invoked to find a sequence of configurations of virtual cameras to film the trajectory.  相似文献   

9.
This work is dedicated to develop a safety measurement for human–machine cooperative system, in which the machine region and the human region cannot be separated due to the overlap and the movement both from human and from machines. Modern production processes become more and more flexible. Therefore, there is a need that devices used in workplace also support flexibility as much as possible. Such characteristics have vision-based protective devices. We present a neural system for the advanced recognition of danger situation for safety control. The sequence of the images from two cameras located above the robot is presented to the system of cellular neural networks (CNNs) realized in the PC computer. They detect a new object appearing in a safety field, define its position with respect to the robot arm and perform the feature extraction of its image. Experiments conducted using artificial images (virtual environment) and low-quality images (internet cameras) indicate that our system can work in a real time and detect successively dangerous situations. We have found that the CNN is unable to detect a new object properly in the presence of high level of noise as a result of percolation type phase transition. An example of possible application of the system is presented.  相似文献   

10.
In this paper we present a novel computer vision based hand-tracking technique, which is capable of robustly tracking 6+4DOF of the human hand in real-time (at least 25 frames per second) with the help of 3 (or more) off-the-shelf consumer cameras. ‘6+4DOF’ means that the system can track the global pose (6 continuous parameters for translation and rotation) of 4 different gestures. A key feature of our system is its fully automatic real-time initialization procedure, which, along with a sound tracking-lost detector, makes the system fit for real-world applications. Because of this, our method acts as an enabling technology for uncumbersome hand-based 3D Human-Computer-Interaction (HCI). Previously, using the hand as an at least 6DOF input device involved the use of either datagloves or markers. Using our tracking we evaluated the use of the hand as an input device for two prevalent Virtual Reality applications: fly-through exploration of a virtual world and a simple digital assembly simulation.  相似文献   

11.
This paper proposes a new method for rectification of single-lens stereovision system with a triprism. The image plane of this camera will capture three different views of the same scene behind the filter in one shot. These three sub-images can be taken as the images captured by three virtual cameras which are generated by the three Face (3F) filter (triprism). A geometry-based method is proposed to determine the pose of virtual cameras and obtain the rotational and translational transformation matrix to real camera. At the same time, the parallelogram rule and refraction rule are employed to determine the desired sketch ray functions. Followed by this, the rectification transformation matrix which applied on the images captured using the system is computed. The approach based on geometry analysis of ray sketching is significantly a simpler implementation: it does not require the usual complicated calibration process. Experimental results are presented to show the effectiveness of the approach.  相似文献   

12.
Multi-projector displays today are automatically registered, both geometrically and photometrically, using cameras. Existing registration techniques assume pre-calibrated projectors and cameras that are devoid of imperfections such as lens distortion. In practice, however, these devices are usually imperfect and uncalibrated. Registration of each of these devices is often more challenging than the multi-projector display registration itself. To make tiled projection-based displays accessible to a layman user we should allow the use of uncalibrated inexpensive devices that are prone to imperfections. In this paper, we make two important advances in this direction. First, we present a new geometric registration technique that can achieve geometric alignment {\em in the presence of severe projector lens distortion} using a relatively inexpensive low-resolution camera. This is achieved via a closed-form model that relates the projectors to cameras, in planar multi-projector displays, using rational Bezier patches. This enables us to geometrically calibrate a 3000 x 2500 resolution planar multi-projector display made of 3 x 3 array of nine severely distorted projectors using a low resolution (640 x 480) VGA camera. Second, we present a photometric self-calibration technique for a projector-camera pair. This allows us to photometrically calibrate the same display made of nine projectors using a photometrically uncalibrated camera. To the best of our knowledge, this is the first work that allows geometrically imperfect projectors and photometrically uncalibrated cameras in calibrating multi-projector displays.  相似文献   

13.
《Advanced Robotics》2013,27(8-9):947-967
Abstract

A wide field of view is required for many robotic vision tasks. Such an aperture may be acquired by a fisheye camera, which provides a full image compared to catadioptric visual sensors, and does not increase the size and the weakness of the imaging system with respect to perspective cameras. While a unified model exists for all central catadioptric systems, many different models, approximating the radial distortions, exist for fisheye cameras. It is shown in this paper that the unified projection model proposed for central catadioptric cameras is also valid for fisheye cameras in the context of robotic applications. This model consists of a projection onto a virtual unitary sphere followed by a perspective projection onto an image plane. This model is shown equivalent to almost all the fisheye models. Calibration with four cameras and partial Euclidean reconstruction are done using this model, and lead to persuasive results. Finally, an application to a mobile robot navigation task is proposed and correctly executed along a 200-m trajectory.  相似文献   

14.
虚拟现实是利用电脑模拟产生一个三维空间的虚拟世界,让使用者如同身历其境,而三维全景图是虚拟现实中一种重要的场景表示方法,生成三维全景图最简单的方法是直接使用全景相机等工具拍摄全景图,由于相机设备价格昂贵,限制了普遍使用;基于图像拼接技术对图像序列进行处理后生成全景图的低成本方法。该方法在南阳理工学院虚拟校园系统中使用,实现了虚拟校园的实景漫游,大大降低了成本。  相似文献   

15.
16.
In this paper we address the problem of establishing a computational model for visual attention using cooperation between two cameras. More specifically we wish to maintain a visual event within the field of view of a rotating and zooming camera through the understanding and modeling of the geometric and kinematic coupling between a static camera and an active camera. The static camera has a wide field of view thus allowing panoramic surveillance at low resolution. High-resolution details may be captured by a second camera, provided that it looks in the right direction. We derive an algebraic formulation for the coupling between the two cameras and we specify the practical conditions yielding a unique solution. We describe a method for separating a foreground event (such as a moving object) from its background while the camera rotates. A set of outdoor experiments shows the two-camera system in operation.  相似文献   

17.
This paper describes a framework for aerial imaging of high dynamic range (HDR) scenes for use in virtual reality applications, such as immersive panorama applications and photorealistic superimposition of virtual objects using image-based lighting. We propose a complete and practical system to acquire full spherical HDR images from the sky, using two omnidirectional cameras mounted above and below an unmanned aircraft. The HDR images are generated by combining multiple omnidirectional images captured with different exposures controlled automatically. Our system consists of methods for image completion, alignment, and color correction, as well as a novel approach for automatic exposure control, which selects optimal exposure so as to avoid banding artifacts. Experimental results indicated that our system generated better spherical images compared to an ordinary spherical image completion system in terms of naturalness and accuracy. In addition to proposing an imaging method, we have carried out an experiment about display methods for aerial HDR immersive panoramas utilizing spherical images acquired by the proposed system. The experiment demonstrated HDR imaging is beneficial to immersive panorama using an HMD, in addition to ordinary uses of HDR images.  相似文献   

18.
Low-cost telepresence for collaborative virtual environments   总被引:1,自引:0,他引:1  
We present a novel low-cost method for visual communication and telepresence in a CAVEtrade-like environment, relying on 2D stereo-based video avatars. The system combines a selection of proven efficient algorithms and approximations in a unique way, resulting in a convincing stereoscopic real-time representation of a remote user acquired in a spatially immersive display. The system was designed to extend existing projection systems with acquisition capabilities requiring minimal hardware modifications and cost. The system uses infrared-based image segmentation to enable concurrent acquisition and projection in an immersive environment without a static background. The system consists of two color cameras and two additional b/w cameras used for segmentation in the near-IR spectrum. There is no need for special optics as the mask and color image are merged using image-warping based on a depth estimation. The resulting stereo image stream is compressed, streamed across a network, and displayed as a frame-sequential stereo texture on a billboard in the remote virtual environment  相似文献   

19.
This paper presents an efficient image-based approach to navigate a scene based on only three wide-baseline uncalibrated images without the explicit use of a 3D model. After automatically recovering corresponding points between each pair of images, an accurate trifocal plane is extracted from the trifocal tensor of these three images. Next, based on a small number of feature marks using a friendly GUI, the correct dense disparity maps are obtained by using our trinocular-stereo algorithm. Employing the barycentric warping scheme with the computed disparity, we can generate an arbitrary novel view within a triangle spanned by three camera centers. Furthermore, after self-calibration of the cameras, 3D objects can be correctly augmented into the virtual environment synthesized by the tri-view morphing algorithm. Three applications of the tri-view morphing algorithm are demonstrated. The first one is 4D video synthesis, which can be used to fill in the gap between a few sparsely located video cameras to synthetically generate a video from a virtual moving camera. This synthetic camera can be used to view the dynamic scene from a novel view instead of the original static camera views. The second application is multiple view morphing, where we can seamlessly fly through the scene over a 2D space constructed by more than three cameras. The last one is dynamic scene synthesis using three still images, where several rigid objects may move in any orientation or direction. After segmenting three reference frames into several layers, the novel views in the dynamic scene can be generated by applying our algorithm. Finally, the experiments are presented to illustrate that a series of photo-realistic virtual views can be generated to fly through a virtual environment covered by several static cameras.  相似文献   

20.
Augmented Reality (AR) composes virtual objects with real scenes in a mixed environment where human–computer interaction has more semantic meanings. To seamlessly merge virtual objects with real scenes, correct occlusion handling is a significant challenge. We present an approach to separate occluded objects in multiple layers by utilizing depth, color, and neighborhood information. Scene depth is obtained by stereo cameras and two Gaussian local kernels are used to represent color, spatial smoothness. These three cues are intelligently fused in a probability framework, where the occlusion information can be safely estimated. We apply our method to handle occlusions in video‐based AR where virtual objects are simply overlapped on real scenes. Experiment results show the approach can correctly register virtual and real objects in different depth layers, and provide a spatial‐awareness interaction environment. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号