首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
The view-independent visualization of 3D scenes is most often based on rendering accurate 3D models or utilizes image-based rendering techniques. To compute the 3D structure of a scene from a moving vision sensor or to use image-based rendering approaches, we need to be able to estimate the motion of the sensor from the recorded image information with high accuracy, a problem that has been well-studied. In this work, we investigate the relationship between camera design and our ability to perform accurate 3D photography, by examining the influence of camera design on the estimation of the motion and structure of a scene from video data. By relating the differential structure of the time varying plenoptic function to different known and new camera designs, we can establish a hierarchy of cameras based upon the stability and complexity of the computations necessary to estimate structure and motion. At the low end of this hierarchy is the standard planar pinhole camera for which the structure from motion problem is non-linear and ill-posed. At the high end is a camera, which we call the full field of view polydioptric camera, for which the motion estimation problem can be solved independently of the depth of the scene which leads to fast and robust algorithms for 3D Photography. In between are multiple view cameras with a large field of view which we have built, as well as omni-directional sensors.  相似文献   

2.
We present an image‐based rendering system to viewpoint‐navigate through space and time of complex real‐world, dynamic scenes. Our approach accepts unsynchronized, uncalibrated multivideo footage as input. Inexpensive, consumer‐grade camcorders suffice to acquire arbitrary scenes, for example in the outdoors, without elaborate recording setup procedures, allowing also for hand‐held recordings. Instead of scene depth estimation, layer segmentation or 3D reconstruction, our approach is based on dense image correspondences, treating view interpolation uniformly in space and time: spatial viewpoint navigation, slow motion or freeze‐and‐rotate effects can all be created in the same way. Acquisition simplification, integration of moving cameras, generalization to difficult scenes and space–time symmetric interpolation amount to a widely applicable virtual video camera system.  相似文献   

3.
We present a novel framework for real-time multi-perspective rendering. While most existing approaches are based on ray-tracing, we present an alternative approach by emulating multi-perspective rasterization on the classical perspective graphics pipeline. To render a general multi-perspective camera, we first decompose the camera into piecewise linear primitive cameras called the general linear cameras or GLCs. We derive the closed-form projection equations for GLCs and show how to rasterize triangles onto GLCs via a two-pass rendering algorithm. In the first pass, we compute the GLC projection coefficients of each scene triangle using a vertex shader. The linear raster on the graphics hardware then interpolates these coefficients at each pixel. Finally, we use these interpolated coefficients to compute the projected pixel coordinates using a fragment shader. In the second pass, we move the pixels to their actual projected positions. To avoid holes, we treat neighboring pixels as triangles and re-render them onto the GLC image plane. We demonstrate our real-time multi-perspective rendering framework in a wide range of applications including synthesizing panoramic and omnidirectional views, rendering reflections on curved mirrors, and creating multi-perspective faux animations. Compared with the GPU-based ray tracing methods, our rasterization approach scales better with scene complexity and it can render scenes with a large number of triangles at interactive frame rates.  相似文献   

4.
This paper presents a novel method for virtual view synthesis that allows viewers to virtually fly through real soccer scenes, which are captured by multiple cameras in a stadium. The proposed method generates images of arbitrary viewpoints by view interpolation of real camera images near the chosen viewpoints. In this method, cameras do not need to be strongly calibrated since projective geometry between cameras is employed for the interpolation. For avoiding the complex and unreliable process of 3-D recovery, object scenes are segmented into several regions according to the geometric property of the scene. Dense correspondence between real views, which is necessary for intermediate view generation, is automatically obtained by applying projective geometry to each region. By superimposing intermediate images for all regions, virtual views for the entire soccer scene are generated. The efforts for camera calibration are reduced and correspondence matching requires no manual operation; hence, the proposed method can be easily applied to dynamic events in a large space. An application for fly-through observations of soccer match replays is introduced along with the algorithm of view synthesis and experimental results. This is a new approach for providing arbitrary views of an entire dynamic event.  相似文献   

5.
This paper proposes a visual representation named scene tunnel for capturing urban scenes along routes and visualizing them on the Internet. We scan scenes with multiple cameras or a fish-eye camera on a moving vehicle, which generates a real scene archive along streets that is more complete than previously proposed route panoramas. Using a translating spherical eye, properly set planes of scanning, and unique parallel-central projection, we explore the image acquisition of the scene tunnel from camera selection and alignment, slit calculation, scene scanning, to image integration. The scene tunnels cover high buildings, ground, and various viewing directions and have uniformed resolutions along the street. The sequentially organized scene tunnel benefits texture mapping onto the urban models. We analyze the shape characteristics in the scene tunnels for designing visualization algorithms. After combining this with a global panorama and forward image caps, the capped scene tunnels can provide continuous views directly for virtual or real navigation in a city. We render scene tunnel dynamically by view warping, fast transmission, and flexible interaction. The compact and continuous scene tunnel facilitates model construction, data streaming, and seamless route traversing on the Internet and mobile devices.  相似文献   

6.
We exploit the common constraint of having a right-angle corner of two rectangular planes in the scene in order to calibrate a perspective projection camera and compute its pose relative to the coordinate system defined by the corner. No metric information about the corner is assumed. The camera is constrained to have its image x- and y-axes to be orthogonal with the same scale factor, which is valid for most real-world cameras. We then reproject the image of the corner to an arbitrary viewpoint. We can also compute the metric properties of the scene to scale. We report experimental results with subjectively acceptable quality. The approach shows the power of exploiting constraints that are abundant in typical architectural scenes.  相似文献   

7.
A method for capturing geometric features of real-world scenes relies on a simple capture setup modification. The system might conceivably be packaged into a portable self-contained device. The multiflash imaging method bypasses 3D geometry acquisition and directly acquires depth edges from images. In the place of expensive, elaborate equipment for geometry acquisition, we use a camera with multiple strategically positioned flashes. Instead of having to estimate the full 3D coordinates of points in the scene (using, for example, 3D cameras) and then look for depth discontinuities, our technique reduces the general 3D problem of depth edge recovery to one of 2D intensity edge detection. Our method could, in fact, help improve current 3D cameras, which tend to produce incorrect results near depth discontinuities. Exploiting the imaging geometry for rendering provides a simple and inexpensive solution for creating stylized images from real scenes. We believe that our camera will be a useful tool for professional artists and photographers, and we expect that it will also let the average user easily create stylized imagery. This article is available with a short video documentary on CD-ROM.  相似文献   

8.
Automatic Camera Placement for Image-Based Modeling   总被引:2,自引:0,他引:2  
We present an automatic camera placement method for generating image-based models from scenes with known geometry. Our method first approximately determines the set of surfaces visible from a given viewing area and then selects a small set of appropriate camera positions to sample the scene from. We define a quality measure for a surface as seen, or covered, from the given viewing area. Along with each camera position, we store the set of surfaces which are best covered by this camera. Next, one reference view is generated from each camera position by rendering the scene. Pixels in each reference view that do not belong to the selected set of polygons are masked out.
The image-based model generated by our method, covers every visible surface only once, associating it with a camera position from which it is covered with quality that exceeds a user-specified quality threshold. The result is a compact non-redundant image-based model with controlled quality.
The problem of covering every visible surface with a minimum number of cameras (guards) can be regarded as an extension to the well-known Art Gallery Problem. However, since the 3D polygonal model is textured, the camera-polygon visibility relation is not binary; instead, it has a weight — the quality of the polygon's coverage.  相似文献   

9.
Automatic camera control for scenes depicting human motion is an imperative topic in motion capture base animation, computer games, and other animation based fields. This challenging control problem is complex and combines both geometric constraints, visibility requirements, and aesthetic elements. Therefore, existing optimization‐based approaches for human action overview are often too demanding for online computation. In this paper, we introduce an effective automatic camera control which is extremely efficient and allows online performance. Rather than optimizing a complex quality measurement, at each time it selects one active camera from a multitude of cameras that render the dynamic scene. The selection is based on the correlation between each view stream and the human motion in the scene. Two factors allow for rapid selection among tens of candidate views in real‐time, even for complex multi‐character scenes: the efficient rendering of the multitude of view streams, and optimized calculations of the correlations using modified CCA. In addition to the method's simplicity and speed, it exhibits good agreement with both cinematic idioms and previous human motion camera control work. Our evaluations show that the method is able to cope with the challenges put forth by severe occlusions, multiple characters and complex scenes.  相似文献   

10.
Automated virtual camera control has been widely used in animation and interactive virtual environments. We have developed a multiple sparse camera based free view video system prototype that allows users to control the position and orientation of a virtual camera, enabling the observation of a real scene in three dimensions (3D) from any desired viewpoint. Automatic camera control can be activated to follow selected objects by the user. Our method combines a simple geometric model of the scene composed of planes (virtual environment), augmented with visual information from the cameras and pre-computed tracking information of moving targets to generate novel perspective corrected 3D views of the virtual camera and moving objects. To achieve real-time rendering performance, view-dependent textured mapped billboards are used to render the moving objects at their correct locations and foreground masks are used to remove the moving objects from the projected video streams. The current prototype runs on a PC with a common graphics card and can generate virtual 2D views from three cameras of resolution 768×576 with several moving objects at about 11 fps.  相似文献   

11.
This paper presents an efficient image-based approach to navigate a scene based on only three wide-baseline uncalibrated images without the explicit use of a 3D model. After automatically recovering corresponding points between each pair of images, an accurate trifocal plane is extracted from the trifocal tensor of these three images. Next, based on a small number of feature marks using a friendly GUI, the correct dense disparity maps are obtained by using our trinocular-stereo algorithm. Employing the barycentric warping scheme with the computed disparity, we can generate an arbitrary novel view within a triangle spanned by three camera centers. Furthermore, after self-calibration of the cameras, 3D objects can be correctly augmented into the virtual environment synthesized by the tri-view morphing algorithm. Three applications of the tri-view morphing algorithm are demonstrated. The first one is 4D video synthesis, which can be used to fill in the gap between a few sparsely located video cameras to synthetically generate a video from a virtual moving camera. This synthetic camera can be used to view the dynamic scene from a novel view instead of the original static camera views. The second application is multiple view morphing, where we can seamlessly fly through the scene over a 2D space constructed by more than three cameras. The last one is dynamic scene synthesis using three still images, where several rigid objects may move in any orientation or direction. After segmenting three reference frames into several layers, the novel views in the dynamic scene can be generated by applying our algorithm. Finally, the experiments are presented to illustrate that a series of photo-realistic virtual views can be generated to fly through a virtual environment covered by several static cameras.  相似文献   

12.
Omnistereo: panoramic stereo imaging   总被引:3,自引:0,他引:3  
An omnistereo panorama consists of a pair of panoramic images, where one panorama is for the left eye and another panorama is for the right eye. The panoramic stereo pair provides a stereo sensation up to a full 360 degrees. Omnistereo panoramas can be constructed by mosaicing images from a single rotating camera. This approach also enables the control of stereo disparity, giving larger baselines for faraway scenes, and a smaller baseline for closer scenes. Capturing panoramic omnistereo images with a rotating camera makes it impossible to capture dynamic scenes at video rates and limits omnistereo imaging to stationary scenes. We present two possibilities for capturing omnistereo panoramas using optics without any moving parts. A special mirror is introduced such that viewing the scene through this mirror creates the same rays as those used with the rotating cameras. The lens used for omnistereo panorama is also introduced, together with the design of the mirror. Omnistereo panoramas can also be rendered by computer graphics methods to represent virtual environments  相似文献   

13.
In this paper we present a scalable 3D video framework for capturing and rendering dynamic scenes. The acquisition system is based on multiple sparsely placed 3D video bricks, each comprising a projector, two grayscale cameras, and a color camera. Relying on structured light with complementary patterns, texture images and pattern-augmented views of the scene are acquired simultaneously by time-multiplexed projections and synchronized camera exposures. Using space–time stereo on the acquired pattern images, high-quality depth maps are extracted, whose corresponding surface samples are merged into a view-independent, point-based 3D data structure. This representation allows for effective photo-consistency enforcement and outlier removal, leading to a significant decrease of visual artifacts and a high resulting rendering quality using EWA volume splatting. Our framework and its view-independent representation allow for simple and straightforward editing of 3D video. In order to demonstrate its flexibility, we show compositing techniques and spatiotemporal effects.  相似文献   

14.
The focused plenoptic camera differs from the traditional plenoptic camera in that its microlenses are focused on the photographed object rather than at infinity. The spatio‐angular tradeoffs available with this approach enable rendering of final images that have significantly higher resolution than those from traditional plenoptic cameras. Unfortunately, this approach can result in visible artifacts when basic rendering is used. In this paper, we present two new methods that work together to minimize these artifacts. The first method is based on careful design of the optical system. The second method is computational and based on a new lightfield rendering algorithm that extracts the depth information of a scene directly from the lightfield and then uses that depth information in the final rendering. Experimental results demonstrate the effectiveness of these approaches.  相似文献   

15.
Pan–tilt–zoom (PTZ) cameras are well suited for object identification and recognition in far-field scenes. However, the effective use of PTZ cameras is complicated by the fact that a continuous online camera calibration is needed and the absolute pan, tilt and zoom values provided by the camera actuators cannot be used because they are not synchronized with the video stream. So, accurate calibration must be directly extracted from the visual content of the frames. Moreover, the large and abrupt scale changes, the scene background changes due to the camera operation and the need of camera motion compensation make target tracking with these cameras extremely challenging. In this paper, we present a solution that provides continuous online calibration of PTZ cameras which is robust to rapid camera motion, changes of the environment due to varying illumination or moving objects. The approach also scales beyond thousands of scene landmarks extracted with the SURF keypoint detector. The method directly derives the relationship between the position of a target in the ground plane and the corresponding scale and position in the image and allows real-time tracking of multiple targets with high and stable degree of accuracy even at far distances and any zoom level.  相似文献   

16.
In virtual simulation application, it is often necessary to use Open GL to render large-scale 3D static scenes including urban architectures. Each scene unit generally has individual vertex data and texture. For large-scale data set, it is hard to render all scene units simultaneously. We need to render part of the scene separately, which is called the scene partition and culling. In general, we partition the whole scene into different units on the CPU. We present a scheme that optimize the GPU rendering pipeline to cull the large-scale static scene, which will reduce the CPU suspending time and take full advantage of GPU computing advantages to speed up the rendering efficiency.  相似文献   

17.
In 3D reconstruction, the recovery of the calibration parameters of the cameras is paramount since it provides metric information about the observed scene, e.g., measures of angles and ratios of distances. Autocalibration enables the estimation of the camera parameters without using a calibration device, but by enforcing simple constraints on the camera parameters. In the absence of information about the internal camera parameters such as the focal length and the principal point, the knowledge of the camera pixel shape is usually the only available constraint. Given a projective reconstruction of a rigid scene, we address the problem of the autocalibration of a minimal set of cameras with known pixel shape and otherwise arbitrarily varying intrinsic and extrinsic parameters. We propose an algorithm that only requires 5 cameras (the theoretical minimum), thus halving the number of cameras required by previous algorithms based on the same constraint. To this purpose, we introduce as our basic geometric tool the six-line conic variety (SLCV), consisting in the set of planes intersecting six given lines of 3D space in points of a conic. We show that the set of solutions of the Euclidean upgrading problem for three cameras with known pixel shape can be parameterized in a computationally efficient way. This parameterization is then used to solve autocalibration from five or more cameras, reducing the three-dimensional search space to a two-dimensional one. We provide experiments with real images showing the good performance of the technique.  相似文献   

18.
This paper presents a simple approach to capturing the appearance and structure of immersive scenes based on the imagery acquired with an omnidirectional video camera. The scheme proceeds by combining techniques from structure-from-motion with ideas from image-based rendering. An interactive photogrammetric modeling scheme is used to recover the locations of a set of salient features in the scene (points and lines) from image measurements in a small set of keyframe images. The estimates obtained from this process are then used as a basis for estimating the position and orientation of the camera at every frame in the video clip. By augmenting the video sequence with pose information, we provide the end-user with the ability to index the video sequence spatially as opposed to temporally. This allows the user to explore the immersive scene by interactively selecting the desired viewpoint and viewing direction  相似文献   

19.
Minimal Aspect Distortion (MAD) Mosaicing of Long Scenes   总被引:2,自引:0,他引:2  
Long scenes can be imaged by mosaicing multiple images from cameras scanning the scene. We address the case of a video camera scanning a scene while moving in a long path, e.g. scanning a city street from a driving car, or scanning a terrain from a low flying aircraft. A robust approach to this task is presented, which is applied successfully to sequences having thousands of frames even when using a hand-held camera. Examples are given on a few challenging sequences. The proposed system consists of two components: (i) Motion and depth computation. (ii) Mosaic rendering. In the first part a “direct” method is presented for computing motion and dense depth. Robustness of motion computation has been increased by limiting the motion model for the scanning camera. An iterative graph-cuts approach, with planar labels and a flexible similarity measure, allows the computation of a dense depth for the entire sequence. In the second part a new minimal aspect distortion (MAD) mosaicing uses depth to minimize the geometrical distortions of long panoramic images. In addition to MAD mosaicing, interactive visualization using X-Slits is also demonstrated. This research was supported by the Israel Science Foundation. Video examples and high resolution images can be viewed in .  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号