首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 869 毫秒
1.
This paper presents a new class of interactive image editing operations designed to maintain consistency between multiple images of a physical 3D scene. The distinguishing feature of these operations is that edits to any one image propagate automatically to all other images as if the (unknown) 3D scene had itself been modified. The modified scene can then be viewed interactively from any other camera viewpoint and under different scene illuminations. The approach is useful first as a power-assist that enables a user to quickly modify many images by editing just a few, and second as a means for constructing and editing image-based scene representations by manipulating a set of photographs. The approach works by extending operations like image painting, scissoring, and morphing so that they alter a scene's plenoptic function in a physically-consistent way, thereby affecting scene appearance from all viewpoints simultaneously. A key element inrealizing these operations is a new volumetric decomposition technique for reconstructing an scene's plenoptic function from an incomplete set of camera viewpoints.  相似文献   

2.
Virtual world explorations by using topological and semantic knowledge   总被引:3,自引:0,他引:3  
This paper is dedicated to virtual world exploration techniques. Automatic camera control is important in many fields as computational geometry, visual servoing, robot motion, graph drawing, etc. The paper introduces a high-level camera controlling approach in virtual environments. The proposed method is related to real-time 3D scene exploration and is made of two steps. In the first step, a set of good viewpoints is chosen to give the user a maximum knowledge of the scene. The second step uses the viewpoints to compute a camera path between them. Finally, we define a notion of semantic distance between objects of the scene to improve the approach.  相似文献   

3.
This paper presents a novel method for virtual view synthesis that allows viewers to virtually fly through real soccer scenes, which are captured by multiple cameras in a stadium. The proposed method generates images of arbitrary viewpoints by view interpolation of real camera images near the chosen viewpoints. In this method, cameras do not need to be strongly calibrated since projective geometry between cameras is employed for the interpolation. For avoiding the complex and unreliable process of 3-D recovery, object scenes are segmented into several regions according to the geometric property of the scene. Dense correspondence between real views, which is necessary for intermediate view generation, is automatically obtained by applying projective geometry to each region. By superimposing intermediate images for all regions, virtual views for the entire soccer scene are generated. The efforts for camera calibration are reduced and correspondence matching requires no manual operation; hence, the proposed method can be easily applied to dynamic events in a large space. An application for fly-through observations of soccer match replays is introduced along with the algorithm of view synthesis and experimental results. This is a new approach for providing arbitrary views of an entire dynamic event.  相似文献   

4.
This paper presents an efficient image-based approach to navigate a scene based on only three wide-baseline uncalibrated images without the explicit use of a 3D model. After automatically recovering corresponding points between each pair of images, an accurate trifocal plane is extracted from the trifocal tensor of these three images. Next, based on a small number of feature marks using a friendly GUI, the correct dense disparity maps are obtained by using our trinocular-stereo algorithm. Employing the barycentric warping scheme with the computed disparity, we can generate an arbitrary novel view within a triangle spanned by three camera centers. Furthermore, after self-calibration of the cameras, 3D objects can be correctly augmented into the virtual environment synthesized by the tri-view morphing algorithm. Three applications of the tri-view morphing algorithm are demonstrated. The first one is 4D video synthesis, which can be used to fill in the gap between a few sparsely located video cameras to synthetically generate a video from a virtual moving camera. This synthetic camera can be used to view the dynamic scene from a novel view instead of the original static camera views. The second application is multiple view morphing, where we can seamlessly fly through the scene over a 2D space constructed by more than three cameras. The last one is dynamic scene synthesis using three still images, where several rigid objects may move in any orientation or direction. After segmenting three reference frames into several layers, the novel views in the dynamic scene can be generated by applying our algorithm. Finally, the experiments are presented to illustrate that a series of photo-realistic virtual views can be generated to fly through a virtual environment covered by several static cameras.  相似文献   

5.
Our paper introduces a novel approach for controlling stereo camera parameters in interactive 3D environments in a way that specifically addresses the interplay of binocular depth perception and saliency of scene contents. Our proposed Dynamic Attention-Aware Disparity Control (DADC) method produces depth-rich stereo rendering that improves viewer comfort through joint optimization of stereo parameters. While constructing the optimization model, we consider the importance of scene elements, as well as their distance to the camera and the locus of attention on the display. Our method also optimizes the depth effect of a given scene by considering the individual user’s stereoscopic disparity range and comfortable viewing experience by controlling accommodation/convergence conflict. We validate our method in a formal user study that also reveals the advantages, such as superior quality and practical relevance, of considering our method.  相似文献   

6.
We introduce a novel method for enabling stereoscopic viewing of a scene from a single pre‐segmented image. Rather than attempting full 3D reconstruction or accurate depth map recovery, we hallucinate a rough approximation of the scene's 3D model using a number of simple depth and occlusion cues and shape priors. We begin by depth‐sorting the segments, each of which is assumed to represent a separate object in the scene, resulting in a collection of depth layers. The shapes and textures of the partially occluded segments are then completed using symmetry and convexity priors. Next, each completed segment is converted to a union of generalized cylinders yielding a rough 3D model for each object. Finally, the object depths are refined using an iterative ground fitting process. The hallucinated 3D model of the scene may then be used to generate a stereoscopic image pair, or to produce images from novel viewpoints within a small neighborhood of the original view. Despite the simplicity of our approach, we show that it compares favorably with state‐of‐the‐art depth ordering methods. A user study was conducted showing that our method produces more convincing stereoscopic images than existing semi‐interactive and automatic single image depth recovery methods.  相似文献   

7.
Free-viewpoint video (FVV) is a promising approach that allows users to control their viewpoint and generate virtual views from any desired perspective. The individual user viewpoints are synthesized from two or more camera streams and correspondent depth sequences. In case of continuous viewpoint changes, the camera inputs of the view-synthesis process must be changed in a seamless way, to avoid the starvation of the viewpoint synthesizer algorithm. Starvation occurs when the desired user viewpoint cannot be synthesized with the currently streamed camera views, and thus, the FVV playout interrupts. In this paper, we propose three different camera handover schemes (TCC, MA, and SA) based on viewpoint prediction to minimize the probability of playout stalls and find the trade-off between the image quality and the camera handover frequency. Our simulation results show that the introduced camera switching methods can reduce the handover frequency with more than 40%, and hence, the viewpoint synthesis starvation and the playout interruption can be minimized. By providing seamless viewpoint changes, the quality of experience can be significantly improved, making the new FVV service more attractive in the future.  相似文献   

8.
In this paper, we describe how geometrically correct and visually realistic shadows may be computed for objects composited into a single view of a target scene. Compared to traditional single view compositing methods, which either do not deal with the shadow effects or manually create the shadows for the composited objects, our approach efficiently utilizes the geometric and photometric constraints extracted from a single target image to synthesize the shadows consistent with the overall target scene for the inserted objects. In particular, we explore (i) the constraints provided by imaged scene structure, e.g. vanishing points of orthogonal directions, for camera calibration and thus explicit determination of the locations of the camera and the light source; (ii) the relatively weaker geometric constraint, the planar homology, that models the imaged shadow relations when explicit camera calibration is not possible; and (iii) the photometric constraints that are required to match the color characteristics of the synthesized shadows with those of the original scene. For each constraint, we demonstrate the working examples followed by our observations. To show the accuracy and the applications of the proposed method, we present the results for a variety of target scenes, including footage from commercial Hollywood movies and 3D video games.  相似文献   

9.
10.
We present a novel approach to optimally retarget videos for varied displays with differing aspect ratios by preserving salient scene content discovered via eye tracking. Our algorithm performs editing with cut, pan and zoom operations by optimizing the path of a cropping window within the original video while seeking to (i) preserve salient regions, and (ii) adhere to the principles of cinematography. Our approach is (a) content agnostic as the same methodology is employed to re‐edit a wide‐angle video recording or a close‐up movie sequence captured with a static or moving camera, and (b) independent of video length and can in principle re‐edit an entire movie in one shot. Our algorithm consists of two steps. The first step employs gaze transition cues to detect time stamps where new cuts are to be introduced in the original video via dynamic programming. A subsequent step optimizes the cropping window path (to create pan and zoom effects), while accounting for the original and new cuts. The cropping window path is designed to include maximum gaze information, and is composed of piecewise constant, linear and parabolic segments. It is obtained via L(1) regularized convex optimization which ensures a smooth viewing experience. We test our approach on a wide variety of videos and demonstrate significant improvement over the state‐of‐the‐art, both in terms of computational complexity and qualitative aspects. A study performed with 16 users confirms that our approach results in a superior viewing experience as compared to gaze driven re‐editing [ JSSH15 ] and letterboxing methods, especially for wide‐angle static camera recordings.  相似文献   

11.
12.
Image geo-tagging has drawn a great deal of attention in recent years. The geographic information associated with images can be used to promote potential applications such as location recognition or virtual navigation. In this paper, we propose a novel approach for accurate mobile image geo-tagging in urban areas. The approach is able to provide a comprehensive set of geo-context information based on the current image, including the real location of the camera and the viewing angle, as well as the location of the captured scene. Moreover, the parsed building facades and their geometric structures can also be estimated. First, for the image to be geo-tagged, we perform partial duplicate image retrieval to filter crowd-sourced images capturing the same scene. We then employ the structure-from-motion technique to reconstruct a sparse 3D point cloud of the scene. Meanwhile, the geometric structure of the query image is analyzed to extract building facades. Finally, by combining the reconstructed 3D scene model and the extracted structure information, we can register the camera location and viewing direction to a real-world map. The captured building location and facade orientation are also aligned. The effectiveness of the proposed system is demonstrated by experiment results.  相似文献   

13.
In this paper we propose a novel method for detecting and removing shadows from a single image thereby obtaining a high‐quality shadow‐free image. With minimal user assistance, we first identify shadowed and lit areas on the same surface in the scene using an illumination‐invariant distance measure. These areas are used to estimate the parameters of an affine shadow formation model. A novel pyramid‐based restoration process is then applied to produce a shadow‐free image, while avoiding loss of texture contrast and introduction of noise. Unlike previous approaches, we account for varying shadow intensity inside the shadowed region by processing it from the interior towards the boundaries. Finally, to ensure a seamless transition between the original and the recovered regions we apply image inpainting along a thin border. We demonstrate that our approach produces results that are in most cases superior in quality to those of previous shadow removal methods. We also show that it is possible to easily composite the extracted shadow onto a new background or modify its size and direction in the original image.  相似文献   

14.
Human beings are very skillful at reaching for and grasping objects under multiple conditions, even when faced with an object's wide variety of positions, locations, structures and orientations. This natural ability, controlled by the human brain, is called eye–hand coordination. To understand this behavior it is necessary to study both eye and hand movements simultaneously. This paper proposes a novel approach to detect grasping movements by means of computer vision techniques. This solution fuses two viewpoints, one viewpoint which is obtained from an eye-tracker capturing the user's perspective and a second viewpoint which is captured by a wearable camera attached to a user's wrist. Utilizing information from these two viewpoints it is possible to characterize multiple hand movements in conjunction with eye-gaze movements through a Hidden–Markov Model framework. This paper shows that combining these two sources makes it possible to detect hand gestures using only the objects contained in the scene even without markers on the surface of the objects. In addition, it is possible to detect which is the desired object before the user can actually grasp said object.  相似文献   

15.
We present a novel approach to direct and control virtual crowds using navigation fields. Our method guides one or more agents toward desired goals based on guidance fields. The system allows the user to specify these fields by either sketching paths directly in the scene via an intuitive authoring interface or by importing motion flow fields extracted from crowd video footage. We propose a novel formulation to blend input guidance fields to create singularity-free, goal-directed navigation fields. Our method can be easily combined with the most current local collision avoidance methods and we use two such methods as examples to highlight the potential of our approach. We illustrate its performance on several simulation scenarios.  相似文献   

16.
A method is proposed for estimating the position of people in a scene when their head locations are known in the image plane. An extension of the approach is presented for processing several observations of the same person. It is shown that the algorithm proposed can be incorporated in the existing tracking methods involving a video from a static camera.  相似文献   

17.
Omnidirectional video enables direct surround immersive viewing of a scene by warping the original image into the correct perspective given a viewing direction. However, novel views from viewpoints off the camera path can only be obtained if we solve the three-dimensional motion and calibration problem. In this paper we address the case of a parabolic catadioptric camera – a paraboloidal mirror in front of an orthographic lens – and we introduce a new representation, called the circle space, for points and lines in such images. In this circle space, we formulate an epipolar constraint involving a 4×4 fundamental matrix. We prove that the intrinsic parameters can be inferred in closed form from the two-dimensional subspace of the new fundamental matrix from two views if they are constant or from three views if they vary. Three-dimensional motion and structure can then be estimated from the decomposition of the fundamental matrix.  相似文献   

18.
19.
Existing tracking methods designed for interacting with projection-based displays generally require visible artifacts to be introduced in the environment in order to guarantee effective stability and accuracy. For instance, in optical-oriented approaches, either the camera sensor or the reference pattern used for tracking are often located within the user's sight (or interfere with it), thus occluding portions of the scene or altering the perception of the virtual environment. Several ways to tackle these issues have been recently explored. Proposed approaches basically aim at making the presence of tracking references in the virtual space transparent to the user. However, such solutions introduce possibly critical constraints on required hardware or environment configuration. In this work, a novel tracking approach based on imperceptible fiducial markers is proposed. The approach relies on a hiding technique that allows digital images to be embedded in (and retrieved from) a projected scene by exploiting the properties of light polarization and additive color mixing. In particular, the virtual scene is obtained by overlapping the light beams of two projectors and by dealing with markers’ hiding via color compensation. A prototype setup has been deployed, where interaction with a flat surface projection environment has been evaluated in terms of tracking accuracy and artifacts avoidance performance by using a consumer camera equipped with a polarizing filter. Although the performed tests presented in this article represent only a preliminary and a partial evaluation of the proposed approach, they provided encouraging results indicating that the proposed technique could be possibly applied in more complex interaction scenarios still with limited hardware requirements.  相似文献   

20.
The system described in this paper provides a real-time 3D visual experience by using an array of 64 video cameras and an integral photography display with 60 viewing directions. The live 3D scene in front of the camera array is reproduced by the full-color, full-parallax autostereoscopic display with interactive control of viewing parameters. The main technical challenge is fast and flexible conversion of the data from the 64 multicamera images to the integral photography format. Based on image-based rendering techniques, our conversion method first renders 60 novel images corresponding to the viewing directions of the display, and then arranges the rendered pixels to produce an integral photography image. For real-time processing on a single PC, all the conversion processes are implemented on a GPU with GPGPU techniques. The conversion method also allows a user to interactively control viewing parameters of the displayed image for reproducing the dynamic 3D scene with desirable parameters. This control is performed as a software process, without reconfiguring the hardware system, by changing the rendering parameters such as the convergence point of the rendering cameras and the interval between the viewpoints of the rendering cameras.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号