首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We describe a novel multiplexing approach to achieve tradeoffs in space, angle and time resolution in photography. We explore the problem of mapping useful subsets of time‐varying 4D lightfields in a single snapshot. Our design is based on using a dynamic mask in the aperture and a static mask close to the sensor. The key idea is to exploit scene‐specific redundancy along spatial, angular and temporal dimensions and to provide a programmable or variable resolution tradeoff among these dimensions. This allows a user to reinterpret the single captured photo as either a high spatial resolution image, a refocusable image stack or a video for different parts of the scene in post‐processing. A lightfield camera or a video camera forces a‐priori choice in space‐angle‐time resolution. We demonstrate a single prototype which provides flexible post‐capture abilities not possible using either a single‐shot lightfield camera or a multi‐frame video camera. We show several novel results including digital refocusing on objects moving in depth and capturing multiple facial expressions in a single photo.  相似文献   

2.
Automatic camera control for scenes depicting human motion is an imperative topic in motion capture base animation, computer games, and other animation based fields. This challenging control problem is complex and combines both geometric constraints, visibility requirements, and aesthetic elements. Therefore, existing optimization‐based approaches for human action overview are often too demanding for online computation. In this paper, we introduce an effective automatic camera control which is extremely efficient and allows online performance. Rather than optimizing a complex quality measurement, at each time it selects one active camera from a multitude of cameras that render the dynamic scene. The selection is based on the correlation between each view stream and the human motion in the scene. Two factors allow for rapid selection among tens of candidate views in real‐time, even for complex multi‐character scenes: the efficient rendering of the multitude of view streams, and optimized calculations of the correlations using modified CCA. In addition to the method's simplicity and speed, it exhibits good agreement with both cinematic idioms and previous human motion camera control work. Our evaluations show that the method is able to cope with the challenges put forth by severe occlusions, multiple characters and complex scenes.  相似文献   

3.
We present a near‐instant method for acquiring facial geometry and reflectance using a set of commodity DSLR cameras and flashes. Our setup consists of twenty‐four cameras and six flashes which are fired in rapid succession with subsets of the cameras. Each camera records only a single photograph and the total capture time is less than the 67ms blink reflex. The cameras and flashes are specially arranged to produce an even distribution of specular highlights on the face. We employ this set of acquired images to estimate diffuse color, specular intensity, specular exponent, and surface orientation at each point on the face. We further refine the facial base geometry obtained from multi‐view stereo using estimated diffuse and specular photometric information. This allows final submillimeter surface mesostructure detail to be obtained via shape‐from‐specularity. The final system uses commodity components and produces models suitable for authoring high‐quality digital human characters.  相似文献   

4.
We present a novel multi‐view, projective texture mapping technique. While previous multi‐view texturing approaches lead to blurring and ghosting artefacts if 3D geometry and/or camera calibration are imprecise, we propose a texturing algorithm that warps (“floats”) projected textures during run‐time to preserve crisp, detailed texture appearance. Our GPU implementation achieves interactive to real‐time frame rates. The method is very generally applicable and can be used in combination with many image‐based rendering methods or projective texturing applications. By using Floating Textures in conjunction with, e.g., visual hull rendering, light field rendering, or free‐viewpoint video, improved rendering results are obtained from fewer input images, less accurately calibrated cameras, and coarser 3D geometry proxies.  相似文献   

5.
Light field videos express the entire visual information of an animated scene, but their shear size typically makes capture, processing and display an off‐line process, i. e., time between initial capture and final display is far from real‐time. In this paper we propose a solution for one of the key bottlenecks in such a processing pipeline, which is a reliable depth reconstruction possibly for many views. This is enabled by a novel correspondence algorithm converting the video streams from a sparse array of off‐the‐shelf cameras into an array of animated depth maps. The algorithm is based on a generalization of the classic multi‐resolution Lucas‐Kanade correspondence algorithm from a pair of images to an entire array. Special inter‐image confidence consolidation allows recovery from unreliable matching in some locations and some views. It can be implemented efficiently in massively parallel hardware, allowing for interactive computations. The resulting depth quality as well as the computation performance compares favorably to other state‐of‐the art light field‐to‐depth approaches, as well as stereo matching techniques. Another outcome of this work is a data set of light field videos that are captured with multiple variants of sparse camera arrays.  相似文献   

6.
4D Video Textures (4DVT) introduce a novel representation for rendering video‐realistic interactive character animation from a database of 4D actor performance captured in a multiple camera studio. 4D performance capture reconstructs dynamic shape and appearance over time but is limited to free‐viewpoint video replay of the same motion. Interactive animation from 4D performance capture has so far been limited to surface shape only. 4DVT is the final piece in the puzzle enabling video‐realistic interactive animation through two contributions: a layered view‐dependent texture map representation which supports efficient storage, transmission and rendering from multiple view video capture; and a rendering approach that combines multiple 4DVT sequences in a parametric motion space, maintaining video quality rendering of dynamic surface appearance whilst allowing high‐level interactive control of character motion and viewpoint. 4DVT is demonstrated for multiple characters and evaluated both quantitatively and through a user‐study which confirms that the visual quality of captured video is maintained. The 4DVT representation achieves >90% reduction in size and halves the rendering cost.  相似文献   

7.
We present a novel approach for analyzing the quality of multi‐agent crowd simulation algorithms. Our approach is data‐driven, taking as input a set of user‐defined metrics and reference training data, either synthetic or from video footage of real crowds. Given a simulation, we formulate the crowd analysis problem as an anomaly detection problem and exploit state‐of‐the‐art outlier detection algorithms to address it. To that end, we introduce a new framework for the visual analysis of crowd simulations. Our framework allows us to capture potentially erroneous behaviors on a per‐agent basis either by automatically detecting outliers based on individual evaluation metrics or by accounting for multiple evaluation criteria in a principled fashion using Principle Component Analysis and the notion of Pareto Optimality. We discuss optimizations necessary to allow real‐time performance on large datasets and demonstrate the applicability of our framework through the analysis of simulations created by several widely‐used methods, including a simulation from a commercial game.  相似文献   

8.
Capturing exposure sequences to compute high dynamic range (HDR) images causes motion blur in cases of camera movement. This also applies to light‐field cameras: frames rendered from multiple blurred HDR light‐field perspectives are also blurred. While the recording times of exposure sequences cannot be reduced for a single‐sensor camera, we demonstrate how this can be achieved for a camera array. Thus, we decrease capturing time and reduce motion blur for HDR light‐field video recording. Applying a spatio‐temporal exposure pattern while capturing frames with a camera array reduces the overall recording time and enables the estimation of camera movement within one light‐field video frame. By estimating depth maps and local point spread functions (PSFs) from multiple perspectives with the same exposure, regional motion deblurring can be supported. Missing exposures at various perspectives are then interpolated.  相似文献   

9.
Recent work have shown that it is possible to register multiple projectors on non‐planar surfaces using a single uncalibrated camera instead of a calibrated stereo pair when dealing with a special class of non‐planar surfaces, vertically extruded surfaces. However, this requires the camera view to contain the entire display surface. This is often an impossible scenario for large displays, especially common in visualization, edutainment, training and simulation applications. In this paper we present a new method that can achieve an accurate geometric registration even when the field‐of‐view of the uncalibrated camera can cover only a part of the vertically extruded display at a time. We pan and tilt the camera from a single point and employ a multi‐view approach to register the projectors on the display. This allows the method to scale easily both in terms of camera resolution and display size. To the best of our knowledge, our method is the first to achieve a scalable multi‐view geometric registration of large vertically extruded displays with a single uncalibrated camera. This method can also handle a different situation of having multiple similarly oriented cameras in different locations, if the camera focal length is known.  相似文献   

10.
We present a method for capturing the skeletal motions of humans using a sparse set of potentially moving cameras in an uncontrolled environment. Our approach is able to track multiple people even in front of cluttered and non‐static backgrounds, and unsynchronized cameras with varying image quality and frame rate. We completely rely on optical information and do not make use of additional sensor information (e.g. depth images or inertial sensors). Our algorithm simultaneously reconstructs the skeletal pose parameters of multiple performers and the motion of each camera. This is facilitated by a new energy functional that captures the alignment of the model and the camera positions with the input videos in an analytic way. The approach can be adopted in many practical applications to replace the complex and expensive motion capture studios with few consumer‐grade cameras even in uncontrolled outdoor scenes. We demonstrate this based on challenging multi‐view video sequences that are captured with unsynchronized and moving (e.g. mobile‐phone or GoPro) cameras.  相似文献   

11.
We present a multi‐view stereo reconstruction technique that directly produces a complete high‐fidelity head model with consistent facial mesh topology. While existing techniques decouple shape estimation and facial tracking, our framework jointly optimizes for stereo constraints and consistent mesh parameterization. Our method is therefore free from drift and fully parallelizable for dynamic facial performance capture. We produce highly detailed facial geometries with artist‐quality UV parameterization, including secondary elements such as eyeballs, mouth pockets, nostrils, and the back of the head. Our approach consists of deforming a common template model to match multi‐view input images of the subject, while satisfying cross‐view, cross‐subject, and cross‐pose consistencies using a combination of 2D landmark detection, optical flow, and surface and volumetric Laplacian regularization. Since the flow is never computed between frames, our method is trivially parallelized by processing each frame independently. Accurate rigid head pose is extracted using a PCA‐based dimension reduction and denoising scheme. We demonstrate high‐fidelity performance capture results with challenging head motion and complex facial expressions around eye and mouth regions. While the quality of our results is on par with the current state‐of‐the‐art, our approach can be fully parallelized, does not suffer from drift, and produces face models with production‐quality mesh topologies.  相似文献   

12.
Videos captured by consumer cameras often exhibit temporal variations in color and tone that are caused by camera auto‐adjustments like white‐balance and exposure. When such videos are sub‐sampled to play fast‐forward, as in the increasingly popular forms of timelapse and hyperlapse videos, these temporal variations are exacerbated and appear as visually disturbing high frequency flickering. Previous techniques to photometrically stabilize videos typically rely on computing dense correspondences between video frames, and use these correspondences to remove all color changes in the video sequences. However, this approach is limited in fast‐forward videos that often have large content changes and also might exhibit changes in scene illumination that should be preserved. In this work, we propose a novel photometric stabilization algorithm for fast‐forward videos that is robust to large content‐variation across frames. We compute pairwise color and tone transformations between neighboring frames and smooth these pair‐wise transformations while taking in account the possibility of scene/content variations. This allows us to eliminate high‐frequency fluctuations, while still adapting to real variations in scene characteristics. We evaluate our technique on a new dataset consisting of controlled synthetic and real videos, and demonstrate that our techniques outperforms the state‐of‐the‐art.  相似文献   

13.
3D garment capture is an important component for various applications such as free‐view point video, virtual avatars, online shopping, and virtual cloth fitting. Due to the complexity of the deformations, capturing 3D garment shapes requires controlled and specialized setups. A viable alternative is image‐based garment capture. Capturing 3D garment shapes from a single image, however, is a challenging problem and the current solutions come with assumptions on the lighting, camera calibration, complexity of human or mannequin poses considered, and more importantly a stable physical state for the garment and the underlying human body. In addition, most of the works require manual interaction and exhibit high run‐times. We propose a new technique that overcomes these limitations, making garment shape estimation from an image a practical approach for dynamic garment capture. Starting from synthetic garment shape data generated through physically based simulations from various human bodies in complex poses obtained through Mocap sequences, and rendered under varying camera positions and lighting conditions, our novel method learns a mapping from rendered garment images to the underlying 3D garment model. This is achieved by training Convolutional Neural Networks (CNN‐s) to estimate 3D vertex displacements from a template mesh with a specialized loss function. We illustrate that this technique is able to recover the global shape of dynamic 3D garments from a single image under varying factors such as challenging human poses, self occlusions, various camera poses and lighting conditions, at interactive rates. Improvement is shown if more than one view is integrated. Additionally, we show applications of our method to videos.  相似文献   

14.
We present a markerless performance capture system that can acquire the motion and the texture of human actors performing fast movements using only commodity hardware. To this end we introduce two novel concepts: First, a staggered surround multi‐view recording setup that enables us to perform model‐based motion capture on motion‐blurred images, and second, a model‐based deblurring algorithm which is able to handle disocclusion, self‐occlusion and complex object motions. We show that the model‐based approach is not only a powerful strategy for tracking but also for deblurring highly complex blur patterns.  相似文献   

15.
Cinemagraphs are a popular new type of visual media that lie in‐between photos and video; some parts of the frame are animated and loop seamlessly, while other parts of the frame remain completely still. Cinemagraphs are especially effective for portraits because they capture the nuances of our dynamic facial expressions. We present a completely automatic algorithm for generating portrait cinemagraphs from a short video captured with a hand‐held camera. Our algorithm uses a combination of face tracking and point tracking to segment face motions into two classes: gross, large‐scale motions that should be removed from the video, and dynamic facial expressions that should be preserved. This segmentation informs a spatially‐varying warp that removes the large‐scale motion, and a graph‐cut segmentation of the frame into dynamic and still regions that preserves the finer‐scale facial expression motions. We demonstrate the success of our method with a variety of results and a comparison to previous work.  相似文献   

16.
Controlling a crowd using multi‐touch devices appeals to the computer games and animation industries, as such devices provide a high‐dimensional control signal that can effectively define the crowd formation and movement. However, existing works relying on pre‐defined control schemes require the users to learn a scheme that may not be intuitive. We propose a data‐driven gesture‐based crowd control system, in which the control scheme is learned from example gestures provided by different users. In particular, we build a database with pairwise samples of gestures and crowd motions. To effectively generalize the gesture style of different users, such as the use of different numbers of fingers, we propose a set of gesture features for representing a set of hand gesture trajectories. Similarly, to represent crowd motion trajectories of different numbers of characters over time, we propose a set of crowd motion features that are extracted from a Gaussian mixture model. Given a run‐time gesture, our system extracts the K nearest gestures from the database and interpolates the corresponding crowd motions in order to generate the run‐time control. Our system is accurate and efficient, making it suitable for real‐time applications such as real‐time strategy games and interactive animation controls.  相似文献   

17.
In virtual reality (VR) applications, the contents are usually generated by creating a 360° Video panorama of a real‐world scene. Although many capture devices are being released, getting high‐resolution panoramas and displaying a virtual world in real‐time remains challenging due to its computationally demanding nature. In this paper, we propose a real‐time 360° Video foveated stitching framework, that renders the entire scene in different level of detail, aiming to create a high‐resolution panoramic Video in real‐time that can be streamed directly to the client. Our foveated stitching algorithm takes Videos from multiple cameras as input, combined with measurements of human visual attention (i.e. the acuity map and the saliency map), can greatly reduce the number of pixels to be processed. We further parallelize the algorithm using GPU to achieve a responsive interface and validate our results via a user study. Our system accelerates graphics computation by a factor of 6 on a Google Cardboard display.  相似文献   

18.
Eleven tone‐mapping operators intended for video processing are analyzed and evaluated with camera‐captured and computer‐generated high‐dynamic‐range content. After optimizing the parameters of the operators in a formal experiment, we inspect and rate the artifacts (flickering, ghosting, temporal color consistency) and color rendition problems (brightness, contrast and color saturation) they produce. This allows us to identify major problems and challenges that video tone‐mapping needs to address. Then, we compare the tone‐mapping results in a pair‐wise comparison experiment to identify the operators that, on average, can be expected to perform better than the others and to assess the magnitude of differences between the best performing operators.  相似文献   

19.
Statistical shape modeling is a widely used technique for the representation and analysis of the shapes and shape variations present in a population. A statistical shape model models the distribution in a high dimensional shape space, where each shape is represented by a single point. We present a design study on the intuitive exploration and visualization of shape spaces and shape models. Our approach focuses on the dual‐space nature of these spaces. The high‐dimensional shape space represents the population, whereas object space represents the shape of the 3D object associated with a point in shape space. A 3D object view provides local details for a single shape. The high dimensional points in shape space are visualized using a 2D scatter plot projection, the axes of which can be manipulated interactively. This results in a dynamic scatter plot, with the further extension that each point is visualized as a small version of the object shape that it represents. We further enhance the population‐object duality with a new type of view aimed at shape comparison. This new “shape evolution view” visualizes shape variability along a single trajectory in shape space, and serves as a link between the two spaces described above. Our three‐view exploration concept strongly emphasizes linked interaction between all spaces. Moving the cursor over the scatter plot or evolution views, shapes are dynamically interpolated and shown in the object view. Conversely, camera manipulation in the object view affects the object visualizations in the other views. We present a GPU‐accelerated implementation, and show the effectiveness of the three‐view approach using a number of real‐world cases. In these, we demonstrate how this multi‐view approach can be used to visually explore important aspects of a statistical shape model, including specificity, compactness and reconstruction error.  相似文献   

20.
Images/videos captured by portable devices (e.g., cellphones, DV cameras) often have limited fields of view. Image stitching, also referred to as mosaics or panorama, can produce a wide angle image by compositing several photographs together. Although various methods have been developed for image stitching in recent years, few works address the video stitching problem. In this paper, we present the first system to stitch videos captured by hand‐held cameras. We first recover the 3D camera paths and a sparse set of 3D scene points using CoSLAM system, and densely reconstruct the 3D scene in the overlapping regions. Then, we generate a smooth virtual camera path, which stays in the middle of the original paths. Finally, the stitched video is synthesized along the virtual path as if it was taken from this new trajectory. The warping required for the stitching is obtained by optimizing over both temporal stability and alignment quality, while leveraging on 3D information at our disposal. The experiments show that our method can produce high quality stitching results for various challenging scenarios.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号