共查询到20条相似文献,搜索用时 15 毫秒
1.
K. Aberman M. Shi J. Liao D. Lischinski B. Chen D. Cohen‐Or 《Computer Graphics Forum》2019,38(2):219-233
We present a new video‐based performance cloning technique. After training a deep generative network using a reference video capturing the appearance and dynamics of a target actor, we are able to generate videos where this actor reenacts other performances. All of the training data and the driving performances are provided as ordinary video segments, without motion capture or depth information. Our generative model is realized as a deep neural network with two branches, both of which train the same space‐time conditional generator, using shared weights. One branch, responsible for learning to generate the appearance of the target actor in various poses, uses paired training data, self‐generated from the reference video. The second branch uses unpaired data to improve generation of temporally coherent video renditions of unseen pose sequences. Through data augmentation, our network is able to synthesize images of the target actor in poses never captured by the reference video. We demonstrate a variety of promising results, where our method is able to generate temporally coherent videos, for challenging scenarios where the reference and driving videos consist of very different dance performances. 相似文献
2.
We present a deep learning based technique that enables novel‐view videos of human performances to be synthesized from sparse multi‐view captures. While performance capturing from a sparse set of videos has received significant attention, there has been relatively less progress which is about non‐rigid objects (e.g., human bodies). The rich articulation modes of human body make it rather challenging to synthesize and interpolate the model well. To address this problem, we propose a novel deep learning based framework that directly predicts novel‐view videos of human performances without explicit 3D reconstruction. Our method is a composition of two steps: novel‐view prediction and detail enhancement. We first learn a novel deep generative query network for view prediction. We synthesize novel‐view performances from a sparse set of just five or less camera videos. Then, we use a new generative adversarial network to enhance fine‐scale details of the first step results. This opens up the possibility of high‐quality low‐cost video‐based performance synthesis, which is gaining popularity for VA and AR applications. We demonstrate a variety of promising results, where our method is able to synthesis more robust and accurate performances than existing state‐of‐the‐art approaches when only sparse views are available. 相似文献
3.
Crispin Deul Tassilo Kugelstadt Marcel Weiler Jan Bender 《Computer Graphics Forum》2018,37(6):313-324
In this paper, we present a novel direct solver for the efficient simulation of stiff, inextensible elastic rods within the position‐based dynamics (PBD) framework. It is based on the XPBD algorithm, which extends PBD to simulate elastic objects with physically meaningful material parameters. XPBD approximates an implicit Euler integration and solves the system of non‐linear equations using a non‐linear Gauss–Seidel solver. However, this solver requires many iterations to converge for complex models and if convergence is not reached, the material becomes too soft. In contrast, we use Newton iterations in combination with our direct solver to solve the non‐linear equations which significantly improves convergence by solving all constraints of an acyclic structure (tree), simultaneously. Our solver only requires a few Newton iterations to achieve high stiffness and inextensibility. We model inextensible rods and trees using rigid segments connected by constraints. Bending and twisting constraints are derived from the well‐established Cosserat model. The high performance of our solver is demonstrated in highly realistic simulations of rods consisting of multiple 10 000 segments. In summary, our method allows the efficient simulation of stiff rods in the PBD framework with a speedup of two orders of magnitude compared to the original XPBD approach. 相似文献
4.
Simon Rodriguez Adrien Bousseau Fredo Durand George Drettakis 《Computer Graphics Forum》2018,37(4):119-131
Street‐level imagery is now abundant but does not have sufficient capture density to be usable for Image‐Based Rendering (IBR) of facades. We present a method that exploits repetitive elements in facades ‐ such as windows ‐ to perform data augmentation, in turn improving camera calibration, reconstructed geometry and overall rendering quality for IBR. The main intuition behind our approach is that a few views of several instances of an element provide similar information to many views of a single instance of that element. We first select similar instances of an element from 3–4 views of a facade and transform them into a common coordinate system, creating a “platonic” element. We use this common space to refine the camera calibration of each view of each instance and to reconstruct a 3D mesh of the element with multi‐view stereo, that we regularize to obtain a piecewise‐planar mesh aligned with dominant image contours. Observing the same element under multiple views also allows us to identify reflective areas ‐ such as glass panels ‐ which we use at rendering time to generate plausible reflections using an environment map. Our detailed 3D mesh, augmented set of views, and reflection mask enable image‐based rendering of much higher quality than results obtained using the input images directly. 相似文献
5.
The Bidirectional Texture Function (BTF) is a data‐driven solution to render materials with complex appearance. A typical capture contains tens of thousands of images of a material sample under varying viewing and lighting conditions. While capable of faithfully recording complex light interactions in the material, the main drawback is the massive memory requirement, both for storing and rendering, making effective compression of BTF data a critical component in practical applications. Common compression schemes used in practice are based on matrix factorization techniques, which preserve the discrete format of the original dataset. While this approach generalizes well to different materials, rendering with the compressed dataset still relies on interpolating between the closest samples. Depending on the material and the angular resolution of the BTF, this can lead to blurring and ghosting artefacts. An alternative approach uses analytic model fitting to approximate the BTF data, using continuous functions that naturally interpolate well, but whose expressive range is often not wide enough to faithfully recreate materials with complex non‐local lighting effects (subsurface scattering, inter‐reflections, shadowing and masking…). In light of these observations, we propose a neural network‐based BTF representation inspired by autoencoders: our encoder compresses each texel to a small set of latent coefficients, while our decoder additionally takes in a light and view direction and outputs a single RGB vector at a time. This allows us to continuously query reflectance values in the light and view hemispheres, eliminating the need for linear interpolation between discrete samples. We train our architecture on fabric BTFs with a challenging appearance and compare to standard PCA as a baseline. We achieve competitive compression ratios and high‐quality interpolation/extrapolation without blurring or ghosting artifacts. 相似文献
6.
Presenting high‐fidelity 3D content on compact portable devices with low computational power is challenging. Smartphones, tablets and head‐mounted displays (HMDs) suffer from thermal and battery‐life constraints and thus cannot match the render quality of desktop PCs and laptops. Streaming rendering enables to show high‐quality content but can suffer from potentially high latency. We propose an approach to efficiently capture shading samples in object space and packing them into a texture. Streaming this texture to the client, we support temporal frame up‐sampling with high fidelity, low latency and high mobility. We introduce two novel sample distribution strategies and a novel triangle representation in the shading atlas space. Since such a system requires dynamic parallelism, we propose an implementation exploiting the power of hardware‐accelerated tessellation stages. Our approach allows fast de‐coding and rendering of extrapolated views on a client device by using hardware‐accelerated interpolation between shading samples and a set of potentially visible geometry. A comparison to existing shading methods shows that our sample distributions allow better client shading quality than previous atlas streaming approaches and outperforms image‐based methods in all relevant aspects. 相似文献
7.
D. Sýkora O. Jamrika O. Texler J. Fier M. Luk
J. Lu E. Shechtman 《Computer Graphics Forum》2019,38(2):83-91
8.
9.
In this paper, we propose a novel motion controller for the online generation of natural character locomotion that adapts to new situations such as changing user control or applying external forces. This controller continuously estimates the next footstep while walking and running, and automatically switches the stepping strategy based on situational changes. To develop the controller, we devise a new physical model called an inverted‐pendulum‐based abstract model (IPAM). The proposed abstract model represents high‐dimensional character motions, inheriting the naturalness of captured motions by estimating the appropriate footstep location, speed and switching time at every frame. The estimation is achieved by a deep learning based regressor that extracts important features in captured motions. To validate the proposed controller, we train the model using captured motions of a human stopping, walking, and running in a limited space. Then, the motion controller generates human‐like locomotion with continuously varying speeds, transitions between walking and running, and collision response strategies in a cluttered space in real time. 相似文献
10.
Applying motion‐capture data to multi‐person interaction between virtual characters is challenging because one needs to preserve the interaction semantics while also satisfying the general requirements of motion retargeting, such as preventing penetration and preserving naturalness. An efficient means of representing interaction semantics is by defining the spatial relationships between the body parts of characters. However, existing methods consider only the character skeleton and thus are not suitable for capturing skin‐level spatial relationships. This paper proposes a novel method for retargeting interaction motions with respect to character skins. Specifically, we introduce the aura mesh, which is a volumetric mesh that surrounds a character's skin. The spatial relationships between two characters are computed from the overlap of the skin mesh of one character and the aura mesh of the other, and then the interaction motion retargeting is achieved by preserving the spatial relationships as much as possible while satisfying other constraints. We show the effectiveness of our method through a number of experiments. 相似文献
11.
Sensorimotor control is an essential mechanism for human motions, from involuntary reflex actions to intentional motor skill learning, such as walking, jumping, and swimming. Humans perform various motions according to different task goals and physiological sensory perception; however, most existing computational approaches for motion simulation and generation rarely consider the effects of human perception. The assumption of perfect perception (i.e., no sensory errors) of existing approaches restricts the generated motion types and makes dynamical reactions less realistic. We propose a general framework for sensorimotor control, integrating a balance controller and a vestibular model, to generate perception‐aware motions. By exploiting simulated perception, more natural responses that are closer to human reactions can be generated. For example, motion sickness caused by the impairments in the function of the vestibular system induces postural instability and body sway. Our approach generates physically correct motions and reasonable reactions to external stimuli since the spatial orientation estimation by the vestibular system is essential to preserve balance. We evaluate our framework by demonstrating standing balance on a rotational platform with different angular speeds and duration. The generated motions show that either faster angular speeds or longer rotational duration cause more severe motion sickness. Our results demonstrate that sensorimotor control, integrating human perception and physically‐based control, offers considerable potential for providing more human‐like behaviors, especially for perceptual illusions of human beings, including visual, proprioceptive, and tactile sensations. 相似文献
12.
Color scribbling is a unique form of illustration where artists use compact, overlapping, and monochromatic scribbles at microscopic scale to create astonishing colorful images at macroscopic scale. The creation process is skill‐demanded and time‐consuming, which typically involves drawing monochromatic scribbles layer‐by‐layer to depict true‐color subjects using a limited color palette delicately. In this work, we present a novel computational framework for automatic generation of color scribble images from arbitrary raster images. The core contribution of our work lies in a novel color dithering model tailor‐made for synthesizing a smooth color appearance using multiple layers of overlapped monochromatic strokes. Specifically, our system reconstructs the appearance of the input image by (i) generating layers of monochromatic scribbles based on a limited color palette derived from input image, and (ii) optimizing the drawing sequence among layers to minimize the visual color dissimilarity between dithered image and original image as well as the color banding artifacts. We demonstrate the effectiveness and robustness of our algorithm with various convincing results synthesized from a variety of input images with different stroke patterns. The experimental study further shows that our approach faithfully captures the scribble style and the color presentation at respectively microscopic and macroscopic scales, which is otherwise difficult for state‐of‐the‐art methods. 相似文献
13.
We present a new motion‐compensated hierarchical compression scheme (HMLFC) for encoding light field images (LFI) that is suitable for interactive rendering. Our method combines two different approaches, motion compensation schemes and hierarchical compression methods, to exploit redundancies in LFI. The motion compensation schemes capture the redundancies in local regions of the LFI efficiently (local coherence) and the hierarchical schemes capture the redundancies present across the entire LFI (global coherence). Our hybrid approach combines the two schemes effectively capturing both local as well as global coherence to improve the overall compression rate. We compute a tree from LFI using a hierarchical scheme and use phase shifted motion compensation techniques at each level of the hierarchy. Our representation provides random access to the pixel values of the light field, which makes it suitable for interactive rendering applications using a small run‐time memory footprint. Our approach is GPU friendly and allows parallel decoding of LF pixel values. We highlight the performance on the two‐plane parameterized light fields and obtain a compression ratio of 30–800× with a PSNR of 40–45 dB. Overall, we observe a ~2–5× improvement in compression rates using HMLFC over prior light field compression schemes that provide random access capability. In practice, our algorithm can render new views of resolution 512 × 512 on an NVIDIA GTX‐980 at ~200 fps. 相似文献
14.
Recent neural style transfer frameworks have obtained astonishing visual quality and flexibility in Single‐style Transfer (SST), but little attention has been paid to Multi‐style Transfer (MST) which refers to simultaneously transferring multiple styles to the same image. Compared to SST, MST has the potential to create more diverse and visually pleasing stylization results. In this paper, we propose the first MST framework to automatically incorporate multiple styles into one result based on regional semantics. We first improve the existing SST backbone network by introducing a novel multi‐level feature fusion module and a patch attention module to achieve better semantic correspondences and preserve richer style details. For MST, we designed a conceptually simple yet effective region‐based style fusion module to insert into the backbone. It assigns corresponding styles to content regions based on semantic matching, and then seamlessly combines multiple styles together. Comprehensive evaluations demonstrate that our framework outperforms existing works of SST and MST. 相似文献
15.
C. D. Tharindu Mathew Paulo R. Knob Soraia Raupp Musse Daniel G. Aliaga 《Computer Graphics Forum》2019,38(1):455-469
We present a system to generate a procedural environment that produces a desired crowd behaviour. Instead of altering the behavioural parameters of the crowd itself, we automatically alter the environment to yield such desired crowd behaviour. This novel inverse approach is useful both to crowd simulation in virtual environments and to urban crowd planning applications. Our approach tightly integrates and extends a space discretization crowd simulator with inverse procedural modelling. We extend crowd simulation by goal exploration (i.e. agents are initially unaware of the goal locations), variable‐appealing sign usage and several acceleration schemes. We use Markov chain Monte Carlo to quickly explore the solution space and yield interactive design. We have applied our method to a variety of virtual and real‐world locations, yielding one order of magnitude faster crowd simulation performance over related methods and several fold improvement of crowd indicators. 相似文献
16.
We present a novel approach to optimally retarget videos for varied displays with differing aspect ratios by preserving salient scene content discovered via eye tracking. Our algorithm performs editing with cut, pan and zoom operations by optimizing the path of a cropping window within the original video while seeking to (i) preserve salient regions, and (ii) adhere to the principles of cinematography. Our approach is (a) content agnostic as the same methodology is employed to re‐edit a wide‐angle video recording or a close‐up movie sequence captured with a static or moving camera, and (b) independent of video length and can in principle re‐edit an entire movie in one shot. Our algorithm consists of two steps. The first step employs gaze transition cues to detect time stamps where new cuts are to be introduced in the original video via dynamic programming. A subsequent step optimizes the cropping window path (to create pan and zoom effects), while accounting for the original and new cuts. The cropping window path is designed to include maximum gaze information, and is composed of piecewise constant, linear and parabolic segments. It is obtained via L(1) regularized convex optimization which ensures a smooth viewing experience. We test our approach on a wide variety of videos and demonstrate significant improvement over the state‐of‐the‐art, both in terms of computational complexity and qualitative aspects. A study performed with 16 users confirms that our approach results in a superior viewing experience as compared to gaze driven re‐editing [ JSSH15 ] and letterboxing methods, especially for wide‐angle static camera recordings. 相似文献
17.
Mazen Al Borno Ludovic Righetti Michael J. Black Scott L. Delp Eugene Fiume Javier Romero 《Computer Graphics Forum》2018,37(8):81-92
Motion capture is often retargeted to new, and sometimes drastically different, characters. When the characters take on realistic human shapes, however, we become more sensitive to the motion looking right. This means adapting it to be consistent with the physical constraints imposed by different body shapes. We show how to take realistic 3D human shapes, approximate them using a simplified representation, and animate them so that they move realistically using physically‐based retargeting. We develop a novel spacetime optimization approach that learns and robustly adapts physical controllers to new bodies and constraints. The approach automatically adapts the motion of the mocap subject to the body shape of a target subject. This motion respects the physical properties of the new body and every body shape results in a different and appropriate movement. This makes it easy to create a varied set of motions from a single mocap sequence by simply varying the characters. In an interactive environment, successful retargeting requires adapting the motion to unexpected external forces. We achieve robustness to such forces using a novel LQR‐tree formulation. We show that the simulated motions look appropriate to each character's anatomy and their actions are robust to perturbations. 相似文献
18.
Riccardo Roveri A. Cengiz Öztireli Ioana Pandele Markus Gross 《Computer Graphics Forum》2018,37(2):87-99
With the widespread use of 3D acquisition devices, there is an increasing need of consolidating captured noisy and sparse point cloud data for accurate representation of the underlying structures. There are numerous algorithms that rely on a variety of assumptions such as local smoothness to tackle this ill‐posed problem. However, such priors lead to loss of important features and geometric detail. Instead, we propose a novel data‐driven approach for point cloud consolidation via a convolutional neural network based technique. Our method takes a sparse and noisy point cloud as input, and produces a dense point cloud accurately representing the underlying surface by resolving ambiguities in geometry. The resulting point set can then be used to reconstruct accurate manifold surfaces and estimate surface properties. To achieve this, we propose a generative neural network architecture that can input and output point clouds, unlocking a powerful set of tools from the deep learning literature. We use this architecture to apply convolutional neural networks to local patches of geometry for high quality and efficient point cloud consolidation. This results in significantly more accurate surfaces, as we illustrate with a diversity of examples and comparisons to the state‐of‐the‐art. 相似文献
19.
Chen‐Hui Hu Chien‐Ying Lee Yen‐Ting Liou Feng‐Yu Sung Wen‐Chieh Lin 《Computer Graphics Forum》2019,38(6):66-78
Skiing is a popular recreational sport, and competitive skiing has been events at the Winter Olympic Games. Due to its wide moving range in the outdoor environment, motion capture of skiing is hard and usually not a good solution for generating skiing animations. Physical simulation offers a more viable alternative. However, skiing simulation is challenging as skiing involves many complicated motor skills and physics, such as balance keeping, movement coordination, articulated body dynamics and ski‐snow reaction. In particular, as no reference motions — usually from MOCAP data — are readily available for guiding the high‐level motor control, we need to synthesize plausible reference motions additionally. To solve this problem, sports techniques are exploited for reference motion planning. We propose a physics‐based framework that employs kinetic analyses of skiing techniques and the ski–snow contact model to generate realistic skiing motions. By simulating the inclination, angulation and weighting/unweighting techniques, stable and plausible carving turns and bump skiing animations can be generated. We evaluate our framework by demonstrating various skiing motions with different speeds, curvature radii and bump sizes. Our results show that employing the sports techniques used by athletes can provide considerable potential to generate agile sport motions without reference motions. 相似文献
20.
This paper proposes a deep learning‐based image tone enhancement approach that can maximally enhance the tone of an image while preserving the naturalness. Our approach does not require carefully generated ground‐truth images by human experts for training. Instead, we train a deep neural network to mimic the behavior of a previous classical filtering method that produces drastic but possibly unnatural‐looking tone enhancement results. To preserve the naturalness, we adopt the generative adversarial network (GAN) framework as a regularizer for the naturalness. To suppress artifacts caused by the generative nature of the GAN framework, we also propose an imbalanced cycle‐consistency loss. Experimental results show that our approach can effectively enhance the tone and contrast of an image while preserving the naturalness compared to previous state‐of‐the‐art approaches. 相似文献