首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 0 毫秒
A practical way to generate a high dynamic range (HDR) video using off‐the‐shelf cameras is to capture a sequence with alternating exposures and reconstruct the missing content at each frame. Unfortunately, existing approaches are typically slow and are not able to handle challenging cases. In this paper, we propose a learning‐based approach to address this difficult problem. To do this, we use two sequential convolutional neural networks (CNN) to model the entire HDR video reconstruction process. In the first step, we align the neighboring frames to the current frame by estimating the flows between them using a network, which is specifically designed for this application. We then combine the aligned and current images using another CNN to produce the final HDR frame. We perform an end‐to‐end training by minimizing the error between the reconstructed and ground truth HDR images on a set of training scenes. We produce our training data synthetically from existing HDR video datasets and simulate the imperfections of standard digital cameras using a simple approach. Experimental results demonstrate that our approach produces high‐quality HDR videos and is an order of magnitude faster than the state‐of‐the‐art techniques for sequences with two and three alternating exposures.  相似文献   

Digital videos such as those captured by a smartphone often exhibit exposure inconsistencies, a poorly exposed sky, or simply suffer from an uninteresting or plain looking sky. Professionals may edit these videos using advanced and time‐consuming tools unavailable to most users, to replace the sky with a more expressive or imaginative sky. In this work, we propose an algorithm for automatic replacement of the sky region in a video with a different sky, providing nonprofessional users with a simple yet efficient tool to seamlessly replace the sky. The method is fast, achieving close to real‐time performance on mobile devices and the user's involvement can remain as limited as simply selecting the replacement sky.  相似文献   

Estimating the correspondence between the images using optical flow is the key component for image fusion, however, computing optical flow between a pair of facial images including backgrounds is challenging due to large differences in illumination, texture, color and background in the images. To improve optical flow results for image fusion, we propose a novel flow estimation method, wavelet flow, which can handle both the face and background in the input images. The key idea is that instead of computing flow directly between the input image pair, we estimate the image flow by incorporating multi‐scale image transfer and optical flow guided wavelet fusion. Multi‐scale image transfer helps to preserve the background and lighting detail of input, while optical flow guided wavelet fusion produces a series of intermediate images for further fusion quality optimizing. Our approach can significantly improve the performance of the optical flow algorithm and provide more natural fusion results for both faces and backgrounds in the images. We evaluate our method on a variety of datasets to show its high outperformance.  相似文献   

This paper proposes a scale‐adaptive filtering method to improve the performance of structure‐preserving texture filtering for image smoothing. With classical texture filters, it usually is challenging to smooth texture at multiple scales while preserving salient structures in an image. We address this issue in the concept of adaptive bilateral filtering, where the scales of Gaussian range kernels are allowed to vary from pixel to pixel. Based on direction‐wise statistics, our method distinguishes texture from structure effectively, identifies appropriate scope around a pixel to be smoothed and thus infers an optimal smoothing scale for it. Filtering an image with varying‐scale kernels, the image is smoothed according to the distribution of texture adaptively. With commendable experimental results, we show that, needing less iterations, our proposed scheme boosts texture filtering performance in terms of preserving the geometric structures of multiple scales even after aggressive smoothing of the original image.  相似文献   

In this work, we present a method to vectorize raster images of line art. Inverting the rasterization procedure is inherently ill‐conditioned, as there exist many possible vector images that could yield the same raster image. However, not all of these vector images are equally useful to the user, especially if performing further edits is desired. We therefore define the problem of computing an instance segmentation of the most likely set of paths that could have created the raster image. Once the segmentation is computed, we use existing vectorization approaches to vectorize each path, and then combine all paths into the final output vector image. To determine which set of paths is most likely, we train a pair of neural networks to provide semantic clues that help resolve ambiguities at intersection and overlap regions. These predictions are made considering the full context of the image, and are then globally combined by solving a Markov Random Field (MRF). We demonstrate the flexibility of our method by generating results on character datasets, a synthetic random line dataset, and a dataset composed of human drawn sketches. For all cases, our system accurately recovers paths that adhere to the semantics of the drawings.  相似文献   

Removing specular highlight in an image is a fundamental research problem in computer vision and computer graphics. While various methods have been proposed, they typically do not work well for real‐world images due to the presence of rich textures, complex materials, hard shadows, occlusions and color illumination, etc. In this paper, we present a novel specular highlight removal method for real‐world images. Our approach is based on two observations of the real‐world images: (i) the specular highlight is often small in size and sparse in distribution; (ii) the remaining diffuse image can be represented by linear combination of a small number of basis colors with the sparse encoding coefficients. Based on the two observations, we design an optimization framework for simultaneously estimating the diffuse and specular highlight images from a single image. Specifically, we recover the diffuse components of those regions with specular highlight by encouraging the encoding coefficients sparseness using L0 norm. Moreover, the encoding coefficients and specular highlight are also subject to the non‐negativity according to the additive color mixing theory and the illumination definition, respectively. Extensive experiments have been performed on a variety of images to validate the effectiveness of the proposed method and its superiority over the previous methods.  相似文献   

This paper proposes a deep learning‐based image tone enhancement approach that can maximally enhance the tone of an image while preserving the naturalness. Our approach does not require carefully generated ground‐truth images by human experts for training. Instead, we train a deep neural network to mimic the behavior of a previous classical filtering method that produces drastic but possibly unnatural‐looking tone enhancement results. To preserve the naturalness, we adopt the generative adversarial network (GAN) framework as a regularizer for the naturalness. To suppress artifacts caused by the generative nature of the GAN framework, we also propose an imbalanced cycle‐consistency loss. Experimental results show that our approach can effectively enhance the tone and contrast of an image while preserving the naturalness compared to previous state‐of‐the‐art approaches.  相似文献   

Image composition extracts the content of interest (COI) from a source image and blends it into a target image to generate a new image. In the majority of existing works, the COI is manually extracted and then overlaid on top of the target image. However, in practice, it is often necessary to deal with situations in which the COI is partially occluded by the target image content. In this regard, both tasks of extracting the COI and cropping its occluded part require intensive user interactions, which are laborious and seriously reduce the composition efficiency. This paper addresses the aforementioned challenges by proposing an efficient image composition method. First, we extract the semantic contents of the images by using state‐of‐the‐art deep learning methods. Therefore, the COI can be selected with clicks only, which can greatly reduce the demanded user interactions. Second, according to the user's operations (such as translation or scale) on the COI, we can effectively infer the occlusion relationships between the COI and the contents of the target image. Thus, the COI can be adaptively embedded into the target image without concern about cropping its occluded part. Therefore, the procedures of content extraction and occlusion handling can be significantly simplified, and work efficiency is remarkably improved. Experimental results show that compared to existing works, our method can reduce the number of user interactions to approximately one‐tenth and increase the speed of image composition by more than ten times.  相似文献   

Smoothing noises while preserving strong edges in images is an important problem in image processing. Image smoothing filters can be either explicit (based on local weighted average) or implicit (based on global optimization). Implicit methods are usually time‐consuming and cannot be applied to joint image filtering tasks, i.e., leveraging the structural information of a guidance image to filter a target image. Previous deep learning based image smoothing filters are all implicit and unavailable for joint filtering. In this paper, we propose to learn explicit guidance feature maps as well as offset maps from the guidance image and smoothing parameter that can be utilized to smooth the input itself or to filter images in other target domains. We design a deep convolutional neural network consisting of a fully‐convolution block for guidance and offset maps extraction together with a stacked spatially varying deformable convolution block for joint image filtering. Our models can approximate several representative image smoothing filters with high accuracy comparable to state‐of‐the‐art methods, and serve as general tools for other joint image filtering tasks, such as color interpolation, depth map upsampling, saliency map upsampling, flash/non‐flash image denoising and RGB/NIR image denoising.  相似文献   

Copying an element from a photo and pasting it into a painting is a challenging task. Applying photo compositing techniques in this context yields subpar results that look like a collage — and existing painterly stylization algorithms, which are global, perform poorly when applied locally. We address these issues with a dedicated algorithm that carefully determines the local statistics to be transferred. We ensure both spatial and inter‐scale statistical consistency and demonstrate that both aspects are key to generating quality results. To cope with the diversity of abstraction levels and types of paintings, we introduce a technique to adjust the parameters of the transfer depending on the painting. We show that our algorithm produces significantly better results than photo compositing or global stylization techniques and that it enables creative painterly edits that would be otherwise difficult to achieve.  相似文献   

Palette‐based image decomposition has attracted increasing attention in recent years. A specific class of approaches have been proposed basing on the RGB‐space geometry, which manage to construct convex hulls whose vertices act as palette colors. However, such palettes do not guarantee to have the representative colors which actually appear in the image, thus making it less intuitive and less predictable when editing palette colors to perform recoloring. Hence, we proposed an improved geometric approach to address this issue. We use a polyhedron, but not necessarily a convex hull, in the RGB space to represent the color palette. We then formulate the task of palette extraction as an optimization problem which could be solved in a few seconds. Our palette has a higher degree of representativeness and maintains a relatively similar level of accuracy compared with previous methods. For layer decomposition, we compute layer opacities via simple mean value coordinates, which could achieve instant feedbacks without precomputations. We have demonstrated our method for image recoloring on a variety of examples. In comparison with state‐of‐the‐art works, our approach is generally more intuitive and efficient with fewer artifacts.  相似文献   

We present a convolutional neural network architecture for performing joint design of color filter array (CFA) patterns and demosaicing. Our generic model allows the training of CFAs of arbitrary sizes, optimizing each color filter over the entire RGB color space. The patterns and algorithms produced by our method provide high‐quality color reconstructions. We demonstrate the effectiveness of our approach by showing that its results achieve higher PSNR than the ones obtained with state‐of‐the‐art techniques on all standard demosaicing datasets, both for noise‐free and noisy scenarios. Our method can also be used to obtain demosaicing strategies for pre‐defined CFAs, such as the Bayer pattern, for which our results also surpass even the demosaicing algorithms specifically designed for such a pattern.  相似文献   

Applying motion‐capture data to multi‐person interaction between virtual characters is challenging because one needs to preserve the interaction semantics while also satisfying the general requirements of motion retargeting, such as preventing penetration and preserving naturalness. An efficient means of representing interaction semantics is by defining the spatial relationships between the body parts of characters. However, existing methods consider only the character skeleton and thus are not suitable for capturing skin‐level spatial relationships. This paper proposes a novel method for retargeting interaction motions with respect to character skins. Specifically, we introduce the aura mesh, which is a volumetric mesh that surrounds a character's skin. The spatial relationships between two characters are computed from the overlap of the skin mesh of one character and the aura mesh of the other, and then the interaction motion retargeting is achieved by preserving the spatial relationships as much as possible while satisfying other constraints. We show the effectiveness of our method through a number of experiments.  相似文献   

Motion capture sequences may contain erroneous data, especially when the motion is complex or performers are interacting closely and occlusions are frequent. Common practice is to have specialists visually detect the abnormalities and fix them manually. In this paper, we present a method to automatically analyze and fix motion capture sequences by using self‐similarity analysis. The premise of this work is that human motion data has a high‐degree of self‐similarity. Therefore, given enough motion data, erroneous motions are distinct when compared to other motions. We utilize motion‐words that consist of short sequences of transformations of groups of joints around a given motion frame. We search for the K‐nearest neighbors (KNN) set of each word using dynamic time warping and use it to detect and fix erroneous motions automatically. We demonstrate the effectiveness of our method in various examples, and evaluate by comparing to alternative methods and to manual cleaning.  相似文献   

Spatially and temporally adaptive algorithms can substantially improve the computational efficiency of many numerical schemes in computational mechanics and physics‐based animation. Recently, a crucial need for temporal adaptivity in the Material Point Method (MPM) is emerging due to the potentially substantial variation of material stiffness and velocities in multi‐material scenes. In this work, we propose a novel temporally adaptive symplectic Euler scheme for MPM with regional time stepping (RTS), where different time steps are used in different regions. We design a time stepping scheduler operating at the granularity of small blocks to maintain a natural consistency with the hybrid particle/grid nature of MPM. Our method utilizes the Sparse Paged Grid (SPGrid) data structure and simultaneously offers high efficiency and notable ease of implementation with a practical multi‐threaded particle‐grid transfer strategy. We demonstrate the efficacy of our asynchronous MPM method on various examples including elastic objects, granular media, and fluids.  相似文献   

High dynamic range (HDR) imaging provides the capability of handling real world lighting as opposed to the traditional low dynamic range (LDR) which struggles to accurately represent images with higher dynamic range. However, most imaging content is still available only in LDR. This paper presents a method for generating HDR content from LDR content based on deep Convolutional Neural Networks (CNNs) termed ExpandNet. ExpandNet accepts LDR images as input and generates images with an expanded range in an end‐to‐end fashion. The model attempts to reconstruct missing information that was lost from the original signal due to quantization, clipping, tone mapping or gamma correction. The added information is reconstructed from learned features, as the network is trained in a supervised fashion using a dataset of HDR images. The approach is fully automatic and data driven; it does not require any heuristics or human expertise. ExpandNet uses a multiscale architecture which avoids the use of upsampling layers to improve image quality. The method performs well compared to expansion/inverse tone mapping operators quantitatively on multiple metrics, even for badly exposed inputs.  相似文献   

Despite recent advances in surveying techniques, publicly available Digital Elevation Models (DEMs) of terrains are low‐resolution except for selected places on Earth. In this paper we present a new method to turn low‐resolution DEMs into plausible and faithful high‐resolution terrains. Unlike other approaches for terrain synthesis/amplification (fractal noise, hydraulic and thermal erosion, multi‐resolution dictionaries), we benefit from high‐resolution aerial images to produce highly‐detailed DEMs mimicking the features of the real terrain. We explore different architectures for Fully Convolutional Neural Networks to learn upsampling patterns for DEMs from detailed training sets (high‐resolution DEMs and orthophotos), yielding up to one order of magnitude more resolution. Our comparative results show that our method outperforms competing data amplification approaches in terms of elevation accuracy and terrain plausibility.  相似文献   

The stochastic nature of Monte Carlo rendering algorithms inherently produces noisy images. Essentially, three approaches have been developed to solve this issue: improving the ray‐tracing strategies to reduce pixel variance, providing adaptive sampling by increasing the number of rays in regions needing so, and filtering the noisy image as a post‐process. Although the algorithms from the latter category introduce bias, they remain highly attractive as they quickly improve the visual quality of the images, are compatible with all sorts of rendering effects, have a low computational cost and, for some of them, avoid deep modifications of the rendering engine. In this paper, we build upon recent advances in both non‐local and collaborative filtering methods to propose a new efficient denoising operator for Monte Carlo rendering. Starting from the local statistics which emanate from the pixels sample distribution, we enrich the image with local covariance measures and introduce a nonlocal bayesian filter which is specifically designed to address the noise stemming from Monte Carlo rendering. The resulting algorithm only requires the rendering engine to provide for each pixel a histogram and a covariance matrix of its color samples. Compared to state‐of‐the‐art sample‐based methods, we obtain improved denoising results, especially in dark areas, with a large increase in speed and more robustness with respect to the main parameter of the algorithm. We provide a detailed mathematical exposition of our bayesian approach, discuss extensions to multiscale execution, adaptive sampling and animated scenes, and experimentally validate it on a collection of scenes.  相似文献   

Video frame interpolation (VFI) enables many important applications such as slow motion playback and frame rate conversion. However, one major challenge in using VFI is accurately handling high dynamic range (HDR) scenes with complex motion. To this end, we explore the possible advantages of dual-exposure sensors that readily provide sharp short and blurry long exposures that are spatially registered and whose ends are temporally aligned. This way, motion blur registers temporally continuous information on the scene motion that, combined with the sharp reference, enables more precise motion sampling within a single camera shot. We demonstrate that this facilitates a more complex motion reconstruction in the VFI task, as well as HDR frame reconstruction that so far has been considered only for the originally captured frames, not in-between interpolated frames. We design a neural network trained in these tasks that clearly outperforms existing solutions. We also propose a metric for scene motion complexity that provides important insights into the performance of VFI methods at test time.  相似文献   

Cosine‐Weighted B‐spline (CWB) interpolation [ Csé13 ] has been originally proposed for volumetric data sampled on the Body‐Centered Cubic (BCC) lattice. The BCC lattice is well known to be optimal for sampling isotropically band‐limited signals above the Nyquist limit. However, the Face‐Centered Cubic (FCC) lattice has been recently proven to be optimal for low‐rate sampling. The CWB interpolation is a state‐of‐the‐art technique on the BCC lattice, which outperforms, for example, the previously proposed box‐spline interpolation in terms of both efficiency and visual quality. In this paper, we show that CWB interpolation can be adapted to the FCC lattice as well, and results in similarly isotropic signal reconstructions as on the BCC lattice.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号