One of the most common tasks in image and video editing is the local adjustment of various properties (e.g., saturation or brightness) of regions within an image or video. Edge‐aware interpolation of user‐drawn scribbles offers a less effort‐intensive approach to this problem than traditional region selection and matting. However, the technique suffers a number of limitations, such as reduced performance in the presence of texture contrast, and the inability to handle fragmented appearances. We significantly improve the performance of edge‐aware interpolation for this problem by adding a boosting‐based classification step that learns to discriminate between the appearance of scribbled pixels. We show that this novel data term in combination with an existing edge‐aware optimization technique achieves substantially better results for the local image and video adjustment problem than edge‐aware interpolation techniques without classification, or related methods such as matting techniques or graph cut segmentation.  相似文献   

This paper investigates a new approach for color transfer. Rather than transferring color from one image to another globally, we propose a system with a stroke‐based user interface to provide a direct indication mechanism. We further present a multiple local color transfer method. Through our system the user can easily enhance a defect (source) photo by referring to some other good quality (target) images by simply drawing some strokes. Then, the system will perform the multiple local color transfer automatically. The system consists of two major steps. First, the user draws some strokes on the source and target images to indicate corresponding regions and also the regions he or she wants to preserve. The regions to be preserved which will be masked out based on an improved graph cuts algorithm. Second, a multiple local color transfer method is presented to transfer the color from the target image(s) to the source image through gradient‐guided pixel‐wise color transfer functions. Finally, the defect (source) image can be enhanced seamlessly by multiple local color transfer based on some good quality (target) examples through an interactive and intuitive stroke‐based user interface.  相似文献   

This paper presents a novel video stabilization approach by leveraging the multiple planes structure of video scene to stabilize inter‐frame motion. As opposed to previous stabilization procedure operating in a single plane, our approach primarily deals with multiplane videos and builds their multiple planes structure for performing stabilization in respective planes. Hence, a robust plane detection scheme is devised to detect multiple planes by classifying feature trajectories according to reprojection errors generated by plane induced homographies. Then, an improved planar stabilization technique is applied by conforming to the compensated homography in each plane. Finally, multiple stabilized planes are coherently fused by content‐preserving image warps to obtain the output stabilized frames. Our approach does not need any stereo reconstruction, yet is able to produce commendable results due to awareness of multiple planes structure in the stabilization. Experimental results demonstrate the effectiveness and efficiency of our approach to robust stabilization on multiplane videos.  相似文献   

Despite their high popularity, common high dynamic range (HDR) methods are still limited in their practical applicability: They assume that the input images are perfectly aligned, which is often violated in practise. Our paper does not only free the user from this unrealistic limitation, but even turns the missing alignment into an advantage: By exploiting the multiple exposures, we can create a super‐resolution image. The alignment step is performed by a modern energy‐based optic flow approach that takes into account the varying exposure conditions. Moreover, it produces dense displacement fields with subpixel precision. As a consequence, our approach can handle arbitrary complex motion patterns, caused by severe camera shake and moving objects. Additionally, it benefits from several advantages over existing strategies: (i) It is robust under outliers (noise, occlusions, saturation problems) and allows for sharp discontinuities in the displacement field. (ii) The alignment step neither requires camera calibration nor knowledge of the exposure times. (iii) It can be efficiently implemented on CPU and GPU architectures. After the alignment is performed, we use the obtained subpixel accurate displacement fields as input for an energy‐based, joint super‐resolution and HDR (SR‐HDR) approach. It introduces robust data terms and anisotropic smoothness terms in the SR‐HDR literature. Our experiments with challenging real world data demonstrate that these novelties are pivotal for the favourable performance of our approach.  相似文献   

Color quantization replaces the color of each pixel with the closest representative color, and thus it makes the resulting image partitioned into uniformly-colored regions. As a consequence, continuous, detailed variations of color over the corresponding regions in the original image are lost through color quantization. In this paper, we present a novel blind scheme for restoring such variations from a color-quantized input image without a priori knowledge of the quantization method. Our scheme identifies which pairs of uniformly-colored regions in the input image should have continuous variations of color in the resulting image. Then, such regions are seamlessly stitched through optimization while preserving the closest representative colors. The user can optionally indicate which regions should be separated or stitched by scribbling constraint brushes across the regions. We demonstrate the effectiveness of our approach through diverse examples, such as photographs, cartoons, and artistic illustrations.  相似文献   

A novel method is given for content‐aware video resizing, i.e. targeting video to a new resolution (which may involve aspect ratio change) from the original. We precompute a per‐pixel cumulative shrinkability map which takes into account both the importance of each pixel and the need for continuity in the resized result. (If both x and y resizing are required, two separate shrinkability maps are used, otherwise one suffices). A random walk model is used for efficient offline computation of the shrinkability maps. The latter are stored with the video to create a multi‐sized video, which permits arbitrary‐sized new versions of the video to be later very efficiently created in real‐time, e.g. by a video‐on‐demand server supplying video streams to multiple devices with different resolutions. These shrinkability maps are highly compressible, so the resulting multi‐sized videos are typically less than three times the size of the original compressed video. A scaling function operates on the multi‐sized video, to give the new pixel locations in the result, giving a high‐quality content‐aware resized video. Despite the great efficiency and low storage requirements for our method, we produce results of comparable quality to state‐of‐the‐art methods for content‐aware image and video resizing.  相似文献   

Image completion techniques aim to complete selected regions of an image in a natural looking manner with little or no user interaction. Video Completion, the space–time equivalent of the image completion problem, inherits and extends both the difficulties and the solutions of the original 2D problem, but also imposes new ones—mainly temporal coherency and space complexity (videos contain significantly more information than images). Data‐driven approaches to completion have been established as a favoured choice, especially when large regions have to be filled. In this survey, we present the current state of the art in data‐driven video completion techniques. For unacquainted researchers, we aim to provide a broad yet easy to follow introduction to the subject (including an extensive review of the image completion foundations) and early guidance to the challenges ahead. For a versed reader, we offer a comprehensive review of the contemporary techniques, sectioned out by their approaches to key aspects of the problem.  相似文献   

Many useful algorithms for processing images and geometry fall under the general framework of high‐dimensional Gaussian filtering. This family of algorithms includes bilateral filtering and non‐local means. We propose a new way to perform such filters using the permutohedral lattice, which tessellates high‐dimensional space with uniform simplices. Our algorithm is the first implementation of a high‐dimensional Gaussian filter that is both linear in input size and polynomial in dimensionality. Furthermore it is parameter‐free, apart from the filter size, and achieves a consistently high accuracy relative to ground truth (> 45 dB). We use this to demonstrate a number of interactive‐rate applications of filters in as high as eight dimensions.  相似文献   

High‐quality video editing usually requires accurate layer separation in order to resolve occlusions. However, most of the existing bilayer segmentation algorithms require either considerable user intervention or a simple stationary camera configuration with known background, which is difficult to meet for many real world online applications. This paper demonstrates that various visually appealing montage effects can be online created from a live video captured by a rotating camera, by accurately retrieving the camera state and segmenting out the dynamic foreground. The key contribution is that a novel fast bilayer segmentation method is proposed which can effectively extract the dynamic foreground under rotational camera configuration, and is robust to imperfect background estimation and complex background colors. Our system can create a variety of live visual effects, including but not limited to, realistic virtual object insertion, background substitution and blurring, non‐photorealistic rendering and camouflage effect. A variety of challenging examples demonstrate the effectiveness of our method.  相似文献   

Many video sequences consist of a locally dynamic background containing moving foreground subjects. In this paper we propose a novel way of re‐displaying these sequences, by giving the user control over a virtual camera frame. Based on video mosaicing, we first compute a static high quality background panorama. After segmenting and removing the foreground subjects from the original video, the remaining elements are merged into a dynamic background panorama, which seamlessly extends the original video footage. We then re‐display this augmented video by warping and cropping the panorama. The virtual camera can have an enlarged field‐of‐view and a controlled camera motion. Our technique is able to process videos with complex camera motions, reconstructing high quality panoramas without parallax artefacts, visible seams or blurring, while retaining repetitive dynamic elements.  相似文献   

This paper proposes an algorithm which uses image registration to estimate a non‐uniform motion blur point spread function (PSF) caused by camera shake. Our study is based on a motion blur model which models blur effects of camera shakes using a set of planar perspective projections (i.e., homographies). This representation can fully describe motions of camera shakes in 3D which cause non‐uniform motion blurs. We transform the non‐uniform PSF estimation problem into a set of image registration problems which estimate homographies of the motion blur model one‐by‐one through the Lucas‐Kanade algorithm. We demonstrate the performance of our algorithm using both synthetic and real world examples. We also discuss the effectiveness and limitations of our algorithm for non‐uniform deblurring.  相似文献   

Restoration of the photographs damaged by the camera shake is a challenging task that manifested increasing attention in the recent period. Despite of the important progress of the blind deconvolution techniques, due to the ill-posed nature of the problem, the finest details of the kernel blur cannot be recovered entirely. Moreover, the additional constraints and prior assumptions make these approaches to be relative limited.
In this paper we introduce a novel technique that removes the undesired blur artifacts from photographs taken by hand-held digital cameras. Our approach is based on the observation that in general several consecutive photographs taken by the users share image regions that project the same scene content. Therefore, we took advantage of additional sharp photographs of the same scene. Based on several invariant local feature points, filtered from the given blurred/non-blurred images, our approach matches the keypoints and estimates the blur kernel using additional statistical constraints.
We also present a simple deconvolution technique that preserves edges while minimizing the ringing artifacts in the restored latent image. The experimental results prove that our technique is able to infer accurately the blur kernel while reducing significantly the artifacts of the spoilt images.  相似文献   

Videos captured by consumer cameras often exhibit temporal variations in color and tone that are caused by camera auto‐adjustments like white‐balance and exposure. When such videos are sub‐sampled to play fast‐forward, as in the increasingly popular forms of timelapse and hyperlapse videos, these temporal variations are exacerbated and appear as visually disturbing high frequency flickering. Previous techniques to photometrically stabilize videos typically rely on computing dense correspondences between video frames, and use these correspondences to remove all color changes in the video sequences. However, this approach is limited in fast‐forward videos that often have large content changes and also might exhibit changes in scene illumination that should be preserved. In this work, we propose a novel photometric stabilization algorithm for fast‐forward videos that is robust to large content‐variation across frames. We compute pairwise color and tone transformations between neighboring frames and smooth these pair‐wise transformations while taking in account the possibility of scene/content variations. This allows us to eliminate high‐frequency fluctuations, while still adapting to real variations in scene characteristics. We evaluate our technique on a new dataset consisting of controlled synthetic and real videos, and demonstrate that our techniques outperforms the state‐of‐the‐art.  相似文献   

Image blur caused by object motion attenuates high frequency content of images, making post‐capture deblurring an ill‐posed problem. The recoverable frequency band quickly becomes narrower for faster object motion as high frequencies are severely attenuated and virtually lost. This paper proposes to translate a camera sensor circularly about the optical axis during exposure, so that high frequencies can be preserved for a wide range of in‐plane linear object motion in any direction within some predetermined speed. That is, although no object may be photographed sharply at capture time, differently moving objects captured in a single image can be deconvolved with similar quality. In addition, circular sensor motion is shown to facilitate blur estimation thanks to distinct frequency zero patterns of the resulting motion blur point‐spread functions. An analysis of the frequency characteristics of circular sensor motion in relation to linear object motion is presented, along with deconvolution results for photographs captured with a prototype camera.  相似文献   

Collaborative filtering collects similar patches, jointly filters them and scatters the output back to input patches; each pixel gets a contribution from each patch that overlaps with it, allowing signal reconstruction from highly corrupted data. Exploiting self‐similarity, however, requires finding matching image patches, which is an expensive operation. We propose a GPU‐friendly approximated‐nearest‐neighbour(ANN) algorithm that produces high‐quality results for any type of collaborative filter. We evaluate our ANN search against state‐of‐the‐art ANN algorithms in several application domains. Our method is orders of magnitudes faster, yet provides similar or higher quality results than the previous work.  相似文献   

The viewfinder of a digital camera has traditionally been used for one purpose: to display to the user a preview of what is seen through the camera's lens. High quality cameras are now available on devices such as mobile phones and PDAs, which provide a platform where the camera is a programmable device, enabling applications such as online computational photography, computer vision‐based interactive gaming, and augmented reality. For such online applications, the camera viewfinder provides the user's main interaction with the environment. In this paper, we describe an algorithm for aligning successive viewfinder frames. First, an estimate of inter‐frame translation is computed by aligning integral projections of edges in two images. The estimate is then refined to compute a full 2D similarity transformation by aligning point features. Our algorithm is robust to noise, never requires storing more than one viewfinder frame in memory, and runs at 30 frames per second on standard smartphone hardware. We use viewfinder alignment for panorama capture, low‐light photography, and a camera‐based game controller.  相似文献   

Edge‐preserving image filtering is a valuable tool for a variety of applications in image processing and computer vision. Motivated by a new simple but effective local Laplacian filter, we propose a scalable and efficient image filtering framework to extend this edge‐preserving image filter and construct an uniform implementation in O (N) time. The proposed framework is built upon a practical global‐to‐local strategy. The input image is first remapped globally by a series of tentative remapping functions to generate a virtual candidate image sequence (Virtual Image Pyramid Sequence, VIPS). This sequence is then recombined locally to a single output image by a flexible edge‐aware pixel‐level fusion rule. To avoid halo artifacts, both the output image and the virtual candidate image sequence are transformed into multi‐resolution pyramid representations. Four examples, single image dehazing, multi‐exposure fusion, fast edge‐preserving filtering and tone‐mapping, are presented as the concrete applications of the proposed framework. Experiments on filtering effect and computational efficiency indicate that the proposed framework is able to build a wide range of fast image filtering that yields visually compelling results.  相似文献   

Video capture is limited by the trade‐off between spatial and temporal resolution: when capturing videos of high temporal resolution, the spatial resolution decreases due to bandwidth limitations in the capture system. Achieving both high spatial and temporal resolution is only possible with highly specialized and very expensive hardware, and even then the same basic trade‐off remains. The recent introduction of compressive sensing and sparse reconstruction techniques allows for the capture of single‐shot high‐speed video, by coding the temporal information in a single frame, and then reconstructing the full video sequence from this single‐coded image and a trained dictionary of image patches. In this paper, we first analyse this approach, and find insights that help improve the quality of the reconstructed videos. We then introduce a novel technique, based on convolutional sparse coding (CSC), and show how it outperforms the state‐of‐the‐art, patch‐based approach in terms of flexibility and efficiency, due to the convolutional nature of its filter banks. The key idea for CSC high‐speed video acquisition is extending the basic formulation by imposing an additional constraint in the temporal dimension, which enforces sparsity of the first‐order derivatives over time.  相似文献   

The topological structure of scalar, vector, and second‐order tensor fields provides an important mathematical basis for data analysis and visualization. In this paper, we extend this framework towards higher‐order tensors. First, we establish formal uniqueness properties for a geometrically constrained tensor decomposition. This allows us to define and visualize topological structures in symmetric tensor fields of orders three and four. We clarify that in 2D, degeneracies occur at isolated points, regardless of tensor order. However, for orders higher than two, they are no longer equivalent to isotropic tensors, and their fractional Poincaré index prevents us from deriving continuous vector fields from the tensor decomposition. Instead, sorting the terms by magnitude leads to a new type of feature, lines along which the resulting vector fields are discontinuous. We propose algorithms to extract these features and present results on higher‐order derivatives and higher‐order structure tensors.  相似文献   

