首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A new method for focus measure computation is proposed to reconstruct 3D shape using image sequence acquired under varying focus plane. Adaptive histogram equalization is applied to enhance varying contrast across different image regions for better detection of sharp intensity variations. Fast discrete curvelet transform (FDCT) is employed for enhanced representation of singularities along curves in an input image followed by noise removal using bivariate shrinkage scheme based on locally estimated variance. The FDCT coefficients with high activity are exploited to detect high frequency variations of pixel intensities in a sequence of images. Finally, focus measure is computed utilizing neighborhood support of these coefficients to reconstruct the shape and a well-focused image of the scene being probed.  相似文献   

2.
We present a batch method for recovering Euclidian camera motion from sparse image data. The main purpose of the algorithm is to recover the motion parameters using as much of the available information and as few computational steps as possible. The algorithm thus places itself in the gap between factorisation schemes, which make use of all available information in the initial recovery step, and sequential approaches which are able to handle sparseness in the image data. Euclidian camera matrices are approximated via the affine camera model, thus making the recovery direct in the sense that no intermediate projective reconstruction is made. Using a little known closure constraint, the FA-closure, we are able to formulate the camera coefficients linearly in the entries of the affine fundamental matrices. The novelty of the presented work is twofold: Firstly the presented formulation allows for a particularly good conditioning of the estimation of the initial motion parameters but also for an unprecedented diversity in the choice of possible regularisation terms. Secondly, the new autocalibration scheme presented here is in practice guaranteed to yield a Least Squares Estimate of the calibration parameters. As a bi-product, the affine camera model is rehabilitated as a useful model for most cameras and scene configurations, e.g. wide angle lenses observing a scene at close range. Experiments on real and synthetic data demonstrate the ability to reconstruct scenes which are very problematic for previous structure from motion techniques due to local ambiguities and error accumulation.  相似文献   

3.
We investigate the feasibility of reconstructing an arbitrarily-shaped specular scene (refractive or mirror-like) from one or more viewpoints. By reducing shape recovery to the problem of reconstructing individual 3D light paths that cross the image plane, we obtain three key results. First, we show how to compute the depth map of a specular scene from a single viewpoint, when the scene redirects incoming light just once. Second, for scenes where incoming light undergoes two refractions or reflections, we show that three viewpoints are sufficient to enable reconstruction in the general case. Third, we show that it is impossible to reconstruct individual light paths when light is redirected more than twice. Our analysis assumes that, for every point on the image plane, we know at least one 3D point on its light path. This leads to reconstruction algorithms that rely on an “environment matting” procedure to establish pixel-to-point correspondences along a light path. Preliminary results for a variety of scenes (mirror, glass, etc.) are also presented. Part of this research was conducted while K. Kutulakos was serving as a Visiting Scholar at Microsoft Research Asia.  相似文献   

4.
目的 客观评价作为图像融合的重要研究领域,是评价融合算法性能的有力工具。目前,已有几十种不同类型的评价指标,但各应用领域包括可见光与红外图像融合,仍缺少统一的选择依据。为了方便比较不同融合算法性能,提出一种客观评价指标的通用分析方法并应用于可见光与红外图像融合。方法 将可见光与红外图像基准数据集中的客观评价指标分为两类,分别是基于融合图像的评价指标与基于源图像和融合图像的评价指标。采用Kendall相关系数分析融合指标间的相关性,聚类得到指标分组;采用Borda计数排序法统计算法的综合排序,分析单一指标排序和综合排序的相关性,得到一致性较高的指标集合;采用离散系数分析指标均值随不同算法的波动程度,选择充分体现不同算法间差异的指标;综合相关性分析、一致性分析及离散系数分析,总结具有代表性的建议指标集合。结果 在13对彩色可见光与红外和8对灰度可见光与红外两组图像源中,分别统计分析不同图像融合算法的客观评价数据,得到可见光与红外图像融合的建议指标集(标准差、边缘保持度),作为融合算法性能评估的重要参考。相较于现有方法,实验覆盖20种融合算法和13种客观评价指标,并且不依赖主观评价结果。结论...  相似文献   

5.
In this paper we study the problem of recovering the 3D shape, reflectance, and non-rigid motion properties of a dynamic 3D scene. Because these properties are completely unknown and because the scene's shape and motion may be non-smooth, our approach uses multiple views to build a piecewise-continuous geometric and radiometric representation of the scene's trace in space-time. A basic primitive of this representation is the dynamic surfel, which (1) encodes the instantaneous local shape, reflectance, and motion of a small and bounded region in the scene, and (2) enables accurate prediction of the region's dynamic appearance under known illumination conditions. We show that complete surfel-based reconstructions can be created by repeatedly applying an algorithm called Surfel Sampling that combines sampling and parameter estimation to fit a single surfel to a small, bounded region of space-time. Experimental results with the Phong reflectancemodel and complex real scenes (clothing, shiny objects, skin) illustrate our method's ability to explain pixels and pixel variations in terms of their underlying causes—shape, reflectance, motion, illumination, and visibility.  相似文献   

6.
Shape-from-focus (SFF) is a passive technique widely used in image processing for obtaining depth-maps. This technique is attractive since it only requires a single monocular camera with focus control, thus avoiding correspondence problems typically found in stereo, as well as more expensive capturing devices. However, one of its main drawbacks is its poor performance when the change in the focus level is difficult to detect. Most research in SFF has focused on improving the accuracy of the depth estimation. Less attention has been paid to the problem of providing quality measures in order to predict the performance of SFF without prior knowledge of the recovered scene. This paper proposes a reliability measure aimed at assessing the quality of the depth-map obtained using SFF. The proposed reliability measure (the R-measure) analyzes the shape of the focus measure function and estimates the likelihood of obtaining an accurate depth estimation without any previous knowledge of the recovered scene. The proposed R-measure is then applied for determining the image regions where SFF will not perform correctly in order to discard them. Experiments with both synthetic and real scenes are presented.  相似文献   

7.
s and t within a given planar figure F is considered. The approach contains basic methodology developed for any parallel or distributed system. The 2D scene or the edge of F are represented in the n Cartesian coordinate system (n-CCS). Several algorithms for the shortest path are given, each one to be applied in specified circumstances depending on the exact machine model or on additional information concerning geometrical properties of the figure. If these algorithms are implemented in a parallel depth search machine (PDSM), then the shortest path can be computed in time O(1). The maximum number of processors used is 0(n). The given methodology can also be adapted for producing an approximate solution when the shortest path is approximated by polygonal lines.  相似文献   

8.
Traditional depth estimation methods typically exploit the effect of either the variations in internal parameters such as aperture and focus (as in depth from defocus), or variations in extrinsic parameters such as position and orientation of the camera (as in stereo). When operating off-the-shelf (OTS) cameras in a general setting, these parameters influence the depth of field (DOF) and field of view (FOV). While DOF mandates one to deal with defocus blur, a larger FOV necessitates camera motion during image acquisition. As a result, for unfettered operation of an OTS camera, it becomes inevitable to account for pixel motion as well as optical defocus blur in the captured images. We propose a depth estimation framework using calibrated images captured under general camera motion and lens parameter variations. Our formulation seeks to generalize the constrained areas of stereo and shape from defocus (SFD)/focus (SFF) by handling, in tandem, various effects such as focus variation, zoom, parallax and stereo occlusions, all under one roof. One of the associated challenges in such an unrestrained scenario is the problem of removing user-defined foreground occluders in the reference depth map and image (termed inpainting of depth and image). Inpainting is achieved by exploiting the cue from motion parallax to discover (in other images) the correspondence/color information missing in the reference image. Moreover, considering the fact that the observations could be differently blurred, it is important to ensure that the degree of defocus in the missing regions (in the reference image) is coherent with the local neighbours (defocus inpainting).  相似文献   

9.
A Theory of Shape by Space Carving   总被引:30,自引:9,他引:21  
In this paper we consider the problem of computing the 3D shape of an unknown, arbitrarily-shaped scene from multiple photographs taken at known but arbitrarily-distributed viewpoints. By studying the equivalence class of all 3D shapes that reproduce the input photographs, we prove the existence of a special member of this class, the photo hull, that (1) can be computed directly from photographs of the scene, and (2) subsumes all other members of this class. We then give a provably-correct algorithm, called Space Carving, for computing this shape and present experimental results on complex real-world scenes. The approach is designed to (1) capture photorealistic shapes that accurately model scene appearance from a wide range of viewpoints, and (2) account for the complex interactions between occlusion, parallax, shading, and their view-dependent effects on scene-appearance.  相似文献   

10.
 Lithography as deep as 400 μm has been carried out to fabricate X-rays refractive lenses using a low energy synchrotron source (AURORA-2 S, 0.7 GeV). The lens made of PMMA has two parabolic curvatures with radii R=4 μm and apertures A=2(2Rz)1/2=179 μm, thus the aspect ratio z/R=250 for its curvatures, which is too great for traditional techniques to achieve. Upon fabrication of the lenses, precision of the curvatures has been evaluated by digital imaging analysis. The lens can singly focus a beam of hard X-rays into several microns at a reasonable focal length F=1.5 m. Advantages of using a low energy source for the LIGA process will be discussed regarding problems such as thick absorbers demanded by the LIGA mask and heat-load occurring in thick resist layers. Received: 10 August 2001/Accepted: 24 September 2001  相似文献   

11.
A Closed-Form Solution to Non-Rigid Shape and Motion Recovery   总被引:2,自引:0,他引:2  
Recovery of three dimensional (3D) shape and motion of non-static scenes from a monocular video sequence is important for applications like robot navigation and human computer interaction. If every point in the scene randomly moves, it is impossible to recover the non-rigid shapes. In practice, many non-rigid objects, e.g. the human face under various expressions, deform with certain structures. Their shapes can be regarded as a weighted combination of certain shape bases. Shape and motion recovery under such situations has attracted much interest. Previous work on this problem (Bregler, C., Hertzmann, A., and Biermann, H. 2000. In Proc. Int. Conf. Computer Vision and Pattern Recognition; Brand, M. 2001. In Proc. Int. Conf. Computer Vision and Pattern Recognition; Torresani, L., Yang, D., Alexander, G., and Bregler, C. 2001. In Proc. Int. Conf. Computer Vision and Pattern Recognition) utilized only orthonormality constraints on the camera rotations (rotation constraints). This paper proves that using only the rotation constraints results in ambiguous and invalid solutions. The ambiguity arises from the fact that the shape bases are not unique. An arbitrary linear transformation of the bases produces another set of eligible bases. To eliminate the ambiguity, we propose a set of novel constraints, basis constraints, which uniquely determine the shape bases. We prove that, under the weak-perspective projection model, enforcing both the basis and the rotation constraints leads to a closed-form solution to the problem of non-rigid shape and motion recovery. The accuracy and robustness of our closed-form solution is evaluated quantitatively on synthetic data and qualitatively on real video sequences.  相似文献   

12.
Recently, various techniques of shape reconstruction using cast shadows have been proposed. These techniques have the advantage that they can be applied to various scenes, including outdoor scenes, without using special devices. Previously proposed techniques usually require calibration of camera parameters and light source positions, and such calibration processes limit the range of application of these techniques. In this paper, we propose a method to reconstruct 3D scenes even when the camera parameters or light source positions are unknown. The technique first recovers the shape with 4-DOF indeterminacy using coplanarities obtained by cast shadows of straight edges or visible planes in a scene, and then upgrades the shape using metric constraints obtained from the geometrical constraints in the scene. In order to circumvent the need for calibrations and special devices, we propose both linear and nonlinear methods in this paper. Experiments using simulated and real images verified the effectiveness of this technique.  相似文献   

13.
We propose a novel framework called transient imaging for image formation and scene understanding through impulse illumination and time images. Using time-of-flight cameras and multi-path analysis of global light transport, we pioneer new algorithms and systems for scene understanding through time images. We demonstrate that our proposed transient imaging framework allows us to accomplish tasks that are well beyond the reach of existing imaging technology. For example, one can infer the geometry of not only the visible but also the hidden parts of a scene, enabling us to look around corners. Traditional cameras estimate intensity per pixel I(x,y). Our transient imaging camera captures a 3D time-image I(x,y,t) for each pixel and uses an ultra-short pulse laser for illumination. Emerging technologies are supporting cameras with a temporal-profile per pixel at picosecond resolution, allowing us to capture an ultra-high speed time-image. This time-image contains the time profile of irradiance incident at a sensor pixel. We experimentally corroborated our theory with free space hardware experiments using a femtosecond laser and a picosecond accurate sensing device. The ability to infer the structure of hidden scene elements, unobservable by both the camera and illumination source, will create a range of new computer vision opportunities.  相似文献   

14.
Abstract— The motion image quality of video systems with hold‐type displays, such as LCDs or OLEDs, were studied with regard to dynamic spatial frequency response and data from subjective evaluations on motion blur. The system parameters of motion image quality, or frame rate (F) and temporal aperture (At), were investigated and their required values were derived. A smaller temporal aperture and/or higher frame rate can improve the dynamic response and motion image quality, but the parameters required in order to maintain a good dynamic response for high motion image velocity seems very difficult to implement, such as a frame rate of 900 Hz. Therefore, the performance goal of video systems is set on “limit of acceptance” for motion image quality, as a compromise. An equation or the relational expression between motion image velocity and required parameter values is derived based on dynamic response and data from subjective evaluations found in published studies. Possible examples of parameter sets are obtained from the equation. Those are (F = 300 Hz, At = 5/6), (F = 240 Hz, At = 2/3), (F = 120 Hz, At = 1/3), and (F > 360 Hz, At = 1).  相似文献   

15.
We present an integrated, fully GPU‐based processing pipeline to interactively render new views of arbitrary scenes from calibrated but otherwise unstructured input views. In a two‐step procedure, our method first generates for each input view a dense proxy of the scene using a new multi‐view stereo formulation. Each scene proxy consists of a structured cloud of feature aware particles which automatically have their image space footprints aligned to depth discontinuities of the scene geometry and hence effectively handle sharp object boundaries and occlusions. We propose a particle optimization routine combined with a special parameterization of the view space that enables an efficient proxy generation as well as robust and intuitive filter operators for noise and outlier removal. Moreover, our generic proxy generation allows us to flexibly handle scene complexities ranging from small objects up to complete outdoor scenes. The second phase of the algorithm combines these particle clouds in real‐time into a view‐dependent proxy for the desired output view and performs a pixel‐accurate accumulation of the colour contributions from each available input view. This makes it possible to reconstruct even fine‐scale view‐dependent illumination effects. We demonstrate how all these processing stages of the pipeline can be implemented entirely on the GPU with memory efficient, scalable data structures for maximum performance. This allows us to generate new output renderings of high visual quality from input images in real‐time.  相似文献   

16.
The purpose of this study is to assess the relative performance of four different gap-filling approaches across a range of land-surface conditions, including both homogeneous and heterogeneous areas as well as in scenes with abrupt changes in landscape elements. The techniques considered in this study include: (1) Kriging and co-Kriging; (2) geostatistical neighbourhood similar pixel interpolator (GNSPI); (3) a weighted linear regression (WLR) algorithm; and (4) the direct sampling (DS) method. To examine the impact of image availability and the influence of temporal distance on the selection of input training data (i.e. time separating the training data from the gap-filled target image), input images acquired within the same season (temporally close) as well as in different seasons (temporally far) to the target image were examined, as was the case of using information only within the target image itself. Root mean square error (RMSE), mean spectral angle (MSA), and coefficient of determination (R2) were used as the evaluation metrics to assess the prediction results. In addition, the overall accuracy (OA) and kappa coefficient (kappa) were used to assess a land-cover classification based on the gap-filled images. Results show that all of the gap-filling approaches provide satisfactory results for the homogeneous case, with R2 > 0.93 for bands 1 and 2 in all cases and R2 > 0.80 for bands 3 and 4 in most cases. For the heterogeneous example, GNSPI performs the best, with R2 > 0.85 for all tested cases. WLR and GNSPI exhibit equivalent accuracy when a temporally close input image is used (i.e. WLR and GNSPI both have an R2 equal to 0.89 for band 1). For the case of abrupt changes in scene elements or in the absence of ancillary data, the DS approach outperforms the other tested methods.  相似文献   

17.
We present a novel algorithm to denoise deep Monte Carlo renderings, in which pixels contain multiple colour values, each for a different range of depths. Deep images are a more expressive representation of the scene than conventional flat images. However, since each depth bin receives only a fraction of the flat pixel's samples, denoising the bins is harder due to the less accurate mean and variance estimates. Furthermore, deep images lack a regular structure in depth—the number of depth bins and their depth ranges vary across pixels. This prevents a straightforward application of patch‐based distance metrics frequently used to improve the robustness of existing denoising filters. We address these constraints by combining a flat image‐space non‐local means filter operating on pixel colours with a deep cross‐bilateral filter operating on auxiliary features (albedo, normal, etc.). Our approach significantly reduces noise in deep images while preserving their structure. To our best knowledge, our algorithm is the first to enable efficient deep‐compositing workflows with denoised Monte Carlo renderings. We demonstrate the performance of our filter on a range of scenes highlighting the challenges and advantages of denoising deep images.  相似文献   

18.
We consider the problem of scheduling two jobs A and B on a set of m uniform parallel machines. Each job is assumed to be independent from the other: job A and job B are made up of n A and n B operations, respectively. Each operation is defined by its processing time and possibly additional data such as a due date, a weight, etc., and must be processed on a single machine. All machines are uniform, i.e. each machine has its own processing speed. Notice that we consider the special case of equal-size operations, i.e. all operations have the same processing time. The scheduling of operations of job A must be achieved to minimize a general cost function F A , whereas it is the makespan that must be minimized when scheduling the operations of job B. These kind of problems are called multiple agent scheduling problems. As we are dealing with two conflicting criteria, we focus on the calculation of strict Pareto optima for F A and CmaxBC_{\mathrm{max}}^{B} criteria. In this paper we consider different min-max and min-sum versions of function F A and provide special properties as well as polynomial time algorithms.  相似文献   

19.
3D models of objects and scenes are critical to many academic disciplines and industrial applications. Of particular interest is the emerging opportunity for 3D graphics to serve artificial intelligence: computer vision systems can benefit from synthetically-generated training data rendered from virtual 3D scenes, and robots can be trained to navigate in and interact with real-world environments by first acquiring skills in simulated ones. One of the most promising ways to achieve this is by learning and applying generative models of 3D content: computer programs that can synthesize new 3D shapes and scenes. To allow users to edit and manipulate the synthesized 3D content to achieve their goals, the generative model should also be structure-aware: it should express 3D shapes and scenes using abstractions that allow manipulation of their high-level structure. This state-of-the-art report surveys historical work and recent progress on learning structure-aware generative models of 3D shapes and scenes. We present fundamental representations of 3D shape and scene geometry and structures, describe prominent methodologies including probabilistic models, deep generative models, program synthesis, and neural networks for structured data, and cover many recent methods for structure-aware synthesis of 3D shapes and indoor scenes.  相似文献   

20.
In this paper, we focus on the ‘reverse editing’ problem in movie analysis, i.e., the extraction of film takes, original camera shots that a film editor extracts and arranges to produce a finished scene. The ability to disassemble final scenes and shots into takes is essential for nonlinear browsing, content annotation and the extraction of higher order cinematic constructs from film. A two-part framework for take extraction is proposed. The first part focuses on the filtering out action-driven scenes for which take extraction is not useful. The second part focuses on extracting film takes using agglomerative hierarchical clustering methods along with different similarity metrics and group distances and demonstrates our findings with 10 movies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号