首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
Statistics-based colour constancy algorithms work well as long as there are many colours in a scene, they fail however when the encountering scenes comprise few surfaces. In contrast, physics-based algorithms, based on an understanding of physical processes such as highlights and interreflections, are theoretically able to solve for colour constancy even when there are as few as two surfaces in a scene. Unfortunately, physics-based theories rarely work outside the lab. In this paper we show that a combination of physical and statistical knowledge leads to a surprisingly simple and powerful colour constancy algorithm, one that also works well for images of natural scenes.From a physical standpoint we observe that given the dichromatic model of image formation the colour signals coming from a single uniformly-coloured surface are mapped to a line in chromaticity space. One component of the line is defined by the colour of the illuminant (i.e. specular highlights) and the other is due to its matte, or Lambertian, reflectance. We then make the statistical observation that the chromaticities of common light sources all follow closely the Planckian locus of black-body radiators. It follows that by intersecting the dichromatic line with the Planckian locus we can estimate the chromaticity of the illumination. We can solve for colour constancy even when there is a single surface in the scene. When there are many surfaces in a scene the individual estimates from each surface are averaged together to improve accuracy.In a set of experiments on real images we show our approach delivers very good colour constancy. Moreover, performance is significantly better than previous dichromatic algorithms.  相似文献   

3.
Statistics of natural image categories   总被引:9,自引:0,他引:9  
In this paper we study the statistical properties of natural images belonging to different categories and their relevance for scene and object categorization tasks. We discuss how second-order statistics are correlated with image categories, scene scale and objects. We propose how scene categorization could be computed in a feedforward manner in order to provide top-down and contextual information very early in the visual processing chain. Results show how visual categorization based directly on low-level features, without grouping or segmentation stages, can benefit object localization and identification. We show how simple image statistics can be used to predict the presence and absence of objects in the scene before exploring the image.  相似文献   

4.
Image mosaic construction is about stitching together a number of images about the same scene to construct a single image with a larger field of view. The majority of the previous work was rooted at the use of a single image-to-image mapping termed planar homography for representing the imaged scene. However, the mapping is applicable only to cases where the imaged scene is either a single planar surface, or very distant from the cameras, or imaged under a pure rotation of the camera, and that greatly limits the range of applications of the mosaicking methods. This paper presents a novel mosaicking solution for scenes that are polyhedral (thus consisting of multiple surfaces) and that are pictured possibly in closed range of the camera. The solution has two major advantages. First, it requires only a few correspondences over the entire scene, not correspondences over every surface patch in it to work. Second, it conquers a seemingly impossible task—warping image data of surfaces that are visible in only one of the input images, which we refer to as the singly visible surfaces, to another viewpoint to constitute the mosaic there. We also provide a detail analysis of what determines whether a singly visible surface could be mosaicked or not. Experimental results on real image data are presented to illustrate the performance of the method.  相似文献   

5.
The volume of raw range image data that is required to represent just a single scene can be extensive; hence direct interpretation of range images can incur a very high computational cost. Range image feature extraction has been identified as a mechanism to produce a more compact scene representation, in particular using features such as edges and surfaces, and hence enables less costly scene interpretation for applications such as object recognition and robot navigation. We present an approach to edge detection in range images that can be used directly with any range data, regardless of whether the data have regular or irregular spatial distribution. The approach is evaluated with respect to accuracy of both edge location and visual results are also provided.  相似文献   

6.
When we take a picture through transparent glass the image we obtain is often a linear superposition of two images: the image of the scene beyond the glass plus the image of the scene reflected by the glass. Decomposing the single input image into two images is a massively ill-posed problem: in the absence of additional knowledge about the scene being viewed there are an infinite number of valid decompositions. In this paper we focus on an easier problem: user assisted separation in which the user interactively labels a small number of gradients as belonging to one of the layers. Even given labels on part of the gradients, the problem is still ill-posed and additional prior knowledge is needed. Following recent results on the statistics of natural images we use a sparsity prior over derivative filters. This sparsity prior is optimized using the terative reweighted least squares (IRLS) approach. Our results show that using a prior derived from the statistics of natural images gives a far superior performance compared to a Gaussian prior and it enables good separations from a modest number of labeled gradients.  相似文献   

7.
Image analysis in the visual system is well adapted to the statistics of natural scenes. Investigations of natural image statistics have so far mainly focused on static features. The present study is dedicated to the measurement and the analysis of the statistics of optic flow generated on the retina during locomotion through natural environments. Natural locomotion includes bouncing and swaying of the head and eye movement reflexes that stabilize gaze onto interesting objects in the scene while walking. We investigate the dependencies of the local statistics of optic flow on the depth structure of the natural environment and on the ego-motion parameters. To measure these dependencies we estimate the mutual information between correlated data sets. We analyze the results with respect to the variation of the dependencies over the visual field, since the visual motions in the optic flow vary depending on visual field position. We find that retinal flow direction and retinal speed show only minor statistical interdependencies. Retinal speed is statistically tightly connected to the depth structure of the scene. Retinal flow direction is statistically mostly driven by the relation between the direction of gaze and the direction of ego-motion. These dependencies differ at different visual field positions such that certain areas of the visual field provide more information about ego-motion and other areas provide more information about depth. The statistical properties of natural optic flow may be used to tune the performance of artificial vision systems based on human imitating behavior, and may be useful for analyzing properties of natural vision systems.  相似文献   

8.
Progress in scene understanding requires reasoning about the rich and diverse visual environments that make up our daily experience. To this end, we propose the Scene Understanding database, a nearly exhaustive collection of scenes categorized at the same level of specificity as human discourse. The database contains 908 distinct scene categories and 131,072 images. Given this data with both scene and object labels available, we perform in-depth analysis of co-occurrence statistics and the contextual relationship. To better understand this large scale taxonomy of scene categories, we perform two human experiments: we quantify human scene recognition accuracy, and we measure how typical each image is of its assigned scene category. Next, we perform computational experiments: scene recognition with global image features, indoor versus outdoor classification, and “scene detection,” in which we relax the assumption that one image depicts only one scene category. Finally, we relate human experiments to machine performance and explore the relationship between human and machine recognition errors and the relationship between image “typicality” and machine recognition accuracy.  相似文献   

9.
This paper describes an approach to training a database of building images under the supervision of a user. Then it will be applied to recognize buildings in an urban scene. Given a set of training images, we first detect the building facets and calculate their properties such as area, wall color histogram and a list of local features. All facets of each building surface are used to construct a common model whose initial parameters are selected randomly from one of these facets. The common model is then updated step-by-step by spatial relationship of remaining facets and SVD-based (singular value decomposition) approximative vector. To verify the correspondence of image pairs, we proposed a new technique called cross ratio-based method which is more suitable for building surfaces than several previous approaches. Finally, the trained database is used to recognize a set of test images. The proposed method decreases the size of the database approximately 0.148 times, while automatically rejecting randomly repeated features from the scene and natural noise of local features. Furthermore, we show that the problem of multiple buildings was solved by separately analyzing each surface of a building.  相似文献   

10.
Intrinsic images are a mid‐level representation of an image that decompose the image into reflectance and illumination layers. The reflectance layer captures the color/texture of surfaces in the scene, while the illumination layer captures shading effects caused by interactions between scene illumination and surface geometry. Intrinsic images have a long history in computer vision and recently in computer graphics, and have been shown to be a useful representation for tasks ranging from scene understanding and reconstruction to image editing. In this report, we review and evaluate past work on this problem. Specifically, we discuss each work in terms of the priors they impose on the intrinsic image problem. We introduce a new synthetic ground‐truth dataset that we use to evaluate the validity of these priors and the performance of the methods. Finally, we evaluate the performance of the different methods in the context of image‐editing applications.  相似文献   

11.
Abstract— Starting from measured scene luminances, the retinal images of high‐dynamic‐range (HDR) test targets were calculated. These test displays contain 40 gray squares with a 50% average surround. In order to approximate a natural scene, the surround area was made up of half‐white and half‐black squares of different sizes. In this display, the spatial‐frequency distribution approximates a 1/f function of energy vs. spatial frequency. Images with 2.7 and 5.4 optical density ranges were compared. Although the target luminances are very different, after computing the retinal image according to the CIE scatter glare formula, it was found that the retinal ranges are very similar. Intraocular glare strongly restricts the range of the retinal image. Furthermore, uniform, equiluminant target patches are spatially transformed to different gradients with unequal retinal luminances. The usable dynamic range of the display correlates with the range on the retina. Observers report that appearances of white and black squares are constant and uniform, despite the fact that the retinal stimuli are variable and non‐uniform. Human vision uses complex spatial processing to calculate appearance from retinal arrays. Spatial image processing increases apparent contrast with increased white area in the surround. Post‐retinal spatial vision counteracts glare.  相似文献   

12.
Extracting objects from range and radiance images   总被引:6,自引:0,他引:6  
In this paper, we present a pipeline and several key techniques necessary for editing a real scene captured with both cameras and laser range scanners. We develop automatic algorithms to segment the geometry from range images into distinct surfaces, register texture from radiance images with the geometry, and synthesize compact high-quality texture maps. The result is an object-level representation of the scene which can be rendered with modifications to structure via traditional rendering methods. The segmentation algorithm for geometry operates directly on the point cloud from multiple registered 3D range images instead of a reconstructed mesh. It is a top-down algorithm which recursively partitions a point set into two subsets using a pairwise similarity measure. The result is a binary tree with individual surfaces as leaves. Our image registration technique performs a very efficient search to automatically find the camera poses for arbitrary position and orientation relative to the geometry. Thus, we can take photographs from any location without precalibration between the scanner and the camera. The algorithms have been applied to large-scale real data. We demonstrate our ability to edit a captured scene by moving, inserting, and deleting objects  相似文献   

13.
In this paper we present the first large-scale scene attribute database. First, we perform crowdsourced human studies to find a taxonomy of 102 discriminative attributes. We discover attributes related to materials, surface properties, lighting, affordances, and spatial layout. Next, we build the “SUN attribute database” on top of the diverse SUN categorical database. We use crowdsourcing to annotate attributes for 14,340 images from 707 scene categories. We perform numerous experiments to study the interplay between scene attributes and scene categories. We train and evaluate attribute classifiers and then study the feasibility of attributes as an intermediate scene representation for scene classification, zero shot learning, automatic image captioning, semantic image search, and parsing natural images. We show that when used as features for these tasks, low dimensional scene attributes can compete with or improve on the state of the art performance. The experiments suggest that scene attributes are an effective low-dimensional feature for capturing high-level context and semantics in scenes.  相似文献   

14.
Split Aperture Imaging for High Dynamic Range   总被引:1,自引:0,他引:1  
Most imaging sensors have limited dynamic range and hence are sensitive to only a part of the illumination range present in a natural scene. The dynamic range can be improved by acquiring multiple images of the same scene under different exposure settings and then combining them. In this paper, we describe a camera design for simultaneously acquiring multiple images. The cross-section of the incoming beam from a scene point is partitioned into as many parts as the required number of images. This is done by splitting the aperture into multiple parts and directing the beam exiting from each in a different direction using an assembly of mirrors. A sensor is placed in the path of each beam and exposure of each sensor is controlled either by appropriately setting its exposure parameter, or by splitting the incoming beam unevenly. The resulting multiple exposure images are used to construct a high dynamic range image. We have implemented a video-rate camera based on this design and the results obtained are presented.  相似文献   

15.
Bayesian Defogging   总被引:2,自引:0,他引:2  
Atmospheric conditions induced by suspended particles, such as fog and haze, severely alter the scene appearance. Restoring the true scene appearance from a single observation made in such bad weather conditions remains a challenging task due to the inherent ambiguity that arises in the image formation process. In this paper, we introduce a novel Bayesian probabilistic method that jointly estimates the scene albedo and depth from a single foggy image by fully leveraging their latent statistical structures. Our key idea is to model the image with a factorial Markov random field in which the scene albedo and depth are two statistically independent latent layers and to jointly estimate them. We show that we may exploit natural image and depth statistics as priors on these hidden layers and estimate the scene albedo and depth with a canonical expectation maximization algorithm with alternating minimization. We experimentally evaluate the effectiveness of our method on a number of synthetic and real foggy images. The results demonstrate that the method achieves accurate factorization even on challenging scenes for past methods that only constrain and estimate one of the latent variables.  相似文献   

16.
17.
Segmentation and classification of range images   总被引:2,自引:0,他引:2  
The recognition of objects in three-dimensional space is a desirable capability of a computer vision system. Range images, which directly measure 3-D surface coordinates of a scene, are well suited for this task. In this paper we report a procedure to detect connected planar, convex, and concave surfaces of 3-D objects. This is accomplished in three stages. The first stage segments the range image into ``surface patches' by a square error criterion clustering algorithm using surface points and associated surface normals. The second stage classifies these patches as planar, convex, or concave based on a non-parametric statistical test for trend, curvature values, and eigenvalue analysis. In the final stage, boundaries between adjacent surface patches are classified as crease or noncrease edges, and this information is used to merge compatible patches to produce reasonable faces of the object(s). This procedure has been successfully applied to a large number of real and synthetic images, four of which we present in this paper.  相似文献   

18.
19.
20.
Reflectance based object recognition   总被引:7,自引:4,他引:3  
Neighboring points on a smoothly curved surface have similar surface normals and illumination conditions. Therefore, their brightness values can be used to compute the ratio of their reflectance coefficients. Based on this observation, we develop an algorithm that estimates a reflectance ratio for each region in an image with respect to its background. The algorithm is efficient as it computes ratios for all image regions in just two raster scans. The region reflectance ratio represents a physical property that is invariant to illumination and imaging parameters. Several experiments are conducted to demonstrate the accuracy and robustness of ratio invariant.The ratio invariant is used to recognize objects from a single brightness image of a scene. Object models are automatically acquired and represented using a hash table. Recognition and pose estimation algorithms are presented that use ratio estimates of scene regions as well as their geometric properties to index the hash table. The result is a hypothesis for the existence of an object in the image. This hypothesis is verified using the ratios and locations of other regions in the scene. This approach to recognition is effective for objects with printed characters and pictures. Recognition experiments are conducted on images with illumination variations, occlusions, and shadows. The paper is concluded with a discussion on the simultaneous use of reflectance and geometry for visual perception.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号