首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Model-based recognition of 3D objects from single images   总被引:1,自引:0,他引:1  
In this work, we treat major problems of object recognition which have received relatively little attention lately. Among them are the loss of depth information in the projection from a 3D object to a single 2D image, and the complexity of finding feature correspondences between images. We use geometric invariants to reduce the complexity of these problems. There are no geometric invariants of a projection from 3D to 2D. However, given certain modeling assumptions about the 3D object, such invariants can be found. The modeling assumptions can be either a particular model or a generic assumption about a class of models. Here, we use such assumptions for single-view recognition. We find algebraic relations between the invariants of a 3D model and those of its 2D image under general projective projection. These relations can be described geometrically as invariant models in a 3D invariant space, illuminated by invariant “light rays,” and projected onto an invariant version of the given image. We apply the method to real images  相似文献   

2.
In this paper, we present an image retrieval technique for specific objects based on salient regions. The salient regions we select are invariant to geometric and photometric variations. Those salient regions are detected based on low level features, and need to be classified into different types before they can be applied on further vision tasks. We first classify the selected regions into four types including blobs, edges and lines, textures, and texture boundaries, by using the correlations with the neigbouring regions. Then, some specific region types are chosen for further object retrieval applications. We observe that regions selected from images of the same object are more similar to each other than regions selected from images of different objects. Correlation is used as the similarity measure between regions selected from different images. Two images are considered to contain the same object, if some regions selected from the first image are highly correlated to some regions selected from the second image. Two data sets are employed for experiment: the first data set contains human face images of a number of different people and is used for testing the retrieval algorithm on distinguishing specific objects of the same category; and the second data set contains images of different objects and is used for testing the retrieval algorithm on distinguishing objects of different categories. The results show that our method is very effective on specific object retrieval.  相似文献   

3.
用不变矩和边界方向进行形状检索   总被引:10,自引:1,他引:10  
基于形状的图像检索一直以来是图像内容检索的一个难点问题,而目前采用周长、面积、边角率等描述形状的方法不能使形状检索达到理想的效果.本文提出了一种新的针对图像形状的检索方法.首先,用Canny算子对图像进行平滑处理,提取图像边界方向直方图特征、其次,用不变矩来描述图像形状的区域特征,不变矩特征不受图像的缩放、平移和旋转的影响.最后,为了克服不变矩只关心对象区域,而对图像边界忽视的缺点,提出了不变矩与边界方向特征相结合的方法,使得检索取得更好的效果.本文通过对医学图像的形状检索实验,给出了实验结果和结论.  相似文献   

4.
A Multibody Factorization Method for Independently Moving Objects   总被引:6,自引:0,他引:6  
The structure-from-motion problem has been extensively studied in the field of computer vision. Yet, the bulk of the existing work assumes that the scene contains only a single moving object. The more realistic case where an unknown number of objects move in the scene has received little attention, especially for its theoretical treatment. In this paper we present a new method for separating and recovering the motion and shape of multiple independently moving objects in a sequence of images. The method does not require prior knowledge of the number of objects, nor is dependent on any grouping of features into an object at the image level. For this purpose, we introduce a mathematical construct of object shapes, called the shape interaction matrix, which is invariant to both the object motions and the selection of coordinate systems. This invariant structure is computable solely from the observed trajectories of image features without grouping them into individual objects. Once the matrix is computed, it allows for segmenting features into objects by the process of transforming it into a canonical form, as well as recovering the shape and motion of each object. The theory works under a broad set of projection models (scaled orthography, paraperspective and affine) but they must be linear, so it excludes projective cameras.  相似文献   

5.
This paper develops a theory of frequency domain invariants in computer vision. We derive novel identities using spherical harmonics, which are the angular frequency domain analog to common spatial domain invariants such as reflectance ratios. These invariants are derived from the spherical harmonic convolution framework for reflection from a curved surface. Our identities apply in a number of canonical cases, including single and multiple images of objects under the same and different lighting conditions. One important case we consider is two different glossy objects in two different lighting environments. For this case, we derive a novel identity, independent of the specific lighting configurations or BRDFs, that allows us to directly estimate the fourth image if the other three are available. The identity can also be used as an invariant to detecttampering in the images.While this paper is primarily theoretical, it has the potential to lay the mathematical foundations for two important practical applications. First, we can develop more general algorithms for inverse rendering problems, which can directly relight and change material properties by transferring the BRDF or lighting from another object or illumination. Second, we can check the consistency of an image, to detect tampering or image splicing.  相似文献   

6.
NeTra: A toolbox for navigating large image databases   总被引:17,自引:0,他引:17  
We present here an implementation of NeTra, a prototype image retrieval system that uses color, texture, shape and spatial location information in segmented image regions to search and retrieve similar regions from the database. A distinguishing aspect of this system is its incorporation of a robust automated image segmentation algorithm that allows object- or region-based search. Image segmentation significantly improves the quality of image retrieval when images contain multiple complex objects. Images are segmented into homogeneous regions at the time of ingest into the database, and image attributes that represent each of these regions are computed. In addition to image segmentation, other important components of the system include an efficient color representation, and indexing of color, texture, and shape features for fast search and retrieval. This representation allows the user to compose interesting queries such as “retrieve all images that contain regions that have the color of object A, texture of object B, shape of object C, and lie in the upper of the image”, where the individual objects could be regions belonging to different images. A Java-based web implementation of NeTra is available at http://vivaldi.ece.ucsb.edu/Netra.  相似文献   

7.
The main task of digital image processing is to recognize properties of real objects based on their digital images. These images are obtained by some sampling device, like a CCD camera, and represented as finite sets of points that are assigned some value in a gray-level or color scale. Based on technical properties of sampling devices, these points are usually assumed to form a square grid and are modeled as finite subsets of Z2. Therefore, a fundamental question in digital image processing is which features in the digital image correspond, under certain conditions, to properties of the underlying objects. In practical applications this question is mostly answered by visually judging the obtained digital images. In this paper we present a comprehensive answer to this question with respect to topological properties. In particular, we derive conditions relating properties of real objects to the grid size of the sampling device which guarantee that a real object and its digital image are topologically equivalent. These conditions also imply that two digital images of a given object are topologically equivalent. This means, for example, that shifting or rotating an object or the camera cannot lead to topologically different images, i.e., topological properties of obtained digital images are invariant under shifting and rotation.  相似文献   

8.
配准误差、噪声干扰和照度变化是影响变化检测性能的主要因素,利用图像结构信息进行变化检测,可以有效地克服这些因素的影响.文中提出了一种利用微分不变量描述图像结构信息并进行变化检测的方法.微分不变量具有平移和旋转不变性,并且对噪声具有较强的鲁棒性.首先利用微分不变量构造特征描述子,然后在一个搜索窗内计算各描述子之间的Mahalanobis距离,取其最小值并与阈值相比作变化检测.实验证明,所提出的算法对噪声干扰和配准误差都有较强的鲁棒性.  相似文献   

9.
10.
Given an unstructured collection of captioned images of cluttered scenes featuring a variety of objects, our goal is to simultaneously learn the names and appearances of the objects. Only a small fraction of local features within any given image are associated with a particular caption word, and captions may contain irrelevant words not associated with any image object. We propose a novel algorithm that uses the repetition of feature neighborhoods across training images and a measure of correspondence with caption words to learn meaningful feature configurations (representing named objects). We also introduce a graph-based appearance model that captures some of the structure of an object by encoding the spatial relationships among the local visual features. In an iterative procedure, we use language (the words) to drive a perceptual grouping process that assembles an appearance model for a named object. Results of applying our method to three data sets in a variety of conditions demonstrate that, from complex, cluttered, real-world scenes with noisy captions, we can learn both the names and appearances of objects, resulting in a set of models invariant to translation, scale, orientation, occlusion, and minor changes in viewpoint or articulation. These named models, in turn, are used to automatically annotate new, uncaptioned images, thereby facilitating keyword-based image retrieval.  相似文献   

11.
We present an active object recognition strategy which combines the use of an attention mechanism for focusing the search for a 3D object in a 2D image, with a viewpoint control strategy for disambiguating recovered object features. The attention mechanism consists of a probabilistic search through a hierarchy of predicted feature observations, taking objects into a set of regions classified according to the shapes of their bounding contours. We motivate the use of image regions as a focus-feature and compare their uncertainty in inferring objects with the uncertainty of more commonly used features such as lines or corners. If the features recovered during the attention phase do not provide a unique mapping to the 3D object being searched, the probabilistic feature hierarchy can be used to guide the camera to a new viewpoint from where the object can be disambiguated. The power of the underlying representation is its ability to unify these object recognition behaviors within a single framework. We present the approach in detail and evaluate its performance in the context of a project providing robotic aids for the disabled.  相似文献   

12.
Tom  Jan 《Pattern recognition》2003,36(12):2895-2907
The paper is devoted to the recognition of objects and patterns deformed by imaging geometry as well as by unknown blurring. We introduce a new class of features invariant simultaneously to blurring with a centrosymmetric PSF and to affine transformation. As we prove in the paper, they can be constructed by combining affine moment invariants and blur invariants derived earlier. Combined invariants allow to recognize objects in the degraded scene without any restoration.  相似文献   

13.
A central task of computer vision is to automatically recognize objects in real-world scenes. The parameters defining image and object spaces can vary due to lighting conditions, camera calibration and viewing position. It is therefore desirable to look for geometric properties of the object which remain invariant under such changes in the observation parameters. The study of such geometric invariance is a field of active research. This paper presents the theory and computation of projective invariants formed from points and lines using the geometric algebra framework. This work shows that geometric algebra is a very elegant language for expressing projective invariants using n views. The paper compares projective invariants involving two and three cameras using simulated and real images. Illustrations of the application of such projective invariants in visual guided grasping, camera self-localization and reconstruction of shape and motion complement the experimental part.  相似文献   

14.
The use of traditional moment invariants in object recognition is limited to simple geometric transforms, such as rotation, scaling and affine transformation of the image. This paper introduces so-called implicit moment invariants. Implicit invariants measure the similarity between two images factorized by admissible image deformations. For many types of image deformations traditional invariants do not exist but implicit invariants can be used as features for object recognition. In the paper we present implicit moment invariants with respect to polynomial transform of spatial coordinates, describe their stable and efficient implementation by means of orthogonal moments, and demonstrate their performance in artificial as well as real experiments.  相似文献   

15.
Edge and corner detection by photometric quasi-invariants   总被引:4,自引:0,他引:4  
Feature detection is used in many computer vision applications such as image segmentation, object recognition, and image retrieval. For these applications, robustness with respect to shadows, shading, and specularities is desired. Features based on derivatives of photometric invariants, which we is called full invariants, provide the desired robustness. However, because computation of photometric invariants involves nonlinear transformations, these features are unstable and, therefore, impractical for many applications. We propose a new class of derivatives which we refer to as quasi-invariants. These quasi-invariants are derivatives which share with full photometric invariants the property that they are insensitive for certain photometric edges, such as shadows or specular edges, but without the inherent instabilities of full photometric invariants. Experiments show that the quasi-invariant derivatives are less sensitive to noise and introduce less edge displacement than full invariant derivatives. Moreover, quasi-invariants significantly outperform the full invariant derivatives in terms of discriminative power.  相似文献   

16.
The “Six-line Problem” arises in computer vision and in the automated analysis of images. Given a three-dimensional (3D) object, one extracts geometric features (for example six lines) and then, via techniques from algebraic geometry and geometric invariant theory, produces a set of 3D invariants that represents that feature set. Suppose that later an object is encountered in an image (for example, a photograph taken by a camera modeled by standard perspective projection, i.e. a “pinhole” camera), and suppose further that six lines are extracted from the object appearing in the image. The problem is to decide if the object in the image is the original 3D object. To answer this question two-dimensional (2D) invariants are computed from the lines in the image. One can show that conditions for geometric consistency between the 3D object features and the 2D image features can be expressed as a set of polynomial equations in the combined set of two- and three-dimensional invariants. The object in the image is geometrically consistent with the original object if the set of equations has a solution. One well known method to attack such sets of equations is with resultants. Unfortunately, the size and complexity of this problem made it appear overwhelming until recently. This paper will describe a solution obtained using our own variant of the Cayley–Dixon–Kapur–Saxena–Yang resultant. There is reason to believe that the resultant technique we employ here may solve other complex polynomial systems.  相似文献   

17.
In this paper, we derive new geometric invariants for structured 3D points and lines from single image under projective transform, and we propose a novel model-based 3D object recognition algorithm using them. Based on the matrix representation of the transformation between space features (points and lines) and the corresponding projected image features, new geometric invariants are derived via the determinant ratio technique. First, an invariant for six points on two adjacent planes is derived, which is shown to be equivalent to Zhu's result [1], but in simpler formulation. Then, two new geometric invariants for structured lines are investigated: one for five lines on two adjacent planes and the other for six lines on four planes. By using the derived invariants, a novel 3D object recognition algorithm is developed, in which a hashing technique with thresholds and multiple invariants for a model are employed to overcome the over-invariant and false alarm problems. Simulation results on real images show that the derived invariants remain stable even in a noisy environment, and the proposed 3D object recognition algorithm is quite robust and accurate.  相似文献   

18.
19.
基于几何不变量的图像特征识别   总被引:6,自引:0,他引:6  
图像的特征识别是图像处理和识别中的一个重要问题,几何不变量作为特征的特征值在很多领域已经得到了广泛的应用。实际中,普遍采用在仿射变换及射影变换下保持不变的仿射、射影不变量作为特征值。本文根据具体图像的特点,利用4类仿射和射影不变量构成特征的特征值空间,依据4步识别策略来识别图像中的特征点,从而完成识别任务。实验表明,这4类不变量能够较好地识别出实际图像中的特征。  相似文献   

20.
Orthogonal variant moments features in image analysis   总被引:1,自引:0,他引:1  
Moments are statistical measures used to obtain relevant information about a certain object under study (e.g., signals, images or waveforms), e.g., to describe the shape of an object to be recognized by a pattern recognition system. Invariant moments (e.g., the Hu invariant set) are a special kind of these statistical measures designed to remain constant after some transformations, such as object rotation, scaling, translation, or image illumination changes, in order to, e.g., improve the reliability of a pattern recognition system. The classical moment invariants methodology is based on the determination of a set of transformations (or perturbations) for which the system must remain unaltered. Although very well established, the classical moment invariants theory has been mainly used for processing single static images (i.e. snapshots) and the use of image moments to analyze images sequences or video, from a dynamic point of view, has not been sufficiently explored and is a subject of much interest nowadays. In this paper, we propose the use of variant moments as an alternative to the classical approach. This approach presents clear differences compared to the classical moment invariants approach, that in specific domains have important advantages. The difference between the classical invariant and the proposed variant approach is mainly (but not solely) conceptual: invariants are sensitive to any image change or perturbation for which they are not invariant, so any unexpected perturbation will affect the measurements (i.e. is subject to uncertainty); on the contrary, a variant moment is designed to be sensitive to a specific perturbation, i.e., to measure a transformation, not to be invariant to it, and thus if the specific perturbation occurs it will be measured; hence any unexpected disturbance will not affect the objective of the measurement confronting thus uncertainty. Furthermore, given the fact that the proposed variant moments are orthogonal (i.e. uncorrelated) it is possible to considerably reduce the total inherent uncertainty. The presented approach has been applied to interesting open problems in computer vision such as shape analysis, image segmentation, tracking object deformations and object motion tracking, obtaining encouraging results and proving the effectiveness of the proposed approach.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号