首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Traditional approaches to three dimensional object recognition exploit the relationship between three dimensional object geometry and two dimensional image geometry. The capability of object recognition systems can be improved by also incorporating information about the color of object surfaces. Using physical models for image formation, the authors derive invariants of local color pixel distributions that are independent of viewpoint and the configuration, intensity, and spectral content of the scene illumination. These invariants capture information about the distribution of spectral reflectance which is intrinsic to a surface and thereby provide substantial discriminatory power for identifying a wide range of surfaces including many textured surfaces. These invariants can be computed efficiently from color image regions without requiring any form of segmentation. The authors have implemented an object recognition system that indexes into a database of models using the invariants and that uses associated geometric information for hypothesis verification and pose estimation. The approach to recognition is based on the computation of local invariants and is therefore relatively insensitive to occlusion. The authors present several examples demonstrating the system's ability to recognize model objects in cluttered scenes independent of object configuration and scene illumination. The discriminatory power of the invariants has been demonstrated by the system's ability to process a large set of regions over complex scenes without generating false hypotheses  相似文献   

2.
目的 随着现代通信和传感技术的快速发展,互联网上多媒体数据日益增长,既为人们生活提供了便利,又给信息有效利用提出了挑战。为充分挖掘网络图像中蕴含的丰富信息,同时考虑到网络中图像类型的多样性,以及不同类型的图像需要不同的处理方法,本文针对当今互联网中两种主要的图像类型:自然场景图像与合成图像,设计层次化的快速分类算法。方法 该算法包括两层,第1层利用两类图像在颜色,饱和度以及边缘对比度上表现出来的差异性提取全局特征,并结合支持向量机(SVM)进行初步分类,第1层分类结果中低置信度的图像会被送到第2层中。在第2层中,系统基于词袋模型(bag-of-words)对图像不同类型的局部区域的纹理信息进行编码得到局部特征并结合第2个SVM分类器完成最终分类。针对层次化分类框架,文中还提出两种策略对两个分类器进行融合,分别为分类器结果融合与全局+局部特征融合。为测试算法的实用性,同时收集并发布了一个包含超过30 000幅图像的数据库。结果 本文设计的全局与局部特征对两类图像具有较强的判别性。在单核Intel Xeon(R)(2.50 GHz)CPU上,分类精度可达到98.26%,分类速度超过40帧/s。另外通过与基于卷积神经网络的方法进行对比实验可发现,本文提出的算法在性能上与浅层网络相当,但消耗更少的计算资源。结论 本文基于自然场景图像与合成图像在颜色、饱和度、边缘对比度以及局部纹理上的差异,设计并提取快速有效的全局与局部特征,并结合层次化的分类框架,完成对两类图像的快速分类任务,该算法兼顾分类精度与分类速度,可应用于对实时性要求较高的图像检索与数据信息挖掘等实际项目中。  相似文献   

3.
Color images captured under various environments are often not ready to deliver the desired quality due to adverse effects caused by uncontrollable illumination settings. In particular, when the illuminate color is not known a priori, the colors of the objects may not be faithfully reproduced and thus impose difficulties in subsequent image processing operations. Color correction thus becomes a very important pre-processing procedure where the goal is to produce an image as if it is captured under uniform chromatic illumination. On the other hand, conventional color correction algorithms using linear gain adjustments focus only on color manipulations and may not convey the maximum information contained in the image. This challenge can be posed as a multi-objective optimization problem that simultaneously corrects the undesirable effect of illumination color cast while recovering the information conveyed from the scene. A variation of the particle swarm optimization algorithm is further developed in the multi-objective optimization perspective that results in a solution achieving a desirable color balance and an adequate delivery of information. Experiments are conducted using a collection of color images of natural objects that were captured under different lighting conditions. Results have shown that the proposed method is capable of delivering images with higher quality.  相似文献   

4.
Visual learning and recognition of 3-d objects from appearance   总被引:33,自引:9,他引:24  
The problem of automatically learning object models for recognition and pose estimation is addressed. In contrast to the traditional approach, the recognition problem is formulated as one of matching appearance rather than shape. The appearance of an object in a two-dimensional image depends on its shape, reflectance properties, pose in the scene, and the illumination conditions. While shape and reflectance are intrinsic properties and constant for a rigid object, pose and illumination vary from scene to scene. A compact representation of object appearance is proposed that is parametrized by pose and illumination. For each object of interest, a large set of images is obtained by automatically varying pose and illumination. This image set is compressed to obtain a low-dimensional subspace, called the eigenspace, in which the object is represented as a manifold. Given an unknown input image, the recognition system projects the image to eigenspace. The object is recognized based on the manifold it lies on. The exact position of the projection on the manifold determines the object's pose in the image.A variety of experiments are conducted using objects with complex appearance characteristics. The performance of the recognition and pose estimation algorithms is studied using over a thousand input images of sample objects. Sensitivity of recognition to the number of eigenspace dimensions and the number of learning samples is analyzed. For the objects used, appearance representation in eigenspaces with less than 20 dimensions produces accurate recognition results with an average pose estimation error of about 1.0 degree. A near real-time recognition system with 20 complex objects in the database has been developed. The paper is concluded with a discussion on various issues related to the proposed learning and recognition methodology.  相似文献   

5.
6.
多曝光图像融合技术是将一组场景相同但曝光程度不同的图像序列直接融合成为一幅含有更多场景细节信息的高质量图像。针对现有算法局部对比度差和色彩失真的问题,结合Retinex理论模型提出了一种新的多曝光图像融合算法。首先,基于Retinex理论模型,利用光照估计算法将曝光序列图像分为入射光分量序列和反射光分量序列,然后分别采用不同的融合方法对这两组序列进行处理。对于入射光分量,要保证场景的全局亮度的变化特性并且削弱过曝光和欠曝光区域的影响;而对于反射光分量,要采用适度曝光的评价参数来更好地保留场景的色彩及细节信息。分别从主观和客观两方面对所提算法进行了分析。实验结果表明,同传统基于图像域合成的算法相比,该算法在结构相似度(SSIM)上平均提升了1.7%,另外在图像色彩和局部细节上的处理效果更好。  相似文献   

7.
Intrinsic images are a mid‐level representation of an image that decompose the image into reflectance and illumination layers. The reflectance layer captures the color/texture of surfaces in the scene, while the illumination layer captures shading effects caused by interactions between scene illumination and surface geometry. Intrinsic images have a long history in computer vision and recently in computer graphics, and have been shown to be a useful representation for tasks ranging from scene understanding and reconstruction to image editing. In this report, we review and evaluate past work on this problem. Specifically, we discuss each work in terms of the priors they impose on the intrinsic image problem. We introduce a new synthetic ground‐truth dataset that we use to evaluate the validity of these priors and the performance of the methods. Finally, we evaluate the performance of the different methods in the context of image‐editing applications.  相似文献   

8.
Retrieving images from large and varied collections using image content as a key is a challenging and important problem. We present a new image representation that provides a transformation from the raw pixel data to a small set of image regions that are coherent in color and texture. This "Blobworld" representation is created by clustering pixels in a joint color-texture-position feature space. The segmentation algorithm is fully automatic and has been run on a collection of 10,000 natural images. We describe a system that uses the Blobworld representation to retrieve images from this collection. An important aspect of the system is that the user is allowed to view the internal representation of the submitted image and the query results. Similar systems do not offer the user this view into the workings of the system; consequently, query results from these systems can be inexplicable, despite the availability of knobs for adjusting the similarity metrics. By finding image regions that roughly correspond to objects, we allow querying at the level of objects rather than global image properties. We present results indicating that querying for images using Blobworld produces higher precision than does querying using color and texture histograms of the entire image in cases where the image contains distinctive objects.  相似文献   

9.
The observed image texture for a rough surface has a complex dependence on the illumination and viewing angles due to effects such as foreshortening, local shading, interreflections, and the shadowing and occlusion of surface elements. We introduce the dimensionality surface as a representation for the visual complexity of a material sample. The dimensionality surface defines the number of basis textures that are required to represent the observed textures for a sample as a function of ranges of illumination and viewing angles. Basis textures are represented using multiband correlation functions that consider both within and between color band correlations. We examine properties of the dimensionality surface for real materials using the Columbia Utrecht Reflectance and Texture (CUReT) database. The analysis shows that the dependence of the dimensionality surface on ranges of illumination and viewing angles is approximately linear with a slope that depends on the complexity of the sample. We extend the analysis to consider the problem of recognizing rough surfaces in color images obtained under unknown illumination and viewing geometry. We show, using a set of 12,505 images from 61 material samples, that the information captured by the multiband correlation model allows surfaces to be recognized over a wide range of conditions. We also show that the use of color information provides significant advantages for three-dimensional texture recognition  相似文献   

10.
摘要:行人再识别是一种在监控视频中自动搜索行人的重要技术,该技术包含特征表示 和度量学习2 部分。有效的特征表示应对光线和视角变化具有鲁棒性,具有判别性的度量学习 能够提高行人图像的匹配精度。但是,现有的特征大多都是基于局部特征表示或者全局特征表 示,没有很好的集成行人外观的精细细节和整体外观信息且度量学习通常是在线性特征空间进 行,不能高效地利用特征空间中的非线性结构。针对该问题,设计了一种增强局部最大发生的 有效特征表示(eLOMO)方法,可以实现行人图像精细细节和整体外观信息的融合,满足人类视 觉识别机制;并提出一种被核化的KISSME 度量学习(k-KISSME)方法,其计算简单、高效,只 需要对2 个逆协方差矩阵进行估计。此外,为了处理光线和视角变化,应用了Retinex 变换和 尺度不变纹理描述符。实验表明该方法具有丰富和完整的行人特征表示能力,与现有主流方法 相比提高了行人再识别的识别率。  相似文献   

11.
This paper proposes a novel method based on Spectral Regression (SR) for efficient scene recognition. First, a new SR approach, called Extended Spectral Regression (ESR), is proposed to perform manifold learning on a huge number of data samples. Then, an efficient Bag-of-Words (BOW) based method is developed which employs ESR to encapsulate local visual features with their semantic, spatial, scale, and orientation information for scene recognition. In many applications, such as image classification and multimedia analysis, there are a huge number of low-level feature samples in a training set. It prohibits direct application of SR to perform manifold learning on such dataset. In ESR, we first group the samples into tiny clusters, and then devise an approach to reduce the size of the similarity matrix for graph learning. In this way, the subspace learning on graph Laplacian for a vast dataset is computationally feasible on a personal computer. In the ESR-based scene recognition, we first propose an enhanced low-level feature representation which combines the scale, orientation, spatial position, and local appearance of a local feature. Then, ESR is applied to embed enhanced low-level image features. The ESR-based feature embedding not only generates a low dimension feature representation but also integrates various aspects of low-level features into the compact representation. The bag-of-words is then generated from the embedded features for image classification. The comparative experiments on open benchmark datasets for scene recognition demonstrate that the proposed method outperforms baseline approaches. It is suitable for real-time applications on mobile platforms, e.g. tablets and smart phones.  相似文献   

12.
Spectral reflectance is an intrinsic characteristic of objects that is independent of illumination and the used imaging sensors. This direct representation of objects is useful for various computer vision tasks, such as color constancy and material discrimination. In this work, we present a novel system for spectral reflectance recovery with high temporal resolution by exploiting the unique color-forming mechanism of digital light processing (DLP) projectors. DLP projectors use color wheels, which are composed of a number of color segments and rotate quickly to produce the desired colors. Making effective use of this mechanism, we show that a DLP projector can be used as a light source with spectrally distinct illuminations when the appearance of a scene under the projector’s irradiation is captured with a high-speed camera. Based on the measurements, the spectral reflectance of scene points can be recovered using a linear approximation of the surface reflectance. Our imaging system is built from off-the-shelf devices, and is capable of taking multi-spectral measurements as fast as 100 Hz. We carefully evaluated the accuracy of our system and demonstrated its effectiveness by spectral relighting of static as well as dynamic scenes containing different objects.  相似文献   

13.
An image representation method using vector quantization (VQ) on color and texture is proposed in this paper. The proposed method is also used to retrieve similar images from database systems. The basic idea is a transformation from the raw pixel data to a small set of image regions, which are coherent in color and texture space. A scheme is provided for object-based image retrieval. Features for image retrieval are the three color features (hue, saturation, and value) from the HSV color model and five textural features (ASM, contrast, correlation, variance, and entropy) from the gray-level co-occurrence matrices. Once the features are extracted from an image, eight-dimensional feature vectors represent each pixel in the image. The VQ algorithm is used to rapidly cluster those feature vectors into groups. A representative feature table based on the dominant groups is obtained and used to retrieve similar images according to the object within the image. This method can retrieve similar images even in cases where objects are translated, scaled, and rotated.  相似文献   

14.
传统的基于物理信号的火焰识别方法易被外部环境干扰,且现有火焰图像特征提取方法对于火焰和场景的区分度较低,从而导致火焰种类或场景改变时识别精度降低。针对这一问题,提出一种基于局部特征过滤和极限学习机的快速火焰识别方法,将颜色空间信息引入尺度不变特征变换(SIFT)算法。首先,将视频文件转化成帧图像,利用SIFT算法对所有图像提取特征描述符;其次,通过火焰在颜色空间上的信息特性进一步过滤局部噪声特征点,并借助关键点词袋(BOK)方法,将特征描述符转换成对应的特征向量;最后放入极限学习机进行训练,从而快速得到火焰识别模型。在火焰公开数据集及真实火灾场景图像进行的实验结果表明:所提方法对不同场景和火焰类型均具有较高的识别率和较快的检测速度,实验识别精度达97%以上;对于包含4301张图片数据的测试集,模型识别时间仅需2.19 s;与基于信息熵、纹理特征、火焰蔓延率的支持向量机模型,基于SIFT、火焰颜色空间特性的支持向量机模型,基于SIFT的极限学习机模型三种方法相比,所提方法在测试集精度、模型构建时间上均占有优势。  相似文献   

15.
16.
17.
In this paper we propose a new approach to real-time view-based pose recognition and interpolation. Pose recognition is particularly useful for identifying camera views in databases, video sequences, video streams, and live recordings. All of these applications require a fast pose recognition process, in many cases video real-time. It should further be possible to extend the database with new material, i.e., to update the recognition system online. The method that we propose is based on P-channels, a special kind of information representation which combines advantages of histograms and local linear models. Our approach is motivated by its similarity to information representation in biological systems but its main advantage is its robustness against common distortions such as clutter and occlusion. The recognition algorithm consists of three steps: (1) low-level image features for color and local orientation are extracted in each point of the image; (2) these features are encoded into P-channels by combining similar features within local image regions; (3) the query P-channels are compared to a set of prototype P-channels in a database using a least-squares approach. The algorithm is applied in two scene registration experiments with fisheye camera data, one for pose interpolation from synthetic images and one for finding the nearest view in a set of real images. The method compares favorable to SIFT-based methods, in particular concerning interpolation. The method can be used for initializing pose-tracking systems, either when starting the tracking or when the tracking has failed and the system needs to re-initialize. Due to its real-time performance, the method can also be embedded directly into the tracking system, allowing a sensor fusion unit choosing dynamically between the frame-by-frame tracking and the pose recognition.  相似文献   

18.
殷慧  曹永锋  孙洪 《自动化学报》2010,36(8):1099-1106
提出了多维金字塔表达算法, 并使用基于多维金字塔表达的AdaBoost实现了高分辨率合成孔径雷达(Synthetic aperture radar, SAR)图像的城区场景分类. 多维金字塔表达算法首先在局部特征的各维计算金字塔表达矢量, 再将所有的金字塔表达矢量连接起来构成多维金字塔表达矢量. 多维金字塔表达算法克服了金字塔表达算法在处理高维局部特征时, 遇到的输出金字塔表达矢量的区分力受计算效率制约的问题. 本文分别在一个TerraSAR-X图像库和一张大幅TerraSAR-X图像上比较基于金字塔表达的AdaBoost和基于多维金字塔表达的AdaBoost的分类性能. 实验结果表明, 与前者相比, 后者显著提高了计算效率同时保证了分类精度.  相似文献   

19.
Recognition by linear combinations of models   总被引:18,自引:0,他引:18  
An approach to visual object recognition in which a 3D object is represented by the linear combination of 2D images of the object is proposed. It is shown that for objects with sharp edges as well as with smooth bounding contours, the set of possible images of a given object is embedded in a linear space spanned by a small number of views. For objects with sharp edges, the linear combination representation is exact. For objects with smooth boundaries, it is an approximation that often holds over a wide range of viewing angles. Rigid transformations (with or without scaling) can be distinguished from more general linear transformations of the object by testing certain constraints placed on the coefficients of the linear combinations. Three alternative methods of determining the transformation that matches a model to a given image are proposed  相似文献   

20.
当前经典的图像分类算法大多是基于RGB图像或灰度图像,并没有很好地利用物体或场景的深度信息,针对这个问题,提出了一种基于RGB-D融合特征的图像分类方法。首先,分别提取RGB图像dense SIFT局部特征与深度图Gist全局特征,然后将得到的两种图像特征进行特征融合;其次,使用改进K-means算法对融合特征建立视觉词典,克服了传统K-means算法过度依赖初始点选择的问题,并在图像表示阶段引入LLC稀疏编码对融合特征与其对应的视觉词典进行稀疏编码;最后,利用线性SVM进行图像分类。实验结果表明,所提出的算法能有效地提高图像分类的精度。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号