首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
2.
Silhouette-based occluded object recognition through curvature scale space   总被引:4,自引:0,他引:4  
A complete and practical system for occluded object recognition has been developed which is very robust with respect to noise and local deformations of shape (due to weak perspective distortion, segmentation errors and non-rigid material) as well as scale, position and orientation changes of the objects. The system has been tested on a wide variety of free-form 3D objects. An industrial application is envisaged where a fixed camera and a light-box are utilized to obtain images. Within the constraints of the system, every rigid 3D object can be modeled by a limited number of classes of 2D contours corresponding to the object's resting positions on the light-box. The contours in each class are related to each other by a 2D similarity transformation. The Curvature Scale Space technique [26, 28] is then used to obtain a novel multi-scale segmentation of the image and the model contours. Object indexing [16, 32, 36] is used to narrow down the search space. An efficient local matching algorithm is utilized to select the best matching models. Received: 5 August 1996 / Accepted: 19 March 1997  相似文献   

3.
The goal of object categorization is to locate and identify instances of an object category within an image. Recognizing an object in an image is difficult when images include occlusion, poor quality, noise or background clutter, and this task becomes even more challenging when many objects are present in the same scene. Several models for object categorization use appearance and context information from objects to improve recognition accuracy. Appearance information, based on visual cues, can successfully identify object classes up to a certain extent. Context information, based on the interaction among objects in the scene or global scene statistics, can help successfully disambiguate appearance inputs in recognition tasks. In this work we address the problem of incorporating different types of contextual information for robust object categorization in computer vision. We review different ways of using contextual information in the field of object categorization, considering the most common levels of extraction of context and the different levels of contextual interactions. We also examine common machine learning models that integrate context information into object recognition frameworks and discuss scalability, optimizations and possible future approaches.  相似文献   

4.
The explosion of the Internet provides us with a tremendous resource of images shared online. It also confronts vision researchers the problem of finding effective methods to navigate the vast amount of visual information. Semantic image understanding plays a vital role towards solving this problem. One important task in image understanding is object recognition, in particular, generic object categorization. Critical to this problem are the issues of learning and dataset. Abundant data helps to train a robust recognition system, while a good object classifier can help to collect a large amount of images. This paper presents a novel object recognition algorithm that performs automatic dataset collecting and incremental model learning simultaneously. The goal of this work is to use the tremendous resources of the web to learn robust object category models for detecting and searching for objects in real-world cluttered scenes. Humans contiguously update the knowledge of objects when new examples are observed. Our framework emulates this human learning process by iteratively accumulating model knowledge and image examples. We adapt a non-parametric latent topic model and propose an incremental learning framework. Our algorithm is capable of automatically collecting much larger object category datasets for 22 randomly selected classes from the Caltech 101 dataset. Furthermore, our system offers not only more images in each object category but also a robust object category model and meaningful image annotation. Our experiments show that OPTIMOL is capable of collecting image datasets that are superior to the well known manually collected object datasets Caltech 101 and LabelMe.  相似文献   

5.
We describe a flexible model for representing images of objects of a certain class, known a priori, such as faces, and introduce a new algorithm for matching it to a novel image and thereby perform image analysis. The flexible model, known as a multidimensional morphable model, is learned from example images of objects of a class. In this paper we introduce an effective stochastic gradient descent algorithm that automatically matches a model to a novel image. Several experiments demonstrate the robustness and the broad range of applicability of morphable models. Our approach can provide novel solutions to several vision tasks, including the computation of image correspondence, object verification and image compression.  相似文献   

6.
7.
8.
The need to generate new views of a 3D object from a single real image arises in several fields, including graphics and object recognition. While the traditional approach relies on the use of 3D models, simpler techniques are applicable under restricted conditions. The approach exploits image transformations that are specific to the relevant object class, and learnable from example views of other “prototypical” objects of the same class. In this paper, we introduce such a technique by extending the notion of linear class proposed by the authors (1992). For linear object classes, it is shown that linear transformations can be learned exactly from a basis set of 2D prototypical views. We demonstrate the approach on artificial objects and then show preliminary evidence that the technique can effectively “rotate” high-resolution face images from a single 2D view  相似文献   

9.
目的 针对细粒度图像分类中的背景干扰问题,提出一种利用自上而下注意图分割的分类模型。方法 首先,利用卷积神经网络对细粒度图像库进行初分类,得到基本网络模型。再对网络模型进行可视化分析,发现仅有部分图像区域对目标类别有贡献,利用学习好的基本网络计算图像像素对相关类别的空间支持度,生成自上而下注意图,检测图像中的关键区域。再用注意图初始化GraphCut算法,分割出关键的目标区域,从而提高图像的判别性。最后,对分割图像提取CNN特征实现细粒度分类。结果 该模型仅使用图像的类别标注信息,在公开的细粒度图像库Cars196和Aircrafts100上进行实验验证,最后得到的平均分类正确率分别为86.74%和84.70%。这一结果表明,在GoogLeNet模型基础上引入注意信息能够进一步提高细粒度图像分类的正确率。结论 基于自上而下注意图的语义分割策略,提高了细粒度图像的分类性能。由于不需要目标窗口和部位的标注信息,所以该模型具有通用性和鲁棒性,适用于显著性目标检测、前景分割和细粒度图像分类应用。  相似文献   

10.
Statistics of natural image categories   总被引:9,自引:0,他引:9  
In this paper we study the statistical properties of natural images belonging to different categories and their relevance for scene and object categorization tasks. We discuss how second-order statistics are correlated with image categories, scene scale and objects. We propose how scene categorization could be computed in a feedforward manner in order to provide top-down and contextual information very early in the visual processing chain. Results show how visual categorization based directly on low-level features, without grouping or segmentation stages, can benefit object localization and identification. We show how simple image statistics can be used to predict the presence and absence of objects in the scene before exploring the image.  相似文献   

11.
This paper presents a rule-based query language for an object-oriented database model. The database model supports complex objects, object identity, classes and types, and a class/type hierarchy. The instances are described by ‘object relations’ which are functions from a set of objects to value sets and other object sets. The rule language is based on object-terms which provide access to objects via the class hierarchy. Rules are divided into two classes: object-preserving rules manipulating existing objects (yielding a new ‘view’ on objects available in the object base) and object-generating rules creating new objects with properties derived from existing objects. The derived object sets are included in a class lattice. We give conditions for whether the instances of the ‘rules’ heads are ‘consistent’, i.e. represent object relations where the properties of the derived objects are functionally determined by the objects.  相似文献   

12.
《Computer》1996,29(10):119-121
Most applications must keep objects from one session to the next. This is known as persistence. But objects are not raw data: They are instances of classes. What happens if an object's class (its generator) changes from one session to the next? This problem is known as schema evolution (the term schema is borrowed from relational databases). This column defines a framework for addressing schema evolution in object technology  相似文献   

13.
Recovering the 3-D shape of an object from its 2-D image contour is an important problem in computer vision. In this correspondence, the author motivates and develops two object-based heuristics. The structured nature of objects is the motivation for the nonaccidental alignment criterion: parallel coordinate axes within the object's bounding contour correspond to object-centered coordinate axes. The regularity and symmetry inherent in many man-made objects is the motivation for the orthogonal basis constraint. An oblique set of coordinate axes in the image is presumed to be the projection of an orthogonal set of 3-D coordinate axes in the scene. These object-based heuristics are used to recover shape in both real and synthetic images  相似文献   

14.
Conics-based stereo,motion estimation,and pose determination   总被引:13,自引:1,他引:12  
Stereo vision, motion and structure parameter estimation, and pose determination are three important problems in 3-D computer vision. The first step in all of these problems is to choose and to extract primitives and their features in images. In most of the previous work, people usually use edge points or straight line segments as primitives and their local properties as features. Few methods have been presented in the literature using more compact primitives and their global features. This article presents an approach using conics as primitives. For stereo vision, a closed-form solution is provided for both establishing the correspondence of conics in images and the reconstruction of conics in space. With this method, the correspondence is uniquely determined and the reconstruction is global. It is shown that the method can be extended for higher degree (degree3) planar curves.For motion and structure parameter estimation, it is shown that, in general, two sequential images of at least three conics are needed in order to determine the camera motion. A complicated nonlinear system must be solved in this case. In particular, if we are given two images of a pair of coplanar conics, a closed-form solution of camera motion is presented. In a CAD-based vision system, the object models are available, and this makes it possible to recognize 3-D objects and to determine their poses from a single image.For pose determination, it is shown that if there exist two conics on the surface of an object, the object's pose can be determined by an efficient one-dimensional search. In particular, if two conics are coplanar, a closed-form solution of the object's pose is presented.Uniqueness analysis and some experiments with real or synthesized data are presented in this article.  相似文献   

15.
Extensibility and dynamic schema evolution are among the attractive features that lead to the wide acceptance of the object-oriented paradigm. Not knowing all class hierarchy details should not prevent a user from introducing new classes when necessary. Naive or professional users may define new classes either by using class definition constructs or as views. However, improper placement of such classes leads to a flat hierarchy with many things duplicated. To overcome this problem, we automated the process in order to help the user find the most appropriate position with respect to her class in the hierarchy regardless of her knowledge of the hierarchy. The system must be responsible for the proper placement of new classes because only the system has complete knowledge of the details of the class hierarchy, especially in a dynamic environment where changes are very frequent. In other published work, we proved that to define a view it is enough to have the set of objects that qualify to be in a view in addition to having message expressions (possible paths) that lead to desired values within those objects. Here, we go further to map a view that is intended to be persistent into a class. Then we investigate the proper position of that class in the hierarchy. To achieve this, we consider current characteristics of a new class in order to derive its relationship with other existing classes in the hierarchy. Another advantage of the presented model is that views that generate new objects are still updatable simply because we based the creation of new objects on existing identities. In other words, an object participates inside view objects by its identity regardless of which particular values from that object are of interest to the view. Values are reachable via message expressions, not violating encapsulation. This way, actual values are present in only one place and can be updated.Received: 19 March 1999, Accepted: 26 December 2003, Published online: 8 April 2004Edited by: R. Topor.  相似文献   

16.
自动图像标识就是自动识别图像中的有意义目标并赋予其相应的语义关键词, 该过程虽然对于人类来说并不难, 但是对于计算机而言却是一项艰巨而有挑战性的任务. 鉴于人类识别物体通常是一个由粗到细的过程, 本文提出一种层次标识方案. 首先, 输入图像被自动分割成多个区域, 每个区域由支持向量机进行粗分类. 由于粗分类结果会直接影响后续细分类, 本文建立统计的上下文语义关系以修订不正确的粗标识. 接着为了对每个获得粗标识的区域进行细分类, 本文提出一种半监督期望最大化算法, 该算法不仅能为每一粗类别下的细类找到代表模式, 而且能对粗分类区域进行二次分类, 使其获得细标识. 最后我们再次应用上下文语义关系修订不合适的细标识. 为了证明上述识别方案的有效性, 我们开发了一个原型图像标识系统, 实验结果证明该层次标识方案是有效的.  相似文献   

17.
Object Detection and Localization by Dynamic Template Warping   总被引:1,自引:0,他引:1  
A simple method is presented for detecting, localizing and recognizing instances of classes of objects, while accommodating a wide variation in an object's pose. The method utilizes a small two-dimensional template that is warped into an image, and converts localization to a one-dimensional sub-problem, with the search for a match between image and template executed by dynamic programming. For roughly cylindrical objects (like heads), the method recovers three of the six degrees of freedom of motion (2 translation, 1 rotation), and accommodates two more degrees of freedom in the search process (1 rotation, 1 translation). Experiments demonstrate that the method provides an efficient search strategy that outperforms normalized correlation. This is demonstrated in the example domain of face detection and localization, and can extended to more general detection tasks. An additional technique recovers rough object pose from the match results, and is used in a two stage recognition experiment in conjunction with maximization of mutual information.  相似文献   

18.
Image-based surface detail transfer   总被引:3,自引:0,他引:3  
Changing an object's appearance by adding geometric details is desirable in many real-world applications. Bump mapping has acted as an alternative to adding geometrical details, to an otherwise smooth object. But constructing visually interesting bump maps requires practice and artistic skills. Computer vision techniques have helped for modeling real-world objects and their surface details. In our approach, for cases where we only want to transfer geometrical details from one object to another, we might not need to explicitly compute 3D structures. In particular, our novel technique captures the geometrical details of an object from a single image in a way that is independent of the object's reflectance property. We can then transfer geometrical details to another surface, producing the appearance of a new surface with added geometrical details while preserving the object's reflectance property. Our method's advantages are that it's simple to implement, requires only a single image for each object, and produces effective results.  相似文献   

19.
In this article we propose a new method for accurate nonrigid motion analysis when point correspondence data is not available. Nonlinear finite element models are constructed by integrating range data and prior knowledge about an object's properties. The motion sequence is recovered given an initial alignment of the model with the first frame of the sequence. The main idea of the method is to find the forces that are responsible for the motion or shape deformation of the given object. The task is broken into subtasks of finding the forces for each frame. Both absolute values and directions of these forces are taken into consideration and iteratively varied not only for each frame, but also between the frames. Experimental results demonstrate the success of the proposed algorithm. The method is applied to man-made elastic materials and human hand modeling. It allows for recovery of single and multiple forces using restricted (elastic-articulated) and completely unrestricted (elastic) models. Our work demonstrates the possibility of accurate nonrigid motion analysis and force recovery from range image sequences containing nonrigid objects and large motion without interframe point correspondences.  相似文献   

20.
庄严  卢希彬  李云辉 《自动化学报》2011,37(10):1232-1240
研究了移动机器人在室内三维环境中的场景认知问题.室内场景框架具有结构化特性,而室 内多样化的物体则难以进行模型化表述. 本文利用区域扩张算法进行平面特征的提取,并根据平面属性及其相互间的空间关系,完成室 内场景框架的辨识.为了借鉴图像处理领域的物体识别方法, 本文提出一种基于Bearing Angle模型的激光测距数据表述方法,从而将三维点云数据转换为二维Bearing Angle图. 同一类物体中的个体形态具有多样性,同时观测视角也导致激光测距数据的显著差异.针对这些 问题,采用一种基于Gentleboost算法的有监督学习方法, 并利用物体碎片及其相对于物体中心的位置作为特征,从而完成室内场景中的物体认知. 利用室内场景框架辨识结果在Bearing Angle图中进行天棚、地面、墙壁、房门等区域的标记,并利用所产生的语义信息去除错误的认知结果,从而有助于提高识别率. 利用实际机器人平台所获得的实验结果验证了所提方法的有效性.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号