期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A CAD Model Based System for Object Recognition

Jharna Majumdar A. G. Seethalakshmy 《Journal of Intelligent and Robotic Systems》1997,18(4):351-365

3D object recognition is a difficult and yet an important problem in computer vision. A 3D object recognition system has two major components, namely: an object modeller and a system that performs the matching of stored representations to those derived from the sensed image. The performance of systems wherein the construction of object models is done by training from one or more images of the objects, has not been very satisfactory. Although objects used in a robotic workcell or in assembly processes have been designed using a CAD system, the vision systems used for recognition of these objects are independent of the CAD database. This paper proposes a scheme for interfacing the CAD database of objects and the computer vision processes used for recognising these objects. CAD models of objects are processed to generate vision oriented features that appear in the different views of the object and the same features are extracted from images of the object to identify the object and its pose. 相似文献

2.

An efficient algorithm for finding the CSG representation of a simple polygon

David Dobkin Leonidas Guibas John Hershberger Jack Snoeyink 《Algorithmica》1993,10(1):1-23

Modeling two-dimensional and three-dimensional objects is an important theme in computer graphics. Two main types of models are used in both cases: boundary representations, which represent the surface of an object explicitly but represent its interior only implicitly, and constructive solid geometry representations, which model a complex object, surface and interior together, as a boolean combination of simpler objects. Because neither representation is good for all applications, conversion between the two is often necessary.We consider the problem of converting boundary representations of polyhedral objects into constructive solid geometry (CSG) representations. The CSG representations for a polyhedronP are based on the half-spaces supporting the faces ofP. For certain kinds of polyhedra this problem is equivalent to the corresponding problem for simple polygons in the plane. We give a new proof that the interior of each simple polygon can be represented by a monotone boolean formula based on the half-planes supporting the sides of the polygon and using each such half-plane only once. Our main contribution is an efficient and practicalO(n logn) algorithm for doing this boundary-to-CSG conversion for a simple polygon ofn sides. We also prove that such nice formulae do not always exist for general polyhedra in three dimensions.The first author would like to acknowledge the support of the National Science Foundation under Grants CCR87-00917 and CCR90-02352. The fourth author was supported in part by a National Science Foundation Graduate Fellowship. This work was begun while the first author was visiting the DEC Systems Research Center. 相似文献

3.

Global feature space neural network for active computer vision

M. A. Sipe D. Casasent 《Neural computing & applications》1998,7(3):195-215

We advance new active computer vision algorithms based on the Feature space Trajectory (FST) representations of objects and a neural network processor for computation of distances in global feature space. Our algorithms classify rigid objects and estimate their pose from intensity images. They also indicate how to automatically reposition the sensor if the class or pose of an object is ambiguous from a given viewpoint and they incorporate data from multiple object views in the final object classification. An FST in a global eigenfeature space is used to represent 3D distorted views of an object. Assuming that an observed feature vector consists of Gaussian noise added to a point on the FST, we derive a probability density function for the observation conditioned on the class and pose of the object. Bayesian estimation and hypothesis testing theory are then used to derive approximations to the maximum a posterioriprobability pose estimate and the minimum probability of error classifier. Confidence measures for the class and pose estimates, derived using Bayes theory, determine when additional observations are required, as well as where the sensor should be positioned to provide the most useful information. 相似文献

4.

Recognition and shape synthesis of 3-D objects based on attributedhypergraphs

Wong A.K.C. Lu S.W. Rioux M. 《IEEE transactions on pattern analysis and machine intelligence》1989,11(3):279-290

A computer vision system is presented for shape synthesis and recognition of three-dimensional objects using an attributed hypergraph representation. The vision system is capable of: (1) constructing an attributed hypergraph representation (AHR) based on the information extracted from an image with range data; (2) synthesizing several AHRs obtained from various views of an object to form a complete AHR of the object; and (3) recognizing any view of an object of finding the graph monomorphism between the AHR of that view and the complete AHR of a prototype object. This system is implemented on a Grinnell imaging system driven by a VAX 11/750 running VMS 相似文献

5.

Accumulation of object representations utilising interaction of robot action and perception

《Knowledge》2002,15(1-2):111-118

We introduce a robotic-vision system which is able to extract object representations autonomously utilising a tight interaction of visual perception and robotic action within a perception action cycle [Ecological Psychology 4 (1992) 121; Algebraic Frames for the Perception and Action Cycle, 1997, 1]. Controlled movement of the object grasped by the robot enables us to compute the transformations of entities which are used to represent aspects of objects and to find correspondences of entities within an image sequence.A general accumulation scheme allows to acquire robust information from partly missing information extracted from single frames of an image sequence. Here we use this scheme with a preprocessing stage in which 3D-line segments are extracted from stereo images. However, the accumulation scheme can be used with any kind of preprocessing as long as the entities used to represent objects can be brought to correspondence by certain equivalence relations such as ‘rigid body motion’.We show that an accumulated representation can be applied within a tracking algorithm. The accumulation scheme is an important module of a vision based robot system on which we are currently working. In this system, objects are planned to be represented by different visual and tactile entities. The object representations are going to be learned autonomously. We discuss the accumulation scheme in the context of this project. 相似文献

6.

On recognizing and positioning curved 3-D objects from imagecontours

Kriegman D.J. Ponce J. 《IEEE transactions on pattern analysis and machine intelligence》1990,12(12):1127-1137

An approach for explicitly relating the shape of image contours to models of curved three-dimensional objects is presented. This relationship is used for object recognition and positioning. Object models consist of collections of parametric surface patches and their intersection curves; this includes nearly all representations used in computer-aided geometric design and computer vision. The image contours considered are the projections of surface discontinuities and occluding contours. Elimination theory provides a method for constructing the implicit equation of these contours for an object observed under orthographic or perspective projection. This equation is parameterized by the object's position and orientation with respect to the observer. Determining these parameters is reduced to a fitting problem between the theoretical contour and the observed data points. The proposed approach readily extends to parameterized models. It has been implemented for a simple world composed of various surfaces of revolution and tested on several real images 相似文献

7.

Accumulator-based inexact matching using relational summaries

Linda G. Shapiro Haiyuan Lu 《Machine Vision and Applications》1990,3(3):143-158

We are building a system that can rapidly determine the pose of a known object in an unknown view using a view class model of the object. The system inputs a three-dimensional CAD model; converts it to a three-dimensional vision model that contains the surfaces, edges, vertices, and topology of the object; and uses the vision model to determine the view classes or representative views of the object. In this paper we define therelational pyramid structure for describing the features in a particular view or view class of an object and thesummary structure that is used to summarize the relational information in the relational pyramid. We then describe an accumulator-based method for rapidly determining the view class(es) that best match an unknown view of an object.This research was partially supported by the National Aeronatics and Space Administration (NASA) through a subcontract from Machine Vision International. 相似文献

8.

Color indexing 总被引：327，自引：11，他引：316

Michael J. Swain Dana H. Ballard 《International Journal of Computer Vision》1991,7(1):11-32

Computer vision is embracing a new research focus in which the aim is to develop visual skills for robots that allow them to interact with a dynamic, realistic environment. To achieve this aim, new kinds of vision algorithms need to be developed which run in real time and subserve the robot's goals. Two fundamental goals are determining the location of a known object. Color can be successfully used for both tasks.This article demonstrates that color histograms of multicolored objects provide a robust, efficient cue for indexing into a large database of models. It shows that color histograms are stable object representations in the presence of occlusion and over change in view, and that they can differentiate among a large number of objects. For solving the identification problem, it introduces a technique calledHistogram Intersection, which matches model and image histograms and a fast incremental version of Histogram Intersection, which allows real-time indexing into a large database of stored models. For solving the location problem it introduces an algorithm calledHistogram Backprojection, which performs this task efficiently in crowded scenes. 相似文献

9.

Deep convolution neural network with scene-centric and object-centric information for object detection

《Image and vision computing》2019

In recent years, Deep Convolutional Neural Network (CNN) has shown an impressive performance on computer vision field. The ability of learning feature representations from large training dataset makes deep CNN outperform traditional hand-crafted features approaches on object classification and detection. However, computations for deep CNN models are time consuming due to their high complexity, which makes it hardly applicable to real world application, such as Advance Driver Assistance System (ADAS). To reduce the computation complexity, several fast object detection frameworks in the literature have been proposed, such as SSD and YOLO. Although these kinds of method can run at real-time, they usually struggle with dealing of small objects due to the difficulty of handling smaller input image size. Based on our observation, we propose a novel object detection framework which combines the feature representations learned from object-centric and scene-centric datasets with an aim to improve the accuracy on detecting especially small objects. The experimental results on MSCOCO dataset show that our method can actually improve the detection accuracy of small objects, which leads to better overall results. We also evaluate our method on PASCAL VOC 2012 datasets, and the results show that our method not only can achieve state-of-the-art accuracy but also most importantly presents in real-time. 相似文献

10.

FORMS: A flexible object recognition and modelling system 总被引：4，自引：1，他引：3

Song Chun Zhu Alan L. Yuille 《International Journal of Computer Vision》1996,20(3):187-212

We describe a flexible object recognition and modelling system (FORMS) which represents and recognizes animate objects from their silhouettes. This consists of a model for generating the shapes of animate objects which gives a formalism for solving the inverse problem of object recognition. We model all objects at three levels of complexity: (i) the primitives, (ii) the mid-grained shapes, which are deformations of the primitives, and (iii) objects constructed by using a grammar to join mid-grained shapes together. The deformations of the primitives can be characterized by principal component analysis or modal analysis. When doing recognition the representations of these objects are obtained in a bottom-up manner from their silhouettes by a novel method for skeleton extraction and part segmentation based on deformable circles. These representations are then matched to a database of prototypical objects to obtain a set of candidate interpretations. These interpretations are verified in a top-down process. The system is demonstrated to be stable in the presence of noise, the absence of parts, the presence of additional parts, and considerable variations in articulation and viewpoint. Finally, we describe how such a representation scheme can be automatically learnt from examples. 相似文献

11.

Symbolic signatures for deformable shapes

Ruiz-Correa S Shapiro LG Meila M Berson G Cunningham ML Sze RW 《IEEE transactions on pattern analysis and machine intelligence》2006,28(1):75-90

Recognizing classes of objects from their shape is an unsolved problem in machine vision that entails the ability of a computer system to represent and generalize complex geometrical information on the basis of a finite amount of prior data. A practical approach to this problem is particularly difficult to implement, not only because the shape variability of relevant object classes is generally large, but also because standard sensing devices used to capture the real world only provide a partial view of a scene, so there is partial information pertaining to the objects of interest. In this work, we develop an algorithmic framework for recognizing classes of deformable shapes from range data. The basic idea of our component-based approach is to generalize existing surface representations that have proven effective in recognizing specific 3D objects to the problem of object classes using our newly introduced symbolic-signature representation that is robust to deformations, as opposed to a numeric representation that is often tied to a specific shape. Based on this approach, we present a system that is capable of recognizing and classifying a variety of object shape classes from range data. We demonstrate our system in a series of large-scale experiments that were motivated by specific applications in scene analysis and medical diagnosis. 相似文献

12.

Motion Analysis by Random Sampling and Voting Process

Atsushi Imiya Iris Fermin 《Computer Vision and Image Understanding》1999,73(3):435

In computer vision, motion analysis is a fundamental problem. Applying the concepts of congruence checking in computational geometry and geometric hashing, which is a technique used for the recognition of partially occluded objects from noisy data, we present a new random sampling approach for the estimation of the motion parameters in two- and three-dimensional Euclidean spaces of both a completely measured rigid object and a partially occluded rigid object. We assume that the two- and three-dimensional positions of the vertices of the object in each image frame are determined using appropriate methods such as a range sensor or stereo techniques. We also analyze the relationships between the quantization errors and the errors in the estimation of the motion parameters by random sampling, and we show that the solutions obtained using our algorithm converge to the true solutions if the resolution of the digitalization is increased. 相似文献

13.

Learning to Recognize and Grasp Objects 总被引：1，自引：1，他引：1

Josef Pauli 《Autonomous Robots》1998,5(3-4):407-420

We apply techniques of computer vision and neural network learning to get a versatile robot manipulator. All work conducted follows the principle of autonomous learning from visual demonstration. The user must demonstrate the relevant objects, situations, and/or actions, and the robot vision system must learn from those. For approaching and grasping technical objects three principal tasks have to be done—calibrating the camera-robot coordination, detecting the desired object in the images, and choosing a stable grasping pose. These procedures are based on (nonlinear) functions, which are not known a priori and therefore have to be learned. We uniformly approximate the necessary functions by networks of gaussian basis functions (GBF networks). By modifying the number of basis functions and/or the size of the gaussian support the quality of the function approximation changes. The appropriate configuration is learned in the training phase and applied during the operation phase. All experiments are carried out in real world applications using an industrial articulation robot manipulator and the computer vision system KHOROS. 相似文献

14.

Real-time tracking of multiple objects in space-variant vision based on magnocellular visual pathway

Seonghoon Kang Author VitaeSeong-Whan LeeAuthor Vitae 《Pattern recognition》2002,35(10):2031-2040

In this paper, we propose a space-variant image representation model based on properties of magnocellular visual pathway, which perform motion analysis, in human retina. Then, we present an algorithm for the tracking of multiple objects in the proposed space-variant model. The proposed space-variant model has two effective image representations for object recognition and motion analysis, respectively. Each image representation is based on properties of two types of ganglion cell, which are the beginning of two basic visual pathways; one is parvocellular and the other is magnocellular. Through this model, we can get the efficient data reduction capability with no great loss of important information. And, the proposed multiple objects tracking method is restricted in space-variant image. Typically, an object-tracking algorithm consists of several processes such as detection, prediction, matching, and updating. In particular, the matching process plays an important role in multiple objects tracking. In traditional vision, the matching process is simple when the target objects are rigid. In space-variant vision, however, it is very complicated although the target is rigid, because there may be deformation of an object region in the space-variant coordinate system when the target moves to another position. Therefore, we propose a deformation formula in order to solve the matching problem in space-variant vision. By solving this problem, we can efficiently implement multiple objects tracking in space-variant vision. 相似文献

15.

Online Depth Image-Based Object Tracking with Sparse Representation and Object Detection

Wei-Long Zheng Shan-Chun Shen Bao-Liang Lu 《Neural Processing Letters》2017,45(3):745-758

Online object tracking under complex environments is an important but challenging problem in computer vision, especially for illumination changing and occlusion conditions. With the emergence of commercial real-time depth cameras like Kinect, depth image-based object tracking, which is insensitive to illumination changing, gains more and more attentions. In this paper, we propose an online depth image-based object tracking method with sparse representation and object detection. In this framework, we combine tracking and detection to leverage precision and efficiency under heavy occlusion conditions. For tracking, objects are represented by sparse representations learned online with update. For detection, we apply two different strategies based on tracking-learning-detection and wider search window approaches. We evaluate our methods on both the subset of the public dataset Princeton Tracking Benchmark and our own driver face video in a simulated driving environment. The quantitative evaluations of precision and running time on these two datasets demonstrate the effectiveness and efficiency of our proposed object tracking algorithms. 相似文献

16.

Learning to Recognize and Grasp Objects 总被引：1，自引：0，他引：1

Pauli Josef 《Machine Learning》1998,31(1-3):239-258

We apply techniques of computer vision and neural network learning to get a versatile robot manipulator. All work conducted follows the principle of autonomous learning from visual demonstration. The user must demonstra te the relevant objects, situations, and/or actions, and the robot vision system must learn from those. For approaching and grasping technical objects three principal tasks have to be done—calibrating the camera-robot coordination, detecting the desired object in the images, and choosing a stable grasping pose. These procedures are based on (nonlinear) functions, which are not known a priori and therefore have to be learned. We uniformly approximate the necessary functions by networks of gaussian basis functions (GBF networks). By modifying the number of basis functions and/or the size of the gaussian support the quality of the function approximation changes. The appropriate configuration is learned in the training phase and applied during the operation phase. All experiments are carried out in real world applications using an industrial articulation robot manipulator and the computer vision system KHOROS. 相似文献

17.

Manipulating, Deforming and Animating Sampled Object Representations

M. Chen † C. Correa S. Islam M. W. Jones P.-Y. Shen D. Silver S. J. Walton P. J. Willis 《Computer Graphics Forum》2007,26(4):824-852

A sampled object representation (SOR) defines a graphical model using data obtained from a sampling process, which takes a collection of samples at discrete positions in space in order to capture certain geometrical and physical properties of one or more objects of interest. Examples of SORs include images, videos, volume datasets and point datasets. Unlike many commonly used data representations in computer graphics, SORs lack in geometrical, topological and semantic information, which is much needed for controlling deformation and animation. Hence it poses a significant scientific and technical challenge to develop deformation and animation methods that operate upon SORs. Such methods can enable computer graphics and computer animation to benefit enormously from the advances of digital imaging technology. In this state of the art report, we survey a wide range of techniques that have been developed for manipulating, deforming and animating SORs. We consider a collection of elementary operations for manipulating SORs, which can serve as building blocks of deformation and animation techniques. We examine a collection of techniques that are designed to transform the geometry shape of deformable objects in sampled representations and pay particular attention to their deployment in surgical simulation. We review a collection of techniques for animating digital characters in SORs, focusing on recent developments in volume animation. 相似文献

18.

A geospatial technique for detecting distance and Reflection Angle between real and virtual objects

Kim Si-Jung Parang Reza Kuc Tae-Yong 《International Journal of Control, Automation and Systems》2010,8(5):1133-1140

This paper presents a geospatial collision detection technique consisting of two methods: Find Object Distance (FOD) and Find Reflection Angle (FRA). We show how the geospatial collision detection technique using a computer vision system detects a computer generated virtual object and a real object manipulated by a human user and how the virtual object can be reflected on a real floor after being detected by a real object. In the geospatial collision detection technique, the FOD method detects the real and virtual objects, and the FRA method predicts the next moving directions of virtual objects. We demonstrate the two methods by implementing a floor based Augmented Reality (AR) game, Ting Ting, which is played by bouncing fire-shaped virtual objects projected on a floor using bamboo-shaped real objects. The results reveal that the FOD and the FRA methods of the geospatial collision detection technique enable the smooth interaction between a real object manipulated by a human user and a virtual object controlled by a computer. The proposed technique is expected to be used in various AR applications as a low cost interactive collision detection engine such as in educational materials, interactive contents including games, and entertainment equipments. Keywords: Augmented reality, collision detection, computer vision, game, human computer interaction, image processing, interfaces. 相似文献

19.

Invariant object recognition and pose estimation with slow feature analysis

Franzius M Wilbert N Wiskott L 《Neural computation》2011,23(9):2289-2323

Primates are very good at recognizing objects independent of viewing angle or retinal position, and they outperform existing computer vision systems by far. But invariant object recognition is only one prerequisite for successful interaction with the environment. An animal also needs to assess an object's position and relative rotational angle. We propose here a model that is able to extract object identity, position, and rotation angles. We demonstrate the model behavior on complex three-dimensional objects under translation and rotation in depth on a homogeneous background. A similar model has previously been shown to extract hippocampal spatial codes from quasi-natural videos. The framework for mathematical analysis of this earlier application carries over to the scenario of invariant object recognition. Thus, the simulation results can be explained analytically even for the complex high-dimensional data we employed. 相似文献

20.

Implementing the expert object recognition pathway

Bruce?A.?Draper Email author Kyungim?Baek Jeff?Boody 《Machine Vision and Applications》2004,16(1):27-32

Brain imaging studies suggest that expert object recognition is a distinct visual skill, implemented by a dedicated anatomical pathway. Like all visual pathways, the expert recognition pathway begins with the early visual system (retina, LGN/SC, striate cortex). It is defined, however, by subsequent diffuse activation in the lateral occipital complex (LOC) and sharp foci of activation in the fusiform gyrus and right inferior frontal gyrus. This pathway recognizes familiar objects from familiar viewpoints under familiar illumination. Significantly, it identifies objects at both the categorical and instance (a.k.a. subcategorical) levels, and these processes cannot be disassociated. This paper presents a four-stage functional model of the expert object recognition pathway, where each stage models one area of anatomic activation. It implements this model in an end-to-end computer vision system and tests it on real images to provide feedback for the cognitive science and computer vision communities.Published online: 4 November 2004 Correspondence to: Bruce A. DraperKyungim Baek: Current address: Department of Biomedical Engineering, Columbia University, New York, NY, USA 相似文献