首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 296 毫秒
1.
Silhouette-based occluded object recognition through curvature scale space   总被引:4,自引:0,他引:4  
A complete and practical system for occluded object recognition has been developed which is very robust with respect to noise and local deformations of shape (due to weak perspective distortion, segmentation errors and non-rigid material) as well as scale, position and orientation changes of the objects. The system has been tested on a wide variety of free-form 3D objects. An industrial application is envisaged where a fixed camera and a light-box are utilized to obtain images. Within the constraints of the system, every rigid 3D object can be modeled by a limited number of classes of 2D contours corresponding to the object's resting positions on the light-box. The contours in each class are related to each other by a 2D similarity transformation. The Curvature Scale Space technique [26, 28] is then used to obtain a novel multi-scale segmentation of the image and the model contours. Object indexing [16, 32, 36] is used to narrow down the search space. An efficient local matching algorithm is utilized to select the best matching models. Received: 5 August 1996 / Accepted: 19 March 1997  相似文献   

2.
Geometric fusion for a hand-held 3D sensor   总被引:2,自引:0,他引:2  
Abstract. This article presents a geometric fusion algorithm developed for the reconstruction of 3D surface models from hand-held sensor data. Hand-held systems allow full 3D movement of the sensor to capture the shape of complex objects. Techniques previously developed for reconstruction from conventional 2.5D range image data cannot be applied to hand-held sensor data. A geometric fusion algorithm is introduced to integrate the measured 3D points from a hand-held sensor into a single continuous surface. The new geometric fusion algorithm is based on the normal-volume representation of a triangle, which enables incremental transformation of an arbitrary mesh into an implicit volumetric field function. This system is demonstrated for reconstruction of surface models from both hand-held sensor data and conventional 2.5D range images. Received: 30 August 1999 / Accepted: 21 January 2000  相似文献   

3.
In this paper, we present a correlation scheme that incorporates a color ring-projection representation for the automatic inspection of defects in textured surfaces. The proposed color ring projection transforms a 2-D color image into a 1-D color pattern as a function of radius. For a search window of width W, data dimensionality is reduced from in the 2-D image to O(W) in the 1-D ring-projection space. The complexity of computing a correlation function is significantly reduced accordingly. Since the color ring-projection representation is invariant to rotation, the proposed method can be applied for both isotropic and oriented textures at arbitrary orientations. Experiments on regular textured surfaces have shown the efficacy of the proposed method. Received: 30 March 2000 / Accepted: 24 July 2001 Correspondence to: D.-M. Tsai (e-mail: iedmtsai@saturn.yzu.edu.tw)  相似文献   

4.
Machine vision system for curved surface inspection   总被引:2,自引:0,他引:2  
This application-oriented paper discusses a non-contact 3D range data measurement system to improve the performance of the existing 2D herring roe grading system. The existing system uses a single CCD camera with unstructured halogen lighting to acquire and analyze the shape of the 2D shape of the herring roe for size and deformity grading. Our system will act as an additional system module, which can be integrated into the existing 2D grading system, providing the additional third dimension to detect deformities in the herring roe, which were not detected in the 2D analysis. Furthermore, the additional surface depth data will increase the accuracy of the weight information used in the existing grading system. In the proposed system, multiple laser light stripes are projected into the herring roe and the single B/W CCD camera records the image of the scene. The distortion in the projected line pattern is due to the surface curvature and orientation. Utilizing the linear relation between the projected line distortion and surface depth, the range data was recovered from a single camera image. The measurement technique is described and the depth information is obtained through four steps: (1) image capture, (2) stripe extraction, (3) stripe coding, (4) triangulation, and system calibration. Then, this depth information can be converted into the curvature and orientation of the shape for deformity inspection, and also used for the weight estimation. Preliminary results are included to show the feasibility and performance of our measurement technique. The accuracy and reliability of the computerized herring roe grading system can be greatly improved by integrating this system into existing system in the future.  相似文献   

5.
三维空间尺度估计是三维重建中的一个重要工作,现实世界中也存在一些基于单幅图像进行三维空间尺度估计的需求。通常情况下,尺度估计需先对相机进行标定。根据单目图像符合透视原理的特性,提出了一种基于2个灭点和局部尺度信息的方法对相机进行标定,从而得到单目图像物体中三维空间尺度信息的估计。首先,从单目图像中选择2组互相正交的平行线组,得到对应2个灭点的坐标;然后,利用灭点坐标和焦距信息得到世界坐标系和相机坐标系之间的旋转矩阵,再利用灭点的性质和已知局部尺度信息得到平移向量,完成单目相机的标定;最后,还原二维图像中像素点对应的三维世界坐标值,计算出图像中2个像素点在三维空间的尺度信息。实验结果表明,该方法能有效地对单幅图像中的建筑物体进行尺度估计。  相似文献   

6.
Abstract. The image sequence in a video taken by a moving camera may suffer from irregular perturbations because of irregularities in the motion of the person or vehicle carrying the camera. We show how to use information in the image sequence to correct the effects of these irregularities so that the sequence is smoothed, i.e., is approximately the same as the sequence that would have been obtained if the motion of the camera had been smooth. Our method is based on the fact that the irregular motion is almost entirely rotational, and that the rotational image motion can be detected and corrected if a distant object, such as the horizon, is visible. Received: 14 February 2001 / Accepted: 11 February 2002 Correspondence to: A. Rosenfeld  相似文献   

7.
This paper introduces an accurate, efficient, and unified engine dedicated to dynamic animation of d-dimensional deformable objects. The objects are modelled as d-dimensional manifolds defined as functional combinations of a mesh of 3D control points, weighted by parametric blending functions. This model ensures that, at each time step, the object shape conforms to its manifold definitions. The object motion is deduced from the control points dynamic animation. In fact, control points should be viewed as the degrees of freedom of the continuous object. The chosen dynamic equations (Lagrangian formalism) reflect this generic modelling scheme and yield an exact and computationally efficient linear system.  相似文献   

8.
We present an autonomous mobile robot navigation system using stereo fish-eye lenses for navigation in an indoor structured environment and for generating a model of the imaged scene. The system estimates the three-dimensional (3D) position of significant features in the scene, and by estimating its relative position to the features, navigates through narrow passages and makes turns at corridor ends. Fish-eye lenses are used to provide a large field of view, which images objects close to the robot and helps in making smooth transitions in the direction of motion. Calibration is performed for the lens-camera setup and the distortion is corrected to obtain accurate quantitative measurements. A vision-based algorithm that uses the vanishing points of extracted segments from a scene in a few 3D orientations provides an accurate estimate of the robot orientation. This is used, in addition to 3D recovery via stereo correspondence, to maintain the robot motion in a purely translational path, as well as to remove the effects of any drifts from this path from each acquired image. Horizontal segments are used as a qualitative estimate of change in the motion direction and correspondence of vertical segment provides precise 3D information about objects close to the robot. Assuming detected linear edges in the scene as boundaries of planar surfaces, the 3D model of the scene is generated. The robot system is implemented and tested in a structured environment at our research center. Results from the robot navigation in real environments are presented and discussed. Received: 25 September 1996 / Accepted: 20 October 1996  相似文献   

9.
10.
The aim of the work reported here is the recovery, from a single image taken inside a roughly cylindrical brick sewer pipe of diameter up to one meter, of the pose of the camera relative to the central axis of the pipe. It is shown that the vanishing point associated with the longitudinal mortar lines carries valuable information about the pose. A method for the automatic detection of this point is presented and used to analyse the camera rotations underlying a number of sewer survey videos. It is similarly shown how the angles between the images of the longitudinal lines can be used to recover information about camera pose. The techniques might form an active part of a more comprehensive image understanding system recovering the three-dimensional shape of a surveyed pipe from survey videos and/or be used as an experimental tool during the design of such a system. Received: 24 June 1997 / Accepted: 17 March 1998  相似文献   

11.
Discovery of a perceptual distance function for measuring image similarity   总被引:3,自引:0,他引:3  
For more than a decade, researchers have actively explored the area of image/video analysis and retrieval. Yet one fundamental problem remains largely unsolved: how to measure perceptual similarity between two objects. For this purpose, most researchers employ a Minkowski-type metric. Unfortunately, the Minkowski metric does not reliably find similarities in objects that are obviously alike. Through mining a large set of visual data, our team has discovered a perceptual distance function. We call the discovered function the dynamic partial function (DPF). When we empirically compare DPF to Minkowski-type distance functions in image retrieval and in video shot-transition detection using our image features, DPF performs significantly better. The effectiveness of DPF can be explained by similarity theories in cognitive psychology.  相似文献   

12.
Abstract. This paper proposes a novel tracking strategy that can robustly track a person or other object within a fixed environment using a pan, tilt, and zoom camera with the help of a pre-recorded image database. We define a set of camera states which is sufficient to survey the environment for the target. Background images for these camera states are stored as an image database. During tracking, camera movements are restricted to these states. Tracking and segmentation are simplified, as each tracking image can be compared with the corresponding pre-recorded background image. Received: 26 August 1999 / Accepted: 22 February 2000  相似文献   

13.
We present two different approaches to the location and recovery of text in images of real scenes. The techniques we describe are invariant to the scale and 3D orientation of the text, and allow recovery of text in cluttered scenes. The first approach uses page edges and other rectangular boundaries around text to locate a surface containing text, and to recover a fronto-parallel view. This is performed using line detection, perceptual grouping, and comparison of potential text regions using a confidence measure. The second approach uses low-level texture measures with a neural network classifier to locate regions of text in an image. Then we recover a fronto-parallel view of each located paragraph of text by separating the individual lines of text and determining the vanishing points of the text plane. We illustrate our results using a number of images. Received May 20, 2001 / Accepted June 19, 2001  相似文献   

14.
In video processing, a common first step is to segment the videos into physical units, generally called shots. A shot is a video segment that consists of one continuous action. In general, these physical units need to be clustered to form more semantically significant units, such as scenes, sequences, programs, etc. This is the so-called story-based video structuring. Automatic video structuring is of great importance for video browsing and retrieval. The shots or scenes are usually described by one or several representative frames, called key-frames. Viewed from a higher level, key frames of some shots might be redundant in terms of semantics. In this paper, we propose automatic solutions to the problems of: (i) video partitioning, (ii) key frame computing, (iii) key frame pruning. For the first problem, an algorithm called “net comparison” is devised. It is accurate and fast because it uses both statistical and spatial information in an image and does not have to process the entire image. For the last two problems, we develop an original image similarity criterion, which considers both spatial layout and detail content in an image. For this purpose, coefficients of wavelet decomposition are used to derive parameter vectors accounting for the above two aspects. The parameters exhibit (quasi-) invariant properties, thus making the algorithm robust for many types of object/camera motions and scaling variances. The novel “seek and spread” strategy used in key frame computing allows us to obtain a large representative range for the key frames. Inter-shot redundancy of the key-frames is suppressed using the same image similarity measure. Experimental results demonstrate the effectiveness and efficiency of our techniques.  相似文献   

15.
Detection, segmentation, and classification of specific objects are the key building blocks of a computer vision system for image analysis. This paper presents a unified model-based approach to these three tasks. It is based on using unsupervised learning to find a set of templates specific to the objects being outlined by the user. The templates are formed by averaging the shapes that belong to a particular cluster, and are used to guide a probabilistic search through the space of possible objects. The main difference from previously reported methods is the use of on-line learning, ideal for highly repetitive tasks. This results in faster and more accurate object detection, as system performance improves with continued use. Further, the information gained through clustering and user feedback is used to classify the objects for problems in which shape is relevant to the classification. The effectiveness of the resulting system is demonstrated in two applications: a medical diagnosis task using cytological images, and a vehicle recognition task. Received: 5 November 2000 / Accepted: 29 June 2001 Correspondence to: K.-M. Lee  相似文献   

16.
Abstract. For some multimedia applications, it has been found that domain objects cannot be represented as feature vectors in a multidimensional space. Instead, pair-wise distances between data objects are the only input. To support content-based retrieval, one approach maps each object to a k-dimensional (k-d) point and tries to preserve the distances among the points. Then, existing spatial access index methods such as the R-trees and KD-trees can support fast searching on the resulting k-d points. However, information loss is inevitable with such an approach since the distances between data objects can only be preserved to a certain extent. Here we investigate the use of a distance-based indexing method. In particular, we apply the vantage point tree (vp-tree) method. There are two important problems for the vp-tree method that warrant further investigation, the n-nearest neighbors search and the updating mechanisms. We study an n-nearest neighbors search algorithm for the vp-tree, which is shown by experiments to scale up well with the size of the dataset and the desired number of nearest neighbors, n. Experiments also show that the searching in the vp-tree is more efficient than that for the -tree and the M-tree. Next, we propose solutions for the update problem for the vp-tree, and show by experiments that the algorithms are efficient and effective. Finally, we investigate the problem of selecting vantage-point, propose a few alternative methods, and study their impact on the number of distance computation. Received June 9, 1998 / Accepted January 31, 2000  相似文献   

17.
Straight lines have to be straight   总被引:18,自引:0,他引:18  
Most algorithms in 3D computer vision rely on the pinhole camera model because of its simplicity, whereas video optics, especially low-cost wide-angle or fish-eye lenses, generate a lot of non-linear distortion which can be critical. To find the distortion parameters of a camera, we use the following fundamental property: a camera follows the pinhole model if and only if the projection of every line in space onto the camera is a line. Consequently, if we find the transformation on the video image so that every line in space is viewed in the transformed image as a line, then we know how to remove the distortion from the image. The algorithm consists of first doing edge extraction on a possibly distorted video sequence, then doing polygonal approximation with a large tolerance on these edges to extract possible lines from the sequence, and then finding the parameters of our distortion model that best transform these edges to segments. Results are presented on real video images, compared with distortion calibration obtained by a full camera calibration method which uses a calibration grid. Received: 27 December 1999 / Accepted: 8 November 2000  相似文献   

18.
This paper presents a novel method for 3D camera calibration. Calculation of the focal length and the optical center of the camera are the main objectives of this research work. The proposed technique requires a single image having two vanishing points. A rectangular prism is employed as the calibration target to generate vanishing points. The special arrangement of the calibration object adds more accuracy in finding the intrinsic parameters. Based on the geometry of the perspective distortion of the edges of the prisms from the image, vanishing points are found. There on, fixing up the picture plane followed by fixing up of the station point is carried out based on the relations that are formulated. Experimental results of our method are likened with Zhang’s method. Results are tabulated to show the accuracy of the proposed approach.
S. MuraliEmail:
  相似文献   

19.
This paper describes the design and implementation of a machine vision system CATALOG for detection and classification of some important internal defects in hardwood logs via analysis of computer axial tomography (CT or CAT) images. The defect identification and classification in CATALOG consists of two phases. The first phase comprises of the segmentation of a single CT image slice, which results in the extraction of 2D defect-like regions from the CT image slice. The second phase comprises of the correlation of the 2D defect-like regions across CT image slices in order to establish 3D support. The segmentation algorithm for a single CT image is a complex form of multiple-value thresholding that exploits both, the prior knowledge of the wood structure within the log and the gray-level characteristics of the image. The algorithm for extraction of 2D defect-like regions in a single CT image first locates the pith of the log cross section, groups the pixels in the segmented image on the basis of their connectivity and classifies each 2D region as either a defect-like region or a defect-free region using shape, orientation and morphological features. Each 2D defect-like region is classified as a defect or non-defect via correlation across corresponding 2D defect-like regions in neighboring CT image slices. The 2D defect-like regions with adequate 3D support are labeled as true defects. The current version of CATALOG is capable of 3D reconstruction and rendering of the log and its internal defects from the individual CT image slices. CATALOG is also capable of simulation and rendering of key machining operations such as sawing and veneering on the 3D reconstructions of the logs. The current version of CATALOG is intended as a decision aid for sawyers and machinists in lumber mills and also as an interactive training tool for novice sawyers and machinists. Received: 1 August 1997 / Accepted: 25 August 1999  相似文献   

20.
Image feedback path tracking control using an uncalibrated CCD camera   总被引:2,自引:0,他引:2  
Abstract. Image feedback path tracking (IFPT) control of a laser light point (LLP) using a CCD camera is studied in this paper. The tracking path and the LLP are assumed clearly focused in the scene, but no camera calibration is needed. A modified version of the thinning algorithm SPTA is proposed to skeletonize the path in a piecewise manner. The proposed thinning algorithm takes less computer time than the original SPTA and makes the real-time skeletonization possible. Included in the paper is also the development of a control algorithm with image feedback to assure LLP tracking along the required path, as well as an experimental study to demonstrate how IFPT control can be realized in practice. Received: 10 November 1999 / Accepted: 9 March 2000  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号