共查询到20条相似文献,搜索用时 296 毫秒
1.
Farzin Mokhtarian 《Machine Vision and Applications》1997,10(3):87-97
A complete and practical system for occluded object recognition has been developed which is very robust with respect to noise
and local deformations of shape (due to weak perspective distortion, segmentation errors and non-rigid material) as well as
scale, position and orientation changes of the objects. The system has been tested on a wide variety of free-form 3D objects.
An industrial application is envisaged where a fixed camera and a light-box are utilized to obtain images. Within the constraints
of the system, every rigid 3D object can be modeled by a limited number of classes of 2D contours corresponding to the object's
resting positions on the light-box. The contours in each class are related to each other by a 2D similarity transformation.
The Curvature Scale Space technique [26, 28] is then used to obtain a novel multi-scale segmentation of the image and the model contours. Object indexing [16, 32, 36] is used to narrow down the search space. An efficient local matching algorithm is utilized to select the best
matching models.
Received: 5 August 1996 / Accepted: 19 March 1997 相似文献
2.
Geometric fusion for a hand-held 3D sensor 总被引:2,自引:0,他引:2
Abstract. This article presents a geometric fusion algorithm developed for the reconstruction of 3D surface models from hand-held sensor
data. Hand-held systems allow full 3D movement of the sensor to capture the shape of complex objects. Techniques previously
developed for reconstruction from conventional 2.5D range image data cannot be applied to hand-held sensor data. A geometric
fusion algorithm is introduced to integrate the measured 3D points from a hand-held sensor into a single continuous surface.
The new geometric fusion algorithm is based on the normal-volume representation of a triangle, which enables incremental transformation of an arbitrary mesh into an implicit volumetric field
function. This system is demonstrated for reconstruction of surface models from both hand-held sensor data and conventional
2.5D range images.
Received: 30 August 1999 / Accepted: 21 January 2000 相似文献
3.
In this paper, we present a correlation scheme that incorporates a color ring-projection representation for the automatic
inspection of defects in textured surfaces. The proposed color ring projection transforms a 2-D color image into a 1-D color
pattern as a function of radius. For a search window of width W, data dimensionality is reduced from in the 2-D image to O(W) in the 1-D ring-projection space. The complexity of computing a correlation function is significantly reduced accordingly.
Since the color ring-projection representation is invariant to rotation, the proposed method can be applied for both isotropic
and oriented textures at arbitrary orientations. Experiments on regular textured surfaces have shown the efficacy of the proposed
method.
Received: 30 March 2000 / Accepted: 24 July 2001
Correspondence to: D.-M. Tsai (e-mail: iedmtsai@saturn.yzu.edu.tw) 相似文献
4.
Machine vision system for curved surface inspection 总被引:2,自引:0,他引:2
Min-Fan Ricky Lee Clarence W. de Silva Elizabeth A. Croft Q.M. Jonathan Wu 《Machine Vision and Applications》2000,12(4):177-188
This application-oriented paper discusses a non-contact 3D range data measurement system to improve the performance of the
existing 2D herring roe grading system. The existing system uses a single CCD camera with unstructured halogen lighting to
acquire and analyze the shape of the 2D shape of the herring roe for size and deformity grading. Our system will act as an
additional system module, which can be integrated into the existing 2D grading system, providing the additional third dimension
to detect deformities in the herring roe, which were not detected in the 2D analysis. Furthermore, the additional surface
depth data will increase the accuracy of the weight information used in the existing grading system. In the proposed system,
multiple laser light stripes are projected into the herring roe and the single B/W CCD camera records the image of the scene.
The distortion in the projected line pattern is due to the surface curvature and orientation. Utilizing the linear relation
between the projected line distortion and surface depth, the range data was recovered from a single camera image.
The measurement technique is described and the depth information is obtained through four steps: (1) image capture, (2) stripe
extraction, (3) stripe coding, (4) triangulation, and system calibration. Then, this depth information can be converted into
the curvature and orientation of the shape for deformity inspection, and also used for the weight estimation.
Preliminary results are included to show the feasibility and performance of our measurement technique. The accuracy and reliability
of the computerized herring roe grading system can be greatly improved by integrating this system into existing system in
the future. 相似文献
5.
三维空间尺度估计是三维重建中的一个重要工作,现实世界中也存在一些基于单幅图像进行三维空间尺度估计的需求。通常情况下,尺度估计需先对相机进行标定。根据单目图像符合透视原理的特性,提出了一种基于2个灭点和局部尺度信息的方法对相机进行标定,从而得到单目图像物体中三维空间尺度信息的估计。首先,从单目图像中选择2组互相正交的平行线组,得到对应2个灭点的坐标;然后,利用灭点坐标和焦距信息得到世界坐标系和相机坐标系之间的旋转矩阵,再利用灭点的性质和已知局部尺度信息得到平移向量,完成单目相机的标定;最后,还原二维图像中像素点对应的三维世界坐标值,计算出图像中2个像素点在三维空间的尺度信息。实验结果表明,该方法能有效地对单幅图像中的建筑物体进行尺度估计。 相似文献
6.
Abstract. The image sequence in a video taken by a moving camera may suffer from irregular perturbations because of irregularities
in the motion of the person or vehicle carrying the camera. We show how to use information in the image sequence to correct
the effects of these irregularities so that the sequence is smoothed, i.e., is approximately the same as the sequence that
would have been obtained if the motion of the camera had been smooth. Our method is based on the fact that the irregular motion
is almost entirely rotational, and that the rotational image motion can be detected and corrected if a distant object, such
as the horizon, is visible.
Received: 14 February 2001 / Accepted: 11 February 2002
Correspondence to: A. Rosenfeld 相似文献
7.
This paper introduces an accurate, efficient, and unified engine dedicated to dynamic animation of d-dimensional deformable objects. The objects are modelled as d-dimensional manifolds defined as functional combinations of a mesh of 3D control points, weighted by parametric blending
functions. This model ensures that, at each time step, the object shape conforms to its manifold definitions. The object motion
is deduced from the control points dynamic animation. In fact, control points should be viewed as the degrees of freedom of
the continuous object. The chosen dynamic equations (Lagrangian formalism) reflect this generic modelling scheme and yield
an exact and computationally efficient linear system. 相似文献
8.
We present an autonomous mobile robot navigation system using stereo fish-eye lenses for navigation in an indoor structured
environment and for generating a model of the imaged scene. The system estimates the three-dimensional (3D) position of significant
features in the scene, and by estimating its relative position to the features, navigates through narrow passages and makes
turns at corridor ends. Fish-eye lenses are used to provide a large field of view, which images objects close to the robot
and helps in making smooth transitions in the direction of motion. Calibration is performed for the lens-camera setup and
the distortion is corrected to obtain accurate quantitative measurements. A vision-based algorithm that uses the vanishing
points of extracted segments from a scene in a few 3D orientations provides an accurate estimate of the robot orientation.
This is used, in addition to 3D recovery via stereo correspondence, to maintain the robot motion in a purely translational
path, as well as to remove the effects of any drifts from this path from each acquired image. Horizontal segments are used
as a qualitative estimate of change in the motion direction and correspondence of vertical segment provides precise 3D information
about objects close to the robot. Assuming detected linear edges in the scene as boundaries of planar surfaces, the 3D model
of the scene is generated. The robot system is implemented and tested in a structured environment at our research center.
Results from the robot navigation in real environments are presented and discussed.
Received: 25 September 1996 / Accepted: 20 October 1996 相似文献
9.
10.
The aim of the work reported here is the recovery, from a single image taken inside a roughly cylindrical brick sewer pipe
of diameter up to one meter, of the pose of the camera relative to the central axis of the pipe. It is shown that the vanishing
point associated with the longitudinal mortar lines carries valuable information about the pose. A method for the automatic
detection of this point is presented and used to analyse the camera rotations underlying a number of sewer survey videos.
It is similarly shown how the angles between the images of the longitudinal lines can be used to recover information about
camera pose. The techniques might form an active part of a more comprehensive image understanding system recovering the three-dimensional
shape of a surveyed pipe from survey videos and/or be used as an experimental tool during the design of such a system.
Received: 24 June 1997 / Accepted: 17 March 1998 相似文献
11.
For more than a decade, researchers have actively explored the area of image/video analysis and retrieval. Yet one fundamental
problem remains largely unsolved: how to measure perceptual similarity between two objects. For this purpose, most researchers
employ a Minkowski-type metric. Unfortunately, the Minkowski metric does not reliably find similarities in objects that are
obviously alike. Through mining a large set of visual data, our team has discovered a perceptual distance function. We call
the discovered function the dynamic partial function (DPF). When we empirically compare DPF to Minkowski-type distance functions in image retrieval and in video shot-transition
detection using our image features, DPF performs significantly better. The effectiveness of DPF can be explained by similarity theories in cognitive psychology. 相似文献
12.
Yiming Ye John K. Tsotsos Eric Harley Karen Bennet 《Machine Vision and Applications》2000,12(1):32-43
Abstract. This paper proposes a novel tracking strategy that can robustly track a person or other object within a fixed environment
using a pan, tilt, and zoom camera with the help of a pre-recorded image database. We define a set of camera states which
is sufficient to survey the environment for the target. Background images for these camera states are stored as an image database.
During tracking, camera movements are restricted to these states. Tracking and segmentation are simplified, as each tracking
image can be compared with the corresponding pre-recorded background image.
Received: 26 August 1999 / Accepted: 22 February 2000 相似文献
13.
Paul Clark Majid Mirmehdi 《International Journal on Document Analysis and Recognition》2002,4(4):243-257
We present two different approaches to the location and recovery of text in images of real scenes. The techniques we describe
are invariant to the scale and 3D orientation of the text, and allow recovery of text in cluttered scenes. The first approach
uses page edges and other rectangular boundaries around text to locate a surface containing text, and to recover a fronto-parallel
view. This is performed using line detection, perceptual grouping, and comparison of potential text regions using a confidence
measure. The second approach uses low-level texture measures with a neural network classifier to locate regions of text in
an image. Then we recover a fronto-parallel view of each located paragraph of text by separating the individual lines of text
and determining the vanishing points of the text plane. We illustrate our results using a number of images.
Received May 20, 2001 / Accepted June 19, 2001 相似文献
14.
In video processing, a common first step is to segment the videos into physical units, generally called shots. A shot is a video segment that consists of one continuous action. In general, these physical units need to be clustered
to form more semantically significant units, such as scenes, sequences, programs, etc. This is the so-called story-based video
structuring. Automatic video structuring is of great importance for video browsing and retrieval. The shots or scenes are
usually described by one or several representative frames, called key-frames. Viewed from a higher level, key frames of some shots might be redundant in terms of semantics. In this paper, we propose
automatic solutions to the problems of: (i) video partitioning, (ii) key frame computing, (iii) key frame pruning. For the
first problem, an algorithm called “net comparison” is devised. It is accurate and fast because it uses both statistical and
spatial information in an image and does not have to process the entire image. For the last two problems, we develop an original
image similarity criterion, which considers both spatial layout and detail content in an image. For this purpose, coefficients
of wavelet decomposition are used to derive parameter vectors accounting for the above two aspects. The parameters exhibit
(quasi-) invariant properties, thus making the algorithm robust for many types of object/camera motions and scaling variances.
The novel “seek and spread” strategy used in key frame computing allows us to obtain a large representative range for the
key frames. Inter-shot redundancy of the key-frames is suppressed using the same image similarity measure. Experimental results
demonstrate the effectiveness and efficiency of our techniques. 相似文献
15.
Detection, segmentation, and classification of specific objects are the key building blocks of a computer vision system for
image analysis. This paper presents a unified model-based approach to these three tasks. It is based on using unsupervised
learning to find a set of templates specific to the objects being outlined by the user. The templates are formed by averaging
the shapes that belong to a particular cluster, and are used to guide a probabilistic search through the space of possible
objects. The main difference from previously reported methods is the use of on-line learning, ideal for highly repetitive
tasks. This results in faster and more accurate object detection, as system performance improves with continued use. Further,
the information gained through clustering and user feedback is used to classify the objects for problems in which shape is
relevant to the classification. The effectiveness of the resulting system is demonstrated in two applications: a medical diagnosis
task using cytological images, and a vehicle recognition task.
Received: 5 November 2000 / Accepted: 29 June 2001
Correspondence to: K.-M. Lee 相似文献
16.
Ada Wai-chee Fu Polly Mei-shuen Chan Yin-Ling Cheung Yiu Sang Moon 《The VLDB Journal The International Journal on Very Large Data Bases》2000,9(2):154-173
Abstract. For some multimedia applications, it has been found that domain objects cannot be represented as feature vectors in a multidimensional
space. Instead, pair-wise distances between data objects are the only input. To support content-based retrieval, one approach
maps each object to a k-dimensional (k-d) point and tries to preserve the distances among the points. Then, existing spatial access index methods such as the R-trees
and KD-trees can support fast searching on the resulting k-d points. However, information loss is inevitable with such an approach since the distances between data objects can only
be preserved to a certain extent. Here we investigate the use of a distance-based indexing method. In particular, we apply
the vantage point tree (vp-tree) method. There are two important problems for the vp-tree method that warrant further investigation,
the n-nearest neighbors search and the updating mechanisms. We study an n-nearest neighbors search algorithm for the vp-tree, which is shown by experiments to scale up well with the size of the dataset
and the desired number of nearest neighbors, n. Experiments also show that the searching in the vp-tree is more efficient than that for the -tree and the M-tree. Next, we propose solutions for the update problem for the vp-tree, and show by experiments that the algorithms are
efficient and effective. Finally, we investigate the problem of selecting vantage-point, propose a few alternative methods,
and study their impact on the number of distance computation.
Received June 9, 1998 / Accepted January 31, 2000 相似文献
17.
Straight lines have to be straight 总被引:18,自引:0,他引:18
Most algorithms in 3D computer vision rely on the pinhole camera model because of its simplicity, whereas video optics, especially
low-cost wide-angle or fish-eye lenses, generate a lot of non-linear distortion which can be critical. To find the distortion
parameters of a camera, we use the following fundamental property: a camera follows the pinhole model if and only if the projection
of every line in space onto the camera is a line. Consequently, if we find the transformation on the video image so that every
line in space is viewed in the transformed image as a line, then we know how to remove the distortion from the image. The
algorithm consists of first doing edge extraction on a possibly distorted video sequence, then doing polygonal approximation
with a large tolerance on these edges to extract possible lines from the sequence, and then finding the parameters of our
distortion model that best transform these edges to segments. Results are presented on real video images, compared with distortion
calibration obtained by a full camera calibration method which uses a calibration grid.
Received: 27 December 1999 / Accepted: 8 November 2000 相似文献
18.
This paper presents a novel method for 3D camera calibration. Calculation of the focal length and the optical center of the
camera are the main objectives of this research work. The proposed technique requires a single image having two vanishing
points. A rectangular prism is employed as the calibration target to generate vanishing points. The special arrangement of
the calibration object adds more accuracy in finding the intrinsic parameters. Based on the geometry of the perspective distortion
of the edges of the prisms from the image, vanishing points are found. There on, fixing up the picture plane followed by fixing
up of the station point is carried out based on the relations that are formulated. Experimental results of our method are
likened with Zhang’s method. Results are tabulated to show the accuracy of the proposed approach.
相似文献
S. MuraliEmail: |
19.
CATALOG: a system for detection and rendering of internal log defects using computer tomography 总被引:1,自引:0,他引:1
Suchendra M. Bhandarkar Timothy D. Faust Mengjin Tang 《Machine Vision and Applications》1999,11(4):171-190
This paper describes the design and implementation of a machine vision system CATALOG for detection and classification of
some important internal defects in hardwood logs via analysis of computer axial tomography (CT or CAT) images. The defect
identification and classification in CATALOG consists of two phases. The first phase comprises of the segmentation of a single
CT image slice, which results in the extraction of 2D defect-like regions from the CT image slice. The second phase comprises
of the correlation of the 2D defect-like regions across CT image slices in order to establish 3D support. The segmentation
algorithm for a single CT image is a complex form of multiple-value thresholding that exploits both, the prior knowledge of
the wood structure within the log and the gray-level characteristics of the image. The algorithm for extraction of 2D defect-like
regions in a single CT image first locates the pith of the log cross section, groups the pixels in the segmented image on
the basis of their connectivity and classifies each 2D region as either a defect-like region or a defect-free region using
shape, orientation and morphological features. Each 2D defect-like region is classified as a defect or non-defect via correlation
across corresponding 2D defect-like regions in neighboring CT image slices. The 2D defect-like regions with adequate 3D support
are labeled as true defects. The current version of CATALOG is capable of 3D reconstruction and rendering of the log and its
internal defects from the individual CT image slices. CATALOG is also capable of simulation and rendering of key machining
operations such as sawing and veneering on the 3D reconstructions of the logs. The current version of CATALOG is intended
as a decision aid for sawyers and machinists in lumber mills and also as an interactive training tool for novice sawyers and
machinists.
Received: 1 August 1997 / Accepted: 25 August 1999 相似文献
20.
Abstract. Image feedback path tracking (IFPT) control of a laser light point (LLP) using a CCD camera is studied in this paper. The
tracking path and the LLP are assumed clearly focused in the scene, but no camera calibration is needed. A modified version
of the thinning algorithm SPTA is proposed to skeletonize the path in a piecewise manner. The proposed thinning algorithm
takes less computer time than the original SPTA and makes the real-time skeletonization possible. Included in the paper is
also the development of a control algorithm with image feedback to assure LLP tracking along the required path, as well as
an experimental study to demonstrate how IFPT control can be realized in practice.
Received: 10 November 1999 / Accepted: 9 March 2000 相似文献