首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
One of the most interesting goals of computer vision is the 3D structure recovery of scenes. Traditionally, two cues are used: structure from motion and structure from stereo, two subfields with complementary sets of assumptions and techniques. This paper introduces a new general framework of cooperation between stereo and motion. This framework combines the advantages of both cues: (i) easy correspondence from motion and (ii) accurate 3D reconstruction from stereo. First, we show how the stereo matching can be recovered from motion correspondences using only geometric constraints. Second, we propose a method of 3D reconstruction of both binocular and monocular features using all stereo pairs in the case of a calibrated stereo rig. Third, we perform an analysis of the performance of the proposed framework as well as a comparison with an affine method. Experiments involving real and synthetic stereo pairs indicate that rich and reliable information can be derived from the proposed framework. They also indicate that robust 3D reconstruction can be obtained even with short image sequences.  相似文献   

2.
In computer vision, motion analysis is a fundamental problem. Applying the concepts of congruence checking in computational geometry and geometric hashing, which is a technique used for the recognition of partially occluded objects from noisy data, we present a new random sampling approach for the estimation of the motion parameters in two- and three-dimensional Euclidean spaces of both a completely measured rigid object and a partially occluded rigid object. We assume that the two- and three-dimensional positions of the vertices of the object in each image frame are determined using appropriate methods such as a range sensor or stereo techniques. We also analyze the relationships between the quantization errors and the errors in the estimation of the motion parameters by random sampling, and we show that the solutions obtained using our algorithm converge to the true solutions if the resolution of the digitalization is increased.  相似文献   

3.
This paper presents an original method for analyzing, in an unsupervised way, images supplied by high resolution sonar. We aim at segmenting the sonar image into three kinds of regions: echo areas (due to the reflection of the acoustic wave on the object), shadow areas (corresponding to a lack of acoustic reverberation behind an object lying on the sea-bed), and sea-bottom reverberation areas. This unsupervised method estimates the parameters of noise distributions, modeled by a Weibull probability density function (PDF), and the label field parameters, modeled by a Markov random field (MRF). For the estimation step, we adopt a maximum likelihood technique for the noise model parameters and a least-squares method to estimate the MRF prior model. Then, in order to obtain an accurate segmentation map, we have designed a two-step process that finds the shadow and the echo regions separately, using the previously estimated parameters. First, we introduce a scale-causal and spatial model called SCM (scale causal multigrid), based on a multigrid energy minimization strategy, to find the shadow class. Second, we propose a MRF monoscale model using a priori information (at different level of knowledge) based on physical properties of each region, which allows us to distinguish echo areas from sea-bottom reverberation. This technique has been successfully applied to real sonar images and is compatible with automatic processing of massive amounts of data.  相似文献   

4.
The role of perceptual organization in motion analysis has heretofore been minimal. In this work we present a simple but powerful computational model and associated algorithms based on the use of perceptual organizational principles, such as temporal coherence (or common fate) and spatial proximity, for motion segmentation. The computational model does not use the traditional frame by frame motion analysis; rather it treats an image sequence as a single 3D spatio-temporal volume. It endeavors to find organizations in this volume of data over three levels—signal, primitive, and structural. The signal level is concerned with detecting individual image pixels that are probably part of a moving object. The primitive level groups these individual pixels into planar patches, which we call the temporal envelopes. Compositions of these temporal envelopes describe the spatio-temporal surfaces that result from object motion. At the structural level, we detect these compositions of temporal envelopes by utilizing the structure and organization among them. The algorithms employed to realize the computational model include 3D edge detection, Hough transformation, and graph based methods to group the temporal envelopes based on Gestalt principles. The significance of the Gestalt relationships between any two temporal envelopes is expressed in probabilistic terms. One of the attractive features of the adopted algorithm is that it does not require the detection of special 2D features or the tracking of these features across frames. We demonstrate that even with simple grouping strategies, we can easily handle drastic illumination changes, occlusion events, and multiple moving objects, without the use of training and specific object or illumination models. We present results on a large variety of motion sequences to demonstrate this robustness.  相似文献   

5.
For decades, there has been an intensive research effort in the Computer Vision community to deal with video sequences. In this paper, we present a new method for recovering a maximum of information on displacement and projection parameters in monocular video sequences without calibration. This work follows previous studies on particular cases of displacement, scene geometry, and camera analysis and focuses on the particular forms of homographic matrices. It is already known that the number of particular cases involved in a complete study precludes an exhaustive test. To lower the algorithmic complexity, some authors propose to decompose all possible cases in a hierarchical tree data structure but these works are still in development (T. Viéville and D. Lingrand, Internat. J. Comput. Vision31, 1999, 5–L29). In this paper, we propose a new way to deal with the huge number of particular cases: (i) we use simple rules in order to eliminate some redundant cases and some physically impossible cases, and (ii) we divide the cases into subsets corresponding to particular forms determined by simple rules leading to a computationally efficient discrimination method. Finally, some experiments were performed on image sequences acquired either using a robotic system or manually in order to demonstrate that when several models are valid, the model with the fewer parameters gives the best estimation, regarding the free parameters of the problem. The experiments presented in this paper show that even if the selected case is an approximation of reality, the method is still robust.  相似文献   

6.
This paper describes the mathematical basis and application of a probabilistic model for recovering the direction of camera translation (heading) from optical flow. According to the theorem that heading cannot lie between two converging points in a stationary environment, one can compute the posterior probability distribution of heading across the image and choose the heading with maximum a posteriori (MAP). The model requires very simple computation, provides confidence level of the judgments, applies to both linear and curved trajectories, functions in the presence of camera rotations, and exhibited high accuracy up to 0.1°–0.2° in random dot simulations.  相似文献   

7.
We present a method for automatically estimating the motion of an articulated object filmed by two or more fixed cameras. We focus our work on the case where the quality of the images is poor, and where only an approximation of a geometric model of the tracked object is available. Our technique uses physical forces applied to each rigid part of a kinematic 3D model of the object we are tracking. These forces guide the minimization of the differences between the pose of the 3D model and the pose of the real object in the video images. We use a fast recursive algorithm to solve the dynamical equations of motion of any 3D articulated model. We explain the key parts of our algorithms: how relevant information is extracted from the images, how the forces are created, and how the dynamical equations of motion are solved. A study of what kind of information should be extracted in the images and of when our algorithms fail is also presented. Finally we present some results about the tracking of a person. We also show the application of our method to the tracking of a hand in sequences of images, showing that the kind of information to extract from the images depends on their quality and of the configuration of the cameras.  相似文献   

8.
We present a new single-chip texture classifier based on the cellular neural network (CNN) architecture. Exploiting the dynamics of a locally interconnected 2D cell array of CNNs we have developed a theoretically new method for texture classification and segmentation. This technique differs from other convolution-based feature extraction methods since we utilize feedback convolution, and we use a genetic learning algorithm to determine the optimal kernel matrices of the network. The CNN operators we have found for texture recognition may combine different early vision effects. We show how the kernel matrices can be derived from the state equations of the network for convolution/deconvolution and nonlinear effects. The whole process includes histogram equalization of the textured images, filtering with the trained kernel matrices, and decision-making based on average gray-scale or texture energy of the filtered images. We present experimental results using digital CNN simulation with sensitivity analysis for noise, rotation, and scale. We also report a tested application performed on a programmable 22 × 20 CNN chip with optical inputs and an execution time of a few microseconds. We have found that this CNN chip with a simple 3 × 3 CNN kernel can reliably classify four textures. Using more templates for decision-making, we believe that more textures can be separated and adequate texture segmentation (< 1% error) can be achieved.  相似文献   

9.
This paper describes the theory and algorithms of distance transform for fuzzy subsets, called fuzzy distance transform (FDT). The notion of fuzzy distance is formulated by first defining the length of a path on a fuzzy subset and then finding the infimum of the lengths of all paths between two points. The length of a path π in a fuzzy subset of the n-dimensional continuous space n is defined as the integral of fuzzy membership values along π. Generally, there are infinitely many paths between any two points in a fuzzy subset and it is shown that the shortest one may not exist. The fuzzy distance between two points is defined as the infimum of the lengths of all paths between them. It is demonstrated that, unlike in hard convex sets, the shortest path (when it exists) between two points in a fuzzy convex subset is not necessarily a straight line segment. For any positive number θ≤1, the θ-support of a fuzzy subset is the set of all points in n with membership values greater than or equal to θ. It is shown that, for any fuzzy subset, for any nonzero θ≤1, fuzzy distance is a metric for the interior of its θ-support. It is also shown that, for any smooth fuzzy subset, fuzzy distance is a metric for the interior of its 0-support (referred to as support). FDT is defined as a process on a fuzzy subset that assigns to a point its fuzzy distance from the complement of the support. The theoretical framework of FDT in continuous space is extended to digital cubic spaces and it is shown that for any fuzzy digital object, fuzzy distance is a metric for the support of the object. A dynamic programming-based algorithm is presented for computing FDT of a fuzzy digital object. It is shown that the algorithm terminates in a finite number of steps and when it does so, it correctly computes FDT. Several potential applications of fuzzy distance transform in medical imaging are presented. Among these are the quantification of blood vessels and trabecular bone thickness in the regime of limited special resolution where these objects become fuzzy.  相似文献   

10.
11.
Considering a checkpoint and communication pattern, the rollback-dependency trackability (RDT) property stipulates that there is no hidden dependency between local checkpoints. In other words, if there is a dependency between two checkpoints due to a noncausal sequence of messages (Z-path), then there exists a causal sequence of messages (C-path) that doubles the noncausal one and that establishes the same dependency.This paper introduces the notion of RDT-compliance. A property defined on Z-paths is RDT-compliant if the causal doubling of Z-paths having this property is sufficient to ensure RDT. Based on this notion, the paper provides examples of such properties. Moreover, these properties are visible, i.e., they can be tested on the fly. One of these properties is shown to be minimal with respect to visible and RDT-compliant properties. In other words, this property defines a minimal visible set of Z-paths that have to be doubled for the RDT property to be satisfied.Then, a family of communication-induced checkpointing protocols that ensure on-the-fly RDT properties is considered. Assuming processes take local checkpoints independently (called basic checkpoints), protocols of this family direct them to take on-the-fly additional local checkpoints (called forced checkpoints) in order that the resulting checkpoint and communication pattern satisfies the RDT property. The second contribution of this paper is a new communication-induced checkpointing protocol . This protocol, based on a condition derived from the previous characterization, tracks a minimal set of Z-paths and breaks those not perceived as being doubled. Finally, a set of communication-induced checkpointing protocols are derived from . Each of these derivations considers a particular weakening of the general condition used by . It is interesting to note that some of these derivations produce communication-induced checkpointing protocols that have already been proposed in the literature.  相似文献   

12.
A contribution to the automatic 3-D reconstruction of complex urban scenes from aerial stereo pairs is proposed. It consists of segmenting the scene into two different kinds of components: the ground and the above-ground objects. The above-ground objects are classified either as buildings or as vegetation. The idea is to define appropriate regions of interest in order to achieve a relevant 3-D reconstruction. For that purpose, a digital elevation model of the scene is first computed and segmented into above-ground regions using a Markov random field model. Then a radiometric analysis is used to classify above-ground regions as building or vegetation, leading to the determination of the final above-ground objects. The originality of the method is its ability to cope with extended above-ground areas, even in case of a sloping ground surface. This characteristic is necessary in a urban environment. Results are very robust to image and scene variability, and they enable the utilization of appropriate local 3-D reconstruction algorithms.  相似文献   

13.
Let R be a commutative ring with 1, let RX1,…,Xn/I be the polynomial algebra in the n≥4 noncommuting variables X1,…,Xn over R modulo the set of commutator relations I={(X1+···+Xn)*Xi=Xi*(X1+···+Xn)|1≤in}. Furthermore, let G be an arbitrary group of permutations operating on the indeterminates X1,…,Xn, and let RX1,…,Xn/IG be the R-algebra of G-invariant polynomials in RX1,…,Xn/I. The first part of this paper is about an algorithm, which computes a representation for any fRX1,…,Xn/IG as a polynomial in multilinear G-invariant polynomials, i.e., the maximal variable degree of the generators of RX1,…,Xn/IG is at most 1. The algorithm works for any ring R and for any permutation group G. In addition, we present a bound for the number of necessary generators for the representation of all G-invariant polynomials in RX1,…,Xn/IG with a total degree of at most d. The second part contains a first but promising analysis of G-invariant polynomials of solvable polynomial rings.  相似文献   

14.
This paper presents a general information-theoretic approach for obtaining lower bounds on the number of examples required for Probably Approximately Correct (PAC) learning in the presence of noise. This approach deals directly with the fundamental information quantities, avoiding a Bayesian analysis. The technique is applied to several different models, illustrating its generality and power. The resulting bounds add logarithmic factors to (or improve the constants in) previously known lower bounds.  相似文献   

15.
A Continuous Probabilistic Framework for Image Matching   总被引:1,自引:0,他引:1  
In this paper we describe a probabilistic image matching scheme in which the image representation is continuous and the similarity measure and distance computation are also defined in the continuous domain. Each image is first represented as a Gaussian mixture distribution and images are compared and matched via a probabilistic measure of similarity between distributions. A common probabilistic and continuous framework is applied to the representation as well as the matching process, ensuring an overall system that is theoretically appealing. Matching results are investigated and the application to an image retrieval system is demonstrated.  相似文献   

16.
An atomic representation of a Herbrand model (ARM) is a finite set of (not necessarily ground) atoms over a given Herbrand universe. Each ARM represents a possibly infinite Herbrand interpretation. This concept has emerged independently in different branches of computer science as a natural and useful generalization of the concept of finite Herbrand interpretation. It was shown that several recursively decidable problems on finite Herbrand models (or interpretations) remain decidable on ARMs.The following problems are essential when working with ARMs: Deciding the equivalence of two ARMs, deciding subsumption between ARMs, and evaluating clauses over ARMs. These problems were shown to be decidable, but their computational complexity has remained obscure so far. The previously published decision algorithms require exponential space. In this paper, we prove that all mentioned problems are coNP-complete.  相似文献   

17.
Face Detection: A Survey   总被引:5,自引:0,他引:5  
In this paper we present a comprehensive and critical survey of face detection algorithms. Face detection is a necessary first-step in face recognition systems, with the purpose of localizing and extracting the face region from the background. It also has several applications in areas such as content-based image retrieval, video coding, video conferencing, crowd surveillance, and intelligent human–computer interfaces. However, it was not until recently that the face detection problem received considerable attention among researchers. The human face is a dynamic object and has a high degree of variability in its apperance, which makes face detection a difficult problem in computer vision. A wide variety of techniques have been proposed, ranging from simple edge-based algorithms to composite high-level approaches utilizing advanced pattern recognition methods. The algorithms presented in this paper are classified as either feature-based or image-based and are discussed in terms of their technical approach and performance. Due to the lack of standardized tests, we do not provide a comprehensive comparative evaluation, but in cases where results are reported on common datasets, comparisons are presented. We also give a presentation of some proposed applications and possible application areas.  相似文献   

18.
19.
It is often difficult to come up with a well-principled approach to the selection of low-level features for characterizing images for content-based retrieval. This is particularly true for medical imagery, where gross characterizations on the basis of color and other global properties do not work. An alternative for medical imagery consists of the “scattershot” approach that first extracts a large number of features from an image and then reduces the dimensionality of the feature space by applying a feature selection algorithm such as the Sequential Forward Selection method.This contribution presents a better alternative to initial feature extraction for medical imagery. The proposed new approach consists of (i) eliciting from the domain experts (physicians, in our case) the perceptual categories they use to recognize diseases in images; (ii) applying a suite of operators to the images to detect the presence or the absence of these perceptual categories; (iii) ascertaining the discriminatory power of the perceptual categories through statistical testing; and, finally, (iv) devising a retrieval algorithm using the perceptual categories. In this paper we will present our proposed approach for the domain of high-resolution computed tomography (HRCT) images of the lung. Our empirical evaluation shows that feature extraction based on physicians' perceptual categories achieves significantly higher retrieval precision than the traditional scattershot approach. Moreover, the use of perceptually based features gives the system the ability to provide an explanation for its retrieval decisions, thereby instilling more confidence in its users.  相似文献   

20.
Thedistance transform(DT) is an image computation tool which can be used to extract the information about the shape and the position of the foreground pixels relative to each other. It converts a binary image into a grey-level image, where each pixel has a value corresponding to the distance to the nearest foreground pixel. The time complexity for computing the distance transform is fully dependent on the different distance metrics. Especially, the more exact the distance transform is, the worse execution time reached will be. Nowadays, quite often thousands of images are processed in a limited time. It seems quite impossible for a sequential computer to do such a computation for the distance transform in real time. In order to provide efficient distance transform computation, it is considerably desirable to develop a parallel algorithm for this operation. In this paper, based on the diagonal propagation approach, we first provide anO(N2) time sequential algorithm to compute thechessboard distance transform(CDT) of anN×Nimage, which is a DT using the chessboard distance metrics. Based on the proposed sequential algorithm, the CDT of a 2D binary image array of sizeN×Ncan be computed inO(logN) time on the EREW PRAM model usingO(N2/logN) processors,O(log logN) time on the CRCW PRAM model usingO(N2/log logN) processors, andO(logN) time on the hypercube computer usingO(N2/logN) processors. Following the mapping as proposed by Lee and Horng, the algorithm for the medial axis transform is also efficiently derived. The medial axis transform of a 2D binary image array of sizeN×Ncan be computed inO(logN) time on the EREW PRAM model usingO(N2/logN) processors,O(log logN) time on the CRCW PRAM model usingO(N2/log logN) processors, andO(logN) time on the hypercube computer usingO(N2/logN) processors. The proposed parallel algorithms are composed of a set of prefix operations. In each prefix operation phase, only increase (add-one) operation and minimum operation are employed. So, the algorithms are especially efficient in practical applications.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号