首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Two novel systems computing dense three-dimensional (3-D) scene flow and structure from multiview image sequences are described in this paper. We do not assume rigidity of the scene motion, thus allowing for nonrigid motion in the scene. The first system, integrated model-based system (IMS), assumes that each small local image region is undergoing 3-D affine motion. Non-linear motion model fitting based on both optical flow constraints and stereo constraints is then carried out on each local region in order to simultaneously estimate 3-D motion correspondences and structure. The second system is based on extended gradient-based system (EGS), a natural extension of two-dimensional (2-D) optical flow computation. In this method, a new hierarchical rule-based stereo matching algorithm is first developed to estimate the initial disparity map. Different available constraints under a multiview camera setup are further investigated and utilized in the proposed motion estimation. We use image segmentation information to adopt and maintain the motion and depth discontinuities. Within the framework for EGS, we present two different formulations for 3-D scene flow and structure computation. One formulation assumes that initial disparity map is accurate, while the other does not. Experimental results on both synthetic and real imagery demonstrate the effectiveness of our 3-D motion and structure recovery schemes. Empirical comparison between IMS and EGS is also reported.  相似文献   

2.
A theory of the motion fields of curves   总被引:6,自引:6,他引:0  
This article reports a study of the motion field generated by moving 3-D curves that are observed by a camera. We first discuss the relationship between optical flow and motion field and show that the assumptions made in the computation of the optical flow are a bit difficult to defend.We then go ahead to study the motion field of a general curve. We first study the general case of a curve moving nonrigidly and introduce the notion of isometric motion. In order to do this, we introduce the notion of spatiotemporal surface and study its differential properties up to the second order. We show that, contrary to what is commonly believed, the full motion field of the curve (i.e., the component tangent to the curve) cannot be recovered from this surface. We also give the equations that characterize the spatio-temporal surface completely up to a rigid transformation. Those equations are the expressions of the first and second fundamental forms and the Gauss and Codazzi-Mainardi equations. We then relate those differential expressions computed on the spatio-temporal surface to quantities that can be computed from the images intensities. The actual values depend upon the choice of the edge detector.We then show that the hypothesis of a rigid 3-D motion allows in general to recover the structure and the motion of the curve, in fact without explicitly computing the tangential motion field, at the cost of introducing the three-dimensional accelerations. We first study the motion field generated by the simplest kind of rigid 3-D curves, namely lines. This study is illuminating in that it paves the way for the study of general rigid curves and because of the useful results which are obtained. We then extend the results obtained in the case of lines to the case of general curves and show that at each point of the image curve two equations can be written relating the kinematic screw of the moving 3-D curve and its time derivative to quantities defined in the study of the general nonrigid motion that can be measured from the spatio-temporal surface and therefore from the image. This shows that the structure and the motion of the curve can be recovered from six image points only, without establishing any point correspondences.Finally we study the cooperation between motion and stereo in the framework of this theory. The use of two cameras instead of one allows us to get rid of the three-dimensional accelerations and the relations between the two spatio-temporal surfaces of the same rigidly moving 3-D curve can be used to help disambiguate stereo correspondences.  相似文献   

3.
人工神经网络在三维X线头影测量系统中的应用   总被引:2,自引:0,他引:2  
给出一种三维X线头影测量中图形图像重建的方法;南正侧位两张X线头影片上72个标志点。可以重建出颅颌骨三维透视图;利用人工神经网络LMBP算法。以病人颅颌骨72个标志点的三维坐标集合在某一平面的二维投影为理想输出,以人体颅颌骨标本72个同名标志点在相同平面的二维坐标集合为训练样本,在训练结束后即可建立两者之问的近似线性关系;由于网络具有泛化能力,因此可用该网络对颅颌骨标本上每一像素进行变换。结果表明变换后72个标志点坐标和病人很好吻合,变换图像为病人近似的颅颌骨图像;即满足口腔正畸诊断要求又获得较好图像视觉效果。  相似文献   

4.
Motion field and optical flow: qualitative properties   总被引:7,自引:0,他引:7  
It is shown that the motion field the 2-D vector field which is the perspective projection on the image plane of the 3-D velocity field of a moving scene, and the optical flow, defined as the estimate of the motion field which can be derived from the first-order variation of the image brightness pattern, are in general different, unless special conditions are satisfied. Therefore, dense optical flow is often ill-suited for computing structure from motion and for reconstructing the 3-D velocity field by algorithms which require a locally accurate estimate of the motion field. A different use of the optical flow is suggested. It is shown that the (smoothed) optical flow and the motion field can be interpreted as vector fields tangent to flows of planar dynamical systems. Stable qualitative properties of the motion field, which give useful informations about the 3-D velocity field and the 3-D structure of the scene, usually can be obtained from the optical flow. The idea is supported by results from the theory of structural stability of dynamical systems  相似文献   

5.
This paper proposes a novel neural-network-based adaptive hybrid-reflectance three-dimensional (3-D) surface reconstruction model. The neural network automatically combines the diffuse and specular components into a hybrid model. The proposed model considers the characteristics of each point and the variant albedo to prevent the reconstructed surface from being distorted. The neural network inputs are the pixel values of the two-dimensional images to be reconstructed. The normal vectors of the surface can then be obtained from the output of the neural network after supervised learning, where the illuminant direction does not have to be known in advance. Finally, the obtained normal vectors are applied to enforce integrability when reconstructing 3-D objects. Facial images and images of other general objects were used to test the proposed approach. The experimental results demonstrate that the proposed neural-network-based adaptive hybrid-reflectance model can be successfully applied to objects generally, and perform 3-D surface reconstruction better than some existing approaches.  相似文献   

6.
A new approach for the interpretation of optical flow fields is presented. The flow field, which can be produced by a sensor moving through an environment with several independently moving, rigid objects, is allowed to be sparse, noisy, and partially incorrect. The approach is based on two main stages. In the first stage, the flow field is partitioned into connected segments of flow vectors, where each segment is consistent with a rigid motion of a roughly planar surface. In the second stage, segments are grouped under the hypothesis that they are induced by a single, rigidly moving object. Each hypothesis is tested by searching for three-dimensional (3-D) motion parameters which are compatible with all the segments in the corresponding group. Once the motion parameters are recovered, the relative environmental depth can be estimated as well. Experiments based on real and simulated data are presented.  相似文献   

7.
This paper is concerned with three-dimensional (3D) analysis, and analysis-guided syntheses, of images showing 3-D motion of an observer relative to a scene. There are two objectives of the paper. First, it presents an approach to recovering 3D motion and structure parameters from multiple cues present in a monocular image sequence, such as point features, optical flow, regions, lines, texture gradient, and vanishing line. Second, it introduces the notion that the cues that contribute the most to 3-D interpretation are also the ones that would yield the most realistic synthesis, thus suggesting an approach to analysis guided 3-D representation. For concreteness, the paper focuses on flight image sequences of a planar, textured surface. The integration of information in these diverse cues is carried out using optimization. For reliable estimation, a sequential batch method is used to compute motion and structure. Synthesis is done by using (i) image attributes extracted from the image sequence, and (ii) simple, artificial image attributes which are not present in the original images. For display, real and/or artificial attributes are shown as a monocular or a binocular sequence. Performance evaluation is done through experiments with one synthetic sequence, and two real image sequences digitized from a commercially available video tape and a laserdisc. The attribute based representation of these sequences compressed their sizes by 502 and 367. The visualization sequence appears very similar to the original sequence in informal, monocular as well as stereo viewing on a workstation monitor  相似文献   

8.
The accuracy and the dependence on parameters of a general scheme for the analysis of time-varying image sequences are discussed. The approach is able to produce vector fields from which it is possible to recover 3-D motion parameters such as time-to-collision and angular velocity. The numerical stability of the computed optical flow and the dependence of the recovery of 3-D motion parameters on spatial and temporal filtering is investigated. By considering optical flows computed on subsampled images or along single scanlines, it is also possible to recover 3-D motion parameters from reduced optical flows. An adequate estimate of time-to-collision can be obtained from sequences of images with spatial resolution reduced to 128×128 pixels or from sequences of single scanlines passing near the focus of expansion. The use of Kalman filtering increases the accuracy and the robustness of the estimation of motion parameters. The proposed approach seems to be able to provide not only a theoretical background but also practical tools that are adequate for the analysis of time-varying image sequences  相似文献   

9.
We present a novel approach to the coarse segmentation of tubular structures in three-dimensional (3-D) image data. Our algorithm, which requires only few initial values and minimal user interaction, can be used to initialize complex deformable models and is based on an extension of the randomized hough transform (RHT), a robust method for low-dimensional parametric object detection. Tubular structures are modeled as generalized cylinders. By means of a discrete Kalman filter, they are tracked through 3-D space. Our extensions to the RHT are a feature adaptive selection of the sample size, expectation-dependent weighting of the input data, and a novel 3-D parameterization for straight elliptical cylinders. Experimental results obtained for 3-D synthetic as well as for 3-D medical images demonstrate the robustness of our approach w.r.t. image noise. We present the successful segmentation of tubular anatomical structures such as the aortic arc and the spinal cord.  相似文献   

10.
This paper presents a new fingerprint recognition method based on mel-frequency cepstral coefficients (MFCCs). In this method, cepstral features are extracted from a group of fingerprint images, which are transformed first to 1-D signals by lexicographic ordering. MFCCs and polynomial shape coefficients are extracted from these 1-D signals or their transforms to generate a database of features, which can be used to train a neural network. The fingerprint recognition can be performed by extracting features from any new fingerprint image with the same method used in the training phase. These features are tested with the neural network. The different domains are tested and compared for efficient feature extraction from the lexicographically ordered 1-D signals. Experimental results show the success of the proposed cepstral method for fingerprint recognition at low as well as high signal to noise ratios (SNRs). Results also show that the discrete cosine transform (DCT) is the most appropriate domain for feature extraction.  相似文献   

11.
In this paper, we present a machine learning approach for subject independent human action recognition using depth camera, emphasizing the importance of depth in recognition of actions. The proposed approach uses the flow information of all 3 dimensions to classify an action. In our approach, we have obtained the 2-D optical flow and used it along with the depth image to obtain the depth flow (Z motion vectors). The obtained flow captures the dynamics of the actions in space–time. Feature vectors are obtained by averaging the 3-D motion over a grid laid over the silhouette in a hierarchical fashion. These hierarchical fine to coarse windows capture the motion dynamics of the object at various scales. The extracted features are used to train a Meta-cognitive Radial Basis Function Network (McRBFN) that uses a Projection Based Learning (PBL) algorithm, referred to as PBL-McRBFN, henceforth. PBL-McRBFN begins with zero hidden neurons and builds the network based on the best human learning strategy, namely, self-regulated learning in a meta-cognitive environment. When a sample is used for learning, PBL-McRBFN uses the sample overlapping conditions, and a projection based learning algorithm to estimate the parameters of the network. The performance of PBL-McRBFN is compared to that of a Support Vector Machine (SVM) and Extreme Learning Machine (ELM) classifiers with representation of every person and action in the training and testing datasets. Performance study shows that PBL-McRBFN outperforms these classifiers in recognizing actions in 3-D. Further, a subject-independent study is conducted by leave-one-subject-out strategy and its generalization performance is tested. It is observed from the subject-independent study that McRBFN is capable of generalizing actions accurately. The performance of the proposed approach is benchmarked with Video Analytics Lab (VAL) dataset and Berkeley Multi-modal Human Action Database (MHAD).  相似文献   

12.
In this paper, we investigate the neural network with three-dimensional parameters for applications like 3D image processing, interpretation of 3D transformations, and 3D object motion. A 3D vector represents a point in the 3D space, and an object might be represented with a set of these points. Thus, it is desirable to have a 3D vector-valued neural network, which deals with three signals as one cluster. In such a neural network, 3D signals are flowing through a network and are the unit of learning. This article also deals with a related 3D back-propagation (3D-BP) learning algorithm, which is an extension of conventional back-propagation algorithm in the single dimension. 3D-BP has an inherent ability to learn and generalize the 3D motion. The computational experiments presented in this paper evaluate the performance of considered learning machine in generalization of 3D transformations and 3D pattern recognition.  相似文献   

13.
The authors present an iterative algorithm for the recovery of 2-D motion, i.e., for the determination of a transformation that maps one image onto another. The local ambiguity in measuring the motion of contour segments (the aperture problem) implies a reliance on measurements along the normal direction. Since the measured normal flow does not agree with the actual normal flow, the full flow recovered from this erroneous flow also possesses substantial error, and any attempt to recover the 3-D motion from such full flow fails. The proposed method is based on the observation that a polynomial approximation of the image flow provides sufficient information for 3-D motion computation. The use of an explicit flow model results in improved normal flow estimates through an iterative process. The authors discuss the adequacy and the convergence of the algorithm. The algorithm was tested on some synthetic and some simple natural time-varying images. The image flow recovered from this scheme is sufficiently accurate to be useful in 3-D structure and motion computation  相似文献   

14.
The main aim of this paper is to propose a new neural algorithm to perform a segmentation of an observed scene in regions corresponding to different moving objects, by analysing a time-varying image sequence. The method consists of a classification step, where the motion of small patches is recovered through an optimisation approach, and a segmen-tation step merging neighbouring patches characterised by the same motion. Classification of motion is performed without optical flow computation. Three-dimensional motion parameter estimates are obtained directly from the spatial and temporal image gradients by minimising an appropriate energy function with a Hopfield-like neural network. Network convergence is accelerated by integrating the quantitative estimation of the motion parameters with a qualitative estimate of dominant motion using the geometric theory of differential equations.  相似文献   

15.
《Image and vision computing》2001,19(9-10):669-678
Neural-network-based image techniques such as the Hopfield neural networks have been proposed as an alternative approach for image segmentation and have demonstrated benefits over traditional algorithms. However, due to its architecture limitation, image segmentation using traditional Hopfield neural networks results in the same function as thresholding of image histograms. With this technique high-level contextual information cannot be incorporated into the segmentation procedure. As a result, although the traditional Hopfield neural network was capable of segmenting noiseless images, it lacks the capability of noise robustness. In this paper, an innovative Hopfield neural network, called contextual-constraint-based Hopfield neural cube (CCBHNC) is proposed for image segmentation. The CCBHNC uses a three-dimensional architecture with pixel classification implemented on its third dimension. With the three-dimensional architecture, the network is capable of taking into account each pixel's feature and its surrounding contextual information. Besides the network architecture, the CCBHNC also differs from the original Hopfield neural network in that a competitive winner-take-all mechanism is imposed in the evolution of the network. The winner-take-all mechanism adeptly precludes the necessity of determining the values for the weighting factors for the hard constraints in the energy function in maintaining feasible results. The proposed CCBHNC approach for image segmentation has been compared with two existing methods. The simulation results indicate that CCBHNC can produce more continuous, and smoother images in comparison with the other methods.  相似文献   

16.
This paper presents an approach to understanding general 3-D motion of a rigid body from image sequences. Based on dynamics, a locally constant angular momentum (LCAM) model is introduced. The model is local in the sense that it is applied to a limited number of image frames at a time. Specifically, the model constrains the motion, over a local frame subsequence, to be a superposition of precession and translation. Thus, the instantaneous rotation axis of the object is allowed to change through the subsequence. The trajectory of the rotation center is approximated by a vector polynomial. The parameters of the model evolve in time so that they can adapt to long term changes in motion characteristics. The nature and parameters of short term motion can be estimated continuously with the goal of understanding motion through the image sequence. The estimation algorithm presented in this paper is linear, i.e., the algorithm consists of solving simultaneous linear equations. Based on the assumption that the motion is smooth, object positions and motion in the near future can be predicted, and short missing subsequences can be recovered. Noise smoothing is achieved by overdetermination and a leastsquares criterion. The framework is flexible in the sense that it allows both overdetermination in number of feature points and the number of image frames.  相似文献   

17.
基于连续函数的自反馈Hopfield神经网络图像复原算法   总被引:1,自引:0,他引:1  
在分析图像复原的Hopfield神经网络恢复算法的基础上,提出了一种基于连续函数的全并行自反馈改进算法,利用该算法对匀速直线运动模糊图像进行复原,并与Paik方法得到的复原图像进行比较,发现该方法得到的复原图像信噪比提高显著,且恢复过程加快。  相似文献   

18.
In order to obtain three-dimensional ultrasound images of the heart, a scanner with easy operation is required. This paper describes a three-dimensional cardiac imaging method with motion based on orthogonal sectional images. These original sectional images may be taken by a transesophageal approach using two miniature phased array transducers mounted on the tip of a gastroscope or simultaneous bi-frequency method using two mechanical scanning transducers set on the thorax. The latter scanner is now commercially available. Images are recorded on video tapes. Contour curves of 2-D echocardiograms are extracted and reconstructed as a 3-D moving heart image. The cardiac volume curve vs. time is also evaluated based on these three-dimensional data.  相似文献   

19.
The necessary and sufficient conditions that an object should satisfy so that motion can be uniquely determined by a direct method are discussed. This direct method, based on the temporal-spatial gradient scheme, can estimate the three-dimensional (3-) motion parameters of a rigid moving object from an image sequence, by utilizing depth information of the object. It is shown that the 3-D motion cannot be uniquely determined for only eight kinds of objects with special geometric structure and surface pattern  相似文献   

20.
A group theoretic approach to image representation and analysis is presented. The concept of a wavelet transform is extended to incorporate different types of groups. The wavelet approach is generalized to Lie groups that satisfy conditions of compactness and commutability and to groups that are determined in a particular way by subgroups that satisfy these conditions. These conditions are fundamental to finding the invariance measure for the admissibility condition of a mother wavelet-type transform. The following special cases of interest in image representation and in biological and computer vision are discussed: 2- and 3-D rigid motion, similarity and Lorentzian groups, and 2-D projective groups obtained from 3-D camera rotation.This research was supported by U.S.—Israel Binational Science Foundation grant 8800320, by the Franz Ollendorff Center of the Department of Electrical Engineering, and by the Fund for Promotion of Research at the Technion. J. Segman is a VATAT (Israel National Committee for Planning and Budgeting Universities) Fellow at the Technion.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号