首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 30 毫秒
1.
The recovery of a three-dimensional (3-D) model from a sequence of two-dimensional (2-D) images is very useful in medical image analysis. Image sequences obtained from the relative motion between the object and the camera or the scanner contain more 3-D information than a single image. Methods to visualize the computed tomograms can be divided into two approaches: the surface rendering approach and the volume rendering approach. In this paper, a new surface rendering method using optical flow is proposed. Optical flow is the apparent motion in the image plane produced by the projection of real 3-D motion onto the 2-D image. The 3-D motion of an object can be recovered from the optical-flow field using additional constraints. By extracting the surface information from 3-D motion, it is possible to obtain an accurate 3-D model of the object. Both synthetic and real image sequences have been used to illustrate the feasibility of the proposed method. The experimental results suggest that the proposed method is suitable for the reconstruction of 3-D models from ultrasound medical images as well as other computed tomograms  相似文献   

2.
The initial conception of a model-based analysis synthesis image coding (MBASIC) system is described and a construction method for a three-dimensional (3-D) facial model that includes synthesis methods for facial expressions is presented. The proposed MBASIC system is an image coding method that utilizes a 3-D model of the object which is to be reproduced. An input image is first analyzed and an output image using the 3-D model is then synthesized. A very low bit rate image transmission can be realized because the encoder sends only the required analysis parameters. Output images can be reconstructed without the noise corruption that reduces naturalness because the decoder synthesizes images from a similar 3-D model.

In order to construct a 3-D model of a person's face, a method is developed which uses a 3-D wire frame face model. A full-face image is then projected onto this wire frame model. For the synthesis of facial expressions two different methods are proposed; a clip-and-paste method and a facial structure deformation method.  相似文献   


3.
Tracking a dynamic set of feature points   总被引:5,自引:0,他引:5  
We address the problems of tracking a set of feature points over a long sequence of monocular images as well as how to include and track new feature points detected in successive frames. Due to the 3-D movement of the camera, different parts of the images exhibit different image motion. Tracking discrete features can therefore be decomposed into several independent and local problems. Accordingly, we propose a localized feature tracking algorithm. The trajectory of each feature point is described by a 2-D kinematic model. Then to track a feature point, an interframe motion estimation scheme is designed to obtain the estimates of interframe motion parameters. Subsequently, using the estimates of motion parameters, corresponding points are identified to subpixel accuracy. Afterwards, the temporal information is processed to facilitate the tracking scheme. Since different feature points are tracked independently, the algorithm is able to handle the image motion arising from general 3-D camera movements. On the other hand, in addition to tracking feature points detected at the beginning, an efficient way to dynamically include new points extracted in subsequent frames is devised so that the information in a sequence is preserved. Experimental results for several image sequences are also reported.  相似文献   

4.
Efficient optical camera tracking in virtual sets   总被引:2,自引:0,他引:2  
Optical tracking systems have become particularly popular in virtual studios applications tending to substitute electromechanical ones. However, optical systems are reported to be inferior in terms of accuracy in camera motion estimation. Moreover, marker-based approaches often cause problems in image/video compositing and impose undesirable constraints on camera movement, present work introduces a novel methodology for the construction of a two-tone blue screen, which allows the localization of camera in three-dimensional (3-D) space on the basis of the captured sequence. At the same time, a novel algorithm is presented for the extraction of camera's 3-D motion parameters based on 3-D-to-two-dimensional (2-D) line correspondences. Simulated experiments have been included to illustrate the performance of the proposed system.  相似文献   

5.
Reconstruction of a 3-D face model from a single 2-D face image is fundamentally important for face recognition and animation because the 3-D face model is invariant to changes of viewpoint, illumination, background clutter, and occlusions. Given a coupled training set that contains pairs of 2-D faces and the corresponding 3-D faces, we train a novel coupled radial basis function network (C-RBF) to recover the 3-D face model from a single 2-D face image. The C-RBF network explores: 1) the intrinsic representations of 3-D face models and those of 2-D face images; 2) mappings between a 3-D face model and its intrinsic representation; and 3) mappings between a 2-D face image and its intrinsic representation. Since a particular face can be reconstructed by its nearest neighbors, we can assume that the linear combination coefficients for a particular 2-D face image reconstruction are identical to those for the corresponding 3-D face model reconstruction. Therefore, we can reconstruct a 3-D face model by using a single 2-D face image based on the C-RBF network. Extensive experimental results on the BU3D database indicate the effectiveness of the proposed C-RBF network for recovering the 3-D face model from a single 2-D face image.  相似文献   

6.
We present a strategy based on human gait to achieve efficient tracking, recovery of ego-motion and 3-D reconstruction from an image sequence acquired by a single camera attached to a pedestrian. In the first phase, the parameters of the human gait are established by a classical frame-by-frame analysis, using an generalized least squares (GLS) technique. The gait model is non-linear, represented by a truncated Fourier series. In the second phase, this gait model is employed within a “predict–correct” framework using a maximum a posteriori, expectation-maximization (MAP-EM) strategy to obtain robust estimates of the ego-motion and scene structure, while continuously refining the gait model. Experiments on synthetic and real image sequences show that the use of the gait model results in more efficient tracking. This is demonstrated by improved matching and retention of features, and a reduction in execution time, when processing video sequences.  相似文献   

7.
We describe a novel approach for creating a three-dimensional (3-D) face structure from multiple image views of a human face taken at a priori unknown poses by appropriately morphing a generic 3-D face. A cubic explicit polynomial in 3-D is used to morph a generic face into the specific face structure. The 3-D face structure allows for accurate pose estimation as well as the synthesis of virtual images to be matched with a test image for face identification. The estimation of a 3-D person's face and pose estimation is achieved through the use of a distance map metric. This distance map residual error (geometric-based face classifier) and the image intensity residual error are fused in identifying a person in the database from one or more arbitrary image view(s). Experimental results are shown on simulated data in the presence of noise, as well as for images of real faces, and promising results are obtained.  相似文献   

8.
视频图像中的车辆检测跟踪和分类   总被引:2,自引:1,他引:2  
介绍了一种用固定的单摄像头拍摄交通图像,并从图像序列中检测、跟踪、分类车辆的方法。该方法大致可分为3部分:抽取背景图像和图像分割;基于针孔模型的摄像机定标,计算透视投影矩阵;利用区域特征进行匹配跟踪,建立目标链。恢复目标三维信息,采用模型匹配法对车型分类。实验证明方法是简单可行的。  相似文献   

9.
View generation for three-dimensional scenes from video sequences   总被引:1,自引:0,他引:1  
This paper focuses on the representation and view generation of three-dimensional (3-D) scenes. In contrast to existing methods that construct a full 3-D model or those that exploit geometric invariants, our representation consists of dense depth maps at several preselected viewpoints from an image sequence. Furthermore, instead of using multiple calibrated stationary cameras or range scanners, we derive our depth maps from image sequences captured by an uncalibrated camera with only approximately known motion. We propose an adaptive matching algorithm that assigns various confidence levels to different regions in the depth maps. Nonuniform bicubic spline interpolation is then used to fill in low confidence regions in the depth maps. Once the depth maps are computed at preselected viewpoints, the intensity and depth at these locations are used to reconstruct arbitrary views of the 3-D scene. Specifically, the depth maps are regarded as vertices of a deformable 2-D mesh, which are transformed in 3-D, projected to 2-D, and rendered to generate the desired view. Experimental results are presented to verify our approach.  相似文献   

10.
The author's goal is to generate a virtual space close to the real communication environment between network users or between humans and machines. There should be an avatar in cyberspace that projects the features of each user with a realistic texture-mapped face to generate facial expression and action controlled by a multimodal input signal. Users can also get a view in cyberspace through the avatar's eyes, so they can communicate with each other by gaze crossing. The face fitting tool from multi-view camera images is introduced to make a realistic three-dimensional (3-D) face model with texture and geometry very close to the original. This fitting tool is a GUI-based system using easy mouse operation to pick up each feature point on a face contour and the face parts, which can enable easy construction of a 3-D personal face model. When an avatar is speaking, the voice signal is essential in determining the mouth shape feature. Therefore, a real-time mouth shape control mechanism is proposed by using a neural network to convert speech parameters to lip shape parameters. This neural network can realize an interpolation between specific mouth shapes given as learning data. The emotional factor can sometimes be captured by speech parameters. This media conversion mechanism is described. For dynamic modeling of facial expression, a muscle structure constraint is introduced for making a facial expression naturally with few parameters. We also tried to obtain muscle parameters automatically from a local motion vector on the face calculated by the optical flow in a video sequence  相似文献   

11.
In this paper, we present a complete system for the recognition and localization of a three-dimensional (3-D) model from a sequence of monocular images with known motion. The originality of this system is twofold. First, it uses a purely 3-D approach, starting from the 3-D reconstruction of the scene and ending by the 3-D matching of the model. Second, unlike most monocular systems, we do not use token tracking to match successive images. Rather, subpixel contour matching is used to recover more precisely complete 3-D contours. In contrast with the token tracking approaches, which yield a representation of the 3-D scene based on disconnected segments or points, this approach provides us with a denser and higher level representation of the scene. The reconstructed contours are fused along successive images using a simple result derived from the Kalman filter theory. The fusion process increases the localization precision and the robustness of the 3-D reconstruction. Finally, corners are extracted from the 3-D contours. They are used to generate hypotheses of the model position, using a hypothesize-and-verify algorithm that is described in detail. This algorithm yields a robust recognition and precise localization of the model in the scene. Results are presented on infrared image sequences with different resolutions, demonstrating the precision of the localization as well as the robustness and the low computational complexity of the algorithms.  相似文献   

12.
杨敏 《电子工程师》2005,31(12):33-35
摄像机定标是获取摄像机几何和光学参数的过程,也是获得摄像机在外部参考坐标系中的三维位置和面向.本文利用针孔模型对网络摄像头进行定标,该方法是基于标定物上已知参考点的三维坐标和参考点在图像上投影像素坐标之间对应关系,它分为两步,先利用线性模型对摄像机投影矩阵进行估计,然后基于投影矩阵分解出摄像机内外部参数.利用真实标定物图像进行实验和计算,得到较好的结果.  相似文献   

13.
Three-dimensional (3-D) scene reconstruction from broadcast video is a challenging problem with many potential applications, such as 3-D TV, free-view TV, augmented reality or three-dimensionalization of two-dimensional (2-D) media archives. In this paper, a flexible and effective system capable of efficiently reconstructing 3-D scenes from broadcast video is proposed, with the assumption that there is relative motion between camera and scene/objects. The system requires no a priori information and input, other than the video sequence itself, and capable of estimating the internal and external camera parameters and performing a 3-D motion-based segmentation, as well as computing a dense depth field. The system also serves as a showcase to present some novel approaches for moving object segmentation, sparse and dense reconstruction problems. According to the simulations for both synthetic and real data, the system achieves a promising performance for typical TV content, indicating that it is a significant step towards the 3-D reconstruction of scenes from broadcast video.  相似文献   

14.
In this paper, after an overview of the literature concerning the imaging technologies applied to skin wounds assessment, we present an original approach to build 3-D models of skin wounds from color images. The method can deal with uncalibrated images acquired with a handheld digital camera with free zooming. Compared with the cumbersome imaging systems already proposed, this novel solution uses a low-cost and user-friendly image acquisition device suitable for widespread application in health care centers. However, this method entails the development of a robust image processing chain. An original iterative matching scheme is used to generate a dense estimation of the surface geometry from two widely separated views. The best configuration for taking photographs lies between 15deg and 30deg for the vergency angle. The metric reconstruction of the skin wound is fully automated through self-calibration. From the 3-D model of the skin wound, accurate volumetric measurements are achieved. The accuracy of the inferred 3-D surface is validated by registration to a ground truth and repetitive tests on volume. The global precision around 3% is in accordance with the clinical requirement of 5% for assessing the healing process.  相似文献   

15.
The construction of an accurate 3-D scene model is a fundamental aspect of any model-based image coding scheme. This article describes the generation of a triangular facet surface representation from the data acquired by a calibrated binocular (stereo) camera system  相似文献   

16.
The paper gives an overview of model-based approaches applied to image coding, by looking at image source models. In these model-based schemes, which are different from the various conventional waveform coding methods, the 3-D properties of the scenes are taken into consideration. They can achieve very low bit rate image transmission. The 2-D model and 3-D model based approaches are explained. Among them, a 3-D model based method using a 3-D facial model and a 2-D model based method utilizing 2-D deformable triangular patches are described. Works related to 3-D model-based coding of facial images and some of the remaining problems are also described  相似文献   

17.
Image overlay projection is a form of augmented reality that allows surgeons to view underlying anatomical structures directly on the patient surface. It improves intuitiveness of computer-aided surgery by removing the need for sight diversion between the patient and a display screen and has been reported to assist in 3-D understanding of anatomical structures and the identification of target and critical structures. Challenges in the development of image overlay technologies for surgery remain in the projection setup. Calibration, patient registration, view direction, and projection obstruction remain unsolved limitations to image overlay techniques. In this paper, we propose a novel, portable, and handheld-navigated image overlay device based on miniature laser projection technology that allows images of 3-D patient-specific models to be projected directly onto the organ surface intraoperatively without the need for intrusive hardware around the surgical site. The device can be integrated into a navigation system, thereby exploiting existing patient registration and model generation solutions. The position of the device is tracked by the navigation system's position sensor and used to project geometrically correct images from any position within the workspace of the navigation system. The projector was calibrated using modified camera calibration techniques and images for projection are rendered using a virtual camera defined by the projectors extrinsic parameters. Verification of the device's projection accuracy concluded a mean projection error of 1.3 mm. Visibility testing of the projection performed on pig liver tissue found the device suitable for the display of anatomical structures on the organ surface. The feasibility of use within the surgical workflow was assessed during open liver surgery. We show that the device could be quickly and unobtrusively deployed within the sterile environment.  相似文献   

18.
Information about camera operations such as zoom, focus, pan, tilt and dollying is significant not only for efficient video coding, but also for content-based video representation. In this paper we describe a high-precision camera operation parameter measurement system and apply it to image motion inferring. First, we outline the implemented system which is designed to provide camera operation parameters with a high precision required for image coding applications. Second, we calibrate the camera lens to determine its exact optical properties, A pin-hole camera model with the 2nd order radial lens distortion and a two-image calibration technique are employed. Finally, we use the pan, tilt and zoom parameters measured by the system to infer image motion. The experimental results show that the inferred motion coincides with the actual motion very closely. Compared to the motion analysis techniques that estimate camera motion from video sequences, our approach does not suffer from ambiguity, thus can provide reliable and accurate image global motion. The obtained motion can be applied to image mosaicing, moving object segmentation, object-based image coding, etc  相似文献   

19.
Fourier-based approaches for three-dimensional (3-D) reconstruction are based on the relationship between the 3-D Fourier transform (FT) of the volume and the two-dimensional (2-D) FT of a parallel-ray projection of the volume. The critical step in the Fourier-based methods is the estimation of the samples of the 3-D transform of the image from the samples of the 2-D transforms of the projections on the planes through the origin of Fourier space, and vice versa for forward-projection (reprojection). The Fourier-based approaches have the potential for very fast reconstruction, but their straightforward implementation might lead to unsatisfactory results if careful attention is not paid to interpolation and weighting functions. In our previous work, we have investigated optimal interpolation parameters for the Fourier-based forward and back-projectors for iterative image reconstruction. The optimized interpolation kernels were shown to provide excellent quality comparable to the ideal sinc interpolator. This work presents an optimization of interpolation parameters of the 3-D direct Fourier method with Fourier reprojection (3D-FRP) for fully 3-D positron emission tomography (PET) data with incomplete oblique projections. The reprojection step is needed for the estimation (from an initial image) of the missing portions of the oblique data. In the 3D-FRP implementation, we use the gridding interpolation strategy, combined with proper weighting approaches in the transform and image domains. We have found that while the 3-D reprojection step requires similar optimal interpolation parameters as found in our previous studies on Fourier-based iterative approaches, the optimal interpolation parameters for the main 3D-FRP reconstruction stage are quite different. Our experimental results confirm that for the optimal interpolation parameters a very good image accuracy can be achieved even without any extra spectral oversampling, which is a common practice to decrease errors caused by interpolation in Fourier reconstruction.  相似文献   

20.
Estimation of global motion parameters by complex linear regression   总被引:1,自引:0,他引:1  
Global motion is very likely to occur in image sequences analysis. For example, it arises if the observer is moving during the sequence acquisition (ego-motion). Our aim is to get a simple method to estimate in a reliable may a set of parameters that can take into account the presence of a global motion component, using only local information. The novelty of our approach is in regarding spatial shift, change of scale, and rotation (corresponding to usual camera effects such as pan and zoom) as a two-dimensional (2-D) Doppler effect. The mathematical treatment is carried on in the complex plane, so that the results can be easily deduced as an extension of the one-dimensional (1-D) case; in this way, we obtain simple expressions, well suited for a practical realization of the estimate. The method has been experimentally validated by both real pictures with a synthetic motion and real image sequences.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号