首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This paper presents a new strategy to extract knowledge about the objects and their relative location in a complex scene when a single range image is taken. The analysis process is based on a range data distributed segmentation technique, which separates the components of the scene, and on a silhouette segmentation method, which classified the silhouette in real (non occluded) and false (occluded) parts. Finally, an occlusion graph provides a compact representation about the layout and relationship of the objects in the scene. This information is essential before higher level tasks in complex scenes – like recognition, understanding and robot interaction – are carried out. An extensive experimentation has been accomplished under real conditions in scenes of up to 12 objects yielding a very good performance. The experiments and results carried out validate the goodness of this approach in 3D environments.  相似文献   

2.
In this paper, we present a novel approach to recover a 3D human pose in real-time from a single depth image using principal direction analysis (PDA). Human body parts are first recognized from a human depth silhouette via trained random forests (RFs). PDA is applied to each recognized body part, which is presented as a set of points in 3D, to estimate its principal direction. Finally, a 3D human pose is recovered by mapping the principal direction to each body part of a 3D synthetic human model. We perform both quantitative and qualitative evaluations of our proposed 3D human pose recovering methodology. We show that our proposed approach has a low average reconstruction error of 7.07 degrees for four key joint angles and performs more reliably on a sequence of unconstrained poses than conventional methods. In addition, our methodology runs at a speed of 20 FPS on a standard PC, indicating that our system is suitable for real-time applications. Our 3D pose recovery methodology is applicable to applications ranging from human computer interactions to human activity recognition.  相似文献   

3.
利用一种基于法线的模型变形方法,从单张图像重建高质量的三维人脸.利用球谐函数和一个初始参考模型计算得到模型上每个顶点的法线,利用法线使参考模型变形.实验结果表明:提出的算法可以从单幅图像重建具有细节的高质量三维人脸.  相似文献   

4.
In this paper, we address the challenging problem of recovering the defocus map from a single image. We present a simple yet effective approach to estimate the amount of spatially varying defocus blur at edge locations. The input defocused image is re-blurred using a Gaussian kernel and the defocus blur amount can be obtained from the ratio between the gradients of input and re-blurred images. By propagating the blur amount at edge locations to the entire image, a full defocus map can be obtained. Experimental results on synthetic and real images demonstrate the effectiveness of our method in providing a reliable estimation of the defocus map.  相似文献   

5.
Guo  Xinru  Xu  Song  Lin  Xiangbo  Sun  Yi  Ma  Xiaohong 《Pattern Analysis & Applications》2022,25(1):157-167
Pattern Analysis and Applications - Based on the disentanglement representation learning theory and the cross-modal variational autoencoder (VAE) model, we derive a “Single Input Multiple...  相似文献   

6.
Human faces are remarkably similar in global properties, including size, aspect ratio, and location of main features, but can vary considerably in details across individuals, gender, race, or due to facial expression. We propose a novel method for 3D shape recovery of faces that exploits the similarity of faces. Our method obtains as input a single image and uses a mere single 3D reference model of a different person's face. Classical reconstruction methods from single images, i.e., shape-from-shading, require knowledge of the reflectance properties and lighting as well as depth values for boundary conditions. Recent methods circumvent these requirements by representing input faces as combinations (of hundreds) of stored 3D models. We propose instead to use the input image as a guide to "mold" a single reference model to reach a reconstruction of the sought 3D shape. Our method assumes Lambertian reflectance and uses harmonic representations of lighting. It has been tested on images taken under controlled viewing conditions as well as on uncontrolled images downloaded from the Internet, demonstrating its accuracy and robustness under a variety of imaging conditions and overcoming significant differences in shape between the input and reference individuals including differences in facial expressions, gender, and race.  相似文献   

7.
8.
9.

Due to severe articulation, self-occlusion, various scales, and high dexterity of the hand, hand pose estimation is more challenging than body pose estimation. Recently-developed body pose estimation algorithms are not suitable for addressing the unique challenges of hand pose estimation because they are trained without explicitly modeling structural relationships between keypoints. In this paper, we propose a novel cascaded hierarchical CNN(CH-HandNet) for 2D hand pose estimation from a single color image. The CH-HandNet includes three modules, hand mask segmentation, preliminary 2D hand pose estimation, and hierarchical estimation. The first module obtains a hand mask by hand mask segmentation network. The second module connects the hand mask and the intermediate image features to estimate the 2D hand heatmaps. The last module connects hand heatmaps with the intermediate image features and hand mask to estimate finger and palm heatmaps hierarchically. Finally, the extracted Finger(pinky,ring,middle,index) and Palm(thumb and palm) feature information are fused to estimate 2D hand pose. Experimental results on three datasets - OneHand 10k, Panoptic, and Eric.Lee, consistently shows that our proposed CH-HandNet outperforms previous state-of-the-art hand pose estimation methods.

  相似文献   

10.
This paper presents a 2D to 3D conversion scheme to generate a 3D human model using a single depth image with several color images. In building a complete 3D model, no prior knowledge such as a pre-computed scene structure and photometric and geometric calibrations is required since the depth camera can directly acquire the calibrated geometric and color information in real time. The proposed method deals with a self-occlusion problem which often occurs in images captured by a monocular camera. When an image is obtained from a fixed view, it may not have data for a certain part of an object due to occlusion. The proposed method consists of following steps to resolve this problem. First, the noise in a depth image is reduced by using a series of image processing techniques. Second, a 3D mesh surface is constructed using the proposed depth image-based modeling method. Third, the occlusion problem is resolved by removing the unwanted triangles in the occlusion region and filling the corresponding hole. Finally, textures are extracted and mapped to the 3D surface of the model to provide photo-realistic appearance. Comparison results with the related work demonstrate the efficiency of our method in terms of visual quality and computation time. It can be utilized in creating 3D human models in many 3D applications.  相似文献   

11.
Automatic estimation and removal of noise from a single image   总被引:1,自引:0,他引:1  
Image denoising algorithms often assume an additive white Gaussian noise (AWGN) process that is independent of the actual RGB values. Such approaches are not fully automatic and cannot effectively remove color noise produced by todays CCD digital camera. In this paper, we propose a unified framework for two tasks: automatic estimation and removal of color noise from a single image using piecewise smooth image models. We introduce the noise level function (NLF), which is a continuous function describing the noise level as a function of image brightness. We then estimate an upper bound of the real noise level function by fitting a lower envelope to the standard deviations of per-segment image variances. For denoising, the chrominance of color noise is significantly removed by projecting pixel values onto a line fit to the RGB values in each segment. Then, a Gaussian conditional random field (GCRF) is constructed to obtain the underlying clean image from the noisy input. Extensive experiments are conducted to test the proposed algorithm, which is shown to outperform state-of-the-art denoising algorithms.  相似文献   

12.
Reconstructing dynamic scenes with commodity depth cameras has many applications in computer graphics,computer vision,and robotics.However,due to the presence o...  相似文献   

13.
We propose a probabilistic formulation of joint silhouette extraction and 3D reconstruction given a series of calibrated 2D images. Instead of segmenting each image separately in order to construct a 3D surface consistent with the estimated silhouettes, we compute the most probable 3D shape that gives rise to the observed color information. The probabilistic framework, based on Bayesian inference, enables robust 3D reconstruction by optimally taking into account the contribution of all views. We solve the arising maximum a posteriori shape inference in a globally optimal manner by convex relaxation techniques in a spatially continuous representation. For an interactively provided user input in the form of scribbles specifying foreground and background regions, we build corresponding color distributions as multivariate Gaussians and find a volume occupancy that best fits to this data in a variational sense. Compared to classical methods for silhouette-based multiview reconstruction, the proposed approach does not depend on initialization and enjoys significant resilience to violations of the model assumptions due to background clutter, specular reflections, and camera sensor perturbations. In experiments on several real-world data sets, we show that exploiting a silhouette coherency criterion in a multiview setting allows for dramatic improvements of silhouette quality over independent 2D segmentations without any significant increase of computational efforts. This results in more accurate visual hull estimation, needed by a multitude of image-based modeling approaches. We made use of recent advances in parallel computing with a GPU implementation of the proposed method generating reconstructions on volume grids of more than 20 million voxels in up to 4.41 seconds.  相似文献   

14.
Recent studies have demonstrated that high-level semantics in data can be captured using sparse representation. In this paper, we propose an approach to human body pose estimation in static images based on sparse representation. Given a visual input, the objective is to estimate 3D human body pose using feature space information and geometrical information of the pose space. On the assumption that each data point and its neighbors are likely to reside on a locally linear patch of the underlying manifold, our method learns the sparse representation of the new input using both feature and pose space information and then estimates the corresponding 3D pose by a linear combination of the bases of the pose dictionary. Two strategies for dictionary construction are presented: (i) constructing the dictionary by randomly selecting the frames of a sequence and (ii) selecting specific frames of a sequence as dictionary atoms. We analyzed the effect of each strategy on the accuracy of pose estimation. Extensive experiments on datasets of various human activities show that our proposed method outperforms state-of-the-art methods.  相似文献   

15.
Stereo is an important cue for visually guided robots. While moving around in the world, such a robot can use dynamic fixation to overcome limitations in image resolution and field of view. In this paper, a binocular stereo system capable of dynamic fixation is presented. The external calibration is performed continuously taking temporal consistency into consideration, greatly simplifying the process. The essential matrix, which is estimated in real-time, is used to describe the epipolar geometry. It will be shown, how outliers can be identified and excluded from the calculations. An iterative approach based on a differential model of the optical flow, commonly used in structure from motion, is also presented and tested towards the essential matrix. The iterative method will be shown to be superior in terms of both computational speed and robustness, when the vergence angles are less than about 15. For larger angles, the differential model is insufficient and the essential matrix is preferably used instead  相似文献   

16.
Most of the existing approaches to bas-relief generation operate in image space, which is quite time-consuming in practice. This paper presents a different bas-relief generation algorithm based on geometric compression and starting from a 3D mesh input. The feature details are first extracted from the original objects using a spatial bilateral filtering technique. Then, a view-dependent coordinate mapping method is applied to build the height domain for the current view. After fitting the compression datum plane, the algorithm uses an adaptive compression function to scale and combine the Z values of the base mesh and the fine details. This approach offers control over the level of detail, making it flexible for the adjustment of the appearance of details. For a typical input mesh with 100 k triangles, this algorithm computes a bas-relief in 0.214 s.  相似文献   

17.
Hand pose estimation benefits large human computer interaction applications. The hand pose has high dimensions of freedom (dof) for joints, and various hand poses are flexible. Hand pose estimation is still a challenge problem. Since hand joints on the hand skeleton topology model have strict relationships between each other, we propose a hierarchical topology based approach to estimate 3D hand poses. First, we determine palm positions and palm orientations by detecting hand fingertips and calculating their directions in depth images. It is the global topology of hand poses. Moreover, we define connection relationships of finger joints as the local topology of hand model. Based on hierarchical topology, we extract angle features to describe hand poses, and adopt the regression forest algorithm to estimate 3D coordinates of hand joints. We further use freedom forrest algorithm to refine ambiguous poses in estimation to solve error accumulation problem. The hierarchical topology based approach ensures estimated hand poses in a reasonable topology, and improves estimation accuracy. We evaluate our approach on two public databases, and experiments illustrate its efficiency. Compared with state-of-the-art approaches, our approach improves estimation accuracy.  相似文献   

18.
This work reconstructs a 3D graphics model of an object with specular surfaces by its rotation. Continuous images are taken to measure highlights on the smooth surfaces and their motion. Coplanar extended lights determine a plane of rays to produce a highlight stripe on the object. 3D surfaces are then recovered from the moving stripe. We investigate global motion characteristics of the highlights in the epipolar-plane images so as to qualitatively identify surface types and control the modeling process. Under single and multiple plane-of-rays illuminations, we give two quantitative approaches for surface and normal recovery which use highlight orientation in the image and highlight motion in the epipolar-plane images. The computations employ a first-order differential equation and linear equations, respectively  相似文献   

19.
Neural Computing and Applications - The efficiency of convolutional neural networks (CNNs) facilitates 3D face reconstruction, which takes a single image as an input and demonstrates significant...  相似文献   

20.
Industrial esthetic designers typically produce hand-drawn sketches in the form of orthographic projections. A subsequent translation from 2D-drawings to 3D-models is usually necessary. This involves a considerably time consuming process, so that some automation is advisable.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号