共查询到20条相似文献,搜索用时 15 毫秒
1.
A fast registration making use of implicit polynomial (IP) models is helpful for the real-time pose estimation from single clinical free-hand Ultrasound (US) image, because it is superior in the areas such as robustness against image noise, fast registration without enquiring correspondences, and fast IP coefficient transformation. However it might lead to the lack of accuracy or failure registration.In this paper, we present a novel registration method based on a coarse-to-fine IP representation. The approach starts from a high-speed and reliable registration with a coarse (of low degree) IP model and stops when the desired accuracy is achieved by a fine (of high degree) IP model. Over the previous IP-to-point based methods our contributions are: (i) keeping the efficiency without requiring pair-wised correspondences, (ii) enhancing the robustness, and (iii) improving the accuracy. The experimental result demonstrates the good performance of our registration method and its capabilities of overcoming the limitations of unconstrained freehand ultrasound data, resulting in fast, robust and accurate registration. 相似文献
2.
We describe an approach to category-level detection and viewpoint estimation for rigid 3D objects from single 2D images. In contrast to many existing methods, we directly integrate 3D reasoning with an appearance-based voting architecture. Our method relies on a nonparametric representation of a joint distribution of shape and appearance of the object class. Our voting method employs a novel parameterization of joint detection and viewpoint hypothesis space, allowing efficient accumulation of evidence. We combine this with a re-scoring and refinement mechanism, using an ensemble of view-specific support vector machines. We evaluate the performance of our approach in detection and pose estimation of cars on a number of benchmark datasets. Finally we introduce the “Weizmann Cars ViewPoint” (WCVP) dataset, a benchmark for evaluating continuous pose estimation. 相似文献
3.
In this paper a real-time 3D pose estimation algorithm using range data is described. The system relies on a novel 3D sensor that generates a dense range image of the scene. By not relying on brightness information, the proposed system guarantees robustness under a variety of illumination conditions, and scene contents. Efficient face detection using global features and exploitation of prior knowledge along with novel feature localization and tracking techniques are described. Experimental results demonstrate accurate estimation of the six degrees of freedom of the head and robustness under occlusions, facial expressions, and head shape variability. 相似文献
4.
This survey reviews advances in human motion capture and analysis from 2000 to 2006, following a previous survey of papers up to 2000 [T.B. Moeslund, E. Granum, A survey of computer vision-based human motion capture, Computer Vision and Image Understanding, 81(3) (2001) 231–268.]. Human motion capture continues to be an increasingly active research area in computer vision with over 350 publications over this period. A number of significant research advances are identified together with novel methodologies for automatic initialization, tracking, pose estimation, and movement recognition. Recent research has addressed reliable tracking and pose estimation in natural scenes. Progress has also been made towards automatic understanding of human actions and behavior. This survey reviews recent trends in video-based human capture and analysis, as well as discussing open problems for future research to achieve automatic visual analysis of human movement. 相似文献
5.
We present a system for human pose estimation by using a single frame and without making assumptions on temporal coherence. The system uses 3D voxel data reconstructed from multiple synchronized video streams as input, and computes, for each frame, a skeleton model which best fits the body pose. This system adopts a hierarchical approach where the head and torso locations are found first based on template fitting with their specific shapes and dimensions. It is followed by a limb detection procedure that estimates the pose parameters of four limbs. However, a problem generally faced with skeleton models is the means to find adequate measurements to fit the model. In this paper, voxel data, together with two novel local shape features, are used for this purpose. Experiments show that this system is robust to several perturbations associated with the input data, such as voxel reconstruction errors and complex poses with self-contact, and also allows unconstrained motions, such as fast or unpredictable movements. 相似文献
6.
Hand pose estimation benefits large human computer interaction applications. The hand pose has high dimensions of freedom (dof) for joints, and various hand poses are flexible. Hand pose estimation is still a challenge problem. Since hand joints on the hand skeleton topology model have strict relationships between each other, we propose a hierarchical topology based approach to estimate 3D hand poses. First, we determine palm positions and palm orientations by detecting hand fingertips and calculating their directions in depth images. It is the global topology of hand poses. Moreover, we define connection relationships of finger joints as the local topology of hand model. Based on hierarchical topology, we extract angle features to describe hand poses, and adopt the regression forest algorithm to estimate 3D coordinates of hand joints. We further use freedom forrest algorithm to refine ambiguous poses in estimation to solve error accumulation problem. The hierarchical topology based approach ensures estimated hand poses in a reasonable topology, and improves estimation accuracy. We evaluate our approach on two public databases, and experiments illustrate its efficiency. Compared with state-of-the-art approaches, our approach improves estimation accuracy. 相似文献
7.
This paper deals with model-based pose estimation (or camera localization). We propose a direct approach that takes into account the image as a whole. For this, we consider a similarity measure, the mutual information. Mutual information is a measure of the quantity of information shared by two signals (or two images in our case). Exploiting this measure allows our method to deal with different image modalities (real and synthetic). Furthermore, it handles occlusions and illumination changes. Results with synthetic (benchmark) and real image sequences, with static or mobile camera, demonstrate the robustness of the method and its ability to produce stable and precise pose estimations. 相似文献
8.
Eigendecomposition-based techniques are popular for a number of computer vision problems, e.g., object and pose estimation,
because they are purely appearance based and they require few on-line computations. Unfortunately, they also typically require
an unobstructed view of the object whose pose is being detected. The presence of occlusion and background clutter precludes
the use of the normalizations that are typically applied and significantly alters the appearance of the object under detection.
This work presents an algorithm that is based on applying eigendecomposition to a quadtree representation of the image dataset
used to describe the appearance of an object. This allows decisions concerning the pose of an object to be based on only those
portions of the image in which the algorithm has determined that the object is not occluded. The accuracy and computational
efficiency of the proposed approach is evaluated on 16 different objects with up to 50% of the object being occluded and on
images of ships in a dockyard.
Chu-Yin Chang
received the B.S. degree in mechanical engineering from National Central University, Chung-Li, Taiwan, ROC, in 1988, the M.S.
degree in electrical engineering from the University of California, Davis, in 1993, and the Ph.D. degree in electrical and
computer engineering from Purdue University, West Lafayette, in 1999. From 1999--2002, he was a Machine Vision Systems Engineer
with Semiconductor Technologies and Instruments, Inc., Plano, TX. He is currently the Vice President of Energid Technologies,
Cambridge, MA, USA. His research interests include computer vision, computer graphics, and robotics.
Anthony A. Maciejewski
received the BSEE, M.S., and Ph.D. degrees from Ohio State University in 1982, 1984, and 1987. From 1988 to 2001, he was a
professor of Electrical and Computer Engineering at Purdue University, West Lafayette. He is currently the Department Head
of Electrical and Computer Engineering at Colorado State University. He is a Fellow of the IEEE. A complete vita is available
at:
Venkataramanan Balakrishnan
is Professor and Associate Head of Electrical and Computer Engineering at Purdue University, West Lafayette, Indiana. He received
the B.Tech degree in electronics and communication and the President of India Gold Medal from the Indian Institute of Technology,
Madras, in 1985. He then attended Stanford University, where he received the M.S. degree in statistics and the Ph.D. degree
in electrical engineering in 1992. He joined Purdue University in 1994 after post-doctoral research at Stanford, CalTech and
the University of Maryland. His primary research interests are in convex optimization and large-scale numerical algebra, applied
to engineering problems.
Rodney G. Roberts
received B.S. degrees in Electrical Engineering and Mathematics from Rose-Hulman Institute of Technology in 1987 and an MSEE
and Ph.D. in Electrical Engineering from Purdue University in 1988 and 1992, respectively. From 1992 until 1994, he was a
National Research Council Fellow at Wright Patterson Air Force Base in Dayton, Ohio. Since 1994 he has been at the Florida
A&M University---Florida State University College of Engineering where he is currently a Professor of Electrical and Computer
Engineering. His research interests are in the areas of robotics and image processing.
Kishor Saitwal
received the Bachelor of Engineering (B.E.) degree in Instrumentation and Controls from Vishwakarma Institute of Technology,
Pune, India, in 1998. He was ranked Third in the Pune University and was recipient of National Talent Search scholarship.
He received the M.S. and Ph.D. degrees from the Electrical and Computer Engineering department, Colorado State University,
Fort Collins, in 2001 and 2006, respectively. He is currently with Behavioral Recognition Systems, Inc. performing research
in computer aided video surveillance systems. His research interests include image/video processing, computer vision, and
robotics.
相似文献
9.
Due to severe articulation, self-occlusion, various scales, and high dexterity of the hand, hand pose estimation is more challenging than body pose estimation. Recently-developed body pose estimation algorithms are not suitable for addressing the unique challenges of hand pose estimation because they are trained without explicitly modeling structural relationships between keypoints. In this paper, we propose a novel cascaded hierarchical CNN(CH-HandNet) for 2D hand pose estimation from a single color image. The CH-HandNet includes three modules, hand mask segmentation, preliminary 2D hand pose estimation, and hierarchical estimation. The first module obtains a hand mask by hand mask segmentation network. The second module connects the hand mask and the intermediate image features to estimate the 2D hand heatmaps. The last module connects hand heatmaps with the intermediate image features and hand mask to estimate finger and palm heatmaps hierarchically. Finally, the extracted Finger(pinky,ring,middle,index) and Palm(thumb and palm) feature information are fused to estimate 2D hand pose. Experimental results on three datasets - OneHand 10k, Panoptic, and Eric.Lee, consistently shows that our proposed CH-HandNet outperforms previous state-of-the-art hand pose estimation methods. 相似文献
10.
We propose a human motion tracking method that not only captures the motion of the skeleton model but also generates a sequence of surfaces using images acquired by multiple synchronized cameras. Our method extracts articulated postures with 42 degrees of freedom through a sequence of visual hulls. We seek a globally optimized solution for likelihood using local memorization of the “fitness” of each body segment. Our method efficiently avoids problems of local minima by using a mean combination and an articulated combination of particles selected according to the weights of the different body segments. The surface is produced by deforming the template and the details are recovered by fitting the deformed surface to 2D silhouette rims. The extracted posture and estimated surface are cooperatively refined by registering the corresponding body segments. In our experiments, the mean error between the samples of the deformed reference model and the target is about 2 cm and the mean matching difference between the images projected by the estimated surfaces and the original images is about 6%. 相似文献
11.
This paper shows the analysis and design of feed-forward neural networks using the coordinate-free system of Clifford or geometric algebra. It is shown that real-, complex- and quaternion-valued neural networks are simply particular cases of the geometric algebra multidimensional neural networks and that they can be generated using Support Multi-Vector Machines. Particularly, the generation of RBF for neurocomputing in geometric algebra is easier using the SMVM that allows to find the optimal parameters automatically. The use of SVM in the geometric algebra framework expands its sphere of applicability for multidimensional learning. We introduce a novel method of geometric preprocessing utilizing hypercomplex or Clifford moments. This method is applied together with geometric MLPs for tasks of 2D pattern recognition. Interesting examples of non-linear problems like the grasping of an object along a non-linear curve and the 3D pose recognition show the effect of the use of adequate Clifford or geometric algebras that alleviate the training of neural networks and that of Support Multi-Vector Machines. 相似文献
12.
We propose a robust algorithm for estimating the projective reconstruction from image features using the RANSAC-based Triangulation method. In this method, we select input points randomly, separate the input points into inliers and outliers by computing their reprojection error, and correct the outliers so that they can become inliers. The reprojection error and correcting outliers are computed using the Triangulation method. After correcting the outliers, we can reliably recover projective motion and structure using the projective factorization method. Experimental results showed that errors can be reduced significantly compared to the previous research as a result of robustly estimated projective reconstruction. 相似文献
13.
Discriminative human pose estimation is the problem of inferring the 3D articulated pose of a human directly from an image feature. This is a challenging problem due to the highly non-linear and multi-modal mapping from the image feature space to the pose space. To address this problem, we propose a model employing a mixture of Gaussian processes where each Gaussian process models a local region of the pose space. By employing the models in this way we are able to overcome the limitations of Gaussian processes applied to human pose estimation — their O( N3) time complexity and their uni-modal predictive distribution. Our model is able to give a multi-modal predictive distribution where each mode is represented by a different Gaussian process prediction. A logistic regression model is used to give a prior over each expert prediction in a similar fashion to previous mixture of expert models. We show that this technique outperforms existing state of the art regression techniques on human pose estimation data sets for ballet dancing, sign language and the HumanEva data set. 相似文献
14.
This paper presents a new model to identify 3D human poses in pictures, given a single input image. The proposed approach is based on a well known model found in the literature, including improvements in terms of biomechanical restrictions aiming to reduce the number of 3D possible postures that correctly represent the pose in the 2D image. Since the generated set of poses can have more than one possible posture, we propose a ranking system in order to suggest the best generated postures according to a “comfort” criterion and shading characteristics in the image as well. The comfort criterion adopts assumptions in terms of pose equilibrium, while the shading criterion eliminates the ambiguities of postures taken into account the image illumination. We must emphasize that the removal of ambiguous 3D poses related to a single image is the main focus of this work. The achieved results were analyzed w.r.t. visual inspection of users as well as a state of the art technique and indicate that our model contributed in some way to the solution of that challenge problem. 相似文献
15.
Pattern Analysis and Applications - Based on the disentanglement representation learning theory and the cross-modal variational autoencoder (VAE) model, we derive a “Single Input Multiple... 相似文献
16.
The paper presents an analysis of the stability of pose estimation. Stability is defined as sensitivity of the pose parameters
towards noise in image features used for estimating pose. The specific emphasis of the analysis is on determining {how the
stability varies with viewpoint} relative to an object and to understand the relationships between object geometry, viewpoint,
and pose stability. Two pose estimation techniques are investigated. One uses a numerical scheme for finding pose parameters;
the other is based on closed form solutions. Both are “pose from trihedral vertices” techniques, which provide the rotation
part of object pose based on orientations of three edge segments. The analysis is based on generalized sensitivity analysis
propagating the uncertainty in edge segment orientations to the resulting effect on the pose parameters. It is shown that
there is a precomputable, generic relationship between viewpoint and pose stability, and that there is a drastic difference
in stability over the range of viewpoints. This viewpoint variation is shared by the two investigated techniques. Additionally,
the paper offers an explicit way to determine the most robust viewpoints directly for any given vertex model. Experiments
on real images show that the results of the work can be used to compute the variance in pose parameters for any given pose.
For the predicted {instable} viewpoints the variance in pose parameters is on the order of 20 (degrees squared), whereas the
variance for robust viewpoints is on the order of 0.05 (degrees squared), i.e., two orders of magnitude difference. 相似文献
17.
Pose estimation is a problem that occurs in many applications. In machine vision, the pose is often a 2D affine pose. In several applications, a restricted class of 2D affine poses with five degrees of freedom consisting of an anisotropic scaling, a rotation, and a translation must be determined from corresponding 2D points. A closed-form least-squares solution for this problem is described. The algorithm can be extended easily to robustly deal with outliers. 相似文献
18.
In this paper, we address the challenging problem of recovering the defocus map from a single image. We present a simple yet effective approach to estimate the amount of spatially varying defocus blur at edge locations. The input defocused image is re-blurred using a Gaussian kernel and the defocus blur amount can be obtained from the ratio between the gradients of input and re-blurred images. By propagating the blur amount at edge locations to the entire image, a full defocus map can be obtained. Experimental results on synthetic and real images demonstrate the effectiveness of our method in providing a reliable estimation of the defocus map. 相似文献
19.
A motion vision system is developed in which a moving object can be detected and image displacement can be estimated based on human visual characteristics and use of a multiresolution image. The system consists of four parts: (1) Temporal gradient, logic AND, and dynamic thresholding operations are used to obtain the primary mask. (2) A region growing algorithm is applied. (3) A hierarchical object detection algorithm is used to identify image patterns. (4) Displacement of the image is estimated by breaking each frame of the motion sequence into local regions (edges). A search is undertaken to discover how the image pattern within a given region appears displaced. This search takes the form of motion channels, the output of which are used to obtain the estimation of displacement. A correlative measure is proposed to match the patterns. 相似文献
20.
Image denoising algorithms often assume an additive white Gaussian noise (AWGN) process that is independent of the actual RGB values. Such approaches are not fully automatic and cannot effectively remove color noise produced by todays CCD digital camera. In this paper, we propose a unified framework for two tasks: automatic estimation and removal of color noise from a single image using piecewise smooth image models. We introduce the noise level function (NLF), which is a continuous function describing the noise level as a function of image brightness. We then estimate an upper bound of the real noise level function by fitting a lower envelope to the standard deviations of per-segment image variances. For denoising, the chrominance of color noise is significantly removed by projecting pixel values onto a line fit to the RGB values in each segment. Then, a Gaussian conditional random field (GCRF) is constructed to obtain the underlying clean image from the noisy input. Extensive experiments are conducted to test the proposed algorithm, which is shown to outperform state-of-the-art denoising algorithms. 相似文献
|