共查询到20条相似文献,搜索用时 15 毫秒
2.
New method of the human body pose estimation based on a single camera 2D observation is presented, aimed at smart surveillance related video analysis and action recognition. It employs 3D model of the human body, and genetic algorithm combined with annealed particle filter for searching the global optimum of model state, best matching the object’s 2D observation. Additionally, new motion cost metric is employed, considering current pose and history of the body movement, favouring the estimates with the lowest changes of motion speed comparing to previous poses. The “genetic memory” concept is introduced for the genetic processing of both current and past states of 3D model. State-of-the-art in the field of human body tracking is presented and discussed. Details of implemented method are described. Results of experimental evaluation of developed algorithm are included and discussed. 相似文献
3.
Social media have ushered in alternative modalities to propagate news and developments rapidly. Just as traditional IR matured to modeling storylines from search results, we are now at a point to study how stories organize and evolve in additional mediums such as Twitter, a new frontier for intelligence analysis. This study takes as input news articles as well as social media feeds and extracts and connects entities into interesting storylines not explicitly stated in the underlying data. First, it proposes a novel method of spatio-temporal analysis on induced concept graphs that models storylines propagating through spatial regions in a time sequence. Second, it describes a method to control search space complexity by providing regions of exploration. And third, it describes ConceptRank as a ranking strategy that differentiates strongly-typed connections from weakly-bound ones. Extensive experiments on the Boston Marathon Bombings of April 15, 2013 as well as socio-political and medical events in Latin America, the Middle East, and the United States demonstrate storytelling’s high application potential, showcasing its use in event summarization and association analysis that identifies events before they hit the newswire. 相似文献
4.
A space robotic system is expected to perform on-orbit servicing missions to rescue malfunctioned satellites in geostationary orbit (GEO). In final berthing and capture, it is difficult for a space robot to determine the relative pose (attitude and position) of a non-cooperative malfunctioned satellite that is usually huge and without artificial recognition devices. In this paper, a space robot with a monocular structured light vision subsystem is introduced to solve the problem. Firstly, the monocular structured light vision subsystem composed of a single camera and a point light source is designed. Secondly, a partial rectangular shaped framework, which is very common on a non-cooperative malfunctioned satellite, is chosen as the recognition object for non-cooperative pose measurement. Using projection constraints on rectangle and circular points, a rectangle feature reconstruction algorithm is proposed. Thirdly, according to the reconstructed rectangle feature, a least square method of pose determination is presented. Lastly, using a semi-physical vision simulation system, several experiments of typical cases are simulated to verify the pose determination method of large non-cooperative target. The results show the validity and flexibility of the proposed method. 相似文献
5.
To support analysis and modelling of large amounts of spatio-temporal data having the form of spatially referenced time series (TS) of numeric values, we combine interactive visual techniques with computational methods from machine learning and statistics. Clustering methods and interactive techniques are used to group TS by similarity. Statistical methods for TS modelling are then applied to representative TS derived from the groups of similar TS. The framework includes interactive visual interfaces to a library of modelling methods supporting the selection of a suitable method, adjustment of model parameters, and evaluation of the models obtained. The models can be externally stored, communicated, and used for prediction and in further computational analyses. From the visual analytics perspective, the framework suggests a way to externalize spatio-temporal patterns emerging in the mind of the analyst as a result of interactive visual analysis: the patterns are represented in the form of computer-processable and reusable models. From the statistical analysis perspective, the framework demonstrates how TS analysis and modelling can be supported by interactive visual interfaces, particularly, in a case of numerous TS that are hard to analyse individually. From the application perspective, the framework suggests a way to analyse large numbers of spatial TS with the use of well-established statistical methods for TS analysis. 相似文献
6.
Multimedia Tools and Applications - Three-Dimensional image-based human pose recovery tries to retrieves 3D poses with 2D image. Therefore, one of the key problem is how to represent 2D images.... 相似文献
8.
The most successful approaches to video understanding and video matching use local spatio-temporal features as a sparse representation for video content. In the last decade, a great interest in evaluation of local visual features in the domain of images is observed. The aim is to provide researchers with guidance when selecting the best approaches for new applications and data-sets. FeEval is presented, a framework for the evaluation of spatio-temporal features. For the first time, this framework allows for a systematic measurement of the stability and the invariance of local features in videos. FeEval consists of 30 original videos from a great variety of different sources, including HDTV shows, 1080p HD movies and surveillance cameras. The videos are iteratively varied by well defined challenges leading to a total of 1710 video clips. We measure coverage, repeatability and matching performance under these challenges. Similar to prior work on 2D images, this leads to a new robustness and matching measurement. Supporting the choices of recent state of the art benchmarks, this allows for a in-depth analysis of spatio-temporal features in comparison to recent benchmark results. 相似文献
9.
With the advances in imaging technologies for robot or machine vision, new imaging devices are being developed for robot navigation or image-based rendering. However, to satisfy some design criterion, such as image resolution or viewing ranges, these devices are not necessarily being designed to follow the perspective rule and, thus, the imaging rays may not pass through a common point. Such generalized imaging devices may not be perspective and, therefore, their poses cannot be estimated with traditional techniques. In this paper, we propose a systematic method for pose estimation of such a generalized imaging device. We formulate it as a nonperspective n point (NPnP) problem. The case with exact solutions, n=3, is investigated comprehensively. Approximate solutions can be found for n>3 in a least-squared-error manner by combining an initial-pose-estimation procedure and an orthogonally iterative procedure. This proposed method can be applied not only to nonperspective imaging devices but also perspective ones. Results from experiments show that our approach can solve the NPnP problem accurately. 相似文献
10.
This paper presents a novel method of foreground and shadow segmentation in monocular indoor image sequences. The models of background, edge information, and shadow are set up and adaptively updated. A Bayesian network is proposed to describe the relationships among the segmentation label, background, intensity, and edge information. A maximum a posteriori—Markov random field estimation is used to boost the spatial connectivity of segmented regions. 相似文献
11.
In this paper, we develop a monocular vision system for online pose measurement of a 3-RRR planar parallel manipulator (PPM). By combining with a camera with global shutter, an active marker array, an industrial personal computer, and a degenerated perspective-n-points (DPnP) algorithm, a monocular vision measurement system (MVMS) is established. To improve measuring accuracy of the MVMS, factors that cause inaccuracy including the lens distortion, non-perpendicular angle, and input parameters’ uncertainty are analyzed and modeled in detail. In the simulation, effects of these error factors on the accuracy of the MVMS are quantitatively displayed, and comparisons between the DPnP algorithms and other state-of-art PnP algorithms are conducted. Experimental tests on the constructed MVMS demonstrate that it not only can accurately and efficiently measure pose of the 3RRR PPM, but possesses a higher operability and stability compared to the laser tracker. 相似文献
12.
This paper proposes a new framework for video editing in gradient domain. The spatio-temporal gradient fields of target videos are modified and/or mixed to generate a new gradient field which is usually not integrable. We compare two methods to solve this “mixed gradient problem”, i.e., the variational method and loopy belief propagation. We propose a 3D video integration algorithm, which uses the variational method to find the potential function whose gradient field is closest to the mixed gradient field in the sense of least squares. The video is reconstructed by solving a 3D Poisson equation. The main contributions of our framework lie in three aspects: first, we derive a straightforward extension of current 2D gradient technique to 3D space, thus resulting in a novel video editing framework, which is very different from all current video editing software; secondly, we propose using a fast and accurate 3D discrete Poisson solver which uses diagonal multigrids to solve the 3D Poisson equation, which is up to twice as fast as a simple conventional multigrid algorithm; finally, we introduce a set of new applications, such as face replacement and painting, high dynamic range video compression and graphcut based video compositing. A set of gradient operators is also provided to the user for editing purposes. We evaluate our algorithm using a variety of examples for image/video or video/video pairs. The resulting video can be seamlessly reconstructed. 相似文献
14.
This paper presents ST-Hadoop; the first full-fledged open-source MapReduce framework with a native support for spatio-temporal data. ST-Hadoop is a comprehensive extension to Hadoop and SpatialHadoop that injects spatio-temporal data awareness inside each of their layers, mainly, language, indexing, and operations layers. In the language layer, ST-Hadoop provides built in spatio-temporal data types and operations. In the indexing layer, ST-Hadoop spatiotemporally loads and divides data across computation nodes in Hadoop Distributed File System in a way that mimics spatio-temporal index structures, which result in achieving orders of magnitude better performance than Hadoop and SpatialHadoop when dealing with spatio-temporal data and queries. In the operations layer, ST-Hadoop shipped with support for three fundamental spatio-temporal queries, namely, spatio-temporal range, top-k nearest neighbor, and join queries. Extensibility of ST-Hadoop allows others to extend features and operations easily using similar approaches described in the paper. Extensive experiments conducted on large-scale dataset of size 10 TB that contains over 1 Billion spatio-temporal records, to show that ST-Hadoop achieves orders of magnitude better performance than Hadoop and SpaitalHadoop when dealing with spatio-temporal data and operations. The key idea behind the performance gained in ST-Hadoop is its ability in indexing spatio-temporal data within Hadoop Distributed File System. 相似文献
15.
As one of the most crucial tasks of scene perception, Monocular Depth Estimation (MDE) has made considerable development in recent years. Current MDE researchers are interested in the precision and speed of the estimation, but pay less attention to the generalization ability across scenes. For instance, the MDE networks trained on outdoor scenes achieve impressive performance on outdoor scenes but poor performance on indoor scenes, and vice versa. To tackle this problem, we propose a self-distillation MDE framework to improve the generalization ability across different scenes in this paper. Specifically, we design a student encoder that extracts features from two datasets of indoor and outdoor scenes, respectively. After that, we introduce a dissimilarity loss to pull apart encoded features of different scenes in the feature space. Finally, a decoder is adopted to estimate the final depth from encoded features. By doing so, our self-distillation MDE framework can learn the depth estimation of two different datasets. To the best of our knowledge, we are the first one to tackle the generalization problem across datasets of different scenes in the MDE field. Experiments demonstrate that our method reduces the degradation problem when a MDE network is in the face of datasets with complex data distribution. Note that evaluating on two datasets by a single network is more challenging than evaluating on two datasets by two different networks. 相似文献
16.
In this paper, we consider the problem of 2D human pose estimation on stereo image pairs. In particular, we aim at estimating the location, orientation and scale of upper-body parts of people detected in stereo image pairs from realistic stereo videos that can be found in the Internet. To address this task, we propose a novel pictorial structure model to exploit the stereo information included in such stereo image pairs: the Stereo Pictorial Structure (SPS). To validate our proposed model, we contribute a new annotated dataset of stereo image pairs, the Stereo Human Pose Estimation Dataset (SHPED), obtained from YouTube stereoscopic video sequences, depicting people in challenging poses and diverse indoor and outdoor scenarios. The experimental results on SHPED indicates that SPS improves on state-of-the-art monocular models thanks to the appropriate use of the stereo information. 相似文献
17.
The 2.1D sketch is a layered image representation, which assigns a partial depth ordering of over-segmented regions in a monocular image. This paper presents a global optimization framework for inferring the 2.1D sketch from a monocular image. Our method only uses over-segmented image regions (i.e., superpixels) as input, without any information of objects in the image, since (1) segmenting objects in images is a difficult problem on its own and (2) the objective of our proposed method is to be generic as an initial module useful for downstream high-level vision tasks. This paper formulates the inference of the 2.1D sketch using a global energy optimization framework. The proposed energy function consists of two components: (1) one is defined based on the local partial ordering relations (i.e., figure-ground) between two adjacent over-segmented regions, which captures the marginal information of the global partial depth ordering and (2) the other is defined based on the same depth layer relations among all the over-segmented regions, which groups regions of the same object to account for the over-segmentation issues. A hybrid evolution algorithm is utilized to minimize the global energy function efficiently. In experiments, we evaluated our method on a test data set containing 100 diverse real images from Berkeley segmentation data set (BSDS500) with the annotated ground truth. Experimental results show that our method can infer the 2.1D sketch with high accuracy. 相似文献
18.
We present a novel method for pose transfer between two 2D human skeletons.When the bone lengths and proportions between the two skeletons are significantly dif... 相似文献
20.
提出了至少存在一个深度值已知点的约束条件下,基于单视频图像序列重建人体三维姿态的方法.利用已知间距的平面点阵来标定获得摄像机参数,在透视投影模型下,根据单视频图像序列中人体关节点的二维数据,重建其三维信息.并将人体运动序列按照运动突变点划分为若干子序列,有效消除了二义性的干扰,较为精确的实现了人体三维姿态的重建.给出了该方法的实验过程及计算结果,验证了该算法的可行性和精度. 相似文献
|