首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Stitching motions in multiple videos into a single video scene is a challenging task in current video fusion and mosaicing research and film production. In this paper, we present a novel method of video motion stitching based on the similarities of trajectory and position of foreground objects. First, multiple video sequences are registered in a common reference frame, whereby we estimate the static and dynamic backgrounds, with the former responsible for distinguishing the foreground from the background and the static region from the dynamic region, and the latter functioning in mosaicing the warped input video sequences into a panoramic video. Accordingly, the motion similarity is calculated by reference to trajectory and position similarity, whereby the corresponding motion parts are extracted from multiple video sequences. Finally, using the corresponding motion parts, the foregrounds of different videos and dynamic backgrounds are fused into a single video scene through Poisson editing, with the motions involved being stitched together. Our major contributions are a framework of multiple video mosaicing based on motion similarity and a method of calculating motion similarity from the trajectory similarity and the position similarity. Experiments on everyday videos show that the agreement of trajectory and position similarities with the real motion similarity plays a decisive role in determining whether two motions can be stitched. We acquire satisfactory results for motion stitching and video mosaicing.  相似文献   

2.
目的 3维人体姿态估计传统方法通常采用单帧点云作为输入,可能会忽略人体运动平滑度的固有先验知识,导致产生抖动伪影。目前,获取2维人体姿态标注的真实图像数据集相对容易,而采集大规模的具有高质量3维人体姿态标注的真实图像数据集进行完全监督训练有一定难度。对此,本文提出了一种新的点云序列3维人体姿态估计方法。方法 首先从深度图像序列估计姿态相关点云,然后利用时序信息构建神经网络,对姿态相关点云序列的时空特征进行编码。选用弱监督深度学习,以利用大量的更容易获得的带2维人体姿态标注的数据集。最后采用多任务网络对人体姿态估计和人体运动预测进行联合训练,提高优化效果。结果 在两个数据集上对本文算法进行评估。在ITOP(invariant-top view dataset)数据集上,本文方法的平均精度均值(mean average precision,mAP)比对比方法分别高0.99%、13.18%和17.96%。在NTU-RGBD数据集上,本文方法的mAP值比最先进的WSM(weakly supervised adversarial learning methods)方法高7.03%。同时,在ITOP数据集上对模型进行消融实验,验证了算法各个不同组成部分的有效性。与单任务模型训练相比,多任务网络联合进行人体姿态估计和运动预测的mAP可以提高2%以上。结论 本文提出的点云序列3维人体姿态估计方法能充分利用人体运动连续性的先验知识,获得更平滑的人体姿态估计结果,在ITOP和NTU-RGBD数据集上都能获得很好的效果。采用多任务网络联合优化策略,人体姿态估计和运动预测两个任务联合优化求解,有互相促进的作用。  相似文献   

3.
Communicative behaviors are a very important aspect of human behavior and deserve special attention when simulating groups and crowds of virtual pedestrians. Previous approaches have tended to focus on generating believable gestures for individual characters and talker‐listener behaviors for static groups. In this paper, we consider the problem of creating rich and varied conversational behaviors for data‐driven animation of walking and jogging characters. We captured ground truth data of participants conversing in pairs while walking and jogging. Our stylized splicing method takes as input a motion captured standing gesture performance and a set of looped full body locomotion clips. Guided by the ground truth metrics, we perform stylized splicing and synchronization of gesture with locomotion to produce natural conversations of characters in motion. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

4.
Towards controlling the frequency of limit cycle locomotion as well as adapting to rough terrain and performing specific tasks, a novel and indirect method has been recently introduced using an active wobbling mass attached to limit cycle walkers. One of the strongest advantages of the method is the easiness of its implementation, prompting its applicability to a wide variety of locomotion systems. To deeply understand the nonlinear dynamics for further enhancement of the methodology, we use a combined rimless wheel with an active wobbling mass as an example to perform nonlinear analysis in this paper. First, we introduce the simplified equation of motion and the gait frequency control method. Second, we obtain Arnold tongue, which represents region of entrained locomotion. In regions where the locomotion is not entrained, we explore chaotic and quasi-periodic gaits. To characterize bistability of two different locomotions that underlie hysteresis phenomena, basins of attraction for the two locomotions were computed. The present nonlinear analysis may help understanding the detailed mechanism of indirectly controlled limit cycle walkers.  相似文献   

5.
Generating a visually appealing human motion sequence using low‐dimensional control signals is a major line of study in the motion research area in computer graphics. We propose a novel approach that allows us to reconstruct full body human locomotion using a single inertial sensing device, a smartphone. Smartphones are among the most widely used devices and incorporate inertial sensors such as an accelerometer and a gyroscope. To find a mapping between a full body pose and smartphone sensor data, we perform low dimensional embedding of full body motion capture data, based on a Gaussian Process Latent Variable Model. Our system ensures temporal coherence between the reconstructed poses by using a state decomposition model for automatic phase segmentation. Finally, application of the proposed nonlinear regression algorithm finds a proper mapping between the latent space and the sensor data. Our framework effectively reconstructs plausible 3D locomotion sequences. We compare the generated animation to ground truth data obtained using a commercial motion capture system.  相似文献   

6.
We present a new approach to motion rearrangement that preserves the syntactic structures of an input motion automatically by learning a context‐free grammar from the motion data. For grammatical analysis, we reduce an input motion into a string of terminal symbols by segmenting the motion into a series of subsequences, and then associating a group of similar subsequences with the same symbol. To obtain the most repetitive and precise set of terminals, we search for an optimial segmentation such that a large number of subsequences can be clustered into groups with little error. Once the input motion has been encoded as a string, a grammar induction algorithm is employed to build up a context‐free grammar so that the grammar can reconstruct the original string accurately as well as generate novel strings sharing their syntactic structures with the original string. Given any new strings from the learned grammar, it is straightforward to synthesize motion sequences by replacing each terminal symbol with its associated motion segment, and stitching every motion segment sequentially. We demonstrate the usefulness and flexibility of our approach by learning grammars from a large diversity of human motions, and reproducing their syntactic structures in new motion sequences.  相似文献   

7.
We develop a novel radar-based human motion recognition technique that exploits the temporal sequentiality of human motions. The stacked recurrent neural network (RNN) with long short-term memory (LSTM) units is employed to extract sequential features for automatic motion classification. The spectrogram of raw radar data is used as the network input to utilize the time-varying Doppler and micro-Doppler signatures for human motion characterization. Based on experimental data, we verified that a stacked RNN with two 36-cell LSTM layers successfully classifies six different types of human motions.  相似文献   

8.
Virtual mannequins need to navigate in order to interact with their environment. Their autonomy to accomplish navigation tasks is ensured by locomotion controllers. Control inputs can be user‐defined or automatically computed to achieve high‐level operations (e.g. obstacle avoidance). This paper presents a locomotion controller based on a motion capture edition technique. Controller inputs are the instantaneous linear and angular velocities of the walk. Our solution works in real time and supports at any time continuous changes of inputs. The controller combines three main components to synthesize locomotion animations in a four‐stage process. First, the Motion Library stores motion capture samples. Motion captures are analysed to compute quantitative characteristics. Second, these characteristics are represented in a linear control space. This geometric representation is appropriate for selecting and weighting three motion samples with respect to the input state. Third, locomotion cycles are synthesized by blending the selected motion samples. Blending is done in the frequency domain. Lastly, successive postures are extracted from the synthesized cycles in order to complete the animation of the moving mannequin. The method is demonstrated in this paper in a locomotion‐planning context. Copyright © 2006 John Wiley & Sons, Ltd.  相似文献   

9.
10.
We present ZeroEGGS, a neural network framework for speech-driven gesture generation with zero-shot style control by example. This means style can be controlled via only a short example motion clip, even for motion styles unseen during training. Our model uses a Variational framework to learn a style embedding, making it easy to modify style through latent space manipulation or blending and scaling of style embeddings. The probabilistic nature of our framework further enables the generation of a variety of outputs given the input, addressing the stochastic nature of gesture motion. In a series of experiments, we first demonstrate the flexibility and generalizability of our model to new speakers and styles. In a user study, we then show that our model outperforms previous state-of-the-art techniques in naturalness of motion, appropriateness for speech, and style portrayal. Finally, we release a high-quality dataset of full-body gesture motion including fingers, with speech, spanning across 19 different styles. Our code and data are publicly available at https://github.com/ubisoft/ubisoft-laforge-ZeroEGGS .  相似文献   

11.
Annotating unlabeled motion captures plays an important role in Computer Animation for motion analysis and motion edition purposes. Locomotion is a difficult case study as all the limbs of the human body are involved whereas a low‐dimensional global motion is performed. The oscillatory nature of the locomotion makes difficult the distinction between straight steps and turning ones, especially for subtle orientation changes. In this paper we propose to geometrically model the center of mass trajectory during locomotion as a C‐continuous circular arcs sequence. Our model accurately analyzes the global motion into the velocity‐curvature space. An experimental study demonstrates that an invariant law links curvature and velocity during straight walk. We finally illustrate how the resulting law can be used for annotation purposes: any unlabeled motion captured walk can be transformed into an annotated sequence of straight and turning steps. Several examples demonstrate the robustness of our approach and give comparison with classical threshold‐based techniques. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

12.
We propose a novel method for real-time face alignment in videos based on a recurrent encoder–decoder network model. Our proposed model predicts 2D facial point heat maps regularized by both detection and regression loss, while uniquely exploiting recurrent learning at both spatial and temporal dimensions. At the spatial level, we add a feedback loop connection between the combined output response map and the input, in order to enable iterative coarse-to-fine face alignment using a single network model, instead of relying on traditional cascaded model ensembles. At the temporal level, we first decouple the features in the bottleneck of the network into temporal-variant factors, such as pose and expression, and temporal-invariant factors, such as identity information. Temporal recurrent learning is then applied to the decoupled temporal-variant features. We show that such feature disentangling yields better generalization and significantly more accurate results at test time. We perform a comprehensive experimental analysis, showing the importance of each component of our proposed model, as well as superior results over the state of the art and several variations of our method in standard datasets.  相似文献   

13.
This paper presents an efficient technique for synthesizing motions by stitching, or splicing, an upper‐body motion retrieved from a motion space on top of an existing lower‐body locomotion of another motion. Compared to the standard motion splicing problem, motion space splicing imposes new challenges as both the upper and lower body motions might not be known in advance. Our technique is the first motion (space) splicing technique that propagates temporal and spatial properties of the lower‐body locomotion to the newly generated upper‐body motion and vice versa. Whereas existing techniques only adapt the upper‐body motion to fit the lower‐body motion, our technique also adapts the lower‐body locomotion based on the upper body task for a more coherent full‐body motion. In this paper, we will show that our decoupled approach is able to generate high‐fidelity full‐body motion for interactive applications such as games.  相似文献   

14.
Horse locomotion exhibits rich variations in gaits and styles. Although there have been many approaches proposed for animating quadrupeds, there is not much research on synthesizing horse locomotion. In this paper, we present a horse locomotion synthesis approach. A user can arbitrarily change a horse's moving speed and direction, and our system would automatically adjust the horse's motion to fulfill the user's commands. At preprocessing, we manually capture horse locomotion data from Eadweard Muybridge's famous photographs of animal locomotion and expand the captured motion database to various speeds for each gait. At runtime, our approach automatically changes gaits based on speed, synthesizes the horse's root trajectory, and adjusts its body orientation based on the horse's turning direction. We propose an asynchronous time warping approach to handle gait transition, which is critical for generating realistic and controllable horse locomotion. Our experiments demonstrate that our system can produce smooth, rich, and controllable horse locomotion in real time. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

15.
3D human pose estimation in motion is a hot research direction in the field of computer vision. However, the performance of the algorithm is affected by the complexity of 3D spatial information, self-occlusion of human body, mapping uncertainty and other problems. In this paper, we propose a 3D human joint localization method based on multi-stage regression depth network and 2D to 3D point mapping algorithm. First of all, we use a single RGB image as the input, through the introduction of heatmap and multi-stage regression to constantly optimize the coordinates of human joint points. Then we input the 2D joint points into the mapping network for calculation, and get the coordinates of 3D human body joint points, and then to complete the 3D human body pose estimation task. The MPJPE of the algorithm in Human3.6 M dataset is 40.7. The evaluation of dataset shows that our method has obvious advantages.  相似文献   

16.
We address the question of how to characterize the outliers that may appear when matching two views of the same scene. The match is performed by comparing the difference of the two views at a pixel level aiming at a better registration of the images. When using digital photographs as input, we notice that an outlier is often a region that has been occluded, an object that suddenly appears in one of the images, or a region that undergoes an unexpected motion. By assuming that the error in pixel intensity generated by the outlier is similar to an error generated by comparing two random regions in the scene, we can build a model for the outliers based on the content of the two views. We illustrate our model by solving a pose estimation problem: the goal is to compute the camera motion between two views. The matching is expressed as a mixture of inliers versus outliers, and defines a function to minimize for improving the pose estimation. Our model has two benefits: First, it delivers a probability for each pixel to belong to the outliers. Second, our tests show that the method is substantially more robust than traditional robust estimators (M-estimators) used in image stitching applications, with only a slightly higher computational complexity.  相似文献   

17.
目的 面向实时、准确、鲁棒的人体运动分析应用需求,从运动分析的特征提取和运动建模问题出发,本文人体运动分析的实例学习方法。方法 在构建人体姿态实例库基础上,首先,采用运动检测方法得到视频每帧的人体轮廓;其次,基于形状上下文轮廓匹配方法,从实例库中检索得到每帧视频的候选姿态集;最后,通过统计建模和转移概率建模实现人体运动分析。结果 对步行、跑步、跳跃等测试视频进行实验,基于轮廓的形状上下文特征表示和匹配方法具有良好的表达能力;本文方法运动分析结果,关节夹角平均误差在5°左右,与其他算法相比,有效提高了运动分析的精度。结论 本文人体运动分析的实例学习方法,能有效分析单目视频中的人体运动,并克服了映射的深度歧义,对运动的视角变化鲁棒,具有良好的计算效率和精度。  相似文献   

18.
现有运动去模糊算法难以有效复原含有大尺度旋转的复合运动模糊,针对此问题提出了一种基于U-net模型的神经网络框架。该框架通过融合运动信息至网络输入,给定每一像素点不同的运动约束。经过网络的编码器与解码器结构,得到每一像素点的预测值,实现端对端的方式直接获得复原图像。实验在通用数据集上与当前先进去模糊算法进行比较,该方法相比性能最好的算法PSNR(peak signal-to-noise ratio)值提高了0.14 dB,相比实时性最好的算法运行时间减少了0.1 s;同时在含有旋转运动的测试集上进行验证,证明了该算法可获得较好的复原质量。  相似文献   

19.
目的 在步态识别算法中,基于外观的方法准确率高且易于实施,但对外观变化敏感;基于模型的方法对外观变化更加鲁棒,但建模困难且准确率较低。为了使步态识别算法在获得高准确率的同时对外观变化具有更好的鲁棒性,提出了一种双分支网络融合外观特征和姿态特征,以结合两种方法的优点。方法 双分支网络模型包含外观和姿态两条分支,外观分支采用GaitSet网络从轮廓图像中提取外观特征;姿态分支采用5层卷积网络从姿态骨架中提取姿态特征。在此基础上构建特征融合模块,融合外观特征和姿态特征,并引入通道注意力机制实现任意尺寸的特征融合,设计的模块结构使其能够在融合过程中抑制特征中的噪声。最后将融合后的步态特征应用于识别行人身份。结果 实验在CASIA-B (Institute of Automation,Chinese Academy of Sciences,Gait Dataset B)数据集上通过跨视角和不同行走状态两种实验设置与目前主流的步态识别算法进行对比,并以Rank-1准确率作为评价指标。在跨视角实验设置的MT (medium-sample training)划分中,该算法在3种行走状态下的准确率分别为93.4%、84.8%和70.9%,相比性能第2的算法分别提升了1.4%、0.5%和8.4%;在不同行走状态实验设置中,该算法在两种行走状态下的准确率分别为94.9%和90.0%,获得了最佳性能。结论 在能够同时获取外观数据和姿态数据的场景下,该算法能够有效地融合外观信息和姿态信息,在获得更丰富的步态特征的同时降低了外观变化对步态特征的影响,提高了步态识别的性能。  相似文献   

20.
In this paper, we present an on‐line real‐time physics‐based approach to motion control with contact repositioning based on a low‐dimensional dynamics model using example motion data. Our approach first generates a reference motion in run time according to an on‐line user request by transforming an example motion extracted from a motion library. Guided by the reference motion, it repeatedly generates an optimal control policy for a small time window one at a time for a sequence of partially overlapping windows, each covering a couple of footsteps of the reference motion, which supports an on‐line performance. On top of this, our system dynamics and problem formulation allow to derive closed‐form derivative functions by exploiting the low‐dimensional dynamics model together with example motion data. These derivative functions and their sparse structures facilitate a real‐time performance. Our approach also allows contact foot repositioning so as to robustly respond to an external perturbation or an environmental change as well as to perform locomotion tasks such as stepping on stones effectively.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号