首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
We propose a novel approach for face tracking, resulting in a visual feedback loop: instead of trying to adapt a more or less realistic artificial face model to an individual, we construct from precise range data a specific texture and wireframe face model, whose realism allows the analysis and synthesis modules to visually cooperate in the image plane, by directly using 2D patterns synthesized by the face model. Unlike other feedback loops found in the literature, we do not explicitly handle the 3D complex geometric data of the face model, to make real-time manipulations possible. Our main contribution is a complete face tracking and pose estimation framework, with few assumptions about the face rigid motion (allowing large rotations out of the image plane), and without marks or makeup on the user's face. Our framework feeds the feature-tracking procedure with synthesized facial patterns, controlled by an extended Kalman filter. Within this framework, we present original and efficient geometric and photometric modelling techniques, and a reformulation of a block-matching algorithm to make it match synthesized patterns with real images, and avoid background areas during the matching. We also offer some numerical evaluations, assessing the validity of our algorithms, and new developments in the context of facial animation. Our face-tracking algorithm may be used to recover the 3D position and orientation of a real face and generate a MPEG-4 animation stream to reproduce the rigid motion of the face with a synthetic face model. It may also serve as a pre-processing step for further facial expression analysis algorithms, since it locates the position of the facial features in the image plane, and gives precise 3D information to take into account the possible coupling between pose and expressions of the analysed facial images.  相似文献   

3.
We describe the components of the system used for real-time facial communication using a cloned head. We begin with describing the automatic face cloning using two orthogonal photographs of a person. The steps in this process are the face model matching and texture generation. After an introduction to the MPEG-4 parameters that we are using, we proceed with the explanation of the facial feature tracking using a video camera. The technique requires an initialization step and is further divided into mouth and eye tracking. These steps are explained in detail. We then explain the speech processing techniques used for real-time phoneme extraction and subsequent speech animation module. We conclude with the results and comments on the integration of the modules towards a complete system  相似文献   

4.
Due to the advent of MPEG-4 standard, facial animation has been receiving significant attention lately. A common approach for facial animation is to use the mesh model. The physics-based transformation, elastic body spline (EBS), has been proposed to deform the facial mesh model and generate realistic expression by assuming the whole facial image has the same elastic property. In this paper, we partition facial images into different regions and propose an iterative algorithm to find the elastic property of each facial region. By doing so, we can obtain the EBS for mesh vertices in the facial mesh model such that facial animation can be more realistically achieved.  相似文献   

5.
吴晓军  鞠光亮 《电子学报》2016,44(9):2141-2147
提出了一种无标记点的人脸表情捕捉方法.首先根据ASM(Active Shape Model)人脸特征点生成了覆盖人脸85%面部特征的人脸均匀网格模型;其次,基于此人脸模型提出了一种表情捕捉方法,使用光流跟踪特征点的位移变化并辅以粒子滤波稳定其跟踪结果,以特征点的位移变化驱动网格整体变化,作为网格跟踪的初始值,使用网格的形变算法作为网格的驱动方式.最后,以捕捉到的表情变化数据驱动不同的人脸模型,根据模型的维数不同使用不同的驱动方法来实现表情动画重现,实验结果表明,提出的算法能很好地捕捉人脸表情,将捕捉到的表情映射到二维卡通人脸和三维虚拟人脸模型都能取得较好的动画效果.  相似文献   

6.
In this paper, the authors have developed a system that animates 3D facial agents based on real-time facial expression analysis techniques and research on synthesizing facial expressions and text-to-speech capabilities. This system combines visual, auditory, and primary interfaces to communicate one coherent multimodal chat experience. Users can represent themselves using agents they select from a group that we have predefined. When a user shows a particular expression while typing a text, the 3D agent at the receiving end speaks the message aloud while it replays the recognized facial expression sequences and also augments the synthesized voice with appropriate emotional content. Because the visual data exchange is based on the MPEG-4 high-level Facial Animation Parameter for facial expressions (FAP 2), rather than real-time video, the method requires very low bandwidth.  相似文献   

7.
Referring to the new functionality of video access and coding, the survey presented here lies within the scope of MPEG-4 activities related to virtual character (VC) animation. We first describe how Amendment 1 of the MPEG-4 standard offers an appropriate framework for virtual human animation, gesture synthesis and compression/transmission. Specifically, face and body representation and animation are described in detail in terms of node syntax and animation stream encoding methods. Then, we discuss how this framework is extended within the ongoing standardization efforts by (1) allowing the animation of any kind of articulated model, and (2) addressing advanced modeling and animation concepts as “skin and bones”-based approach. The new syntax for node definition and animation stream is presented and discussed in terms of genericity and additional functionalities. The biomechanical properties, modeled by means of the character skeleton that defines the bone influence on the skin region, as well as the local spatial deformations simulating muscles, are supported by specific nodes. Animating the VC consists in instantiating bone transformations and muscle control curve. Interpolation techniques, inverse kinematics, discrete cosine transform and arithmetic encoding techniques make it possible to provide a highly compressed animation stream. The new animation framework extension tools are finally evaluated in terms of realism, complexity and transmission bandwidth within a sign language communication system.  相似文献   

8.
This paper presents an overview of some of the synthetic visual objects supported by MPEG-4 version-1, namely animated faces and animated arbitrary 2D uniform and Delaunay meshes. We discuss both specification and compression of face animation and 2D-mesh animation in MPEG-4. Face animation allows to animate a proprietary face model or a face model downloaded to the decoder. We also address integration of the face animation tool with the text-to-speech interface (TTSI), so that face animation can be driven by text input.  相似文献   

9.
With better understanding of face anatomy and technical advances in computer graphics, 3D face synthesis has become one of the most active research fields for many human-machine applications, ranging from immersive telecommunication to the video games industry. In this paper we proposed a method that automatically extracts features like eyes, mouth, eyebrows and nose from the given frontal face image. Then a generic 3D face model is superimposed onto the face in accordance with the extracted facial features in order to fit the input face image by transforming the vertex topology of the generic face model. The 3D-specific face can finally be synthesized by texturing the individualized face model. Once the model is ready six basic facial expressions are generated with the help of MPEG-4 facial animation parameters. To generate transitions between these facial expressions we use 3D shape morphing between the corresponding face models and blend the corresponding textures. Novelty of our method is automatic generation of 3D model and synthesis face with different expressions from frontal neutral face image. Our method has the advantage that it is fully automatic, robust, fast and can generate various views of face by rotation of 3D model. It can be used in a variety of applications for which the accuracy of depth is not critical such as games, avatars, face recognition. We have tested and evaluated our system using standard database namely, BU-3DFE.  相似文献   

10.
王海波  蔡骏  余兆明 《电视技术》2003,(2):10-13,25
MPEG-4通过脸部定义参数FDP和脸部运动参数FAP来对人脸对象进行编码,从而获得码率极低的视频流。本文主要讨论了MPEG-4中人脸对象这类视频码流的结构以及参数编码和解码的过程。  相似文献   

11.
MPEG-4 standard allows composition of natural or synthetic video with facial animation. Based on this standard, an animated face can be inserted into natural or synthetic video to create new virtual working environments such as virtual meetings or virtual collaborative environments. For these applications, audio-to-visual conversion techniques can be used to generate a talking face that is synchronized with the voice. In this paper, we address audio-to-visual conversion problems by introducing a novel Hidden Markov Model Inversion (HMMI) method. In training audio-visual HMMs, the model parameters {av} can be chosen to optimize some criterion such as maximum likelihood. In inversion of audio-visual HMMs, visual parameters that optimize some criterion can be found based on given speech and model parameters {av}. By using the proposed HMMI technique, an animated talking face can be synchronized with audio and can be driven realistically. The HMMI technique combined with MPEG-4 standard to create a virtual conference system, named VIRTUAL-FACE, is introduced to show the role of HMMI for applications of MPEG-4 facial animation.  相似文献   

12.
Compression of computer graphics data such as static and dynamic 3D meshes has received significant attention in recent years, since new applications require transmission over channels and storage on media with limited capacity. This includes pure graphics applications (virtual reality, games) as well as 3DTV and free viewpoint video. Efficient compression algorithms have been developed first for static 3D meshes, and later for dynamic 3D meshes and animations. Standard formats are available for instance in MPEG-4 3D mesh compression for static meshes, and Interpolator Compression for the animation part. For some important types of 3D objects, e.g. human head or body models, facial and body animation parameters have been introduced. Recent results for compression of general dynamic meshes have shown that the statistical dependencies within a mesh sequence can be exploited well by predictive coding approaches. Coders introduced so far use experimentally determined or heuristic thresholds for tuning the algorithms. In video coding, rate-distortion (RD) optimization is often used to avoid fixed thresholds and to select the optimum prediction mode. We applied these ideas and present here an RD-optimized dynamic 3D mesh coder. It includes different prediction modes as well as an RD cost computation that controls the mode selection across all possible spatial partitions of a mesh to find the clustering structure together with the associated prediction modes. The general coding structure is derived from statistical analysis of mesh sequences and exploits temporal as well as spatial mesh dependencies. To evaluate the coding efficiency of the developed coder, comparative coding results for mesh sequences at different resolutions were carried out.  相似文献   

13.
The video analysis system described in this paper aims at facial expression recognition consistent with the MPEG4 standardized parameters for facial animation, FAP. For this reason, two levels of analysis are necessary: low-level analysis to extract the MPEG4 compliant parameters and high-level analysis to estimate the expression of the sequence using these low-level parameters.The low-level analysis is based on an improved active contour algorithm that uses high level information based on principal component analysis to locate the most significant contours of the face (eyebrows and mouth), and on motion estimation to track them. The high-level analysis takes as input the FAP produced by the low-level analysis tool and, by means of a Hidden Markov Model classifier, detects the expression of the sequence.  相似文献   

14.
15.
The MPEG-4 specifications have provided substantial progress in many areas of multimedia technology. Following MPEG tradition, MPEG-4 focuses on media coding. However, a couple of innovative aspects other than media coding characterise MPEG-4 with respect to its predecessors: the ability to code an audio-visual scene, and the ability to abstract from the delivery technology. This paper focuses its attention on this later aspect, which is covered by part 6 of the MPEG-4 specification: DMIF (Delivery Multimedia Integration Framework). The paper explains the motivations that have driven the delivery technology abstraction, analyses the details of the DMIF architecture, and highlights the practical impact on the “bits-on-the-wire” and on conformance issues. It is important not to forget, throughout this paper, that the whole focus of this work is on real-time delivery of multimedia content.  相似文献   

16.
实时逼真的人脸表情动画技术是计算机图形学研究领域中一个富有挑战性的课题,文章针对已有物理模型算法的计算复杂、效果粗糙等问题,阐述在Direct3D开发平台下给出的一种将物理模型与渐变表情算法结合的组合渐变动画算法及用其生成真实感表情动画的实现过程。实验表明,这种方法极大地增强了人脸表情动画生成的真实性。  相似文献   

17.
This paper presents a novel view-based approach to quantify and reproduce facial expressions, by systematically exploiting the degrees of freedom allowed by a realistic face model. This approach embeds efficient mesh morphing and texture animations to synthesize facial expressions. We suggest using eigenfeatures, built from synthetic images, and designing an estimator to interpret the responses of the eigenfeatures on a facial expression in terms of animation parameters.  相似文献   

18.
In this paper a modular approach of gradual confidence for facial feature extraction over real video frames is presented. The problem is being dealt under general imaging conditions and soft presumptions. The proposed methodology copes with large variations in the appearance of diverse subjects, as well as of the same subject in various instances within real video sequences. Areas of the face that statistically seem to be outstanding form an initial set of regions that are likely to include information about the features of interest. Enhancement of these regions produces closed objects, which reveal—through the use of a fuzzy system—a dominant angle, i.e. the facial rotation angle. The object set is restricted using the dominant angle. An exhaustive search is performed among all candidate objects, matching a pattern that models the relative position of the eyes and the mouth. Labeling of the winner features can be used to evaluate the features extracted and provide feedback in an iterative framework. A subset of the MPEG-4 facial definition or facial animation parameter set can be obtained. This gradual feature revelation is performed under optimization for each step, producing a posteriori knowledge about the face and leading to a step-by-step visualization of the features in search.  相似文献   

19.
三维扫描仪可以准确获取人脸的几何形状与纹理,但原始的人脸扫描数据仅为一张连续曲面,不符合实际的人脸结构,无法用于人脸动画。针对此问题,提出了一种由三雏扫描数据进行人脸建模的方法,将一个具备完整结构的通用人脸模型与扫描数据进行初步适配,再采用细节重建技术恢复特定人脸的表面细节和皮肤纹理。实验表明,该方法建立的三维人脸模型真实感强,结构完整,可生成连续自然的表情动画。  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号