共查询到20条相似文献,搜索用时 15 毫秒
1.
With better understanding of face anatomy and technical advances in computer graphics, 3D face synthesis has become one of the most active research fields for many human-machine applications, ranging from immersive telecommunication to the video games industry. In this paper we proposed a method that automatically extracts features like eyes, mouth, eyebrows and nose from the given frontal face image. Then a generic 3D face model is superimposed onto the face in accordance with the extracted facial features in order to fit the input face image by transforming the vertex topology of the generic face model. The 3D-specific face can finally be synthesized by texturing the individualized face model. Once the model is ready six basic facial expressions are generated with the help of MPEG-4 facial animation parameters. To generate transitions between these facial expressions we use 3D shape morphing between the corresponding face models and blend the corresponding textures. Novelty of our method is automatic generation of 3D model and synthesis face with different expressions from frontal neutral face image. Our method has the advantage that it is fully automatic, robust, fast and can generate various views of face by rotation of 3D model. It can be used in a variety of applications for which the accuracy of depth is not critical such as games, avatars, face recognition. We have tested and evaluated our system using standard database namely, BU-3DFE. 相似文献
2.
Realistic speech animation based on observed 3D face dynamics 总被引:1,自引:0,他引:1
Muller P. Kalberer G.A. Proesmans M. Van Gool L. 《Vision, Image and Signal Processing, IEE Proceedings -》2005,152(4):491-500
An efficient system for realistic speech animation is proposed. The system supports all steps of the animation pipeline, from the capture or design of 3D head models up to the synthesis and editing of the performance. This pipeline is fully 3D, which yields high flexibility in the use of the animated character. Real detailed 3D face dynamics, observed at video frame rate for thousands of points on the face of speaking actors, underpin the realism of the facial deformations. These are given a compact and intuitive representation via independent component analysis (ICA). Performances amount to trajectories through this 'viseme space'. When asked to animate a face, the system replicates the 'visemes' that it has learned, and adds the necessary coarticulation effects. Realism has been improved through comparisons with motion captured groundtruth. Faces for which no 3D dynamics have been observed can be animated nonetheless. Their visemes are adapted automatically to their physiognomy by localising the face in a 'face space'. 相似文献
3.
Cosker D.P. Marshall A.D. Rosin P.L. Hicks Y.A. 《Vision, Image and Signal Processing, IEE Proceedings -》2004,151(4):314-321
A system capable of producing near video-realistic animation of a speaker given only speech inputs is presented. The audio input is a continuous speech signal, requires no phonetic labelling and is speaker-independent. The system requires only a short video training corpus of a subject speaking a list of viseme-targeted words in order to achieve convincing realistic facial synthesis. The system learns the natural mouth and face dynamics of a speaker to allow new facial poses, unseen in the training video, to be synthesised. To achieve this the authors have developed a novel approach which utilises a hierarchical and nonlinear principal components analysis (PCA) model which couples speech and appearance. Animation of different facial areas, defined by the hierarchy, is performed separately and merged in post-processing using an algorithm which combines texture and shape PCA data. It is shown that the model is capable of synthesising videos of a speaker using new audio segments from both previously heard and unheard speakers. 相似文献
4.
Chandrasiri N.P. Naemura T. Ishizuka M. Harashima H. Barakonyi I. 《Multimedia, IEEE》2004,11(3):20-29
In this paper, the authors have developed a system that animates 3D facial agents based on real-time facial expression analysis techniques and research on synthesizing facial expressions and text-to-speech capabilities. This system combines visual, auditory, and primary interfaces to communicate one coherent multimodal chat experience. Users can represent themselves using agents they select from a group that we have predefined. When a user shows a particular expression while typing a text, the 3D agent at the receiving end speaks the message aloud while it replays the recognized facial expression sequences and also augments the synthesized voice with appropriate emotional content. Because the visual data exchange is based on the MPEG-4 high-level Facial Animation Parameter for facial expressions (FAP 2), rather than real-time video, the method requires very low bandwidth. 相似文献
5.
当前,动画及其实现技术受到业界广泛关注,而人脸动画如喜、怒、哀、乐的表达其真实感还不够好.以Waters肌肉模型为基础,提出NURBS弹性肌肉模型,该方法依据解剖学知识,用非均匀有理B样条曲线仿真肌肉.通过改变曲线控制点的权重,可以找到一个动作向量控制肌肉的运动,进而合成人脸的各种表情.控制点数量越多,肌肉就越好控制,那么就可以更加真实地仿真人脸表情. 相似文献
6.
Obstructive sleep apnea (OSA) is a common disorder associated with anatomical abnormalities of the upper airways that affects 5% of the population. Acoustic parameters may be influenced by the vocal tract structure and soft tissue properties. We hypothesize that speech signal properties of OSA patients will be different than those of control subjects not having OSA. Using speech signal processing techniques, we explored acoustic speech features of 93 subjects who were recorded using a text-dependent speech protocol and a digital audio recorder immediately prior to polysomnography study. Following analysis of the study, subjects were divided into OSA (n=67) and non-OSA (n=26) groups. A Gaussian mixture model-based system was developed to model and classify between the groups; discriminative features such as vocal tract length and linear prediction coefficients were selected using feature selection technique. Specificity and sensitivity of 83% and 79% were achieved for the male OSA and 86% and 84% for the female OSA patients, respectively. We conclude that acoustic features from speech signals during wakefulness can detect OSA patients with good specificity and sensitivity. Such a system can be used as a basis for future development of a tool for OSA screening. 相似文献
7.
B. A. Echeagaray-Patrón V. I. Kober V. N. Karnaukhov V. V. Kuznetsov 《Journal of Communications Technology and Electronics》2017,62(6):648-652
Face recognition is one of the most rapidly developing areas of image processing and computer vision. In this work, a new method for face recognition and identification using 3D facial surfaces is proposed. The method is invariant to facial expression and pose variations in the scene. The method uses 3D shape data without color or texture information. The method is based on conformal mapping of original facial surfaces onto a Riemannian manifold, followed by comparison of conformal and isometric invariants computed in this manifold. Computer results are presented using known 3D face databases that contain significant amount of expression and pose variations. 相似文献
8.
The distributed acoustic sensing technology was used for real-time speech reproduction and recognition, in which the voiceprint can be extracted by the Mel frequency cepstral coefficient(MFCC) method. A classic ancient Chinese poem “You Zi Yin”, also called “A Traveler’s Song”, was analyzed both in time and frequency domains, where its real-time reproduction was achieved with a 116.91 ms time delay. The smaller scaled MFCC0 at 1/12 of MFCC matrix was taken as a feature vector of each ... 相似文献
9.
Learning multiview face subspaces and facial pose estimation using independent component analysis. 总被引:1,自引:0,他引:1
Stan Z Li XiaoGuang Lu Xinwen Hou Xianhua Peng Qiansheng Cheng 《IEEE transactions on image processing》2005,14(6):705-712
An independent component analysis (ICA) based approach is presented for learning view-specific subspace representations of the face object from multiview face examples. ICA, its variants, namely independent subspace analysis (ISA) and topographic independent component analysis (TICA), take into account higher order statistics needed for object view characterization. In contrast, principal component analysis (PCA), which de-correlates the second order moments, can hardly reveal good features for characterizing different views, when the training data comprises a mixture of multiview examples and the learning is done in an unsupervised way with view-unlabeled data. We demonstrate that ICA, TICA, and ISA are able to learn view-specific basis components unsupervisedly from the mixture data. We investigate results learned by ISA in an unsupervised way closely and reveal some surprising findings and thereby explain underlying reasons for the emergent formation of view subspaces. Extensive experimental results are presented. 相似文献
10.
《Signal Processing Magazine, IEEE》2008,25(3):123-132
In this paper real-time (RT) magnetic resonance imaging (MRI) is used to study speech production especially capturing vocal tract shaping. 相似文献
11.
3D face synthesis has been extensively used in many applications over the last decade. Although many methods have been reported, automatic 3D face synthesis from a single video frame still remains unsolved. An automatic 3D face synthesis algorithm is proposed, which resolves a number of existing bottlenecks. 相似文献
12.
Consider a real-time data acquisition and processing multiserver (e.g., unmanned air vehicles and machine controllers) and multichannel (e.g., surveillance regions, communication channels, and assembly lines) system involving maintenance. In this kind of system, jobs are executed immediately upon arrival, conditional on system availability. That part of the job which is not served immediately is lost forever and cannot be processed later. Thus, queuing of jobs in such systems is impossible. The effectiveness analysis of real-time systems in a multichannel environment is important. Several definitions of performance effectiveness index for the real-time system under consideration are suggested. The real-time system is treated with exponentially distributed time-to-failure, maintenance, interarrival and duration times as a Markov chain in order to compute its steady-state probabilities and performance effectiveness index via analytic and numerical methods. Some interesting analytic results are presented concerning a worst-case analysis, which is most typical in high-performance data acquisition and control real-time systems 相似文献
13.
14.
三维面部数据采集与NURBS曲面重构 总被引:1,自引:3,他引:1
为获取面部三维点云,并转换为高精度NURBS曲面,建立了面部图像采集与处理系统,并研究其中的点云自动拼接和曲面重构方法。首先,采用双光栅三维扫描仪将正弦光栅投射到人的面部,将被面部调制的变形光栅转换成两片分立点云。然后,通过标定块对测量系统进行标定,利用协方差求取旋转矩阵和平移向量实现点云自动拼接获取完整面部点云,并生成三角面片。最后,通过检测曲率、生成四边形网格、建立UV参数线等处理,进行曲面重构生成高精度NURBS四边域面部曲面,并对重构结果进行误差分析。结果表明:所重构面部曲面符合G1连续,标准偏差为0.009222 mm。本文所采用的高精度的面部曲面重构方法可用于复杂曲面的三维检测与重构中。 相似文献
15.
16.
面部情绪识别已成为可见光人脸识别应用的重要部 分,是光学模式识别研究中最重要的领域之一。为了进一步实现可见光条件下面部情绪的自 动识别,本文结合Viola-Jones、自适应直方图均衡(AHE)、离散小波变换(DWT)和深度卷 积神经网络(CNN),提出了一种面部情绪自动识别算法。该算法使用Viola-Jones定位脸 部和五官,使用自适应直方图均衡增强面部图像,使用DWT完成面部特征提取;最后,提取 的特征直接用于深度卷积神经网络训练,以实现面部情绪自动识别。仿真实验分别在CK+数 据库和可见光人脸图像中进行,在CK+数据集上收获了97%的平均准确 率,在可见光人脸图像测试中也获得了95%的平均准确率。实验结果 表明,针对不同的面部五官和情绪,本文算法能够对可见光面部特征进行准确定位,对可见 光图像信息进行均衡处理,对情绪类别进行自动识别,并且能够满足同框下多类面部情绪同 时识别的需求,有着较高的识别率和鲁棒性。 相似文献
17.
18.
19.
20.
Feijun Jiang Mika Fischer Hazım Kemal Ekenel Bertram E. Shi 《Signal Processing: Image Communication》2013,28(9):1100-1113
Intuitively, integrating information from multiple visual cues, such as texture, stereo disparity, and image motion, should improve performance on perceptual tasks, such as object detection. On the other hand, the additional effort required to extract and represent information from additional cues may increase computational complexity. In this work, we show that using biologically inspired integrated representation of texture and stereo disparity information for a multi-view facial detection task leads to not only improved detection performance, but also reduced computational complexity. Disparity information enables us to filter out 90% of image locations as being less likely to contain faces. Performance is improved because the filtering rejects 32% of the false detections made by a similar monocular detector at the same recall rate. Despite the additional computation required to compute disparity information, our binocular detector takes only 42 ms to process a pair of 640×480 images, 35% of the time required by the monocular detector. We also show that this integrated detector is computationally more efficient than a detector with similar performance where texture and stereo information is processed separately. 相似文献