共查询到20条相似文献,搜索用时 31 毫秒
1.
With better understanding of face anatomy and technical advances in computer graphics, 3D face synthesis has become one of the most active research fields for many human-machine applications, ranging from immersive telecommunication to the video games industry. In this paper we proposed a method that automatically extracts features like eyes, mouth, eyebrows and nose from the given frontal face image. Then a generic 3D face model is superimposed onto the face in accordance with the extracted facial features in order to fit the input face image by transforming the vertex topology of the generic face model. The 3D-specific face can finally be synthesized by texturing the individualized face model. Once the model is ready six basic facial expressions are generated with the help of MPEG-4 facial animation parameters. To generate transitions between these facial expressions we use 3D shape morphing between the corresponding face models and blend the corresponding textures. Novelty of our method is automatic generation of 3D model and synthesis face with different expressions from frontal neutral face image. Our method has the advantage that it is fully automatic, robust, fast and can generate various views of face by rotation of 3D model. It can be used in a variety of applications for which the accuracy of depth is not critical such as games, avatars, face recognition. We have tested and evaluated our system using standard database namely, BU-3DFE. 相似文献
2.
Realistic speech animation based on observed 3D face dynamics 总被引:1,自引:0,他引:1
Muller P. Kalberer G.A. Proesmans M. Van Gool L. 《Vision, Image and Signal Processing, IEE Proceedings -》2005,152(4):491-500
An efficient system for realistic speech animation is proposed. The system supports all steps of the animation pipeline, from the capture or design of 3D head models up to the synthesis and editing of the performance. This pipeline is fully 3D, which yields high flexibility in the use of the animated character. Real detailed 3D face dynamics, observed at video frame rate for thousands of points on the face of speaking actors, underpin the realism of the facial deformations. These are given a compact and intuitive representation via independent component analysis (ICA). Performances amount to trajectories through this 'viseme space'. When asked to animate a face, the system replicates the 'visemes' that it has learned, and adds the necessary coarticulation effects. Realism has been improved through comparisons with motion captured groundtruth. Faces for which no 3D dynamics have been observed can be animated nonetheless. Their visemes are adapted automatically to their physiognomy by localising the face in a 'face space'. 相似文献
3.
Cosker D.P. Marshall A.D. Rosin P.L. Hicks Y.A. 《Vision, Image and Signal Processing, IEE Proceedings -》2004,151(4):314-321
A system capable of producing near video-realistic animation of a speaker given only speech inputs is presented. The audio input is a continuous speech signal, requires no phonetic labelling and is speaker-independent. The system requires only a short video training corpus of a subject speaking a list of viseme-targeted words in order to achieve convincing realistic facial synthesis. The system learns the natural mouth and face dynamics of a speaker to allow new facial poses, unseen in the training video, to be synthesised. To achieve this the authors have developed a novel approach which utilises a hierarchical and nonlinear principal components analysis (PCA) model which couples speech and appearance. Animation of different facial areas, defined by the hierarchy, is performed separately and merged in post-processing using an algorithm which combines texture and shape PCA data. It is shown that the model is capable of synthesising videos of a speaker using new audio segments from both previously heard and unheard speakers. 相似文献
4.
Chandrasiri N.P. Naemura T. Ishizuka M. Harashima H. Barakonyi I. 《Multimedia, IEEE》2004,11(3):20-29
In this paper, the authors have developed a system that animates 3D facial agents based on real-time facial expression analysis techniques and research on synthesizing facial expressions and text-to-speech capabilities. This system combines visual, auditory, and primary interfaces to communicate one coherent multimodal chat experience. Users can represent themselves using agents they select from a group that we have predefined. When a user shows a particular expression while typing a text, the 3D agent at the receiving end speaks the message aloud while it replays the recognized facial expression sequences and also augments the synthesized voice with appropriate emotional content. Because the visual data exchange is based on the MPEG-4 high-level Facial Animation Parameter for facial expressions (FAP 2), rather than real-time video, the method requires very low bandwidth. 相似文献
5.
Obstructive sleep apnea (OSA) is a common disorder associated with anatomical abnormalities of the upper airways that affects 5% of the population. Acoustic parameters may be influenced by the vocal tract structure and soft tissue properties. We hypothesize that speech signal properties of OSA patients will be different than those of control subjects not having OSA. Using speech signal processing techniques, we explored acoustic speech features of 93 subjects who were recorded using a text-dependent speech protocol and a digital audio recorder immediately prior to polysomnography study. Following analysis of the study, subjects were divided into OSA (n=67) and non-OSA (n=26) groups. A Gaussian mixture model-based system was developed to model and classify between the groups; discriminative features such as vocal tract length and linear prediction coefficients were selected using feature selection technique. Specificity and sensitivity of 83% and 79% were achieved for the male OSA and 86% and 84% for the female OSA patients, respectively. We conclude that acoustic features from speech signals during wakefulness can detect OSA patients with good specificity and sensitivity. Such a system can be used as a basis for future development of a tool for OSA screening. 相似文献
6.
B. A. Echeagaray-Patrón V. I. Kober V. N. Karnaukhov V. V. Kuznetsov 《Journal of Communications Technology and Electronics》2017,62(6):648-652
Face recognition is one of the most rapidly developing areas of image processing and computer vision. In this work, a new method for face recognition and identification using 3D facial surfaces is proposed. The method is invariant to facial expression and pose variations in the scene. The method uses 3D shape data without color or texture information. The method is based on conformal mapping of original facial surfaces onto a Riemannian manifold, followed by comparison of conformal and isometric invariants computed in this manifold. Computer results are presented using known 3D face databases that contain significant amount of expression and pose variations. 相似文献
7.
当前,动画及其实现技术受到业界广泛关注,而人脸动画如喜、怒、哀、乐的表达其真实感还不够好.以Waters肌肉模型为基础,提出NURBS弹性肌肉模型,该方法依据解剖学知识,用非均匀有理B样条曲线仿真肌肉.通过改变曲线控制点的权重,可以找到一个动作向量控制肌肉的运动,进而合成人脸的各种表情.控制点数量越多,肌肉就越好控制,那么就可以更加真实地仿真人脸表情. 相似文献
8.
《光电子快报》2024,20(4)
The distributed acoustic sensing technology was used for real-time speech reproduction and recognition,in which the voiceprint can be extracted by the Mel frequency cepstral coefficient(MFCC)method.A classic ancient Chinese poem\"You Zi Yin\",also called\"A Traveler's Song\",was analyzed both in time and frequency domains,where its real-time reproduction was achieved with a 116.91 ms time delay.The smaller scaled MFCC0 at 1/12 of MFCC matrix was tak-en as a feature vector of each line against the ambient noise,which provides a recognition method via cross-correlation among the 6 original and recovered verse pairs.The averaged cross-correlation coefficient of the matching pairs is calculated to be 0.580 6 higher than 0.188 3 of the nonmatched pairs,promising an accurate and fast method for real-time speech reproduction and recognition over a passive optical fiber. 相似文献
9.
Learning multiview face subspaces and facial pose estimation using independent component analysis. 总被引:1,自引:0,他引:1
Stan Z Li XiaoGuang Lu Xinwen Hou Xianhua Peng Qiansheng Cheng 《IEEE transactions on image processing》2005,14(6):705-712
An independent component analysis (ICA) based approach is presented for learning view-specific subspace representations of the face object from multiview face examples. ICA, its variants, namely independent subspace analysis (ISA) and topographic independent component analysis (TICA), take into account higher order statistics needed for object view characterization. In contrast, principal component analysis (PCA), which de-correlates the second order moments, can hardly reveal good features for characterizing different views, when the training data comprises a mixture of multiview examples and the learning is done in an unsupervised way with view-unlabeled data. We demonstrate that ICA, TICA, and ISA are able to learn view-specific basis components unsupervisedly from the mixture data. We investigate results learned by ISA in an unsupervised way closely and reveal some surprising findings and thereby explain underlying reasons for the emergent formation of view subspaces. Extensive experimental results are presented. 相似文献
10.
《Signal Processing Magazine, IEEE》2008,25(3):123-132
In this paper real-time (RT) magnetic resonance imaging (MRI) is used to study speech production especially capturing vocal tract shaping. 相似文献
11.
3D face synthesis has been extensively used in many applications over the last decade. Although many methods have been reported, automatic 3D face synthesis from a single video frame still remains unsolved. An automatic 3D face synthesis algorithm is proposed, which resolves a number of existing bottlenecks. 相似文献
12.
Consider a real-time data acquisition and processing multiserver (e.g., unmanned air vehicles and machine controllers) and multichannel (e.g., surveillance regions, communication channels, and assembly lines) system involving maintenance. In this kind of system, jobs are executed immediately upon arrival, conditional on system availability. That part of the job which is not served immediately is lost forever and cannot be processed later. Thus, queuing of jobs in such systems is impossible. The effectiveness analysis of real-time systems in a multichannel environment is important. Several definitions of performance effectiveness index for the real-time system under consideration are suggested. The real-time system is treated with exponentially distributed time-to-failure, maintenance, interarrival and duration times as a Markov chain in order to compute its steady-state probabilities and performance effectiveness index via analytic and numerical methods. Some interesting analytic results are presented concerning a worst-case analysis, which is most typical in high-performance data acquisition and control real-time systems 相似文献
13.
14.
15.
16.
17.
18.
Feijun Jiang Mika Fischer Hazım Kemal Ekenel Bertram E. Shi 《Signal Processing: Image Communication》2013,28(9):1100-1113
Intuitively, integrating information from multiple visual cues, such as texture, stereo disparity, and image motion, should improve performance on perceptual tasks, such as object detection. On the other hand, the additional effort required to extract and represent information from additional cues may increase computational complexity. In this work, we show that using biologically inspired integrated representation of texture and stereo disparity information for a multi-view facial detection task leads to not only improved detection performance, but also reduced computational complexity. Disparity information enables us to filter out 90% of image locations as being less likely to contain faces. Performance is improved because the filtering rejects 32% of the false detections made by a similar monocular detector at the same recall rate. Despite the additional computation required to compute disparity information, our binocular detector takes only 42 ms to process a pair of 640×480 images, 35% of the time required by the monocular detector. We also show that this integrated detector is computationally more efficient than a detector with similar performance where texture and stereo information is processed separately. 相似文献
19.
In this paper, we investigate feature extraction and feature selection methods as well as classification methods for automatic
facial expression recognition (FER) system. The FER system is fully automatic and consists of the following modules: face
detection, facial detection, feature extraction, selection of optimal features, and classification. Face detection is based
on AdaBoost algorithm and is followed by the extraction of frame with the maximum intensity of emotion using the inter-frame
mutual information criterion. The selected frames are then processed to generate characteristic features using different methods
including: Gabor filters, log Gabor filter, local binary pattern (LBP) operator, higher-order local autocorrelation (HLAC)
and a recent proposed method called HLAC-like features (HLACLF). The most informative features are selected based on both
wrapper and filter feature selection methods. Experiments on several facial expression databases show comparisons of different
methods. 相似文献
20.
《Proceedings of the IEEE. Institute of Electrical and Electronics Engineers》1975,63(10):1404-1414
The purpose of this research was to design and use a minicomputer-based data acquistion system in a clinical dental environment to digitize and analyze electromyogrohic (EMG) and jaw motion data. The quantitation of EMG in clinical problems requires a significant number of patients, short analysis times, and efficient acquisition and analysis of data. A present extension of this system is to utilize interactive computer graphics for presentation of these data for evaluation by clinical personnel. To facilitate EMG evaluation for clinical diagnostic purposes, a real-time capability was designed to accommodate high-speed simultaneous data collection from a multiplicity of signal sources, viz., 5 EMG channels, biting force, and jaw position. Once digitization and storage begin, data reduction and analysis programs are provided for immediate presentation of results. Application of the system to examine jaw motion has resulted in clinically useful diagnostic information. These results suggest that fuller utilization of the potential of the computer system will provide deeper clinical and physiological insight and, therefore, better dental treatment for a large fraction of the population. 相似文献