共查询到20条相似文献,搜索用时 171 毫秒
1.
快速、高效地实现语音驱动下的唇形自动合成,以及优化语音与唇动的同步是语音驱动人脸动画的重点。提出了一种基于共振峰分析的语音驱动人脸动画的方法。对语音信号进行加窗分帧,DFT变换,再对短时音频信号的频谱进行第一、第二共振峰分析,将分析结果映射为一组控制序列,并对控制序列进行去奇异点等后处理。设定三维人脸模型的动态基本口形,以定时方式将控制序列导入模型,完成人脸动画驱动。实验结果表明,该方法简单快速,有效实现了语音和唇形的同步,动画效果连贯自然,可广泛用于各类虚拟角色的配音,缩短虚拟人物的制作周期。 相似文献
2.
3.
针对汉语是基于音节的语言,发音过程具有“枣核型”的特点,提出一种描述汉语动态视位的模型,模型分别对音节自身和音节之间的唇部运动进行建模。对音节利用基于声韵母的唇部子运动模型描述,先提取声母和韵母发音时的唇部特征参数,并按参数对口型归类,得到化简的音节视位模型,再计算唇部子运动与音节发音过程在口型上的相似性。在音节间采用元音影响分级的权重函数模拟协同发音影响,先分析各元音与其后接辅音的口型影响,再通过权重函数控制实际发音口型。实验结果表明,相对于单音子或三音子模型表征汉语动态视位,方法提高了动画效率,使得汉语音唇动画更为合理,自然。 相似文献
4.
特定三维人脸的建模与动画是计算机图形学中一个非常令人感兴趣的领域.本文提出了一种新的从两幅正交照片建立特定人脸的模型以及动画方法,首先以主动轮廓跟踪技术snake自动获取人脸特征点的准确位置,然后以文中的局部弹性变形(local elastic deformation)方法进行通用人脸模型到特定人脸的定制,并辅以采用图像镶嵌技术生成的大分辨率纹理图像施行纹理绘制,该方法以特征点的位移和非特征点与特征点的相对位置为基础计算局部人脸面部的变形,同时还能够实现人脸剧烈的面部变化和动作,与肌肉模型相结合,可很好地实时完成人脸的动画,具有快速高效的特点.最后,给出了所得到的实验结果. 相似文献
5.
汉语语音变换技术的目的是将汉语语音中源说话人的语音特征转换为目标说话人语音特征。提出的适用于汉语说话人的变换算法分为3个部分:前两部分用高斯混合模型实现了语音的谱包络(线性预测编码)及其激励(残差)的转换;第三部分采用支持向量回归算法实现语音的韵律变换规则建模,结合汉语语音特点利用基音同步叠加算法实现语音的超音段特征调整。与现有的语音变换算法进行比较,算法针对汉语语音超音段发音特点进行韵律调整,有效实现了汉语语音变换并得到高自然度合成语音,是一种有效的汉语语音变换算法。 相似文献
6.
三维人脸动画是计算机图形学领域的热点课题。针对目前三维动画模型对人脸的模拟难度高且效果不够逼真地问题,为了简洁且逼真的模拟人脸表情动作,提出了一种拟合抽象肌肉模型。该模型基于人脸动画模型中常用的抽象肌肉模型,对其中宽线性肌的数学模型进行改进,利用形变参数控制宽线性肌的形态,对面部肌肉动作直接进行模拟。仿真实验表明,利用拟合抽象肌肉模型能够更为真实地模拟出复杂的嘴部动作。因此,拟合抽象肌肉模型与传统的抽象肌肉模型相比,实现的计算复杂度不高,并且模拟效果更加逼真,具有广阔的应用前景。 相似文献
7.
《信息技术》2015,(8):26-30
现有的人脸识别系统大多基于数码相机等设备获取的二维人脸图像,当目标的姿态或者摄像机的方位发生改变时,往往会造成图像的变形以至于无法识别。而当输入的人脸为三维图像时,可以进行任意的姿态变换,从而实现对目标的识别。因此,可以通过三维人脸重建并进行空间姿态变换的方法实现任意姿态的人脸识别。在对原有三维人脸识别算法研究的基础上提出了更为通用的方法,该方法将三维深度数据与二维RGB数据结合起来,通过空间变换实现同一人脸的多姿态表示,从而建立人脸库,而在测试时只需要输入普通二维图像即可实现人脸识别。实验结果表明,此方法在采集的人脸库上,得到了很好的识别效果。 相似文献
8.
文中提出一种基于红外结构光的人脸三维面形测量方法,通过投影红外条纹在人脸表
面,采用傅立叶变换轮廓术方法获取人脸三维信息,并给出了方法的原理、系统结构及实验结果。这种方法对检测对象不具有侵犯性,也满足特定环境隐蔽测量的要求,在三维人脸识别领域具有明显的应用前景。 相似文献
9.
10.
11.
Chen JX Wechsler H Pullen JM Zhu Y MacMahon EB 《IEEE transactions on bio-medical engineering》2001,48(9):1042-1052
We present a new system that integrates computer graphics, physics-based modeling, and interactive visualization to assist knee study and surgical operation. First, we discuss generating patient-specific three-dimensional (3-D) knee models from patient's magnetic resonant images (MRIs). The 3-D model is obtained by deforming a reference model to match the MRI dataset. Second, we present simulating knee motion that visualizes patient-specific motion data on the patient-specific knee model. Third, we introduce visualizing biomechanical information on a patient-specific model. The focus is on visualizing contact area, contact forces, and menisci deformation. Traditional methods have difficulty in visualizing knee contact area without using invasive methods. The approach presented here provides an alternative of visualizing the knee contact area and forces without any risk to the patient. Finally, a virtual surgery can be performed. The constructed 3-D knee model is the basis of motion simulation, biomechanical visualization, and virtual surgery. Knee motion simulation determines the knee rotation angles as well as knee contact points. These parameters are used to solve the biomechanical model. Our results integrate 3-D construction, motion simulation, and biomechanical visualization into one system. Overall, the methodologies here are useful elements for future virtual medical systems where all the components of visualization, automated model generation, and surgery simulation come together. 相似文献
12.
Markerless real-time 3-D target region tracking by motion backprojection from projection images 总被引:3,自引:0,他引:3
Rohlfing T Denzler J Grässl C Russakoff DB Maurer CR 《IEEE transactions on medical imaging》2005,24(11):1455-1468
Accurate and fast localization of a predefined target region inside the patient is an important component of many image-guided therapy procedures. This problem is commonly solved by registration of intraoperative 2-D projection images to 3-D preoperative images. If the patient is not fixed during the intervention, the 2-D image acquisition is repeated several times during the procedure, and the registration problem can be cast instead as a 3-D tracking problem. To solve the 3-D problem, we propose in this paper to apply 2-D region tracking to first recover the components of the transformation that are in-plane to the projections. The 2-D motion estimates of all projections are backprojected into 3-D space, where they are then combined into a consistent estimate of the 3-D motion. We compare this method to intensity-based 2-D to 3-D registration and a combination of 2-D motion backprojection followed by a 2-D to 3-D registration stage. Using clinical data with a fiducial marker-based gold-standard transformation, we show that our method is capable of accurately tracking vertebral targets in 3-D from 2-D motion measured in X-ray projection images. Using a standard tracking algorithm (hyperplane tracking), tracking is achieved at video frame rates but fails relatively often (32% of all frames tracked with target registration error (TRE) better than 1.2 mm, 82% of all frames tracked with TRE better than 2.4 mm). With intensity-based 2-D to 2-D image registration using normalized mutual information (NMI) and pattern intensity (PI), accuracy and robustness are substantially improved. NMI tracked 82% of all frames in our data with TRE better than 1.2 mm and 96% of all frames with TRE better than 2.4 mm. This comes at the cost of a reduced frame rate, 1.7 s average processing time per frame and projection device. Results using PI were slightly more accurate, but required on average 5.4 s time per frame. These results are still substantially faster than 2-D to 3-D registration. We conclude that motion backprojection from 2-D motion tracking is an accurate and efficient method for tracking 3-D target motion, but tracking 2-D motion accurately and robustly remains a challenge. 相似文献
13.
Ren J Patel RV McIsaac KA Guiraudon G Peters TM 《IEEE transactions on medical imaging》2008,27(8):1061-1070
Two-dimensional or 3-D visual guidance is often used for minimally invasive cardiac surgery and diagnosis. This visual guidance suffers from several drawbacks such as limited field of view, loss of signal from time to time, and in some cases, difficulty of interpretation. These limitations become more evident in beating-heart procedures when the surgeon has to perform a surgical procedure in the presence of heart motion. In this paper, we propose dynamic 3-D virtual fixtures (DVFs) to augment the visual guidance system with haptic feedback, to provide the surgeon with more helpful guidance by constraining the surgeon's hand motions thereby protecting sensitive structures. DVFs can be generated from preoperative dynamic magnetic resonance (MR) or computed tomograph (CT) images and then mapped to the patient during surgery. We have validated the feasibility of the proposed method on several simulated surgical tasks using a volunteer's cardiac image dataset. Validation results show that the integration of visual and haptic guidance can permit a user to perform surgical tasks more easily and with reduced error rate. We believe this is the first work presented in the field of virtual fixtures that explicitly considers heart motion. 相似文献
14.
《IEEE transactions on systems, man and cybernetics. Part C, Applications and reviews》2005,35(1):116-125
This paper presents an integrated method to identify an object pattern from an image, and track its movement over a sequence of images. The sequence of images comes from a single perspective video source, which is capturing data from a precalibrated scene. This information is used to reconstruct the scene in three-dimension (3-D) within a virtual environment where a user can interact and manipulate the system. The steps that are performed include the following: i) Identify an object pattern from a two-dimensional perspective video source. The user outlines the region of interest (ROI) in the initial frame; the procedure builds a refined mask of the dominant object within the ROI using the morphological watershed algorithm. ii) The object pattern is tracked between frames using object matching within the mask provided by the previous and next frame, computing the motion parameters. iii) The identified object pattern is matched with a library of shapes to identify a corresponding 3-D object. iv) A virtual environment is created to reconstruct the scene in 3-D using the 3-D object and the motion parameters. This method can be applied to real-life application problems, such as traffic management and material flow congestion analysis. 相似文献
15.
Ohzu H. Habara K. 《Proceedings of the IEEE. Institute of Electrical and Electronics Engineers》1996,84(5):782-798
The phrase “Concealing Telecommunications Networks” is first introduced as an ultimate philosophical concept in human-to-human or physically evolving “multimedia” communications by employing the same face-to-face mode that is used in natural communications. Then virtual reality (VR) technologies and their current applications are introduced, followed by an introduction of cutting-edge research on “Teleconferencing with Realistic Sensations”, which is a communications system that conceals the existence of telecommunications networks. Next, research activities on “vision” and “motion”, the most important underlying human functions supporting technologies such as VR, are presented. These activities consist of 1) a perception model that explains how human beings mentally reconstruct 3-D shapes from 2-D information projected on the retina, and 2) research on the close relationship between the senses, i.e., auditory and visual perception, visual information, and muscular motion stimuli. As a practical application, an example of measuring eye movements for early detection of Alzheimer's disease is briefly introduced. Finally, some fundamental problems with stereoscopic 3-D displays on 2-D screens, which can make them more fatiguing than the natural environment, are discussed 相似文献
16.
Efficient optical camera tracking in virtual sets 总被引:2,自引:0,他引:2
Xirouhakis Y.S. Drosopoulos A.I. Delopoulos A.N. 《IEEE transactions on image processing》2001,10(4):609-622
Optical tracking systems have become particularly popular in virtual studios applications tending to substitute electromechanical ones. However, optical systems are reported to be inferior in terms of accuracy in camera motion estimation. Moreover, marker-based approaches often cause problems in image/video compositing and impose undesirable constraints on camera movement, present work introduces a novel methodology for the construction of a two-tone blue screen, which allows the localization of camera in three-dimensional (3-D) space on the basis of the captured sequence. At the same time, a novel algorithm is presented for the extraction of camera's 3-D motion parameters based on 3-D-to-two-dimensional (2-D) line correspondences. Simulated experiments have been included to illustrate the performance of the proposed system. 相似文献
17.
H Ertan Cetingül Yücel Yemez Engin Erzin A Murat Tekalp 《IEEE transactions on image processing》2006,15(10):2879-2891
There have been several studies that jointly use audio, lip intensity, and lip geometry information for speaker identification and speech-reading applications. This paper proposes using explicit lip motion information, instead of or in addition to lip intensity and/or geometry information, for speaker identification and speech-reading within a unified feature selection and discrimination analysis framework, and addresses two important issues: 1) Is using explicit lip motion information useful, and, 2) if so, what are the best lip motion features for these two applications? The best lip motion features for speaker identification are considered to be those that result in the highest discrimination of individual speakers in a population, whereas for speech-reading, the best features are those providing the highest phoneme/word/phrase recognition rate. Several lip motion feature candidates have been considered including dense motion features within a bounding box about the lip, lip contour motion features, and combination of these with lip shape features. Furthermore, a novel two-stage, spatial, and temporal discrimination analysis is introduced to select the best lip motion features for speaker identification and speech-reading applications. Experimental results using an hidden-Markov-model-based recognition system indicate that using explicit lip motion information provides additional performance gains in both applications, and lip motion features prove more valuable in the case of speech-reading application. 相似文献
18.
直升机载荷平台6-D(Six-Dimensional)运动误差(即飞行轨迹和姿态角运动误差)对机载LiDAR点云质量影响显著,进而影响三维重建模型精度。分析各运动误差对点云质量的影响特点,对于有针对性地消除各运动误差影响、有效提高机载LiDAR三维成像产品精度具有重要意义。建立了机载激光扫描脚点三维空间位置偏差与机载平台六方位运动误差之间的传递关系;采用数值仿真,定量比较了六方位运动误差对激光点云密度分布和的影响,获得了六方位运动误差的影响特点及规律。仿真结果表明,直升机载荷平台的三个姿态角运动误差对点云密度的影响更显著,且随飞行高度的增大而增大,而三个飞行轨迹运动误差的影响相对较小。 相似文献
19.
《IEEE transactions on medical imaging》1997,16(5):630-641
The recovery of a three-dimensional (3-D) model from a sequence of two-dimensional (2-D) images is very useful in medical image analysis. Image sequences obtained from the relative motion between the object and the camera or the scanner contain more 3-D information than a single image. Methods to visualize the computed tomograms can be divided into two approaches: the surface rendering approach and the volume rendering approach. In this paper, a new surface rendering method using optical flow is proposed. Optical flow is the apparent motion in the image plane produced by the projection of real 3-D motion onto the 2-D image. The 3-D motion of an object can be recovered from the optical-flow field using additional constraints. By extracting the surface information from 3-D motion, it is possible to obtain an accurate 3-D model of the object. Both synthetic and real image sequences have been used to illustrate the feasibility of the proposed method. The experimental results suggest that the proposed method is suitable for the reconstruction of 3-D models from ultrasound medical images as well as other computed tomograms 相似文献
20.
A dynamic programming based matching method for motion estimation. That optimises a Bayesian maximum likelihood function in a 3-D optimisation space, is presented. The Bayesian function consists of a matching cost and an object based 2-D regularisation cost. The method gives results more accurate than block-based matching since the motion boundaries are close to the actual object boundaries 相似文献