首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 171 毫秒
1.
潘晋  杨卫英 《电声技术》2009,33(5):62-65
快速、高效地实现语音驱动下的唇形自动合成,以及优化语音与唇动的同步是语音驱动人脸动画的重点。提出了一种基于共振峰分析的语音驱动人脸动画的方法。对语音信号进行加窗分帧,DFT变换,再对短时音频信号的频谱进行第一、第二共振峰分析,将分析结果映射为一组控制序列,并对控制序列进行去奇异点等后处理。设定三维人脸模型的动态基本口形,以定时方式将控制序列导入模型,完成人脸动画驱动。实验结果表明,该方法简单快速,有效实现了语音和唇形的同步,动画效果连贯自然,可广泛用于各类虚拟角色的配音,缩短虚拟人物的制作周期。  相似文献   

2.
三维扫描仪可以准确获取人脸的几何形状与纹理,但原始的人脸扫描数据仅为一张连续曲面,不符合实际的人脸结构,无法用于人脸动画。针对此问题,提出了一种由三雏扫描数据进行人脸建模的方法,将一个具备完整结构的通用人脸模型与扫描数据进行初步适配,再采用细节重建技术恢复特定人脸的表面细节和皮肤纹理。实验表明,该方法建立的三维人脸模型真实感强,结构完整,可生成连续自然的表情动画。  相似文献   

3.
李皓  陈艳艳  唐朝京 《信号处理》2012,28(3):322-328
针对汉语是基于音节的语言,发音过程具有“枣核型”的特点,提出一种描述汉语动态视位的模型,模型分别对音节自身和音节之间的唇部运动进行建模。对音节利用基于声韵母的唇部子运动模型描述,先提取声母和韵母发音时的唇部特征参数,并按参数对口型归类,得到化简的音节视位模型,再计算唇部子运动与音节发音过程在口型上的相似性。在音节间采用元音影响分级的权重函数模拟协同发音影响,先分析各元音与其后接辅音的口型影响,再通过权重函数控制实际发音口型。实验结果表明,相对于单音子或三音子模型表征汉语动态视位,方法提高了动画效率,使得汉语音唇动画更为合理,自然。   相似文献   

4.
特定三维人脸的建模与动画是计算机图形学中一个非常令人感兴趣的领域.本文提出了一种新的从两幅正交照片建立特定人脸的模型以及动画方法,首先以主动轮廓跟踪技术snake自动获取人脸特征点的准确位置,然后以文中的局部弹性变形(local elastic deformation)方法进行通用人脸模型到特定人脸的定制,并辅以采用图像镶嵌技术生成的大分辨率纹理图像施行纹理绘制,该方法以特征点的位移和非特征点与特征点的相对位置为基础计算局部人脸面部的变形,同时还能够实现人脸剧烈的面部变化和动作,与肌肉模型相结合,可很好地实时完成人脸的动画,具有快速高效的特点.最后,给出了所得到的实验结果.  相似文献   

5.
孙卓  岳振军 《电声技术》2007,31(6):37-40
汉语语音变换技术的目的是将汉语语音中源说话人的语音特征转换为目标说话人语音特征。提出的适用于汉语说话人的变换算法分为3个部分:前两部分用高斯混合模型实现了语音的谱包络(线性预测编码)及其激励(残差)的转换;第三部分采用支持向量回归算法实现语音的韵律变换规则建模,结合汉语语音特点利用基音同步叠加算法实现语音的超音段特征调整。与现有的语音变换算法进行比较,算法针对汉语语音超音段发音特点进行韵律调整,有效实现了汉语语音变换并得到高自然度合成语音,是一种有效的汉语语音变换算法。  相似文献   

6.
三维人脸动画是计算机图形学领域的热点课题。针对目前三维动画模型对人脸的模拟难度高且效果不够逼真地问题,为了简洁且逼真的模拟人脸表情动作,提出了一种拟合抽象肌肉模型。该模型基于人脸动画模型中常用的抽象肌肉模型,对其中宽线性肌的数学模型进行改进,利用形变参数控制宽线性肌的形态,对面部肌肉动作直接进行模拟。仿真实验表明,利用拟合抽象肌肉模型能够更为真实地模拟出复杂的嘴部动作。因此,拟合抽象肌肉模型与传统的抽象肌肉模型相比,实现的计算复杂度不高,并且模拟效果更加逼真,具有广阔的应用前景。  相似文献   

7.
《信息技术》2015,(8):26-30
现有的人脸识别系统大多基于数码相机等设备获取的二维人脸图像,当目标的姿态或者摄像机的方位发生改变时,往往会造成图像的变形以至于无法识别。而当输入的人脸为三维图像时,可以进行任意的姿态变换,从而实现对目标的识别。因此,可以通过三维人脸重建并进行空间姿态变换的方法实现任意姿态的人脸识别。在对原有三维人脸识别算法研究的基础上提出了更为通用的方法,该方法将三维深度数据与二维RGB数据结合起来,通过空间变换实现同一人脸的多姿态表示,从而建立人脸库,而在测试时只需要输入普通二维图像即可实现人脸识别。实验结果表明,此方法在采集的人脸库上,得到了很好的识别效果。  相似文献   

8.
文中提出一种基于红外结构光的人脸三维面形测量方法,通过投影红外条纹在人脸表 面,采用傅立叶变换轮廓术方法获取人脸三维信息,并给出了方法的原理、系统结构及实验结果。这种方法对检测对象不具有侵犯性,也满足特定环境隐蔽测量的要求,在三维人脸识别领域具有明显的应用前景。  相似文献   

9.
基于对称变换与高斯微分的人脸定位新方法   总被引:3,自引:2,他引:1       下载免费PDF全文
宋海娜  匡纲要  郁文贤 《电子学报》2003,31(9):1433-1436
在实际人脸识别系统中,复杂背景、无控制光照及成像质量对人脸的准确定位造成了严重影响.本文针对上述情况提出了一种室内自然环境下人脸准确定位的新方法.该方法充分利用了人脸具有的强对称性与三维特性,运用高斯微分求图像边缘,再对广义对称变换及径向对称变换加以规则限制,实现了人脸眉心的准确定位及对尺度因子的估计,进而实现了人眼的准确定位,具有很强的稳健性.  相似文献   

10.
针对以往三维人脸模型重建算法实用性差、算法复杂度高、需要通用人脸模型和对噪声敏感等缺陷,提出了一种计算量小、无需通用人脸模型的三维人脸模型的重建算法。该算法在人工辅助确定特征点的基础上,利用能量函数最小的约束关系实现深度图的初步融合,然后运用改进ICP算法获得隐式的三维人脸模型。通过对获得模型的变换和投影,可产生不同姿态的二维人脸图像。实验结果表明,融合平均误差仅为1.32毫米,效果逼真。和其它算法相比,它还具有存储资源消耗少、算法稳定性高等优点。  相似文献   

11.
We present a new system that integrates computer graphics, physics-based modeling, and interactive visualization to assist knee study and surgical operation. First, we discuss generating patient-specific three-dimensional (3-D) knee models from patient's magnetic resonant images (MRIs). The 3-D model is obtained by deforming a reference model to match the MRI dataset. Second, we present simulating knee motion that visualizes patient-specific motion data on the patient-specific knee model. Third, we introduce visualizing biomechanical information on a patient-specific model. The focus is on visualizing contact area, contact forces, and menisci deformation. Traditional methods have difficulty in visualizing knee contact area without using invasive methods. The approach presented here provides an alternative of visualizing the knee contact area and forces without any risk to the patient. Finally, a virtual surgery can be performed. The constructed 3-D knee model is the basis of motion simulation, biomechanical visualization, and virtual surgery. Knee motion simulation determines the knee rotation angles as well as knee contact points. These parameters are used to solve the biomechanical model. Our results integrate 3-D construction, motion simulation, and biomechanical visualization into one system. Overall, the methodologies here are useful elements for future virtual medical systems where all the components of visualization, automated model generation, and surgery simulation come together.  相似文献   

12.
Accurate and fast localization of a predefined target region inside the patient is an important component of many image-guided therapy procedures. This problem is commonly solved by registration of intraoperative 2-D projection images to 3-D preoperative images. If the patient is not fixed during the intervention, the 2-D image acquisition is repeated several times during the procedure, and the registration problem can be cast instead as a 3-D tracking problem. To solve the 3-D problem, we propose in this paper to apply 2-D region tracking to first recover the components of the transformation that are in-plane to the projections. The 2-D motion estimates of all projections are backprojected into 3-D space, where they are then combined into a consistent estimate of the 3-D motion. We compare this method to intensity-based 2-D to 3-D registration and a combination of 2-D motion backprojection followed by a 2-D to 3-D registration stage. Using clinical data with a fiducial marker-based gold-standard transformation, we show that our method is capable of accurately tracking vertebral targets in 3-D from 2-D motion measured in X-ray projection images. Using a standard tracking algorithm (hyperplane tracking), tracking is achieved at video frame rates but fails relatively often (32% of all frames tracked with target registration error (TRE) better than 1.2 mm, 82% of all frames tracked with TRE better than 2.4 mm). With intensity-based 2-D to 2-D image registration using normalized mutual information (NMI) and pattern intensity (PI), accuracy and robustness are substantially improved. NMI tracked 82% of all frames in our data with TRE better than 1.2 mm and 96% of all frames with TRE better than 2.4 mm. This comes at the cost of a reduced frame rate, 1.7 s average processing time per frame and projection device. Results using PI were slightly more accurate, but required on average 5.4 s time per frame. These results are still substantially faster than 2-D to 3-D registration. We conclude that motion backprojection from 2-D motion tracking is an accurate and efficient method for tracking 3-D target motion, but tracking 2-D motion accurately and robustly remains a challenge.  相似文献   

13.
Two-dimensional or 3-D visual guidance is often used for minimally invasive cardiac surgery and diagnosis. This visual guidance suffers from several drawbacks such as limited field of view, loss of signal from time to time, and in some cases, difficulty of interpretation. These limitations become more evident in beating-heart procedures when the surgeon has to perform a surgical procedure in the presence of heart motion. In this paper, we propose dynamic 3-D virtual fixtures (DVFs) to augment the visual guidance system with haptic feedback, to provide the surgeon with more helpful guidance by constraining the surgeon's hand motions thereby protecting sensitive structures. DVFs can be generated from preoperative dynamic magnetic resonance (MR) or computed tomograph (CT) images and then mapped to the patient during surgery. We have validated the feasibility of the proposed method on several simulated surgical tasks using a volunteer's cardiac image dataset. Validation results show that the integration of visual and haptic guidance can permit a user to perform surgical tasks more easily and with reduced error rate. We believe this is the first work presented in the field of virtual fixtures that explicitly considers heart motion.  相似文献   

14.
This paper presents an integrated method to identify an object pattern from an image, and track its movement over a sequence of images. The sequence of images comes from a single perspective video source, which is capturing data from a precalibrated scene. This information is used to reconstruct the scene in three-dimension (3-D) within a virtual environment where a user can interact and manipulate the system. The steps that are performed include the following: i) Identify an object pattern from a two-dimensional perspective video source. The user outlines the region of interest (ROI) in the initial frame; the procedure builds a refined mask of the dominant object within the ROI using the morphological watershed algorithm. ii) The object pattern is tracked between frames using object matching within the mask provided by the previous and next frame, computing the motion parameters. iii) The identified object pattern is matched with a library of shapes to identify a corresponding 3-D object. iv) A virtual environment is created to reconstruct the scene in 3-D using the 3-D object and the motion parameters. This method can be applied to real-life application problems, such as traffic management and material flow congestion analysis.  相似文献   

15.
The phrase “Concealing Telecommunications Networks” is first introduced as an ultimate philosophical concept in human-to-human or physically evolving “multimedia” communications by employing the same face-to-face mode that is used in natural communications. Then virtual reality (VR) technologies and their current applications are introduced, followed by an introduction of cutting-edge research on “Teleconferencing with Realistic Sensations”, which is a communications system that conceals the existence of telecommunications networks. Next, research activities on “vision” and “motion”, the most important underlying human functions supporting technologies such as VR, are presented. These activities consist of 1) a perception model that explains how human beings mentally reconstruct 3-D shapes from 2-D information projected on the retina, and 2) research on the close relationship between the senses, i.e., auditory and visual perception, visual information, and muscular motion stimuli. As a practical application, an example of measuring eye movements for early detection of Alzheimer's disease is briefly introduced. Finally, some fundamental problems with stereoscopic 3-D displays on 2-D screens, which can make them more fatiguing than the natural environment, are discussed  相似文献   

16.
Efficient optical camera tracking in virtual sets   总被引:2,自引:0,他引:2  
Optical tracking systems have become particularly popular in virtual studios applications tending to substitute electromechanical ones. However, optical systems are reported to be inferior in terms of accuracy in camera motion estimation. Moreover, marker-based approaches often cause problems in image/video compositing and impose undesirable constraints on camera movement, present work introduces a novel methodology for the construction of a two-tone blue screen, which allows the localization of camera in three-dimensional (3-D) space on the basis of the captured sequence. At the same time, a novel algorithm is presented for the extraction of camera's 3-D motion parameters based on 3-D-to-two-dimensional (2-D) line correspondences. Simulated experiments have been included to illustrate the performance of the proposed system.  相似文献   

17.
There have been several studies that jointly use audio, lip intensity, and lip geometry information for speaker identification and speech-reading applications. This paper proposes using explicit lip motion information, instead of or in addition to lip intensity and/or geometry information, for speaker identification and speech-reading within a unified feature selection and discrimination analysis framework, and addresses two important issues: 1) Is using explicit lip motion information useful, and, 2) if so, what are the best lip motion features for these two applications? The best lip motion features for speaker identification are considered to be those that result in the highest discrimination of individual speakers in a population, whereas for speech-reading, the best features are those providing the highest phoneme/word/phrase recognition rate. Several lip motion feature candidates have been considered including dense motion features within a bounding box about the lip, lip contour motion features, and combination of these with lip shape features. Furthermore, a novel two-stage, spatial, and temporal discrimination analysis is introduced to select the best lip motion features for speaker identification and speech-reading applications. Experimental results using an hidden-Markov-model-based recognition system indicate that using explicit lip motion information provides additional performance gains in both applications, and lip motion features prove more valuable in the case of speech-reading application.  相似文献   

18.
直升机载荷平台6-D(Six-Dimensional)运动误差(即飞行轨迹和姿态角运动误差)对机载LiDAR点云质量影响显著,进而影响三维重建模型精度。分析各运动误差对点云质量的影响特点,对于有针对性地消除各运动误差影响、有效提高机载LiDAR三维成像产品精度具有重要意义。建立了机载激光扫描脚点三维空间位置偏差与机载平台六方位运动误差之间的传递关系;采用数值仿真,定量比较了六方位运动误差对激光点云密度分布和的影响,获得了六方位运动误差的影响特点及规律。仿真结果表明,直升机载荷平台的三个姿态角运动误差对点云密度的影响更显著,且随飞行高度的增大而增大,而三个飞行轨迹运动误差的影响相对较小。  相似文献   

19.
The recovery of a three-dimensional (3-D) model from a sequence of two-dimensional (2-D) images is very useful in medical image analysis. Image sequences obtained from the relative motion between the object and the camera or the scanner contain more 3-D information than a single image. Methods to visualize the computed tomograms can be divided into two approaches: the surface rendering approach and the volume rendering approach. In this paper, a new surface rendering method using optical flow is proposed. Optical flow is the apparent motion in the image plane produced by the projection of real 3-D motion onto the 2-D image. The 3-D motion of an object can be recovered from the optical-flow field using additional constraints. By extracting the surface information from 3-D motion, it is possible to obtain an accurate 3-D model of the object. Both synthetic and real image sequences have been used to illustrate the feasibility of the proposed method. The experimental results suggest that the proposed method is suitable for the reconstruction of 3-D models from ultrasound medical images as well as other computed tomograms  相似文献   

20.
Panis  S. Cosmas  J.P. 《Electronics letters》1996,32(10):872-873
A dynamic programming based matching method for motion estimation. That optimises a Bayesian maximum likelihood function in a 3-D optimisation space, is presented. The Bayesian function consists of a matching cost and an object based 2-D regularisation cost. The method gives results more accurate than block-based matching since the motion boundaries are close to the actual object boundaries  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号