首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We present an approach to identify noncooperative individuals at a distance from a sequence of images, using 3-D face models. Most biometric features (such as fingerprints, hand shape, iris, or retinal scans) require cooperative subjects in close proximity to the biometric system. We process images acquired with an ultrahigh-resolution video camera, infer the location of the subjects' head, use this information to crop the region of interest, build a 3-D face model, and use this 3-D model to perform biometric identification. To build the 3-D model, we use an image sequence, as natural head and body motion provides enough viewpoint variation to perform stereomotion for 3-D face reconstruction. We have conducted experiments on a 2-D and 3-D databases collected in our laboratory. First, we found that metric 3-D face models can be used for recognition by using simple scaling method even though there is no exact scale in the 3-D reconstruction. Second, experiments using a commercial 3-D matching engine suggest the feasibility of the proposed approach for recognition against 3-D galleries at a distance (3, 6, and 9 m). Moreover, we show initial 3-D face modeling results on various factors including head motion, outdoor lighting conditions, and glasses. The evaluation results suggest that video data alone, at a distance of 3 to 9 meters, can provide a 3-D face shape that supports successful face recognition. The performance of 3-D–3-D recognition with the currently generated models does not quite match that of 2-D–2-D. We attribute this to the quality of the inferred models, and this suggests a clear path for future research.   相似文献   

2.
Recently, we have proposed a real-time tracker that simultaneously tracks the 3-D head pose and facial actions in monocular video sequences that can be provided by low quality cameras. This paper has two main contributions. First, we propose an automatic 3-D face pose initialization scheme for the real-time tracker by adopting a 2-D face detector and an eigenface system. Second, we use the proposed methods—the initialization and tracking—for enhancing the human–machine interaction functionality of an AIBO robot. More precisely, we show how the orientation of the robot's camera (or any active vision system) can be controlled through the estimation of the user's head pose. Applications based on head-pose imitation such as telepresence, virtual reality, and video games can directly exploit the proposed techniques. Experiments on real videos confirm the robustness and usefulness of the proposed methods.   相似文献   

3.
We introduce a new model for personal recognition based on the 3-D geometry of the face. The model is designed for application scenarios where the acquisition conditions constrain the facial position. The 3-D structure of a facial surface is compactly represented by sets of contours (facial contours) extracted around automatically pinpointed nose tip and inner eye corners. The metric used to decide whether a point on the face belongs to a facial contour is its geodesic distance from a given landmark. Iso-geodesic contours are inherently robust to head pose variations, including in-depth rotations of the face. Since these contours are extracted from rigid parts of the face, the resulting recognition algorithms are insensitive to changes in facial expressions. The facial contours are encoded using innovative pose invariant features, including Procrustean distances defined on pose-invariant curves. The extracted features are combined in a hierarchical manner to create three parallel face recognizers. Inspired by the effectiveness of region ensembles approaches, the three recognizers constructed around the nose tip and inner corners of the eyes are fused both at the feature-level and the match score-level to create a unified face recognition algorithm with boosted performance. The performances of the proposed algorithms are evaluated and compared with other algorithms from the literature on a large public database appropriate for the assumed constrained application scenario.  相似文献   

4.
3-D motion estimation in model-based facial image coding   总被引:6,自引:0,他引:6  
An approach to estimating the motion of the head and facial expressions in model-based facial image coding is presented. An affine nonrigid motion model is set up. The specific knowledge about facial shape and facial expression is formulated in this model in the form of parameters. A direct method of estimating the two-view motion parameters that is based on the affine method is discussed. Based on the reasonable assumption that the 3-D motion of the face is almost smooth in the time domain, several approaches to predicting the motion of the next frame are proposed. Using a 3-D model, the approach is characterized by a feedback loop connecting computer vision and computer graphics. Embedding the synthesis techniques into the analysis phase greatly improves the performance of motion estimation. Simulations with long image sequences of real-world scenes indicate that the method not only greatly reduces computational complexity but also substantially improves estimation accuracy  相似文献   

5.
Integral projections are a useful technique in many computer vision problems. In this paper, we present a perceptual interface which allows us to navigate through a virtual 3D world by using the movements of the face of human users. The system applies advanced computer vision techniques to detect, track, and estimate the pose of the user’s head. The core of the proposed approach consists of a face tracker, which is based on the computation, alignment, and analysis of integral projections. This technique provides a robust, accurate, and stable 2D location of the face in each frame of the input video. Then, 3D location and orientation of the head are estimated, using some predefined heuristics. Finally, the resulting 3D pose is transformed into control signals for the navigation in the virtual world. The proposed approach has been implemented and tested in a prototype, which is publicly available. Some experimental results are shown, proving the feasibility of the method. The perceptual interface is fast, stable, and robust to facial expression and illumination conditions.  相似文献   

6.
Sketching a recognizable human face involves artistic talents and an intuitive knowledge of which aspects of the face are important in recognition. A man-machine system, called WHATSISFACE, has been developed with which man-machine system, called WHATSISFACE, has been developed with which a nonartist can create, on a graphic display, any male Caucasian facial image resembling the face of a photograph in front of him. The computer system contains pre-stored facial features, an average face used as a starting point and a heuristic strategy which guides the user through a carefully constructed sequence of questions, choices and feature manipulations. The user makes all the visual decisions and can change the individual features or hierarchically organized sets of features using analog input devices.  相似文献   

7.
Video conferencing provides an environment for multiple users linked on a network to have meetings. Since a large quantity of audio and video data are transferred to multiple users in real time, research into reducing the quantity of data to be transferred has been drawing attention. Such methods extract and transfer only the features of a user from video data and then reconstruct a video conference using virtual humans. The disadvantage of such an approach is that only the positions and features of hands and heads are extracted and reconstructed, whilst the other virtual body parts do not follow the user. In order to enable a virtual human to accurately mimic the entire body of the user in a 3D virtual conference, we examined what features should be extracted to express a user more clearly and how they can be reproduced by a virtual human. This 3D video conferencing estimates the user’s pose by comparing predefined images with a photographed user’s image and generates a virtual human that takes the estimated pose. However, this requires predefining a diverse set of images for pose estimation and, moreover, it is difficult to define behaviors that can express poses correctly. This paper proposes a framework to automatically generate the pose-images used to estimate a user’s pose and the behaviors required to present a user using a virtual human in a 3D video conference. The method for applying this framework to a 3D video conference on the basis of the automatically generated data is also described. In the experiment, the framework proposed in this paper was implemented in a mobile device. The generation process of poses and behaviors of virtual human was verified. Finally, by applying programming by demonstration, we developed a system that can automatically collect the various data necessary for a video conference directly without any prior knowledge of the video conference system.  相似文献   

8.
A new approach to the generation of a feature-point-driven facial animation is presented. In the proposed approach, a hypothetical face is used to control the animation of a face model. The hypothetical face is constructed by connecting some predefined facial feature points to create a net so that each facet of the net is represented by a Coon's surface. Deformation of the face model is controlled by changing the shape of the hypothetical face, which is performed by changing the locations of feature points and their tangents. Experimental results show that this hypothetical-face-based method can generate facial expressions which are visually almost identical to those of a real face.  相似文献   

9.
This paper presents a virtual try-on system based on augmented reality for design personalization of facial accessory products. The system offers several novel functions that support real-time evaluation and modification of eyeglasses frame. 3D glasses model is embedded within video stream of the person who is wearing the glasses. Machine learning algorithms are developed for instantaneous tracking of facial features without use of markers. The tracking result enables continuously positioning of the glasses model on the user’s face while it is moving during the try-on process. In addition to color and texture, the user can instantly modify the glasses shape through simple semantic parameters. These functions not only facilitate evaluating products highly interactive with human users, but also engage them in the design process. This work has thus implemented the concept of human-centric design personalization.  相似文献   

10.
In this paper, we present a system for real-time performance-driven facial animation. With the system, the user can control the facial expression of a digital character by acting out the desired facial action in front of an ordinary camera. First,we create a muscle-based 3D face model. The muscle actuation parameters are used to animate the face model. To increase the reality of facial animation, the orbicularis oris in our face model is divided into the inner part and outer part. We also establish the relationship between jaw rotation and facial surface deformation. Second, a real-time facial tracking method is employed to track the facial features of a performer in the video. Finally, the tracked facial feature points are used to estimate muscle actuation parameters to drive the face model. Experimental results show that our system runs in real time and outputs realistic facial animations.Compared with most existing performance-based facial animation systems, ours does not require facial markers, intrusive lighting,or special scanning equipment, thus it is inexpensive and easy to use.  相似文献   

11.
We describe a novel approach for 3-D ear biometrics using video. A series of frames is extracted from a video clip and the region of interest in each frame is independently reconstructed in 3-D using shape from shading. The resulting 3-D models are then registered using the iterative closest point algorithm. We iteratively consider each model in the series as a reference model and calculate the similarity between the reference model and every model in the series using a similarity cost function. Cross validation is performed to assess the relative fidelity of each 3-D model. The model that demonstrates the greatest overall similarity is determined to be the most stable 3-D model and is subsequently enrolled in the database. Experiments are conducted using a gallery set of 402 video clips and a probe of 60 video clips. The results (95.0% rank-1 recognition rate and 3.3% equal error rate) indicate that the proposed approach can produce recognition rates comparable to systems that use 3-D range data. To the best of our knowledge, we are the first to develop a 3-D ear biometric system that obtains a 3-D ear structure from a video sequence.   相似文献   

12.
罗常伟  於俊  汪增福 《自动化学报》2014,40(10):2245-2252
描述了一种实时的视频驱动的人脸动画合成系统.通过该系统,用户只要在摄像头前面表演各种脸部动作,就可以控制虚拟人脸的表情.首先,建立一个基于肌肉的三维人脸模型,并使用肌肉激励参数控制人脸形变.为提高人脸动画的真实感,将口轮匝肌分解为外圈和内圈两部分,同时建立脸部形变与下颌转动的关系.然后,使用一种实时的特征点跟踪算法跟踪视频中人脸的特征点.最后,将视频跟踪结果转换为肌肉激励参数以驱动人脸动画.实验结果表明,该系统能实时运行,合成的动画也具有较强真实感.与大部分现有的视频驱动的人脸动画方法相比,该系统不需要使用脸部标志和三维扫描设备,极大地方便了用户使用.  相似文献   

13.
Constructing a 3D individualized head model from two orthogonal views   总被引:7,自引:0,他引:7  
A new scheme for constructing a 3D individualized head model automatically from only a side view and the front view of the face is presented. The approach instantiates a generic 3D head model based on a set of the individual's facial features extracted by a local maximum-curvature tracking (LMCT) algorithm that we have developed. A distortion vector field that deforms the generic model to that of the individual is computed by correspondence matching and interpolation. The input of the two facial images are blended and texture-mapped onto the 3D head model. Arbitrary views of a person can be generated from two orthogonal images and can be implemented efficiently on a low-cost, PC-based platform.  相似文献   

14.
This paper proposes a deep bidirectional long short-term memory approach in modeling the long contextual, nonlinear mapping between audio and visual streams for video-realistic talking head. In training stage, an audio-visual stereo database is firstly recorded as a subject talking to a camera. The audio streams are converted into acoustic feature, i.e. Mel-Frequency Cepstrum Coefficients (MFCCs), and their textual labels are also extracted. The visual streams, in particular, the lower face region, are compactly represented by active appearance model (AAM) parameters by which the shape and texture variations can be jointly modeled. Given pairs of the audio and visual parameter sequence, a DBLSTM model is trained to learn the sequence mapping from audio to visual space. For any unseen speech audio, whether it is original recorded or synthesized by text-to-speech (TTS), the trained DBLSTM model can predict a convincing AAM parameter trajectory for the lower face animation. To further improve the realism of the proposed talking head, the trajectory tiling method is adopted to use the DBLSTM predicted AAM trajectory as a guide to select a smooth real sample image sequence from the recorded database. We then stitch the selected lower face image sequence back to a background face video of the same subject, resulting in a video-realistic talking head. Experimental results show that the proposed DBLSTM approach outperforms the existing HMM-based approach in both objective and subjective evaluations.  相似文献   

15.
《Real》1996,2(2):67-79
Many researchers have studied techniques related to the analysis and synthesis of human heads under motion with face deformations. These techniques can be used for defining low-rate image compression algorithms (model-based image coding), cinema technologies, video-phones, as well as for applications of virtual reality, etc. Such techniques need a real-time performance and a strong integration between the mechanisms of motion estimation and those of rendering and animation of the 3D synthetic head/face. In this paper, a complete and integrated system for tracking and synthesizing facial motions in real-time with low-cost architectures is presented. Facial deformations curves represented as spatiotemporal B-splines are used for tracking in order to model the main facial features. In addition, the system proposed is capable of adapting a generic 3D wire-frame model of a head/face to the face that must be tracked; therefore, the simulations of the face deformations are produced by using a realistic patterned face.  相似文献   

16.
We have developed an easy-to-use and cost-effective system to construct textured 3D animated face models from videos with minimal user interaction. This is a particularly challenging task for faces due to a lack of prominent textures. We develop a robust system by following a model-based approach: we make full use of generic knowledge of faces in head motion determination, head tracking, model fitting, and multiple-view bundle adjustment. Our system first takes, with an ordinary video camera, images of a face of a person sitting in front of the camera turning their head from one side to the other. After five manual clicks on two images to indicate the position of the eye corners, nose tip and mouth corners, the system automatically generates a realistic looking 3D human head model that can be animated immediately (different poses, facial expressions and talking). A user, with a PC and a video camera, can use our system to generate his/her face model in a few minutes. The face model can then be imported in his/her favorite game, and the user sees themselves and their friends take part in the game they are playing. We have demonstrated the system on a laptop computer live at many events, and constructed face models for hundreds of people. It works robustly under various environment settings.  相似文献   

17.
针对现有语音生成说话人脸视频方法忽略说话人头部运动的问题,提出基于关键点表示的语音驱动说话人脸视频生成方法.分别利用人脸的面部轮廓关键点和唇部关键点表示说话人的头部运动信息和唇部运动信息,通过并行多分支网络将输入语音转换到人脸关键点,通过连续的唇部关键点和头部关键点序列及模板图像最终生成面部人脸视频.定量和定性实验表明,文中方法能合成清晰、自然、带有头部动作的说话人脸视频,性能指标较优.  相似文献   

18.
19.
20.
针对传统方法存在的通用性不强、实时性较差和与视频传输系统不兼容等缺点,提出一种面向单幅任意姿态图像的3D人脸建模技术.首先将基于初始位置校正和模型实例选择的主动外观模型改进方法应用到人脸特征点的提取过程中;然后结合人脸结构特征和空间仿射变换调节CANDIDE-3线框模型,实现对应人脸的全局位置恢复和形状匹配.在此基础上,根据感兴趣区域的对称性对局部坐标进行微调,并构建真实感纹理.实验结果表明,该技术不受图像拍摄焦距等因素影响,单幅图像平均建模时间约为300ms.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号