首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Facial expressions contain most of the information on human face which is essential for human–computer interaction. Development of robust algorithms for automatic recognition of facial expressions with high recognition rates has been a challenge for the last 10 years. In this paper, we propose a novel feature selection procedure which recognizes basic facial expressions with high recognition rates by utilizing three-Dimensional (3D) geometrical facial feature positions. The paper presents a system of classifying expressions in one of the six basic emotional categories which are anger, disgust, fear, happiness, sadness, and surprise. The paper contributes on feature selections for each expression independently and achieves high recognition rates with the proposed geometric facial features selected for each expression. The novel feature selection procedure is entropy based, and it is employed independently for each of the six basic expressions. The system’s performance is evaluated using the 3D facial expression database, BU-3DFE. Experimental results show that the proposed method outperforms the latest methods reported in the literature.  相似文献   

2.
基于隐马尔可夫模型(HMM)的人脸表情识别   总被引:1,自引:1,他引:1  
王冲 《通信技术》2007,40(11):359-361
人脸表情识别是目前的研究热点.文中介绍了人脸表情识别的过程,给出了基于隐马尔可夫模型(HMM)的人脸表情识别方法.通过分析人脸表情的变化情况,利用二维离散余弦变换(2D—DCT)提取脸部表情特征,经过大样本训练构建HMM模型来识别图像中的人脸表情.实验结果表明该方法是一种高效的面部表情识别方法。  相似文献   

3.
In this paper, we propose Learned Local Gabor Patterns (LLGP) for face representation and recognition. The proposed method is based on Gabor feature and the concept of texton, and defines the feature cliques which appear frequently in Gabor features as the basic patterns. Different from Local Binary Patterns (LBP) whose patterns are predefined, the local patterns in our approach are learned from the patch set, which is constructed by sampling patches from Gabor filtered face images. Thus, the patterns in our approach are face-specific and desirable for face perception tasks. Based on these learned patterns, each facial image is converted into multiple pattern maps and the block-based histograms of these patterns are concatenated together to form the representation of the face image. In addition, we propose an effective weighting strategy to enhance the performances, which makes use of the discriminative powers of different facial parts as well as different patterns. The proposed approach is evaluated on two face databases: FERET and CAS-PEAL-R1. Extensive experimental results and comparisons with existing methods show the effectiveness of the LLGP representation method and the weighting strategy. Especially, heterogeneous testing results show that the LLGP codebook has very impressive generalizability for unseen data.  相似文献   

4.

Face recognition has become an accessible issue for experts as well as ordinary people as it is a focal non-interfering biometric modality. In this paper, we introduced a new approach to perform face recognition under varying facial expressions. The proposed approach consists of two main steps: facial expression recognition and face recognition. They are two complementary steps to improve face recognition across facial expression variation. In the first step, we selected the most expressive regions responsible for facial expression appearance using the Mutual Information technique. Such a process helps not only improve the facial expression classification accuracy but also reduce the features vector size. In the second step, we used the Principal Component Analysis (PCA) to build EigenFaces for each facial expression class. Then, a face recognition is performed by projecting the face onto the corresponding facial expression Eigenfaces. The PCA technique significantly reduces the dimensionality of the original space since the face recognition is carried out in the reduced Eigenfaces space. An experimental study was conducted to evaluate the performance of the proposed approach in terms of face recognition accuracy and spatial-temporal complexity.

  相似文献   

5.
One crucial application of intelligent robotic systems is remote surveillance using a security robot. A fundamental need in security is the ability to automatically verify an intruder into a secure or restricted area, to alert remote security personnel, and then to enable them to track the intruder. In this article, we propose an Internet-based security robot system. The face recognition approach possesses "invariant" recognition characteristics, including face recognition where facial expressions, viewing perspectives, three-dimensional poses, individual appearance, and lighting vary and occluding structures are present. The experiment uses a 33.6-kb/s modem Internet connection to successfully remotely control a mobile robot, proving that the streaming technology-based approach greatly improves the "sensibility" of robot teleoperation. This improvement ensures that security personnel can effectively and at low cost use the Internet to remotely control a mobile robot to track and identify a potential intruder.  相似文献   

6.
Target recognition is a key module in modern human–computer interaction (HCI) and computer vision systems It is pervasively used in many domains like autonomous vehicles and robot, remote operation, and video surveillance. However, due to the complicated environment and object occlusion, target recognition is still a challenging task. In this paper, we propose a novel target recognition algorithm toward autonomous robot by leveraging the Kinect sensors. More specifically, we utilize the Kinect sensors to capture scenario image in real-time. Subsequently, we present an improved HSV-based image segmentation algorithm to decompose the captured image, where morphological operation is employed for foreground target extraction. Afterward, we leverage Spatial Pyramid (SP)-based scheme for visual feature extraction. Then, we adopt a new distance metric for target matching. Comprehensive experimental results have shown the effectiveness of our proposed method.  相似文献   

7.
8.
9.
In this paper, we propose an automatic facial expression exaggeration system, which consists of face detection, facial expression recognition, and facial expression exaggeration components, for generating exaggerated views of different expressions for an input face video. In addition, the parallelized algorithms for the automatic facial expression exaggeration system are developed to reduce the execution time on a multi-core embedded system. The experimental results show satisfactory expression exaggeration results and computational efficiency of the automatic facial expression exaggeration system under cluttered environments. The quantitative experimental comparisons show that the proposed parallelization strategies provide significant computational speedup compared to the single-processor implementation on a multi-core embedded platform.  相似文献   

10.
11.
该文针对人脸图像受到非刚性变化的影响,如旋转、姿态以及表情变化等,提出一种基于稠密尺度不变特征转换(SIFT)特征对齐(Dense SIFT Feature Alignment, DSFA)的稀疏表达人脸识别算法。整个算法包含两个步骤:首先利用DSFA方法对齐训练和测试样本;然后设计一种改进的稀疏表达模型进行人脸识别。为加快DSFA步骤的执行速度,还设计了一种由粗到精的层次化对齐机制。实验结果表明:在ORL,AR和LFW 3个典型数据集上,该文方法都获得了最高的识别精度。该文方法比传统稀疏表达方法在识别精度上平均提高了4.3%,同时提高了大约6倍的识别效率。  相似文献   

12.
In real‐world intelligent transportation systems, accuracy in vehicle license plate detection and recognition is considered quite critical. Many algorithms have been proposed for still images, but their accuracy on actual videos is not satisfactory. This stems from several problematic conditions in videos, such as vehicle motion blur, variety in viewpoints, outliers, and the lack of publicly available video datasets. In this study, we focus on these challenges and propose a license plate detection and recognition scheme for videos based on a temporal matching prior network. Specifically, to improve the robustness of detection and recognition accuracy in the presence of motion blur and outliers, forward and bidirectional matching priors between consecutive frames are properly combined with layer structures specifically designed for plate detection. We also built our own video dataset for the deep training of the proposed network. During network training, we perform data augmentation based on image rotation to increase robustness regarding the various viewpoints in videos.  相似文献   

13.
Although several algorithms have been proposed for facial model adaptation from image sequences, the insufficient feature set to adapt a full facial model, imperfect matching of feature points, and imprecise head motion estimation may degrade the accuracy of model adaptation. In this paper, we propose to resolve these difficulties by integrating facial model adaptation, texture mapping, and head pose estimation as cooperative and complementary processes. By using an analysis-by-synthesis approach, salient facial feature points and head profiles are reliably tracked and extracted to form a growing and more complete feature set for model adaptation. A more robust head motion estimation is achieved with the assistance of the textured facial model. The proposed scheme is performed with image sequences acquired with single uncalibrated camera and requires only little manual adjustment in the initialization setup, which proves to be a feasible approach for facial model adaptation.  相似文献   

14.
A system capable of producing near video-realistic animation of a speaker given only speech inputs is presented. The audio input is a continuous speech signal, requires no phonetic labelling and is speaker-independent. The system requires only a short video training corpus of a subject speaking a list of viseme-targeted words in order to achieve convincing realistic facial synthesis. The system learns the natural mouth and face dynamics of a speaker to allow new facial poses, unseen in the training video, to be synthesised. To achieve this the authors have developed a novel approach which utilises a hierarchical and nonlinear principal components analysis (PCA) model which couples speech and appearance. Animation of different facial areas, defined by the hierarchy, is performed separately and merged in post-processing using an algorithm which combines texture and shape PCA data. It is shown that the model is capable of synthesising videos of a speaker using new audio segments from both previously heard and unheard speakers.  相似文献   

15.
Facial expression recognition (FER) is an active research area that has attracted much attention from both academics and practitioners of different fields. In this paper, we investigate an interesting and challenging issue in FER, where the training and testing samples are from a cross-domain dictionary. In this context, the data and feature distribution are inconsistent, and thus most of the existing recognition methods may not perform well. Given this, we propose an effective dynamic constraint representation approach based on cross-domain dictionary learning for expression recognition. The proposed approach aims to dynamically represent testing samples from source and target domains, thereby fully considering the feature elasticity in a cross-domain dictionary. We are therefore able to use the proposed approach to predict class information of unlabeled testing samples. Comprehensive experiments carried out using several public datasets confirm that the proposed approach is superior compared to some state-of-the-art methods.  相似文献   

16.
The analysis of moving objects in videos, especially the recognition of human motions and gestures, is attracting increasing emphasis in computer vision area. However, most existing video analysis methods do not take into account the effect of video semantic information. The topological information of the video image plays an important role in describing the association relationship of the image content, which will help to improve the discriminability of the video feature expression. Based on the above considerations, we propose a video semantic feature learning method that integrates image topological sparse coding with dynamic time warping algorithm to improve the gesture recognition in videos. This method divides video feature learning into two phases: semi-supervised video image feature learning and supervised optimization of video sequence features. Next, a distance weighting based dynamic time warping algorithm and K-nearest neighbor algorithm is leveraged to recognize gestures. We conduct comparative experiments on table tennis video dataset. The experimental results show that the proposed method is more discriminative to the expression of video features and can effectively improve the recognition rate of gestures in sports video.  相似文献   

17.
静止-视频人脸识别是指训练集为高质量静止图像而测试集为低质量视频序列的一种身份识别技术。针对图像对齐困难和运动模糊问题,提出了一种改进的稀疏表示静止-视频人脸识别算法。根据梯度方差信息,实现了视频条件下人脸图像中几何特征的对齐。通过对图像进行多尺度滤波操作构造字典,解决了运动模糊问题。利用图像之间的互相关系数,提取了视频序列中的关键帧。实验结果表明,提出的算法较神经网络、支持向量机等方法有明显的性能改善。   相似文献   

18.
MiE is a facial involuntary reaction that reflects the real emotion and thoughts of a human being. It is very difficult for a normal human to detect a Micro-Expression (MiE), since it is a very fast and local face reaction with low intensity. As a consequence, it is a challenging task for researchers to build an automatic system for MiE recognition. Previous works for MiE recognition have attempted to use the whole face, yet a facial MiE appears in a small region of the face, which makes the extraction of relevant features a hard task. In this paper, we propose a novel deep learning approach that leverages the locality aspect of MiEs by learning spatio-temporal features from local facial regions using a composite architecture of Convolutional Neural Network (CNN) and Long Short Term Memory (LSTM). The proposed solution succeeds to extract relevant local features for MiEs recognition. Experimental results on benchmark datasets demonstrate the highest recognition accuracy of our solution with respect to state-of-the-art methods.  相似文献   

19.
人脸表情识别在人机交互等人工智能领域发挥着 重要作用,当前研究忽略了人脸的语 义信息。本 文提出了一种融合局部语义与全局信息的人脸表情识别网络,由两个分支组成:局部语义区 域提取分支 和局部-全局特征融合分支。首先利用人脸解析数据集训练语义分割网络得到人脸语义解析 ,通过迁移训 练的方法得到人脸表情数据集的语义解析。在语义解析中获取对表情识别有意义的区域及其 语义特征, 并将局部语义特征与全局特征融合,构造语义局部特征。最后,融合语义局部特征与全局特 征构成人脸 表情的全局语义复合特征,并通过分类器分为7种基础表情之一。本文同时提出了解冻部分 层训练策略, 该训练策略使语义特征更适用于表情识别,减 少语义信息冗余性。在两个公开数据集JAFFE 和KDEF上 的平均识别准确率分别达到了93.81%和88.78% ,表现优于目前的深度学习方法和传统方法。实验结果证 明了本文提出的融合局部语义和全局信息的网络能够很好地描述表情信息。  相似文献   

20.
Even though user generated video sharing sites are tremendously popular, the experience of the user watching videos is often unsatisfactory. Delays due to buffering before and during a video playback at a client are quite common. In this paper, we present a prefetching approach for user-generated video sharing sites like YouTube. We motivate the need for prefetching by performing a PlanetLab-based measurement demonstrating that video playback on YouTube is often unsatisfactory and introduce a series of prefetching schemes: (1) the conventional caching scheme, which caches all the videos that users have watched, (2) the search result-based prefetching scheme, which prefetches videos that are in the search results of users' search queries, and (3) the recommendation-aware prefetching scheme, which prefetches videos that are in the recommendation lists of the videos that users watch. We evaluate and compare the proposed schemes using user browsing pattern data collected from network measurement. We find that the recommendation-aware prefetching approach can achieve an overall hit ratio of up to 81%, while the hit ratio achieved by the caching scheme can only reach 40%. Thus, the recommendation-aware prefetching approach demonstrates strong potential for improving the playback quality at the client. In addition, we explore the trade-offs and feasibility of implementing recommendation-aware prefetching.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号