首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
The objective of this work is to recognize faces using video sequences both for training and novel input, in a realistic, unconstrained setup in which lighting, pose and user motion pattern have a wide variability and face images are of low resolution. There are three major areas of novelty: (i) illumination generalization is achieved by combining coarse histogram correction with fine illumination manifold-based normalization; (ii) pose robustness is achieved by decomposing each appearance manifold into semantic Gaussian pose clusters, comparing the corresponding clusters and fusing the results using an RBF network; (iii) a fully automatic recognition system based on the proposed method is described and extensively evaluated on 600 head motion video sequences with extreme illumination, pose and motion pattern variation. On this challenging data set our system consistently demonstrated a very high recognition rate (95% on average), significantly outperforming state-of-the-art methods from the literature.  相似文献   

2.
In this paper, we present a new algorithm that utilizes low-quality red, green, blue and depth (RGB-D) data from the Kinect sensor for face recognition under challenging conditions. This algorithm extracts multiple features and fuses them at the feature level. A Finer Feature Fusion technique is developed that removes redundant information and retains only the meaningful features for possible maximum class separability. We also introduce a new 3D face database acquired with the Kinect sensor which has released to the research community. This database contains over 5,000 facial images (RGB-D) of 52 individuals under varying pose, expression, illumination and occlusions. Under the first three variations and using only the noisy depth data, the proposed algorithm can achieve 72.5 % recognition rate which is significantly higher than the 41.9 % achieved by the baseline LDA method. Combined with the texture information, 91.3 % recognition rate has achieved under illumination, pose and expression variations. These results suggest the feasibility of low-cost 3D sensors for real-time face recognition.  相似文献   

3.
4.

Face recognition techniques are widely used in many applications, such as automatic detection of crime scenes from surveillance cameras for public safety. In these real cases, the pose and illumination variances between two matching faces have a big influence on the identification performance. Handling pose changes is an especially challenging task. In this paper, we propose the learning warps based similarity method to deal with face recognition across the pose problem. Warps are learned between two patches from probe faces and gallery faces using the Lucas-Kanade algorithm. Based on these warps, a frontal face registered in the gallery is transformed into a series of non-frontal viewpoints, which enables non-frontal probe face matching with the frontal gallery face. Scale-invariant feature transform (SIFT) keypoints (interest points) are detected from the generated viewpoints and matched with the probe faces. Moreover, based on the learned warps, the probability likelihood is used to calculate the probability of two faces being the same subject. Finally, a hybrid similarity combining the number of matching keypoints and the probability likelihood is proposed to describe the similarity between a gallery face and a probe face. Experimental results show that our proposed method achieves better recognition accuracy than other algorithms it was compared to, especially when the pose difference is within 40 degrees.

  相似文献   

5.
Visual learning and recognition of 3-d objects from appearance   总被引:33,自引:9,他引:24  
The problem of automatically learning object models for recognition and pose estimation is addressed. In contrast to the traditional approach, the recognition problem is formulated as one of matching appearance rather than shape. The appearance of an object in a two-dimensional image depends on its shape, reflectance properties, pose in the scene, and the illumination conditions. While shape and reflectance are intrinsic properties and constant for a rigid object, pose and illumination vary from scene to scene. A compact representation of object appearance is proposed that is parametrized by pose and illumination. For each object of interest, a large set of images is obtained by automatically varying pose and illumination. This image set is compressed to obtain a low-dimensional subspace, called the eigenspace, in which the object is represented as a manifold. Given an unknown input image, the recognition system projects the image to eigenspace. The object is recognized based on the manifold it lies on. The exact position of the projection on the manifold determines the object's pose in the image.A variety of experiments are conducted using objects with complex appearance characteristics. The performance of the recognition and pose estimation algorithms is studied using over a thousand input images of sample objects. Sensitivity of recognition to the number of eigenspace dimensions and the number of learning samples is analyzed. For the objects used, appearance representation in eigenspaces with less than 20 dimensions produces accurate recognition results with an average pose estimation error of about 1.0 degree. A near real-time recognition system with 20 complex objects in the database has been developed. The paper is concluded with a discussion on various issues related to the proposed learning and recognition methodology.  相似文献   

6.
7.
目前,对于人物识别的研究依然是一个非常具有挑战性的难题,结合多姿势来进行人物识别则是一个新的课题,因此准确提取多姿势样本是人物识别关键的一步.Poselets算法可以检测出图像中的所有人物及其相应的姿势,但是无法对特定位置的人物进行定位.因此本文提出了一种基于poselets的特定位置人物姿势提取的方法:首先根据特定位置人物头部标定框设置过滤模型,通过过滤模型对图像中由poselets算法检出的人物框进行筛选,并对筛选结果进行排序,然后结合排序得分利用二分图最大权值匹配算法对筛选结果进行匹配,找到特定位置的目标人物,提取对应的姿势.实验表明,本文算法能有效精确的检测特定位置的人物,并提取出相应的人物姿势.  相似文献   

8.
Face recognition under variable pose and illumination is a challenging problem in computer vision tasks. In this paper, we solve this problem by proposing a new residual based deep face reconstruction neural network to extract discriminative pose-and-illumination-invariant (PII) features. Our deep model can change arbitrary pose and illumination face images to the frontal view with standard illumination. We propose a new triplet-loss training method instead of Euclidean loss to optimize our model, which has two advantages: a) The training triplets can be easily augmented by freely choosing combinations of labeled face images, in this way, overfitting can be avoided; b) The triplet-loss training makes the PII features more discriminative even when training samples have similar appearance. By using our PII features, we achieve 83.8% average recognition accuracy on MultiPIE face dataset which is competitive to the state-of-the-art face recognition methods.  相似文献   

9.
近年来,静态图像中人脸特征点检测算法得到了极大的改进,然而,由于真实视频中头部姿态、遮挡和光照等因素的变化,人脸特征点检测和跟踪仍然具有挑战性。为了解决这一问题,提出一种多视角约束级联回归的视频人脸特征点跟踪算法。首先,利用三维和二维稀疏点集建立变换关系,并估计初始形状;其次,由于人脸图像存在较大的姿态差异,使用仿射变换对人脸图像进行姿态矫正;在构造形状回归模型时,采用多视角约束级联回归模型减小形状方差,从而使学习到的回归模型对形状方差具有更强的鲁棒性;最后,采用重新初始化机制,并在特征点正确定位时使用归一化互相关(NCC)模板匹配跟踪算法建立连续帧之间的形状关系。在公共数据集上的实验结果表明:该算法的平均误差小于眼间距离的10%。  相似文献   

10.
We present a novel approach to face recognition by constructing facial identity structures across views and over time, referred to as identity surfaces, in a Kernel Discriminant Analysis (KDA) feature space. This approach is aimed at addressing three challenging problems in face recognition: modelling faces across multiple views, extracting non-linear discriminatory features, and recognising faces over time. First, a multi-view face model is designed which can be automatically fitted to face images and sequences to extract the normalised facial texture patterns. This model is capable of dealing with faces with large pose variation. Second, KDA is developed to compute the most significant non-linear basis vectors with the intention of maximising the between-class variance and minimising the within-class variance. We applied KDA to the problem of multi-view face recognition, and a significant improvement has been achieved in reliability and accuracy. Third, identity surfaces are constructed in a pose-parameterised discriminatory feature space. Dynamic face recognition is then performed by matching the object trajectory computed from a video input and model trajectories constructed on the identity surfaces. These two types of trajectories encode the spatio-temporal dynamics of moving faces.  相似文献   

11.
It is still challenging to design a robust and efficient tracking algorithm in complex scenes. We propose a new object tracking algorithm with adaptive appearance learning and occlusion detection in an efficient self-tuning particle filter framework. The appearance of an object is modeled with a set of weighted and ordered submanifolds, which can guarantee the adaptability when there is fast illumination or pose change. To overcome the occlusion problem, we use the reconstruction error data of the appearance model to extract occlusion region by graph cuts. And the tracking result is improved with feedback of occlusion detection. The motion model is also integrated with adaptability to overcome the abrupt motion problem. To improve the efficiency of particle filter, the number of samples is tuned with respect to the scene conditions. Experimental results demonstrate that our algorithm can achieve great robustness, high accuracy and good efficiency in challenging scenes.  相似文献   

12.
In this paper, the problem of non-collaborative person identification for a secure access to facilities is addressed. The proposed solution adopts a face and a speaker recognition techniques. The integration of these two methods allows to improve the performance with respect to the two classifiers.In non-collaborative scenarios, the problem of face recognition first requires to detect the face pattern then to recognize it even when in non-frontal poses. In the current work, a histogram normalization, a boosting technique and a linear discrimination analysis have been exploited to solve typical problems like illumination variability, occlusions, pose variation, etc. In addition, a new temporal classification is proposed to improve the robustness of the frame-by-frame classification. This allows to project known classification techniques for still image recognition into a multi-frame context where the image capture allows dynamics in the environment.For the audio, a method for the automatic speaker identification in noisy environments is presented. In particular, we propose an optimization of a speech de-noising algorithm to optimize the performance of the extended Kalman filter (EKF). To provide a baseline system for the integration with our proposed speech de-noising algorithm, we use a conventional speaker recognition system, based on Gaussian mixture models and mel frequency cepstral coefficients (MFCCs) as features.To confirm the effectiveness of our methods, we performed video and speaker recognition tasks first separately then integrating the results. In particular, two different corpora have been used: (a) a public corpus (ELDSR for audio and FERRET for images) and (b) a dedicated audio/video corpus, in which the speakers read a list of sentences wearing a scarf or a full-face motorcycle helmet. Experimental results show that our methods are able to reduce significantly the classification error rate.  相似文献   

13.
In this paper, we introduce a novel approach to modeling non-stationary random processes. Given a set of training samples sequentially, we can iteratively update the eigenspace to manifest the current statistics provided by each new sample. The updated eigenspace is derived based more on recent samples and less on older samples, controlled by a number of decay parameters. Extensive study has been performed on how to choose these decay parameters. Other existing eigenspace updating algorithms can be regarded as special cases of our algorithm. We show the effectiveness of the proposed algorithm with both synthetic data and practical applications on face recognition. Significant improvements have been observed on face images with different variations, such as pose, expression and illumination variations. We expect the proposed algorithm to have other applications in active recognition and modeling as well.  相似文献   

14.
廖斌  吴文 《计算机应用》2019,39(2):556-563
传统方法在处理自由移动相机捕获视频中的阴影时,存在时空不连贯现象。为解决该问题,提出一种区域配对引导的光照传播阴影去除方法。首先,使用基于尺度不变特征变换(SIFT)特征向量的均值漂移方法分割视频,通过支持向量机(SVM)分类器检测出其中的阴影;然后,将输入视频帧分解成重叠的二维图像区域块,建立其马尔可夫随机场(MRF),通过光流引导的区域块匹配机制,为每一个阴影块找到最佳匹配的非阴影块;最后,使用局部光照传播算子恢复阴影区域块的光照,并对其进行全局光照优化。实验结果表明,与传统基于光照传播方法相比,所提方法在阴影检测综合评价指标上提升约6.23%,像素均方根误差(RMSE)减小约30.12%,且大幅度缩短了阴影处理时间,得到的无阴影视频结果更具时空连贯性。  相似文献   

15.
Video shot boundary detection (SBD) is a fundamental step in automatic video content analysis toward video indexing, summarization and retrieval. Despite the beneficial previous works in the literature, reliable detection of video shots is still a challenging issue with many unsolved problems. In this paper, we focus on the problem of hard cut detection and propose an automatic algorithm in order to accurately determine abrupt transitions from video. We suggest a fuzzy rule-based scene cut identification approach in which a set of fuzzy rules are evaluated to detect cuts. The main advantage of the proposed method is that, we incorporate spatial and temporal features to describe video frames, and model cut situations according to temporal dependency of video frames as a set of fuzzy rules. Also, while existing cut detection algorithms are mainly threshold dependent; our method identifies cut transitions using a fuzzy logic which is more flexible. The proposed algorithm is evaluated on a variety of video sequences from different genres. Experimental results, in comparison with the most standard cut detection algorithms confirm our method is more robust to object and camera movements as well as illumination changes.  相似文献   

16.
Most face recognition techniques have been successful in dealing with high-resolution (HR) frontal face images. However, real-world face recognition systems are often confronted with the low-resolution (LR) face images with pose and illumination variations. This is a very challenging issue, especially under the constraint of using only a single gallery image per person. To address the problem, we propose a novel approach called coupled kernel-based enhanced discriminant analysis (CKEDA). CKEDA aims to simultaneously project the features from LR non-frontal probe images and HR frontal gallery ones into a common space where discrimination property is maximized. There are four advantages of the proposed approach: 1) by using the appropriate kernel function, the data becomes linearly separable, which is beneficial for recognition; 2) inspired by linear discriminant analysis (LDA), we integrate multiple discriminant factors into our objective function to enhance the discrimination property; 3) we use the gallery extended trick to improve the recognition performance for a single gallery image per person problem; 4) our approach can address the problem of matching LR non-frontal probe images with HR frontal gallery images, which is difficult for most existing face recognition techniques. Experimental evaluation on the multi-PIE dataset signifies highly competitive performance of our algorithm.   相似文献   

17.
Shadow removal for videos is an important and challenging vision task. In this paper, we present a novel shadow removal approach for videos captured by free moving cameras using illumination transfer optimization. We first detect the shadows of the input video using interactive fast video matting. Then, based on the shadow detection results, we decompose the input video into overlapped 2D patches, and find the coherent correspondences between the shadow and non‐shadow patches via discrete optimization technique built on the patch similarity metric. We finally remove the shadows of the input video sequences using an optimized illumination transfer method, which reasonably recovers the illumination information of the shadow regions and produces spatio‐temporal shadow‐free videos. We also process the shadow boundaries to make the transition between shadow and non‐shadow regions smooth. Compared with previous works, our method can handle videos captured by free moving cameras and achieve better shadow removal results. We validate the effectiveness of the proposed algorithm via a variety of experiments.  相似文献   

18.
In this paper, we present a novel video stabilization method with a pixel-wise motion model. In order to avoid distortion introduced by traditional feature points based motion models, we focus on constructing a more accurate model to capture the motion in videos. By taking advantage of dense optical flow, we can obtain the dense motion field between adjacent frames and set up a pixel-wise motion model which is accurate enough. Our method first estimates dense motion field between adjacent frames. A PatchMatch based dense motion field estimation algorithm is proposed. This algorithm is specially designed for similar video frames rather than arbitrary images to reach higher speed and better performance. Then, a simple and fast smoothing algorithm is performed to make the jittered motion stabilized. After that, we warp input frames using a weighted average algorithm to construct the output frames. Some pixels in output frames may be still empty after the warping step, so in the last step, these empty pixels are filled using a patch based image completion algorithm. We test our method on many challenging videos and demonstrate the accuracy of our model and the effectiveness of our method.  相似文献   

19.
In this paper, we present a system for person re-identification in TV series. In the context of video retrieval, person re-identification refers to the task where a user clicks on a person in a video frame and the system then finds other occurrences of the same person in the same or different videos. The main characteristic of this scenario is that no previously collected training data is available, so no person-specific models can be trained in advance. Additionally, the query data is limited to the image that the user clicks on. These conditions pose a great challenge to the re-identification system, which has to find the same person in other shots despite large variations in the person’s appearance. In the study, facial appearance is used as the re-identification cue, since, in contrast to surveillance-oriented re-identification studies, the person can have different clothing in different shots. In order to increase the amount of available face data, the proposed system employs a face tracker that can track faces up to full profile views. This makes it possible to use a profile face image as query image and also to retrieve images with non-frontal poses. It also provides temporal association of the face images in the video, so that instead of using single images for query or target, whole tracks can be used. A fast and robust face recognition algorithm is used to find matching faces. If the match result is highly confident, our system adds the matching face track to the query set. Finally, if the user is not satisfied with the number of returned results, the system can present a small number of candidate face images and lets the user confirm the ones that belong to the queried person. These features help to increase the variation in the query set, making it possible to retrieve results with different poses, illumination conditions, etc. The system is extensively evaluated on two episodes of the TV series Coupling, showing very promising results.  相似文献   

20.
基于改进的Fisher准则的多示例学习视频人脸识别算法   总被引:1,自引:0,他引:1  
王玉  申铉京  陈海鹏 《自动化学报》2018,44(12):2179-2187
视频环境下目标的姿态变化使得人脸关键帧难以准确定位,导致基于关键帧标识的视频人脸识别方法的识别率偏低.为解决上述问题,本文提出一种基于Fisher加权准则的多示例学习视频人脸识别算法.该算法将视频人脸识别视为一个多示例问题,将视频中归一化后的人脸帧图像作为视频包中的示例,采用分块TPLBP级联直方图作为示例纹理特征,示例特征的权值通过改进的Fisher准则获得.在训练集合的示例特征空间中,采用多示例学习算法生成分类器,进而实现对测试视频的分类及预测.通过在Honda/UCSD视频库和Youtube Face数据库中的相关实验,该算法达到了较高的识别精度,从而验证了算法的有效性.同时,该方法对均匀光照变化、姿态变化等具有良好的鲁棒性.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号