首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Deep learning has risen in popularity as a face recognition technology in recent years. Facenet, a deep convolutional neural network (DCNN) developed by Google, recognizes faces with 128 bytes per face. It also claims to have achieved 99.96% on the reputed Labelled Faces in the Wild (LFW) dataset. However, the accuracy and validation rate of Facenet drops down eventually, there is a gradual decrease in the resolution of the images. This research paper aims at developing a new facial recognition system that can produce a higher accuracy rate and validation rate on low-resolution face images. The proposed system Extended Openface performs facial recognition by using three different features i) facial landmark ii) head pose iii) eye gaze. It extracts facial landmark detection using Scattered Gated Expert Network Constrained Local Model (SGEN-CLM). It also detects the head pose and eye gaze using Enhanced Constrained Local Neural field (ECLNF). Extended openface employs a simple Support Vector Machine (SVM) for training and testing the face images. The system’s performance is assessed on low-resolution datasets like LFW, Indian Movie Face Database (IMFDB). The results demonstrated that Extended Openface has a better accuracy rate (12%) and validation rate (22%) than Facenet on low-resolution images.  相似文献   

2.
Changes in eyebrow configuration, in conjunction with other facial expressions and head gestures, are used to signal essential grammatical information in signed languages. This paper proposes an automatic recognition system for non-manual grammatical markers in American Sign Language (ASL) based on a multi-scale, spatio-temporal analysis of head pose and facial expressions. The analysis takes account of gestural components of these markers, such as raised or lowered eyebrows and different types of periodic head movements. To advance the state of the art in non-manual grammatical marker recognition, we propose a novel multi-scale learning approach that exploits spatio-temporally low-level and high-level facial features. Low-level features are based on information about facial geometry and appearance, as well as head pose, and are obtained through accurate 3D deformable model-based face tracking. High-level features are based on the identification of gestural events, of varying duration, that constitute the components of linguistic non-manual markers. Specifically, we recognize events such as raised and lowered eyebrows, head nods, and head shakes. We also partition these events into temporal phases. We separate the anticipatory transitional movement (the onset) from the linguistically significant portion of the event, and we further separate the core of the event from the transitional movement that occurs as the articulators return to the neutral position towards the end of the event (the offset). This partitioning is essential for the temporally accurate localization of the grammatical markers, which could not be achieved at this level of precision with previous computer vision methods. In addition, we analyze and use the motion patterns of these non-manual events. Those patterns, together with the information about the type of event and its temporal phases, are defined as the high-level features. Using this multi-scale, spatio-temporal combination of low- and high-level features, we employ learning methods for accurate recognition of non-manual grammatical markers in ASL sentences.  相似文献   

3.
Study on eye gaze estimation   总被引:1,自引:0,他引:1  
There are two components to the human visual line-of-sight: pose of human head and the orientation of the eye within their sockets. We have investigated these two aspects but will concentrate on eye gaze estimation. We present a novel approach called the "one-circle" algorithm for measuring the eye gaze using a monocular image that zooms in on only one eye of a person. Observing that the iris contour is a circle, we estimate the normal direction of this iris circle, considered as the eye gaze, from its elliptical image. From basic projective geometry, an ellipse can be back-projected into space onto two circles of different orientations. However, by using a geometric constraint, namely, that the distance between the eyeball's center and the two eye corners should be equal to each other, the correct solution can be disambiguated. This allows us to obtain a higher resolution image of the iris with a zoom-in camera, thereby achieving higher accuracies in the estimation. A general approach that combines head pose determination with eye gaze estimation is also proposed. The searching of the eye gaze is guided by the head pose information. The robustness of our gaze determination approach was verified statistically by the extensive experiments on synthetic and real image data. The two key contributions are that we show the possibility of finding the unique eye gaze direction from a single image of one eye and that one can obtain better accuracy as a consequence of this.  相似文献   

4.
目的 视线追踪是人机交互的辅助系统,针对传统的虹膜定位方法误判率高且耗时较长的问题,本文提出了一种基于人眼几何特征的视线追踪方法,以提高在2维环境下视线追踪的准确率。方法 首先通过人脸定位算法定位人脸位置,使用人脸特征点检测的特征点定位眼角点位置,通过眼角点计算出人眼的位置。直接使用虹膜中心定位算法的耗时较长,为了使虹膜中心定位的速度加快,先利用虹膜图片建立虹膜模板,然后利用虹膜模板检测出虹膜区域的位置,通过虹膜中心精定位算法定位虹膜中心的位置,最后提取出眼角点、虹膜中心点等信息,对点中包含的角度信息、距离信息进行提取,组合成眼动向量特征。使用神经网络模型进行分类,建立注视点映射关系,实现视线的追踪。通过图像的预处理对图像进行增强,之后提取到了相对的虹膜中心。提取到需要的特征点,建立相对稳定的几何特征代表眼动特征。结果 在普通的实验光照环境中,头部姿态固定的情况下,识别率最高达到98.9%,平均识别率达到95.74%。而当头部姿态在限制区域内发生变化时,仍能保持较高的识别率,平均识别率达到了90%以上。通过实验分析发现,在头部变化的限制区域内,本文方法具有良好的鲁棒性。结论 本文提出使用模板匹配与虹膜精定位相结合的方法来快速定位虹膜中心,利用神经网络来对视线落点进行映射,计算视线落点区域,实验证明本文方法具有较高的精度。  相似文献   

5.
In this paper we present a novel mechanism to obtain enhanced gaze estimation for subjects looking at a scene or an image. The system makes use of prior knowledge about the scene (e.g. an image on a computer screen), to define a probability map of the scene the subject is gazing at, in order to find the most probable location. The proposed system helps in correcting the fixations which are erroneously estimated by the gaze estimation device by employing a saliency framework to adjust the resulting gaze point vector. The system is tested on three scenarios: using eye tracking data, enhancing a low accuracy webcam based eye tracker, and using a head pose tracker. The correlation between the subjects in the commercial eye tracking data is improved by an average of 13.91%. The correlation on the low accuracy eye gaze tracker is improved by 59.85%, and for the head pose tracker we obtain an improvement of 10.23%. These results show the potential of the system as a way to enhance and self-calibrate different visual gaze estimation systems.  相似文献   

6.
Eye gaze tracking is very useful for quantitatively measuring visual attention in virtual environments. However, most eye trackers have a limited tracking range, e.g., ±35° in the horizontal direction. This paper proposed a method to combine head pose tracking and eye gaze tracking together to achieve a large range of tracking in virtual driving simulation environments. Multiple parallel multilayer perceptrons were used to reconstruct the relationship between head images and head poses. Head images were represented with the coefficients extracted from Principal Component Analysis. Eye gaze tracking provides precise results on the front view, while head pose tracking is more suitable for tracking areas of interest than for tracking points of interest on the side view.  相似文献   

7.
When estimating human gaze directions from captured eye appearances, most existing methods assume a fixed head pose because head motion changes eye appearance greatly and makes the estimation inaccurate. To handle this difficult problem, in this paper, we propose a novel method that performs accurate gaze estimation without restricting the user's head motion. The key idea is to decompose the original free-head motion problem into subproblems, including an initial fixed head pose problem and subsequent compensations to correct the initial estimation biases. For the initial estimation, automatic image rectification and joint alignment with gaze estimation are introduced. Then compensations are done by either learning-based regression or geometric-based calculation. The merit of using such a compensation strategy is that the training requirement to allow head motion is not significantly increased; only capturing a 5-s video clip is required. Experiments are conducted, and the results show that our method achieves an average accuracy of around 3° by using only a single camera.  相似文献   

8.
视线估计能够反映人的关注焦点,对理解人类的情感、兴趣等主观意识有重要作用。但目前用于视线估计的单目眼睛图像容易因头部姿态的变化而失真,导致视线估计的准确性下降。提出一种新型分类视线估计方法,利用三维人脸模型与单目相机的内在参数,通过人脸的眼睛与嘴巴中心的三维坐标形成头部姿态坐标系,从而合成相机坐标系与头部姿态坐标系,并建立归一化坐标系,实现相机坐标系的校正。复原并放大归一化得到的灰度眼部图像,建立基于表观的卷积神经网络模型分类方法以估计视线方向,并利用黄金分割法优化搜索,进一步降低误差。在MPIIGaze数据集上的实验结果表明,相比已公开的同类算法,该方法能降低约7.4%的平均角度误差。  相似文献   

9.
We address the problem of recognizing the visual focus of attention (VFOA) of meeting participants based on their head pose. To this end, the head pose observations are modeled using a Gaussian mixture model (GMM) or a hidden Markov model (HMM) whose hidden states correspond to the VFOA. The novelties of this paper are threefold. First, contrary to previous studies on the topic, in our setup, the potential VFOA of a person is not restricted to other participants only. It includes environmental targets as well (a table and a projection screen), which increases the complexity of the task, with more VFOA targets spread in the pan as well as tilt gaze space. Second, we propose a geometric model to set the GMM or HMM parameters by exploiting results from cognitive science on saccadic eye motion, which allows the prediction of the head pose given a gaze target. Third, an unsupervised parameter adaptation step not using any labeled data is proposed, which accounts for the specific gazing behavior of each participant. Using a publicly available corpus of eight meetings featuring four persons, we analyze the above methods by evaluating, through objective performance measures, the recognition of the VFOA from head pose information obtained either using a magnetic sensor device or a vision-based tracking system. The results clearly show that in such complex but realistic situations, the VFOA recognition performance is highly dependent on how well the visual targets are separated for a given meeting participant. In addition, the results show that the use of a geometric model with unsupervised adaptation achieves better results than the use of training data to set the HMM parameters.  相似文献   

10.
Most of the research on sign language recognition concentrates on recognizing only manual signs (hand gestures and shapes), discarding a very important component: the non-manual signals (facial expressions and head/shoulder motion). We address the recognition of signs with both manual and non-manual components using a sequential belief-based fusion technique. The manual components, which carry information of primary importance, are utilized in the first stage. The second stage, which makes use of non-manual components, is only employed if there is hesitation in the decision of the first stage. We employ belief formalism both to model the hesitation and to determine the sign clusters within which the discrimination takes place in the second stage. We have implemented this technique in a sign tutor application. Our results on the eNTERFACE’06 ASL database show an improvement over the baseline system which uses parallel or feature fusion of manual and non-manual features: we achieve an accuracy of 81.6%.  相似文献   

11.
We introduce a new model for personal recognition based on the 3-D geometry of the face. The model is designed for application scenarios where the acquisition conditions constrain the facial position. The 3-D structure of a facial surface is compactly represented by sets of contours (facial contours) extracted around automatically pinpointed nose tip and inner eye corners. The metric used to decide whether a point on the face belongs to a facial contour is its geodesic distance from a given landmark. Iso-geodesic contours are inherently robust to head pose variations, including in-depth rotations of the face. Since these contours are extracted from rigid parts of the face, the resulting recognition algorithms are insensitive to changes in facial expressions. The facial contours are encoded using innovative pose invariant features, including Procrustean distances defined on pose-invariant curves. The extracted features are combined in a hierarchical manner to create three parallel face recognizers. Inspired by the effectiveness of region ensembles approaches, the three recognizers constructed around the nose tip and inner corners of the eyes are fused both at the feature-level and the match score-level to create a unified face recognition algorithm with boosted performance. The performances of the proposed algorithms are evaluated and compared with other algorithms from the literature on a large public database appropriate for the assumed constrained application scenario.  相似文献   

12.
The accurate location of eyes in a facial image is important to many human facial recognition-related applications, and has attracted considerable research interest in computer vision. However, most prevalent methods are based on the frontal pose of the face, where applying them to non-frontal poses can yield erroneous results.In this paper, we propose an eye detection method that can locate the eyes in facial images captured at various head poses. Our proposed method consists of two stages: eye candidate detection and eye candidate verification. In eye candidate detection, eye candidates are obtained by using multi-scale iris shape features and integral image. The size of the iris in face images varies as the head pose changes, and the proposed multi-scale iris shape feature method can detect the eyes in such cases. Since it utilizes the integral image, its computational cost is relatively low. The extracted eye candidates are then verified in the eye candidate verification stage using a support vector machine (SVM) based on the feature-level fusion of a histogram of oriented gradients (HOG) and cell mean intensity features.We tested the performance of the proposed method using the Chinese Academy of Sciences' Pose, Expression, Accessories, and Lighting (CAS-PEAL) database and the Pointing'04 database. The results confirmed the superiority of our method over the conventional Haar-like detector and two hybrid eye detectors under relatively extreme head pose variations.  相似文献   

13.
Safety, legibility and efficiency are essential for autonomous mobile robots that interact with humans. A key factor in this respect is bi-directional communication of navigation intent, which we focus on in this article with a particular view on industrial logistic applications. In the direction robot-to-human, we study how a robot can communicate its navigation intent using Spatial Augmented Reality (SAR) such that humans can intuitively understand the robot’s intention and feel safe in the vicinity of robots. We conducted experiments with an autonomous forklift that projects various patterns on the shared floor space to convey its navigation intentions. We analyzed trajectories and eye gaze patterns of humans while interacting with an autonomous forklift and carried out stimulated recall interviews (SRI) in order to identify desirable features for projection of robot intentions. In the direction human-to-robot, we argue that robots in human co-habited environments need human-aware task and motion planning to support safety and efficiency, ideally responding to people’s motion intentions as soon as they can be inferred from human cues. Eye gaze can convey information about intentions beyond what can be inferred from the trajectory and head pose of a person. Hence, we propose eye-tracking glasses as safety equipment in industrial environments shared by humans and robots. In this work, we investigate the possibility of human-to-robot implicit intention transference solely from eye gaze data and evaluate how the observed eye gaze patterns of the participants relate to their navigation decisions. We again analyzed trajectories and eye gaze patterns of humans while interacting with an autonomous forklift for clues that could reveal direction intent. Our analysis shows that people primarily gazed on that side of the robot they ultimately decided to pass by. We discuss implications of these results and relate to a control approach that uses human gaze for early obstacle avoidance.  相似文献   

14.
In this paper, we presented algorithms to assess the quality of facial images affected by factors such as blurriness, lighting conditions, head pose variations, and facial expressions. We developed face recognition prediction functions for images affected by blurriness, lighting conditions, and head pose variations based upon the eigenface technique. We also developed a classifier for images affected by facial expressions to assess their quality for recognition by the eigenface technique. Our experiments using different facial image databases showed that our algorithms are capable of assessing the quality of facial images. These algorithms could be used in a module for facial image quality assessment in a face recognition system. In the future, we will integrate the different measures of image quality to produce a single measure that indicates the overall quality of a face image  相似文献   

15.
赵昕晨  杨楠 《计算机应用》2020,40(11):3295-3299
实时视线跟踪技术是智能眼动操作系统的关键技术。与基于眼动仪的技术相比,基于网络摄像头的技术具有低成本、高通用性等优点。针对现有的基于摄像头的算法只考虑眼部图像特征、准确度较低的问题,提出引入头部姿态分析的视线追踪算法优化技术。首先,通过人脸特征点检测结果构建头部姿态特征,为标定数据提供头部姿态上下文;然后,研究了新的相似度算法,计算头部姿态上下文的相似度;最后,在进行视线追踪时,利用头部姿态相似度对校准数据进行过滤,从标定数据集中选取与当前输入帧头部姿态相似度较高的数据进行预测。在选取不同特征人群的数据上进行了大量实验,对比实验结果显示,与WebGazer相比,所提算法的平均误差降低了58~63 px。所提算法能有效提高追踪结果的准确性和稳定性,拓展了摄像头设备在视线追踪领域的应用场景。  相似文献   

16.
赵昕晨  杨楠 《计算机应用》2005,40(11):3295-3299
实时视线跟踪技术是智能眼动操作系统的关键技术。与基于眼动仪的技术相比,基于网络摄像头的技术具有低成本、高通用性等优点。针对现有的基于摄像头的算法只考虑眼部图像特征、准确度较低的问题,提出引入头部姿态分析的视线追踪算法优化技术。首先,通过人脸特征点检测结果构建头部姿态特征,为标定数据提供头部姿态上下文;然后,研究了新的相似度算法,计算头部姿态上下文的相似度;最后,在进行视线追踪时,利用头部姿态相似度对校准数据进行过滤,从标定数据集中选取与当前输入帧头部姿态相似度较高的数据进行预测。在选取不同特征人群的数据上进行了大量实验,对比实验结果显示,与WebGazer相比,所提算法的平均误差降低了58~63 px。所提算法能有效提高追踪结果的准确性和稳定性,拓展了摄像头设备在视线追踪领域的应用场景。  相似文献   

17.
眼动跟踪研究进展与展望   总被引:1,自引:0,他引:1  
苟超  卓莹  王康  王飞跃 《自动化学报》2022,48(5):1173-1192
眼动跟踪是指自动检测瞳孔中心位置或者识别三维视线方向及注视点的过程, 被广泛应用于人机交互、智能驾驶、人因工程等. 由于不同场景下的光照变化、个体眼球生理构造差异、遮挡、头部姿态多样等原因, 眼动跟踪的研究目前仍然是一个具有挑战性的热点问题. 针对眼动跟踪领域,首先概述眼动跟踪研究内容, 然后分别论述近年来瞳孔中心检测及视线估计领域的国内外研究进展, 综述目前眼动跟踪主要数据集、评价指标及研究成果, 接着介绍眼动跟踪在人机交互、智能驾驶等领域的应用, 最后对眼动跟踪领域的未来发展趋势进行展望.  相似文献   

18.
In this paper, we propose an On-line Appearance-Based Tracker (OABT) for simultaneous tracking of 3D head pose, lips, eyebrows, eyelids and irises in monocular video sequences. In contrast to previously proposed tracking approaches, which deal with face and gaze tracking separately, our OABT can also be used for eyelid and iris tracking, as well as 3D head pose, lips and eyebrows facial actions tracking. Furthermore, our approach applies an on-line learning of changes in the appearance of the tracked target. Hence, the prior training of appearance models, which usually requires a large amount of labeled facial images, is avoided. Moreover, the proposed method is built upon a hierarchical combination of three OABTs, which are optimized using a Levenberg–Marquardt Algorithm (LMA) enhanced with line-search procedures. This, in turn, makes the proposed method robust to changes in lighting conditions, occlusions and translucent textures, as evidenced by our experiments. Finally, the proposed method achieves head and facial actions tracking in real-time.  相似文献   

19.
Chen  Shicun  Zhang  Yong  Yin  Baocai  Wang  Boyue 《Pattern Analysis & Applications》2021,24(4):1745-1755
Pattern Analysis and Applications - Nowadays, face detection and head pose estimation have a lot of application such as face recognition, aiding in gaze estimation and modeling attention. For these...  相似文献   

20.
The iCat is a user-interface robot with the ability to express a range of emotions through its facial features. This article summarizes our research to see whether we can increase the believability and likability of the iCat for its human partners through the application of gaze behaviour. Gaze behaviour serves several functions during social interaction such as mediating conversation flow, communicating emotional information and avoiding distraction by restricting visual input. There are several types of eye and head movements that are necessary for realizing these functions. We designed and evaluated a gaze behaviour system for the iCat robot that implements realistic models of the major types of eye and head movements found in living beings: vergence, vestibulo ocular reflexive, smooth pursuit movements and gaze shifts. We discuss how these models are integrated into the software environment of the iCat and can be used to create complex interaction scenarios. We report about some user tests and draw conclusions for future evaluation scenarios.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号