首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.

We investigate the use of commercial off-the-shelf (COTS) eye-trackers to automatically detect mind wandering—a phenomenon involving a shift in attention from task-related to task-unrelated thoughts—during computerized learning. Study 1 (N?=?135 high-school students) tested the feasibility of COTS eye tracking while students learn biology with an intelligent tutoring system called GuruTutor in their classroom. We could successfully track eye gaze in 75% (both eyes tracked) and 95% (one eye tracked) of the cases for 85% of the sessions where gaze was successfully recorded. In Study 2, we used this data to build automated student-independent detectors of mind wandering, obtaining accuracies (mind wandering F1?=?0.59) substantially better than chance (F1?=?0.24). Study 3 investigated context-generalizability of mind wandering detectors, finding that models trained on data collected in a controlled laboratory more successfully generalized to the classroom than the reverse. Study 4 investigated gaze- and video- based mind wandering detection, finding that gaze-based detection was superior and multimodal detection yielded an improvement in limited circumstances. We tested live mind wandering detection on a new sample of 39 students in Study 5 and found that detection accuracy (mind wandering F1?=?0.40) was considerably above chance (F1?=?0.24), albeit lower than offline detection accuracy from Study 1 (F1?=?0.59), a finding attributable to handling of missing data. We discuss our next steps towards developing gaze-based attention-aware learning technologies to increase engagement and learning by combating mind wandering in classroom contexts.

  相似文献   

2.
ABSTRACT

Mind wandering is a commonly intruding cognitive state that leads to diminished performance and increased error risk during a primary task. A controversy over whether easier or more difficult tasks increase mind wandering has led to mind wandering being proposed as two different states: deliberate and spontaneous. We hypothesise that forced engagement via persistent compliant activity may both increase responsiveness and inhibit non-instrumental activities including deliberate mind wandering. Twenty-eight healthy adults interacted with 2 pairs of stimuli, each pair having one low-interactivity version and a high-interactivity version requiring compliant activity. Mind wandering was assessed by thought probes, and subjective responses were rated using visual analogue scales. Reaction times were measured using Superlab. Compliant activity decreased the prevalence of deliberate mind wandering episodes but not of overall mind wandering. Thought probe durations were accelerated significantly by compliant activity, near-significantly by thinking on-task thoughts, and additively by the combination of both. Deliberate and spontaneous mind wandering elicited equivalent thought probe durations. We conclude that compliant activity works synergistically with lack of mind wandering to accelerate the difficult task of thought probe response but not simple reaction times. These results fit with an arousal model but not the attentional resources model.  相似文献   

3.
孤独症谱系障碍(autism spectrum disorder,ASD)是一类以社会交流、刻板行为和狭隘兴趣为主要特征的神经发育障碍性疾病,致残率较高,严重影响着儿童的健康成长。ASD 主观临床诊断存在耗时长、主观性强等问题。因此,迫切需要一种快速、经济、有效的客观筛查方法。研究发现,ASD 儿童具有非典型的情绪视觉感知模式,有望将眼动追踪技术用于 ASD 的辅助诊断。该文提出一个在自然场景下,ASD 非典型情绪视觉感知模式结合机器学习的自动筛查 ASD 患者的模型。该模型可提取自然场景下感知情绪的眼动轨迹特征,通过机器学习模型进行建模,以实现根据眼动轨迹自动识别 ASD 患儿。实验结果表明,该方法的准确率为 79.71%,有望成为一种 ASD 儿童早期筛查的辅助工具。  相似文献   

4.
Even though eye movements during reading have been studied intensively for decades, applications that track the reading of longer passages of text in real time are rare. The problems encountered in developing such an application (a reading aid, iDict), and the solutions to the problems are described. Some of the issues are general and concern the broad family of Attention Aware Systems. Others are specific to the modality of interest: eye gaze. One of the most difficult problems when using eye tracking to identify the focus of visual attention is the inaccuracy of the eye trackers used to measure the point of gaze. The inaccuracy inevitably affects the design decisions of any application exploiting the point of gaze for localizing the point of visual attention. The problem is demonstrated with examples from our experiments. The principles of the drift correction algorithms that automatically correct the vertical inaccuracy are presented and the performance of the algorithms is evaluated.  相似文献   

5.
We report on an investigation into people’s behaviors on information search tasks, specifically the relation between eye movement patterns and task characteristics. We conducted two independent user studies (n = 32 and n = 40), one with journalism tasks and the other with genomics tasks. The tasks were constructed to represent information needs of these two different users groups and to vary in several dimensions according to a task classification scheme. For each participant we classified eye gaze data to construct models of their reading patterns. The reading models were analyzed with respect to the effect of task types and Web page types on reading eye movement patterns. We report on relationships between tasks and individual reading behaviors at the task and page level. Specifically we show that transitions between scanning and reading behavior in eye movement patterns and the amount of text processed may be an implicit indicator of the current task type facets. This may be useful in building user and task models that can be useful in personalization of information systems and so address design demands driven by increasingly complex user actions with information systems. One of the contributions of this research is a new methodology to model information search behavior and investigate information acquisition and cognitive processing in interactive information tasks.  相似文献   

6.
Many preprocessing techniques intended to normalize artifacts and clean noise induce anomalies in part due to the discretized nature of the document image and in part due to inherent ambiguity in the input image relative to the desired transformation. The potentially deleterious effects of common preprocessing methods are illustrated through a series of dramatic albeit contrived examples and then shown to affect real applications of ongoing interest to the community through three writer identification experiments conducted on Arabic handwriting. Retaining ruling lines detected by multi-line linear regression instead of repairing strokes broken by deleting ruling lines reduced the error rate by 4.5 %. Exploiting word position relative to detected rulings instead of ignoring it decreased errors by 5.5 %. Counteracting page skew by rotating extracted contours during feature extraction instead of rectifying the page image reduced the error by 1.4  %. All of these accuracy gains are shown to be statistically significant. Analogous methods are advocated for other document processing tasks as topics for future research.  相似文献   

7.
It is difficult to map land covers in the urban core due to the close proximity of high-rise buildings. This difficulty is overcome with a proposed hybrid, the hierarchical method via fusing PAN-sharpened WorldView-2 imagery with light detection and ranging (lidar) data for central Auckland, New Zealand, in two stages. After all features were categorized into ‘ground’ and ‘above-ground’ using lidar data, ground features were classified from the satellite data using the object-oriented method. Above-ground covers were grouped into four types from lidar-derived digital surface model (nDSM) based on rules. Ground and above-ground features were classified at an accuracy of 94.1% (kappa coefficient or κ = 0.913) and 93.7% (κ = 0.873), respectively. After the two results were merged, the nine covers achieved an overall accuracy of 93.7% (κ = 0.902). This accuracy is highly comparable to those reported in the literature, but was achieved at much less computational expense and complexity owing to the hybrid workflow that optimizes the efficiency of the respective classifiers. This hybrid method of classification is robust and applicable to other scenes without modification as the required parameters are derived automatically from the data to be classified. It is also flexible in incorporating user-defined rules targeting hard-to-discriminate covers. Mapping accuracy from the fused complementary data sets was adversely affected by shadows in the satellite image and the differential acquisition time of imagery and lidar data.  相似文献   

8.
The interactive use of visual interface tools has diversified the use of visualisations. This article reviews the relevant aspects of interaction and challenges the sufficiency of traditional evaluation criteria developed for static graphs. Traditionally, the problem for statisticians has been to maintain perceptual discriminability of details, when quantities of data increase. Currently, however, even non-professional users need to integrate qualitatively different kinds of information. The review of task requirements indicates the use of a visual outline: (1) visual tools can facilitate parallel separation of individual data entities and integration of their features and (2) more focused comparisons require visual memory due to eye movements. The article reports psychophysical experiments that measure performance accuracy and response latency conditioned by the above task requirements. The impact of shape and colour on performance interacted with display times; the times were shorter (100 ms) or longer (1 s) than the duration of typical gaze fixation. The features of graphs in the experiments were derived from a popular internet service. Thus, we describe methods for evaluating visual components of real services and provide general guidelines for visual design of human–computer interaction.  相似文献   

9.
Deep learning has risen in popularity as a face recognition technology in recent years. Facenet, a deep convolutional neural network (DCNN) developed by Google, recognizes faces with 128 bytes per face. It also claims to have achieved 99.96% on the reputed Labelled Faces in the Wild (LFW) dataset. However, the accuracy and validation rate of Facenet drops down eventually, there is a gradual decrease in the resolution of the images. This research paper aims at developing a new facial recognition system that can produce a higher accuracy rate and validation rate on low-resolution face images. The proposed system Extended Openface performs facial recognition by using three different features i) facial landmark ii) head pose iii) eye gaze. It extracts facial landmark detection using Scattered Gated Expert Network Constrained Local Model (SGEN-CLM). It also detects the head pose and eye gaze using Enhanced Constrained Local Neural field (ECLNF). Extended openface employs a simple Support Vector Machine (SVM) for training and testing the face images. The system’s performance is assessed on low-resolution datasets like LFW, Indian Movie Face Database (IMFDB). The results demonstrated that Extended Openface has a better accuracy rate (12%) and validation rate (22%) than Facenet on low-resolution images.  相似文献   

10.
In this paper we present a novel mechanism to obtain enhanced gaze estimation for subjects looking at a scene or an image. The system makes use of prior knowledge about the scene (e.g. an image on a computer screen), to define a probability map of the scene the subject is gazing at, in order to find the most probable location. The proposed system helps in correcting the fixations which are erroneously estimated by the gaze estimation device by employing a saliency framework to adjust the resulting gaze point vector. The system is tested on three scenarios: using eye tracking data, enhancing a low accuracy webcam based eye tracker, and using a head pose tracker. The correlation between the subjects in the commercial eye tracking data is improved by an average of 13.91%. The correlation on the low accuracy eye gaze tracker is improved by 59.85%, and for the head pose tracker we obtain an improvement of 10.23%. These results show the potential of the system as a way to enhance and self-calibrate different visual gaze estimation systems.  相似文献   

11.
目的 疲劳驾驶是引发车辆交通事故的主要原因之一,针对现有方法在驾驶员面部遮挡情况下对眼睛状态识别效果不佳的问题,提出了一种基于自商图—梯度图共生矩阵的驾驶员眼部疲劳检测方法。方法 利用以残差网络(residual network,ResNet)为前置网络的SSD(single shot multibox detector)人脸检测器来获取视频中的有效人脸区域,并通过人脸关键点检测算法分割出眼睛局部区域图像;建立驾驶员眼部的自商图与梯度图共生矩阵模型,分析共生矩阵的数字统计特征,选取效果较好的特征用以判定人眼的开闭状态;结合眼睛闭合时间百分比(percentage of eyelid closure,PERCLOS)与最长闭眼持续时间(maximum closing duration,MCD)两个疲劳指标来判别驾驶员的疲劳状态。结果 在六自由度汽车性能虚拟仿真实验平台上模拟汽车驾驶,采集并分析驾驶员面部视频,本文方法能够有效识别驾驶员面部遮挡时眼睛的开闭状态,准确率高达99.12%,面部未遮挡时的识别精度为98.73%,算法处理视频的速度约为32帧/s。对比方法1采用方向梯度直方图特征与支持向量机分类器相结合的人脸检测算法,并以眼睛纵横比判定开闭眼状态,在面部遮挡时识别较弱;以卷积神经网络(convolutional neural network,CNN)判别眼睛状态的对比方法2虽然在面部遮挡情况下的准确率高达98.02%,但眨眼检测准确率效果不佳。结论 基于自商图—梯度图共生矩阵的疲劳检测方法能够有效识别面部遮挡时眼睛的开闭情况和驾驶员的疲劳状态,具有较快的检测速度与较高的准确率。  相似文献   

12.
Eye tracking has been used successfully as a technique for measuring cognitive load in reading, psycholinguistics, writing, language acquisition etc. for some time now. Its application as a technique for measuring the reading ease of MT output has not yet, to our knowledge, been tested. We report here on a preliminary study testing the use and validity of an eye tracking methodology as a means of semi-automatically evaluating machine translation output. 50 French machine translated sentences, 25 rated as excellent and 25 rated as poor in an earlier human evaluation, were selected. Ten native speakers of French were instructed to read the MT sentences for comprehensibility. Their eye gaze data were recorded non-invasively using a Tobii 1750 eye tracker. The average gaze time and fixation count were found to be higher for the “bad” sentences, while average fixation duration and pupil dilations were not found to be substantially different for output rated as good and output rated as bad. Comparisons between HTER scores and eye gaze data were also found to correlate well with gaze time and fixation count, but not with pupil dilation and fixation duration. We conclude that the eye tracking data, in particular gaze time and fixation count, correlate reasonably well with human evaluation of MT output but fixation duration and pupil dilation may be less reliable indicators of reading difficulty for MT output. We also conclude that eye tracking has promise as a semi-automatic MT evaluation technique, which does not require bi-lingual knowledge, and which can potentially tap into the end users’ experience of machine translation output.  相似文献   

13.
Gaze interaction affords hands-free control of computers. Pointing to and selecting small targets using gaze alone is difficult because of the limited accuracy of gaze pointing. This is the first experimental comparison of gaze-based interface tools for small-target (e.g. <12 × 12 pixels) point-and-select tasks. We conducted two experiments comparing the performance of dwell, magnification and zoom methods in point-and-select tasks with small targets in single- and multiple-target layouts. Both magnification and zoom showed higher hit rates than dwell. Hit rates were higher when using magnification than when using zoom, but total pointing times were shorter using zoom. Furthermore, participants perceived magnification as more fatiguing than zoom. The higher accuracy of magnification makes it preferable when interacting with small targets. Our findings may guide the development of interface tools to facilitate access to mainstream interfaces for people with motor disabilities and other users in need of hands-free interaction.  相似文献   

14.
The aim of the study was to evaluate the difference in legibility between e-books and paper books by using an eye tracker. Eight male and eight female subjects free of eye disease participated in the experiment. The experiment was conducted using a 2 × 3 within-subject design. The book type (e-book, paper book) and font size (8 pt, 10 pt, 12 pt) were independent variables, and fixation duration time, saccade length, blink rate and subjective discomfort were dependent variables. In the results, all dependent variables showed that reading paper books provided a better experience than reading e-books did. These results indicate that the legibility of e-books needs further improvement, considering fixation duration time, saccade movement, eye fatigue, device and so on.  相似文献   

15.
When first introduced, the cross-ratio (CR) based remote eye tracking method offered many attractive features for natural human gaze-based interaction, such as simple camera setup, no user calibration, and invariance to head motion. However, due to many simplification assumptions, current CR-based methods are still sensitive to head movements. In this paper, we revisit the CR-based method and introduce two new extensions to improve the robustness of the method to head motion. The first method dynamically compensates for scale changes in the corneal reflection pattern, and the second method estimates true coplanar eye features so that the cross-ratio can be applied. We present real-time implementations of both systems, and compare the performance of these new methods using simulations and user experiments. Our results show a significant improvement in robustness to head motion and, for the user experiments in particular, an average reduction of up to 40 % in gaze estimation error was observed.  相似文献   

16.
目的 视线追踪是人机交互的辅助系统,针对传统的虹膜定位方法误判率高且耗时较长的问题,本文提出了一种基于人眼几何特征的视线追踪方法,以提高在2维环境下视线追踪的准确率。方法 首先通过人脸定位算法定位人脸位置,使用人脸特征点检测的特征点定位眼角点位置,通过眼角点计算出人眼的位置。直接使用虹膜中心定位算法的耗时较长,为了使虹膜中心定位的速度加快,先利用虹膜图片建立虹膜模板,然后利用虹膜模板检测出虹膜区域的位置,通过虹膜中心精定位算法定位虹膜中心的位置,最后提取出眼角点、虹膜中心点等信息,对点中包含的角度信息、距离信息进行提取,组合成眼动向量特征。使用神经网络模型进行分类,建立注视点映射关系,实现视线的追踪。通过图像的预处理对图像进行增强,之后提取到了相对的虹膜中心。提取到需要的特征点,建立相对稳定的几何特征代表眼动特征。结果 在普通的实验光照环境中,头部姿态固定的情况下,识别率最高达到98.9%,平均识别率达到95.74%。而当头部姿态在限制区域内发生变化时,仍能保持较高的识别率,平均识别率达到了90%以上。通过实验分析发现,在头部变化的限制区域内,本文方法具有良好的鲁棒性。结论 本文提出使用模板匹配与虹膜精定位相结合的方法来快速定位虹膜中心,利用神经网络来对视线落点进行映射,计算视线落点区域,实验证明本文方法具有较高的精度。  相似文献   

17.
This paper presents the first attempt to fuse two different kinds of behavioral biometrics: mouse dynamics and eye movement biometrics. Mouse dynamics were collected without any special equipment, while an affordable The Eye Tribe eye tracker was used to gather eye movement data at a frequency of 30 Hz, which is also potentially possible using a common web camera. We showed that a fusion of these techniques is quite natural and it is easy to prepare an experiment that collects both traits simultaneously. Moreover, the fusion of information from both signals gave 6.8 % equal error rate and 92.9 % accuracy for relatively short registration time (20 s on average). Achieving such results were possible using dissimilarity matrices based on dynamic time warping distance.  相似文献   

18.
程时伟  朱安杰  范菁 《软件学报》2018,29(S2):75-85
眼动跟踪方法具有很强的视觉指向性,可以将其应用于面向大屏幕的目标选择,进而避免鼠标操作方式在空间上的远距离移动.然而,仅仅利用眼动跟踪进行选择操作,也会产生选择精度降低、容易产生误操作等问题.因此,为了实现大屏幕上快速、准确的目标选择,提出一种融合眼动跟踪与手势的多通道交互方法,即通过眼动跟踪选择目标,利用手势进行选择确认.在目标尺寸小、目标间距较小时,通过光标稳定和二次选择机制进一步对交互过程进行优化.用户测试结果表明,该方法可以在大屏幕上针对不同尺寸和间距的目标完成有效的选择操作,与仅使用眼动跟踪的目标选择方法相比,任务完成速度提升了16%,任务完成正确率提升了82.6%.此外,针对层级菜单的具体选择任务,该方法与仅使用眼动跟踪的方法相比,任务完成速度提升了13.6%,任务完成正确率提升了55.7%.此外,该方法总体性能接近传统的鼠标操作方式,进一步验证了该方法在实际应用中的有效性.  相似文献   

19.
This paper proposes a new gaze-detection method based on a 3-D eye position and the gaze vector of the human eyeball. Seven new developments compared to previous works are presented. First, a method of using three camera systems, i.e., one wide-view camera and two narrow-view cameras, is proposed. The narrow-view cameras use autozooming, focusing, panning, and tilting procedures (based on the detected 3-D eye feature position) for gaze detection. This allows for natural head and eye movement by users. Second, in previous conventional gaze-detection research, one or multiple illuminators were used. These studies did not consider specular reflection (SR) problems, which were caused by the illuminators when working with users who wore glasses. To solve this problem, a method based on dual illuminators is proposed in this paper. Third, the proposed method does not require user-dependent calibration, so all procedures for detecting gaze position operate automatically without human intervention. Fourth, the intrinsic characteristics of the human eye, such as the disparity between the pupillary and the visual axes in order to obtain accurate gaze positions, are considered. Fifth, all the coordinates obtained by the left and right narrow-view cameras, as well as the wide-view camera coordinates and the monitor coordinates, are unified. This simplifies the complex 3-D converting calculation and allows for calculation of the 3-D feature position and gaze position on the monitor. Sixth, to upgrade eye-detection performance when using a wide-view camera, the adaptive-selection method is used. This involves an IR-LED on/off scheme, an AdaBoost classifier, and a principle component analysis method based on the number of SR elements. Finally, the proposed method uses an eigenvector matrix (instead of simply averaging six gaze vectors) in order to obtain a more accurate final gaze vector that can compensate for noise. Experimental results show that the root mean square error of gaze detection was about 0.627 cm on a 19-in monitor. The processing speed of the proposed method (used to obtain the gaze position on the monitor) was 32 ms (using a Pentium IV 1.8-GHz PC). It was possible to detect the user's gaze position at real-time speed.  相似文献   

20.
In this paper, we present human emotion recognition systems based on audio and spatio-temporal visual features. The proposed system has been tested on audio visual emotion data set with different subjects for both genders. The mel-frequency cepstral coefficient (MFCC) and prosodic features are first identified and then extracted from emotional speech. For facial expressions spatio-temporal features are extracted from visual streams. Principal component analysis (PCA) is applied for dimensionality reduction of the visual features and capturing 97 % of variances. Codebook is constructed for both audio and visual features using Euclidean space. Then occurrences of the histograms are employed as input to the state-of-the-art SVM classifier to realize the judgment of each classifier. Moreover, the judgments from each classifier are combined using Bayes sum rule (BSR) as a final decision step. The proposed system is tested on public data set to recognize the human emotions. Experimental results and simulations proved that using visual features only yields on average 74.15 % accuracy, while using audio features only gives recognition average accuracy of 67.39 %. Whereas by combining both audio and visual features, the overall system accuracy has been significantly improved up to 80.27 %.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号