Gaze estimation via bilinear pooling-based attention networks |
| |
Affiliation: | 1. Engineering Research Center of Wideband Wireless Communication Technology, Ministry of Education, Nanjing University of Posts and Telecommunications, Nanjing 210003, China;2. Hangzhou Joyware Limited Corporation, Hangzhou 310051, China |
| |
Abstract: | Attention mechanism has been found effective for human gaze estimation, and the attention and diversity of learned features are two important aspects of attention mechanism. However, the traditional attention mechanism used in existing gaze model is more prone to utilize first-order information that is attentive but not diverse. Though the existing bilinear pooling-based attention could overcome the shortcoming of traditional attention, it is limited to extract high-order contextual information. Thus we introduce a novel bilinear pooling-based attention mechanism, which could extract the second-order contextual information by the interaction between local deep learned features. To make the gaze-related features robust for spatial misalignment, we further propose an attention-in-attention method, which consists of a global average pooling and an inner attention on the second-order features. For the purpose of gaze estimation, a new bilinear pooling-based attention networks with attention-in-attention is further proposed. Extensive evaluation shows that our method surpasses the state-of-the-art by a big margin. |
| |
Keywords: | Gaze tracking Deep learning Bilinear pooling Attention |
本文献已被 ScienceDirect 等数据库收录! |
|