首页 | 本学科首页   官方微博 | 高级检索  
     


Joint face and head tracking inside multi-camera smart rooms
Authors:Zhenqiu Zhang  Gerasimos Potamianos  Andrew W Senior  Thomas S Huang
Affiliation:(1) Beckman Institute, University of Illinois, Urbana, IL 61801, USA;(2) IBM T.J. Watson Research Center, Yorktown Heights, NY 10598, USA
Abstract:The paper introduces a novel detection and tracking system that provides both frame-view and world-coordinate human location information, based on video from multiple synchronized and calibrated cameras with overlapping fields of view. The system is developed and evaluated for the specific scenario of a seminar lecturer presenting in front of an audience inside a “smart room”, its aim being to track the lecturer’s head centroid in the three-dimensional (3D) space and also yield two-dimensional (2D) face information in the available camera views. The proposed approach is primarily based on a statistical appearance model of human faces by means of well-known AdaBoost-like face detectors, extended to address the head pose variation observed in the smart room scenario of interest. The appearance module is complemented by two novel components and assisted by a simple tracking drift detection mechanism. The first component of interest is the initialization module, which employs a spatio-temporal dynamic programming approach with appropriate penalty functions to obtain optimal 3D location hypotheses. The second is an adaptive subspace learning based 2D tracking scheme with a novel forgetting mechanism, introduced to reduce tracking drift and increase robustness. System performance is benchmarked on an extensive database of realistic human interaction in the lecture smart room scenario, collected as part of the European integrated project “CHIL”. The system consistently achieves excellent tracking precision, with a 3D mean tracking error of less than 16 cm, and is demonstrated to outperform four alternative tracking schemes. Furthermore, the proposed system performs relatively well in detecting frontal and near-frontal faces in the available frame views. This work was performed while Zhenqiu Zhang was on a summer internship with the Human Language Technology Department at the IBM T.J. Watson Research Center.
Keywords:Person tracking  Face detection  Multi-camera tracking  Dynamic programming  Adaptive subspace tracking  Mean-shift tracking  AdaBoost  Lecture data  Smart rooms
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号