首页 | 本学科首页   官方微博 | 高级检索  
     


Audio-visual human recognition using semi-supervised spectral learning and hidden Markov models
Authors:Wei Feng  Lei Xie  Jia Zeng  Zhi-Qiang Liu
Affiliation:1. Media Computing Group, School of Creative Media, City University of Hong Kong, Hong Kong, China;2. School of Computer Science, Northwestern Polytechnical University, Xi’an, China;3. Department of Computer Science, Hong Kong Baptist University, Hong Kong, China;1. mnappi@unisa.it;2. tortora@unisa.it
Abstract:This paper presents a multimodal system for reliable human identity recognition under variant conditions. Our system fuses the recognition of face and speech with a general probabilistic framework. For face recognition, we propose a new spectral learning algorithm, which considers not only the discriminative relations among the training data but also the generative models for each class. Due to the tedious cost of face labeling in practice, our spectral face learning utilizes a semi-supervised strategy. That is, only a small number of labeled faces are used in our training step, and the labels are optimally propagated to other unlabeled training faces. Besides requiring much less labeled data, our algorithm also enables a natural way to explicitly train an outlier model that approximately represents unauthorized faces. To boost the robustness of our system for human recognition under various environments, our face recognition is further complemented by a speaker identification agent. Specifically, this agent models the statistical variations of fixed-phrase speech using speaker-dependent word hidden Markov models. Experiments on benchmark databases validate the effectiveness of our face recognition and speaker identification agents, and demonstrate that the recognition accuracy can be apparently improved by integrating these two independent biometric sources together.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号