首页 | 本学科首页   官方微博 | 高级检索  
     

基于动态贝叶斯网络的音视频联合说话人跟踪
引用本文:金乃高,殷福亮,陈喆.基于动态贝叶斯网络的音视频联合说话人跟踪[J].自动化学报,2008,34(9):1083-1089.
作者姓名:金乃高  殷福亮  陈喆
作者单位:1.大连理工大学电子与信息工程学院 大连 116023
摘    要:将多传感器信息融合技术用于说话人跟踪问题, 提出了一种基于动态贝叶斯网络的音视频联合说话人跟踪方法. 在动态贝叶斯网络中, 该方法分别采用麦克风阵列声源定位、人脸肤色检测以及音视频互信息最大化三种感知方式获取与说话人位置相关的量测信息; 然后采用粒子滤波对这些信息进行融合, 通过贝叶斯推理实现说话人的有效跟踪; 并运用信息熵理论对三种感知方式进行动态管理, 以提高跟踪系统的整体性能. 实验结果验证了本文方法的有效性.

关 键 词:说话人跟踪    动态贝叶斯网络    粒子滤波    麦克风阵列
收稿时间:2007-7-9
修稿时间:2007-11-26

Audio-visual Speaker Tracking Based on Dynamic Bayesian Network
JIN Nai-Gao,YIN Fu-Liang,CHEN Zhe.Audio-visual Speaker Tracking Based on Dynamic Bayesian Network[J].Acta Automatica Sinica,2008,34(9):1083-1089.
Authors:JIN Nai-Gao  YIN Fu-Liang  CHEN Zhe
Affiliation:1.School of Electronic and Information Engineering, Dalian University of Technology, Dalian 116023
Abstract:Multi-sensor data fusion technique is applied to speaker tracking problem,and a novel audio-visual speaker tracking approach based on dynamic Bayesian network is proposed.Based on the complementarity and redundancy between speech and image of a speaker,three kinds of perception methods,including sound source localization based on microphone array,face detection based on skin color information,and maximization mutual information based on audio-visual synchronization,are proposed to acquire the tracking information.In the framework of dynamic Bayesian network,particle filtering is used to fuse the tracking information,and perception management is achieved to improve the tracking efficiency by information entropy theory.Experiments using real-world data demonstrate that the proposed method can robustly track the speaker even in the presence of perturbing factors such as high room reverberation and video occlusions.
Keywords:Speaker tracking  dynamic Bayesian network  particle filter  microphone array
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《自动化学报》浏览原始摘要信息
点击此处可从《自动化学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号