基于视听分层模型的实时爆炸场景识别 Real-Time Recognition of Explosion Scenes Based on Audio-Visual Hierarchical Model期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于视听分层模型的实时爆炸场景识别

引用本文：	庄越挺,傅正钢,叶朝阳,吴飞. 基于视听分层模型的实时爆炸场景识别[J]. 计算机辅助设计与图形学学报, 2004, 16(1): 90-97

作者姓名：	庄越挺傅正钢叶朝阳吴飞

作者单位：	浙江大学人工智能研究所,杭州,310027;浙江大学人工智能研究所,杭州,310027;浙江大学人工智能研究所,杭州,310027;浙江大学人工智能研究所,杭州,310027

基金项目：	国家自然科学基金(60272031)，教育部博士点基金(20010335049)，国家“十五”重大科技攻关项目(2001BA101A0703)，浙江省科技计划项目重点科研项目(2003C21010)资助

摘要：	提出在实时环境下使用基于听觉和视觉的分层模型对MPEG多媒体数据流中的“爆炸”场景在压缩域进行识别的算法．首先用一个粗分支持向量机把爆炸和类似爆炸的音频从别的音频中识别出来，然后再分别用几个精细支持向量机把爆炸和类似爆炸的音频区分开，由此得到音频爆炸备选场景．由于大多数爆炸场景均伴随剧烈的视觉突变，因此对得到的音频爆炸备选场景再判断其对应的视觉特征是否发生了变化，得到最后的识别结果。
关键词：	压缩域特征分层支持向量机视听事件
Real-Time Recognition of Explosion Scenes Based on Audio-Visual Hierarchical Model

Zhuang Yueting Fu Zhenggang Ye Zhaoyang Wu Fei. Real-Time Recognition of Explosion Scenes Based on Audio-Visual Hierarchical Model[J]. Journal of Computer-Aided Design & Computer Graphics, 2004, 16(1): 90-97

Authors:	Zhuang Yueting Fu Zhenggang Ye Zhaoyang Wu Fei

Abstract:	An audio-visual hierarchical model is used to detect explosion scenes from MPEG stream based on compressed features. First, a coarse SVM is applied to discriminate explosion and explosion-like audio from others, then several fine-grained SVMs are used to determine explosion audio from explosion-like one. From these coarse to fine-grained SVMs, the audio explosion candidates are selected out. Because most explosion scenes have obvious visual change, the corresponding video is checked to get the final result.

Keywords:	compressed features hierarchical SVM audio-visual event
本文献已被 CNKI 维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏