基于语义空间统一表征的视频多模态内容分析技术 Semantic space representation based multiple model analysis of video期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于语义空间统一表征的视频多模态内容分析技术

引用本文：	张德,王子玮,张峰.基于语义空间统一表征的视频多模态内容分析技术[J].电视技术,2017,41(7).

作者姓名：	张德王子玮张峰

作者单位：	中国电子科技集团公司信息科学研究院,北京,100086

摘要：	视频是数据处理中综合性能最高,包含内容最广的载体.视频题目通过文字表达,内容通过连续图像帧表达,另外部分视频还包含背景音乐或者解说旁白.因此,视频处理即是对文字、图像、声音的多模态处理.着眼于多模态处理技术,提出基于语义空间统一表征的视频多模态内容分析框架,利用多种架构的深度神经网络,对视频的文字、图像、音频进行分别处理,为达到统一的功效,将不同结构的深度神经网络归结到语义空间,通过语义空间进行综合认知.提出的架构清晰、层次分明,对于视频理解的建模具有指导意义.
关键词：	语义空间多模态视频
Semantic space representation based multiple model analysis of video

ZHANG De,WANG Ziwei,ZHANG Feng.Semantic space representation based multiple model analysis of video[J].Tv Engineering,2017,41(7).

Authors:	ZHANG De WANG Ziwei ZHANG Feng

Abstract:	Video is the highest comprehensive performance in the data processing,including the widest content carrier.The title of the video is expressed by text,and the content is expressed by successive image frames.Other parts of the video include background music or commentary.Therefore,video processing is a multimodal processing of text,images,and sounds.This paper focuses on the multi modality treatment technology,the paper presents an analysis framework of semantic space unified characterization of multi-modal video content based on deep neural network,text,images,audio video are processed in order to achieve a uniform effect,the structure of different depth neural network due to the semantic space,comprehensive cognition through the semantic space.The architecture presented in this paper is clear and hierarchical,which is of guiding significance to the modeling of video understanding.

Keywords:	semantic space multiple model video
本文献已被万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏