首页 | 本学科首页   官方微博 | 高级检索  
     

基于语义空间统一表征的视频多模态内容分析技术
引用本文:张德,王子玮,张峰.基于语义空间统一表征的视频多模态内容分析技术[J].电视技术,2017,41(7).
作者姓名:张德  王子玮  张峰
作者单位:中国电子科技集团公司信息科学研究院,北京,100086
摘    要:视频是数据处理中综合性能最高,包含内容最广的载体.视频题目通过文字表达,内容通过连续图像帧表达,另外部分视频还包含背景音乐或者解说旁白.因此,视频处理即是对文字、图像、声音的多模态处理.着眼于多模态处理技术,提出基于语义空间统一表征的视频多模态内容分析框架,利用多种架构的深度神经网络,对视频的文字、图像、音频进行分别处理,为达到统一的功效,将不同结构的深度神经网络归结到语义空间,通过语义空间进行综合认知.提出的架构清晰、层次分明,对于视频理解的建模具有指导意义.

关 键 词:语义空间  多模态  视频

Semantic space representation based multiple model analysis of video
ZHANG De,WANG Ziwei,ZHANG Feng.Semantic space representation based multiple model analysis of video[J].Tv Engineering,2017,41(7).
Authors:ZHANG De  WANG Ziwei  ZHANG Feng
Abstract:Video is the highest comprehensive performance in the data processing,including the widest content carrier.The title of the video is expressed by text,and the content is expressed by successive image frames.Other parts of the video include background music or commentary.Therefore,video processing is a multimodal processing of text,images,and sounds.This paper focuses on the multi modality treatment technology,the paper presents an analysis framework of semantic space unified characterization of multi-modal video content based on deep neural network,text,images,audio video are processed in order to achieve a uniform effect,the structure of different depth neural network due to the semantic space,comprehensive cognition through the semantic space.The architecture presented in this paper is clear and hierarchical,which is of guiding significance to the modeling of video understanding.
Keywords:semantic space  multiple model  video
本文献已被 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号