首页 | 本学科首页   官方微博 | 高级检索  
     

结合外部知识库与适应性推理的场景图生成模型
引用本文:王旖旎,高永彬,万卫兵,杨淑群,郭茹燕.结合外部知识库与适应性推理的场景图生成模型[J].计算机工程,2022,48(9):230-238.
作者姓名:王旖旎  高永彬  万卫兵  杨淑群  郭茹燕
作者单位:上海工程技术大学 电子电气工程学院, 上海 201600
基金项目:国家自然科学基金青年科学基金项目(61802253)。
摘    要:为在场景图生成网络中获得重要的上下文信息,同时减少数据集偏差对场景图生成性能的影响,构建一种基于外部知识库与适应性推理的场景图生成模型。利用结合外部知识库的目标检测模块引入语言先验知识,提高实体对关系类别检测的准确性。设计基于Transformer架构的上下文信息提取模块,采用两个Transformer编码层对候选框和实体对关系类别进行处理,并利用自注意力机制分阶段实现上下文信息合并,获取重要的全局上下文信息。构建特征特殊融合的适应性推理模块,通过软化分布并根据实体对的视觉外观进行适应性推理关系分类,缓解实体对关系频率的长尾分布问题,提升模型推理能力。在VG数据集上的实验结果表明,与MOTIFS模型相比,该模型在谓词分类、场景图分类和场景图生成子任务上的Top-100召回率分别提升了1.4、4.3、7.1个百分点,对于多数关系类别具有更好的场景图生成效果。

关 键 词:场景图  视觉关系  外部知识库  注意力机制  适应性推理  
收稿时间:2021-08-05
修稿时间:2021-10-15

Scene Graph Generation Model Combined with External Knowledge Base and Adaptive Reasoning
WANG Yini,GAO Yongbin,WAN Weibing,YANG Shuqun,GUO Ruyan.Scene Graph Generation Model Combined with External Knowledge Base and Adaptive Reasoning[J].Computer Engineering,2022,48(9):230-238.
Authors:WANG Yini  GAO Yongbin  WAN Weibing  YANG Shuqun  GUO Ruyan
Affiliation:School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai 201600, China
Abstract:To obtain better contextual information in the Scene Graph Generation(SGG) network while reducing the impact of dataset bias, this study proposes a SGG model based on an external knowledge base and adaptive reasoning.First, the proposed model uses a target-detection module combined with an external knowledge base to provide the model with linguistic priori knowledge to improve the accuracy of relationship-category detection for entity pairs.Second, the model designs a transformer architecture-based context information extraction module to process the candidate box and entity pair relationship labels through two transformer-coding layers, and merge the context information in stages using the self-attention mechanism to obtain more meaningful global context information.Finally, as the relationship frequencies are affected by the long-tail distribution, the model designs a feature-specific fusion of adaptive inference modules to alleviate this problem by softening the distribution and by adaptively reasoning about relationship classification based on the visual appearance of entity pairs.Experimental results on the Visual Genome (VG) dataset show that using the proposed model, Top-100 Recall(Recall@100, R@100) on Predicate Classification(PredCls), Scene Graph Classification(SGCls), and Scene Graph Generation(SGGen) subtasks is increased by 1.4, 4.3, and 7.1 percentage points, respectively, compared with the MOTIFS model.Furthermore, the proposed model achieves better SGG effect for most relationship categories.
Keywords:scene graph  visual relationship  external knowledge base  attention mechanism  adaptive reasoning  
点击此处可从《计算机工程》浏览原始摘要信息
点击此处可从《计算机工程》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号